Suggested Reading:
Chapter 5-7 of Kernighan, B. & Pike, R. (1999). The practice of programming. Reading, MA: Addison-Wesley. Though the authors use C language code snippets, much of the advice about debugging, testing, and performance tuning is useful. (pdf)
The concepts in this section are taken from the book linked above, and will be illustrated with a high level language (Python) for readability.
Some features, like dynamic typing, offer trade-offs in performance.
Debugging
There are several strategies to efficiently debug code. The best way to reduce bugs is prevention - testing functionality before production by proactively trying to break the program. Once you discover there are bugs, you need to figure out how to react to it by narrowing down the inspection area, such as specific lines of code, adding more checks before execution, watching the program flow.
Debugging Approaches
- Understand error messages (during build time) - Debugging at build-time is easier than debugging at run time only if you understand the error message. Sometimes the line number is an approximation, the compiler might not flag the real error. When reviewing code you’ll soon realize that taking the time to properly indent and format each line such as missing semicolons speeds up the review.
- Think before writing code - try explaining the code aloud as a sanity check
- Look for common bugs
- Divide and conquer - a trial and error process of finding the smallest subset of code that illustrates the bug and incrementally add more lines of code to test
- Add more internal tests - validating parameters and checking invariants can eliminate excessive headaches
- Display output at critical spots
- Use a debugger - alternative to displaying output (see debugging tools section)
- Focus on recent changes - keep testing as you add new code. Version control can be helpful with this approach
Debugging Tools
The Python Debugger pdb is a module that supports setting conditional breakpoints and single stepping at the source line level, inspection of stack frames. Gdb is a debugger for C (and C++) so it won’t be covered in this class.
Assertions are a way for you to perform internal self-checks with your Python code. The assert keyword lets you test if a condition in your code returns True. if it passes the program will keep running without explicit feedback. If not, then the program will raise an assertion error. https://wiki.python.org/moin/UsingAssertionsEffectively
Testing
Testing code can be of various techniques and forms. Programmers working on the code should test all statements, logic, and boundaries and conduct white-box testing. Black box testing is that of not knowing the code design and checking inputs and outputs.
External Testing: Designing data to test your program
- Statement testing - each statement in a program should be executed at least once during program testing
- Logic/Path testing - each logical path through the program is tested
- Boundary testing - a technique of using input values at, just below, and just above the defined limits of an input domain in addition to input values causing outputs to be at, just below, and just above the defined limits of an output domain.
- Stress testing - evaluating a system or component at or beyond the limits of its specified requirements in respect to quantity of data and variety of data.
Internal Testing: Designing your program to test itself
- Validating parameters - make sure values of parameters are valid (commonly using the assert function)
- Checking invariants - checking aspects of data structures that should not vary (perhaps null or 1 element in the data structures)
- Checking function return values
- Changing code temporarily - generate artificial boundary or stress test
- Leaving testing code intact
General Testing Strategies
- *Automate the tests
- Test incrementally - test as you compose new code, making sure that after a fix previously working code is not broken (rerun all tests)
- Bug-driven testing could be reactive where you find a code and you create a test case that catches it. OR, you can proactively do fault injection with intentionally injecting a bug to ensure testing mechanism catches it.
Profiling
In software engineering, software profiling is a form of program analysis that measures time or space usage to aid program optimization. Various techniques can help you answer questions such as how slow is my program? Where is my program slow? How can I make my program run faster? With less memory?
Measuring time efficiency
- Timing studies with something like time
- Gather statistics about program execution
- Big (O) analysis of data structures or algorithms
Improving memory efficiency comes down to thinking more about the storage of data. Perhaps you can select a smaller data type such as short instead of int as it takes up fewer bits. Or you might try to determine a linked list length by traversing nodes instead of storing node count.