My previous post on S4D did omit some of my notes from the conference. In particular, the very entertaining and serious keynote of Barry Lock from Lauterbach and some more philosophical observations on the nature of debugging.
Barry lock gave a very entertaining keynote, from his viewpoint as essentially the champion of physical hardware debug. Lauterbach is clearly focused on debugging using hardware assists in real systems, with not much to do with high-level programming or virtual platforms. Barry has been working with computers longer than I have lived, and have seen both the semiconductor and software side of things.
His main message was that you have to take debuggability into account when buying chips for your embedded project. Saving a few cents by buying a chip with no or limited debug features will come back to haunt you, many times, in many nightmares. He had had grown men crying over the phone, asking for a miracle to save their projects after debugging had utterly failed for many months. He had seen startup companies go under, burning all their money chasing the last bug… and claimed that 75% of all product starts never got to market, blaming debug problems for a large proportion of that.
The most important debug feature is trace – which follows the theme of this being the S4D of trace. After trace, you want hardware breakpoints. Apparently, you need at least two to breakpoints to debug a system with a virtual memory RTOS. One to keep a look at the MMU, and one to actually debug code. More are better, but it is rare to see silicon vendors include many more breakpoints.
Barry gave a number of examples of projects which had failed by not buying the right hardware. He put the blame both on the buyers chasing a few cents of costs in the end product, but also on the poor quality of silicon salespeople who rather took the price route than the quality route when selling chips.
He also noted that there seemed to be a positive correlation between industry leaders and buying debuggable hardware. Companies like Bosch, Ericsson, and Nokia always spend the extra money to get hardware that can be debugged, and have results to show for it.
Philosophy of Debug
During our two panel discussions on debug, there were two ways to look at debugging that stood out from the crowd.
The first was the observation that debugging today is very much a craft. When things go really bad, you go to the proven expert. Debugging is a craft you learn by apprenticeship with a master, and master debuggers are incredibly valuable for their organizations. This reliance on masters indicate that general programming education to a large extent overlooks debugging as a crucial skill for programmers. It also means that, in the words of one member of the audience, debug cannot scale. As problems become more complex, we still rely on single individuals, which reduces our ability to tackle problems.
The second observation was to liken debugging to the diagnosis of human diseases. As systems become more complex, their behavior gets so complicated and rich that it is hard to even precisely identify what a bug is. A simple crash or illegal operation is clear-cut – but when results of programs are just a bit off? When control loops don’t quite do the right thing, but almost? When the quality of a picture of a television just feels wrong? In those cases, we might be looking at composite measurements of many different parameters and factors in a system, and making a diagnosis of error based on the whole picture rather than each factor in isolation.
Based on these observations, I can envision a somewhat weird future where we train computer doctors (as in medical doctor, not PhD) to diagnose computer problems based on holistic, systematic, approaches. Such education could be separate from the training of programmers and testers, as their specialty would be the diagnosis of system outputs against the expected outcomes at a high level, rather than the details of code.