04 January 2017

Experimental Computer Science

I studied computer science in college, and was struck by the lack of science in the discipline.  Computer science as a field is fundamentally applied mathematics in the style of theoretical physics.  Software engineering, the other side of the coin, is as superstitious as theoretical computer science is formal.  Given the long time-periods and budgets required to construct large software projects, it is little surprise that software engineering is still largely imitative in character ("Well Google did it this way...").  We cannot afford to conduct worthwhile experiments, and the art suffers as a result.

A senior colleague at my first internship was so kind as to reveal to me the mundane nature of experimental computer science, however.  I had encountered a bug in my code, and was frustrated.  He came over, sat down on his trademark inflatable exercise ball, asked me what my hypothesis was, and started bouncing beatifically.  And so I learned that lowly debugging was the experimental computer science that I had long sought.  You consider your known facts.  You formulate a set of hypotheses about what might have happened consistent with those facts.  You find a way to test your hypotheses and gather more facts.  Repeat until phenomenon is explained, mystery solved.

Engineering builds artifacts using facts; experiemtal science builds facts using artifacts.  Debugging is most certainly in the latter category.

In the years since, debugging has come to be probably my favorite part of my job, and in the style of Lakoff's Metaphors We Live By, I've picked up a couple more perspectives on it.

The professor of my operating systems class once said: "Debugging is telling two stories.  One is the story of what you wanted the computer to do, and the other is the story of what the computer did.  Where they diverge, there is your bug."  This narrative view is a very temporal way to think about debugging, well-suited to stepping through code in a debugger.

A third view that I have used while debugging distributed systems is that of police procedural / forensics.  A symptom of a "crime" appears, an invariant violated.  Careful notes are taken on the evidence; places and times, the frequency if repeated or irregular, any commonalities between multiple events.  A list of "suspect" components is drawn up.  "Means, motive, and opportunity" sort of still holds; components with the permissions and logic to do something like the crime, as well as components which are historically known to be buggy, or which have been changed in a recent commit.  Then you investigate the suspects, entertaining the possibility that they acted alone or in conspiracy with each other.  Fundamentally this differs from the scientific approach in two respects: chunking and anthropomorphization.  Anthropomorphization is a dangerous falsehood to permit oneself, but it works very well for me, perhaps because it lets me leverage some measure of social intelligence in an otherwise asocial situation.  I have had some great successes with this method, in several cases correctly calling complex sequences of race conditions in a cluster within single-digit minutes of receiving a bug report, without looking at any code.

So, three faces of debugging:
  • Science
  • Storytelling
  • Policework
There are, to be sure, more such lenses to be found and played with.  I look forward to it.

No comments:

Post a Comment