Tuesday, June 8, 2010
Genius or Mad Man?
I presented an ambitious proposal to a couple of engineers more experienced in the space, and whilst everything they said and asked about was measured and reasoned, the look they had on their faces was as if they were wondering whether I was sane. On reflection, it is a look that I have likely given to others many times in the past, whenever I am thinking "if you manage to achieve what you propose, I will be suitably impressed by your genius. If not, I will merely be right."
Sunday, April 25, 2010
Unit Tests
Early on in my career I used to think that there existed code that was "too complicated" to unit test, especially code doing multithreading. A few years experience on this type of code base taught me that "too complicated to unit test" is just another way of saying "unmaintainable". What I really needed to do was to change the idioms I was using so I could write something unit testable [1].
The flip side to that mistake was I argued that since we couldn't unit test the most complicated parts, we shouldn't unit test the simple code either. The mistake here was the assumption was that bugs were much more likely to appear in the complex code then the simple code. Whereas in reality, because the complex code was complex, it was subject to a lot more ad hoc and integration testing than the simple code, which just "obviously" worked. The key thing is that most bugs turn out to be trivial (e.g. cutting and pasting a loop and not changing the loop variables in all cases), but code is so dense with information it is often very hard to spot these mistakes by eye, so with their path untested they all too easily make it to production and cause some behavioral bug in some edge case, and after days go by before it is observed, then you have spent hours tracking it down to a one-line bug, you will wonder how the entire software world keeps itself together.
That being said, you have to be careful with unit tests. They can add a lot of almost deadweight code to your base, making trivial changes to functionality take several hours just in refactoring unit tests. Now, modern IDEs with refactoring tools can help a lot here [2]. But certain of these problems are not refactorable.
The most important thing is not to treat test code as a second class citizen. In terms of quality it must be treated the same as production code So for example, since we all know that copying and pasting a code block 10 times is bad in production code, then it is just as bad, for all the same reasons, in a test, or across a set of tests. What this means in practice is you end up with lots of little methods to construct helper objects to put into tests, e.g. replacing N replicas of this[3]:
MyObjectUnderTest obj = new MyObjectUnderTest();
obj.setFoo("foo");
obj.setBar("bar");
obj.setBaz("baz");
doTest(obj);
With N of this:
doTest(createObjectUnderTest("foo", "bar", "baz"));
A second important aspect is to test just 1 object at a time - don't couple objects under test. Doing 2 (or more) at once often seems like a time saver ("I'm testing three objects with one test, yay productivity!"), but as soon as you want to refactor one of the objects (e.g. to use it in another place) you will be regretting your choice as you have to understand the complex interactions the test was testing, and then rewrite them. Mocking frameworks and judicious use of Abstract Classes/Interfaces can be your friend here.
As to when to write the tests, TDD is fine if it works for you. I find it most useful when trying to write classes where I want to get the API right. But after all I am generally writing a applications as opposed to a library, and so any individual class's interface is not that important, and so I use it rarely.
One thing I do force myself to do is write code in units, and write the test straight after each unit. The thing I noticed is that if I delay writing unit tests until after all the units are working together end to end, then because after all the system "already works" my subconscious enthusiasm for writing unit tests falls markedly, and their quality and coverage fall likewise. Whereas if I write the unit tests just after each unit, it's part of "getting everything to work", and so I am willing to put the effort into doing it well.
In the end though, I have found the greatest value of unit tests is that they give me direct feedback as to just how high my error rate is in writing code. I have lost count of how many times I have written the code, written the test, thought "this is sure to work", and in fact had several quite serious bugs that need to be fixed. It is forever a humbling experience, but humility is a good value to have when working on large, complex software systems in a corporate "we need it yesterday" environment.
[1] In particular, I was writing mulithreaded stuff on the Win32 API and trying to get by with Semaphores and Events, and interlocked stuff for cross thread synchronization. All very difficult to unit test. When I finally switched to Posix Condition Variables, everything was much easier. The other thing I learned here was as much as possible objects should not own threads - rather they should be driven by threads. Thus you just need to unit test state transitions of method calls. Now, this is still not checking for race conditions and deadlocks - for that the only solution I have found is good up front design with well defined semantics.
[2] It's one of the many reasons that dinosaurs still using Emacs for writing Java are doing a bad job - they either aren't writing enough unit tests or not doing enough refactoring.
[3] Of course a better language with named parameters would make this all less monotonous. But like most people in the real world I am stuck with Java or C++ on any project large enough that it needs type safety.
The flip side to that mistake was I argued that since we couldn't unit test the most complicated parts, we shouldn't unit test the simple code either. The mistake here was the assumption was that bugs were much more likely to appear in the complex code then the simple code. Whereas in reality, because the complex code was complex, it was subject to a lot more ad hoc and integration testing than the simple code, which just "obviously" worked. The key thing is that most bugs turn out to be trivial (e.g. cutting and pasting a loop and not changing the loop variables in all cases), but code is so dense with information it is often very hard to spot these mistakes by eye, so with their path untested they all too easily make it to production and cause some behavioral bug in some edge case, and after days go by before it is observed, then you have spent hours tracking it down to a one-line bug, you will wonder how the entire software world keeps itself together.
That being said, you have to be careful with unit tests. They can add a lot of almost deadweight code to your base, making trivial changes to functionality take several hours just in refactoring unit tests. Now, modern IDEs with refactoring tools can help a lot here [2]. But certain of these problems are not refactorable.
The most important thing is not to treat test code as a second class citizen. In terms of quality it must be treated the same as production code So for example, since we all know that copying and pasting a code block 10 times is bad in production code, then it is just as bad, for all the same reasons, in a test, or across a set of tests. What this means in practice is you end up with lots of little methods to construct helper objects to put into tests, e.g. replacing N replicas of this[3]:
MyObjectUnderTest obj = new MyObjectUnderTest();
obj.setFoo("foo");
obj.setBar("bar");
obj.setBaz("baz");
doTest(obj);
With N of this:
doTest(createObjectUnderTest("foo", "bar", "baz"));
A second important aspect is to test just 1 object at a time - don't couple objects under test. Doing 2 (or more) at once often seems like a time saver ("I'm testing three objects with one test, yay productivity!"), but as soon as you want to refactor one of the objects (e.g. to use it in another place) you will be regretting your choice as you have to understand the complex interactions the test was testing, and then rewrite them. Mocking frameworks and judicious use of Abstract Classes/Interfaces can be your friend here.
As to when to write the tests, TDD is fine if it works for you. I find it most useful when trying to write classes where I want to get the API right. But after all I am generally writing a applications as opposed to a library, and so any individual class's interface is not that important, and so I use it rarely.
One thing I do force myself to do is write code in units, and write the test straight after each unit. The thing I noticed is that if I delay writing unit tests until after all the units are working together end to end, then because after all the system "already works" my subconscious enthusiasm for writing unit tests falls markedly, and their quality and coverage fall likewise. Whereas if I write the unit tests just after each unit, it's part of "getting everything to work", and so I am willing to put the effort into doing it well.
In the end though, I have found the greatest value of unit tests is that they give me direct feedback as to just how high my error rate is in writing code. I have lost count of how many times I have written the code, written the test, thought "this is sure to work", and in fact had several quite serious bugs that need to be fixed. It is forever a humbling experience, but humility is a good value to have when working on large, complex software systems in a corporate "we need it yesterday" environment.
[1] In particular, I was writing mulithreaded stuff on the Win32 API and trying to get by with Semaphores and Events, and interlocked stuff for cross thread synchronization. All very difficult to unit test. When I finally switched to Posix Condition Variables, everything was much easier. The other thing I learned here was as much as possible objects should not own threads - rather they should be driven by threads. Thus you just need to unit test state transitions of method calls. Now, this is still not checking for race conditions and deadlocks - for that the only solution I have found is good up front design with well defined semantics.
[2] It's one of the many reasons that dinosaurs still using Emacs for writing Java are doing a bad job - they either aren't writing enough unit tests or not doing enough refactoring.
[3] Of course a better language with named parameters would make this all less monotonous. But like most people in the real world I am stuck with Java or C++ on any project large enough that it needs type safety.
Friday, April 23, 2010
Pruning
One under appreciated skill that you need as a software engineer is the ability to quickly and accurately prune your solution space. For any problem you will have a range of possible options on how to solve it, and most of them will be bad. There are three tricks I have found useful to being successful:
- In the initial evaluation step, knowing and using the right heuristics to choose between the options. The ideal heuristic is both low cost in the sense you don't spend too long evaluating each option, but also accurate in getting the most probable success.
- Once an option is chosen and implementation is underway, be continually critically evaluating the new information gained about the option's viability, and then intuiting when the downsides are getting too high relative to the next alternative.
- Once that point is reached, being willing to put aside what you have done to try the next alternative. This is often hard as a) it seems like you have wasted your time (i.e. the sunk cost fallacy), and b) it is sometimes intellectually dissatisfying to put aside a problem unsolved.
Monday, April 5, 2010
Crap All The Way Down
You have a project with requirements W,X,Y and Z. You find an existing bit of software, be it tool, library or even source, that looks like it will do a lot of what you want. This is how it proceeds:
Start: "Wow this bit of software looks good, it should do exactly what I want."
Proceeds To: "Hmm, it handles these edge cases a little bit funny, let me investigate more."
Proceeds To: "What a pile of crap, why did they handle such important cases in such a shitty, half assed way. This will never work. I'm gonna look at something else."
Proceeds To: "I can't believe it. All the other options are just as crappy. I'm gonna start from scratch, how hard can it be?"
Proceeds To: "Wow, this underlying bit of software is just an incomprehensible ball of crap, this would take years."
Proceeds To: "Oh well, if I just cut out requirements W and X, and just do crappy half way implementations for Y and Z, then at least I will have something."
Then someone else comes along, and chooses to use your software for their project.
Start: "Wow this bit of software looks good, it should do exactly what I want."
Proceeds To: "Hmm, it handles these edge cases a little bit funny, let me investigate more."
Proceeds To: "What a pile of crap, why did they handle such important cases in such a shitty, half assed way. This will never work. I'm gonna look at something else."
Proceeds To: "I can't believe it. All the other options are just as crappy. I'm gonna start from scratch, how hard can it be?"
Proceeds To: "Wow, this underlying bit of software is just an incomprehensible ball of crap, this would take years."
Proceeds To: "Oh well, if I just cut out requirements W and X, and just do crappy half way implementations for Y and Z, then at least I will have something."
Then someone else comes along, and chooses to use your software for their project.
Saturday, October 17, 2009
Marginal Thinking
One of the most most powerful ideas I have taken from economics is thinking at the margin. To quote one of Greg Mankiw's principles of economics, "Rational People Think at the Margin". I think like Mankiw occasionally tends toward, this quote is ironically mischievous, and to some extent the mischievousness undermines the profundity of the message. Mankiw is not backing a horse in the old debate of whether "people are rational", or "markets are rational" - the "rational thinking" applies to the analyst, not the actors. So what he is basically saying is "In analyzing a complex system of actors or events in light of possible changes, a rational analyst thinks in terms of how actors/events at the margins are affected by the changes, rather than trying to reason about the affect on all of them, or even an average one".
A good example is if you ask someone "Do mandatory seat belts cause there to be less fatal road accidents?". The immediate common response is, "of course, since fatal accidents can become merely injurious". This is true, but not complete, the followup question is "Do you suppose, by engendering a feeling of safety, that seat belts could cause more dangerous driving, and so more accidents". Here people think about their own driving, and whether it is influenced by their wearing of the seat belt, and conclude it isn't and so answer "No". But it is here that their thinking is less than rational. Yes, most people are good, safe, average drivers in their normal range of experience, and a fairly small accident rate across the population reflects this. But it is the margin where things matter; is there marginal drivers who might feel a little safer with their seat belt on, and so might drive a little closer to the edge? It seems the answer must be yes. So on balance then are mandatory seatbelts good or bad? Well from analysis of actual before/after measurement when laws have been enacted, results are mixed.
This is a fairly unintuitive result, and economists like Steven Landsburg And Steven Levitt have managed to fill books by applying similar reasoning to a range of similar systems, to get similar unintuitive outcomes. The important point is they follow their reasonings with actual experiments to test the reasoning, and it is this that carries the weight of the argument, and allows it to win over the default, naive, intuitive analysis.
Which brings me back to the "ironically mischievous" aspect of Mankiw's principle; that since results from thinking at the margin seldom match with peoples intuition, they are extremely difficult to convince people of over "average case" type thinking. Or in other words, most people are just not rational.
A good example is if you ask someone "Do mandatory seat belts cause there to be less fatal road accidents?". The immediate common response is, "of course, since fatal accidents can become merely injurious". This is true, but not complete, the followup question is "Do you suppose, by engendering a feeling of safety, that seat belts could cause more dangerous driving, and so more accidents". Here people think about their own driving, and whether it is influenced by their wearing of the seat belt, and conclude it isn't and so answer "No". But it is here that their thinking is less than rational. Yes, most people are good, safe, average drivers in their normal range of experience, and a fairly small accident rate across the population reflects this. But it is the margin where things matter; is there marginal drivers who might feel a little safer with their seat belt on, and so might drive a little closer to the edge? It seems the answer must be yes. So on balance then are mandatory seatbelts good or bad? Well from analysis of actual before/after measurement when laws have been enacted, results are mixed.
This is a fairly unintuitive result, and economists like Steven Landsburg And Steven Levitt have managed to fill books by applying similar reasoning to a range of similar systems, to get similar unintuitive outcomes. The important point is they follow their reasonings with actual experiments to test the reasoning, and it is this that carries the weight of the argument, and allows it to win over the default, naive, intuitive analysis.
Which brings me back to the "ironically mischievous" aspect of Mankiw's principle; that since results from thinking at the margin seldom match with peoples intuition, they are extremely difficult to convince people of over "average case" type thinking. Or in other words, most people are just not rational.
Friday, July 3, 2009
Lessons of Waterfall
No thinking person did Waterfall the way it was described in the literature written by the "Big Process" gatekeepers. Rather, they did what was appropriate to the problem at hand and then did the minimal amount of Waterfall waggle dancing needed to keep the gatekeepers off their back. Unfortunately this had the side affect of making both the gatekeepers and the management hierarchy think that their process was working, and the whole Big Process industry managed to perpetuate itself for well over 20 years. Until Extreme Programming came along, made a lot of noise about being different, and then through their "one right way" dogma managed to continue the charade, just with a different set of gatekeepers cashing in.
Thursday, July 2, 2009
TDD as Future Self Paternalism
I have a lot of trouble stopping myself snacking when I am concentrating on a problem. In the moment when my brain is distracted I lack the mental awareness to control my behavior. I can be back and forth to the cupboard and halfway through a pack of chips before it even hits me what I'm doing. For a while, I struggled with this and struggled with my weight, but eventually I hit upon an ingenuously simple solution: just stop buying snacks. Then in the moment of thought, they won't be around to be eaten.
I am not a Test Driven Development (TDD) dogmatist, my coverage only averages about 60%, and I rarely write tests before code. However, I do force myself to write code in units, and write the test straight after each unit. The thing I noticed is that if I delay writing unit tests until after all the units are working together end to end, then because after all the system "already works" my subconscious enthusiasm for writing unit tests falls markedly, and their quality and coverage fall likewise. Whereas if I write the unit tests just after each unit, it's part of "getting everything to work", and so I am willing to put the effort into doing it well.
I am not a Test Driven Development (TDD) dogmatist, my coverage only averages about 60%, and I rarely write tests before code. However, I do force myself to write code in units, and write the test straight after each unit. The thing I noticed is that if I delay writing unit tests until after all the units are working together end to end, then because after all the system "already works" my subconscious enthusiasm for writing unit tests falls markedly, and their quality and coverage fall likewise. Whereas if I write the unit tests just after each unit, it's part of "getting everything to work", and so I am willing to put the effort into doing it well.
Subscribe to:
Posts (Atom)