"Measuring programming progress by lines of code is like measuring aircraft building progress by weight." - Bill Gates
The other day, someone on my project team proposed ranking developers by their lines-of-code committed in Subversion. I really hope the suggestion was meant to be tongue-in-cheek. But lest anyone take LOC measures too seriously, it's worth pointing out that LOC is a bad, bad metric of productivity, and only gets worse when one tries to apply it across different application areas, developer roles, etc.
Here are a few reasons why LOC is not a good measure of development output:
- LOC metrics, sooner or later, intentionally or unintentionally, encourage people to game the system with code lines, check-ins, etc. This is bad enough to be a sufficient reason not to measure per-developer LOC, but this is actually the least bad of the problems.
- LOC cannot say anything about the quality of the code. It cannot distinguish between and an overly complex and bad solution to a simple problem, and a long, complex, and necessary solution to a hard problem. And yet it "rewards" poorly-thought-out, copy-paste code, which is not a desirable trait in a metric.
- In software development, we desire nicely factored, elegant solutions to a problem -- in a perfect world, we want the least LOC solution to a problem that still meets requirements such as readability. So a metric that evaluates the opposite -- maximum LOC for each task -- is counterproductive. And since high LOC certainly doesn't necessarily mean bad code, there isn't even a negative correlation to use from the measurement.
- In general, spending time figuring out the right way to do something, as opposed to hacking and hacking and hacking, lowers your LOC per unit time. And if you do succeed in finding a nice compact solution, then it lowers your gross LOC overall.
- Even in the same application and programming environment, some tasks lend themselves to much higher LOC counts than others, because of the level of the APIs available. For example, in a Java application with some fancy graphics and a relational persistence store, Java2D API UI code probably requires more statements than persistence code leveraging EJB 3 (based on Hibernate) based on the nature of the API. Persistence code using straight JDBC and SQL strings will require more lines of code than EJB 3 code, although it’s most likely the “wrong” choice for an application for all sorts of well-known reasons.
- In the same application and environment, not every LOC is equal in terms of business value: there is core code, high-value edge-case code, low-value edge-case code. To imagine that every line of every feature is the same disregards the business reality of software.
- You may have read that the average developer on Windows Vista at Microsoft averaged very few lines of code per day (from 50 down to about 5 depending on who you read). Is Microsoft full of lazy clueless coders? Is that why the schedule slipped? I doubt it. There were management issues, but Microsoft also worked extremely hard to get certain aspects of Vista to be secure. Do security and reliability come with added lines of code? Unlikely – in fact, industry data suggest the opposite: more code = more errors, more vulnerabilities.
But no one on our team would ever write bad code, right? So we don’t need to worry about those issues.
Not so fast… Developers, even writing “good” code, generally make the same number of errors on average per line (a constant for an individual developer). So if I write twice as many lines of code, I create twice as many bugs. Will I find them soon? Will I fix them properly? Will they be very expensive sometime down the line? Who pays for this? Is it worth it? Complex questions. Never an easy “yes.” Or, as Jeff Atwood puts it, “The Best Code is No Code At All.”
And beautiful, elegant, delightful code is expensive to write (because it requires thought and testing). The profitability of my firm depends on delivering software and controlling costs at the same time. We don’t fly first class and we don’t use $6000 development rigs, even if we might offer some benefit to the client or customer as a result. And we don’t write arbitrary volumes of arbitrarily sophisticated code if we can help it.
Ok, so why, then, is the LOC metric even around? If it’s such a bad idea, it would be gone by now!
Here’s why: while LOC is a poor measure of developer output, it’s easy to use, and it’s a (primitive but functional) measure of the overall complexity and cost of a system. When all the code in a large system is averaged together, one can establish a number of lines per feature, a number of bugs per line, a number of lines per developer-dollar, and a cost to maintain each line for a long, long time into the future.
These metrics can be valuable when applied to estimating similar efforts in the same organization under similar constraints. So they’re worth collecting in aggregate for that reason.
But for comparing individual productivity? I don’t think so.