Over two years ago, I wrote about how neither the assurances of static compiler technology nor the ardent enthusiasm and discipline of TDD (and its offshoots) represent major headway against the difficulty and complexity of large software projects.
At the time, this issue came up in the context of static languages versus dynamic languages. There still exists a political issue, although today it is more transparently about different organizations and their view of the computing business. I will revisit the successor debate in my next post.
For now, however, I want to talk about a tool. In my post of two years ago, I suggested that significantly better analysis tools would be needed in order to make real progress, regardless of your opinion languages.
So I've been excited to see the latest tools from Microsoft Research fast-tracking their way into product releases -- tools which can really move the ball downfield as far as software quality, testing, productivity, and economy.
The most significant of these is called Pex, short for Program Explorer. Pex is a tool that analyses code and automatically creates test suites with high code coverage. By high coverage, it attempts to cover every branch in the code and -- since throwing exceptions or failing assertions or contracts count as branches -- it will automatically attempt to determine all of the conditions which can generate these occurrences.
Let me say that again: Pex will attempt to construct a set of (real, code, editable) unit tests that cover every intentional logic flow in your methods, as well as any exceptions, assertion failures, or contract failures, even ones you did not code in yourself (for example, any runtime-style error like Null Pointer or Division by Zero) or which seem like "impossible-to-break trivial sanity checks" (e.g. x = 1; AssertNotEqual(x, 0))
Moreover, Pex does not randomly generate input values, cover all input ranges, or search just for generic edge cases (e.g., MAX_INT). Instead, it takes a complex theoretical approach called "abstract interpretation" coupled with a SMT (satisfiability modulo theories) constraint solver to explore the space of code routes as it runs the code and derives new, significant inputs.
In addition, so far as I can understand from the materials I've seen, Pex's runtime-based (via IL) analysis means that it should work equally well on dynamic languages as on static ones.
To get an idea of how this might work, have a look at this article showing the use of Pex to analyze a method (in the .Net base class library) which takes an arbitrary stream as input.
For those of you who are inherently skeptical of anything Microsoft -- or anything that sounds like a "big tool" or "big process" from a "big company" -- I'll have more for you in my next article. But for now keep in mind that if Microsoft can show that the approach can work and is productive, user friendly, and fun (it is), then certainly we will see similar open source tools. After all, it appears the same exact approach could work for any environment with a bytecode-based runtime.
Last, I do recognize that even if this tool works beyond our wildest expectations, it still has significant limitations including
- reaching its full potential requires clarity of business requirements in the code, which in turn require human decision making and input and
- for reasons of implementation and sheer complexity this tool operates at the module level, so you can't point it at one end of your giant SOA infrastructure, go home for the weekend, and expect it to produce a report of all the failure branches all over your company.
That said, here are a couple more great links:
- "Contract Checking and Automated Test Generation with Pex" from PDC 2008
- "Pex: Automated White Box Testing for .NET" on MS DevLabs