Thursday, November 20, 2008

Microsoft Pex Moves the Needle Bigtime on Software Testing and Correctness

Over two years ago, I wrote about how neither the assurances of static compiler technology nor the ardent enthusiasm and discipline of TDD (and its offshoots) represent major headway against the difficulty and complexity of large software projects.

At the time, this issue came up in the context of static languages versus dynamic languages. There still exists a political issue, although today it is more transparently about different organizations and their view of the computing business. I will revisit the successor debate in my next post.

For now, however, I want to talk about a tool. In my post of two years ago, I suggested that significantly better analysis tools would be needed in order to make real progress, regardless of your opinion languages.

So I've been excited to see the latest tools from Microsoft Research fast-tracking their way into product releases -- tools which can really move the ball downfield as far as software quality, testing, productivity, and economy.

The most significant of these is called Pex, short for Program Explorer. Pex is a tool that analyses code and automatically creates test suites with high code coverage. By high coverage, it attempts to cover every branch in the code and -- since throwing exceptions or failing assertions or contracts count as branches -- it will automatically attempt to determine all of the conditions which can generate these occurrences.

Let me say that again: Pex will attempt to construct a set of (real, code, editable) unit tests that cover every intentional logic flow in your methods, as well as any exceptions, assertion failures, or contract failures, even ones you did not code in yourself (for example, any runtime-style error like Null Pointer or Division by Zero) or which seem like "impossible-to-break trivial sanity checks" (e.g. x = 1; AssertNotEqual(x, 0))

Moreover, Pex does not randomly generate input values, cover all input ranges, or search just for generic edge cases (e.g., MAX_INT). Instead, it takes a complex theoretical approach called "abstract interpretation" coupled with a SMT (satisfiability modulo theories) constraint solver to explore the space of code routes as it runs the code and derives new, significant inputs.

In addition, so far as I can understand from the materials I've seen, Pex's runtime-based (via IL) analysis means that it should work equally well on dynamic languages as on static ones.

To get an idea of how this might work, have a look at this article showing the use of Pex to analyze a method (in the .Net base class library) which takes an arbitrary stream as input.

For those of you who are inherently skeptical of anything Microsoft -- or anything that sounds like a "big tool" or "big process" from a "big company" -- I'll have more for you in my next article. But for now keep in mind that if Microsoft can show that the approach can work and is productive, user friendly, and fun (it is), then certainly we will see similar open source tools. After all, it appears the same exact approach could work for any environment with a bytecode-based runtime.

Last, I do recognize that even if this tool works beyond our wildest expectations, it still has significant limitations including

  1. reaching its full potential requires clarity of business requirements in the code, which in turn require human decision making and input and
  2. for reasons of implementation and sheer complexity this tool operates at the module level, so you can't point it at one end of your giant SOA infrastructure, go home for the weekend, and expect it to produce a report of all the failure branches all over your company.

That said, here are a couple more great links:

7 comments:

Jamie said...

Darn, now there's one less metric I can use when looking at an unfamiliar codebase to decide if I want to run screaming rather than work on it. :)

Seriously, though, I assume that it'll still be pretty obvious that lazy individuals just mass-enabled "allow exception" and pumped up their coverage to 100%. Otherwise we're all doomed, unless we become "test-test-infected" and learn to write tests for the tests, to stay one step ahead of the code manglers.

Taumuon said...

Hi,

I've written a blog post on using Pex to write some theories (more general unit tests) that you might find interesting, as you were talking about Pex:

http://taumuon-jabuka.blogspot.com/2009/01/theory-driven-development-using_11.html

Cheers,

Gary

petrenkov said...

It was rather interesting for me to read that post. Thanks for it. I like such topics and anything that is connected to them. I definitely want to read more on that blog soon.

Sincerely yours
Steave Markson

pex said...

Agree with Jamie for pumped up their coverage to 100%

Sell RS Gold said...

I just stumbled on a person's clause and also have ended up reading material on. I must express the wonderment of this composition accomplishment and power to make audience study right from the start to your conclusion. I would really like to read more sophisticated discussions also to plowshare the thoughts on hand.

RS Gold
Runescape Gold

WORLD OF WARCRAFT said...

If you might get yourself in a great guild or group up with some increased degree avid gamers in team quests, you can typically get Cheap WOW Gold Eugear that you simply couldn’t get in your own.Though the most beneficial way by which the elite avid gamers accomplish accomplishment at Buying WOW Gold.

Anonymous said...

To get this sort of critters you undoubtedly dog collar require being urged to assist you to certify that possibly enjoying all the presents and they are near practical fitness once review. If you've gained over one puppy, subsequent the rrs really a must as you would like make sure that the dog carrier kitties may be within the perfect luxury combined with physical health whereas you have been off. Despite the presence of individuals devoid of everyone all round, certify they are being satisfied and content such as www.lovelonglong.com you are going to have them till you go back.