Tuesday, December 30, 2008

On the Wide Range of Ruby Talent: Rails as a 4GL(?)

For a long time, I was puzzled by the extremely wide range of talent level among Ruby talent: more than any other language I could think of, "Ruby Programmers" seemed to range from the truly clueless to hardcore accomplished engineering veterans.

I had more or less chalked this observation up to the language and APIs themselves -- easy to get running with, highly intuitive ... and yet packing deep and powerful features that would attract the more sophisticated (easy C integration, metaprogramming, continuations, etc.)

On top of this was Rails with its famous screencasts showing database-backed websites automagically constructing themselves, a feat that got novices very excited, and reminded experts that they should be asking harder questions of any other framework they were using for the same kind of app.

But lately I've come up with another -- or perhaps stronger variant -- hypothesis.

Rails itself aspires to be a 4GL -- a DSL together with a framework, specializing in database-backed, RESTful web application development.

It appears that some programmers see (or want to see) just the 4GL parts (the DSL, the declarative bits, the "conventions" and the "magic") while others see a 4GL in close symbiosis with a 3[+]GL (Ruby) for doing procedural work.

In some sense, both groups are seeing accurately -- for certain problems. That is, apps that truly hit the "sweet spot" for which Rails was designed, and which do nothing else, can be treated as compositions of 4GL bits much of the time. Highest productivity, get your check, go home, watch a movie.

For other problems, additional logic and/or integration is required.

And here's where the two groups of programmers part company. The pure-4GL-ers want to search high and low for a plug-in, gem, or other black-box component to solve the problem. Even if the black-box component is harder to use, poorly documented or maintained, does not include a test suite verifying it does what it claims, this group of coders wants the plug-in. They'll search and argue and try different ones even if the functionality is 15 lines of Ruby code. But they won't write the functionality themselves.

The other group just writes the code. Perhaps they even bundle it into a plug-in and throw it on github. They also do things like submit patches to Rails itself.

Depending on the situation, either approach might be fine ... but: when something doesn't go right, in terms of logic or process or performance or data integrity or something else ... the first group hits the wall. All of a sudden it's a "Big Problem" ... whether or not it's really a big problem. That is the Achilles heel of all 4GLs: if you view them as "closed" systems that require pluggable components then anything outside their bounds becomes a "Big Problem."

And lately I've watched as some participants in the Rails community systematically leap tall buildings in a single bound when they encounter them, while others literally sit on their hands waiting for a plug-in to appear (with no track record or test cases), which they then gratefully throw into their app and hope for the best.

Monday, December 29, 2008

Fear of the Driver Disk

The problem of bloatware/crapware on retail PCs is well known -- to the extent that Apple makes fun of it, pointing out the absence of such software on new Macs, while PC tools exist just to clean it.

But bloatware has a less-famous, equally annoying sibling: all the garbage that brand-name hardware devices install off their driver or utility disk.

Pick up a peripheral -- printer, web cam, DVD drive -- from a major brand, and if you follow the automagical installer on the drive disk, you'll get a half-dozen apps that you may not need or like. In some cases, they're just bad apps (that have a habit of arranging to start at boot), while in other cases they can destabilize a perfectly-running system.

The problems are that

  1. in some cases you do need these apps, because some hardware features require "support software" to be present, and don't fully leverage the many built-in drivers and control panels available for free in Windows ...
  2. most hardware companies internally view the driver/utility software as an afterthought, writing it hastily, testing it inadequately, and staffing it with ... well ... whomever they can find.

There are two main remedies.

In many cases, getting an unbranded or "strange-branded" device is a smart idea (provided you know what you're getting). I've found these devices have straight-forward, minimalist support apps, make great use of built-in Windows drivers, and don't put any garbage on your system -- for the simple reason that they don't have the resources to write a bunch of half-baked apps, or to form "distribution partnerships" with people who do.

If I do have a brand-name product, I generally attempt to install it without its own driver disk, no matter what the instructions say. In many cases, the device is fully supported by Windows out of the box; in other cases, some features may not be available -- but I may not need them. (E.g., If I wanted to use my digital camera as a webcam, that would have required the vendor driver disk... but I have never wanted to use that feature of the device.)

And if that latter approach fails, it's pretty easy to uninstall the device or otherwise convince the PC it's never seen the device before -- so that you can go the RTFM route and use the supplied disk.

Tuesday, December 23, 2008

Stax Brings more Standard App Models to the Cloud, Marginalizes Custom Platform Further

Stax, which recently launched in private beta, is a cloud hosting/deployment/ops platform based on Java appservers. The coolest thing about Stax is that it offers many flavors of JavaEE-deployable applications, including Python apps (via Jython) and Rails (via JRuby) with ready-to-roll built-in templates.

Stax has a very AppEngine-y feel, not just on the website, but in terms of the SDK interactions, local development, etc.

This is good news for all of the popular platforms ... and bad news for those rattling around the corners with non-standard APIs. As the app-hosting industry continues to mature, the emphasis will clearly be on established players like Rails, ASP.Net, JavaEE, Pylons, et al. at the expense of guys like AppJet.

It's not about the language (JavaScript) but the about learning a set of APIs, patterns/practices, and sustaining a community ... based on a roll-your-own platform.

It is true that some of these built-for-the-cloud platforms were designed from the start to default to hash-style or big-table style storage -- popular for content-oriented cloud apps because of its easy horizontal scaling -- where the "traditional" platforms focus on relational stores and have a variety of bolt-on APIs for cloud DBs.

But now that there are so many standard alternatives, it is unlikely developers will pay any custom-platform-tax no matter how elegant that platform might be.

Thursday, December 18, 2008

We May Look Back Fondly on Our 68% Project Failure Rate

The report [PDF] referenced here -- focusing on failure originating in the business requirements realm and offering a "68% project 'failure'" headline -- inspired two thoughts.

First, it remains a head-scratcher why nearly every project, despite talk of risk analysis and mitigation, expects to be in the 32% success area. Even if a company has the best resources (and generally it will not), many causes of project failure originate in organizational factors -- friction, process, politics -- and externalities (e.g., no budget this quarter for the tools in the original plan).

Since these issues are rarely known -- and, I would argue, actively denied in the sense that whistle-blowers and problem-spotter are systematically excluded from planning (at best) or forced down or out -- the average "expected value" of the d100 role is way less than a 68.

In startups, I've heard the excuse that the whole company is a "long shot" -- as though that justifies taking on disproportionate risk in each project implementation.

In enterprises, it seems as though management is so used to failure (in a timeline or budgetary sense), that they have simply redefined success to mean anything that they can pull off in their tenure -- and if that means a system that kinda works, but no one likes, shipped in 200% time and 400% budget, well, that's just the "reality" (which, to be sure, it is once all other possible paths have been discarded).

This redefinition of success also has the side effect of removing accountability and pretty much assuring a nice bonus and making strict accountability impossible.

Another thought inspired by the report is how the gap between enterprise development and small / startup development is widening. On the one hand, large businesses could benefit from the high-productivity tools and agile approaches popular with startups; for a variety of reasons ranging from policies to personnel, they are not exploiting the latest wave of technology, and it's costing them.

On the other hand, what they need regardless of tech is solid analysis and estimation capabilities. Analysis and estimation are possible, but hard, and the agile camp has moved mountains to try and reframe the problem so that they can advocate punting on all of the hard parts of these disciplines. That works great for the hourly agile consultants, but inside large businesses and large projects, it just doesn't cut it. A business needs to be able to attempt to estimate and plan months or even years of a project (hence the prevalence of "fake" agile in companies that purport to use it).

The fact that the business does a horrible job with the estimation today does not mean that the organization (which, generally, is run by business people not developers) won't keep planning and targeting in the large.

The result of these two pieces (different tech, different process) is that enterprise and small development are moving farther and farther apart, which is a damaging long-term trend. Ideally, these two groups should be learning from each other. They should spend more time moving in the same world.

The enterprise learns that it probably doesn't need JavaEE and Oracle and a big planning process for myriad internal utility apps that could be done with Rails at a fraction of the cost and effort. The small company learns that relational integrity, transactions, estimation, and operations management are sometimes both necessary and profitable.

Even more importantly, individual employees in the industry can cross-pollinate ideas as they move between these environments over their careers, sorting fact from fiction and getting better at determining what will work.

The farther apart these groups are, the less appealing it will be for "startup guys" to work in an enterprise or a startup to hire on an "enterprise guy."

This trend -- since it keeps useful knowledge hidden -- can only help a 68% failure rate go up.

Friday, December 12, 2008

How is Google Native Client Faster Than -- Or As Safe As -- JVM or x86 VM?

When I saw Google's proposed native-code execution plug-in earlier this week, my initial reaction was: "Don't we already have those, and they're called exploits?"

I decided to mull it over a bit, and I still don't like it. While the idea of sandboxed native x86 execution based on real-time analysis of instruction sequences makes for a great Ph.D. thesis or 20%-time project, it sounds like an awfully large attack surface for questionable benefit.

Here's what I'd like to know from a practical nuts and bolts point-of-view: how many "compute-intensive" scenarios could not be implemented effectively using either (1) the Java VM or (2) a plug-in based on widely-available open-source machine virtualization, running a tiny Linux snapshot.

While the JVM falls short of native code in some places, it can be as fast -- or even faster -- in other cases (faster because the runtime behavior provides opportunities for on-the-fly optimization beyond what is known at compile time). Yes, there are issue with clients not all having the latest Java version -- but that seems a small issue compared with the operational issue of deploying a brand-new plug-in or browser (Chrome).

Another approach is to use a plug-in that runs a virtual machine, which in turn runs a small Linux (started from a static snapshot). User-mode code should run at approximately native speed in the VM, which should cover the pure computation cases. In a rare situation where someone wants to run an OLTP file-based db (which would behave badly in a VM environment) or get access to hardware-accelerated OpenGL facililties, specific well-defined services could be created on the host, which code could access via driver "ports" installed in the VM image.

These approaches to the security boundary -- Java VM and x86 VM -- are well-known, long-tested and based essentially on a white-list approach to allowing access to the physical host. While Google's idea is intriguing, it sounds as though it's based on black-listing (albeit with dynamic discovery rather than a static list) of code. I'm not yet convinced it's necessary or safe.

Tuesday, December 09, 2008

Yes, VC May Be Irrelevant if it Continues Focusing on Weekend Projects

Where are Web 2.0's Amazon.com, PayPal, Google, or Travelocity? They were never funded.

Paul Graham and now Eric Schonfeld are extending the several-year-old "VCs don't know what to do with Web 2.0's super-low-cost startups" meme. The extension has to do with the recession, arguing that it will accelerate the decline of VC relevance, as VCs become more reluctant to fund these shoestring startups, and more entrepreneurs pull a DIY anyway and ignore VC.

Well... maybe. The VC problem with micro-cap micro-startups is real, in the sense that it's a real math problem where the VC fund size divided by the proposed investment size equals too many portfolio companies to interact with, and a kind of interaction that is a big break away from the old model.

But the easiest fix is for VCs to simply invest in more expensive businesses. The current predicament is a classic chicken/egg: after the dot-com crash, VC money was hard or impossible to get, so the businesses people started were these undergraduate-level 1 man-year (or couple-of-weekends!) efforts.

Small product, small team, small money. We got things like 'Remember the Milk.' Cute, sure. Useful, sure. But nothing too hard or too ambitious. Nothing a tiny shop -- or a bored college student -- couldn't hack out in their spare time. Indeed many were spare time or side projects.

Some of these apps got a lot of users, especially where network effects were involved, and eventually VC wanted nothing that wasn't viral, network-effect, social. Never mind that there was never big challenge or big value in those. Facebook is the biggest thing in Web 2.0 at the moment, and it's nothing but network effect and questionable monetization. "Not that there's anything wrong with that..." It's a fine, useful application. But in the absence of any real substance the model became more Hollywoodesque, more personality- and connection-dominated than it should have.

So where is the Amazon? PayPal? Google? Travelocity? Ariba? Netflix? Danger (maker of the Sidekick devices)? Those big projects that take tens of man-years, maybe hundreds? The projects that can't "launch in beta" after a few months and acquire tens of thousands of fanatical users because they're more than a glossy AJAX UI on a local database?

Where are the startups that drag whole industries whining and screaming into the 21st century and liberate billions in value, trapped in transactions that real people make every day?

Those big projects haven't been A-round darlings for a long, long time. VCs, terrified of risk, moving in a tight pack, loved the new ethos: let the entrepreneur build a product, get it launched ("beta"), get customers, get mirror-hall PR (blogosphere), then later drop in a few bucks. Less risk. But not easy money. Count the exits.  And the ad-only valuation has no more magic today than it did in 1998.

And no one even notices when hundreds or thousands of these little pownces and iwantsandys hit the deadpool.

It's no coincidence that many of the step-by-step tutorials for new frameworks and tools teach you to clone a blogger, a flickr, or a wiki farm in a sitting. After all, those are trivial undertakings, lacking only for a network-effect mob signing on. And we'd all have a good laugh about widgets one day... if we weren't laughing so hard already.

Can you imagine a tool vendor of the dot-com era giving an hour tutorial that produces a working Travelocity clone? a working PayPal clone? It's sketch comedy, or something sadder and more disturbing. Frankly, the most ambitious projects in all of web 2.0 are the tooling and infrastructure plays that are largely open source.

Investors, entrepreneurs, engineers and end users might all do well by hunting some bigger game.

Thursday, December 04, 2008

Approaches to the Silverlight Dilemma: Cashback? Vista SP? Win 7?

The jury's still out on Windows Live Search Cashback -- apparently there were some issues on Black Friday and, overall, things aren't up to goal ... but that could change.

In light of this cashback program, though, my proposal from last year that Microsoft simply pay people to install Silverlight seems astonishingly realistic.

The problem with Silverlight is that it doesn't have wide enough deployment -- so there is a chicken-and-egg problem deploying apps on the platform. Microsoft has released stats about how many people 'have access to a PC running Silverlight' -- but that's an odd definition of penetration (if a state university has 15,000 students, and 10 computers in a library lab have Silverlight, then presumably all 15,000 'have access to a PC with Silverlight'.)

So ... why not just pay folks a small amount to install? Maybe a $2 coupon(Starbucks/Amazon/PayPal/etc.) to install the plug-in in the user's default browser.

The WGA or XP key-checking components could be used to keep track of machines that have participated, making it impractical to game the system by performing extra installs. I'm sure a suitable equivalent could be used to verify installs on Macs and Linux boxes.

Another approach is to wait for a future Vista SP or Windows 7 upgrade cycle. Since Silverlight will presumably be included for IE, the trick is to hook the default browser selection and attempt to install a suitable version of Silverlight into the target browser (Firefox, Chrome) when the user selects it.

This move would surely be controversial -- even if there were a way to opt out -- and would bring back memories of the Microsoft of the 90s for some. But, since the web page itself decides what plug-ins to use, and no content is "forced" to render via Silverlight, it's only moderately different from making WMP or any other dynamically loadable object available.

And for those "open web" advocates who have nothing good to say about proprietary plug-ins ... consider that Silverlight does not really compete with the open web, but rather with Flash, which otherwise has a de facto monopoly on any RIA that cannot easily or cost-effectively be implemented in HTML/JavaScript. Breaking that monopoly could actually help the open web cause by creating a real platform debate in a space that doesn't really have any debate right now.

Tuesday, December 02, 2008

Microsoft Begins Posting "Proceedings" Docs for PDC Sessions

Microsoft has started posting an accompanying document for each 2008 PDC session, containing a detailed content summary, links, and indexes (into the video stream) for demos.

These docs, called "proceedings," had been part of the plan to publish the PDC content this year, and it's nice to see them showing up alongside the slide decks and full-length session video streams that had already been posted.

While not full transcripts, or even detailed outlines, the information value of these docs is far higher than the full-length videos (per minute of attention investment), so I highly recommend them if you have interest in any of the newest Microsoft technology.

Docs, transcripts, and outlines seem to be a more efficient way to present content for multicast consumption than live video. While full-length web video is cheap and easy to produce, it wastes an enormous amount of time on the viewer side. Instead of reading/scanning the content as fast as the reader likes, he or she must sit through the lower-information-density spoken content and -- what's worse -- is forced to do so in a linear fashion. The time loss is then multiplied by all of the viewers ...

If an employer is paying for your time watching videos -- whether they be technical sessions, product demo videos (all the rage now), or so-called webinars -- then perhaps it's nice to sit back, have a cup of tea, and go into watcher mode.

But if you're investing your own time/resource/productivity into acquiring some knowledge, it's nice to (1) be able to do it at your own reading pace and (2) be able to skim/scan sections of less value at high speed and pop out into 'careful reading mode' as necessary.

Sunday, November 30, 2008

Most Laser Printers are Razors, not Cars

Once upon a time, buying a small or midrange laser printer was like buying a car. Big upfront expenditure, lots of sweating the details, and a moderate amount of thought about gas mileage and scheduled maintenance, er, toner and fusers and all that.

Now, however, it's more like buying a razor. The core features are mostly reasonable, each "size" printer has a speed/duty cycle that determines its suitability for an installation, and the cost is so small that it's all about the consumables (blades).

So why won't vendors -- or manufacturers -- print the cost-per-page of consumables right next to the dpi, ppm, idle power consumption, and time-to-first-page?

It's easy to find 10x differences between otherwise similar printers in the cost-per-page based on the manufacturer's price for toner cartridges and their intended yield.

Big companies, of course, have IT purchasing folks who perform these calculations, factor in the discount they get because the CIO plays golf with the right people, and order the gear. In the case of printers, large companies are typically buying high-volume printers that are among the cheapest per page anyway.

But startups, professional practices (think doctors, accountants), small to midsize businesses -- they rarely calculate the TCO for each device. It would be helpful to have the consumables price per page listed right on the sticker, like MPG.

Saturday, November 29, 2008

For Individuals, Top Laptop Values May Be At Low-End Prices

About a year ago, when I started my latest stint doing 100% contracting, I realized I would need a laptop for meetings, client presentations, etc. Not for coding -- I've written about my views on that, and I'll never trade away my nuclear aircraft carrier of a dev box for an inflatable dinghy just so I can hang with hipsters at the coffee shop.

Since I wouldn't be using the laptop much, and historically the many company-provided laptops I've used have turned out to be poor performers, malfunction-prone, and costly to deal with, I resolved to get the cheapest new laptop I could find. (Laptops have a strange depreciation curve, with the result that a cheap new laptop is often a much better device than a used one at the same price.)

In addition to the usual holiday sales (you can guess what prompted this article now), the whole Vista-Basic-capable-not-really thing was going on, with the result that many machines were being cleared out for good, considered illegitimate for Vista and not saleable.

I snagged a Gateway floor model at BestBuy for under $300, put another gig of RAM in ($30?), snagged XP drivers, and installed Windows Server 2003 R2, since I had found that Server 2003 was very light on the hardware while offering more than all the benefits of XP.

At this price point, I figured the laptop borders on disposable, and if I could prevent the TCO from getting high come what may.

Well, a year or so on I have some results to report.

It has performed far beyond my expectations (and as a developer my expectations tend be unreasonably high).

The only negative is the comically poor build quality -- this is a machine that one must literally 'Handle With Care' as it's built about as well as those tiny toy cars out of a quarter-vending-machine. I think I could snap it in half if I tried, and a careless twist could rip the drive door off or crack the case right open. The keyboard rattles a bit on some keys.

I have a padded briefcase and the machine was never intended for "heavy duty," so that wasn't a big deal for me. And, in any case, it seems more of a reflection on Gateway than on the price point, since, e.g., Acer offers rock-bottom laptops with much higher build quality.

That issue aside, the machine has performed flawlessly. No problems with any part of it, despite being a display model. And performance adequate to some real programming.

The ergonomics are poor for programming (single monitor, low-ish resolution, etc.) -- but it snappily runs NetBeans/Ruby/Java; Eclipse/Java plus various mobile device emulators (e.g., Android) which I needed for a course I taught this summer; even Visual Studio 2008. I do run MySQL rather than SQLServer, in part to keep the load down.

Let's see ... what else has run well on here? ... the Symbian S60 emulators (along with dev tools) ... Sun's VirtualBox virtualization software, with an OS inside. All the usual productivity stuff (Office 2007, MindManager 8) ... Microsoft's Expression design/UI tool suite ... video encoding software and high-bitrate x264 streams from hi-def rips ... often many of these at the same time. Everything I've asked it to do, it does seamlessly.

My conclusions are that sturdier laptops may well be worth it, especially for corporate IT departments -- I'm thinking about products like the ThinkPad and Tecra lines, where the price doesn't just reflect the specs but also a sturdy enclosure, standard serviceable components, slow-evolution/multi-year-lifecycle per model etc.

But for an individual, unless you have a very specific hard-to-fill need (e.g. you want to do hardcore 3D gaming on your laptop or capture DV, a bad idea with a 4200 RPM HDD), the top end of the value equation for laptops appears to be at or near the bottom of the price range. When one considers that higher-end peripherals (e.g., a BlueRay writer) can easily be plugged in via USB, and a faster hard drive will snap right into the standard slot, the value-price equation seems to get seriously out of whack for those $1200-$2500 machines.

That's not to say these higher end machines are not great ... they just don't represent value at the pricepoint. Just as a Mercedes E-class is a fine car, but the radio commercials that try to make it out to be some kind of value purchase are downright funny, I think the same applies for the high-end VAIOs, MacBook Pros, etc. Those machines are a style and brand statement for people who care about making such a statement.

This possibility is interesting because, in most products, the "optimum value" position is somewhat above the bottom end of the price range ... that is, the least expensive products are missing critical features, making them a poor value, while the high-end ones charge a "luxury premium." If laptops are different, that seems worth noting.

The usability of an ultra-cheap laptop also suggests a response to folks who commented on my earlier article, saying that companies are loath to buy a desktop and a laptop, so if an employee needs any mobility at all, they get a laptop. It appears a good solution might be to provide a high-end desktop and an ultra-cheap laptop. At these prices, the employee's time costs more than the laptop, and my experience suggests little productivity (given a remote scenarios such as a training class or client demo) is sacrificed.

Tuesday, November 25, 2008

Do FlexBuilder and MXMLC Really Feature Incremental Compilation?

I use FlexBuilder in my work, and, overall, it's a decent tool. Eclipse gets a lot of points for being free; Flex SDK gets a lot of points for being free. FlexBuilder doesn't get points because it's basically the above two items glued together along with a GUI builder, and it costs real cash.

Wait, I'm off track already. The price isn't the issue for me. Rather, I want to know why FlexBuilder doesn't feature incremental compilation.

Hold up again, actually, I guess I want to know how Adobe defines incremental compilation since they insist that it is present and switched on by default in FlexBuilder.

Now, if I make any change (even spacing) to any code file -- or even a non-compiled file, like some html or JavaScript that happens to be hanging out in the html-template folder -- FlexBuilder rebuilds my entire project. And it's a big project, so, the job even on a 3.6GHz box means a chance to catch up on RSS or grab more coffee.

Interesting take on incremental compilation. See, I thought the whole idea was to allow compilation of some, ah, compilation unit -- say a file, or a class -- into an intermediate format which would then be linked, stitched or, in the case of Java .class files, just ZIPped into a final form.

Besides allowing compilation in parallel, this design allows for an easy way to only recompile the units that have changed: just compare the date on the intermediate output file to the date on the source file. If the source file has changed later, then recompile it. It does not appear that this is how the tool is behaving.

Perhaps this logic is already built into FlexBuilder -- mxmlc, really, since that's the compiler -- and the minutes of build time are spent on linking everything into a SWF. Since Adobe revs Flash player regularly, and many movies are compiled with new features to target only the new player, it should be possible to update the SWF format a bit in the next go-around, so that linking doesn't take egregiously long.

Apparently, at MAX this year, Adobe has started referring to the Flash "platform" -- meaning all of the related tools and tech involved around the runtime. Fair enough, it is a robust ecosystem. But "platform" kind of implies that the tools support writing real -- and big -- applications, not just a clone of Monkey Ball or another custom video player for MySpace pages.

Sunday, November 23, 2008

Software Discipline Tribalism

Unfortunately, people seem rarely able to stop at a reasoned preference -- e.g., "I like X, since X may offer better outcomes than Y" ... and too often end up, at least whenever group persuasion is involved, somewhere more dramatic, personal, extreme, and narrow -- e.g., "I'm the X kind of person who rebels against Y, since Y can offer worse outcomes."

This is as true in cultures of software development as anywhere else.

While many people have been guilty of this unproductive shift in attitudes, it seems many of the Agile development, dynamic languages, small-tools/small-process/small-companies crowd has long since fallen prey to it.

To be sure, there was much temptation to rebel.

At the start of the decade, proprietary Unix was strong; many processes came with expensive consultants, training, books, and tools ... and overwhelmed the projects they were meant to guide; many tools were expensive and proprietary. Web services were coming onto the scene and large players, with licenses and consulting hours to sell, created specs that were unwieldy for the small, agile, and less-deep-pocketed.

When the post-dot-com nuclear winter set in, small companies had no money to pay for any of that, and we got LAMP, Agile, TDD, REST, etc. Opposition to OO, which had been strong in many quarters, suddenly faded as OO was no longer identified (rightly or not) with certain problematic processes. Ironically, many new OO language fans had been ignoring the lightweight, free (speech and beer) processes that some OO advocates had been producing for years.

These have all proven to be useful tools and techniques and have created whole companies and enormous value ... but somewhere along the line, instead of being in favor of these tools and techniques because in some cases they produced better outcomes, either the leaders or the converts started thinking they were the rebels against anything enterprise, strongly typed, thoroughly analyzed, designed and well tooled.

This shift in attitudes does not help the industry ... nor even the clever consultants who lead the charge, deprecating last year's trend for a new one which they just happen to have written a book about.

We desperately need a broader perspective that integrates all of these pieces. There are things manually-written tests just won't do -- tools like Pex can help immensely, even if (or because... )they are from a big company.

Analysis and design are not bad words, while Agile can get dangerously close to simply surrendering to the pounding waves of change (and laughing at goals all the way to the bank) rather than building against the tide, and trying manage to a real outcome on a real budget.

Static languages can get hideously verbose for cases with functor-like behavior (Java and C# [pre-3.0], I'm looking at you). At the same time, go talk to some ActionScript developers -- who have had dynamic and functional for years -- and you'll see an amazing appreciation for the optional strict typing and interfaces in AS3.

REST is great, but in playing at dynamic, it turns out to be rather like C -- it's as dynamic as the strings you pipe into the compiler, and no more. Absent proper metadata, it cannot reflect and self-bind, so it sacrifices features that dynamic language developers love in their day-to-day coding.

Ironically, most of the critical elements of this "movement" -- along with open source -- are being subsumed into the big enterprise software companies at a prodigious pace. Sun owns MySQL and halfway owns JRuby; Java servers may serve more Rails apps than Mongrel/Ebb/Thin/etc. soon, Microsoft is all over TDD, IronRuby, IronPython...

I suppose the sort of tribalizing we see here is at least partly inevitable in any field. But it would serve the entire industry if that "part" could be made as small as reasonably possible. As a young industry with a poor track record and few rules, we ought to be more interested in better software outcomes than in being rebellious.

Thursday, November 20, 2008

Microsoft Pex Moves the Needle Bigtime on Software Testing and Correctness

Over two years ago, I wrote about how neither the assurances of static compiler technology nor the ardent enthusiasm and discipline of TDD (and its offshoots) represent major headway against the difficulty and complexity of large software projects.

At the time, this issue came up in the context of static languages versus dynamic languages. There still exists a political issue, although today it is more transparently about different organizations and their view of the computing business. I will revisit the successor debate in my next post.

For now, however, I want to talk about a tool. In my post of two years ago, I suggested that significantly better analysis tools would be needed in order to make real progress, regardless of your opinion languages.

So I've been excited to see the latest tools from Microsoft Research fast-tracking their way into product releases -- tools which can really move the ball downfield as far as software quality, testing, productivity, and economy.

The most significant of these is called Pex, short for Program Explorer. Pex is a tool that analyses code and automatically creates test suites with high code coverage. By high coverage, it attempts to cover every branch in the code and -- since throwing exceptions or failing assertions or contracts count as branches -- it will automatically attempt to determine all of the conditions which can generate these occurrences.

Let me say that again: Pex will attempt to construct a set of (real, code, editable) unit tests that cover every intentional logic flow in your methods, as well as any exceptions, assertion failures, or contract failures, even ones you did not code in yourself (for example, any runtime-style error like Null Pointer or Division by Zero) or which seem like "impossible-to-break trivial sanity checks" (e.g. x = 1; AssertNotEqual(x, 0))

Moreover, Pex does not randomly generate input values, cover all input ranges, or search just for generic edge cases (e.g., MAX_INT). Instead, it takes a complex theoretical approach called "abstract interpretation" coupled with a SMT (satisfiability modulo theories) constraint solver to explore the space of code routes as it runs the code and derives new, significant inputs.

In addition, so far as I can understand from the materials I've seen, Pex's runtime-based (via IL) analysis means that it should work equally well on dynamic languages as on static ones.

To get an idea of how this might work, have a look at this article showing the use of Pex to analyze a method (in the .Net base class library) which takes an arbitrary stream as input.

For those of you who are inherently skeptical of anything Microsoft -- or anything that sounds like a "big tool" or "big process" from a "big company" -- I'll have more for you in my next article. But for now keep in mind that if Microsoft can show that the approach can work and is productive, user friendly, and fun (it is), then certainly we will see similar open source tools. After all, it appears the same exact approach could work for any environment with a bytecode-based runtime.

Last, I do recognize that even if this tool works beyond our wildest expectations, it still has significant limitations including

  1. reaching its full potential requires clarity of business requirements in the code, which in turn require human decision making and input and
  2. for reasons of implementation and sheer complexity this tool operates at the module level, so you can't point it at one end of your giant SOA infrastructure, go home for the weekend, and expect it to produce a report of all the failure branches all over your company.

That said, here are a couple more great links:

Wednesday, November 12, 2008

BizSpark Shows Wider Microsoft View Around SaaS Innovation

Depsite a lot of reporting about Microsoft's new BizSpark program, one interesting bit wasn't featured in the coverage:

Microsoft's existing nearly-free-licenses-to-help-developers-get-going-on-the-platform program has been around for a while and has always required that a company plan to distribute a "packaged and resalable" application targeting a Microsoft platform. 

It could be an app on Vista or Server, an Office add-in, a Windows Mobile app, or one of a few other options. But it had to be packaged, at least in the sense that it was digitally bundled into an installable set of files even if it never got put in a physical cardboard box.

This requirement made some sense as far as promoting the client OS ecosystem but it disqualified any online offering. An online service had to work around the restriction: for example, by offering a small Windows Mobile app that has some interaction with the service.

But the language makes a statement -- Empower was largely about helping ISVs new or old develop apps on the platform, thus making the client platform stronger. Nevermind that an online service that targets browsers and the iPhone might lead to Server license sales later.

BizSpark, on the other hand, takes another approach. There is much talk of online solutions -- the program is meant to dovetail with hosting providers (or Microsoft's Azure platform) to offer the server-side muscle a solution will need after its incubation period. The program is aimed exclusively at new companies -- if a firm has been in business 3 years or more, it does not qualify.

Both programs exist today and will presumably continue. So I'm not suggesting there is a big move from one view of the world to another. But there does seem to be a conscious broadening of horizons in terms of seeing where innovation is taking place and how Microsoft can be part of it.

Tuesday, November 11, 2008

On the Knocking at the Gate, VCs, and a Math Problem


This economy may be the wakeup to VCs (and CEOs alike) that their future isn't what they want. But the murder happened years ago, and the "I don't want yes men, but strangely I don't listen to much else'" hivemind has just woken up.

The IPO window isn't shutting or recently shut -- it never re-opened after the dot-com meltdown. A handful of IPOs (including a Google) doesn't make a "window." Whether you blame SarbOx, or a trend of investing in companies with a wink-wink style of sustainable competitive advantage, that could not produce high enough valuations to warrant a public offering, there were to be few IPOs.

The startup and VC world started getting this idea a couple of years ago, when they realized that at the smaller exit values (for the exits that were to be had), in order to get a high multiple return, the initial investments would have to be so small that the venture fund couldn't afford to service the quantity of investments. That is, the investment would have to be too small to be worth the firm's time. Uh-oh. A few innovative programs came out of that realization. But for the most part everyone acted like this was just a bad dream.

Moreover, the vaunted "get acquired" exit that appeared to be the next-best exit option, has been rather overrated. The real acquisition numbers for the most part are not what investors (or founders) would like. Not to mention the acquisition could well mean the end of the road for the business (Google is the most famous for this) which is the opposite of what founders should want, and so produces some strange incentives. Yes, YouTube and Skype ... but the curve falls off quickly.

The bad news is that this pile of trouble has been sitting in the corner stinking up the room for years.

The good news is that it's not a sudden crisis, and may well be correctable by VCs who are willing to, um, take some risk (this means getting out of their comfort zone in terms of rituals and assumptions, or expanding said zone) which is, ironically, what they are supposed to be doing for their investors.

But what about all of that advertising? Isn't there money in all of those targeted ads? Or, at least, wasn't there supposed to be until the advertising market started downhill?

In the short term, maybe ... but in the long term, the model doesn't work at the macro level and here is some math that suggests why.

Showing ads is kind of like printing money. You can show as many as you like up to a function of your pageviews. In order for the ad-economy to grow, the attention economy has to grow. That is, the aggregate amount of attention-hours spent against ad-supported pages needs to grow. Ok, there's definitely evidence for that (GMail, etc.)

But what about the ratio of the ad growth rates to attention growth? Attention growth has real-world limits (number of people, amount of time, ad-blockers, desensitization to ads) while theoretical ad supply does not. The limit on attention growth does not limit real-world ad growth. For example, if I view 50 GMail pages where I used to view 15, it's entirely possible that the same amount of attention is now divided across more ads -- or that the total spent attention is even smaller.

In the long run, the ad-value growth is smaller than the application-value growth. So the deficit of uncaptured value for businesses relying on ad revenue grows larger over time.

We can check this analysis by looking at the numbers from another point of view. Start with the raw resource itself -- a piece of a data center that includes a unit of compute power, storage, bandwidth, and hosting.

Hosting and displaying ads is relatively less expensive (in units of the resource) than hosting application functionality. So the very resource which makes the ad-supported model plausible will supports growth on the ad side that is at least as strong as growth in hosted functionality, which is the attention-harnessing product. That is, the resource availability supports growing the supply of units for spending attention as fast or faster than the supply of units for capturing attention. Again, over time, in the aggregate, more ads are powering less features. The value of each ad goes down.

This has nothing to do with the overall economy, consumer spending, etc. It's simply a side-effect of the coupling between the monetization mechanism and the product.

If, at the same time, the "real-world" spending that is driven by even successful ads is flat or in decline, you have an even bigger problem.

Thursday, November 06, 2008

Windows 7 Puts on a Good Show

I took Windows 7 for a quick test drive yesterday. My main goal was to see whether the performance would be so brutally bad as to make me relive my Vista experience.

For those who haven't read my Vista posts, the short version is: early Microsoft developer releases had unbearably bad performance; Microsoft made excuses ("debug build" etc.); turns out the RTM was nearly as bad. I made an honest attempt to run Vista but, as a developer, I just couldn't bear the excruciating waiting, knowing that I could be screaming along in XP. I run XP (or Server) to this day for my development.

Just the Vista-like look of the early Windows 7 bits made me anxious -- I wasn't expecting a lot. I set it up in a VMWare VM with 768 MB of RAM and no VMWare tools (= minimal video perf) in order to torture the OS. Naturally, things would be better running on the metal in a new machine designed for Win 7.

Install was very fast and seamless. I could see a difference right away in perf: even the shell in Vista runs slow, and this one was snappy. I saw the new-and-removed UAC, and I liked.

To add some more load, I installed Visual Studio 2008, which is a fairly heavy app. In addition, it was Visual Studio that had made me give up on Vista in 2007, so I thought it was fitting to try it again.

Inside Visual Studio, I opened a WPF windows project. Mucked around with the GUI editors, built, debugged ... and it cruised along nicely in the VM. Next, I set up an ASP.Net web project, and got that going in debug mode with the integrated server. Finally I started to feel some minor slowdown -- but it appeared I was running out of RAM with my 768MB VM. This was not a huge shock, since my install of Win 7 was consuming about 450MB RAM at idle, with no user apps running.

The 450MB RAM usage is a little disturbing, but, hey, even fast RAM is cheap. And my Server 2008 setup was idling at about 350MB with few services enabled, so I suppose this usage is to expected.

Overall, I was very happy with my Win 7 preview. I could see myself actually using this as my OS and not cursing all the time, which was a pleasant surprise.

The big unanswered, and unanswerable, question is: how similar will this experience be to the final RTM of Windows 7?

On one hand, Microsoft might have released this preview "stripped down" -- either to make it run better on today's hardware, or just because the additional components with which it will ship are not yet ready for public consumption. In that case, future builds might be slower.

On the other hand, still smarting from Vista, Microsoft might adjust the previews in the opposite direction -- a sort of "under-promise, over-deliver" thing -- lest anyone see a later build and say anything except "wow this is fast."

Tuesday, November 04, 2008

On BASIC and JavaScript and 25 Years of Coding

I realized I've been putting programs together on a regular basis for 25 years now. I distinctly remember some of the earliest programs I worked on, around October 1983, and the fourth-grade teacher who let me get away with a lot more access to my school's computers than was the norm. When I somehow convinced may parents later that year to get me a machine ... things got even more out of control.

I worked with a bunch of the classic 80s "home computers": Tandy Color Computers (1, 2, and 3) and Apple II (and II+, IIe, IIc, IIgs, ...), and some not-so-classics like a CP/M machine that ran via an add-in board (with its own Z80) inside an Apple.

The languages and tools were primitive, performance was always a concern, most serious programs required at least some assembly-language component for speed and hardware access and, even if they didn't, compatibility across computer types was nearly impossible.

A lot like programming in JavaScript nowadays (I guess replace assembly language and hardware with native browser plug-in [e.g., gears] and OS APIs).

I could flog the 80s-BASIC / JavaScript analogy to death, but (1) if you read this blog, you likely can fill in the blanks yourself and (2) my goal isn't to bash JavaScript, which would be a side-effect of the drawn-out analogy.

What I find interesting is the question of why these things seem similar, and I have a hypothesis.

I have noticed that many members of my peer group, who started programming at a young age on these early microcomputers, have an affinity for tools, structured languages, and to a lesser extent models and processes. I wonder whether this affinity is not some kind of reaction against the difficulties of programming these early microcomputers in what can only be called a forced-agile approach, where debugging and testing took an inordinate proportion of the time, "releases" were frequent, and where the only evidence of working software was ... working software.

I will be the first to admit I am quite conscious that my experiences in the years before the C/DOS/Windows/C++/Mac era make me appreciative of and (depending upon your perspective) perhaps overly-tolerant of -- process, tools, models, definitions, infrastructure, etc. as a kind of reaction.

Let's stretch the hypothesis a little further: Gen-Y, who missed this era in computing (I would say it really ended between 1987 and 1989) will have typically had their first experience with coding in a structured, well documented, "standardized" ecosystem -- whether that was C on DOS, or Pascal on a Mac, or for those still younger perhaps C++ on Windows or even Java on Linux.

Depending on their age today, and the age at which they first took up coding, this generation had compilation, linking, structured programming, OS APIs, perhaps even OO and/or some process from the beginning. For them, it is possible to imagine, the overhead of the structure and process was something to rebel against, or at least a headache worth questioning.

Hence their enthusiasm for interpreted languages and agile approaches, and the general disdain for processes with formal artifacts, or extensive before-the-fact specs.

That's the hypothesis.

A simple study could validate at least the correlation -- by gathering data on age, age when someone started coding, the language/machine/environment they were using, perceived benefits or disadvantages in those years, and their interests today. And even the bare correlation data would be fascinating.

Considering that these approaches are often "best fits" for rather different kinds of development projects, knowing what sort of prejudices ("everything looks like a nail") individuals bring to the debate might add a lot of context to decisions about how to make software projects more successful, and how to compose the right teams and resources to execute them.

Friday, October 31, 2008

Great Business Model: Pseudo-Business, Pseudo-Freemium Online Software

I can't come up with a better name for this model, but, not to worry, you'll recognize it right away. In this period of renewed discussion of "how to make money," I'm trotting out my favorite -- perhaps the best one for a startup today.

A couple of examples are YouSendIt.com and JotSpot.com (prior to its acquisition by Google).

Let's take the explanation in two parts. Pseudo-Business Software is software used to conduct business, but which is not necessarily sold directly to businesses. Put another way, it is priced and and offered in such a way that individuals and small work groups inside of businesses can buy and use the software directly, without larger purchasing approval and without IT department approval. It offers a businesslike function so that it is easy to justify on an individual expense report -- and it's cheap enough that some folks may happily pay for it themselves just to be more effective at work, the same way they might shell out $25 for a DayRunner or $50 for a nice portfolio without thinking twice.

YouSendIt fits this model: it offers an oft-needed business function -- transferring large files. It lets employees bypass the tortuous and unwinnable debates with IT over why and whether attachments fail, how to share files with others, etc. Pay a few bucks and if you can get to the web at work, you're good to go. Easy to expense or even pay for on your own. It's enterprise software sold cheaply and one user at a time.

JotSpot (now Google Sites) -- in its original freemium wiki form -- fits as well. Easy to justify as project groups scale and each monthly increment is a small charge. Instantly bypass all the broken collaboration infrastructure your company can't get right.

At the same time, these products are Pseudo-Freemium software. If freemium is software that offers one level of functionality or resources for free, with more available for a price, then pseudo-freemium is like freemium except that the free version is not terrifically usable as a business solution except to make the customer comfortable with the product.

YouSendIt has, and JotSpot had, free versions. Unlike many consumer freemium use cases where many users happily use the free version and never need to upgrade, these pseudo-freemium products are were specified so as to be more like a mini-free-trial.

IIRC, the JotSpot free account was limited to 5 users and 20 wiki pages, while YouSendIt free limits the file size to just shy of what I always seemed to need. There are countless other examples, ranging from single-user hosted source code services to online storage that offers only minimal free space.

The free version gives you the "warm fuzzy" of a free never-goes-away account and lets you see that the product works as advertised and won't embarrass you with some crazy or unprofessional aspect when you're asking your boss to sign for it on your expense report. These have always been the easiest-sell, no-brainer for-pay services that I've subscribed to. In general, they appeal to that certain purchasing area in people's minds -- next to the planners, a beer in the airport restaurant, or a nice tie -- as modest costs of being a professional that are either expensable or should just be paid for oneself.

Monday, October 27, 2008

Azure -- and the Other Clouds Players -- Should Lean Forward

Since I covered Azure pretty well two weeks ago, there's not much to add except the name and the open question of which parts of the platform can be run in-house, on AMIs, or anywhere outside of MSFT data centers (via a hosting partner). And Microsoft hasn't really addressed that either (I have questions in at PDC) so the answer appears to be "not yet, stay tuned."

Now that the semi-news is out of the way, I am a little disappointed that all the cloud players haven't leaned in more, in terms of providing added-value capabilities beyond scaling. Elastic scaling is valuable, but it's a tradeoff. You are paying significantly more to be in the cloud than you would be to host equivalent compute power on own machines, or on VMs or app server instances at a consolidated host.

If you have reasonable projections about your capacity, then you're wasting money on the elasticity premium. You do get some nice operations/management capabilities ... but for apps that really need them, you still need to bring a bunch of your own, and you're taking on someone else's ops risks too.

For some businesses, these costs make sense. Here are some value-added features that would make the price persuasive for more people outside that core group:

  1. Relational and transaction capabilities. Microsoft does get the prize here, as they are the only ones offering this right now. Distributed transactions and even joins are expensive. So charge as appropriate. It's a meaningful step beyond the $/VM-CPU-cycle model that dominates now.
  2. Reverse AJAX (comet and friends). Here is a feature that is easy to describe, tricky to get right and multiplies the value of server resource elasticity. It's a perfect scenario for an established player to sell on-demand resources, and could be a differentiator in a field sorely lacking qualitative differentiation.
  3. XMPP and XMPP/BOSH (leveraging the reverse AJAX capability above). XMPP is clearly not just for IM anymore, and may evolve into the next generation transport for "web" services. Not to mention, having a big opinionated player involved may help at the next layer in the stack, namely how a payload+operation gets represented over XMPP for interop.

Those are just a couple of ideas that spring to mind -- I'm sure there are much better ones out there. To make the cloud more of a "pain killer" than a "vitamin" for more people, some new hard-to-DIY features are the way to go.

Monday, October 20, 2008

hquery: Separate data from HTML ... without templates!

Out of a bunch of cool presentation at least week's San Francisco Ruby meetup, my favorite was Choon Keat's demo of a working implementation of his hquey project -- a lib that lets you use Ruby, CSS, and jQuery patterns to bind data to views ... in the DOM, on the server side.

He had previously blogged about the motivation -- the core idea is that 10-15 years into the web application era, we are largely using template languages for integrating data into HTML. We've come a long way toward avoiding procedural code in our templates, enforcing MVC, etc. But aside from collectively agreeing to avoid a set of 'worst practices,' we're still inserting data into HTML in a manner reminiscent of 1998.

For well designed pages, with CSS classes and/or IDs, it should be possible to specify the data binding using CSS on the server, without any special binding tokens or markup. hquery does just that. So a designer can create full mockups with dummy data, and hquery can swap in the live data

The current demos and syntax might be touch verbose, but this is just an intial proof-of-concept and Ruby lends itself to easy re-arrangement an API if you need to.

This sort of radical division of dynamic data from HTML, while still using standards and not introducing yet-another-meta-templating-scheme reminds me a little of the old idea of using XML+XSLT to create pages. We all know how popular that ended up.

The hquery approach seems about fifty times more accessible to the present community of developers ... so the question is: how do we make this more popular and build community interest?

Thursday, October 16, 2008

Listen to Early Windows 7 Feedback, Even From Developers

At the upcoming 2008 Professional Developers Conference, Microsoft will be showing Windows 7 to developers.

My bet is that the 160GB portable hard drives they are handing out to distribute preview bits will actually contain Virtual PC images of Windows 7 in various states or configurations. Such a setup will be more convenient to try, even if it does narrow what aspects of the OS can be seen.

In any case, Microsoft would do well to pay attention to the feedback it receives from these developers.

We know all of the reasons why geeks can make poor proxies for "real end users." Nonetheless, I recall the 2005 PDC, when Microsoft gave us the latest beta of Windows Vista. A chorus of complaints arose from many who tried the new OS. It's way too slow; it doesn't work with the hardware we have; we can't explain the 10-odd different SKUs to our customers.

Do these sound familiar? They should, because they're uncannily similar to the problems "real end users" found -- and continue to find -- with Vista.

At the time, the 'softies at the conference, who are generally open, approachable, and humble with regard to technical matters, didn't want to hear these complaints about Vista. I was rebuffed more than once: the SKUs haven't been ironed out yet; the beta build is a checked debug build, so of course it's slower. Well, maybe. But I found it to be little slower than the release build on the same hardware. Either way it was unusable.

I think everyone's learned from the Vista experience -- and that includes Microsoft, ISVs, consumers, PC builders ... and Apple.

Let's try it differently this time around, starting with feedback from PDC.

One last thing: it would make sense to release the Windows 7 preview to the general public at the same time. Why? It'll be on the file-sharing networks instantly, where there is a greater chance of folks downloading a trojaned image, etc. So it will help everyone to have an official distro from Redmond instead.

Monday, October 13, 2008

What's In Microsoft's "Strata"[?] Cloud OS


Just for fun, let's do a little educated speculation on Microsoft's "cloud os" initiative. It's not too hard to make some good guesses -- Microsoft's existing and unreleased products telegraph a lot about what they are likely assembling. For example, the semi-well-known "COOL" and Visual J++/WFC gave you most of what needed to know to imagine the real .Net platform.

There are lots of pieces out there -- certainly enough to comprise a pretty interesting cloud stack and application model.

Since Microsoft -- and platform vendors in general -- like to go all out, let's imagine this stack reaching from real hardware up through virtualized hardware up to application servers and then to client components and the end-user's browser or alternative on the other end.

Let's start in the middle of this stack and work our way out.

What would the "middle" look like? Well, what makes a hosted ASP.net account different from a cloud platform? Some answers: storage and bandwidth may not be elastic; clustering the app is neither automatic nor declarative, but requires programmatic and operational work; the database is typically a SQL Server instance (perhaps a mirrored failover cluster) with all of the usual capabilities and scaling constraints.

So imagine a hosted ASP.net account with a few changes that address these limitations.

First, swap in an alternative implementation of sessions, that supports clustering, proper caching, etc., with zero config. Add a lint-like tool to warn about code that isn't properly stateless. And an asynchronous worker service for any long-running, background, or scheduled tasks that could be "fudged" with threads or events in a controlled Windows Server environment, but won't work that way in the cloud.

Next, replace the datastore with something like ... SSDS, and a LINQ provider so that in many cases code won't need to be changed at all. The interesting thing about SSDS, of course, is that unlike other non-relational cloud datastores, Microsoft has said the roadmap will offer more relational capability (subject to constraints, no pun intended). So ASP.net apps that need real relational behavior might have an easier time moving to this new datastore.

So, without much new, we have a flavor of ASP.net that is more cloud-centric and less server-centric.

Now on the hardware and VM end of the stack, bear in mind also that -- to add value and sell the Server product, as well as to service enterprises which would like cloud architecture but need parts of the "cloud" to stay inside the firewall -- the whole enchilada is likely to be available as a service (or its own SKU) on Windows Server.

In fact, a number of Microsoft products related to modeling data centers, virtualization, and automated migration of services and machine images suggests that a key thrust of the "cloud os" might be that a customer can easily move services from individual servers up to a private cloud implementation and on to one (or more -- perhaps an opportunity for the hosting partners) public cloud data centers... provided they are coded to conform to the API.

ADO.Net Data Services (aka Astoria) already supports AtomPub, the format Microsoft is using or moving to for all of its Live services, so minimal wrappers (not to say minimal effort in the API design) could turn this into a platform API. A simple using directive brings in a File object and API that works with Skydrive instead of My Documents.

Last, look at the client end of things. Right now, we have ASP.net serving web pages, and we have web services for Silverlight clients. There is also a project (named "Volta", and which has just recently gone offline while a "new version" is in the works) aimed at dynamic tier splitting and retargeting apps in terms of the client runtime. Hmmm... Sounds like a critical part of the front end of the cloud os stack.

In order to provide a RIA experience via Silverlight (or even desktop experience for a cloud-os edition of office), promote the client os product by offering a best-of-breed experience on Windows clients, and at the same time offer a legitimate cross-platform web-browser-consumable app, a piece like Volta is critical, and makes complete sense.

Microsoft tends to hunt big game, and I doubt they are interested in a me-too web app environment. They really intend to offer a cloud os, allowing developers to code libraries and GUIs that are outside of the web paradigm. These bits can run as .Net on Windows ... as .Net in Silverlight on Mac or (one day) Linux ... and as Javascript apps in non-.Net-capable browsers.

The big question in my mind is timing -- how far along are they on the supportable, RTM version of this stuff. Whether this is relevant -- or even becomes a reality -- will depend on how fast they can get this out of beta.

It seems that when Microsoft is quite close to production with a platform they can grab enormous mindshare (recall the release of the .Net platform). If this is an alpha look, with no promised timeline, things are a lot more tenuous. If there is a 1.0 planned before mid 2009, this could make things interesting.

Sunday, October 12, 2008

Time for a Wireless Coverage Map that Shows Utilization, Not Just "3G"

It's easy to get distracted with mobile protocol (HSDPA vs. EVDO) or "generational" system (GSM-3G vs. EDGE) speed claims. In fact, that's the most common conversation that mobile operators, hardware manufacturers, and customers have.

But it's only a piece of the puzzle. The other big ones include latency (the time required to establish a connection and get the packets flowing) and congestion (the instantaneous demand for bandwidth relative to the current capacity in a location).

Not to downplay the value of 3G+ data speeds, it is still instructive how well a "slow" connection can work when congestion is low, and how badly 3G can work when congestion is high.

Apple iPhone customers have complained about the high congestion experience. The other day I had an interesting low congestion experience.

I was in a corner of San Rafael where my phone could only negotiate GPRS -- but I had the air to myself. Subjectively, the browsing experience was better than a typical EDGE connection on the same hardware, and similar (for modest amounts of data) to the 3G experience.

Demand on the network changes constantly as users do different things, and the effective capacity changes due to everything from weather to RF interference to upstream network congestion. So it's not easy for an operator to make a priori statements about actual speeds or actual congestion ... hence they talk about the protocols they offer and their "optimal throughput."

But congestion/capacity issues are a first-order concern in many areas, so I propose some mechanism be created so that customers and operators can have an informed negotiation about service.

I'd like to see a coverage map, for example, that doesn't just show "3G" areas in a certain color -- but also color codes the average real utilization over the past 90 days. Sort of like shopping for an airline ticket and looking at that column that shows "on-time percentage." It lets a customer separate the hypothetical performance from what can actually be expected.

That information would also motivate operators to invest in capacity and infrastructure where the demand is, rather than trying to extend that "patch of orange" on their coverage map to one more town for sales purposes.

Wednesday, October 08, 2008

Engineering Departments Should Take Their Medicine Too

Without getting into gloomy predictions, it's clear a lot of companies will be feeling some pain soon, if they aren't already feeling it.

Software development groups at tech companies should take some medicine too -- cutting costs, becoming more competitive, improving morale, and attracting good talent at the same time.

How is that possible? Here's one way:

Unless your company is brand new, and you've got geniuses agile-ly coding your hot frog-sticker-collectors' social network, your project has a bunch of history, cruft, and mistakes in it.

It's nothing to get upset about, that's just the way of the world when developing software over the course of years in an organizational setting.

Yet many companies don't make any effort -- or actively resist any effort -- to identify those legacy problems and mitigate them. Perhaps it's fear of blame (why didn't you see this before? why didn't you say something? aren't you guys supposed to be experts?) or fear of appearing backward-looking rather than forward-looking (we'll make do with all that stuff; now can we shrink wrap and ship your latest proof-of-concept and call it next quarter's product?)

Suppose that, instead, your development group listed all the crufty bits that bug them. Stuff that maybe made sense at the time, but just isn't right any more -- wrong code, wrong architecture, wrong network protocol, wrong database, wrong format, whatever. Suppose the team got to rank these in order of annoyance factor, and impediment to productivity. Then, picking the top handful, they got to decide how to use the latest and greatest (and in many cases less expensive) technology to refactor and fix those modules.

A shortsighted manager might complain that 'if it ain't broke, don't fix it' -- it's a waste of resources.

But we know better than that.

In many projects a significant percentage of resources (up to two-thirds in extreme cases, based on my research and experience) can be spent wrestling with these "legacy cruft" issues. So, from a simple economics standpoint, it's definitely "broke" if you're spending $1 million per year on a dev group that could theoretically deliver the same functionality on $333,000, or 3x as much for the same $1 million.

A project that removes these old bits becomes more competitive. Why? The competition, particularly if it's a newer company or on a newer platform, isn't running with these parking brakes holding them back. Why should your team? If you can release the brake, you can deny your competitors something they definitely view as an advantage against you.

Moreover, these moves can boost morale and make your company more attractive to prospective employees in several different ways.

Getting rid of old morale-busting code makes everyone feel good.

Using a newer technology -- not as a novelty but because it's solving a real problem better than existing code -- is appealing to developers who want to learn new skills.

Doing a refactor of critical code without breaking existing code, wrecking release schedules, or introducing excessive ops downtime is a challenging and rewarding skill, kind of like working on the engine while plane is flying -- and top developers relish this kind of "black-diamond" assignment.

Finally, it tells everyone, inside the company and out, that this isn't Initech, where an engineer will have to work on 15-year-old tech for the next 15 years.

Tuesday, October 07, 2008

TextMarks and AppEngine Make Building SMS-Enabled Webapps Simple

TextMarks has been on my radar for a little while. It's a company offering free (ad-supported) access to SMS shortcode 41411, via a secondary keyword of your choice. They also have pro options with no ads, and shorter (or reserved) keywords.

Not having a great idea for a SMS-based app (I think I'm biased from having web-enabled, push-email PDAs for too long), but wanting to kick the tires, I decided to build an iteration of the now-classic "to-do list" app.

Meanwhile, I've been spending a little time with the Google AppEngine. When a hosting account came due for renewal, I decided to see if I could replicate it using AppEngine. AppEngine covers all the basics, and makes it easy to stream BLOBs out of the datastore as well. But if you host files (anything over 1MB), those are not allowed in AppEngine, either as static files or as datastore objects. Too great a magnet for bandwidth or data hogs I suppose. So those files live somewhere else.

But AppEngine makes a fine place to put a small SMS-driven app (assuming you don't need the background processes or scheduled processes that AppEngine doesn't allow).

Registering a TextMark is simple. Set the URL to the one you want called when 41411 receives your keyword, and add dynamic data from the SMS into the URL request GET params using escape sequences (e.g., \0 means "the whole message after my keyword"). So your URL might look something like http://foo.appspot.com/sms?text=\0).

Go into the "Manage" tab, pick "Configuration" and turn off the various options related to "posting messages" -- these aren't necessary for your app. If you aren't planning to asynchronously send messages out to your users (hard to do with AppEngine, as mentioned above), you can also turn off the subscription options.

Now you just need a handler on AppEngine to receive those calls and do something. Whatever the handler renders will be truncated and sent back to the user as a reply SMS, so you get a semi-web-like mechanism for confirming that you're done an operation, or returning results.

Create your AppEngine app, a handler, and yaml that ties 'em together, per the tutorial. Now you can just modify the basic HelloWorld to do your bidding instead.

Here is my little memo handler -- it files memos under the phone number of the person sending the text, and forwards a list of all the memos to an email address if it receives the command "export (email address)." Don't put anything private in there, since there's no PIN/authentication (although it would be easy enough to add) ... so anyone who knows your phone number and can build a URL by hand can just ask for all your notes :)

Of course, it's easy to test out your handler with just the browser address bar, but TextMarks also provides a nice emulator so you can send messages to your handler -- and see what the response will look like with their 120 char (free account) limit and their ad tacked on. And there are a bunch of other neat things you can do, like create short term stateful contexts, where a user can just text back Y/N or a response to numbered "menu" and the messages will get to your app automagically.

Friday, October 03, 2008

zBoost Cell Extender: Refinement Post

This is a refinement, or follow-up, to my first post about cell signal boosting with zBoost:

While the base station allows decent communication over a specific area once a data or voice call is established, it seems to have little or no ability to propagate the signaling part of the GSM protocols when there is no connection established.

This symptom means that the network is unaware of a handset, and a handset thinks there is no service. As a result, it often will not even try to make a call.

Once the handset decides there's a network -- usually by picking up a very faint signal from a real tower -- it will attempt the call and then discovers a near-perfect connection.

zBoost may not be responsible for this behavior -- the product docs specifically say that you have to have some real signal level in order to use the repeater. I had imagined this restriction was solely due to the zBoost device needing a tower to talk to (it cannot bridge to another backhaul) -- but in fact it may also be due to zBoost's inability to simulate the idle signaling between the handset and the tower. That is, you need a tower to make or get a call, but the zBoost can boost the data in the call stream.

So what does this all mean?

Well basically that if your RF visibility to a tower is marginal, as mine is, zBoost may still be able to help. But you don't want to rely on this to reach 911 or make any other kind of need it right now phone call, as it might take you a few minutes of moving around or monkeying with the phone to get it to realize there's any service.

Likewise, if you absolutely positively cannot miss an incoming call, this may not be the solution for you, although, interestingly, incoming signaling (calls and SMS) seems to be stronger/prioritized traffic from the tower than idle updates, and so gets to the handset more frequently, without any kind of booster present.

Tuesday, September 30, 2008

Credit Crisis: At Least One Great Thing Already Happening for Silicon Valley

The unfolding credit crisis may have severe implications for the region -- or the world -- a bit down the line. But it is already helping chip away at one of the Bay Area's biggest long-term problems: housing (un)affordability.

I'm going to keep with my narrow technology focus here and, fairly or not, make an analysis specifically of the region's viability as a technology and innovation hub, its ability to route investment into businesses that attract top engineers and churn out world-changing products.

Every year when the region's large business leaders meet, housing is at or near the top of their list of concerns. It's hard to hire in a place people cannot afford to live. Especially when potential employees are talented and often mobile, with a lot of options in front of them.

Owning a house in the Bay Area has been expensive for some time. But -- in the space of the last eight years or so -- it has crossed the line from "painful but affordable" to "not realistically affordable" for the technology professionals smart enough not to take one of those wild-'n'-wacky subprime loans with a handful of bogus low payments up front.

I'm going to be concrete here, and use specific numbers... numbers which may leave you gasping or slapping your head if you live elsewhere in the world, but numbers which are real.

In the wake of the dot-com crash, a large portion of the area's "starter house" stock was available for $350,000 to $500,000. These were often 2-bedroom tract homes, not always in the most desirable locations, but not in the worst either. They were no bargain, but they were affordable for the "senior software engineers" with a decent education, a half-dozen years of experience, and the willingness to save for a real down payment all that time instead of going wild with the Visa card. These engineers had seen their earning power top $100,000 (partly due to the dot-com boom), and if they avoided Aqua when the company wasn't paying, they could sock away some serious cash.

A widely held rule of thumb regarding housing affordability suggests that at most one-third -- perhaps a little more -- of gross income can go to housing. More than that, and the probability of the buyer experiencing financial hardship or outright defaulting goes up sharply.

So the senior engineer, with say six or seven years of professional experience, making a little over $100,000 -- exactly the "heart of the batting order" in a growing tech company's talent pool -- could afford a $400,000-$500,000 house, at around 6%, with the traditional 20% down payment. If they had a significant other who was also earning money, they could afford a little more. But not too much more if they didn't want to be dependent on those full dual incomes indefinitely.

Almost all of the folks in my "cohort" (say, within four years or so of my age) whom I know and who bought houses in the Bay Area did so in this way, at this time, and with this sort of cost.

Fast forward three years.

Easy money and bogus mortgages have caused prices to balloon. Those $400k-$500k houses become $700-$800k houses.

Salaries have drifted up a touch as the worst "crash" years pass, but not much beyond the rate of inflation, maybe $5k-$10k per year for these mid-level positions. Certainly not enough to cover the change in housing prices.

And -- just like that -- those engineers, the early-to-mid-career core of any tech company trying to scale, the folks who know enough to use a little process and not re-invent the wheel, while still working hard and willing to take chances to innovate and make something happen, have no access to the stock of starter homes.

Beyond the larger down payment, they discover that making a "real" (i.e., based on traditional mortgage terms) monthly payment on this $750,000 house means making over $150,000. And that's well beyond the typical salary for a senior engineer or even a lead engineer / architect type role.

And there's the story: that bubble cut off the up-and-coming generations of engineers from homeownership. Indefinitely if not permanently. If the Silicon Valley business leaders roundtable thought housing was an issue before, they are in a whole new landscape now.

But this is exactly where the credit crisis is starting to help. It was never practical to build our way out of the housing shortage because not only is buildable land scarce here, but ready credit meant each housing unit gets bid up based on what lenders are in the mood to invest.

Now that the brakes are on, we've already seen the median home price in the Bay Area drop by nearly a third from its high in '06. We need a couple more years of this -- together with lenders that want to see payments under that one-third of income mark, and a solid down payment.

When it all shakes out, perhaps some of my friends who missed that narrow window early in the decade and so, despite working hard, excelling, and making serious incomes, missed a chance to own any kind of house, will get their opportunity.

And, if it's too late for them -- after all, families grow, the kids get bigger, and that worn-down starter house won't look so attractive when we're all middle-aged -- at least the next generation of geeks and whiz kids will have a reason to work in Silicon Valley.

Saturday, September 20, 2008

Low-Hanging Fruit: a Server-Side JavaScript API (or Standards, or ...)

There's a big chunk of stuff missing from JavaScript cloud-hosting platforms (like 10gen) and as well as from JavaScript semi-app-servers (like Phobos).

It's called any kind of API or standard.

Hard to believe, but after several years of growing JavaScript influence, and a whole web culture that is tilts towards openness and standards, all of the players -- Bungee Labs, AppJet, 10gen, Phobos, and many others -- are rolling their own little server-side platform APIs.

Standards make a platform easier to learn, understand, debate, debunk, and fix. They allow a larger community to share code and ideas, and provide a small degree of lock-in-proofing and future-proofing. Standards also allow transparent competition on the basis of implementation quality, tooling, SLA, etc., rather than obscuring those things behind incompatible facades (APIs).

New platforms on new technologies with no standards behind them can be a hard sell -- especially when they do not offer any new capabilities.

According to Techcrunch, Bungee is in a "freefall." And the interesting bit is that their CEO ascribed the recent round of layoffs to 'actual vs. anticipated rates of adoption.'

Hello, if you are trying to sell the world on your server-side JavaScript programming and deployment environment, you're not helping your 'rates of adoption' by also asking people to learn and commit to your own home-brew platform API.

Now to be fair, there aren't a lot of alternatives in the absence of a standard. But ... it would make a lot more sense for all these players to get together and create some standard APIs and commit to using them. The APIs would cover all the basics: e.g., persistence (of object, key-value and relational flavors), templates, request/response handling, calls out to other web services and processing of their responses, publishing SOAP services (which still remains critical in the enterprise world), and interop with other server-side environments (Java, Python, etc.)

Overnight, there would be a single community (and acronym!) instead of a dozen fragments. Like any standard, it would generate books, conferences, training materials -- and controversy, which is never a bad thing when you need publicity. We would see real performance tests, and get a real debate over where the JavaScript-to-SomethingElse boundary should be and why.

And these vendors would gain instant legitimacy by being founding contributors to a specific platform "trend," rather than lone voices in the woods. That legitimacy (and, via the sad logic of large companies, the "legitimacy" of being printed on the top of some conference bag) would help them appear credible to customers big enough to pay them real money.

Thursday, September 11, 2008

BYOA (Bring Your Own Analogies)

I found this brilliant label on the side of an industrial-strength wood chipper/eater/pulverizer.

There are so many other places -- especially in software development -- where a label like this (including both explicit content and implicit assumptions about the attitude of the reader) would be appropriate.

I won't ruin your fun by babbling on about all the specific cases; instead I'll leave you the pleasure and satisfaction that will come as you begin thinking of them.


Tuesday, September 09, 2008

Presentations from SF Flash Hackers August '08

A couple of weeks ago I gave two mini-presentations to the SF Flash Hackers group, on topics I've talked about here before ... but I figured I'd post the slides to slideshare.

Here are slides about porting a large Windows app (Mindjet MindManager 7.2 with Connect) to run in the browser via Flash (developerd with Flex):


And here are slides on generating ActionScript 3 code from UML class diagrams using my VASGen tool -- which is really a contribution building on two existing tools, the Violet UML modeler, and the Metaas ActionScript 3 meta-library (in Java):

As3 Code Gen from Uml
View SlideShare presentation or Upload your own.

Thursday, September 04, 2008

Citibank Needs to Get Their PKI Act Together

While I'm thinking about security ... there has been plenty of debate over whether Firefox 3's hostility toward self-signed certs is a good idea.

Either way, this should be a non-issue in the banking world, which ought to have proper certs on any public facing machines.

So I was more than a little surprised when I went through a workflow with Citi, where a number of links, widgets, etc., triggered the Firefox self-signed-cert blockade/warning. The problem resources were loading from subdomains like foo.citibank.com or bar.citimortgage.com.

When I did some work with [insert other extremely large bank here], everything was SSL, even internal web service and app communications, and usually (not for external customers) mutual authentication. The bank had an entire PKI department, which controlled numerous separate CAs corresponding to dev, test, production, different business units/functions, etc.

Hooking up to most things meant sorting out the right kind of cert to present, and working the proper cert chain for whatever the server gave you. In some situations -- e.g., programmatically sorting this out in Java -- it was a major hassle. Not to mention that the PKI group was cooperative but busy, so there could be delays. And the certs were set to expire in not-so-long, so a whole mechanism was necessary to make sure you didn't have system failures in your department due to not getting a new cert placed in time.

It would have been much easier to use self-signed certs all over the place, but the bank wanted some extra protection even against rogue calls from inside the network. The policy made sense and, even if it didn't to you, your alternative was to box up your stuff and leave the building.

Of course intentionally deploying a production service to external customers with a bogus cert was so unimaginable it wouldn't have even been funny ... in order to be funny there would've had to have been some molecule of possibility in it, and there wasn't.

Can Citibank really not have controls that prevent this?

Chrome's Unusual Installation Location: Good, Bad, or Ugly?

I -- and many other folks -- have noticed that Google Chrome installs only for a single user, and does so in a way that does not require administrative privileges to run the installer.

Basically, it just drops its files into a subdirectory of the user's home directory, places its shortcuts in the user's specific Start Menu folder, Desktop folder, etc., and arranges for its GoogleUpdate.exe helper app to launch from Windows/CurrentVersion/Run under HKEY_CURRENT_USER, rather than HKEY_LOCAL_MACHINE.

This is an unusual pattern for a Windows installer, almost certainly rigged in order to allow minimal-privilege user accounts on corporate networks to install and run Chrome ... under the radar of IT or management policy, if need be.

The question is whether this is inherently a security problem.

On one hand, I've read posts pointing out that this setup leaves the executable vulnerable to other executables that run with the user's permissions. This means another app could replace Chrome with a compromised Chrome, and the user would never know. At the same time, if Chrome can install, then any other malware could install itself the same way -- set itself up to launch under HKCU/.../CurrentVersion/Run, and sit in the background doing anything it wanted (like listen to keystrokes for another HWND). Then again, being in the user's browser might make snarfing credentials and scripting their use (or taking advantage of an auth cookie being present) a lot easier. The point is that a traditional executable under Program Files should be less vulnerable -- a nonprivileged user account can't rewrite those files.

On the other hand ... this is not terribly unlike the install/run routine on *nix servers. If I'm a "regular" user, I'm not installing to /usr/bin, I'm just untarring in a local directory, possibly building, and then running the binary. Of course a user doing this is likely more sophisticated than general Windows users, and fewer *nix end users means less malware at the moment.

Wednesday, September 03, 2008

Always-On JavaScript Mildly Disturbing

Google Chrome doesn't have a switch to turn off or restrict scripts. While this might be an upcoming feature, my guess is that Chrome is about "running" web 2.0 apps, and so JavaScript is considered essential.

Well, maybe that's ok if the security model around JavaScript execution is as fantastic as the comic book suggests.

On the other hand, the launch of this browser featuring a well-known security flaw (admittedly not a JavaScript flaw) makes me a less comfortable about always-100% script execution.