Monday, January 26, 2009

Is that Service Really a Scalable Cloud or Just Full-Service Web Hosting?

A lot of cloud stacks, or cloud app platforms promise scalability for your app, "With a little EC2 in every box!" (TM). There is a big catch and a little catch, though, and if your app gets big, then either or both of these may be a deal-breaker.

First, and most important: Running a vanilla RDBMS (e.g. MySQL) in a VM somewhere does not make it magically scalable. Read that sentence one more time.

Some cloud offerings integrate tightly to the traditional sort of DB instance you might attach to your web app on a single server. Examples include Heroku, which applies your Rails migrations to a PostgreSQL instance, and Stax, which offers MySQL.

The great thing about these environments is that they don't require significant changes to your standard app built on their supported platforms (mostly Rails and Java variants). Upload, minimal admin, and IJW (it just works).

That's turn-key, full-service web hosting, right there. It's beautiful -- in fact, in an OO and Rails course I wrote, I chose Heroku for deployment as a way to let students get something up and running on the web without getting into the operations/deployment/tuning aspects of Rails which deserve their own course.

But if your app gets large -- or just uses large datasets -- the database is rapidly going to be a bottleneck. Scaling out an app logic tier to a dozen EC2 instances automatically may sound good, but it won't do a thing for a DB-bound app (it may make it worse). And these databases don't scale out without a little architecture, planning, configuration -- all of the things which these cloud platforms are designed to avoid. And which, on some platforms, you cannot do at all.

For example, so far as I can tell on Heroku or Stax, there is no way to even configure multiple servers and replication, which is just a minimum starting point for scaling a DB to multiple machines. Stax may allow for a logical sharding setup, but it's not clear how one would control which VMs and disks the databases run on. Rightscale seems like the kind of firm which would specialize in the management scripts / meta-API that one would need to automate sharding, but the sharding option doesn't appear in any of the models on their website. With replication, which Rightscale does offer (though they're not exactly an app platform, more an infrastructure play), you get to this, still limited, picture:

Other cloud platforms offer datastores specifically designed to scale out, including Google App Engine, 10gen, and others. These platforms offer a non-relational or pseudo-relational datastore, with different data access APIs and a variety of restrictions relative to what you may be used to. These datastores are architected to scale easily, but there are real tradeoffs that must be considered. In fact, if you don't know these tradeoffs cold, you are not the right person to be making this platform decision. Get on craigslist and hire (or borrow) someone who knows the stuff.

The other catch is that whichever approach you choose, these vendors are offering you convenience, some outsourced operations management, and (in some tiers) elasticity and scalability ... but they are not offering cheap compute cycles. That is, if you know you'll need a large, predictable amount of raw compute time, then know also that you're paying a premium to do that computation in one of these environments.

A friend who has designed, built and operated feature film renderfarms for a number of studios confirmed that he has, on a semi-regular basis, analyzed the costs of remote VM-based datacenters (e,g. EC2) compared to their physical ones. Because the studios use these machines intensely, and are consistently consuming raw compute power, the local physical servers have always made more sense.

What does this have to do with your web app and datastore? Well, suppose you have designed your app to leverage a scalable datastore. These may not be tunable, may not perform fast, and may require you to do certain operations in code which traditionally are done in the DB. You may never see these slow queries or operations ... until they show up in your bill. That is, if the system is truly elastic and scalable, it will apply resources as needed to handle your work. If your query or sort or filter takes a lot of CPU cycles, the cycles will be made (almost) instantly available, so the user always sees your app perform well. And then you'll pay for all those cycles or instances at the end of the month.

Either way, there is no free lunch on the data persistence side. Which is not in itself a reason to avoid cloud environments. But it should be a bigger part of the conversation than it is today. And it absolutely must be part of the conversation, if larger businesses are going to move their services into the cloud.

Wednesday, January 21, 2009

Using AppEngine -- Or Similar Datastore -- To Integrate Complex Legacy Data Formats

I gave a lightning talk last night at the SF Bay Area App Engine Developers, showing some work I've been doing to represent gnarly legacy records in AppEngine so as to maintain source fidelity, minimize upfront analysis, and make them easy to integrate with other systems.

I had started with an XML record that I wanted to parse and represent in the datastore -- without knowing which tags and structures would be present, since this format had, ahem, evolved to obscurity over time, as often happens with real-world legacy records.

Before I talk about my approach, here's why I thought this effort might be interesting to the group: a lot of data structures have a tree structure in common with XML. From C structs and file blocks that include a header, telling which types to cast the next n bytes to (and so on inside of those) ... to mainframe "structured data" records I've encountered which consist of nested records, parsed recursively, with their meanings occasionally opaque, lost to history, or belonging to some partner company.

My approach -- which is simply to create a mapping of how to assemble and disassemble the records -- enables a record to be stored in a single App Engine record. But not as a block (or blob) -- rather with fine-grained addressable fields that are easy to talk to using the GAE Datastore API.

In my case, since my original was XML, I created a mechanism similar to a tiny subset of XPath describing the sequence of tags where a data element lived -- but with the characters changed so that it would be Python and GAE-friendly. That is, instead of "/foo/bar[2]/baz" I used _Foo_Bar__2_Baz.

This let me "flatten" the XML into a set of key-value pairs, while allowing that the XML might contain arbitrary structures injected by others ... and that I might want to inject my own extra structures. This arrangement is perfect for the Expando models in App Engine Datastore, or any similar store (e.g. Hypertable, which is modeled after BigTable, or Microsoft SQL Data Services which uses SQL 2008's sparse tables to similar effect).

So now I can store and retrieve my records. Any fields/subrecords which I understand and care about, I can easily work with from other systems, by mapping to the appropriate "key" in the stored record.

For example, if I'm storing a bunch of catalog data, and another system just cares about enumerating each "Product" with "Name" and "Price," then I can create a facade or wrapper in GAE that maps, say, Price to _Strange_Old_Way_To_Represent_Current_Price, and we're all set.

To be sure, there could be performance issues if you tried to use this to create arbitrary queries and reports against the data. That's not really the purpose and, in my experience, if there are no "shortcuts" to processing these legacy records, then the business folks are not used to being able to make an OLAP cube out of them either. (They probably have a batch or offline extraction process.)

Nonetheless, it's another tool in our chest when we need to work with systems and data that have been out in enough real-world battles to come home scarred with lots of cruft.

Monday, January 19, 2009

Twitter's Underwherlming (Former?) Architecture Problem

I recently came across this post from May 2008 comparing Twitter traffic and the Options Price Reporting Authority data feed. Needless to say, the stock market feed is many orders of magnitude larger, at 700,000+ messages per second(!)

It's also not the fairest comparison in the world on its face, for a variety of reasons: the OPRA data system was planned (Twitter met success more or less by accident), Twitter is minimally funded, etc.

A more relevant comparison, in my opinion, is that provided by newzwag, which presented its performance challenges, triumphs, and secrets at a recent SF Ruby meetup.

newzwag's site and trivia game is built on Rails, started small, and had to grow to meet traffic driven by Yahoo and the Beijing Olympics to 9 million pageviews per hour (using a total of a half-dozen machines or so). And lest you think this is a content site served out of a cache, most of the traffic consists of data writes by game players that then need to be ranked, published, etc.

As far as I can tell, that's somewhat larger than Twitter, even considering that Twitter has grown 3-4x since last May's stats.

newzwag's solutions, which they share here, are a study in sanity, reasoned problem solving, and smart efficient architecture.

Without the timelines or resources of a stock-market app, newzwag produced a nice solution that -- at least in hindsight -- appears drama-free.

Interestingly, a newzwag - Twitter comparison can be enlisted to support a variety of different startup social narratives.

One narrative is that an amateur-hour effort yields amateur-hour results, and aspiring startups shouldn't fool themselves into thinking that they won't need old-time Architecture and Sophistication to scale.

A different narrative says it doesn't matter -- if Twitter's success is your worst-case scenario, you still win. That is, build it fast, get it out there where people can try it, and you should be so lucky as to need a real re-arch to fix your scaling problems. In this model, both Twitter and newzwag played it right -- newzwag because they knew the Olympics would provide a narrower time window to showcase their system, so they managed risk against that stricter goal.

And yet another narrative says if you accept these two stories, you still wouldn't want your brokerage transaction flowing through a system built to "see what sticks," and hence Web 2.0 startup methodologies stare at mission-critical business apps from across a huge chasm.

I see this last story as persuasive but also as a big opportunity: there is a chasm, to be sure, but it needn't be quite so big. There are legacy mainframe apps that can speak webservices. Every manager in a big company wants their product to be "100% critical" even if they could create more value by admitting that a lot of nice-to-have two-nines business apps are the real bricks in the wall. If enterprises can get better at separating their Twitters from their OPRAs, they can make more money and have both.

Wednesday, January 14, 2009

New iPhone App Store Rules Take a Step Closer to Scriptable Apps

A lot of folks commented today on newly-approved web browsers appearing in the App Store. Or, more precisely, a handful of apps using the existing web browser widget to offer a slightly tweaked browser experience.

While iPhone apps could include the UIWebView component before -- and indeed this has proven a popular route to getting hybrid native/web apps up and running quickly -- today's change is about allowing apps that "duplicate" a built-in feature of the phone. And one of the fundamental characteristics of any web browser nowadays is that it is thoroughly scriptable.

If you build an app this way, it already includes a scripting environment ... so the question (since scripting and dynamic apps are verboten on un-jailbroken iPhones) is how far one can let the scripting go and still pass muster with the App Store overlords.

Using stringByEvaluatingJavaScriptFromString we can inject script into the browser ... including script that pulls data back out.

And although the JavaScript bridge is "one directional" compared to the OSX desktop API, there are workarounds such as registering a protocol handler to receive scripted "requests" from inside the page ... or by hooking decidePolicyForNavigationAction with a script-initiated navigation request (disclaimer: I haven't checked to see if this is in the phone API, but it seems plausible) to signal the availability of data.

So native code becomes effectively scriptable. Or, for an even less controversial but perhaps equally powerful route: just inject a bunch of JavaScript API libraries into the browser and keep the scripting (and more of your app) in Safari. That's not too different from pointing a browser at a web site (where the page loads various scripting libraries) ... except that underneath it all we are in native-caps mode ...

Unless I'm missing something here, a somewhat ambiguous situation has gotten thornier with the admission of this new class of general purpose browser apps.

Monday, January 12, 2009

Windows 7 Product Name is Missing a Feature

I didn't feel strongly one way or the other about the Windows 7 product name (i.e. "Windows 7") ... until recently when I wanted to troubleshoot the Azure SDK on Windows 7. (Apparently Azure on 7 has worked with the M3 build for at least one intrepid forum poster, but it's not behaving with the beta build for me at the moment.)

I started searching newsgroups, forums, blogs, etc., and realized that "Windows 7" is not a great search term.

On an engine like Google you can put quotes around it, specifying exact phrase, but some other full-text search systems don't seem to want to keep the Windows and the 7 together. Or perhaps they have an index by single words, and they link the results together to match your phrase later, but once you throw in other terms like SDK and Azure, the matching engine becomes a little more promiscuous, offering you a "promising" combo of Azure, SDK, and Windows ... or SDK and 7 ... as a higher-ranked match. Making it, in any case, rather harder to find what you want.

One-word product names, like "Silverlight," "Vista," and "XP" work a lot better for this kind of search.

Which is perhaps a reason that folks include the release name with the version number on products such as Ubuntu (Hardy Heron, Intrepid Ibex, etc.)

So ... what would be a good nickname to put next to Window 7?

Ruby and Python as Cloud Lingue Franche; Ruby/Rails on 10gen

Not sure how this one slipped past me, but 10gen announced support for the Ruby language and most of the Rails framework APIs on their open-source cloud service last month.

This addition is great news for 10gen and for cloud computing (the hosted-application-platform flavor, not the hosted-hardware/datacenter flavor).

For 10gen, support for a well-known API and app model is a huge bonus, which makes it easy for people to move an app into the cloud without learning and coding to new APIs, and also lowers the perceived "lock-in" involved, should the move not work out.

Their original JavaScript platform approach, as I've written before, is problematic not only because folks are unlikely to have meaningful (for their business) apps lying around to try mounting in the cloud, but more so because there is no standard server-side JS API set. A half-dozen companies offer a JS app server or cloud and they all have different platform APIs for even the simplest things, such as reading HTTP request variables, or deleting a session.

10gen takes a big step forward, joining Stax, Heroku, and morph labs in supporting Ruby on Rails in the cloud.

This move also reinforces another emerging trend: Ruby and Python serving as lingue franche for cloud app stacks. While many cloud offerings support JavaScript or other languages, Ruby and Python seem to be emerging as the ones with broadest support: 10gen will support both; AppEngine supports Python and a language-to-be-named-later; Stax supports both; Azure will likely support IronRuby and IronPython (some Python apps can already work in Azure).

Of course, the language is only half of the battle -- there are the APIs to deal with as well, and issues will typically arise where the impedance mismatch is highest with cloud-related infrastructure. E.g., cloud databases are mostly non-relational and don't support SQL ... so an ActiveRecord or SQLAlchemy API won't work on 10gen's 'grid database' (a reasonable tradeoff for simpler scalability.)

Even so, it is starting to appear as though one could write a lot of core business logic using, say, Python, and expect it to run unmodified on most vendors' clouds. Not a bad position to be in for the Python folks.

Sunday, January 11, 2009

Another Windows 7 Milestone: Bounce vs. Hibernate vs. VM Suspend Times

Lately I've started running the Windows 7 Beta for some development experiments, using VMWare's fantastic dual-screen support. As I've written before, the general experience is great, even under virtualization with 1GB of RAM, and having it wall-to-wall on multiple displays makes the illusion more convincing.

An interesting thing I've noticed is that my old habit, when I want to stop working in a VM and free up the resources, is to suspend the VM. This action is roughly (but not exactly, depending on the VM you're using) equivalent to "hibernating" a laptop (S4 power state) -- memory is mapped out to a file and the device is powered off.

I usually do this because this hibernate/wake is faster than a shutdown/boot-up, not because I'm trying to save my actual work state (open apps, etc.) Especially with Windows server, but even with XP and Linux, this approach is the faster way to hop in and out of a work session. On the laptop, it's a way to save the battery power involved in a longish hard boot.

In Windows 7, VM suspend/resume (==hibernate/wake) seems to be slower than shutdown/boot. That is, even with no user apps running (which could take up an arbitrary amount of memory and thus lengthen the map-out / map-in time), boot seems faster. I say "seems" because I have only 2 machines to play with, and they are not clean images just for this test, so I won't pretend they represent absolute objective truth.

What does this mean?

It would appear that the boot process has been cleverly streamlined so that a cold machine gets to a running, usable state before all of the additional services and apps have fully loaded and gotten running, and that this is orchestrated using knowledge that a white-box VM player doesn't have.

Some folks may point out that having to reboot an OS is itself questionable ... and indeed the boot is optional -- I only reboot my XP desktop every couple of months when some security patch or other requires a restart.

But in the world of laptops and netbooks, things are different: every minute of juice is valuable, so there's always the consideration of the cost of a hibernate/wake vs. sleep vs. leaving it on with the LCD off. And that equation has just gotten a little more interesting: for Win 7 on a laptop, if you're not going to be using the machine for a while, it may turn out to be faster and use less power to do a shutdown, and just reboot later.

While this may seem like a fairly inconsequential gimmick about boot times, it is a step in the right direction as we look at the huge array of gadgets we all use and which eat a ton of phantom power. The Windows PC is kind of the holy grail for a fully-off / instant-on experience, and Win 7 appears to take a measurable step in that direction.

Wednesday, January 07, 2009

Embarrassment of Meetup Riches and a Suggestion for Compelling Talks

In the Bay Area, we are fortunate enough to have so many great tech meetup groups that there are frequent collisions, and it's always a bummer to pass on what promises to be a great presentation.

On January 20, the SF Java group has three sharp guys presenting on Scala.

Meanwhile, down in Mountain View, the Google App Engine group has a "hacking..." talk that with a handful of presentations including an update from Google AppEngine Product Manager Pete Koomen.

What to do? Hmmmm...

Ultimately, I decided on AppEngine.

Principally, it appeared that more new, not-readily-available-today-on-the-net material would be presented at the AppEngine group. Lightning talks give a forum to quirky thinkers, very early startups, and other interesting folks, while representation from the mothership might be able to offer a little detail or timeline on upcoming features, like large BLOB support and the next language.

The Scala talk just seems less likely to include info that I can't get from existing resources.

Which leads to the following conclusion about what makes for more compelling talks, at least for an audience of me:

A focus on information that is not readily available, and which gains from the presence and experience of the speaker.

So, for example, I've seen many talks on "my cool [insert language] library that does [insert function]."

A fine topic. Now in the execution, perhaps best to talk about the problem being solved, how you solved it, what tradeoffs were made, constraints dealt with, any magic foo inside ... rather that a bunch of examples showing how clever/elegant the external API is and examples of what one can do with it.

Not that the latter is unimportant, but that the latter is (should?!) be readily available from the online docs/examples or the presentation notes; whereas the former represents the specialized knowledge and experience of the library's creator.

Tuesday, December 30, 2008

On the Wide Range of Ruby Talent: Rails as a 4GL(?)

For a long time, I was puzzled by the extremely wide range of talent level among Ruby talent: more than any other language I could think of, "Ruby Programmers" seemed to range from the truly clueless to hardcore accomplished engineering veterans.

I had more or less chalked this observation up to the language and APIs themselves -- easy to get running with, highly intuitive ... and yet packing deep and powerful features that would attract the more sophisticated (easy C integration, metaprogramming, continuations, etc.)

On top of this was Rails with its famous screencasts showing database-backed websites automagically constructing themselves, a feat that got novices very excited, and reminded experts that they should be asking harder questions of any other framework they were using for the same kind of app.

But lately I've come up with another -- or perhaps stronger variant -- hypothesis.

Rails itself aspires to be a 4GL -- a DSL together with a framework, specializing in database-backed, RESTful web application development.

It appears that some programmers see (or want to see) just the 4GL parts (the DSL, the declarative bits, the "conventions" and the "magic") while others see a 4GL in close symbiosis with a 3[+]GL (Ruby) for doing procedural work.

In some sense, both groups are seeing accurately -- for certain problems. That is, apps that truly hit the "sweet spot" for which Rails was designed, and which do nothing else, can be treated as compositions of 4GL bits much of the time. Highest productivity, get your check, go home, watch a movie.

For other problems, additional logic and/or integration is required.

And here's where the two groups of programmers part company. The pure-4GL-ers want to search high and low for a plug-in, gem, or other black-box component to solve the problem. Even if the black-box component is harder to use, poorly documented or maintained, does not include a test suite verifying it does what it claims, this group of coders wants the plug-in. They'll search and argue and try different ones even if the functionality is 15 lines of Ruby code. But they won't write the functionality themselves.

The other group just writes the code. Perhaps they even bundle it into a plug-in and throw it on github. They also do things like submit patches to Rails itself.

Depending on the situation, either approach might be fine ... but: when something doesn't go right, in terms of logic or process or performance or data integrity or something else ... the first group hits the wall. All of a sudden it's a "Big Problem" ... whether or not it's really a big problem. That is the Achilles heel of all 4GLs: if you view them as "closed" systems that require pluggable components then anything outside their bounds becomes a "Big Problem."

And lately I've watched as some participants in the Rails community systematically leap tall buildings in a single bound when they encounter them, while others literally sit on their hands waiting for a plug-in to appear (with no track record or test cases), which they then gratefully throw into their app and hope for the best.

Monday, December 29, 2008

Fear of the Driver Disk

The problem of bloatware/crapware on retail PCs is well known -- to the extent that Apple makes fun of it, pointing out the absence of such software on new Macs, while PC tools exist just to clean it.

But bloatware has a less-famous, equally annoying sibling: all the garbage that brand-name hardware devices install off their driver or utility disk.

Pick up a peripheral -- printer, web cam, DVD drive -- from a major brand, and if you follow the automagical installer on the drive disk, you'll get a half-dozen apps that you may not need or like. In some cases, they're just bad apps (that have a habit of arranging to start at boot), while in other cases they can destabilize a perfectly-running system.

The problems are that

  1. in some cases you do need these apps, because some hardware features require "support software" to be present, and don't fully leverage the many built-in drivers and control panels available for free in Windows ...
  2. most hardware companies internally view the driver/utility software as an afterthought, writing it hastily, testing it inadequately, and staffing it with ... well ... whomever they can find.

There are two main remedies.

In many cases, getting an unbranded or "strange-branded" device is a smart idea (provided you know what you're getting). I've found these devices have straight-forward, minimalist support apps, make great use of built-in Windows drivers, and don't put any garbage on your system -- for the simple reason that they don't have the resources to write a bunch of half-baked apps, or to form "distribution partnerships" with people who do.

If I do have a brand-name product, I generally attempt to install it without its own driver disk, no matter what the instructions say. In many cases, the device is fully supported by Windows out of the box; in other cases, some features may not be available -- but I may not need them. (E.g., If I wanted to use my digital camera as a webcam, that would have required the vendor driver disk... but I have never wanted to use that feature of the device.)

And if that latter approach fails, it's pretty easy to uninstall the device or otherwise convince the PC it's never seen the device before -- so that you can go the RTFM route and use the supplied disk.

Tuesday, December 23, 2008

Stax Brings more Standard App Models to the Cloud, Marginalizes Custom Platform Further

Stax, which recently launched in private beta, is a cloud hosting/deployment/ops platform based on Java appservers. The coolest thing about Stax is that it offers many flavors of JavaEE-deployable applications, including Python apps (via Jython) and Rails (via JRuby) with ready-to-roll built-in templates.

Stax has a very AppEngine-y feel, not just on the website, but in terms of the SDK interactions, local development, etc.

This is good news for all of the popular platforms ... and bad news for those rattling around the corners with non-standard APIs. As the app-hosting industry continues to mature, the emphasis will clearly be on established players like Rails, ASP.Net, JavaEE, Pylons, et al. at the expense of guys like AppJet.

It's not about the language (JavaScript) but the about learning a set of APIs, patterns/practices, and sustaining a community ... based on a roll-your-own platform.

It is true that some of these built-for-the-cloud platforms were designed from the start to default to hash-style or big-table style storage -- popular for content-oriented cloud apps because of its easy horizontal scaling -- where the "traditional" platforms focus on relational stores and have a variety of bolt-on APIs for cloud DBs.

But now that there are so many standard alternatives, it is unlikely developers will pay any custom-platform-tax no matter how elegant that platform might be.

Thursday, December 18, 2008

We May Look Back Fondly on Our 68% Project Failure Rate

The report [PDF] referenced here -- focusing on failure originating in the business requirements realm and offering a "68% project 'failure'" headline -- inspired two thoughts.

First, it remains a head-scratcher why nearly every project, despite talk of risk analysis and mitigation, expects to be in the 32% success area. Even if a company has the best resources (and generally it will not), many causes of project failure originate in organizational factors -- friction, process, politics -- and externalities (e.g., no budget this quarter for the tools in the original plan).

Since these issues are rarely known -- and, I would argue, actively denied in the sense that whistle-blowers and problem-spotter are systematically excluded from planning (at best) or forced down or out -- the average "expected value" of the d100 role is way less than a 68.

In startups, I've heard the excuse that the whole company is a "long shot" -- as though that justifies taking on disproportionate risk in each project implementation.

In enterprises, it seems as though management is so used to failure (in a timeline or budgetary sense), that they have simply redefined success to mean anything that they can pull off in their tenure -- and if that means a system that kinda works, but no one likes, shipped in 200% time and 400% budget, well, that's just the "reality" (which, to be sure, it is once all other possible paths have been discarded).

This redefinition of success also has the side effect of removing accountability and pretty much assuring a nice bonus and making strict accountability impossible.

Another thought inspired by the report is how the gap between enterprise development and small / startup development is widening. On the one hand, large businesses could benefit from the high-productivity tools and agile approaches popular with startups; for a variety of reasons ranging from policies to personnel, they are not exploiting the latest wave of technology, and it's costing them.

On the other hand, what they need regardless of tech is solid analysis and estimation capabilities. Analysis and estimation are possible, but hard, and the agile camp has moved mountains to try and reframe the problem so that they can advocate punting on all of the hard parts of these disciplines. That works great for the hourly agile consultants, but inside large businesses and large projects, it just doesn't cut it. A business needs to be able to attempt to estimate and plan months or even years of a project (hence the prevalence of "fake" agile in companies that purport to use it).

The fact that the business does a horrible job with the estimation today does not mean that the organization (which, generally, is run by business people not developers) won't keep planning and targeting in the large.

The result of these two pieces (different tech, different process) is that enterprise and small development are moving farther and farther apart, which is a damaging long-term trend. Ideally, these two groups should be learning from each other. They should spend more time moving in the same world.

The enterprise learns that it probably doesn't need JavaEE and Oracle and a big planning process for myriad internal utility apps that could be done with Rails at a fraction of the cost and effort. The small company learns that relational integrity, transactions, estimation, and operations management are sometimes both necessary and profitable.

Even more importantly, individual employees in the industry can cross-pollinate ideas as they move between these environments over their careers, sorting fact from fiction and getting better at determining what will work.

The farther apart these groups are, the less appealing it will be for "startup guys" to work in an enterprise or a startup to hire on an "enterprise guy."

This trend -- since it keeps useful knowledge hidden -- can only help a 68% failure rate go up.

Friday, December 12, 2008

How is Google Native Client Faster Than -- Or As Safe As -- JVM or x86 VM?

When I saw Google's proposed native-code execution plug-in earlier this week, my initial reaction was: "Don't we already have those, and they're called exploits?"

I decided to mull it over a bit, and I still don't like it. While the idea of sandboxed native x86 execution based on real-time analysis of instruction sequences makes for a great Ph.D. thesis or 20%-time project, it sounds like an awfully large attack surface for questionable benefit.

Here's what I'd like to know from a practical nuts and bolts point-of-view: how many "compute-intensive" scenarios could not be implemented effectively using either (1) the Java VM or (2) a plug-in based on widely-available open-source machine virtualization, running a tiny Linux snapshot.

While the JVM falls short of native code in some places, it can be as fast -- or even faster -- in other cases (faster because the runtime behavior provides opportunities for on-the-fly optimization beyond what is known at compile time). Yes, there are issue with clients not all having the latest Java version -- but that seems a small issue compared with the operational issue of deploying a brand-new plug-in or browser (Chrome).

Another approach is to use a plug-in that runs a virtual machine, which in turn runs a small Linux (started from a static snapshot). User-mode code should run at approximately native speed in the VM, which should cover the pure computation cases. In a rare situation where someone wants to run an OLTP file-based db (which would behave badly in a VM environment) or get access to hardware-accelerated OpenGL facililties, specific well-defined services could be created on the host, which code could access via driver "ports" installed in the VM image.

These approaches to the security boundary -- Java VM and x86 VM -- are well-known, long-tested and based essentially on a white-list approach to allowing access to the physical host. While Google's idea is intriguing, it sounds as though it's based on black-listing (albeit with dynamic discovery rather than a static list) of code. I'm not yet convinced it's necessary or safe.

Tuesday, December 09, 2008

Yes, VC May Be Irrelevant if it Continues Focusing on Weekend Projects

Where are Web 2.0's Amazon.com, PayPal, Google, or Travelocity? They were never funded.

Paul Graham and now Eric Schonfeld are extending the several-year-old "VCs don't know what to do with Web 2.0's super-low-cost startups" meme. The extension has to do with the recession, arguing that it will accelerate the decline of VC relevance, as VCs become more reluctant to fund these shoestring startups, and more entrepreneurs pull a DIY anyway and ignore VC.

Well... maybe. The VC problem with micro-cap micro-startups is real, in the sense that it's a real math problem where the VC fund size divided by the proposed investment size equals too many portfolio companies to interact with, and a kind of interaction that is a big break away from the old model.

But the easiest fix is for VCs to simply invest in more expensive businesses. The current predicament is a classic chicken/egg: after the dot-com crash, VC money was hard or impossible to get, so the businesses people started were these undergraduate-level 1 man-year (or couple-of-weekends!) efforts.

Small product, small team, small money. We got things like 'Remember the Milk.' Cute, sure. Useful, sure. But nothing too hard or too ambitious. Nothing a tiny shop -- or a bored college student -- couldn't hack out in their spare time. Indeed many were spare time or side projects.

Some of these apps got a lot of users, especially where network effects were involved, and eventually VC wanted nothing that wasn't viral, network-effect, social. Never mind that there was never big challenge or big value in those. Facebook is the biggest thing in Web 2.0 at the moment, and it's nothing but network effect and questionable monetization. "Not that there's anything wrong with that..." It's a fine, useful application. But in the absence of any real substance the model became more Hollywoodesque, more personality- and connection-dominated than it should have.

So where is the Amazon? PayPal? Google? Travelocity? Ariba? Netflix? Danger (maker of the Sidekick devices)? Those big projects that take tens of man-years, maybe hundreds? The projects that can't "launch in beta" after a few months and acquire tens of thousands of fanatical users because they're more than a glossy AJAX UI on a local database?

Where are the startups that drag whole industries whining and screaming into the 21st century and liberate billions in value, trapped in transactions that real people make every day?

Those big projects haven't been A-round darlings for a long, long time. VCs, terrified of risk, moving in a tight pack, loved the new ethos: let the entrepreneur build a product, get it launched ("beta"), get customers, get mirror-hall PR (blogosphere), then later drop in a few bucks. Less risk. But not easy money. Count the exits.  And the ad-only valuation has no more magic today than it did in 1998.

And no one even notices when hundreds or thousands of these little pownces and iwantsandys hit the deadpool.

It's no coincidence that many of the step-by-step tutorials for new frameworks and tools teach you to clone a blogger, a flickr, or a wiki farm in a sitting. After all, those are trivial undertakings, lacking only for a network-effect mob signing on. And we'd all have a good laugh about widgets one day... if we weren't laughing so hard already.

Can you imagine a tool vendor of the dot-com era giving an hour tutorial that produces a working Travelocity clone? a working PayPal clone? It's sketch comedy, or something sadder and more disturbing. Frankly, the most ambitious projects in all of web 2.0 are the tooling and infrastructure plays that are largely open source.

Investors, entrepreneurs, engineers and end users might all do well by hunting some bigger game.

Thursday, December 04, 2008

Approaches to the Silverlight Dilemma: Cashback? Vista SP? Win 7?

The jury's still out on Windows Live Search Cashback -- apparently there were some issues on Black Friday and, overall, things aren't up to goal ... but that could change.

In light of this cashback program, though, my proposal from last year that Microsoft simply pay people to install Silverlight seems astonishingly realistic.

The problem with Silverlight is that it doesn't have wide enough deployment -- so there is a chicken-and-egg problem deploying apps on the platform. Microsoft has released stats about how many people 'have access to a PC running Silverlight' -- but that's an odd definition of penetration (if a state university has 15,000 students, and 10 computers in a library lab have Silverlight, then presumably all 15,000 'have access to a PC with Silverlight'.)

So ... why not just pay folks a small amount to install? Maybe a $2 coupon(Starbucks/Amazon/PayPal/etc.) to install the plug-in in the user's default browser.

The WGA or XP key-checking components could be used to keep track of machines that have participated, making it impractical to game the system by performing extra installs. I'm sure a suitable equivalent could be used to verify installs on Macs and Linux boxes.

Another approach is to wait for a future Vista SP or Windows 7 upgrade cycle. Since Silverlight will presumably be included for IE, the trick is to hook the default browser selection and attempt to install a suitable version of Silverlight into the target browser (Firefox, Chrome) when the user selects it.

This move would surely be controversial -- even if there were a way to opt out -- and would bring back memories of the Microsoft of the 90s for some. But, since the web page itself decides what plug-ins to use, and no content is "forced" to render via Silverlight, it's only moderately different from making WMP or any other dynamically loadable object available.

And for those "open web" advocates who have nothing good to say about proprietary plug-ins ... consider that Silverlight does not really compete with the open web, but rather with Flash, which otherwise has a de facto monopoly on any RIA that cannot easily or cost-effectively be implemented in HTML/JavaScript. Breaking that monopoly could actually help the open web cause by creating a real platform debate in a space that doesn't really have any debate right now.

Tuesday, December 02, 2008

Microsoft Begins Posting "Proceedings" Docs for PDC Sessions

Microsoft has started posting an accompanying document for each 2008 PDC session, containing a detailed content summary, links, and indexes (into the video stream) for demos.

These docs, called "proceedings," had been part of the plan to publish the PDC content this year, and it's nice to see them showing up alongside the slide decks and full-length session video streams that had already been posted.

While not full transcripts, or even detailed outlines, the information value of these docs is far higher than the full-length videos (per minute of attention investment), so I highly recommend them if you have interest in any of the newest Microsoft technology.

Docs, transcripts, and outlines seem to be a more efficient way to present content for multicast consumption than live video. While full-length web video is cheap and easy to produce, it wastes an enormous amount of time on the viewer side. Instead of reading/scanning the content as fast as the reader likes, he or she must sit through the lower-information-density spoken content and -- what's worse -- is forced to do so in a linear fashion. The time loss is then multiplied by all of the viewers ...

If an employer is paying for your time watching videos -- whether they be technical sessions, product demo videos (all the rage now), or so-called webinars -- then perhaps it's nice to sit back, have a cup of tea, and go into watcher mode.

But if you're investing your own time/resource/productivity into acquiring some knowledge, it's nice to (1) be able to do it at your own reading pace and (2) be able to skim/scan sections of less value at high speed and pop out into 'careful reading mode' as necessary.

Sunday, November 30, 2008

Most Laser Printers are Razors, not Cars

Once upon a time, buying a small or midrange laser printer was like buying a car. Big upfront expenditure, lots of sweating the details, and a moderate amount of thought about gas mileage and scheduled maintenance, er, toner and fusers and all that.

Now, however, it's more like buying a razor. The core features are mostly reasonable, each "size" printer has a speed/duty cycle that determines its suitability for an installation, and the cost is so small that it's all about the consumables (blades).

So why won't vendors -- or manufacturers -- print the cost-per-page of consumables right next to the dpi, ppm, idle power consumption, and time-to-first-page?

It's easy to find 10x differences between otherwise similar printers in the cost-per-page based on the manufacturer's price for toner cartridges and their intended yield.

Big companies, of course, have IT purchasing folks who perform these calculations, factor in the discount they get because the CIO plays golf with the right people, and order the gear. In the case of printers, large companies are typically buying high-volume printers that are among the cheapest per page anyway.

But startups, professional practices (think doctors, accountants), small to midsize businesses -- they rarely calculate the TCO for each device. It would be helpful to have the consumables price per page listed right on the sticker, like MPG.

Saturday, November 29, 2008

For Individuals, Top Laptop Values May Be At Low-End Prices

About a year ago, when I started my latest stint doing 100% contracting, I realized I would need a laptop for meetings, client presentations, etc. Not for coding -- I've written about my views on that, and I'll never trade away my nuclear aircraft carrier of a dev box for an inflatable dinghy just so I can hang with hipsters at the coffee shop.

Since I wouldn't be using the laptop much, and historically the many company-provided laptops I've used have turned out to be poor performers, malfunction-prone, and costly to deal with, I resolved to get the cheapest new laptop I could find. (Laptops have a strange depreciation curve, with the result that a cheap new laptop is often a much better device than a used one at the same price.)

In addition to the usual holiday sales (you can guess what prompted this article now), the whole Vista-Basic-capable-not-really thing was going on, with the result that many machines were being cleared out for good, considered illegitimate for Vista and not saleable.

I snagged a Gateway floor model at BestBuy for under $300, put another gig of RAM in ($30?), snagged XP drivers, and installed Windows Server 2003 R2, since I had found that Server 2003 was very light on the hardware while offering more than all the benefits of XP.

At this price point, I figured the laptop borders on disposable, and if I could prevent the TCO from getting high come what may.

Well, a year or so on I have some results to report.

It has performed far beyond my expectations (and as a developer my expectations tend be unreasonably high).

The only negative is the comically poor build quality -- this is a machine that one must literally 'Handle With Care' as it's built about as well as those tiny toy cars out of a quarter-vending-machine. I think I could snap it in half if I tried, and a careless twist could rip the drive door off or crack the case right open. The keyboard rattles a bit on some keys.

I have a padded briefcase and the machine was never intended for "heavy duty," so that wasn't a big deal for me. And, in any case, it seems more of a reflection on Gateway than on the price point, since, e.g., Acer offers rock-bottom laptops with much higher build quality.

That issue aside, the machine has performed flawlessly. No problems with any part of it, despite being a display model. And performance adequate to some real programming.

The ergonomics are poor for programming (single monitor, low-ish resolution, etc.) -- but it snappily runs NetBeans/Ruby/Java; Eclipse/Java plus various mobile device emulators (e.g., Android) which I needed for a course I taught this summer; even Visual Studio 2008. I do run MySQL rather than SQLServer, in part to keep the load down.

Let's see ... what else has run well on here? ... the Symbian S60 emulators (along with dev tools) ... Sun's VirtualBox virtualization software, with an OS inside. All the usual productivity stuff (Office 2007, MindManager 8) ... Microsoft's Expression design/UI tool suite ... video encoding software and high-bitrate x264 streams from hi-def rips ... often many of these at the same time. Everything I've asked it to do, it does seamlessly.

My conclusions are that sturdier laptops may well be worth it, especially for corporate IT departments -- I'm thinking about products like the ThinkPad and Tecra lines, where the price doesn't just reflect the specs but also a sturdy enclosure, standard serviceable components, slow-evolution/multi-year-lifecycle per model etc.

But for an individual, unless you have a very specific hard-to-fill need (e.g. you want to do hardcore 3D gaming on your laptop or capture DV, a bad idea with a 4200 RPM HDD), the top end of the value equation for laptops appears to be at or near the bottom of the price range. When one considers that higher-end peripherals (e.g., a BlueRay writer) can easily be plugged in via USB, and a faster hard drive will snap right into the standard slot, the value-price equation seems to get seriously out of whack for those $1200-$2500 machines.

That's not to say these higher end machines are not great ... they just don't represent value at the pricepoint. Just as a Mercedes E-class is a fine car, but the radio commercials that try to make it out to be some kind of value purchase are downright funny, I think the same applies for the high-end VAIOs, MacBook Pros, etc. Those machines are a style and brand statement for people who care about making such a statement.

This possibility is interesting because, in most products, the "optimum value" position is somewhat above the bottom end of the price range ... that is, the least expensive products are missing critical features, making them a poor value, while the high-end ones charge a "luxury premium." If laptops are different, that seems worth noting.

The usability of an ultra-cheap laptop also suggests a response to folks who commented on my earlier article, saying that companies are loath to buy a desktop and a laptop, so if an employee needs any mobility at all, they get a laptop. It appears a good solution might be to provide a high-end desktop and an ultra-cheap laptop. At these prices, the employee's time costs more than the laptop, and my experience suggests little productivity (given a remote scenarios such as a training class or client demo) is sacrificed.

Tuesday, November 25, 2008

Do FlexBuilder and MXMLC Really Feature Incremental Compilation?

I use FlexBuilder in my work, and, overall, it's a decent tool. Eclipse gets a lot of points for being free; Flex SDK gets a lot of points for being free. FlexBuilder doesn't get points because it's basically the above two items glued together along with a GUI builder, and it costs real cash.

Wait, I'm off track already. The price isn't the issue for me. Rather, I want to know why FlexBuilder doesn't feature incremental compilation.

Hold up again, actually, I guess I want to know how Adobe defines incremental compilation since they insist that it is present and switched on by default in FlexBuilder.

Now, if I make any change (even spacing) to any code file -- or even a non-compiled file, like some html or JavaScript that happens to be hanging out in the html-template folder -- FlexBuilder rebuilds my entire project. And it's a big project, so, the job even on a 3.6GHz box means a chance to catch up on RSS or grab more coffee.

Interesting take on incremental compilation. See, I thought the whole idea was to allow compilation of some, ah, compilation unit -- say a file, or a class -- into an intermediate format which would then be linked, stitched or, in the case of Java .class files, just ZIPped into a final form.

Besides allowing compilation in parallel, this design allows for an easy way to only recompile the units that have changed: just compare the date on the intermediate output file to the date on the source file. If the source file has changed later, then recompile it. It does not appear that this is how the tool is behaving.

Perhaps this logic is already built into FlexBuilder -- mxmlc, really, since that's the compiler -- and the minutes of build time are spent on linking everything into a SWF. Since Adobe revs Flash player regularly, and many movies are compiled with new features to target only the new player, it should be possible to update the SWF format a bit in the next go-around, so that linking doesn't take egregiously long.

Apparently, at MAX this year, Adobe has started referring to the Flash "platform" -- meaning all of the related tools and tech involved around the runtime. Fair enough, it is a robust ecosystem. But "platform" kind of implies that the tools support writing real -- and big -- applications, not just a clone of Monkey Ball or another custom video player for MySpace pages.

Sunday, November 23, 2008

Software Discipline Tribalism

Unfortunately, people seem rarely able to stop at a reasoned preference -- e.g., "I like X, since X may offer better outcomes than Y" ... and too often end up, at least whenever group persuasion is involved, somewhere more dramatic, personal, extreme, and narrow -- e.g., "I'm the X kind of person who rebels against Y, since Y can offer worse outcomes."

This is as true in cultures of software development as anywhere else.

While many people have been guilty of this unproductive shift in attitudes, it seems many of the Agile development, dynamic languages, small-tools/small-process/small-companies crowd has long since fallen prey to it.

To be sure, there was much temptation to rebel.

At the start of the decade, proprietary Unix was strong; many processes came with expensive consultants, training, books, and tools ... and overwhelmed the projects they were meant to guide; many tools were expensive and proprietary. Web services were coming onto the scene and large players, with licenses and consulting hours to sell, created specs that were unwieldy for the small, agile, and less-deep-pocketed.

When the post-dot-com nuclear winter set in, small companies had no money to pay for any of that, and we got LAMP, Agile, TDD, REST, etc. Opposition to OO, which had been strong in many quarters, suddenly faded as OO was no longer identified (rightly or not) with certain problematic processes. Ironically, many new OO language fans had been ignoring the lightweight, free (speech and beer) processes that some OO advocates had been producing for years.

These have all proven to be useful tools and techniques and have created whole companies and enormous value ... but somewhere along the line, instead of being in favor of these tools and techniques because in some cases they produced better outcomes, either the leaders or the converts started thinking they were the rebels against anything enterprise, strongly typed, thoroughly analyzed, designed and well tooled.

This shift in attitudes does not help the industry ... nor even the clever consultants who lead the charge, deprecating last year's trend for a new one which they just happen to have written a book about.

We desperately need a broader perspective that integrates all of these pieces. There are things manually-written tests just won't do -- tools like Pex can help immensely, even if (or because... )they are from a big company.

Analysis and design are not bad words, while Agile can get dangerously close to simply surrendering to the pounding waves of change (and laughing at goals all the way to the bank) rather than building against the tide, and trying manage to a real outcome on a real budget.

Static languages can get hideously verbose for cases with functor-like behavior (Java and C# [pre-3.0], I'm looking at you). At the same time, go talk to some ActionScript developers -- who have had dynamic and functional for years -- and you'll see an amazing appreciation for the optional strict typing and interfaces in AS3.

REST is great, but in playing at dynamic, it turns out to be rather like C -- it's as dynamic as the strings you pipe into the compiler, and no more. Absent proper metadata, it cannot reflect and self-bind, so it sacrifices features that dynamic language developers love in their day-to-day coding.

Ironically, most of the critical elements of this "movement" -- along with open source -- are being subsumed into the big enterprise software companies at a prodigious pace. Sun owns MySQL and halfway owns JRuby; Java servers may serve more Rails apps than Mongrel/Ebb/Thin/etc. soon, Microsoft is all over TDD, IronRuby, IronPython...

I suppose the sort of tribalizing we see here is at least partly inevitable in any field. But it would serve the entire industry if that "part" could be made as small as reasonably possible. As a young industry with a poor track record and few rules, we ought to be more interested in better software outcomes than in being rebellious.