Here's the Deal: The Next Stack Frame: April 2008

Sunday, April 27, 2008

Don't Feel Bad, It's Also Amateur Hour at Analog Devices

Analog Devices, whose D/A, A/D, and DSP circuitry power on-board sound in everything from ThinkPads to ASUS' main line of motherboards, makes a wicked bad driver suite.

Last summer, I was troubleshooting a situation where laptops were exhibiting lots of really short freezes at a close-to-the-metal level ... keystroke input being delayed, or mouse cursors getting jittery. Turned out to be DPC storms resulting from the Analog Devices SoundMAX driver and usermode controls.

Today I was struggling with the latest flavor of this driver set, which attempts to detect what you've plugged in to which port, and then auto-configure. Which it does wrong, forcing you to come up with tricks to get basic things like a headset and microphone working. Guys, we've had the "green" and "pink" sockets on sound cards for like 15 years now, give me some credit here.

But that's not what got me really annoyed. No, these guys have gotten themselves really confused about how multi-user logon and fast user switching works on XP. They've only had seven years to get that right. They pop up some of their controls and wizards in the wrong user session ... when I tried to close them and switch back to the account I needed to be on, their driver blue-screened my box.

As a developer, I install all kinds of stuff on my machines and I often torture them in unseemly ways. But, since 2001 when I started using Windows XP, this is the first time I've had a blue screen during a regular old user session, after a successful boot.

This is a WHQL certified driver too. For a long time there were problems where developers didn't get their drivers WHQL certified, which led to users clicking the infamous "continue anyway" button, and companies ignoring WHQL; I'd hate to suppose Microsoft solved that problem by letting junk like this pass the test.

Saturday, April 26, 2008

DVD Flick Installer Needs to Rewrite a System Library Why?

I feel bad ripping one in the crew that builds DVD Flick, a well regarded open source tool. From what I can tell, this is a fully-open, Sourceforge-hosted, quick-authoring tool similar to the excellent ConvertXToDVD.

During its installation, though, it attempts to overwrite the richtx32.ocx file in the windows/system32 directory. Richtx32.ocx is a Microsoft-authored GUI widget component (an AcitveX library) present in all or most Windows systems.

Trying to overwrite this is bad on so many levels. Who knows which version of this lib is being supplied by DVD Flick? The installer definitely doesn't compare versions and make an intelligent choice, or it would have chosen not to even try on my systems, where everything is up to date. So it's an old version, maybe one with bugs or security vulnerabilities, that could compromise my system because any process might load this code up.

Not to mention it appears to the user that the app needs to alter their system in order to continue. Since they trust this app (it's been written up in blogs like LifeHacker), they start thinking "oh, sometimes free apps I download off the Internet need to overwrite my system libraries, no big deal" ... all the UAC in the world won't be able to overcome that mentality (although theoretically Vista has other protections to remedy this specific problem).

I tried to think of any justification for this behavior, but I couldn't. True, there are aspects of the ActiveX/Registry/DLL Hell problem that can make something like this a little tricky, but there are workarounds as well.

Starting with the easiest fix: include a legit Microsoft merge module containing the latest DLL version, and first, in the installer script, check the installed lib version/signature to see if it's necessary to install a new one at all.

The other possibility, that DVD Flick isn't trying to update this library, but needs to install its own version on top of the existing one, with some extended functionality, would be even more ridiculous... I entertained this idea for a moment, but it appears not to be the case.

Wednesday, April 23, 2008

Augmented Reality is Here, You Just Can't See It

Augmented reality is here now, and is going to get bigger fast. In fact, it's going to be a killer app for the semantic web.

Where did I come up with that?

Well, as far as not seeing it ... that's because it doesn't look like the picture you have in your head. You're thinking of stuff floating in midair like in Minority Report or at least a heads-up display with goggles or a sceen that overlays data onto video.

Those may be nice implementations. But the core utility of augmented reality is getting information on stuff you're near (where near can be physical or by mental association) and having it delivered to you in an actionable "heads-up way" even if it's not on a HUD.

In the information sense, AR is already popular. Joe calls Fred on his cell phone: "I'm trying to find a hardware store down here near 20th, can you hop on Google maps and find it?" Fred looks it up, maybe hits Street Views, done. Later he calls Fred again from a party: "That girl is here -- the one I met at your office party, used to work with you... what is her name? She's looking great. What is her whole deal again?" That's augmented reality by cellphone and human reverse TTY, call it v 0.5.

Then there's v 0.6, which is Google Local for Mobile/iPhone, Windows Live Mobile, or the like. GLM offers "My Location"-biased search results, while WLM has solid speech recognition (you can just tell it where you are or what you want). These mobile search products are great, except that they are relatively active, not passive -- you need to tell them what you're interested in, they doesn't already know. And they only knows a little bit about places and a little more about businesses.

Despite the limitations, these two methods are real examples of AR in use today, even if people don't call it that.

I assert that AR is a killer semantic web app because it's the semantic tools that let the machine filter the Internet down to what's relevant in your (metaphorical) field of view, when you're out in the real world.

Googling for answers in many real-world situations is drinking from a firehose, and you need to go all the way into the cyber world (iPhone, laptop, etc.) and invest effort to get what you want. That's not AR, it's just context switching and portable cyberspace.

AR is a system that can matrix your interests, contacts, places, and needs against all the current information germane to where you are or what you're doing ... and then pick out just the high level parts you want, with a mechanism to drill down by concept. Doing that by brute-force search, or even collaborative methods (think geotags), won't get you far enough. You need a true semantic layer to front-load the work and make this real-time.

On the other side, once you have a workable if basic semantic layer, then AR becomes a very basic incredibly useful flavor of personal, semantic search.

Saturday, April 19, 2008

Social Net 2.0 is about the Edges not the Nodes

In a the social graph model, people are usually the nodes and their relationships to one another are the edges. Profile data become node properties. Relationship data (worked with, married to, used to date) are the edge properties. There are some twists on this, like making a company or school (say Stanford) into a node that people connect to, rather than an aspect of a relationship ("used to go to Stanford with").

Proposition 1

Today's graphs have lots of node data, but are very weak on edge data. It's the problem where

If I come to your Facebook profile, it is hard or impossible to tell, just from your friends list, who you are actually friends with and why.
You may reasonably hesitate to fess up to some legit social graph links because if this missing data. Like, say, an anti-copyright crusader who's really an old buddy from junior high, but whose link might jeopardize your Hollywood job chances if the relationship were misconstrued.

This binary friendship is the weakest form of social graph, especially as it gets syndicated via apps and APIs throughout the Internet. It's poor quality information.

Proposition 2

The next step, the of 1.1 of edge data, appears in schemes like FOAF, and apps like LinkedIn. LinkedIn at least asks how you know someone ... did you work with them? where? when? That information adds value to their network.

The uber-social-graph (the hypothetical union of graphs) that we imagine and talk about can only have real value when the edges have quality (how do you connect), quantity (how well), sign/directionality (let's face it, there some people I might like, know, or relate to more than they do me, even if we're generally on the same level). In fact, the edges are multidimensional ... multiple different linearly independent quality/quantity/sign groups might all apply over a single relationship. Like anytime you've spent time working professionally with a good friend. If it's someone you've been romantically involved with, toss another set of attributes on there as well.

Proposition 3

One way to get to 2.0 would be to ask people to create a profile for each relationship (friend) they add. Probably not likely to happen, because it's labor intensive in proportion to the value of the data: Saying I worked with someone is not as good as saying where it was and when, adding that we were pretty friendly and had beers a lot after work, adding that we were both really interested in, say, programming languages or biking, adding that I convinced him to volunteer with me on so-and-so's political campaign.

It's a lot of work.

Passive acquisition of the data would be easier and more accurate: my emails or IMs with someone would tell how often I talked to them, what about, in what context, etc. The existing email, IM, and soc net comms providers (e.g. facebook messaging) have most of this data already. Ideally they would get my approval and once-over before asserting any conclusions, since it's still early days as far as accuracy of automated semantic analysis.

Proposition 4

Even if passive acquisition and analysis were occurring and were accurate, the data could be quite wrong as long as my life spans multiple systems. E.g., gmail sees 50 emails with someone, all related to Java work. The Google graph system draws one conclusion. But if there are 2500 AIM messages somewhere else, about wacky topics and at different times of day or night to the same person, the picture of the relationship might look a lot different. So the data from many modalities of communication (chat, IM, email, TXT, phone calls) and many systems needs to be analyzed together.

Proposition 5

This will require an ingenious entity to manage the graph. The government's attempt to do more or less exactly what I'm talking about is likely to be kept far from us law-abiding citizens. (I consider it only a matter of time before the world's top cybercrime / warfare / terrorism groups actually compromise this database, but they're not about to give us an API to it either.)

It won't be practical to keep this data secret; users will need to understand that once they allow data to flow into the system, it will be syndicated and replicated forever; it can't be pulled back. This should not shock current users of social networks, who must assume that not only their friends list/network, but the history of its deltas over time, and any tracking-cookie-enabled assumptions about the people it, may already be in a Google cache somewhere or in someone's data scraping startup.

What is to be hoped is that there is some centralized mechanism for auditing, correcting, and marking things as questionable. I.e., some group of graph engines that have a higher degree of trust than generic web scraping. Incorrect data could be challenged by allowing the engines access to additional information.

Proposition 6

Automation in turn gives rise to graph spam and graph phishing: if a med vendor sends me 100 emails a day, does that imply something? If a foreign con-man tricks me into clicking his link or sending him an email, does that get him mileage in my social graph?

One way or another, this stuff is coming, so we may as well start figuring it out right now.

Wednesday, April 16, 2008

Real Psystar Story: Mac Users Caught Looking ... or Apple != Hardware

Apple believes that they are a hardware company, even though their fabulous software and consistently broken, mediocre, (tech-and-policy-)crippled, and overpriced hardware make it clear that's not the case.

These folks have enablers, too: fans who think about the hardware like they think about tables at Design Within Reach ... who get a warm fuzzy knowing they have the authentic overpriced article. And other fans who don't think at all, but love a glowy apple logo because it makes them fit in with the crowd they want to imitate, er, um, that came out wrong, the crowd who all want to, uh, think different with. Identically. Or something.

Truth is, aside from the hardcore enablers, all the other Mac users don't give a damn about the hardware (until it breaks). They sit at the their glass desks from Ikea or Target (not DWR), they boot up their who-cares-I-just-like-Mac-OS machines, and they work.

The real Psystar story is not about a vaporware company, or unenforceable EULAs. It's a story about the story. The real Psystar story is this: it's a big deal because most Mac lovers don't love, worship, or want to pay for Apple's charade of being about hardware. So they would consider buying a Mac clone.

Think of it this way:

PC users who don't want a Mac don't care about Psystar
PC users who are technophiles and would like to play with Mac stuff without a real Mac don't care much, because they can build their own Hackintosh for very cheap
Mac users who believe there is critical value in the Apple hardware don't care either, because they (and, they figure, those in the know) would not be interested in a Mac clone to save a couple hundred bucks.

Who's left? Why is everyone so spun up? (Google shows 220,000 results for Psystar right now, and only a few are psystar.com itself)

Because when the mere possibility of a Mac clone gets so many people feverishly looking, writing, and thinking, it puts the lie to the Apple-is-really-a-hardware-company positioning. And that realization ... if it were to take hold ... would have implications for Apple's strategy, product line, and stock price.

At some level, the Macosphere has always known, that the software was 98% of the experience. And a big part of them viscerally responded to the proposition that all they need is a software subscription to really nice OS.

Tuesday, April 15, 2008

Not News, But Still Weak: PPV Timeouts on DirecTV DVR

Just got my DirecTV bill which mentions how, from now on, pay-per-view videos will self-destruct from my DVR after 24 hours (they used to remain indefinitely). This wasn't a surprise, as it was reported ages ago.

Sad and amusing really: the PPV offerings are already more expensive (by 2x) than the video store 4 minutes from my house. Oh, but PPV has second-run films that aren't on video, and it's more convenient.

But they're not competing with the video store, are they?

BitTorrent has first-run films and is even more convenient, and the films stick around as long as they want. Oh and they have gorgeous HD specimens with an open codec and no HDCP, special video cards, fancy cables, strange monitors, blah blah blah.

And if I'm going the PPV route, and I really wanted to keep the PPV videos from the DVR, I could just push the record button on the attached DVD recorder, which records in the clear, obviating any CSS-based, DMCA-granted restrictions (beyond the normal copyright restrictions) that might get in the way of my time- and place- shifting.

Those DirecTV guys and the Hollywood schemers who thought this up are so smart, I want to know what they're eating for breakfast so I can get some too.

Saturday, April 12, 2008

A Hint in the Quest for S3 Suspend on Windows

Unlike most geeks I know, when I'm working, I don't like to put on headphones and listen to music. I concentrate and focus better in quiet places. So when I put together a new PC, one of my top priorities was quiet. Quiet case, quiet cooling, low power consumption (relative to horsepower) and S3 suspend.

Since I switch between a few different boxes, multiple power supply fans churning in a small room is a loud problem. Plus, S3 is a great user experience (instant restart, exactly where you were) and enviro/energy-bill friendly (far less power consumption than an S1 fake suspend).

Google around and you'll find lots of tips for S3:

First, you need to be running the ACPI HAL underneath Windows. Typically not a problem, but you can check under Device Manager / Computer node.
Second, check that your BIOS ACPI setting allows S3 (e.g., mine has as option "S1 Only" that would prevent it from going S3 if it were selected)
Windows XP normally omits a registry entry (USBBIOSx) required for S3 because apparently some hardware can't wake from it via USB devices. Why that warrants disabling a sleep mode that I've always woken up with the big ol' power button I don't know.

Check the BIOS before you install Windows and the ACPI HAL immediately afterward, because conventional wisdom is you may need to reinstall the OS if you want to change these items.

I did all this stuff, still no S3. What next? There's a Microsoft command-line utility called DUMPPO.EXE which outputs a bunch of useful power management status stuff and lets you set power policy as well.

When I ran it with the cap argument, it produced a report that said no suspend modes were available on my hardware at all! Huh? Whah?

At the bottom of the report is a separate line about a legacy driver. It was the default VGA driver that XP uses on install, since I hadn't loaded the real nVidia driver yet.

Hmmm...

I installed nVidia's latest and rebooted -- all of the power options were now enabled. DUMPPO also reported all of the suspend modes available.

So... first, bad, bad Microsoft for defaulting virtually all video hardware to a legacy driver that breaks the OS's own power management. Maybe there's a fix for that in XP SP3.

But second, check ALL of your drivers -- chipset, sound hardware, and video at least, and don't bother testing S3 until after you've gotten your drivers installed.

Tuesday, April 08, 2008

The Big Bad Rewrite Isn't Always a Bad Idea

The other day I wrote about how a lot of value in legacy systems could be liberated by more freethinking and vigorous modification and refactoring -- not the defaults of "replacement" or "incremental maintenance."

Now I'm going to look at the other end of things -- when it makes sense for to replace even "young" systems with a full-on rewrite.

Joel Spolsky has written one of the more famous essays on the big rewrite as anti-pattern. And he's right in that essay. Don't rewrite because you have grand dreams of tearing down the house to build a beautiful code palace, or because code is easier to write than to read, or because you don't like the platform the other guys used, or because your team ends every status meeting with "we need to throw the codebase out and start over," like Cato the Elder ending every speech Carthago delenda est.

A rewrite makes a lot of sense, though, in situations where the original implementation was pushing the bounds of the platform, resulting in some hairy, troublesome, and expensive-to-maintain code ... and meanwhile the industry has caught up and moved past you to the point that the hard areas are now trivial.

In the late 90s, you might have spent a bunch of time on a very complex desktop reporting app with Java 2D graphics. Your engineers hacked to the limits of what Java could do ... and maybe even tore out into the world of JNI/OpenGL or COM/DirectX to push the bounds. You used CORBA or wrote your own communication protocol, worked around firewall issues, etc. Suppose that today, the code still causes trouble, is difficult to maintain -- and could be replaced with a few hundred lines Adobe AIR/ActionScript code and web service calls. The smart thing to do is rewrite.

Look at the apps your company may have built 5-10 years ago, especially anywhere you were pushing the bleeding edge not as value-add but just as support. Do you have a roll-your-own communication protocol that made sense at the time, but could now be replaced with 10-lines of script, or a point-and-click binding in your IDE of choice? Did you write a persistence layer (or work the source code of an open-source one) when databases were more expensive and each server could house much less data than today?

Here are some common factors indicating it may be time to do a rewrite:

Volumes of code can be replaced with much less code (leveraging newer open source and/or vendor-supported libraries).
The existing code is creating real, measurable, avoidable expense at a large ratio (like 2:1, not 1.15:1)
Your goal is to leverage other smart people's work, not prove that you're smarter than the guy who wrote your old system.
At the business level, you need to gain leverage by getting back up on top of a technology stack and out of the middle where you're toiling away creating no externally measurable value.

AppEngine is Pretty Much What I Was Talking About

None of the people I know at Google had leaked me any info whatsoever about AppEngine.

It's pretty much the environment I wrote about when I talked about simplifying cloud computing. AppEngine could stand some more simplification and standardization, but it's a a rockin' first step.

Although it was tempting to be a little self-satisfied, discussing something like this initiative a month or so before launch means ... well ... discussing it many months after it was already thought of, planned, coded, and up and running inside of Google, so humility is probably a better approach.

Sunday, April 06, 2008

Want More Money? Repair, Remediate, Refactor

In the time it takes you to read this sentence, about $6.02x10²³ dollars will have been wasted in lost productivity due to old, rickety, hard-to-use, non-performant software systems.

Well not so many dollars, but the loss is real -- whether it's the energy and cost-of-operation for more servers, when fewer could do the same job; whether it's hours spent waiting for CSRs to tab through tens of screens in legacy line-of-business apps to find the one field they still use; or worse, when a police dispatcher makes a mistake because software makes it hard to click the right thing fast and dangerously easy to click the wrong one.

Strategies for making "bad old systems" better are not the most popular topic for the tech elite. This makes sense for a bunch of reasons. The mythologies we live by are generally startup oriented. But, more importantly, real roll-up-your sleeves work on old systems doesn't scale. The hard problems are business-specific and not ones that licensable products can solve ... so anyone on a mission to save the world from these systems ends up with a services approach -- i.e., a process approach -- that is bottlenecked by human quality-and-scaling problems.

That's a bummer, because there is a lot that can be done with old systems once a really good team starts thinking about them. Forget the big rewrites and the magic bullets. Forget the awful HTML front ends that barely hide COBOL.

In between are some great possibilities. A lot of legacy architectures lend themselves to SOA without even knowing it. Inexpensive hardware and storage means, in some cases, legacy client bits can be moved back to the datacenter and integrated into a middle tier. Other times, the clients can be done away with altogether if someone has the chutzpah to figure out the "old" protocols and implement them with modern tech.

While I mostly work with startups, I've been involved with at least three projects in the last five years that included some re-engineering of legacy systems. One of these systems was almost 40 years old and used emulation of custom hardware (and a custom physical network) just to communicate. Two of the three were barely documented, or not documented at all. And at least one was a total clusterfudge, in the sense that it hadn't aged and "crufted" naturally, but was the result of crackpot work that was known to be broken from the relatively recent start.

In all of these cases, it was possible to improve the architecture, performance, and functionality of the system without the mega-rewrite. And since the existing systems had tons of data and transactions running, the new software needed to run smoothly side-by-side with existing software, allowing a limited and careful transition.

Here's the kicker: in two out of the three cases, although the "new" software was a vast improvement, it was never allowed to take on a larger role in the enterprise outside of a niche for which it had been developed. A bunch of startup hotshots writing cutting-edge stuff was simply not in the Enterprise IT Legacy Technology Lifecycle Planning Regime and so, like the dismissal of Heron's steam engine in the first century, the new technology was regarded as a mere novelty.

There are some reasons for conservatism and stodginess around legacy systems. But there is also a ton of value waiting to be liberated as soon as some of that stodginess can be shed.

Here's the Deal: The Next Stack Frame