Monday, February 26, 2007

BitTorrent's paid movie rentals are just silly

Not the concept. But these offerings are so predictable, and so predictably bad, in their pricing that each one is just another big delay before a service can finally be produced that offers real value to consumers, studios, content creators, and publishers/distributors.

Here are a couple of examples. First up, this Bittorrent service: from them, for $4, I get a 24-hour viewing period on a media file playable in some devices, with Windows Media Player.

Now, behind door #2, I have a video store where for a little over $3, I can get a physical disk that will play in far more devices, for a period of several days, most likely with better video and sound quality.

Maybe this demo (males 15-35) is too cool to go to a video store? I doubt it, but you also have Netflix. A conservative calculation (cycling 2 disks per week on a 3-at-a-time sub) yields about $2 per rental. Plus I can keep the disk as long as I like, and I have the luxury to not watch it in 24 hours if I’m busy. Although I do pay another convenience cost in that I am dealing with the Netflix queue, not an on-demand selection, this cost is essentially paid to Netflix; that is, it is a virtual subsidy of the Netflix operational model. The studios get no benefit from that at all, since the relative physical scarcity of disks is not in their model (they fix the original disk price and press as many as they can sell), but only enters into Netflix’ model (Netflix can only reasonably acquire, use, and then dispose of a modest number of disks for each film).

There’s also on-demand films from satellite/cable. Cost: $4. Terms of use? basically old school TV on VHS rules: I can record the show to my TiVo and make suitable personal copies (e.g. with my DVD recorder), which last indefinitely and which I am allowed to watch whenever I want. Downside? NTSC quality video.

Bittorrent is selling me a strictly inferior good at a higher price. Economics says they’re not going to succeed, and the studios will claim it’s because viewers are all crooked.

Not to pick on Bittorrent in particular – without doing a rundown of all of these work-alike online stores, let’s look at one more: Amazon Unbox made headlines for its onerous Terms of Service as well as its implausible pricing. Here are a few couple of typical price matchups all from amazon.com:

The Departed:
Unbox restricted download: $14.99;
actual DVD, widescreen with extras: $15.99;
BluRay high def 1080p disk $23.95

The Devil Wears Prada:
Unbox restricted download: $14.99;
DVD new $13.89;
DVD used ("very good condition") $8.76

Babel:
Download: $14.99;
DVD new: $14.24;
HDDVD or BluRay: $27.95

In some cases there are shipping charges, but there are numerous ways to avoid shipping charges on Amazon. For reference, iTunes new releases are around $12.99 and are also heavily restricted.

The point here is that not only do these ventures refuse to concede that more restrictions on media make it less valuable to the consumer, but they actually imagine that they are somehow innovating in a way that will let them charge more than the baseline cost (physical DVD/CD and accompanying rights are the baseline).

Nothing here suggests that the media should or must be "free" – only that Bittorrent president/cofounder Ashwin Navin and the studios are all yanking our chains when Navin says, "We're really hammering the studios to say, 'Go easy on this audience' ... We need to give them a price that feels like a good value relative to what they were getting for free."

Thursday, February 22, 2007

MSFT apologia

Ok, it's not a huge secret that I'm a Microsoft apologist (that is to say defender). Not that Microsoft hasn't made its share of mistakes and done some things wrong. But yesterday a friend, not a developer but a power user, lightheartedly referred to Bill et al. as "software bozos" and I felt obliged to point out a few things...

Microsoft produces great products under an unbelievable set of constraints. Customers want Microsoft stuff to work seamlessly on everything from cell phones to PCs to set-top boxes to web servers to XBox 360s; they want it to make sense to everyone from CEOs to doctors to my mom; they want it to be localized (support local language, culture, currency, calendar, phones) everywhere in the world, and to be accessible to the handicapped and to be secure even when an extremely unsophisticated user tries to do really dumb things.

They also want it to be inexpensive and to work on any cheap hardware you buy off the 'net and install it on (unlike, say, Apple, where the OS is only legal and supported on the hardware they're in the mood to offer this month); oh and besides being a general purpose operating system, customers like it that Windows is one of the most advanced 3D gaming platforms, competing with dedicated gaming consoles that cost just as much to build as a PC and need do nothing except play games...

Oh, and also, unlike pretty much any other OS I'm familiar with, customers (especially business customers) need it to be perpetually backward compatible, so that when they put a new Vista machine together today it'll still run line-of-business apps that were written for DOS 4.01 in the 80s, and somehow magically these old apps will print reports on the new color laser printers attached to the computer, that were never even dreamt of when the apps were written. And mostly this actually works.

Now let's say you live in America and you buy a new/upgrade copy of Windows every 4 years for about $200, and a new copy of Office for about $400. You're paying about $12.50 per month. And you get the security updates, and browser updates, media player, Virtual PC, development tools (if that's your thing) and all kinds of other stuff for free (or included in your $12.50 per month admission price if that's the way you want to think about it.)

I'm not sure there's anything else I pay $12.50/month for that even tries to think about solving problems on this kind of scale, let alone succeeds.

Lastly, someone will be tempted to point out that Microsoft's enormous presence in the client OS and office productivity space may inhibit all kinds of other software ecosystems from flourishing. There are a number of open questions about this. First, it is reasonable to believe that standardization at one level in a stack enables massive innovation at the next level up the stack, which would otherwise have been impractical. This goes for any platform piece -- Ethernet, Windows, *nix, Java, HTTP...

More importantly, do not assume for a minute that the open PC architecture would even exist without the dominating historical presence of Microsoft Windows. The fact that you can even sit down with an assembler and start hacking a boot image and work your way up to running literally whatever you want on a readily available PC has never been a given. Considering the attitudes of more closed OS and hardware makers in other ecosystems (like cell phones), it is entirely possible that without Microsoft and the need for backwards compatibility, just running code on a cheap mass-produced box would long ago have required signed code, a crypto key from some industry licensing group, and more cash for membership and fees than any small company is ever going to have.

Tuesday, February 20, 2007

Grepping in PowerShell

I originally wrote this for the company wiki the other day, and thought it might be useful to a wider audience. The context is parsing and processing an iTunes library.xml file (just a one-off task), which I thought might a be a fun and educational opportunity to slice, dice, and ... how does that Ron Popeil commercial go? ... with PowerShell.

PowerShell is the new shell for Windows. New, and supported, but not "the official" in the sense that it doesn't ship with Vista, although I'm guessing it will ship in the Longhorn Server rev.

If you're used to Unix shells, then you'll probably be floored by the power of PowerShell and somewhat annoyed by the syntax, which, despite liberal aliases to familiar things like ls takes some getting used to.

.net framework integration means you can easily access any object in the .net base class library, and there are some special tricks that do some of this for you too. The canonical example seems to be this one, a quickie rss reader:


$wc = new-object System.Net.WebClient
$rssdata = [xml]$wc.DownloadString(‘http://foo.bar/rss.xml’)
write-host $rssdata.rss.channel.title
$rssdata.rss.channel.item | foreach { write-host $_.title }


Since the source file is xml, I had thought the XML parsing would come in handy, but it turned out that there was no real data model to the XML. Basically, there is just a big nested map structure (key-value pairs in blocks) in the item list. Sort of XML for the "takes-void*-returns-void*" crowd. So then grep looked promising because the keys and values (and their tags) were grouped on individual lines.

Grepping is a little counterintuitive with PowerShell because the pipeline between commandlets in PowerShell is filled with full-on objects not strings. If you just want text, you can use Get-Content, which provides its output as a bunch of string objects, 1 per line, which is convenient. Here's an example I came up with after struggling a little bit to get a grep type of functionality. I throw a sort and unique on here for fun:


Get-Content Library.xml | ForEach-Object { if ($_ -match [regex]"(?<=Artist\<.{13}).*(?=\<\/)" ) { $matches[0] }} | Sort-Object | Get-Unique | Out-File lib.txt


Many of these things can be abbreviated too, so if you want your script to read a little tighter, you can use


gc Library.xml | % { if ($_ -match [regex]"(?<=Artist\<.{13}).*(?=\<\/)") { $matches[0] }} | sort | unique


Isn't that sweet?

Since the regex uses zero-width lookahead and lookbehind assertions instead of extracting a marked subexpression, I'm curious if anyone has input on whether one approach is faster / better / shinier than the other.

My first guess is that they are similar, since my first cut at implementing lookahead + lookbehind would probably be to match the whole outer expression while naming the non-zero-width-bit in the middle, and assigning the value of that to the expression.

Monday, February 12, 2007

Yipes it's Y! Pipes

Super cool: there is no reason that a human should need to handwrite HTTP/XML/mashup/filtering logic for simple cases. Even with the highest-level toolkit, it still requires time, introduces bugs, needs to be hosted...
Systems like this are about moving toward a declarative specification for extracting semantics from web services (in this case RSS).

This particular implementation is a bit fancy on the graphics, which makes it run slowly, and it seems like it needs to extract data from RSS only. That is, if you try it out, it expects every URL "fetch" result to look like an RSS formatted collection of "somethings" ... which is nice, but it would be cool if you could also process XML from REST queries, or build SOAP queries as well. My first inclination was to ask for some kind of RegEx widget, but perhaps the Y! Pipes team intentionally doesn't want to allow us to go down that route ... over time they want more structure, not less structure in the data. They probably feel like RegEx has already been done in the HTML scraping world, although there is certainly lots more work to do there.

If you are interested in this stuff, check out some other approaches and flavors of this notion too:

- Dapper which tries to build web services on top of any web page as a data source. These guys have a "virtual browser" which lets you point and click your way through existing pages to build a service

- Kapow and OpenKapow -- enterprise and "free online" design tools for scraping, mixing, mashing and republishing the web

- QL2 an "old-school" enterprise software product used for industrial strength scraping, it implements a query language so that you can treat the web data sources that are being used as a virtual database (!) (frighteningly enough for an "unstructured data" query tool, this system is used in some large mission critical apps)

- YubNub: this souped-up version of wget lets you define "commands" (aka abbreviations) for issuing web queries, can substitute parameters, and pipe things together. It's arbitrarily extensible since you can always write a servlet/ashx/&c. to provide any data access or transformation you might want. On the other hand, it's more about plaintext (or human readable anyway) than XML