Here's a semantic app I'd really like and which could make the Internet -- and data stores in general -- more valuable. If anyone sees this and thinks it's a great startup idea, you're welcome to develop it.
There are also a bunch of up-and-coming semantic-web apps, like Twine, that add semantic analysis to the extant web 2.0 experience. But while the RDF-enabled-del.icio.us-on-steroids-with-autotagging may be nice -- heck, I'm sure I'll be using one of those systems -- I want something that can carry out the kind of mental associations that I normally would have to do myself.
In order to make that a reality, I'll need several things:
- The Model: it works by association, and associations have direction, degree, and kind, among other things. So we need more than just a network. We need a model that implements a metric space or vector space, allowing distances to be computed between any two points with sensible behavior, and where measures (length, volume) can easily and intuitively work.
- Concepts (e.g., "politics"), and the places concepts are referenced (say, a political blog page), both live inside this space as subspaces, just like in my brain. Some of these spaces may consist of just one element.
- The formulae that support metric need to be fine tuned so that abstract tags ("politics", "justice" as opposed to "Davis, CA" or "bananas") don't suck a million other things to within epsilon of themselves. We can't have everything that linguistically has to do with politics cluster tightly in the space to a politics node, or else the system isn't terribly helpful.
- I want the system to start out thinking like me ... and then I can experiment later with "social thinking." What does this mean? My semantic tagging is different from everyone else's. Each group or culture I'm in sees the content differently from other groups and cultures; there is no universal invariant conceptual structure. One persons sees a news story and thinks "economics" while someone else thinks "environment" and another thinks "social justice." If we mush all these tags together we get nothing terribly useful. So: let's start out with my view of the world, we'll compare and integrate others' later.
- How to do #4? Start with every web page I visit (not just those I actively tag) -- read my history file or my network traffic (obviously, keep raw data local for now). Read my email and my calendar and my notes and phone (PIM) and my to-do list. And weight accordingly: associations in a web page I write (like this blog post) count more than stuff I browse through; notes I make in my phone or Outlook count for even more; the metadata in my calendar and the titles of my contacts mean a heck of a lot for the model. Walk my social graph and look at what my friends know and are interested in! These are all straightforward algorithmic steps. Leaving aside any self-tuning in the metrics engine, there is no AI or black box here.
- The data from #5 is part of the metric function ... that's how the system shapes itself to my view of the world, or at least my "attention waveform" as I transmit that through keystrokes and mouse clicks. Concretely, nodes "move" in the space based on whether I actually key them, whether they appear in meetings in my calendar, whether they are tightly clustered to my personal contacts etc. Even my contacts are arranged based on how long I've known them, what I talk to them about, how often, etc.
- The system will make mistakes. So all the more reason for (1) privacy around my core data and (2) a dashboard where I can "juice" certain things or move them around. (This is where interesting goal-oriented self-retuning can come inThere is plenty of other data that can be shared and deduced for network-effect-dependent revenue streams.
- A UI into the model. What I'd really like is something brilliant and minimalist (that I can't myself invent!) I know there are lots of desktop-based visualization methods that would be fascinating and could dazzle a crowd at a presentation, but I want something day-to-day useful ... and ideally something that fits on a mobile phone (or at least iPhone) screen. So that in a perfect world, if I'm out and about, I can poke this system with one piece of data and have it return the associations my brain makes -- and the ones it would make if I were jacked up on caffeine and had the whole internet in my frontal lobe somewhere. I'd like maps, charts, and pictures in there too.
I hope this outline makes some sort of sense. Like I said, it's really not as tricky or complex as it sounds. Fine tuning the metrics will take real work, as will optimizing the data structures so that the relevant queries are fast, and so that the system can work in "tinfoil-hat-private-mode" as well as "publish-whatever-you-deduce-from-my-friendfeed-and-private-chats-mode."
Incidentally, this app could also help solve the augmented reality dilemma of "how do you narrow down all the possible info about the input objects and coordinates, so that the user sees something interesting and/or actionable," so that's another angle.
Have some capital or time you want to throw in this direction? Feel free to email me -- firstname.lastname@example.org ... or if you want to loot this idea and think you can build a killer app for the semantic web era? That's fine too -- send me a link when I can sign up for the beta.