A few months ago I noted the release of Google Base as a pivotal moment in the maturation of the Web. I said the move from unstructured to structured data was an important improvement in making the Web usable and re-usable.
I have also discussed mashups as an important, lightweight development paradigm for making this reusability possible. Thus far the mashups I have seen have been features, mostly visualizations of someone else’s data on Google or Yahoo maps. The owners of the primary data have Web 1.0 business models, including Google Base. The Web 1.0 model is “all your data are belong to us. So come to
Well, now there is about to be a Vast improvement.
Vast has been in development for about a year, creating what I think of as the first inverted portal. Now the preview release is up, showing the first three "applications" of the technology. Vast crawls the web for data and extracts that data in vertical categories. The URL Vast.com is a demonstration site, not really a destination site. The real destination site is everyplace else on the web – blogs, discussion boards, microsites, social networking sites, etc. Anyplace where content creates context.
Vast.com looks like a classic classifieds site, but it is not. It is not a data aggregator. It is a data disseminator. It is a hub targeted to the developer community to enable the mashup of structured data in a reusable form. And it’s not just about classifieds. It’s about adding structure to content. But the first application of structuring the Web is to take free-form descriptions on the Web and make them structured, searchable listings, i.e. classifieds.
What’s different from the aggregators besides the business model? It is the data itself. The data do not come from feeds. They come from crawls of primary data sources – cars listed on dealer sites, jobs posted on companies’ web sites, personal profiles listed on blogs, personal web sites, and dating sites. As a result, the data are Vast – millions of cars, millions of jobs, and millions of profiles, with more categories of objects to come.
This is the true long tail of listings. The user benefit is obvious -- find the exceptional value. Like the old joke -- why is it always that you find something in the last place you look? -- the exceptional value is always in the long tail.
You don’t see a lot of end user features on Vast.com – no AJAX widgets, no mapping, no integration with reviews and ratings. What you will see a Vast amount of data. If you have a web site about Miami – create Miami-only view of the data. If you have a Mercedes-Benz discussion site, create an M-B specialty classifieds component. If you think you can do the next HotorNot.com, build it. You can even build the next Vast.com on the API. All the features you see are available in the API for free.
I don’t think of Vast (or Riya) as just Web 2.0. Web 2.0 is largely about tagging, social annotation, and sharing. I think of Vast and Riya as bricks in the road toward a Structured Web, beyond Web 2.0. Vast does for free form text what Riya does for images – extracts structure for reusability. There is some additional discussion of Vast here, here, and here.
Vast is not a walled garden. It is “All your data are belong to you.” Have at it.
I like the idea, and its been awhile since I reviewed this space, but how is that Vast different Web 1.0 Autonomy's main USP?
Posted by: Kevin Russell | March 14, 2006 at 10:43 AM
Peter:
This is real well written. Unfortunately, I did not see this key point come out of any of the blog coverage that Vast got earlier this week, though Naval tried to clarify this point on a couple of posts.
I agree with your point about bringing structure to unstructured data. But I do not agree that the way to do it is by crawling the entire web and categorizing. For one, Google can do it very quickly. Secondly, and more importantly, data usually has multiple parameters which may not be easily captured/understood, so that the structured data created might not be as useful in the end. For example: in classifieds shelf life, reputation of the poster, overall credibility of the listing is more important than just any listing. All these aspects are tough to capture with pure software genius. And I think problem just increases when you consider other data beyond classifieds.
The concept is definitely good - structured data opens enormous opportunities. But we have battled this same issues at iNods, but we have moved away from the approach you have mentioned here. I'd be curious to have a chat with you on this sometime, if possible.
Posted by: Vaibhav Domkundwar - iNods | March 16, 2006 at 01:18 AM
Peter: another nice post.
I was a little fixated on clutter and how to cut through it a few months back...started thinking about the long tail of web services (i.e. there are going to be more and more in each niche) and thinking about the tools we're going to need to sort it all out. Vast.com looks like a solution to this emerging problem...again..nice post.
Posted by: Mike McDerment | March 17, 2006 at 10:58 AM