Collecta real-time search: So fast it hurts
Collecta’s new real-time search engine has an impressive interface that allows for amazingly fast delivery of results. The problem is that in such a short timeline, the results can lose context and relevance to users.
The real-time search market has been heating up as of late, with Facebook testing the waters and other upstarts such as OneRiot and Scoopler fighting to gain a foothold in the search marker. Even Google co-founder Larry Page has spoken about the need to, “index the Web every second.”
Enter Collecta, led by Gerry Campbell, formerly President of Search for Reuters and SVP of Search for AOL. Campbell boasts that Collecta beats the competition in terms of both immediacy and breadth of results.
The Good
Collecta’s user interface (seen below) is intuitive and allows for lightning-fast loading of Web results. It does so by using an XMPP Gateway like many companies use for instant messages.Think about how quickly a site like Meebo loads new instant messages and you can imagine the user experience.
As you type in new queries, Collecta stores your last search in the left sidebar so that you can go back to it. It doesn’t continue to gather results on each query that you have on the page, but runs whichever you’re focused on in the moment.
The Bad
Since Collecta doesn’t start collecting results until you hit the search button, it’s literally a flash in time. This means that the results are dominated by Wordpress comments and Twitter results.
While Twitter results are generally relevant for real-time searches, Wordpress comments aren’t always about current news. The problems with returning blog comments are that they could be on a post from weeks or months ago and often lose context without understanding the article they’re attached to.
The other problem is that it’s impossible to get the full story within such a tight timeframe. If a Twitter post links to an article that came out the same day or even the same week, it could be essential to understanding the topic.
The Ugly
The challenge for the emerging real-time search engine is balancing immediacy with relevance and it’s no small task. As you compress the timeline for any issue, you get less information as far as relevance and trends.
The other challenge is that real-time search engines generally leverage APIs from companies like Twitter to return results. This means that they aren’t actually creating an index in most cases, instead relying on the search API to return the best results.
This makes it hard for Collecta to interpret which results are most relevant or to apply trending data in a useful way as the same result isn’t likely to ever be returned in a subsequent search query. Unfortunately the real real-time search engine, much like the fish from Finding Nemo, can’t learn by providing results from the last few seconds.

Related Posts:

June 19th, 2009
Everyone now is trying to launch a search engine whose name gonna be forgotten within a week. Someone remember Cuil? The only search engine that may compete Google in a mid or long-term period is Bing. Period.
Yeah real-time results are great, but they are totally irrelevant. Searching takes 2 minutes and all you finally get is some totally stupid results that are most of the time related to twitter. Completely useless. Traditional search engines are the way to go.
June 21st, 2009
hey SixStorm, i work at OneRiot so i’d argue with realtime search being “completely useless” ;-) Industry figures show that 20-40% of searches display some intent that would be best served by results from the realtime web (e.g. when you are trying to find out what’s going on right now concerning Iran). We’re doing a lot of work on realtime web ranking algorithms to get the most relevant realtime results to you fast (http://blog.oneriot.com/content/2009/06/oneriot-pulse-rank) but there’s also huge value in the realtime “firehose” approach of Collecta. It’s a fast-moving and innovative space, so keep watching :) Tobias @ OneRiot