X

Narrowing the search

Vivisimo CEO Raul Valdes-Perez says that in today's Web clutter, search companies should think beyond ranked lists.

3 min read
Early Web search engines succeeded spectacularly at turning up interesting results.

With the growth of online content and the improvement in the ranking of search results, the situation is now flipped: Any query turns up an overabundance of results, both relevant and irrelevant.

Search engines struggle with the challenge of helping users deal with this information overload. Some search engines are placing their bets on personalization, which I contend is a dead end: Top talent will be expended on the problem with little to show for it in the end.

True search personalization has several inherent problems.

True search personalization means that software will observe your Web surfing and other habits, and infer a profile of your true information tastes. Then your next search for, say, "anthrax," will turn up the rock band rather than the chemical.

But true search personalization has several inherent problems. Chief among them are the following:

• People are not static; they have many fleeting and seasonal interests. A student might intensely research Abraham Lincoln for a school project but may care nothing at all about the subject later on.

• The surfing data used for personalizing search is weak. The data that online booksellers like Amazon.com use is strong: I'm paying $20 for a book and committing 10 hours of my life to reading it. (Let's ignore the problems with gift purchases.) Surfing data involves the minimal commitments of a mouse click and a few seconds to look at a page before leaving.

• If the data used for inferring user profiles is the whole Web page that the user visited, then it's misleading. In this case, the user's decision to visit the page is based on the title and brief excerpt (snippet) that are shown in the search results, not the whole page.

• Home computers are often shared among family members, whose surfing interests obviously diverge.

• Queries tend to be short. My own spouse couldn't figure out my interests from a one or two word utterance, so how is a computer going to do better?

The best personalization is done by individuals themselves.

Given all these difficulties, search personalization is likely to waste the talents of top computer scientists for years to come. But if not search personalization, then what? Some companies are placing bets on a display of search results that goes beyond simple ranked lists. The idea is to analyze the search results, show users the variety of themes therein and let them explore their interests at that moment, which only they are in a position to recognize.

One approach is to cluster the search results into possibly overlapping categories. They can be displayed as simple category folders which can be expanded into subfolders and whose folder contents can be listed. They can also be displayed as spatial or temporal objects that are visualized on a computer screen in various dimensions with a time component. Clustering into categories takes place first, and exactly how to show the clusters is a second, independent decision.

Other approaches do not make use of categories at all and instead directly embed the search results into a map of some sort, possibly with an added time dimension.

Two researchers at the University of Maryland experimentally compared zoomable interfaces to folders-style clustering interfaces along the user dimensions of task accuracy, efficiency and subjective levels of satisfaction. Among their statistically significant conclusions was that users preferred the folders-style clustering interface partly because of its simplicity.

Which approach is best? The best personalization is done by individuals themselves. Software or even other people can only guess--usually poorly--a person's interest. The next advances in search technology will acknowledge this limitation and make it easy for people to act on their interests of the moment--which only they can recognize.