Powerset: Re-indexing the Web

A new search engine is indexing meaning, not just keywords.

Rafe Needleman Former Editor at Large
Rafe Needleman reviews mobile apps and products for fun, and picks startups apart when he gets bored. He has evaluated thousands of new companies, most of which have since gone out of business.
Rafe Needleman
3 min read

My first thought when stepping into the Powerset offices: "Overfunded." The company, which aims to create a better search engine than Google, already has some of the search giant's trappings: fancy offices (though rented), a game room, and a victor's arrogance. Yet if the Powerset team can pull off what it's set out to do, it will indeed revolutionize search and the way people use the Web, not to mention its economics.

Only natural

Powerset is "natural language search." What that means is that instead of searching the Web based on keywords, like Google does, it searches on meaning. Powerset understands what a search query means, and it understands what every sentence it has indexed is about, too. The company's shining example (which is getting a little old) is this: If you enter the query, "politicians who died from disease," Powerset will return a list that begins, "Edward Heath," with the supporting snippet from Wikipedia, "Sir Edward heath died from pneumonia." It says this because it knows that Heath was Prime Minister of England (and thus a politician), and that pneumonia is a disease.

Powerset's well-worn show-off query. Powerset

Understanding Web content this way is, as they say, nontrivial. Powerset acquired an exclusive license to a 35-year-old Xerox research exercise called XLE, which does the job. Powerset COO Steve Newcomb told me that recent breakthroughs in both the XLE algorithms and in technology (the predictable Moore's Law) have made it economically feasible to index the Web for meaning.

(Newcomb said it took a year and a half of negotiating to strike the license deal with PARC, Xerox's research arm spinout. The deal includes provisions that prevent any other company--like Google--from getting access to the technology even if the other company acquires Xerox or PARC.)

Building a semantic index, as opposed to simply a semantic search query parser, is fundamentally new and different, and if Powerset can pull it off, it will make Web searches more accurate and useful. No longer will users have to experiment with subtle variations in search queries to get useful results. Slight differences in wording that mean the same thing will pull up the same results. Also, Powerset technology enables the display of results that are more readable than Google's: Powerset highlights passages that answer the query, instead of simply flagging keywords that match.

The hype curve

Am I skeptical that this will work? Of course. For the past several months, Powerset has been slowly peeling back layers of its work, trying to stay just ahead of the building sentiment that it's more hype than reality. The demo is impressive, to be sure. But Whatsit-style queries are just one kind of search. And to date, no outsiders have been turned loose on Powerset's engine. Only Powerset execs drive during the public demos.

That changes in September, when Powerset will launch PowerLabs, a special site for early Powerset testers that will unleash the search technology on limited corpuses of knowledge, like Wikipedia. After a few months of beta testers banging on the algorithm--and Powerset tweaking its engine--it will shut down PowerLabs, turn its technology loose on the Web itself for a few months, and then launch Powerset proper.


Powerset's search technology is more expensive to run than Google's. It takes more computing power to parse semantics than to simply index, and nearly 20 percent of Powerset's ongoing budget is spent on compute resources, Newcomb told me. That's an awful lot for a Web startup, and although the price of compute cycles keeps dropping, Powerset's technology will always cost more than other search engines.

So it remains to be seen how Powerset will make a buck, even if it is better than Google. Perhaps Google Adwords on Powerset's highly precise search results will be do the trick. I believe there is margin to spare in Google's advertising business, so even though Powerset queries are more expensive than Google's, the economics might work. Powerset can also be turned loose on corporate databases for the big bucks. Imagine what it could do for lawyers.