New start-up Factery Labs is launching its first service on Tuesday, a technology called FactRank that can tear through Web pages and collect what it calls "facts." These are bits of information from each source page that Factery Labs' algorithm then organizes into an order of importance.
What this means for you is that developers will soon make use of the technology in third-party search engines or on Web pages to very quickly deliver reading summaries. This cuts out most (or all) of the parts you don't care about, while organizing the bits you might. It also manages to do all this in real time.
The FactRank technology was created by Paul Pedersen, who has a good background in search, including gigs at Inktomi, Google, and Powerset. CNET News met with him and co-founder Sean Gaddis (former Skype and eBay'er) on Monday to get a demo of how the technology works.
In a nutshell it goes like this: FactRank goes through each Web page or source (in whatever index it's searching from) finding semantic tip-offs like declarative sentences. It then cross references each of those against one another, surfacing some of the most relevant ones to the top, as well as factoring in the order of how they appeared. What the user then gets is a tidy list of statements, each of which is sourced and given a level of relevancy based on their appearances in all of the indexed source pages combined.
Whew. Got that? Great, here's an example of what it looks like in motion, as seen on a basic search for Sarah Palin on Twitter:
Of course, one of the problems with Factery Labs' approach across multiple sources--be it Twitter, or multiple URLs is accuracy; like how can it realize something like The Onion is not the same as the Associated Press?
The short answer is that it can't. Factery Labs can't determine the truth value of what it finds, nor will it ever. "It goes beyond any existing technology. And nobody knows how to do that. I mean, I don't even know how to do that--people don't even know how to do that," Pedersen said. "We are absolutely neutral. We have nothing in the system that has any bias in terms of anything. The only mechanism we maintain is egregious spam, the bad guys."
Along with maintaining a blacklist of these bad sites, FacteryLabs also keeps a list of good sources, or ones that continuously deliver. The more often an author successfully recommends a usable page, the faster they'll accumulate rank among the results.
What you can play with today
As for applying that technology to some consumer products, Factery Labs is launching with a handful of development partners, each of which has already built a tool that makes use of FactRank. The most notable one comes from Sobees which is using the service to add relevancy to Twitter and FriendFeed search results--something that's no small feat.
Users can do a search on Sobees' Silverlight-based Twitter client as usual, but there will now be a FactRank button that can sort through those tweets. It does a quick once-over of all of the results, and will filter the most relevant information to the very top. Included in each of its results is also a shortlist of the facts it finds on every page.
Advanced users might find more utility in an updated version of Ultimate Info, an extension for Firefox that does a number of things with on-page data. Starting Tuesday, it will let users select links on a page, each of which gets the fact-finding treatment using FactRank.
In our demo, Gaddis used Ultimate Info on the front page of popular site Drudge Report, highlighting about six or seven URLs that were on the page, then running a FactRank query, which brought in its fact results in just a few seconds. As Pedersen explained, users could run something similar on a long article (or several long articles about the same subject), and FactRank's algorithm would be able to provide a fact summary in short order.
Not launching on Tuesday but where the company expects to see the most development is on mobile devices. "Our analysis shows that mobile devices are a prime target for this technology because the latency produces a lot of resistance in the browse experience," said Pedersen. Instead of a user just getting back a link dump of all the URLs it finds, the FactRank engine will go out, process those results, then deliver users with a summary of the best selection of facts--a move that will save the end user from having to wait for any extra pages to load.
If you want to give some of the third party Factery Labs tools a run, you can find them on the company's implementations section. There you'll also find a test search engine that's running off of Twitter's index.