CNET también está disponible en español.

Ir a español

Don't show this again

Tech Industry

Spock: Search's final frontier?

As the search engine goes live, co-founder Jaideep Singh explains why he thinks the new search engine's people-related focus can spell the difference.

What newbie Web 2.0 company wouldn't want to become the next Google? Easier said than done, of course, and if you had a nickel every time you heard that prediction for a start-up, you'd be in Eric Schmidt's tax bracket.

But if a Google could come out of nowhere to blast past Yahoo, why can't another unknown emerge to eat Google's lunch? Nobody at Spock is making that bold claim just yet--the invitation-only search service went live only this morning.

As it launches, Spock has more than 100 million people in its database, and the company plans to quickly add more by scouring other publicly available sites. While people-related search sites such as Wink, and LinkedIn have had their 15 minutes of fame without upending the constellation of forces in the search arena, Spock takes a slightly different tack, offering meta-tag searching and Wikipedia-like tagging privileges to trusted users.

CNET recently sat down the CEO co-founder Jaideep Singh to find out more. By the way, Singh says the company name has nothing to do with the Vulcan science officer of the Starship Enterprise. It's an acronym for "single point of contact and knowledge."

Q: How many people has Spock indexed now?
Singh: A little over 100 million people.

And you're adding approximately how many each day?
Singh: There are two things: one is people, and the other is how many documents we're processing, because one person may have many documents. We're really crawling an index in the entire Web and picking out documents and organizing those documents around people.

Can you explain exactly how the technology works?
Singh: If you're looking for some specific keyword, Google is great. The issue is that when you now search for people on Google, what you get is a bunch of documents about people. If you have a popular name like David Stern, who is the NBA commissioner, the first couple of pages are really about that person. So you really can't find the David Stern you met at the bar or from a business meeting.

That's a simple manifestation of the thing. It takes a lot more technology to do what we're doing, which is really trying to figure out the unique David Stern and organize documents and information and images and relationships--all those things--around a person.

How much harder is it to do that than a general search?
Singh: A lot harder. It's actually a different technology stack. The only thing that's common is crawling.

Instead of just doing metadata extraction, we try to figure out who is this document about.

So where's the difference?
Singh: When we're done crawling, we go off in a different direction. Instead of just doing metadata extraction, we try to figure out who is this document about. We want to figure out the most relevant thing in that document. So, say there's a document about Charlie, and it says, "Jaideep likes to play tennis with Renee." That doesn't mean Charlie plays tennis or likes to play tennis. So you really you have to understand language and understand what this document is all about and that takes you to do things like natural language processing and other technologies.

Is there anything that you folks have come up with that's proprietary?
Singh: Absolutely. We have numerous patents. We have seven Ph.D.-type people in our company working on the algorithms for this thing. We have a lot of other outside help, including a lot of notable advisers from Stanford and from industry who are helping us really solve these problems. It's not just solving the problem, it's solving that scale for billions of Web documents. That's the largest-scale problem there is out there, so that's a challenge.

Is what we now see on the screen what the public will see when Spock opens up?
Singh: That's correct.

And one of the first questions they'll have is how is this different from Google.
Singh: Let me just step back a little bit. When users come to use the site, we think they're going to find it to be a very cool service because not only is it Google-esque in a way. You can give a query and type in a name or any keyword--you can say "Give me all the astronauts--but when you do that, you get very well-organized results and see the picture of the person. You see the most relevant terms or words that define this person and you'll see where they are on the Web and their relationships.

I tested the service earlier and it pulled up a lot more personal information about me than I found with a Google search.
Singh: You raised a really good point. Let's talk about that for a second. One has to realize that what we're doing is identical to what Google is doing in terms of indexing the Web. We're going out to public documents and picking up content. One has to realize that there is a lot of stuff about you on the Internet. You may have blogged someplace but it's on the Web. You may have a MySpace profile and it's on the Web. What we're finding is our users--when they come on to Spock--can really find this valuable in terms of "Hey, what has Spock discovered that's on the Web about me?" So, just knowing that is valuable.

Can you go beyond a firewall?
Singh: We don't do that. Unless it's out on the public Web, we don't try to get inside.

Why do you believe that personal search is something the market really wants now?
Singh: Two or three things, frankly. The reason we built this company was the problem we faced every day. I was sitting there with my co-founder and I would try to look at my contacts. I use Outlook and I'd say, "OK, you know what? I'm a venture capitalist and I'm looking to find a marketing vice president for one of my companies." I know at least 30 of these guys. I've met them, but I just can't remember who they are. I know that information is out there somewhere so I'd look in my contacts and I wouldn't be able to find that person. Then I'd go to Google and type in their name. But there's so much noise and so many documents that it'd be really hard to find that information. Then I'd go to LinkedIn, which is a lot better actually if we're looking for VP marketing, but again, you just get that biographical data about that person. This was really the source of the problem.

So, to answer your question, No. 1 was it was a real pain point that we were seeing every day. We knew we were building an application that's going to be in the order of magnitude better than anything out there. Secondly, our beta has been going on for a couple of months and we have a lot of users in our beta and we're seeing that the feedback has been phenomenal--people are really finding this to be utilitarian and interesting.

Was that frustration the genesis of the idea behind Spock?
Singh: That was the genesis of the idea and that's when we realized that we're in a very interesting state in the market where there's a lot of people-related information out there, people have their bio pages under "about me," people have their MySpace page and so on and so forth.

Although there is a ton of information about people today on the Web, it's very fragmented over thousands of different sites.

OK, vertical search. What about the barrier to entry? You guys were able to put something together in a reasonably short time. If the space is so hot why wouldn't--or why couldn't--some company with the resources of a Google also decide this makes sense to focus on?
Singh: I think we have lots of barriers to entry. No 1 is that people love the product. We're going to scale this really fast, so that itself is a huge thing. Secondly, from a technology standpoint, this is not easy stuff to re-create. We have a lot of very competent people from engineering who have come from leading search engines out there and these are people with a lot of experience. It's not like putting together a much simpler consumer site that is just a couple of databases.

What is going to be the business model?
Singh: The business model is actually pretty straightforward. It's targeted advertising, very much similar to search that is there today.

When I looked up different bits of information before sitting down for this interview, I found Spock to be a bit thin. Is the search algorithm only being directed toward certain prechosen Web repositories of information? What are your plans there?
Singh: It's just a hard problem to have a ton of data on day one. Even if you look at Google, they started with a much smaller index and grew that daily after the launch. The thing about Spock is every day you come, there's new data and new content being added or indexed.

I'm a sports fan and I when I clicked on the name of an old football player, George Sauer, one of the links that came up was the University of Nebraska. But when I clicked there, I expected more specific information about Sauer and his career. Instead, it brought up a browser full of people who had some connection to the University of Nebraska. It wasn't really clear what the point was.
Singh: I think that's a great point. We are collecting feedback and improving our user interface and our service right now. The fact of the matter is sometimes we can find something about a person in a document and we give a link to that document. I think what we need to do is make our URLs as deep as possible. That's a problem we know about and we'll fix it.

When the service is entirely built out to your satisfaction, how will it look differently from what it looks like today?
Singh: There will be a lot more links and content. Although there is a ton of information about people today on the Web, it's very fragmented over thousands of different sites. As we index more data, what you'll find is that when you do a search, you'll get the most relevant results. It may be a little bit light for some people right now but that's going to increase over time.

You've talked about people-related search taking off in the same way that Google grew. But that's a really big statement considering how far Google's come in such a brief time. Netting out the hype, what's your realistic expectation?
Singh: I think we're going to see exactly the same trajectory and I say this not because it's just a hope and dream. If it's not us, someone will do this. The problem is universal. About 30 percent of search traffic for all search engines is people-related traffic. That's huge. It's actually the largest category of search today.

It's not just Google. They go to LinkedIn or to Outlook to search their contacts. Or Gmail or on their mobile phones. It's something they do often. We project that an average person does a people-related search about 10 times a day, that's why the need is broad and that's why we think this is a product that's designed for everybody.