X

Google's man behind the curtain

Craig Silverstein is director of technology at Google, which, in these pre-IPO days, may very well be the search company's most important job.

Stefanie Olsen Staff writer, CNET News
Stefanie Olsen covers technology and science.
Stefanie Olsen
10 min read
If there ever was an employee who carried the water for Google, it's Craig Silverstein, employee No. 1, technology director and loyal chanter of the search company's "don't be evil" mantra.

Silverstein, 31, left his doctoral studies at Stanford University in 1998, joining school chums Sergey Brin and Larry Page in a nearby garage to build the now famed search engine.

It turned out to be a wise diversion, too, now that the search company is poised to raise $2.7 billion in one of the hottest tech initial public offerings since 2000.

Imminent wealth aside, Silverstein has long been a champion of working hard and whistling while you do it. As Google's director of technology, he balances pie-in-the-sky visions for search--in other words, artificially intelligent search pets--and churning out products that improve people's access to information. Just a sampling includes new technology to personalize the company's Web site; comparative shopping prices on wireless devices; and the ability to send, store and manage up to 1 gigabyte of free e-mail, otherwise known as Gmail.

In an interview before Google's IPO filing, Silverstein discussed the backlash against Gmail among privacy advocates, the company's cultural changes and its shifting reliance on PageRank, the mathematical algorithm that has helped Google shine. The company recently renewed an exclusive PageRank license from Stanford that's valid through 2011.

Q: What is your perspective on Google's role in the history of search?
A: Google was in the right place at the right time. The history of search, since the advent of computers, is one where more and more information is available for people, and you need ever more sophisticated techniques to make sense of it and to make it useful--and Google was at the cusp of (that).

You have portrayed the ideal search engine as one resembling the intelligence of the Starship Enterprise or a world populated with intelligent search pets. Can you talk a little bit about those ideas?
Well, the third idea is having the computer be as smart as a reference librarian. That's interesting, because reference librarians, of course, use computers, use Google to help them search, but they put some element of intelligence into it that the computer cannot do by itself.

So, part of the goal is to make computers smart enough so that when you interact with them, they can do something with that information to help you actually get better results. That is certainly something Google thinks about to improve quality.

When do you think that kind of artificially intelligent search will happen?
I think that understanding language is kind of the last frontier in artificial intelligence, and then talking to a computer will be just like talking to a reference librarian, because they will both be equally knowledgeable about the world and about you.

The big difference, and this is where the search pets come in, is that the reference librarian will understand emotions and other nonfactual information that even a fully intelligent computer may have trouble with.

In terms of timing, I typically say about 200 to 300 years. I think it is probably closer to the 300th year end of it. But if it ends up being closer to the 200th year, I would not be around in any case, and I will not be able to have anyone gainsay me.

Good thinking.
Going back further, even 30 years, the people who were working on artificial intelligence in the '60s thought all these problems would be solved by today--and we are basically not very much closer in terms of those overall high AI goals of understanding language.

Some computer scientists suspect that PageRank is dead, because Internet marketers have managed to exploit it by creating false popularity for their sites. Is that true? Has it been altered, or is it playing less of a role?
The point of view that PageRank is dead is kind of a very static view of the world. It will always continue to be a part of our ranking scheme but, over time, as we develop new ideas on how to do ranking, as we tweak existing ideas, as we think about new ways to have them play together--the role of any one of the techniques that we use will obviously change.

Are there any other algorithm techniques that you are using that are playing a bigger role?
Well, there are certainly other techniques that we are using. Talking about it is the trickier part. In broad terms, techniques we use fall into, like, two or three categories, and one is we try to understand and leverage human intelligence. We look for signals that people put in to indicate intelligence, like deciding to link from one page to another or annotating text with the description of what the text is about.

How many servers is Google currently running? Some say 100,000; others say 10,000. Others say Google's computing setup is the most interesting thing about the company, in that search is just an application that is running on a platform that can do literally anything you want it to--for example, Gmail. Is that a fair assessment of Google's strengths?
That is very interesting you should say that. The history of search is actually a history of search engines being put on top of an application that was not developed for search. AltaVista, for instance, was developed as a proof of concept of Digital Equipment's Alpha servers.

The point of view that PageRank is dead is kind of a very static view of the world.

We have more than 10,000 computers, as part of a rich tradition, in terms of commercial Web search engines. However, it is definitely the case for us that we developed the infrastructure we have in order to better be able to do search.

We needed something that could grow very easily, because we knew the Web would grow very quickly. We had to develop algorithms that we could easily scale so that we could just get more capacity, as we added more computers, and we would not need to rewrite any code. So, keeping those ideals in mind let us grow Google to the size it is today from something that was orders of magnitude smaller--a thousand times smaller--from when we first started the company.

But the thing that we found is that a lot of these techniques are useful for the more general task of making lots of information available. Gmail is a perfect example of this. And this amount of information could be as big as the Web or even bigger in aggregate. We have the technical know-how to be able to do that as well.

Then what other applications is Google working on? I cannot talk in specifics. The general direction I think are some of the things that I have been talking about already: making more and more types of information available. Gmail is trying to search over private information--that is our first real effort into that area.

What have you learned from the negative reactions to Gmail from privacy advocates and now lawmakers?
What I have learned is that Google plays a very important part in people's lives, and it is worthwhile for people to get worked up about. I remember the last time there was a big brouhaha over something that Google did, which was when we acquired the Usenet archives from Deja.com, and the Usenet community was all up in arms about what this meant for the future of Usenet and being able to get access to the information.

Over time, it became familiar, and they had the chance to play around with the product and see that it actually was really good. That brouhaha subsided, and I expect and hope that the same thing will happen in this case. The issues that are important to people any company should take seriously, and I feel that we are doing so.

How do you think the service might change? It is premature for me to speculate on what changes might happen.

In the long run, what do you think will be more interesting: one gigantic search space, or lots of little ones partitioned off from one another--different databases for this Web site or that company's e-mail archive?
From a user's point of view, you want one place you can go to do the search. I do not necessarily have a technical preference. The important thing for me is that it be as easy for the users to get the information that they want and, to me, that means they just have to only go to one place, and that one place should be smart enough to figure out, out of the zillions of different types of information sources in the world, which ones have the right results for you.

What are your ideas on the need for privacy, with search histories, registration data, e-mail documents in one place?
Well, we definitely respect the fact that the people who create the information and who own the information have the rights to decide how that information should be viewed. We give all sorts of controls to let people control very finely how their information is made available through Google. That is going to be our policy.

Do Google's algorithms scale? And if the amount of data in your database doubles, for example, does it take twice as many computers to return a search result?
Our algorithms do scale, and if, you know, the size of the Web doubles, and the machines double, then we are keeping pace.

Does it break it at some point? Does it work with arbitrarily large data sets? As far as I know, it works with arbitrarily large data sets. If there is a constraint, we have not run into it yet.

Do you think advanced search features should be built into the operating system, and does that allow Microsoft to create a tool that is far better tuned to the individual? And if so, would Google want access to the information Microsoft collects?
I remember when the whole Microsoft-Netscape debate happened several years ago, and there were all these talks about what should and should not be in the operating system. It all kind of floundered on the definition of what an operating system is.

At some point, it's not an interesting question to me. (What interests) me is that it be as easy as possible for people to get the information they need.

Do you think that Microsoft is creating fear and uncertainty around search, considering that its products are not likely going to come out until 2006?
I do not really pay so much attention to those kinds of things. Microsoft has decided and stated publicly they think search is very important to people, and that is certainly something that we would agree with.

The history of search is actually a history of search engines being put on top of an application that was not developed for search.
What are the complexities of building a video or audio search engine?
Part of the complexity is speaking of and (leveraging) nontextual information--having humans describe it in some way. I think that is possible for audio and video, though certainly, the challenges are nontrivial.

It is a hot area in the academic community, but I would say the challenges in the short term are nontechnical issues. The people who own this content do not necessarily want to make it publicly available or available for searching. We respect that and, until a time comes where there is a business model or some other arrangement where they feel comfortable making the information available for searching over the Web, we are not going to really provide the functionality.

There are some personalization tools emerging. Amazon's A9.com and MSN are using different techniques. Google's tool is a little bit more like, "Give us information, and we will help you out," and the others take the approach, "We will learn from you, and then we will help you out." Tell me why your approach is superior.
In the latter scenario, where first you learn, and then you help the visitor out, you have two places where the computer has to make intelligent judgments. I am not saying that is not an interesting or promising approach, but it does put more strain on the computer. When you tell it what your interests are, then the computer only has to be intelligent to use that information to try to help you out. They are both part of the same goal of trying to help people out with personal information--it is just a matter of how you get there. We will be seeing more of this in the future.

Can you talk about how the culture at Google has changed since you started there, as employee No. 1?
It has certainly changed. I used to know everyone in the company, and now I do not, and it makes me sad. But what impresses me and is basically the reason I am still here is that even though the culture has changed, the basic principles that underlie Google, both in terms of the products and how we run internally as a company, have not really changed since it started.

We still believe that it is important to have a work environment that is fun. That is still true, just as much now as it was when we started, even though instead of having one massage therapist come in, you know, a few times a day, we have, you know, a whole crew going in, making sure that everyone can get a massage who wants or needs it.

And on the other side of the products, we're a very technology-focused company, and we are very much focused on the user experience. There are a lot of pressures on a company, as it goes through its life, and certainly, five and a half years is a long time for an Internet company. To see it stay so constant through all those pressures, I think, is really remarkable, and I am really grateful for it.