The search giant has to make sure its vast data on customer behavior isn't used to violate privacy.
Elinor MillsFormer Staff Writer
Elinor Mills covers Internet security and privacy. She joined CNET News in 2005 after working as a foreign correspondent for Reuters in Portugal and writing for The Industry Standard, the IDG News Service and the Associated Press.
That such detailed personal information is so readily available on public Web sites makes most people uncomfortable. But it's nothing compared with the information Google collects and doesn't make public.
What Google knows about you
• -- The e-mail service offers two gigabytes of free storage and scans the content of messages to serve up context-related ads.
• -- Google's Desktop Search lets users easily search files stored on their computer.
• -- The application speeds Web surfing by storing cached copies of Web pages you've visited; those page requests can include personal information.
Assuming Schmidt uses his company's services, someone with access to Google's databases could find out what he writes in his e-mails and to whom he sends them, where he shops online or even what restaurants he's located via online maps. Like so many other Google users, his virtual life has been meticulously recorded.
The fear, of course, is that hackers, zealous government investigators, or even a Google insider who falls short of the company's ethics standards could abuse that information. Google, some worry, is amassing a tempting record of personal information, and the onus is on the Mountain View, Calif., company to keep that information under wraps.
Privacy advocates say information collected at Yahoo, Microsoft's MSN, Amazon.com's A-9 and other search and e-commerce companies poses similar risks. Indeed, many of those companies' business plans tend to mimic what Google is trying to do, and some are less careful with the data they collect. But Google, which has more than a 50 percent share of the U.S. search engine market, according to the , has become a lightning rod for privacy concerns because of its high profile and its unmatched impact on the Internet community.
"Google is poised to trump Microsoft in its potential to invade privacy, and it's very hard for many consumers to get it because the Google brand name has so much trust," said Chris Hoofnagle of the Electronic Privacy Information Center. "But if you step back and look at the suite of products and how they are used, you realize Google can have a lot of personal information about individuals' Internet habits--e-mail, saving search history, images, personal information from (social network site) Orkut--it represents a significant threat to privacy."
Kevin Bankston, staff attorney at the Electronic Frontier Foundation, said Google is amassing data that could create some of the most detailed individual profiles ever devised.
"Your search history shows your associations, beliefs, perhaps your medical problems. The things you Google for define you," Bankston said.
The Google record
As is typical for search engines, Google retains log files that record search terms used, Web sites visited and the Internet Protocol address and browser type of the computer for every single search conducted through its Web site.
In addition, search engines are collecting personally identifiable information in order to offer certain services. For instance, Gmail asks for name and e-mail address. By comparison, Yahoo's registration also asks for address, phone number, birth date, gender and occupation and may ask for home address and Social Security number for financial services.
"It's data that's practically a printout of what's going on in your brain: What you are thinking of buying, who you talk to, what you talk about."
--Kevin Bankston, staff attorney, Electronic Frontier Foundation
If search history, e-mail and registration information were combined, a company could see intimate details about a person's health, sex life, religion, financial status and buying preferences.
It's "data that's practically a printout of what's going on in your brain: What you are thinking of buying, who you talk to, what you talk about," Bankston said. "It is an unprecedented amount of personal information, and these third parties (such as Google) have carte blanche control over that information."
Google uses the log information to analyze traffic in order to prevent people from rigging search results, for blocking denial-of-service attacks and to improve search services, said Nicole Wong, associate general counsel at Google.
Correction: The original article incorrectly implied that Google Desktop Search can track what's stored on a user's PC. The service does not expose a user's content to Google or anyone else without the user's explicit permission.
need it to process the data on Google's behalf.
Concern about Google's data retention practices has become more acute since the company went public last August. The company's motto of doing no evil remains, but some people question Google's ability to adequately balance the heavy burden of safeguarding consumer privacy rights with the pull toward intermingling and mining data for ever more lucrative targeted advertising.
"Although Google is held in high esteem by the public as a good corporate citizen, past performance is no guarantee of future behavior, especially following Google's IPO when the company will have a legal duty to maximize shareholder wealth," Hoofnagle said in testimony in March before the California Senate Judiciary Committee on the privacy risks of e-mail scanning.
"It's very hard for many consumers to get it, because the Google brand name has so much trust."
--Chris Hoofnagle, director, Electronic Privacy Information Center
Google, like virtually all companies, also complies with legal orders such as search warrants and subpoenas.
"The prospect of unlimited data retention creates a honey pot for law enforcement," Hoofnagle said in his testimony. In addition, e-mail stored for longer than 180 days has less protection from law enforcement than e-mail deleted before then, he said.
Google knows people are worried
Google is very much concerned with protecting the privacy of its users, Wong said. "We take privacy very seriously from the design of the products through launch and beyond," including by building in privacy-protection options in new products, she said. Google does not have a privacy officer, but it does have Wong and a team of lawyers who work with her to address privacy issues, among other matters.
Even if Google is well-intentioned, the data could eventually end up being misused, Bankston fears.
"I think the mantra of not being evil is not disingenuous, but it is a hard credo to stick to when you're a public corporation with stockholders to please and economic incentives driving you to collect as much information as possible," Bankston said. "I'm not saying it's evil to collect this information; I'm saying it's dangerous for them to collect this."
The largest outcry against Google so far has been in response to Gmail. , Gmail now offers a whopping two gigabytes of storage for free and scans the content of messages to serve up context-related ads.
Gmail users can delete messages, but the process isn't intuitive. Deletion takes multiple steps to accomplish and it takes an undetermined period of time to delete the messages from all the Google servers that may have a copy of it, Wong said.
People can use Google search without a cookie. If a cookie is used and is not deleted by the user, the searches may then be linked to the cookie, Wong said. However, Google can not correlate searches to a specific user unless that person voluntarily provides personally
identifiable information. For example, Google does not correlate Gmail accounts with users' searches.
, an application that lets users search for personal files and Web history stored locally on their computer, also created a stir when it was launched last year. Privacy advocates worried that someone with access to a user's computer could easily search for sensitive data.
A free version of Google's Desktop Search for businesses has an option that allows users to require a password to access it. The free consumer version of it does not.
Other privacy concerns were raised with , downloadable software for broadband users that was designed to speed access to Web pages by serving up cached or compressed copies of Web sites from Google's servers. However, the service does not really retain any more data than a user's Internet service provider can.
Underpinning many of the privacy concerns is the longevity of Google's data retention.
The log files created during Web searches, and which don't personally identify the user, are kept for as long as the data "is useful," Wong said. She did not give any time frame or elaborate.
"Overall, the issues with Google are not any different from the issues you have with Yahoo, Microsoft and others."
--Danny Sullivan, editor, Search Engine Watch
Google is able to link log file data, cookies and Google accounts to help it identify attempts to manipulate Web site ranking on its search pages, help track down originators of denial-of-service attacks against Web sites, and provide improvements to services in general, Wong said.
Concerned Googlers can either choose not to register for Google services or use two browsers, one for their Web searches and another for Gmail and other Google services.
For the more paranoid there are anonymizing proxy networks, such as the EFF's Tor, that bounce Internet communication through a series of routers that encrypt and decrypt it so that the origination and destination cannot be traced.
"Before you Google for something, think about whether you want that on your permanent record," Bankston advised. "If not, don't Google, or take steps so the search can't be tied back to you."
Google is no DoubleClick
In fairness, the level of anxiety hasn't come close to what online ad network DoubleClick faced in the late 1990s. DoubleClick became the subject of a Federal Trade Communication lawsuit for its attempt to combine offline and online consumer data. It settled federal and state suits and .
In a question-and-answer session during Google's media day in May Schmidt addressed the trade-off between privacy issues and offering better services.
"Our general philosophy on those things is very much to allow people to opt in," Schmidt said. "There are always options to not use that set of technology and remain anonymous with respect to the functionality that you're using on Google."
Gartner analyst Allen Weiner opined: "Overall, I think the privacy concerns are probably overblown."
Search engines have reached a plateau in their ability to serve up the best results, Weiner said, adding that tracking users' ongoing searches will lead to improvements.
"Have search engines gotten to the point where they have developed enough trust with consumers in order to get them to give up some of their privacy?" he asked rhetorically. "At some point there's a leap of faith that needs to occur."
And it's not as though Google is the only company asking Web surfers to make that leap, said Danny Sullivan editor of Search Engine Watch.
"Overall, the issues with Google are not any different from the issues you have with Yahoo, Microsoft and others. They tend to get singled out, and unfairly, in my view," Sullivan said. "They're the biggest, and they make a big target for someone to take a swing at. It's not that the issues are not important. It's that they are applicable to the search industry" as a whole.
Trust is the key. As software industry analyst Stephen O'Grady wrote in his Tecosystems blog late last year: "Google is nearing a crossroads in determining its future path. They can take the Microsoft fork--and face the same scrutiny Microsoft does, or they can learn what the folks from Redmond have: Trust is hard to earn, easy to lose and nearly impossible to win back."