April 9, 2008 8:46 AM PDT

Why Google puts privacy second

European regulators sent shock-waves through the search engine industry earlier this week, when they proposed significantly tighter rules for logging data. If the EU adopts the proposed rules, Google, Yahoo and Microsoft will have to significantly reduce the amount of time they keep identifying search logs, and will have to start treating IP addresses as personally identifiable data -- something that Google has been particularly vocal against.

Google has recently engaged in a major public relations effort to try and make a credible argument for keeping log data. The company has trotted out respected employee researchers to try and make the case that deleting such data will hurt search results. When all of their claims are analyzed, however, one thing becomes clear: It's all about the money (and the clicks).

Google has a genuine need to retain detailed log information on one kind of user: Those who click on ads. However, in order to avoid creating a situation where only clickers lose their privacy, the company logs data on all searchers instead. That is, the privacy of millions is threatened, to protect the incentive for users to click on ads.

The excuses

Over the last few months, a number of Google's engineers have issued public statements on the company's public policy blog to defend its much criticized log data retention policies. The company claims that the data can be used to hunt down malware, to catch people defrauding its advertising system, and can be used to improve search results, especially for localized results.

Google claims that accurate logging data can improve localized searches. This data is then used to intelligently respond to searches, such that a search for "GM" will result in General Motors related information for an American search user, yet someone in France be presented with information on "Guerre Mondiale" (World War).

What Google has done here, is attempt to muddy the waters of the debate. Yes, accurate logging data improves localized searches. However, the company does not need to retain the exact network address (known as an IP address) of each and every search. Instead of tracking my searches by my network address, 129.53.136.23, the company could instead log that I came from San Francisco, California. That, in itself, would be more than enough information in order to help it localize and improve search results.

Avoiding disincentives

Of all the excuses that Google's puppets have presented for retaining search logs, there is only 1 case where Google actually has a legitimate need to store information that identifies the individual user, and network address: advertising clicks.

Google is an advertising company first, and a search engine second. Sometimes, we forget this, but Google has a lot of bills to pay. After all, those free meals and massages for employees have to be paid for somehow.

Google displays text advertisements on all of its web search results pages. Advertisers, for the most part, pay per click. That is, every time a user clicks on one of the ads, Google charges an advertiser a few cents (or dollars, depending on the search term). Because of the amounts of money at play, this tends to attract criminals wishing to defraud the system. Thus, it is not terribly surprising that Google wishes to retain information on the user who clicked.

What is most interesting to note though, is that if a user does not click on one of Google's web advertisements, the only credible reason for retaining detailed search information becomes moot. If a user doesn't click, they can't possibly be engaged in fraud, and thus there is no reason to retain identifying information on the user's search.

Were Google to institute an information needs based logging policy, it would find itself in a curious position: users who clicked on advertisements would have detailed logs retained for months, if not years, while users who didn't click on ads would quickly have any identifying information scrubbed from logs, and replaced with more generalized info.

The obvious problem with such a scenario would be that of incentives, especially once the policy was made public. Users would lose their privacy each time they clicked on an advertisement. Unfortunately for the company, this is exactly the wrong kind of message to send. It wants to encourage users to click on its text ads, not to provide incentives for customers to skip them.

Thus, in order to not create that situation, and to avoid the disincentive to click on ads, Google logs data on every search, by every user. And because of this, we all suffer -- even those users who never even see ads, because they use technologies like AdBlockPlus and CustomizeGoogle.


Disclaimer: In 2006, worked as a summer intern in Google's click fraud team. Shuman Ghosemajumder, Google's "Business Product Manager for Trust & Safety" and the person claiming that search logs prevent fraud worked in the same team.

None of the information in this blog post involves confidential company information.

I was awarded a Google fellowship in both 2006 and 2007, for $5000 each time. Finally, I just returned from a Scholar Retreat in San Francisco, which the company paid for.

advertisement
 
Discover unlimited music for the price of one CD a month
Recent posts from Surveillance State
For Hezbollah, it's fiber warfare
U.K. turns CCTV, terrorism laws on pooping dogs
IRS Web site opens door to phishers
Keep your data safe at the border
Can TSA be trusted not to data discriminate?
Add a Comment (Log in or register) 3 comments (Page 1 of 1)
by Daniel_Brandt April 9, 2008 5:22 PM PDT
It's more than a desire to protect their advertising model, which admittedly accounts for 99 percent of Google's revenue. AdWords started in late 2000 or early 2001. When that happened it was something of a shock to those who were watching Google. They were considered pure compared to other corporations in Silicon Valley. PageRank was hyped as some sort of secret sauce, even though it was a rather obvious way to pre-rank the entire web and instantly show better results than the competition while using less real-time CPU overhead. Everyone was brainwashed by Google's propaganda and false modesty. I lost count of the stories about hard disks made out of Legos in a dorm room, to business out of a rented garage, to the guy who used to cook for the Grateful Dead. Google was brilliant, employees can bring a dog to work, Google was cool. The cookie with a unique ID in it that expired in 2038 was in use before AdWords. I remember seeing it in 2000, the year when President Clinton's administration issued an order forbidding all federal agencies from using permanent cookies. Until Google came along with that cookie, it was unheard of for any website to issue a cookie that expired more than ten years later. This all happened before Google used any advertising. Google's excuse then and today is that their cookie was needed to save the user's preferences. This is a lie. You don't need a unique ID in the cookie to save user preferences. And Google certainly didn't have click fraud in mind that early, because anyone who engages in such fraud will refuse cookies. Even today, journalists report that in mid-2007 Google changed their cookie to expire after two years, and neglect to mention that every time you visit a Google site, the expiration date is renewed for another two years. In other words, the two-year expiration date is basically a public relations trick. It really expires two years after you throw your hard disk into a dumpster. It's probably true that storing and analyzing IP addresses is the easiest way to defend against click fraud. But this could be done using a moving 30-day history of IP address activity, instead of a longer period. And it doesn't explain the reasons behind Google's cookie. There is much more going on here than Google's problems with click fraud. There is amazing arrogance across the entire spectrum of Google's efforts: copyright law, customer service, corporate nondisclosure policies, ageism, and the list goes on. Google has always been arrogant, and has always tended to defend its policies with either silence or with lies. I put most of the blame on those high-tech journalists who specialize in Google groveling. During 10 years of their coverage Google kept growing market share, and almost everyone at the Googleplex got rich beyond their wildest dreams. Under such conditions one can hardly expect Google to do things differently.
Reply to this comment
by zhweijun April 14, 2008 2:13 AM PDT
HOW ? HTTP://WWW.ICE.COM.HK
Reply to this comment
by unlimitedj April 23, 2008 2:13 PM PDT
I was researching to see what type of things I should be doing to protect myself better and found site after site about internet privacy concerns, like my ISP tracking keystrokes or Googline keeping tabs on me for way to long--which I feel they shouldn't be keeping records at all. I definitely didn?t, and still don?t like the sound of that, or how my data can be seen by anyone able to tap into my network. Privacy concerns stem from anywhere possible; as long as you type something you are putting yourself at risk. I know I don?t do anything illegal, so I don?t need my keystrokes monitored or my isp address logged by every site I visit. It is almost like Big Brother is coming with the way everything is monitored. Some places are freer than others but that doesn?t mean people know how to protect themselves while surfing the internet. I use a proxy server so that any tracking efforts or data retention are kept free of any of my real information. This recent press release I came across discusses some new features for proxy servers that should be included in everyones internet tool-belt.
Reply to this comment
Powered by Jive Software
advertisement
  • About Surveillance State

  • Christopher Soghoian, a graduate student in the school of Informatics at Indiana University, delves into the areas of security, privacy and e-crime. He is a member of the CNET Blog Network. His homepage is www.dubfire.net/chris and his research group is available at www.stop-phishing.com. Disclosure.

Add this feed to your online news reader
Google
Yahoo
MSN
advertisement
Click Here.
On TV.com: MILEY CYRUS photographs
Visit other CNET Networks sites: