Google's recent announcement that it may have found a way to predict U.S. flu trends has led to the inevitable expressions of concern from some privacy groups.
The Electronic Privacy Information Center and Patient Privacy Rights sent a letter this week to Google CEO Eric Schmidt saying if the records are "disclosed and linked to a particular user, there could be adverse consequences for education, employment, insurance, and even travel." It asks for more disclosure about how Google Flu Trends protects privacy.
In reality, Google is releasing precisely zero personally identifiable information about its users.
Instead, Google Flu Trends publishes one number per state, representing the company's best guess based on search queries at influenza-related cases in each state. These are the same type of regional statistics that the Centers for Disease Control and Prevention already publishes.
If you think that knowing that Alaska's "influenza-like illness" number for the week of November 9 is 2.035 and California's number is 1.384 is somehow worrisome and can identify you personally, it's time to break out your tinfoil hat.
"There are no new privacy implications," Mike Yang, a Google lawyer, told me on Friday.
EPIC acknowledges that Google Flu Trends may prove useful. But the group is also making a more subtle argument as well.
"The basic question I'm asking to Google is: how can it be that across all these key terms, you can generate aggregate anonymized data without any risk of reidentification?" said Marc Rotenberg, EPIC executive director.
Put another way, what if an attorney general in a state where marijuana was illegal sent a subpoena to Google asking for the identities of anyone who typed in "how to grow pot?" Or if abortion were illegal in a certain state, what if the subpoena wanted to know who typed in "how to get an abortion?"
Google has told us in the past that if, given a list of search terms, it can produce a list of people who searched for that term, identified by IP address and/or cookie value. If they're registered with Google, the company also knows the names they typed in when registering. (Google partially anonymizes log files after nine months. And, of course, the company has fought legal battles to keep these data confidential.)
EPIC's answer to these hypotheticals is to pressure Google--and this week's letter was a part of that strategy--to keep logs for an even shorter period of time, or not at all.
EPIC's Web site darkly warns: "There are no clear legal or technological privacy safeguards that prevent the disclosure of individual search histories. Without such privacy safeguards Google Flu Trends could be used to reidentify users who search for medical information. Such user-specific investigations could be compelled, even over Google's objection, by court order or presidential authority."
Yet keeping search logs for nine months may be useful for dealing with advertising-related questions and for optimizing a search engine's responsiveness. If users don't like that, nobody's forcing them to use Google. They also have the choice of using an anonymizing service like Tor.
But if the problem is bad laws and nebulous "presidential authority" that permits fishing expeditions, then it makes sense to fix them. This week's letter might have been better addressed to President-elect Barack Obama and Democratic leaders in Congress, asking them to make sure the Fourth Amendment's protections are extended to information stored by third parties like search engines. Unfortunately, that's not likely to happen.
Disclosure: The author is married to a Google employee.