X

Researchers could face legal risks for network snooping

A group of researchers from the University of Colorado and University of Washington could face both civil and criminal penalties for a research project in which they snooped on users of the Tor anonymous proxy network. Should federal prosecutors take inte

Chris Soghoian
Christopher Soghoian delves into the areas of security, privacy, technology policy and cyber-law. He is a student fellow at Harvard University's Berkman Center for Internet and Society , and is a PhD candidate at Indiana University's School of Informatics. His academic work and contact information can be found by visiting www.dubfire.net/chris/.
Chris Soghoian
8 min read

A group of researchers from the University of Colorado and University of Washington could face both civil and criminal penalties for a research project in which they snooped on users of the Tor anonymous proxy network. Should federal prosecutors take interest in the project, the researchers could also face up to 5 years in jail for violating the Wiretap Act.

The team of two graduate students and three professors neither sought legal review of the project, nor ran it past the Human Subjects Committee at their university, putting them in a particularly dangerous position.

The academic paper, "Shining Light in Dark Places: Understanding the Tor Network" (pdf) was presented at the Privacy Enhancing Technologies Symposium yesterday, in Leuven, Belgium. The authors are listed as: Damon McCoy, Kevin Bauer, Dr. Dirk Grunwald, Dr. Tadayoshi Kohno and Dr. Douglas Sicker.

The goal of the project was to learn what kind of traffic was flowing over Tor -- a free network providing anonymous web and other Internet services to hundreds of thousands of users world-wide. Some of Tor's users include pro-democracy dissidents, journalists and bloggers in countries like China, Egypt and Burma who would otherwise face arrest and torture for their work.

Tor relies on volunteers who donate computing power and bandwidth to run approximately 2500 publicly accessible proxy servers, which are then used by hundreds of thousands of people to hide their Internet traffic.

In order to study Tor, the researchers setup their own 'exit node' server on the University of Colorado's high-speed network. For 4 days in December 2007, they logged and stored the first 150 bytes of each network packet that crossed their network, thus revealing what kind of traffic was crossing the network, and the remote websites that Tor users were visiting. While the authors do not state how many sessions they snooped on, they do state that their server carried over 700GB of data.

In a second part of the study, the researchers ran an 'entry node' to the network for 15 days, which allowed them to determine the source IP address of a large number of Tor users. They used this to learn which countries use Tor more heavily than others. Note that in this second part of the study, the researchers did not have access to the destination site information, nor were they able to observe the kinds of traffic going through their server.

The researchers found that HTTP (web traffic) was responsible for 58% of their servers' bandwidth. They also found that the BitTorrent file-sharing protocol, while accounting for only 3% of the number of connections, was responsible for over 40% of the overall bandwidth. They also observed that German users were responsible for over 30% of the requests through their server.

No Legal Review Sought

In his presentation of the work at the PET Symposium yesterday, Kevin Bauer, one of the graduate students who wrote the paper shed some light on the limited amount of legal analysis performed on the project.

Bauer said that the researchers "spoke informally with one lawyer, who told us that that area of the law is ill defined" based on this, the researchers felt that it was "unnecessary to follow up with other lawyers."

The lawyer they spoke to was Professor Paul Ohm, who teaches at the University of Colorado Law School. Ohm has previously collaborated with two of the researchers on an earlier publication, which discussed the legal risks faced by academics engaged network monitoring research. Ohm, a former federal computer crimes prosecutor, has also been the subject of some media attention in recent months, after he publicly stated that ISP-level advertising and traffic-shaping systems may violate US wiretap laws .

In a response to questions by this blogger, Professor Ohm seemed to attempt to distance himself from the researchers, writing by email:

I met with the research team once before they had finished their research, although I don't know how far along they were at that point. At the meeting, I gave them a very brief sketch about federal Wiretap law and they gave me a very brief sketch of their research. They seemed to have put in place a number of controls to try to minimize the risk of liability. I haven't seen the final paper (as far as I can recall).

I'm not their lawyer, and I've never been their lawyer, and I haven't produced any official or unofficial legal advice about their research, but because I spoke with them about this, I don't think it would be appropriate for me to give you any opinions about the research other than this brief statement.

Legal Risks

The Electronic Frontier Foundation, which wrote a legal guide for operators of Tor servers, strongly advises server administrators against snooping on their users. A section in the legal guide makes this clear:

Should I snoop on the plaintext that exits through my Tor relay?

No. You may be technically capable of modifying the Tor source code or installing additional software to monitor or log plaintext that exits your node. However, Tor relay operators in the U.S. can create legal and possibly even criminal liability for themselves under state or federal wiretap laws if they affirmatively monitor, log, or disclose Tor users' communications .... Do not examine the contents of anyone's communications without first talking to a lawyer.

While state laws vary, one immediate concern would be the Wiretap Act, a federal law that broadly prohibits snooping by network operators and others. The core prohibition of the Wiretap Act is found at section 2511(1)(a), which prohibits any person from intentionally intercepting, or attempting to intercept, any wire, oral, or electronic communication." A violation of these rules is is a Class D felony, and can result in fines up to $250,000 and up to 5 years in jail.

It is this same law that groups such as the ACLU and EFF sued AT&T and other telecom companies for violating, when they shared customer communication with the US National Security Agency. AT&T was able to obtain retroactive immunity from the US Congress, but only after spending tens of millions of dollars on lobbyists.

In order to learn more about the legal issues at play, I spoke with Kevin Bankston, the EFF lawyer who wrote the Legal guide for Tor server operators, and who also lead the EFF's lawsuit against AT&T. Bankston told me that:

"I agree that their logging the content exiting their nodes would appear to constitute interceptions of those electronic (not wire) communications under the Wiretap Act, and I don't think they qualify for the narrow provider exceptions [18 USC 2511, 2 (a) I], so I still see the same potential civil and criminal liability that was noted in our FAQ."

No Human Subjects Committee Review

In addition to possible legal issues, the project also raises serious ethical concerns related to the study of users' communications without their consent.

During his presentation, Bauer revealed that the researchers did not seek the approval of their university's Institutional Review Board -- a body that reviews research projects that involve human subjects. He said that, "we were advised that it wasn't necessary," adding that the IRB review process is used "used more in medical and psychology research at our university," and was not generally consulted in computer science projects

Information listed on the website of the University of Colorado's Human Research Committee states that: "All research involving human participants that is conducted by UCB faculty, staff or students must receive some level of review by the Human Research Committee."

Of particular concern to all Institutional Review Boards is any research that involves the study of participants under the age off 18, and other at risk or vulnerable persons. Given that the users of the Tor network have gone out of their way to seek anonymity, and that in some cases, their discovery could lead to arrest or torture, it would seem that these users would almost certainly be considered to be vulnerable. Furthermore, it is quite likely that the snooped communications include at least a few users under the age of 18 -- something that the researchers did not address in their paper.

In a paper published earlier this year, Dr. Simson Garfinkel explored some of the common myths and pitfalls for computer security researchers that study real users and their behavior, and the need to submit their projects to an IRB review.

Dr Garfinkel specifically deals with one of the researcher's claims:

Myth: Because the Common Rule exempts research involving subjects that cannot be identified, IRB approval is not required when using anonymized data

Although this would certainly be convenient, most institutions only allow a determination of exemption to be made by the IRB itself.

A request for clarification on these issues left with the director of the University of Colorado Human Research Committee had not been returned by press time.

Other concerns

In addition to the issues surrounding US legal liability, and ethical concerns over human subject testing -- there is one other problem: International law.

While the researchers are Americans, and conducted their study on a server based in the US, there is certainly an international angle to their study. Users from around the world sent traffic through the researchers' server, and as such more strict Canadian and European intercept and data privacy laws may apply.

Furthermore, one of the strongest privacy protections inherent in the Tor system is the complete lack of logging. That is, if law enforcement agencies approach a Tor server administrator seeking information on a user of the system, the admin can truthfully reply that they have no logs, and thus have nothing that they can be compelled to produce.

Taking questions before their presentation, two of the authors told me that they still have a copy of the data that they collected, and admitted that it was not currently stored on an encrypted disk. They did stress that it was, however, being kept in a "secure" location.

What this means of course, is that law enforcement agencies could easily subpoena this data, thus legally compelling the researchers into handing over the data. This places the users of the Tor network at a significant risk, one that certainly violates the expected social norms of the system.

During the question and answer session after his presentation, Bauer stated that the researchers were still not sure what they were going to do with the data set, and were exploring possibilities for releasing it to researchers in an anonymized and non-personally identifiable way. This statement was met with boos from the audience, which was mainly made up of privacy researchers and activists, a number of whom run their own legitimate Tor servers.

Caveat Emptor

While the US government did not send officials to this annual meeting of privacy researchers, the Canadian government did. A representative for Dr. Ann Cavoukian, the Information and Privacy Commissioner of Ontario was in the audience during the presentation.

When asked for comment on the research project, and any potential impact for Canadian citizens who may have used the snooping Tor server, Cavoukian issued the following statement:

"Whether you run an ISP, a search engine, a Tor server node, or a research project, the principle of Data Minimization should rule. Universal privacy practices require that strong limits be placed on the processing and storage of personal data. In today's online world of constant data availability, privacy requires data minimization at every stage of the information life-cycle: If you don't need the data, don't collect it in the first place; if you don't need it any more, then destroy it securely -- don't keep it any longer than you need to. Full stop."

Wise words indeed.