X

Blackout shows Net's fragility

A dispute between major Internet backbone companies has selectively shut down e-mail and Web sites for many online.

John Borland Staff Writer, CNET News.com
John Borland
covers the intersection of digital entertainment and broadband.
John Borland
5 min read
A correction was made to this story. Read below for details.
Since early Wednesday, Phil Bradham, the network engineer at Boston's Museum of Fine Arts, has been cut off from the parts of the Internet he needs the most.

He can't reach his Web hosting company to update his site. Critical e-mails aren't going through, and some aren't reaching him. He can't get to some important sites on the Net, such as the popular Wikipedia encyclopedia.

The source of Bradham's difficulties is a feud between two big backbone Internet companies--the long-haul networks that most consumers and even most businesses ordinarily have little to do with. One of these companies, Level 3 Communications, has cut off direct communications with rival Cogent Communications, causing many of each company's customers to lose access to potentially significant swatches of the Net.

"We've been working with both (companies), but neither one will do anything until the other one budges," said Bradham. "It's very frustrating that two top companies would try to resolve this with a standoff like this."

In theory, this kind of blackout is precisely the kind of problem the Internet was designed to withstand. The complicated, interlocking nature of networks means that data traffic is supposed to be able to find an alternate route to its destination, even if a critical link is broken.

In practice, obscure contract disputes between the big network companies can make all these redundancies moot.

At issue is a type of network connection called "peering." Most of the biggest network companies, such as AT&T, Sprint and MCI, as well as companies including Cogent and Level 3, strike "peering agreements" in which they agree to establish direct connections between their networks.

That means that when a Cogent customer wants to visit a Web site hosted by Level 3, the data can take a short, fast path, instead of winding its way around the broader Internet.

Typically, peering agreements are made without any money changing hands, since each company expects to hand off a roughly comparable amount of traffic. Smaller network companies buy what are called "transit" agreements with larger companies, in order to hand off their customers' traffic to the big networks.

Peering gone wrong
These collegial peering relationships among big companies allow traffic to flow efficiently across the Net without most customers knowing anything about the under-the-hood relationships. But when these relationships go sour, the feuding parties' lack of flexibility can result in blackouts like the one that occurred this week.

In this case, Level 3 says that it believes it is substantially larger than its rival, and told Cogent as long as 90 days ago that it was planning to sever the direct connection between the two networks. The connection could be re-established if Cogent were to pay Level 3 access fees for use of its network, the company says.

For its part, Cogent contends that it is similar in size to Level 3, and that it makes no sense to pay for the kind of peering relationship that it maintains with many other companies. Cogent is offering any Level 3 user who can't get to Cogent sites free Internet service for a year, in an attempt to attract its rival's customers.

"Our goal is to have this problem go away, whether through Level 3 reconsidering, or their customers coming to us," said Dave Schaeffer, chief executive officer of Cogent.

As of mid-Tuesday, both sides said they were committed to their position, showing no willingness to budge, despite complaints from customers on both sides around the Net that they can't reach Web sites or can't send e-mail to some addresses or receive it from others. This means that there is no immediate fix ahead, unless customers (or their ISPs) find an alternative or auxiliary network provider.

The scale of the problem
It's impossible to say precisely how many people are affected. Many customers of the two companies, and customers of the ISPs

 

Correction: This story incorrectly stated the day Phil Bradham began having difficulties accessing his Web site. The problem began Wednesday.
that use one of the networks, buy connections from several providers simultaneously to avoid outages of this kind.

However, many businesses, individuals and even some ISPs have so-called single-homed network connections, which means they depend on a single provider to reach the Internet. (Think of this as a town with a single road leading in and out, instead of several different highways.)

These single-connection customers are the ones hardest hit by Level 3's decision. Because Level 3 and Cogent each uses direct connections to other networks to exchange traffic--rather than paying a third party to provide redundant or backup transmission service--there is no alternate route for data from one network to reach the other.

"I have been pushing for years to have a redundant ISP for our traffic. But we're a nonprofit. We don't have the money available to do that."
--Phil Bradham, network engineer, Museum of Fine Arts, Boston

The result: blackouts such as those Bradham and other customers are seeing.

According to Cogent, between 5 percent and 10 percent of its customers were affected. Level 3 did not provide an estimate. Because some of those customers could be ISPs with thousands or hundreds of thousands of their own customers, the number of people affected could range into the millions.

CNET News.com readers have reported problems with businesses and home connections, however.

William Steele, a senior network engineer for Syncro Services, said his company noticed one such problem Wednesday morning.

"There are some people I can't send an e-mail to," Steele said. "At home I have Road Runner as an ISP, and wasn't even able to remotely connect in order to manage our servers."

A spokesman for Time Warner Cable confirmed that many of the company's Road Runner cable modem customers would be affected.

"That means some sites they might normally visit are not available to them right now," the company said in a statement. "We are working to find alternate pathways so our customers can be reconnected with these Web sites as soon as possible."

In the past, network outages stemming from this kind of private contract dispute have prompted some to call for regulatory oversight, or at least legal action.

In 2001, a similar contract dispute led Cable & Wireless to cut off its connection to PSINet, one of the oldest Net backbone companies. After outcries by customers, the connection was restored several days later, however.

Even Cogent says it prefers to handle this kind of problem without government getting involved.

"We don't think there should be any involvement in terms of regulatory oversight," Cogent spokesman Jeff Henriksen said. "These are individual contracts based on specific needs of individual providers."

As the outage stretches on, however, it highlights fragility in what seems like a deeply interconnected Net. Many people remain unaware of the problem, and it can be expensive for users to address it.

"I have been pushing for years to have a redundant ISP for our traffic," Bradham said. "But we're a nonprofit. We don't have the money available to do that."