The problems slowed data transfers across the Net for a few days, network engineers said. But the outage highlights growing concerns about Net bottlenecks, which are pushing network companies toward new, expensive connections to keep their quality standards high.
"This was a particularly bad set of timing and circumstances," said Deepak Jain, director of strategic operations at AiNet, a Washington, D.C.-based network service provider.
The problem also comes at an awkward time for WorldCom, which is undergoing tight federal scrutiny of its control over the Internet data backbone market and suffered a crippling outage of key data services late last year. Federal regulators last month nixed the company's planned merger with Sprint, in large part because of concerns that the two companies would control too much of this backbone network. The merger was formally called off today.
WorldCom was able to fix the problem early this morning, bringing the network back to normal capacity.
The Net is built around the ability of huge networks, run by companies like Sprint, WorldCom, Genuity and many smaller companies, to connect with each other. These companies can make individual arrangements to connect at private "peering points," swapping data that travels between their customers.
But a large part of this swapping is done at public peering points, where multiple networks connect at a single hub. These points are the network equivalent to a major city's railroad system, at which a train can switch tracks for dozens of new destinations instead of just one or two.
WorldCom's MAE West facility in San Jose, Calif., the site of this week's problem, is one of the largest of these public points.
According to a WorldCom spokeswoman, the problem stemmed from a failure in one of six early-generation traffic switches. Since 1998, the company has been moving its customers to a more powerful type of traffic-routing technology, and only about a third of its customers are still on the old system, she said.
Nevertheless, the single failure rippled through the system to the point where the facilities' total traffic dropped to about a quarter of its usual level, according to WorldCom's own statistics.
The sharp drop-off in traffic provided the industry an immediate reminder of how much each network bottleneck can affect others. As one connection point goes down, others have to work harder to take up any slack, slowing even well-functioning points of the network.
That's what happened this week, as WorldCom customers began to see slowdowns or serious traffic jams in the MAE West facility.
Many of the Internet service providers that were connected to the older generation of switches at the WorldCom facility moved their traffic elsewhere. But many of them went to the same places for the needed capacity, slowing connections at these other bottlenecks, Jain said.
The problem was less severe for companies that have already moved most of their service to the new generation of technology at the WorldCom connection points.
"We've been in the process of moving our peering points over for a while," said Scott Marcus, chief technology officer of Genuity. "We didn't experience much congestion."
The outage highlights a move among some in the industry toward new kinds of network-connection points run by companies like Equinix and AboveNet. These companies, offering services similar to what WorldCom and the handful of public peering points provide, are hoping to reduce the bottlenecks by creating a larger number of the facilities around the world.
News.com's Kurt Oeler contributed to this report.