What started out as a router glitch at a small Internet service provider in Virginia today triggered a major outage in Internet access across the country, lasting more than two hours in some places.
The problem started this morning at 8:30 a.m. PT when MAI Network Services, an ISP headquartered in a McLean, Virginia, unwittingly passed some bad router information from one of its customers onto Sprint, one of the largest Internet backbone operators in North America. Because Sprint's backbone is used by so many other smaller ISPs, the router problem was echoed, causing temporary network outages across the country and, perhaps, internationally.
The outage underscored the fragility of the infrastructure that underlies the global network and how easily a problem with one small ISP can be amplified throughout the Internet. Even so, the Net displayed a remarkable resilience that seems to disprove its doomsayers, who have predicted that the network is on the verge of collapse.
"This particular thing was a confluence of two or three things happening--human error, bug, and some policy problems--that all came together on the same day," said Jack Rickard, publisher of BoardWatch magazine.
"There are probably a hundred guys in back rooms keeping this stuff together, just barely," Ricard said of the Internet.
As of this evening, most ISPs, including Sprint, said their networks were operating normally. Earlier today, MAI Network Service, Sprint, and others appeared more actors in a comedy of networking errors.
MAI's problems stemmed from bad router "table" information that directed routers operated by Sprint and other ISPs to transmit all Internet traffic to MAI's network. Routers are the hubs that guide data traffic throughout networks; router tables are essentially network road maps for directing data from router to router.
MAI's networks was almost instantly overwhelmed by all of the traffic pointing to its routers and disconnected itself from the Internet by 9:15 a.m. PT. A number of major ISPs, including Sprint and UUNet, saw their networks turned into data "black holes" since most Internet routers were directing traffic to MAI.
Vincent Bono, director of the network services group at MAI, said there are safeguards that should have prevented the router table errors for being propagated throughout the network, but he was unsure why they didn't work.
"When you have thousands of routers, incorrect routing tables can have lots of problems," Bono said. "In theory, the minute we unplugged ourselves everything should have gone back to normal. We are not hooking back until we are certain we're not going to do it again."
Today, a spokesman for Sprint said that its network was down between 8:30 a.m. and 10:30 a.m. PT, when it finally corrected its routing tables.
"Sprint recognized the situation and immediately began corrective action, including notifying other Internet backbone providers," Sprint said in a statement.
Sprint spokesman Charles Fleckenstein said the problem was not exclusive to the East Coast and could even have affected Internet access internationally. Sprint and other access providers were able to correct the problem by resetting their routing tables.
ISPs around the country reported problems with their networks. A spokeswoman for UUNet Technologies said that its network was affected on the West Coast for a short time but that it is functioning normally now.
Leonard Conn, chief executive of Oklahoma ISP Ionet, said his network was disconnected from other access providers for about 30 minutes, though users could communicate within Ionet's service.
One analyst said that the Internet is bound to experience more problems like today's outage but predicted that they would be manageable.
"Outages like this are very important to track," said Rebecca Wedsell, an Internet analyst for the TeleChoice consultancy. However, she added, "the problems will not bring the system to its knees."