Thus, as more businesses become dependent on the Internet, its reliability is increasingly questioned.
"I don't think anyone is immune to this," said Keith Lowry, vice president of security operations for network consulting firm Pilot Network Services. "It's science, but there's a lot about the connectivity of electronics that we are still learning."
"One misconfigured router or unforeseen event can take down a network," Lowry said. "It has to do with architecture, but there are a lot of unknown things that happen."
One thing is clear, stressed Lowry: "If you have one single point of failure, you are in trouble."
A so-called single point of failure seems to have caused--or at least exacerbated--Microsoft's problems this week.
Several Microsoft servers that provide a key Internet function--known as the Domain Name Service (DNS)--apparently are on the same network. DNS servers act as a phone book for the Internet, linking the names used by Web sites, such as Microsoft.com and Yahoo.com, to the numerical computer addresses that locate the Web server on the Internet. If a problem occurs with all of the key DNS servers that tell where on the Internet a company can be found, then that company's Web sites essentially drop off the Internet.
And that--seemingly--is what happened to Microsoft. By putting its DNS servers in the same place and not providing a Plan B in case of attack or disaster, Microsoft's design made it easier for the company's Web sites to virtually disappear from the Internet.
On Tuesday, a critical piece of networking hardware--known as a router--had been configured improperly by a technician, cutting off the giant's DNS servers from the Internet. As a result, many of the company's sites, including MSN.com, Hotmail.com, Expedia.com, Encarta.com and Microsoft.com, could not reliably be reached. At times, the blackout seemed near total. By 5 p.m. PST Wednesday, nearly 24 hours after the error was made, the software giant claimed it had solved the problem.
However, on Thursday, Microsoft's sites were down again--this time for almost two hours. The company later claimed that a hacker had targeted its systems with what is known as a denial-of-service attack.
A denial-of-service attack overloads a site's servers with a flood of data, effectively blocking legitimate Web surfers from accessing the site. In this case, the attack was aimed not at the servers, but at the hardware switches that route data to the Web sites--the "single point of failure" pointed out by experts. By flooding these routers with bogus requests for Web pages, the hacker ensured that legitimate requests for pages could not be processed by Microsoft's servers.
At one point, such legitimate page requests to the Microsoft Network languished at anywhere from an abysmal 1.5 percent success rate to around 70 percent, according to network consulting company Keynote Systems. Normally, the sites are able to fulfill 97 percent of all page requests, said Dan Todd, chief technologist for public services at Keynote.
However, Microsoft is not alone. Several outages caused primarily by server errors have plagued other companies as well.
In fact, a survey prompted by this week's Microsoft attack found that 38 percent of the companies with a .com address have a similar design flaw: critical DNS servers with a single connection to the Internet.
"It is clear that a stunning number of companies have serious DNS configuration problems, which can lead to failure at any time," said a statement by Sjofn Agustsdottir, an Icelandic researcher who queried a sample of 5,000 domains to complete the survey. "A single point of failure can go undetected for months, which is simply a disaster waiting to happen."
Earlier this month, online auctioneer eBay suffered a day of lengthy outages. During the outages, eBay visitors could access the company's home page and its category listings but weren't able to view individual auctions, place bids or list items. The company said the interruption resulted from a series of failures that affected its primary and backup systems.
Thursday's attack on Microsoft cames nearly a year after massive distributed denial-of-service attacks (DDoS) slowed, and in some cases halted, access to eight major Web sites, including Yahoo, eBay and CNN.com. DDoS attacks are denial-of-service attacks that use hundreds of servers to attack a single target, making the source of the attack much more difficult to find.
Eggs in a basket
Although the lesson for all companies seems to be "Don't put your eggs in one basket," others believe the entire Internet may be doing just that by relying so heavily on the domain name system to locate sites on the Internet. In other words, nearly everyone recognizes that putting DNS functions on a single or small group of servers leaves companies vulnerable to their URLs dropping off the Web, especially with the increasing complexity of Web sites, but few people do anything about it.
"DNS is now serving functions and protocols that it was never intended to be responsible for," said Paul Robertson, director of risk assessment at security provider TruSecure. "Load balancing, load sharing, and high-availability Web sites--those sorts of things are obviously not what the protocol was designed to do."
"Instead of going back and looking at DNS to see how we can re-engineer it, people are adding tricks to it to let them do what they want," he said.
Worse, if DNS starts fracturing under the stress, the whole Internet could be at risk, said the manager of research and development for network security firm @Stake who would only use his old-school hacker handle "Weld Pond."
"DNS itself is a single point of failure," he said. "Everything else relies on it. If a mail server is down, that doesn't mean the Web is down. But if a DNS server is down, then your site is off the Internet."
Despite that, at least one DNS expert asserts that the problem is one that can be easily repaired.
"This is trivially fixable by doing intelligent things in setting up your (DNS) servers," said David Conrad, chief technology officer of Nominum, a DNS software and service company.
"It is not directly related to the growth of the Internet aside from the fact that there are a lot of people that are connected to the Net today."