X

Amazon working again, but what went wrong?

Site is back up and running after a two-hour outage. What went wrong at the e-commerce giant?

Stephen Shankland Former Principal Writer
Stephen Shankland worked at CNET from 1998 to 2024 and wrote about processors, digital photography, AI, quantum computing, computer science, materials science, supercomputers, drones, browsers, 3D printing, USB, and new computing technology in general. He has a soft spot in his heart for standards groups and I/O interfaces. His first big scoop was about radioactive cat poop.
Expertise Processors, semiconductors, web browsers, quantum computing, supercomputers, AI, 3D printing, drones, computer science, physics, programming, materials science, USB, UWB, Android, digital photography, science. Credentials
  • Shankland covered the tech industry for more than 25 years and was a science writer for five years before that. He has deep expertise in microprocessors, digital photography, computer hardware and software, internet standards, web technology, and more.
Stephen Shankland
2 min read

Update 4:36 p.m. PDT with outside comment about possible causes of the Amazon.com outage.

Amazon posted an apology placeholder page for broken links.
Amazon posted an apology placeholder page for broken links during a two-hour outage. Amazon.com

A two-hour Amazon.com outage is over. Now on to the post-mortem: what triggered the problem?

Amazon declared itself clear of the problem this afternoon. "The Amazon retail site was down for approximately two hours earlier today beginning around 10:25 a.m. The site (is) back up," the company said in statement.

But as to the explanation, the company only hinted that its complicated computing infrastructure was, unsurprisingly, a culprit.

"Amazon's systems are very complex and on rare occasions, despite our best efforts, they may experience problems. We work to minimize any disruption and to get the site back as quickly as possible," the company said, declining to comment further.

Human error?
The most likely culprit was simple human error, in the estimation of Shawn White, director of operations for Keynote Systems, which monitors Web site availability.

"Some engineer might have made a particular change, not knowing it could cause a trickle-down effect" that eventually brought down the site.

For example, he said, somebody in charge of maintenance might have been directing Internet traffic to a particular group of servers, but selected the wrong group.

But at Amazon? "What I find still so surprising is it happened in the middle of the day. Typically you do that in off-peak hours," White said. "They rank on the top with performance and availability, consistently, time and time again."

Network attack?
Another possible explanation is an attack such as the distributed denial-of-service (DDOS) attack that struck Amazon and other high-profile sites in 2000. White thinks it unlikely, though, that a crushing load of network traffic brought Amazon down.

"These guys are experts at dealing with flash floods of users," including those that routinely arrive during peak shopping days. "Usually, when you see a site going under because of traffic issues or a denial-of-service attack, you see a gradual slowdown in performance and drop in availability. Here we saw at 10:16 a.m. it completely dropped off 100 percent."

Soups Ranjan, a senior member of the technical staff of network protection and management company Narus, hasn't yet found any attack evidence.

"It doesn't seem to be the result of a network-initiated attack, at least from my preliminary analysis from our probes," Ranjan said.

Human error may not sound as gripping a tale as a network attack, but there's plenty of drama for the people responsible. And it's the career-limiting variety of drama, said Illuminata analyst Gordon Haff, who hazarded a guess that Amazon's problem involved its front-end Web servers.

The security group of WebSense, a Web site and communications protection company, also saw no evidence Amazon's problem was security related.

CNET staff writer Robert Vamosi contributed to this report.