X

Twitter explains service meltdown

Twitter's vice president of engineering explains the system's fail in a message to users.

Donna Tam Staff Writer / News
Donna Tam covers Amazon and other fun stuff for CNET News. She is a San Francisco native who enjoys feasting, merrymaking, checking her Gmail and reading her Kindle.
Donna Tam
2 min read
Watch this: Twitter fail

Twitter's service went down and up multiple times in a three-hour period this morning and it was all because of a bug, the company explains to users. The company said it was a "cascading bug," that spread into other parts of its system, causing the outage.

Twitter's VP of engineering Mazen Rawashdeh wrote in a blog post today that the bug isn't confined to a particular software element and instead "cascades" into other elements, affecting users worldwide. He said the company took "corrective actions," after it discovered the bug.

"Not how we wanted today to go. At approximately 9 a.m. PDT, we discovered that Twitter was inaccessible for all Web users, and mobile clients were not showing new tweets. We immediately began to investigate the issue and found that there was a cascading bug in one of our infrastructure components. This wasn't due to a hack or our new office or Euro 2012 or GIF avatars, as some have speculated today," he wrote.

Twitter appeared to be completely down starting at 9 a.m. PT, according to monitoring site Pingdom. According to the stats on Pingdom, this looks to be Twitter's worst crash in months.

After service was back up mid-morning following about an hour of service interruption, the site went down again. Rawashdeh wrote that recovery began around 10:10 a.m. PT and then full recovery began at 11:08 a.m. PT.

"Update: The issue is ongoing and engineers are working to resolve it," the company's blog status read at 10:57 a.m. PT. The site's service continued to yo-yo into the afternoon.

"We are currently conducting a comprehensive review to ensure that we can avoid this chain of events in the future," Rawashdeh wrote, adding that in the past six months, Twitter was at its highest marks for site reliability and stability. He cited a tweet from one user who said Twitter being down was the "closest thing to living without oxygen..."

"It's imperative that we remain available around the world, and today we stumbled," Rawashdeh wrote. "For that we offer our most sincere apologies and hope you'll be able to breathe easier now."