X

Facebook explains what caused its widespread outage

The social network's services, including Facebook, Facebook Messenger, Instagram and WhatsApp, were offline for about six hours on Monday.

Queenie Wong Former Senior Writer
Queenie Wong was a senior writer for CNET News, focusing on social media companies including Facebook's parent company Meta, Twitter and TikTok. Before joining CNET, she worked for The Mercury News in San Jose and the Statesman Journal in Salem, Oregon. A native of Southern California, she took her first journalism class in middle school.
Expertise I've been writing about social media since 2015 but have previously covered politics, crime and education. I also have a degree in studio art. Credentials
  • 2022 Eddie award for consumer analysis
Queenie Wong
2 min read
facebook-pineville-datacenter-8156.jpg

Facebook relies on data centers, such as this one in Oregon, to handle its massive traffic.

James Martin/CNET

Facebook said late Monday that the company believes a "faulty configuration" change caused a widespread outage that lasted roughly six hours.

"Our engineering teams have learned that configuration changes on the backbone routers that coordinate network traffic between our data centers caused issues that interrupted this communication," Facebook's vice president of engineering and infrastructure, Santosh Janardhan, said in a blog post. "This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt." 

Monday's outage also impacted the tools that Facebook employees use. Facebook said it hasn't found any evidence that user data was compromised during the outage. 

In a more detailed post published Tuesday, Janardhan said there was a "bug" in a tool meant to prevent mistakes like what triggered the outage from happening. Facebook encountered multiple problems, including getting access to its data centers and domain name system servers, which had become unreachable. Referred to as the phone book of the internet, DNS translates domain names like Facebook.com to numeric Internet Protocol addresses. "The total loss of DNS broke many of the internal tools we'd normally use to investigate and resolve outages like this," Janardhan said.

Facebook also had to carefully manage how quickly it brought its services back online because a sudden surge in traffic could cause a new round of crashes. "Every failure like this is an opportunity to learn and get better, and there's plenty for us to learn from this one," Janardhan said. The company is extensively reviewing what happened.

The rare outage, which also impacted other apps owned by Facebook such as Instagram , WhatsApp and Facebook Messenger, showcased how dependent people and businesses are on social media even as the company faces more scrutiny from lawmakers and regulators. The Wall Street Journal recently published a series of stories detailing how Facebook knew about the platform's problems, including its harmful impact on the mental health of teenagers. 

Former Facebook product manager Frances Haugen, the whistleblower who gathered the internal documents used by the Journal, testified before Congress on Tuesday.

Monday's outage was reminiscent of other times Facebook's services went offline. For instance, Facebook experienced an outage in 2019 that lasted more than 14 hours, which the social network said was the result of a "server configuration change."

Read also: Best memes and jokes about the big Facebook outage 

Inside Facebook's massive Oregon data center

See all photos