The company's Office 365, Hotmail, SkyDrive, and other online services were down last night but have reportedly all been restored.
Lance WhitneyContributing Writer
Lance Whitney is a freelance technology writer and trainer and a former IT professional. He's written for Time, CNET, PCMag, and several other publications. He's the author of two tech books--one on Windows and another on LinkedIn.
Several of Microsoft's online services suffered an outage last night but are reportedly all back up at this point.
The company's Office 365, Hotmail, SkyDrive, and various Windows Live services were down throughout the world for a period of around three hours. Microsoft acknowledged the outage late yesterday in its Inside Windows Live blog and on its Office 365 Twitter feed and said that it was working to resolve the issue.
After a couple of hours of investigation, the company pinned the cause on a DNS (Domain Name System) issue and said that it was starting to see intermittent recovery in various regions of the world. DNS is responsible for translating domain names, such as microsoft.com, into IP addresses, such as 22.214.171.124, so that Internet traffic can be delivered to the right location.
"Still working to restore service," tweeted Microsoft at around 11:00 p.m. PT. "Preliminary root cause suggests a DNS issue, though we're still working hard to restore."
An earlier update at the Inside Windows Live blog at 9:45 p.m. PT incorrectly stated that all services had been restored at that point but corrected itself about an hour later, acknowledging that customers were still experiencing connection problems.
After propagating its DNS fix throughout the world, Microsoft tweeted around midnight that it believed service had been restored for all Office 365 customers (and presumably for customers of Hotmail and the other affected online services).
Providing further details, a Microsoft representative told CNET that "on Thursday, September 8th at approximately 8 p.m. PDT, Microsoft became aware of a Domain Name Service (DNS) problem causing service degradation for multiple cloud-based services. A tool that helps balance network traffic was being updated, and for a currently unknown reason, the update did not work correctly. As a result, the configuration was corrupted, which caused service disruption. Service restoration began at approximately 10:30 p.m. PDT, with full service restoration completed at approximately 11:30 p.m. PDT. We are continuing to review the incident."
Amazon, too, has experienced outages over the past year with its Elastic Compute Cloud (EC2) service, which hosts the Web sites of many major companies. One disruption in April affected such customers as Quora and Reddit, while another one last month took Netflix, Foursquare, Quora, and Reddit offline.
Updated on 9/10 at 4:15 a.m. PTwith more detailed statement from Microsoft.