The massive data failure at Microsoft's Danger subsidiary threatens to put a dark cloud over the company's broader "software plus services" strategy.
A key tenet of that approach is that businesses and consumers can trust Microsoft to reliably store valuable data on their servers.
A week ago, though, Microsoft's Danger unit experienced a huge outage that left many T-Mobile Sidekick users without access to their calendar, address book, and other key data. That's because the Sidekick keeps nearly all its data in the cloud as opposed to keeping the primary copy on the devices themselves.
Things got even worse on Saturday, as Microsoft said in a statement that data not recovered thus far may be permanently lost. It's not immediately clear how many people lost their data. The outage earlier in the week affected a broad swath of Sidekick users, though many had data return during the week.
While outages in the cloud computing world are common (one need only look at recent issues with Twitter or Gmail), data losses are another story. And this one stands as one of the more stunning ones in recent memory.
The Danger outage comes just a month before Microsoft is expected to launch its operating system in the cloud--Windows Azure. That announcement is expected at November's Professional Developer Conference. One of the characteristics of Azure is that programs written for it can be run only via Microsoft's data centers and not on a company's own servers.
It should be pointed out that the Azure setup is entirely different from what Danger uses: the Sidekick uses an architecture Microsoft inherited rather than built (Microsoft bought Danger last year). Still, the failure would seem to be enough to give any CIO pause.
Update, 2 p.m. PT, 10/11/2009: I asked Microsoft for comment Saturday when I was writing this, in particular as to how the rest of its cloud might differ from the Danger set up.
Microsoft said Sunday that its the fabric controller that manages the Azure service is built with redundancy in mind.
"We write multiple replicas of user data to multiple devices so that the data is available in a situation where a single or multiple physical nodes may fail," Windows Azure general manager Doug Hauger said in a statement to CNET News.
That doesn't mean Azure is immune from data loss, though I'm told an entire data center would have to be wiped out, as opposed to just a server or collection of servers. I'd be interested to know whether Microsoft will also offer multiple location options so that users that want to can have their data in more than one physical spot as well.
But that's just one of many questions raised by this spectacular failure. Among the other questions still looming large in my head are:
1. What backup procedures did Danger have?
2. Just how many of T-mobile's Sidekick customers lost their data? (Feel free to let me know, Sidekick users.)