Hallelujah! BlackBerry service is finally restored

Three days of disrupted service for millions around the world comes to an end. RIM says service is back at 100 percent after a "frustrating" inability to fix things quickly.

Marguerite Reardon Former senior reporter
Marguerite Reardon started as a CNET News reporter in 2004, covering cellphone services, broadband, citywide Wi-Fi, the Net neutrality debate and the consolidation of the phone companies.
Marguerite Reardon
3 min read

Research In Motion's BlackBerry service, which had been out since Monday in some parts of the world, has been fully restored, executives at the company said Thursday morning.

Co-CEOs Mike Lazaridis and Jim Balsillie informed investors and reporters on a conference call that the service to all BlackBerry customers in all regions of the world had been restored as of the wee hours of Thursday morning.

Lazaridis explained in slightly more detail what caused the problem. And he once again apologized to customers. Earlier on Thursday the company released a taped video apology from Lazaridis.

"Our inability to quickly fix this has been frustrating," he said.

He explained that on Monday there was a hardware failure on a dual redundant core switch that had been designed to help protect the BlackBerry infrastructure that failed. This switch failure caused the e-mail and messaging services to go down in Europe, India, Africa, the Middle East, and parts of South America.

The backup switching architecture did not work as intended and the systems in Europe quickly became overwhelmed, which is how the issue began rippling to other parts of the world. When technicians restarted the system, it took a long time for the backlog of messages and data passing through the infrastructure to become stable.

Now that service is restored, customers should be seeing inboxes on their BlackBerry smartphones fill up with messages that had been sent and queued up in the system.

"When you start to see the traffic flowing very quickly, that's a very good thing," co-CEO Jim Balsillie said.

Lazaridis said the company still doesn't know what caused the switch to fail and the redundant switch to not work properly. He wouldn't say who the vendor or vendors are that provide this equipment to RIM, but he said that the company is working with them to determine how this unusual failure occurred. The company is also auditing all of its systems to minimize the risk of something like this happening again.

Lazaridis explained that the interconnected nature of the telecommunications network is what caused the problems to spread throughout different regions of the world.

The very nature of RIM's architecture, which sends all e-mail and messaging traffic through RIM's own BlackBerry servers at data centers around the world, is both a blessing and a curse for the company. It provides added security and device management that has made the BlackBerry, but it also creates single points of failure.

Still, Balsillie defended the company's architecture. He said that over the past 18 months the company has had excellent reliability with a 99.97 percent uptime for its service. This translates into a downtime of no more than 160 minutes per year. The traditional telephone network is considered to be the most reliable communications network with 99.999 reliability, or no more than 5 minutes, 15 seconds of downtime in a year.

Both executives acknowledged that they had let down customers. And they said they will work to regain their trust. But they stopped short of mentioning specifics for compensating customers. Lazaridis said that up until the call, the company was purely focused on restoring service. But now that service has been restored, it will turn its attention to figuring out how to win back the trust of its customers.

"That's something we will turn our attention to now," he said. "We plan to come to customers on this and make things right with them."

The outage, which is the worst in the company's 12-year history, has come at the worst possible time for RIM. It is facing stiff competition from rivals, such as Apple and Google, and it's been losing market share. But Balsillie said the company is not worried that this latest snafu will send customers fleeing.

"We worked 12 years to win the trust of our 70 million customers," Balsillie added. "And we are fully committed to winning that trust back."