Internet worms and critical infrastructure

Security expert Bruce Schneier asks why a report into the big August blackout overlooks MSBlast as a culprit.

4 min read
Did MSBlast cause the Aug. 14 blackout? The official analysis says "no," but I'm not so sure. A November interim report a panel of government and industry officials issued concluded that the blackout was caused by a series of failures with the chain of events starting at FirstEnergy, a power company in Ohio. A series of human and computer failures then turned a small problem into a major one. And because critical alarm systems failed, workers at FirstEnergy did not stop the cascade, because they did not know what was happening.

This is where I think MSBlast, also known as Blaster, may have been involved.

The report gives a specific timeline for the failures. At 2:14 p.m. EDT, the "alarm and logging software" at FirstEnergy's control room failed. This alarm software "provided audible and visual indications when a significant piece of equipment changed from an acceptable to problematic condition." Of course, no one knew that it failed.

Six minutes later, "several" remote control consoles failed. At 2:41 p.m., the primary server computer that hosted the alarm function failed. Its functions were passed to a backup computer, which failed at 2:54 p.m.

Doesn't this sound like a computer worm wending its way through FirstEnergy's operational computers?

The report had the following: "For over an hour no one in FE's control room grasped that their computer systems were not operating properly, even though FE's Information Technology support staff knew of the problems and was working to solve them."

Doesn't this sound like IT working to clean a worm out of its network?

This massive computer failure was critical to the cascading power failure. The report continues: "Power system operators rely heavily on audible and on-screen alarms, plus alarm logs, to reveal any significant changes in their system's conditions. After 2:14 p.m. EDT on Aug. 14, FE's operators were working under a significant handicap without these tools. However, they were in further jeopardy because they did not know that they were operating without alarms, so that they did not realize that system conditions were changing."

Other computer glitches are mentioned in the report. At the Midwest Independent Transmission System Operator, a regional agency that oversees power distribution, there's something called a "state estimator." It's a computer used to determine whether the power grid is in trouble. This computer also failed, at 12:15 p.m. According to the report, a technician tried to repair it and forgot to turn it back on when he went to lunch.

The MSBlast worm first appeared on Aug. 11, and infected more than a million computers in the days following
The MSBlast worm first appeared on Aug. 11 and infected more than a million computers in the days following. It targeted a vulnerability in the Microsoft operating system, infected computers and, in turn, tried to infect other computers. And in this way, the worm automatically spread from computer to computer and network to network.

Although the worm didn't have to perform any malicious actions on the computers it infected, its mere existence drained resources and often caused the host computer to crash. To remove the worm, a system administrator had to run a program that erased the malicious code; then, the administrator had to patch the vulnerability so that the computer would not get reinfected.

The coincidence is too obvious to ignore. At 2:14 p.m. EDT, the MSBlast worm was dropping systems all across North America. The report doesn't explain why so many computers--both primary and backup systems--at FirstEnergy were failing at around the same time. But MSBlast is certainly a reasonable suspect.

Unfortunately, the report doesn't directly address the MSBlast worm and its effects on FirstEnergy's computers. The closest I could find is this paragraph, on page 99: "Although there were a number of worms and viruses impacting the Internet and Internet connected systems and networks in North America before and during the outage, the SWG's preliminary analysis provides no indication that worm/virus activity had a significant effect on the power generation and delivery systems. Further SWG analysis will test this finding."

Why the tortured prose? The writers take pains to assure us that "the power generation and delivery systems" were not affected by MSBlast. But what about the alarm systems? Clearly, they were all affected by something--and all at the same time.

Let's be fair. I don't know that MSBlast caused the blackout. The report doesn't say that MSBlast caused the blackout. Conventional wisdom is that MSBlast did not cause the blackout. But it's certainly possible that MSBlast contributed to the blackout. The primary and backup computers that hosted the alarm systems failed at the same time MSBlast was attacking Windows computers on the Internet. What operating system were the alarm computers running? Were they on the Internet? These are interesting questions worth knowing the answers to.

And regardless of the answers, there's a very important moral here. As networked computers infiltrate more and more of our critical infrastructure, that infrastructure is vulnerable not only to attacks but also to sloppy software and sloppy operations. And these vulnerabilities are not the obvious ones.

The computers that directly control the power grid are well protected. It's the peripheral systems that are less protected and more likely to be vulnerable. And a direct attack is unlikely to cause our infrastructure to fail, because the connections are too complex and too obscure. It's only by accident--MSBlast affecting systems at just the wrong time, allowing a minor failure to become a major one--that these massive failures occur.

In late January 2003, the Slammer worm knocked out 911 emergency telephone service in Bellevue. More recently, the Natchi worm disabled automatic teller machines made by Diebold. As commercial operating systems become more commonplace in critical systems, this sort of thing will become more common.