What cloud computing can learn from 'flash crash'

Last week's "flash crash" may long be remembered as one of the first significant economic events caused by competing computer algorithms, in the form of high-frequency trading systems.

James Urquhart
James Urquhart is a field technologist with almost 20 years of experience in distributed-systems development and deployment, focusing on service-oriented architectures, cloud computing, and virtualization. James is a market strategist for cloud computing at Cisco Systems and an adviser to EnStratus, though the opinions expressed here are strictly his own. He is a member of the CNET Blog Network and is not an employee of CNET.
James Urquhart
4 min read

May 6, 2010, may long be remembered as one of the most significant events in the young history of electronic trading. As has been widely reported, at about 2:15 p.m. EDT on that Thursday, several financial indexes experienced a sudden and precipitous drop, losing around 8 percent of their value at the beginning of the day in a matter of minutes. The market recovered much of that loss quickly but closed the day down overall.

Screenshot by Tom Krazit/CNET

While there has been no definitive cause identified for the day's events, many financial market experts have identified the increasing presence of automated trading and electronic exchanges as a key cause of this "flash crash." The New York Times explained the importance of the new automated regime as follows:

In recent years, what is known as high-frequency trading--rapid automated buying and selling--has taken off and now accounts for 50 percent to 75 percent of daily trading volume. At the same time, new electronic exchanges have taken over much of the volume that used to be handled by the New York Stock Exchange.

In fact, more than 60 percent of trading in stocks listed on the New York Stock Exchange takes place on separate computerized exchanges.

Complex adaptive systems and unexpected behaviors
High-frequency trading is performed by automated systems that attempt to beat out competition to the best matches of buyers and sellers for particular stocks. These systems are deployed in the same data centers as the exchange systems themselves, and the success of a system is often dependent on shaving milliseconds off of network and computing latencies.

What is critical to note, however, is that the number of high-frequency trading algorithms operating independently against the same market environment creates a sort of complex adaptive system, in which many interdependent agents adhering to known rules create a system which exhibits unpredictable or unexpected behaviors as a whole. In fact, financial markets are often heralded as one of the best examples of complex adaptive behavior.

One of the key traits that science has determined about these systems is that sometimes little causes can trigger giant effects. Think of a pile of sand on a table top. Drop one grain at a time on that pile, and you'll note not a gradual shift in the shape of the pile, but rather moments of relative quiet punctuated by noticeable avalanches. Eventually, one grain of sand triggers an avalanche that sends many grains to the floor.

This is what is assumed by some to have happened in the "flash crash." One unexpectedly big trade seems to have triggered several automated systems to begin selling stock. Attempts to halt the drop by halting trading for specific stocks on the major exchanges were thwarted by continued selling on the new electronic exchanges but probably confused the high-frequency trading systems further.

The result: about 800 points shaved off of the Dow in less than 30 minutes.

The future of cloud computing as a complex adaptive system
So what does this have to do with cloud computing? Well, today, very little really. Most cloud computing automation happens in pockets, with systems under management "belonging" to one set of algorithms that decide how application needs are matched with available resources. While the public clouds have some competition for resources today--Amazon's spot pricing option is evidence of this--the truth is, there isn't either the volume or the interconnectedness of cloud systems to create true complex adaptive systems behaviors.

But think ahead. Imagine a world in which cloud systems operate in an increasingly interconnected fashion; a world in which the automation controlling the needs of a single application--or even a single aspect of a complex distributed application system--must compete for resources across the globe with every other application. Add to that the different service automation environments run by the cloud providers themselves, each making decisions for the good of the resources that make up that service.

You now have an environment with a large number of highly interdependent software agents attempting to operate a "market" of IT resources and services on behalf of other applications or even human beings. I can almost guarantee to you that this system--what some may think of as the Inter-Cloud--will display some quite unexpected behavior of its own.

Most of that will likely be good; strong pricing efficiencies, rapid re-provisioning of resources in response to unpredictable events, and so on. However, there will also be the increased possibility that some relatively small event somewhere--let's say the shutdown of a data center due to political conflict--will trigger a chain reaction of events that would negatively affect a large number of customers.

Perhaps other data centers become overloaded themselves and can't meet their own service level agreements. Or the application management systems confuse each other into believing that resources are not available when they are, and the applications are depricated or shut down entirely.

The most likely scenario, in my opinion, would be sudden wild price swings that could add greatly to the cost of computing for all.

To meet the challenge of the "flash crash," regulators and exchanges are working together to increase the effectiveness of so-called circuit-breaker processes. As we build out the federation and interoperability capabilities of our cloud marketplace, we should do so completely understanding that that market needs failure protection of its own.

For now, however, it's no big deal. Which is why the cloud market probably won't remember May 6, 2010, until it's too late.