James Hamilton on cloud economies of scale

The Amazon.com vice president and distinguished engineer gives what may be a hallmark presentation on the economics of scale in data centers.

James Urquhart

James Urquhart is a field technologist with almost 20 years of experience in distributed-systems development and deployment, focusing on service-oriented architectures, cloud computing, and virtualization. James is a market strategist for cloud computing at Cisco Systems and an adviser to EnStratus, though the opinions expressed here are strictly his own. He is a member of the CNET Blog Network and is not an employee of CNET.

See full bio

James Urquhart

April 28, 2010 4:26 p.m. PT

5 min read

While it is often cited that cloud computing will change the economics of IT operations, it is rare to find definitive sources of information about the subject. However, the influence of economies of scale on the cost and quality of computing infrastructure is a critical reason why cloud computing promises to be so disruptive.

James Hamilton, a vice president and distinguished engineer at Amazon.com, and one of the true gurus of large-scale data center practices, recently gave a presentation at Mix 10 that may be one of the most informative--and influential--overviews of data center economies of scale to date. Hamilton was clearly making a case for the advantages that public cloud providers such as Amazon have over enterprise data centers, when it comes to cost of operations.

However, as he presented his case, he presented a wide variety of observations about everything from data center design to how human resource costs differ between the enterprise and a service provider.

Here are the key points that I took away from the presentation:

Everything is (probably) cheaper for a large-scale service provider than for the average enterprise. This is "economies of scale 101" stuff, but Hamilton backed up this assertion with his own data claiming that server, networking and administration costs the average enterprise five to seven times what it costs a large provider. He did not cite the source of this data, but the enterprise values are consistent with my own observations.
The two quickest hits in terms of data center operations are server costs and the cost of delivering power to servers. In a calculation that runs somewhat counter to what some vendors have claimed about server power costs vs. the cost of the server itself, Hamilton lays out why in a 10-year lifetime of a data center (with proper amortization cycles for servers, networking and the data center itself), server costs are by far dominant (54 percent), with the actual cost of power consumed by the servers being a small fraction (11 percent). However, if you add the cost of delivering the power to the servers (and cooling them), the total cost of powering the servers becomes much more significant (34 percent).

This means that focusing on getting better value from servers and data center power and cooling equipment will deliver the biggest bang for the buck to the data center operator.
Turning off a server is not as economically efficient as using the server fully at all times.
Want to know what prompted Amazon's spot pricing model? Hamilton lays it all out for you. It works out better financially if Amazon can keep their servers running at all times, and utilize them for paying customers. Spot pricing is a way to encourage consumption while allowing reclamation of the resources should they be needed for other purposes. Of course, you already knew that.
Large computing providers have a different relationship with their vendors than you do. The large volumes they deal in gets them a better price, for one. However, server vendors are also willing to work with large providers on custom server designs because of that volume. That, in turn, makes them even more efficient than they would have been with off-the-shelf components.

All of that being said, I would be remiss if I didn't highlight a few things that Hamilton glossed over in this presentation:

"You can have any color you want, as long as it's black." That paraphrasing of a famous quote from Henry Ford in 1909 when discussing the Model "T" is reflected in a basic fact about today's large-scale infrastructure and platform clouds: you can build any application you like, as long as it fits into the infrastructure architecture prescribed by the provider. Configuration of the infrastructure itself is extremely limited, if not nonexistant.

For example, several users of Tomcat have had to rework that application server's clustering mechanism to allow it to work in Amazon EC2. Why? Because Amazon does not allow multicasting, and has shown no indication that they will anytime soon.

Another common example is the limited set of security configuration options, and the amount of "do-it-yourself" left for the users of large-scale IaaS offerings. I'm not saying that's a deal breaker, as you may not need much more than is provided and you can get additional security capabilities through management platforms like enStratus, but risk mitigation is a key part of data center investment in the enterprise. It pays to be aware of what you will have to accept from the cloud in that regard.
Price isn't everything. If computing was all about getting basic CPU, storage and networking capabilities at the lowest possible price, there would be no markets for such things as high-availability infrastructure, high-performance computing, and the wide plethora of data center security software and hardware. Driving to maximum economies of scale for those basic services without enabling support for more specialized needs means that those large-scale services won't be right for all workloads.

Now, let me be clear that I think Amazon knows that, and we will see some pretty significant innovation from them to address alternative architectures and configurations in the coming year.
Enterprise data centers aren't going away overnight. Building new applications in the cloud is one thing, but transferring existing systems is something else entirely. Hamilton's calculations is entirely from the perspective of operating a data center. The cost of re-architecting and moving applications and data is not factored in, and is cost is the "barrier-of-exit" that will keep many applications in internal data centers for some time to come. They may or may not move to private cloud infrastructures, but they'll stay in house until replaced or savings justify the cost of rework.

Joe Weinman, VP of Business Strategy at AT&T, has a great site discussing how you calculate the savings that public cloud computing brings to your applications.

One has to be impressed by the true disruption that infrastructure and platform services is having on data center economics. Both enterprise IT customers and vendors should pay close attention, and understand exactly how they will play in the changing data center landscape.