IBM will release a radical new chip next year that will go into a University of Illinois supercomputer in a quest to build what may become the world's fastest supercomputer.
That university's supercomputer center is a storied place, home to both famous fictional and real supercomputers. The notorious HAL 9000 sentient supercomputer in "2001: A Space Odyssey" was built in Urbana, Illinois, presumably on the University of Illinois Urbana-Champaign campus.
Though not aspiring to artificial intelligence, the IBM Blue Waters project supercomputer, like the HAL 9000 series, will be able to do massively complex calculations in an instant and, like HAL, be built in Urbana-Champaign. It is being housed in a special building on the Urbana-Champaign campus specifically for the computer that will theoretically be capable of achieving 10 petaflops, about 10 times as fast as the fastest supercomputer today. (A petaflop is 1 quadrillion floating point operations per second, a key indicator of supercomputer performance.)
Part of the National Center for Supercomputing Applications (NCSA) at the University of Illinois, it will be the largest publicly accessible supercomputer in the world when it's turned on sometime in 2011.
Supercomputers are essentially a large collection of microprocessors acting in concert on a complex problem. As processor designs go, the upcoming Blue Waters' IBM Power7 processor--due in the first half of 2010--is a big step for IBM: the processor integrates the features of a chip used in its "Roadrunner" supercomputer, which has often been ranked as the fastest supercomputer in the world. Power7 fuses the flagship Power chip design with key technology from a separate "Cell" processor--the latter was part of IBM's Roadrunner system at the Los Alamos National Laboratory, according to Bradley McCredie, an IBM Fellow in the Systems and Technology Group.
"We took some of that genetic material from the Cell program--ways to do floating point (calculations)--and embedded that right into the Power7 core," McCredie said in an interview with CNET.
But that's not the only thing that makes the Power7 chip special. It integrates eight processing cores in one chip package and each core can execute four tasks--called "threads"--turning an individual chip into a virtual 32-core processor. As a yardstick, Intel's high-end Xeon processors typically have two threads per processing core.
IBM is also using novel memory technology. Widely used "static" RAM memory, used as the on-chip memory in almost all processors today, can add as much as a billion transistors to high-end processors. IBM wanted to avoid these ballooning--and costly--chip counts and elected to use a technology called E-DRAM, keeping the total number of transistors to 1.2 billion. "The equivalent number of transistors if we had done all of the cache in (static RAM) is well in excess of two billion," McCredie said.
And the chip's speed? Between 3GHz and 4GHz (IBM has yet to make a final decision), which is actually a lower rating than the previous Power6 chip which ran at 5GHz. "We have gotten performance from other spots, such as the dense E-DRAM. We had to back off from the gigahertz in order to get eight of these cores on to the chip and not have it melt," according to McCredie.
IBM has also made other tweaks to get the performance up. It has brought circuitry onto the chip for communicating with system memory--which was previously external to the processor--and returned to "out of order" instructions.
And how does IBM keep this dense collection of ultrafast processors cool? In a word, water. "We actually went a bit further environmentally," said Ed Seminaro, an IBM Fellow who is involved with the University of Illinois project. "We took a lot of the infrastructure that's typically inside of the computer room for cooling and powering and moved the equivalent of that infrastructure right into that same cabinet with the server, storage, and interconnect hardware."
Seminaro continued: "The whole rack is water-cooled. We actually water-cool the processor directly to pull the heat out. We take it right to water, which is very power efficient," he said.
World's fastest supercomputer?
The Blue Waters project is funded by the National Science Foundation and follows a Defense Advanced Research Projects Agency (DARPA) project that also uses IBM's Power7 chip, according to Seminaro.
"We did a lot more than what we would have typically done because of this (DARPA) engagement," Seminaro said. "And what followed was the Blue Waters contract. Blue Waters was a bid to the National Science Foundation. We had strong story based on what we did for DARPA," he said.
Blue Waters will be able to theoretically hook together 16,384 Power7 chips--referred to as "nodes"--for a total theoretical performance of 16 petaflops, though IBM said that, at least initially, the theoretical peak performance will likely be closer to 10 petaflops and the much more strict (and realistic) "sustained" performance on real-world software applications (not cited in Top 500 supercomputer statistics) will be one petaflop.
But organizations like DARPA and the NSF are not looking only at Top 500 "peak" benchmarks, which can be achieved rather crudely, according to Seminaro. "You can get a pretty good number without a lot of bandwidth (speed) between nodes," he said, referring to the Top 500 Supercomputer "LINPACK" benchmark. "Because there's almost no communication between nodes that you have to do for this benchmark. And you can get away with very poor memory bandwidth."
In Blue Waters' case, the transfer rate between nodes is a game changer, Seminaro believes. "The transfer of data between any of those two nodes in the system is at the full rate of 192GB per second--peak," he said. "So, you can get data from anyplace to anyplace at that kind of speed with latency on the order of less than one microsecond."
This kind of performance is crucial for big companies and governments alike. "Companies [including] Boeing, GM, and Ford, use these systems heavily. Most of the crash tests are now done on these machines. And weather prediction--a large percentage is done on this platform," Seminaro said. More specialized government-centric applications include simulations of how to properly dispose of nuclear waste, he added.
Blue Waters is the largest supercomputer project that Seminaro has been involved in since 1999, when he first participated in a supercomputer project. "This is really the biggest," he said.
And watch out Intel, Power7 is coming to commercial server products too. Said Seminaro: "We will be shipping [Power7 processors] sometime in the first half of next year in some [of] our commercial products."
Updated at 3:45 p.m. PST: clarifying the statement about the IBM Power architecture and the IBM Roadrunner supercomputer. Roadrunner uses IBM Cell processors in combination with processors from Advanced Micro Devices.