SGI claims lead in supercomputer race
Computer maker evolves from its "Jurassic" era with a Linux-based machine. NASA gives it a thumbs-up. Photos: SGI's Columbia supercomputer
The speed is a notch faster than the 36.01 teraflops IBM reported for its Blue Gene/L system in September. That performance was enough to edge Big Blue ahead of NEC's Earth Simulator, which since 2002 has led a list of the world's 500 fastest supercomputers. IBM's test, performed Sept. 16, also is likely to be outdone by a later score.
The Top500 rankings are updated twice yearly, and new results will be published at the SC2004 supercomputing show in Pittsburgh beginning Nov. 6.
SGI isn't promising it will remain the speed king. "It's clear there are a few horses all making a dash for the finish line," said Dave Parry, senior vice president of SGI's server and platform group. "IBM may scrape up a few more cabinets and do a little more."
Virginia Tech is another organization looking for a little supercomputer glory before the new list is published. The school announced Tuesday that it clocked its upgraded System X machine at 12.25 teraflops.
Columbia uses Itanium 2 processors--a combination of current models that come with 6MB of high-speed cache memory and as-yet-unannounced models with 9MB of cache.
SGI has been struggling to reclaim the prowess and prestige it had in 1990s, when its high-end computers stood out for demanding graphics challenges such as digital animation in the movie "Jurassic Park." But using Intel processors has given the company's equipment a boost, said Walt Brooks, division chief of NASA's Advanced Supercomputing Center--the Itanium machines are six times faster than the SGI models they replace.
SGI's system is different from many clusters of low-end machines that make up most supercomputers today. Columbia is made of twenty 512-processor machines connected with the high-speed InfiniBand networking technology, and each machine runs a single operating system.
That "single-system image" approach is good for tasks such as simulations of the space shuttle's aerodynamics, Brooks said. Clusters can be used for fluid dynamics, "but it's extremely inefficient with those systems and the programming is very difficult," he said.
Another task the systems are being used for is hurricane forecasting. Software under development now shows promise at being able to forecast a hurricane's path five days into the future with the accuracy of current two-day forecasts, Brooks said.
The system is much more complicated as well as more complicated than a conventional server. After a surprise power outage hit the machine early Tuesday, engineers were called in early to oversee the two-hour reboot of the complete machine, Brooks said.
The system was assembled with extraordinary rapidity--120 days from the close of the deal to design, assemble and test the system, executives and NASA officials said at an unveiling Tuesday. NASA had to secure approval from the Office of Management and Budget, the Office of Science and Technology Policy, Republican and Democratic appropriations committee staffs for the House and Senate, and six NASA divisions.
"That all happened in a month. Can you imagine moving the entire government in 30 days?" asked G. Scott Hubbard, director of NASA Ames. And the system has been used as sections were brought online. "This is like making up the bed with the patient in it," Hubbard said.
SGI has just completed work connecting four of the 512-processor machines into a single system with 2,048 processors running a single instance of Linux. Future upgrade possibilities include joining another four in a similar way, along with boosts with newer Intel processors, Brooks said.
Initial Columbia components were SGI's conventional Altix 3700 machines, but the newer parts are a replacement model that's got twice the processor density, Parry said.
NASA has been a longtime customer of SGI's top-end systems. The new 512-processor machines can perform at about three teraflops each--six times the performance of the earlier systems using SGI's MIPS chips and Irix operating system, Brooks said.
However, that performance comes at a cost: heat. The cabinets of the SGI systems in Columbia have been specially modified with water-cooled radiator systems that chill the hot exhaust air that rises off the chips.
Columbia takes up the area of about three basketball courts, Brooks said. It currently consumes about 2 megawatts of the facility's 8-megawatt capacity.
The largest MIPS-Irix system SGI ever built had 1,024 processors--another NASA Ames machine, Chief Executive Bob Bishop said.
Intel Chief Operating Officer Paul Otellini said in September that the system would have a performance of 60 teraflops, but that was an estimate of the machine's peak speed. The Top500 list ranks computers according to a number that's typically lower, the sustained performance.
The ratio of sustained performance to peak performance is called efficiency. The 16-system speed test had an efficiency of 88 percent.