Inside NASA's world-class supercomputer center

NASA's Pleiades supercomputer is the world's sixth-most powerful. Servicing the entire space agency, the computer has been measured at 973 teraflops.

Daniel Terdiman Former Senior Writer / News
Daniel Terdiman is a senior writer at CNET News covering Twitter, Net culture, and everything in between.
Daniel Terdiman
6 min read
A visualization of the Ares-1's main engine plume interacting during a type-4 stage separation with the Interstage, created at the NASA advanced supercomputing facility. The facility's current top-end supercomputer, known as Pleiades, is the sixth-fastest computer on Earth, measured recently at 973 teraflops--or 973 trillion floating point operations a second. Goetz Klopfer, NASA Exploration Systems MissionDirectorate

MOUNTAIN VIEW, Calif.--If you're a materials scientist at NASA's Glenn Research Center, or an engineer at the Johnson or Marshall Space Centers studying Space Shuttle flow-control valves, or any one of countless others in the agency needing a supercomputer, there's really just one place to go.

That place is the advanced supercomputing facility at the Ames Research Center here, the home of Pleiades, NASA's flagship computer, a monster of a machine that, with a current rating of 973 teraflops--or 973 trillion floating point operations per second--is today ranked the sixth-most powerful supercomputer on Earth.

The computing facility, which services about 1,500 users across NASA, according to Rupak Biswas, the agency's advanced supercomputing division chief, is somewhat of a one-stop shop for those needing the highest-end processing power NASA has to offer: the division provides not just computing power, but also a "fully integrated environment where people have access to the machine, and [where] we assist them to get the most out of the machine."

Pleiades, like most, if not all, supercomputers, is a work in progress. Debuted in late 2008 with a world No. 3 ranking and a measurement of 487 teraflops, the machine how now doubled its capacity, even as it has dropped three places in the rankings. Based on SGI's Altix ICE system, Pleiades still has room to grow, and as Biswas and his staff at the supercomputing division add more SGI racks, it will do just that.

And that's vital to NASA because the demands for the computer's use are non-stop. Across the agency, Biswas must support anyone granted time on the computer, be it for doing climate modeling, researching weather patterns caused by El Nino, understanding how galaxies merge or investigating the next generation of space vehicles.

Twenty-five-plus years of NASA supercomputing
Last week, I got a chance to visit Biswas and see Pleiades up close. Biswas explained that he and his staff work 24 hours a day, seven days a week servicing the demands for time on Pleiades and the other machines under his command, making sure to offer NASA's scientists support for application performance optimization, data analysis and visualization and networking.

Essentially, though, the supercomputers are used for a never-ending supply of modeling and simulation tasks, and Biswas said that just about every available computer cycle that the division's machines have to offer are spoken for.

Inside NASA's world-class supercomputer center (photos)

See all photos

While NASA has offered its people high-end computers for years, the agency changed its model in 1984, when it opened the doors of the advanced supercomputing facility and began focusing on providing "leading-edge computational capabilities based on an innovative network-centric environment," according to a 2008 brochure celebrating the 25th anniversary of the division.

Things have come a very long way since then. The division's first supercomputer, the Cray X-MP, was measured at 0.00021 teraflops--or as the brochure puts it today, less power than a single Apple Mac Mini personal computer. But over the years, supercomputers in general have outperformed Moore's Law, Biswas told me, because they are made up of the very latest equipment and processors.

In 2003, the Space Shuttle Columbia exploded, and NASA went into full-scale investigation mode, trying to determine what caused the disaster, and that became one of the primary motivations for rolling out what came to be known as the Columbia supercomputer in 2004.

But that wasn't the only driving factor, he said. There also was a touch of national competitiveness. Indeed, Biswas said, despite the fact that most Americans thought the Japanese were bluffing, computer scientists in Japan managed to build the Earth Simulator, which was the fastest supercomputer in the world. That caused, Biswas recalled, "a lot of panic in the U.S." as people thought "we are losing leadership in high-end computing."

By the time Columbia debuted in 2004, Japan had been surpassed, not just by the new NASA supercomputer--which came in at No. 2 in the world--but also by a new machine at Lawrence Livermore National Laboratory, which was the fastest on the planet. By then, the Earth Simulator had dropped off the very top of the leader board.

Three-year cycles
Supercomputers are like most other class of machine: they have limited lifespans--in their case, about three years. After that, Biswas said, "it's not cost-effective to run them anymore.

That's in part because advances are happening in the industry so fast that after that time period, new technologies have far surpassed what was once cutting-edge. Further, the industry moves quickly in figuring out new ways to package supercomputers, meaning they require much less power than older models. "Supercomputers advance so rapidly," Biswas said, it "does not makes sense from an economic standpoint to run old supercomputers."

Rupak Biswas, the division chief at the NASA advanced supercomputing division, standing in front of one of many racks of SGI machines that comprise Pleiades. Daniel Terdiman/CNET

During a three-year cycle, supercomputer divisions will typically look at what's new that's out there, take in proposals from the different vendors, and then procure a small test-bed, Biswas explained. After "kicking the tires" on a new machine, and seeing how it works in an existing supercomputing environment, and how well it can support those who would be using it, a division like that at NASA will make a decision on which new supercomputer to purchase.

Prior to purchasing the SGI equipment that makes up Pleiades, the NASA supercomputing division purchased an IBM p575+ supercomputer in order to evaluate it. Eventually, the division decided on the SGI approach, but the IBM equipment is still in the supercomputing facility at Ames, and today operates under the name Schirra, named after the Mercury 7 astronaut Wally Schirra.

The move to exaflop supercomputers
Today, the world's fastest supercomputers are topping out at about 1 petaflop--or 1,000 trillion floating point operations per second. Biswas said there are about five such computers on Earth today, two at the Oak Ridge National Lab in Tennessee, one in China, and one in Germany. And Pleiades is just behind that. But already, he said, the next-generation thinking in the industry is envisioning machines capable of exaflop computing, which is the equivalent of 1,000 petaflops.

Of course, as with any new supercomputing threshold, the question isn't necessarily whether it's possible to build the hardware, but whether it's also possible to optimize applications for such a powerful system. And not only that, Biswas said, but there's also a crucial question of whether it's possible to build machines that powerful and yet have them be energy efficient.

A petaflop supercomputer draws about 7 megawatts of power, he explained. That would mean that without increased efficiencies, an exaflop machine would draw 7 gigawatts. And that's simply out of the question. "You can't expect to have a nuclear reactor sitting next to a supercomputer," he said.

The bigger question would be, what will the right applications be for computers that could be 1,000 times or more faster than today's top-end, and which could potentially reach that strength by 2018. There's no point in building such machines, Biswas suggested, unless they're being used at peak efficiency.

At NASA, one use case could be to try to, finally, accurately predict weather. "Today, forget about predicting weather five days from now," Biswas said. "You can't even do today."

And the reason, he explained, is that even with today's massively powerful computers, it's not possible to create models fine enough to distill weather forecasts at the kind of small resolution necessary to be correct and relevant and timely. But with a thousand-fold boost in supercomputing juice, it could theoretically happen at that resolution.

Another use of an exaflop supercomputer would be to do richer modeling of the relationship between the Earth's atmosphere and our oceans. And that relationship must be coupled with what's taking place on terra firma. And then add in what's going on at the polar ice caps, and you have, as Biswas put it, "a huge, complex calculation" that cannot be done today. "That requires exaflop" computing.

Being NASA, of course, there's also plenty of applications for rocket science. Biswas said engineers wanting a complete digital model of a launch could find ample uses for an exaflop supercomputer. Today, he said, it is possible to model the first few seconds of a Shuttle launch, but to be able to put together a full profile, from launch to stage separation to an entire mission, "that's easily a multi-exaflop" problem.

And marrying the idea of weather forecasting with launch analysis, such computers may also for the first time allow NASA to make better decisions surrounding when it's safe to launch. "Right now, NASA launches by looking at the weather forecast," Biswas said. "If it's not good, [don't] launch. Ultimately, you want to couple the weather with the launch."

There are, of course, countless ways an agency like NASA could use such paradigm shifting computing resources, and the good thing, is we likely won't have to wait that long to see first-hand what they are.