Hewlett Packard Enterprise and AMD will deliver what they expect to be the world's fastest supercomputer in 2023, a $600 million machine at Lawrence Livermore National Laboratory called El Capitan they promise will perform at 2 exaflops, or 2 quintillion calculations per second. That's fast enough that if the world's human population could perform one such calculation per second, it would take everybody eight years to match 1 second's worth of El Capitan computing.
The current fastest machine, as measured by the Top500 ranking released by supercomputing researchers twice each year, is the in Tennessee. Upgrades have boosted its performance to 143 petaflops.
"We expect when it's delivered to the laboratory in 2023, it will be the fastest supercomputer in the world," Bill Goldstein, director of the Livermore lab, said Wednesday. He spoke at a press conference at HPE offices in San Jose, California, in the heart of Silicon Valley, at which HPE, AMD and LLNL announced their El Capitan ambitions along with design details.
Supercomputers are mammoth systems assembled from hundreds or thousands of computers linked with high-speed interconnects to shuttle data and coordinate operations. They occupy rooms the size of tennis courts, uses thousands or millions of processors, cost millions of dollars and consume enough electricity to power a town.
But they can tackle computing challenges out of reach of lesser machines. In the case of El Capitan, that means full 3D simulations of nuclear weapons explosions that the US Department of Energy demands to ensure that its aging stockpile of thermonuclear weapons will work as advertised, not fizzle or pose unexpected safety risks. The DOE has been funding such supercomputers since the 1990s, embracing them as the US ceased real-world nuclear tests.
The nuclear weapons simulations it'll perform are extraordinarily complex, modeling matter and energy shifting through temperatures ranging from room temperature to the center of the sun. Simulations must accommodate detail down to billionths of a meter for devices measuring meters in length. They take steps lasting billionths of a second through an event that lasts for seconds. And the lab runs different simulations over and over.
"The range of scales is tremendous that we have to represent," Goldstein said.
Supercomputers are also in demand for health and genetics research, astrophysical modeling, aircraft and automotive design, climate change simulations and, more recently, new artificial intelligence algorithms. El Capitan will be used for some of these nonmilitary tasks, too.
"The unique architecture of El Capitan will allow us to further advance new work that we're doing to combine machine learning with the traditional modeling and simulation that has undergirded our stockpile work," Goldstein said. "The ability to combine machine learning and simulation is going to be a game changer for our ability to rapidly and accurately come up with predictions."
AI today is used for detecting patterns like fraudulent credit card transactions and interpreting complex data like medical scans or voice commands. For nuclear weapons, it can be used for tasks like spotting unusual phenomena in a simulation that merit closer attention. It could also be helpful in choosing which variations of a simulation to try, zeroing in faster on what's important, Goldstein said.
El Capitan supercomputer nuts and bolts
El Capitan will take up about two tennis courts' worth of space in a Livermore data center and weigh as much as 35 school buses. If you stacked up its system boards end to end, you'd get a tower three times taller than the real El Capitan cliff in Yosemite National Park.
It'll need 30 megawatts of power -- about the same consumption as 12,000 homes, according to federal energy consumption rates.
The allies behind the machine wouldn't reveal how many processors it'll use. But it will combine next-next-gen AMD "Genoa" processors with AMD graphics chips that also are good for mathematical calculations.
AMD accelerated its product plans to meet the El Capitan deadline, said Chief Executive Lisa Su. "We've pulled in our roadmap so we could meet this requirement," she said.
Each CPU will connect to four graphics chips and to shared memory with a new, higher-speed connection technology, AMD's third-generation Infinity Architecture. Data transfer across the whole supercomputer will use a new HPE optical network that significantly shrinks the overall size of the supercomputer, said Terri Quinn, Livermore's deputy associate director for high-performance computing.
A related HPE-AMD supercomputer called Frontier is scheduled to arrive at Oak Ridge in 2021 with a speed of 1.5 exaflops. A performance level of 1 exaflops ("flops" stands for floating-point operations per second) is 1,000 times 1 petaflops, making El Capitan about 10 times faster than Summit. While El Capitan is for classified research, Livermore will get a smaller system -- still faster than the lab's fastest machine today -- for open science research.
The 2-exaflops performance level -- 2,000,000,000,000,000,000 calculations per second -- is a notable boost over today's machines. But a lot could happen in the three years it'll take to build El Capitan. The Summit machines reclaimed the top spot for the US after years during which Chinese supercomputers were fastest, andwith the US to be the first to cross the exaflops threshold.
Confident El Capitan will be fastest
But HPE and its partners are confident El Capitan will indeed be the fastest machine, in part because of all the new research and development it's funding to make such a mammoth machine practical.
"We believe the technology we're building, between AMD's CPUs and GPUs and the Shasta architecture we're building at HPE, is absolutely at the edge of what's possible in this time frame," said Peter Ungaro, leader of HPE's high-performance computing business. "We feel very strongly we're going to have the greatest machine on the planet."
Also, take the numbers with a grain of salt. The Top500 rankings are based on a supercomputer's sustained performance. But El Capitan's score of 2 exaflops is based on peak performance, a number that, while higher, isn't as representative of real-world performance.
Peak performance is easier to predict this far in advance of actually building El Capitan, the Livermore lab said in a statement. They expect the difference between sustained and peak performance to be similar to that of Summit, which would give El Capitan a sustained performance of about 1.5 exaflops.
The Top500 organizers have been working for years on, which they hope will more broadly represent different computing tasks.