X

Nvidia's new data center GPU packs 20 times the performance of its predecessor

The graphics chip will power servers used by Amazon, Google, Microsoft and other cloud service providers. And it will be used to fight COVID-19.

Shara Tibken Former managing editor
Shara Tibken was a managing editor at CNET News, overseeing a team covering tech policy, EU tech, mobile and the digital divide. She previously covered mobile as a senior reporter at CNET and also wrote for Dow Jones Newswires and The Wall Street Journal. Shara is a native Midwesterner who still prefers "pop" over "soda."
Shara Tibken
6 min read
2555435ebb25b4781b28-39429237-first-nvidia-dgx-a100-at-argonne-national-laboratory-image-courtesy-argonne

Nvidia's new GPU technology will power supercomputers and server farms. 

Nvidia

Whenever a new GPU or CPU hits the market, it boasts faster speeds, usually in the magnitude of two or three times better than before -- at most. But Nvidia on Thursday blew right past those normally strong performance increases. The company's new GPU performs 20 times better than its predecessor, giving a major boost to the cloud computing companies that will use the chips in their data centers.

Nvidia on Thursday unveiled its A100 graphics processor. It's the first GPU based on the Santa Clara, California, company's new Ampere architecture. 

"The Ampere architecture provides the greatest generational leap out of our eight generations of GPUs," Paresh Kharya, Nvidia's director of product management for data center and cloud platforms, said Wednesday in a briefing with reporters. 

"Eight" comes up again in Thursday's announcement: The DGX A100 board combines eight A100s into a super-GPU that can work as one giant processor -- or as separate GPUs for different users or tasks. It weighs 50 pounds and fits into Nvidia CEO Jensen Huang's oven as you can see in the video.

The A100 is aimed at intensive tasks like AI training, conversational AI, high-performance data analytics, genomics, scientific simulation, seismic modeling and financial forecasting. And it will be used to help explore cures and vaccines for the novel coronavirus, which has infected over 4.3 million people so far around the globe. Because of the GPU's speed, it will let researchers crunch data in days or months instead of years. 

Nvidia is one of the world's biggest graphics chips makers, and it has built a cult following among gamers. In earlier days, GPUs mainly went into computers and gaming consoles , aimed at tasks that required high-quality, responsive graphics. Today, GPUs are used by anything that needs to crunch a lot of data quickly. Because of the way GPUs efficiently process information, they're key to robots, self-driving cars , data centers powering artificial intelligence and supercomputers trying to find cures to diseases. They're also typically more efficient and require less floor space than CPUs, which traditionally have served as the brains of systems.

The A100 chip is based on a 7-nanometer design. A key part of semiconductor manufacturing is shrinking the components called transistors, extraordinarily tiny electronic switches that process data for everything from microwave oven clocks to artificial intelligence algorithms running in our phones . The smaller the transistors, the better the battery life and performance. The Ampere architecture boasts 54 billion transistors, "making it the world's largest 7-nanometer chip," Nvidia said. 

With the A100, not only will machines be capable of crunching a lot of data quickly, but also the servers will be more flexible.

"It's going to unify that infrastructure into something much more flexible, much more fungible and increase its utility makes it a lot easier to predict how much capacity you need," Huang said Wednesday in a briefing with reporters. 

Watch this: GeForce Now takes on Google Stadia in cloud gaming

Training AI

Nvidia's new A100 GPU is already shipping to customers around the globe. It will be used by the biggest names in cloud computing, including Alibaba, Amazon , Baidu, Google and Microsoft . The companies operate huge server farms that house the world's data. Netflix, Reddit and most other online services rely on the cloud operators to keep their sites up and running. Nvidia said that Microsoft, with its cloud, will be one of the first companies to use the A100.

"Azure will enable training of dramatically bigger AI models using Nvidia's new generation of A100 GPUs to push the state-of-the-art on language, speech, vision and multi-modality," Mikhail Parakhin, Microsoft corporate vice president, said in a press release.

Other organizations that plan to use the A100 include national laboratories, leading universities and research institutions like Indiana University; Germany's Julich Supercomputing Centre, Karlsruhe Institute of Technology, and Max Planck Computing and Data Facility; and the US Department of Energy's National Energy Research Scientific Computing Center.

The A100 will be particularly useful in training and operating AI systems. The technology is flexible, letting companies scale their servers up or down as needed, and the A100's speed can reduce the amount of time it takes to teach an artificial intelligence program. 

"Modern and complex AI training and inference workloads that require a large amount of data can benefit from state-of-the art technology like Nvidia A100 GPUs, which help reduce model training time and speed up the machine learning development process," Gary Ren, machine learning engineer at food delivery service DoorDash, said in a press release. 

Along with cloud and supercomputer organizations, many tech companies will use the A100 in servers. That includes Atos, Dell , Fujitsu, Lenovo and Supermicro. 

Five 'miracles'

The new A100 pulls off five "miracles," as Nvidia's Kharya put it. First is the Ampere architecture.

"This is unquestionably the first time that we've unified the acceleration workload of the entire data center into one single platform," Huang said. "Everything from video analytics to image processing through voice to training to inference to data processing is now on one unified server."

nvida-a100

The Nvidia A100 GPU will speed up intensive tasks like crunching data to develop a vaccine to fight COVID-19.

Nvidia

Second is Nvidia's third-generation Tensor cores, which improve high performance computing applications. Third is a multi-instance GPU that lets a single A100 be partitioned into as many as seven separate GPUs to "deliver varying degrees of compute for jobs of different sizes, providing optimal utilization and maximizing return on investment." 

Along with those advancements, the A100 use the third generation of Nvidia's NVLink, a high-speed, GPU to GPU interconnect. In the A100, the NVLink is twice as fast, letting multiple GPUs be connected to operate as one giant GPU. The fifth advancement in the A100 is something called structural sparsity. The efficiency technique "harnesses the inherently sparse nature of AI math to double performance."

Everything else

Along with the A100, Nvidia on Thursday introduced its DGX A100 system. It features eight A100 GPUs connected with Nvidia NVLink and is aimed at intensive AI computing. One DGX A100, which starts at $199,000, is capable of delivering 5 petaflops of AI performance and consolidates the power and capabilities of an entire data center into a single system. 

The first organization using the DGX A100 is the US Energy Department's Argonne National Laboratory. It plans to use the cluster's AI and computing power to better understand and fight COVID-19.

"The compute power of the new DGX A100 systems coming to Argonne will help researchers explore treatments and vaccines and study the spread of the virus, enabling scientists to do years' worth of AI-accelerated work in months or days," Rick Stevens, associate laboratory director for computing, environment and life sciences at Argonne, said in a press release. 

The company also updated its software linked to the A100 GPU. That includes Jarvis, a multimodal conversation AI system; Merlin, a deep recommender application framework; and Nvidia's high performance computing SDK to help supercomputer makers debug and optimize their code for the A100. 

Jarvis provides a complete, GPU software stack and tools to make it easy for developers to build and launch real-time conversational bots that can understand terminology unique to each company and its customers. For instance, a bank's app built with Jarvis would understand what financial terms mean. 

Nvidia expects Jarvis to be helpful during the pandemic, when more people are working from home and telemedicine and remote learning are becoming the norm. 

"Conversational AI is central to the future of many industries, as applications gain the ability to understand and communicate with nuance and contextual awareness," CEO Huang said in a press release. "Nvidia Jarvis can help the healthcare, financial services, education and retail industries automate their overloaded customer support with speed and accuracy."  

Companies that will use Jarvis include Voca, an AI agent for call center support; Kensho, which provides automatic speech transcriptions for finance and business; and Square, which has a virtual assistant for appointment scheduling.

Coronavirus reopenings: How it looks as lockdowns ease around the world

See all photos