Speaker 1: Today we are announcing the next generation. The engine of the world's AI computing infrastructure makes a giant leap. Introducing Envidia H 100. The H 100 is a massive 80 billion transistor chip using TSMC four N process. We designed the H 100 for scale up and scale out infrastructures. [00:00:30] So bandwidth memory, networking and ENV link chip to chip data rates are vital. H 100 is the first PCI express gen five GPU and the first HBM three GPU, a single H 100 sustains 40 Teras per second of IO bandwidth to put it in perspective, 20 H 100 S can sustain the equivalent of the entire world's internet traffic. [00:01:00] The hopper architecture is a J and leap over AMPI. Let me highlight five groundbreaking inventions. First, the H 100 has incredible performance, a new tensor processing format F P eight H 100 has four plops of F P two plops of FP 16, one pedo flops of TF, 32 60 [00:01:30] Terra, flops of FP 64 and FP 32 designed for air and liquid cooling.
Speaker 1: H 100 is also the first GPU to scale in performance to 700 Watts over the past six years through Pascal Volta AMPI and now hopper, we developed technologies to train with FP 32, then FP 16, and now F P eight for AI processing [00:02:00] hopper H 100 S four plops of F P eight is an amazing six times. The performance of AMPI a 100 S FP 16, our largest generational leap ever. The transformer is unquestionably. The most important deep learning model invented hopper introduces a transformer engine. The hopper transformer engine combines a new tensor core and software that uses [00:02:30] FPA and FP 16 numerical formats and dynamically processes. Layers of a transformer network transformer model training can be reduced from weeks to days for cloud computing. Multi-tenant infrastructure translates directly to revenues and cost of service. A service can partition H 100 up to seven instances.
Speaker 1: AMPI can also do this. However, hopper added complete per instance, [00:03:00] isolation and per instance, IO virtualization to support multi-tendency in the cloud. H 100 can host seven cloud tenants while a 100 can only host one. Each one is equivalent in performance to two full T four GS are most popular cloud inference, GPU. Each hopper multi-instance supports confidential computing with trusted execution environment. Sensitive data is often encrypted [00:03:30] at rest and in transit over to network, but unprotected during use data can be an AI model that results from millions of dollars of trained on years of domain knowledge or company proprietary data, and is valuable or secret hopper, confidential computing, a combination of processor, architecture and software addresses this gap by protecting both data and application during use confidential computing today is [00:04:00] only CPU hopper introduces the first GPU confidential computing hopper, confidential computing protects the confidentiality and integrity of AI models and algorithms of the owners, software developers and services can now distribute and deploy their proprietary and valuable AI models on shared remote infrastructure, protecting their intellectual property and scaling their business models.
Speaker 1: And there's more hopper [00:04:30] introduces a new set of instructions called DPX designed to accelerate dynamic programming algorithms. Many real world algorithms grow with commonatorial or exponential complexity. Examples include the famous traveling salesperson optimization problems. Floyd wash for shortest route optimization used for mapping Smith, Waterman pattern matching [00:05:00] for gene sequencing and protein folding and E graph optimization algorithms, dynamic programming breaks, complex problems down to simpler sub problems that are solved recursively, reducing complexity and time to polynomial scale H 100 is the newest engine of AI infrastructures. H 100 S are packaged with HBM three memories, TSMs cos two and a half D packaging and integrated with voltage [00:05:30] regulation into a super chip module called SX M.