Nvidia introduces DGX GH200 AI supercomputer with 1 exaFLOPS performance

Training AI requires extremely powerful hardware. It was not only for these purposes that the company developed Nvidia new supercomputers DGX GH200. It uses the powerful Nvidia GH200 Grace Hopper Superchip and Nvidia NVLink Switch System. This allows the entire DGX GH200 system to appear as one big unit, while with the previous generation, a maximum of 8 chips could be connected via NVLink without performance penalty. The data throughput through NVLink is increased by 48x, which means that the result is a massive supercomputer for the development of AI systems that offers users the same ease of programming as if it were a single GPU.

Nvidia DGX GH200

Grace Hopper Superchip it has 72 ARM Neoverse V2 cores in the CPU part (Grace CPU), each of which has 64 KB of instruction and 64 KB of data L1 cache, and each core is also equipped with 1 MB of L2 cache. As for the L3 cache, it is shared and its capacity is 117 MB. LPDDR5X memory with a throughput of up to 512 GB/s is used, its capacity can reach 480 GB. It is possible to create 4 PCIe Gen5 x16 connections per processor.

The GPU part of this superchip (Hopper H100 GPU) manages to achieve 67 TFLOPS in FP32, 494 TFLOPS in TF32 Tensor Core, 1979 TFLOPS in FP8 (3958 TFLOPS using sparse matrices). Here we have up to 96 GB of HBM3 memory with a throughput of 4 TB/s, which surpasses desktop cards like the RTX 4090 by more than 3 times. NVLink-C2C between GPU and CPU achieves a throughput of 900 GB/s. Nvidia says that this is a 7 times faster connection than using the PCIe bus, and all this at a fifth of the consumption of the communication interface. The TDP of the superchip (CPU+GPU+Memory) is programmable between 450 and 1000 W. All this will increase the AI ​​training speed by 9 times, AI algorithm processing by 30 times compared to the previous generation.

And now back to the DGX GH200 supercomputer. It can connect up to 256 of these superchips, which brings a huge performance of 1 exaFLOPS. The supercomputer can be equipped with up to 144 TB of shared memory (480 GB from the CPU + 96 GB from the GPU, all 256 times). The first companies and projects to have access to the DGX GH200 will be Google Cloud, Meta and Microsoft. Nvidia will also build one supercomputer for itself and use it for development and for its own development teams. It will be called Nvidia Helios, which will consist of 4 DGX GH200 supercomputers at once. These will be connected via the Nvidia Quantum-2 InfiniBand bus. The result will be 1024 Grace Hopper Superchips.


Source: Svět hardware by www.svethardware.cz.

*The article has been translated based on the content of Svět hardware by www.svethardware.cz. If there is any problem regarding the content, copyright, please leave a report below the article. We will try to process as quickly as possible to protect the rights of the author. Thank you very much!

*We just want readers to access information more quickly and easily with other multilingual content, instead of information only available in a certain language.

*We always respect the copyright of the content of the author and always include the original link of the source article.If the author disagrees, just leave the report below the article, the article will be edited or deleted at the request of the author. Thanks very much! Best regards!