AMD released the world’s first MCM-GPU for servers: the Instinct MI200

One MI200 includes 8196-bit HBM2E memory controllers with two Graphics Compute Die chips and up to 3.2 TB / s of memory

Milan-X processors and Zen 4 coverage were not the only issues at the AMD Accelerated Data Center Premiere. In addition to the new processors, the event officially unveiled AMD, citing “the world’s first multi-chip GPU,” the Instinct MI200.

Although AMD has just released the MI200 GPU and accelerators based on it, it has had time to ship them for a long time, at least to the U.S. Department of Energy’s new Frontier supercomputer. The MI200 will be marketed at least initially as three products: the Open Compute Project OCP Accelerator Module (OAM) for the MI250X and MI250, and the MI210 for the PCIe bus. Contrary to expectations, the MI200 is manufactured using TSMC’s N6 manufacturing process.

The full MI200 has two GPU chips, the GCD (Graphics Compute Die) and eight HBM2E memory stacks. Each GCD contains 112 Compute Units of the CDNA2 architecture and a 4096-bit memory controller, as well as eight Infinity Fabric links. With the CDNA2 architecture, Compute Units now count to FP64 for the first time at a 1: 1 rate, and the FP32 rate can rise to 2: 1 if two identical commands are computed in parallel, allowing them to be packaged to run at once (Packed Math) . The second-generation Matrix Core cores have four per Compute Unit and are capable of computing matrices at FP64 at four times the speed of BFloat16 and at twice the speed of the last generation. There are now eight Infinity Fabric links and they allow cached coherent memory space between processors and memories. Each GCD is displayed to the system as a separate GPU and is interconnected by IF links.

The top model MI250X uses 110 Compute Units per GCD chip, while the MI250 uses 104 per chip, but otherwise the circuits are carved from almost the same wood: both have a maximum clock frequency of 1700 MHz, a total of 8192-bit HBM2E a memory controller that extends 128 GB of 3.2 GB of memory at 3.2 Tbps. The MI250 has also been pruned for two IF links and support for cache coherence with the processor. The details of the PCIe bus MI210 model were still incomplete at this stage. Both have a TDP of 560 watts when cooled in air and 500 watts when cooled in air.

The AMD Instinct MI200 Series Accelerators will enter the market as OAM buses immediately. The PCIe bus MI210 will go on sale later. We recommend more deeply to those interested in the topic, for example AnandTech article on the subject.

