AMD Unveils MI300X AI Chip with 192GB Memory and High Efficiency for Large Language Models

AMD recently unveiled its MI300X chip, which is poised to compete with Nvidia’s H100 in the AI space. As demand for generative AI increases, the power and efficiency of this chip will fill an important role in the market.

AMD has long taken a back seat to Nvidia in the AI chip market. But with its new MI300X chip, the company hopes to change that trend. Announced at a recent San Francisco unveiling event, the high-tech GPU packs 192 gigabytes of memory and aims to rival Nvidia’s flagship H100 chip.

The move comes amid growing interest in generative artificial intelligence (AI). As applications like OpenAI’s ChatGPT keep the technology in the public’s eye, demand for chips capable of running large language models (LLMs) has soared. AMD claims its latest chip will be ready to meet demand as soon as 2024.

Powerful, Efficient Performance

AMD calls the MI300X the most complex chip it has ever built. Indeed, the GPU includes several high-impact features designed to improve its performance and efficiency when completing AI tasks. The chip follows AMD’s recently announced MI300A, which it dubs, “the first APU Accelerator for HPC and AI workloads.”

The new GPU, with its 192 GB of memory, easily outpaces Nvidia’s H100, which boasts just 120 GB. But that isn’t the most impressive feature. The chip’s transistor count is a massive 153 billion and its memory bandwidth is 5.2 terabytes per second.

Of the design, AMD CEO Dr. Lisa Su said in a statement, “Our use of chiplets in this product is very, very strategic. The generative AI, large language models have changed the landscape. The need for more [computing power] is growing exponentially, whether you’re talking about training or about inference.”

The MI300X can bounce back and forth between CPU and GPU tasks thanks to the clever inclusion of two additional CDNA 3 chiplets beyond the design used in the MI300A.

To demonstrate the new chip’s computing power, AMD used it to run Falcon-40B, one of the most popular LLMs currently available. Consisting of 40 billion neural network parameters, Falcon-40B is a challenge for most chips. But MI300X ran it entirely in memory, rather than moving data back and forth to external memory, to help Su compose a poem about the Bay Area.

Notably, the 40 billion parameters the LLM requires fill just half of the chip’s capacity. A single MI300X chip can run models with up to 80 billion parameters.

Su says, “When you compare MI300X to the competition, [it] offers 2.4 times more memory, and 1.6 times more memory bandwidth, and with all of that additional memory capacity, we actually have an advantage for large language models because we can run larger models directly in memory.”

This feature reduces the number of GPUs required to run LLMs and significantly improves the potential performance speed. As the industry begins to turn its eye toward efficiency and energy usage among AI chips, working more in memory gives the MI300X an edge over its competitors.

Su also notes that this feature reduces the total cost of ownership for LLMs, making the tech more accessible. Indeed, as generative AI grows in popularity, improved efficiency ensures developers of all sizes will be able to take advantage of the benefits and explore the technology’s potential.

Package Deal

AMD plans to package the MI300X as part of the AMD Instinct Platform to better compete with offerings from Nvidia, such as its DGX systems. The first Instinct machine will feature eight MI300X chips and boast a whopping 1.5 terabytes of HMB3 memory.

Thanks to a design that conforms to the industry standard Open Compute Platform specs, it will seamlessly slot into existing architecture.

Su says, “The whole purpose is to make AI much, much more accessible. So, everybody who wants to use AI needs more GPUs, and we have a GPU that is incredibly powerful; very, very efficient; and we believe will be a significant winner in the AI market.”

To that point, Su claims AMD will have the capacity to produce enough MI300X chips to meet demand in 2024. At this time, the company has not unveiled pricing details for its new AI chip.

Moving forward, it will be interesting to see how the combined graphics/CPU chip is accepted by companies looking to utilize generative AI. Similar chips have failed in the past, but AMD’s latest design is promising. With demand for AI applications and efficient silicon to support them soaring, the MI300X could fill an important gap in the current market.

Author