Event Coverage

AMD's Radeon Instinct is the company's new machine intelligence platform

By John Law & Zachary Chan - 13 Dec 2016

AMD's Radeon Instinct is the company's new machine intelligence platform

There’s no longer any question about it. Machine learning models and deep neural networks are the future of computing and machine intelligence is reshaping the world as we know it. Self-driving cars, world champion-beating AIs such as AlphaGo, and even your photo app’s uncanny new abilities to identify people, pets and locations all on its own have some implementation of machine learning in them. The only problem is that machine learning requires a huge amount of processing power and very large data sets to train, which is commonly outside the boundaries of any single personal device.

In the past two years, graphics processors (GPUs) have been found to be incredibly well-suited to power machine learning systems, and NVIDIA has arguably been at the forefront of this push for GPU-based machine learning with their CUDA-based cuDNN software library and this year’s launch of their Pascal part, especially the Tesla P100 chip. Even Google has gotten into making their own machine learning  processor.

AMD was of course, not going to sit this one out and let NVIDIA take all the glory in this bold new world of High Performance Computing (HPC). At the AMD Tech Summit 2016 just last week in Sonoma, California, the company launched Radeon Instinct, their own HPC solution that comprises new Radeon Instinct GPU accelerators, a free and open-sourced deep learning library called MIOpen, and the Radeon Open Compute (ROCm) platform.

What AMD’s trying do to however, is to advocate for open, heterogeneous computing. This means opening up their software libraries and supporting different computing architectures. To this end, AMD’s ROCm platform will not only support all three major chip architectures (x86, ARM and Power), but will also support ports of NVIDIA CUDA applications.

AMD’s MIOpen library on the other hand, is middleware that enables GPU acceleration via Radeon Instinct hardware of popular deep learning frameworks such as Google’s TensorFlow, Microsoft’s CNTK and Caffe. It is essentially what NVIDIA’s cuDNN (CUDA Deep Neural Network) library does for NVIDIA GPU acceleration.

The last piece of AMD’s Radeon Instinct solution is of course, hardware accelerators. Three Radeon Instinct accelerators were announced: MI6, MI8 and MI25. No, the naming convention has nothing to do with any branch of the British Secret Service; it is rather practical. MI stands for Machine Intelligence, and the number corresponds to its relative compute performance. So, you can deduce that the MI6 card will produce close to six teraflops, while the MI25 card will be around 25 teraflops.

The Radeon Instinct MI6 is a Polaris-based accelerator with 5.7 teraflops of FP16 performance, 16GB memory and will operate at 150W with passive cooling. This card is optimized to be an inference accelerator.

The Radeon Instinct MI8 is also an inference accelerator in a small-form factor package. This is a Fiji-based card with 8.2 teraflops of FP16 performance and 4GB of HBM operating under 175W.

If it seems like there’s a huge performance jump from the MI8 to the MI25, it is because the Radeon Instinct MI25 will be based on the yet-to-be revealed Vega architecture. As such, besides the performance indicator you can derive from its name, we can’t actually tell you anything else about it.

What we can tell you is that according to AMD, the Fiji-based MI8 is already able to outperform a Pascal-based TitanX in a DeepBench GEMM benchmark. AMD also claims their MIOpen library can offer up to three times the acceleration with Radeon Instinct over the competition.

Now, if you were impressed with NVIDIA’s DGX-1 supercomputer announced earlier this year that featured 170 teraflops of compute performance with eight Tesla P100 GPU accelerators, AMD already has partners from Supermicro, Falconwitch and Inventec (to name a few), signed up to build Radeon Instinct servers. And some of the early specs we’ve seen put them in the range from 100 teraflops to three petaflops! That’s right, a three petaflop GPU supercomputer that will fit into one 39U server rack.