Nvidia just published its latest MLPerf benchmark results, and they have are some big implications for the future of computing. In addition to maintaining a lead over other A.I. hardware — which Nvidia has claimed for the last three batches of results — the company showcased the power of ARM-based systems in the data center, with results nearly matching traditional x86 systems.
In the six tests MLPerf includes, ARM-based systems came within a few percentage points of x86 systems, with both using Nvidia A100 A.I. graphics cards. In one of the tests, the ARM-based system actually beat the x86 one, showcasing the advancements made in deploying different instruction sets in A.I. applications.
“The latest inference results demonstrate the readiness of ARM-based systems powered by ARM-based CPUs and Nvidia GPUs for tackling a broad array of A.I. workloads in the data center,” David Lecomber, senior director of HPC at Arm, said. Nvidia only tested the ARM-based systems in the data center, not with edge or other MLCommons benchmarks.
MLPerf is a series of benchmarks for A.I. that are designed, contributed to, and validated by industry leaders. Although Nvidia has led the charge in many ways with MLPerf, the leadership of the MLCommons consortium is made up of executives from Intel, the Video Electronics Standards Association, and Arm, to name a few.
The latest benchmarks pertain to MLCommons’ inference tests for the data center and edge devices. A.I. inference is when the model begins producing results. It comes after the training phase where the A.I. model is still learning, which MLCommons also has benchmarks for. Nvidia’s Triton software, which deals with inference, is in use at companies like American Express for fraud detection and Pinterest for image segmentation.
Nvidia also highlighted its Multi-Instance GPU (MIG) feature when speaking with press. MIG allows the A100 and A30 graphics cards to go from a single A.I. processing unit into a few A.I. accelerators. The A100 is able to split into seven separate accelerators, while the A30 can split into four.
By splitting up the GPU, Nvidia is able to run the entire MLPerf suite at the same time with only a small loss in performance. Nvidia says it measured 95% of per-accelerator performance when running all of the tests compared to a baseline reading, allowing the GPUs to run multiple A.I. instructions at the same time.