News Nvidia switches to 64 cores Epyc Rome for DXG A100

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
And that's quite the massive system they offer there: "the universal system for all AI workloads"


  • 8X NVIDIA A100 GPUS WITH 320 GB TOTAL GPU MEMORY
    12 NVLinks/GPU, 600 GB/s GPU-to-GPU Bi-directonal Bandwidth
  • 6X NVIDIA NVSWITCHES
    4.8 TB/s Bi-directional Bandwidth, 2X More than Previous Generation NVSwitch
  • 9x MELLANOX CONNECTX-6 200Gb/S NETWORK INTERFACE
    450 GB/s Peak Bi-directional Bandwidth
  • DUAL 64-CORE AMD CPUs AND 1 TB SYSTEM MEMORY
    3.2X More Cores to Power the Most Intensive AI Jobs
  • 15 TB GEN4 NVME SSD
    25GB/s Peak Bandwidth, 2X Faster than Gen3 NVME SSDs

data-denter-dgx-a100-tour-1cC-D.jpg


Related yellow press reporting:
 
Last edited:

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
Not surprising as only AMD offers PCIe gen 4 connectivity as of now.
IBM also have PCIe Gen 4. And IBM and NV partnered for Summit. But they went with EPYC Rome.
But is fairly hard to acquire a few Power9 servers :nomouth:
IBM POWER9: Enhanced core and chip architecture for nextgen workloads Built from the ground-up for data intensive workloads, POWER9 is the only processor with state-of-the-art I/O subsystem technology, including next generation NVIDIA NVLink, PCIe Gen4 and OpenCAPI.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
If im not mistaken they are more commonly known as Lenovo enterprise division now..
You are mistaken. :tonguewink:
Lenovo DC is a different company.
IBM is still maintaining mainframes/servers/supercomputer business.
 

piokos

Senior member
Nov 2, 2018
554
206
86
In a way that's the surprise actually. There are many players in the industry that seem to prefer making illogical compromises instead going for the least surprise. Nvidia clearly isn't one of them.
If you mean people who prefer Intel - there are quite a few reasons this both makes sense and is cheaper.

But not in case of something like DGX, which you buy for the GPUs and for AI-tasks (not general computing, running databases and ERP systems). CPU is almost transparent for the end user.
Nvidia needed PCIe 4.0, so Intel wasn't an option at all and AMD was almost surely much cheaper than IBM.
This may as well be the last DGX with x86. Nvidia is not hiding the fact they want to make these using in-house ARM chips (again: no PCIe 4.0 for now).
 

piokos

Senior member
Nov 2, 2018
554
206
86
There's been a rumor they are working on their own chip that uses RISC-V.
That's for an embedded chip in the GPU ("Falcon").

I meant the CPU that will run future deep learning servers. ARM is the natural choice - they're already using it in smaller products (Jetson, DRIVE), so all the necessary software already exists. They just need a really big one - at least Graviton2-like.
Also, I haven't seen any signs that ARM plans to support PCIe 4.0 at all. They may jump straight to 5.0.