Question Nvidia to enter the server CPU market

NTMBK · Apr 12, 2021

NVIDIA Unveils Grace: A High-Performance Arm Server CPU For Use In Big AI Systems

www.anandtech.com

Based on Neoverse cores, not a Denver derivative.

This makes sense with the ARM acquisition attempt. They want to fully control the server stack.

Thala · Mar 22, 2022

Markfw said:
Its kind of sad that it gets beat by 33% by a CPU thats been out for over a year. (by the time it launches) By the time Genoa comes out (close to its launch) I am sure it will get beat by 100%

integer performance is not even its main selling point...
In addition, i believe the more relevant metric is perf/W

ps: From what i understand, the GRACE SpecINT rating is just for a single socket - while this is compared to dual-socket EPYC.

Hitman928 · Mar 22, 2022

Thala said:
integer performance is not even its main selling point...
In addition, i believe the more relevant metric is perf/W

ps: From what i understand, the GRACE SpecINT rating is just for a single socket - while this is compared to dual-socket EPYC.

Grace number is from dual socket, each CPU is 72 cores.

I agree though, Nvidia doesn't care what its general compute capabilities are, that's not their purpose in this situation.

Thala · Mar 22, 2022

Hitman928 said:
Grace number is from dual socket.

I agree though, Nvidia doesn't care what its general compute capabilities are, that's not their purpose in this situation.

Thanks for clarification, it was not particularly clear from reading!

biostud · Mar 22, 2022

Take that M1 Ultra!

videogames101 · Mar 23, 2022

Saylick said:
Thanks for the heads up. I thought that AT's own internal estimates would be good enough; looks like I was wrong.

I don't think this is quite right. Official SPEC runs can use (almost) any compiler and compiler settings they please. The resulting numbers are about as far from apples-to-apples as it gets. As I understand it the AT numbers use comparable compilers and compiler settings, and are much better for platform comparisons. Here's the published config file from one of those official 2xEPYC 7763 ~900 scores. The compiler tuning is bordering on absurd:

https://www.spec.org/cpu2017/results/res2021q4/cpu2017-20211121-30148.cfg

Saylick · Mar 24, 2022

Can someone do a die size estimate of Grace based on what we know about Hopper (814 mm2) and the size of the memory modules? I was looking at the render of Grace and I counted 7 rows of 12 cores, which implies that there's 84 cores per die, so 12 of them have to be disabled for yield reasons? Either way, Grace looks almost the size of Hopper itself, meaning Grace is probably in the ~600mm2 range?

tomatosummit · Mar 24, 2022

Saylick said:
Can someone do a die size estimate of Grace based on what we know about Hopper (814 mm2) and the size of the memory modules? I was looking at the render of Grace and I counted 7 rows of 12 cores, which implies that there's 84 cores per die, so 12 of them have to be disabled for yield reasons? Either way, Grace looks almost the size of Hopper itself, meaning Grace is probably in the ~600mm2 range?

A very dirty estimate using the blurry image from sth and it looks like grace is around 80% of the size of hopper.
~650-675mm^2
Not too far from the 698mm^2 skylake-xcc.

Looking at it closer it might be another spr situation. Some of those cores/modules near the memory interfaces might be the memory controllers which would take 8 out of your calculations leaving a more reasonable 4 spare cores for yield. My only basis for that is that they look a bit darker.

jamescox · Mar 30, 2022

tomatosummit said:
A very dirty estimate using the blurry image from sth and it looks like grace is around 80% of the size of hopper.
~650-675mm^2
Not too far from the 698mm^2 skylake-xcc.

Looking at it closer it might be another spr situation. Some of those cores/modules near the memory interfaces might be the memory controllers which would take 8 out of your calculations leaving a more reasonable 4 spare cores for yield. My only basis for that is that they look a bit darker.

The image is almost certainly just a rendering; STH article mentions that. I don’t know if the process it is to be made on is even finalized yet. They might have made the image based on the current state of the design work, but it could also just be essentially a completely fake graphic. You can’t base much on it.

Given the current data, it will not be usable for a lot of applications. Some applications have significant amount of work on the cpu before sending data to the gpu. This cpu will likely be too slow for many of them. There is no point in buying this if the expensive gpu portion would be under utilized due to insufficient cpu power. Not everything can easily be run on the gpu.

I suspect AMD and Intel will both have a similar solution. They will have much more powerful CPUs and likely GPUs of similar capability. They won’t have the cuda lock-in though. A lot of applications don’t actually need that high of bandwidth to system memory though, so a lot of applications will be able to get by with PCI express levels of bandwidth. It depends on the algorithm and amount of processing required. For many things, the copy to gpu memory can be overlapped with the compute to make good utilization of the gpu. They can also use multiple smaller GPUs, each with their separate link.

nicalandia · Mar 30, 2022

This is a good take on it, instead of Grace being a direct competitor to EPYC or Intel Sapphire Raids

Nvidia Grace is Not a Server Platform

Home - Techstrong IT

Top Stories Techstrong TV Cloud Semiconductors Networking Data Storage ITSM Videos

gestaltit.com

moinmoin · Mar 31, 2022

nicalandia said:
This is a good take on it, instead of Grace being a direct competitor to EPYC or Intel Sapphire Raids

Nvidia Grace is Not a Server Platform

Home - Techstrong IT

Top Stories Techstrong TV Cloud Semiconductors Networking Data Storage ITSM Videos

gestaltit.com

The articles is saying something different than its title. It claims Nvidia's approach would be more akin to a DPU, but ends with it essentially being a preview of where the whole industry moves toward with CXL.

Also the article is nearly a year old already ("if the acquisition of Arm goes through" heh).

nicalandia · Mar 31, 2022

moinmoin said:
The articles is saying something different than its title. It claims Nvidia's approach would be more akin to a DPU, but ends with it essentially being a preview of where the whole industry moves toward with CXL.

Also the article is nearly a year old already ("if the acquisition of Arm goes through" heh).

It is old, but still effective at laying out Nvidia purpose for their Grace CPU/DPU which only needs to be powerful enough to compliment their GPU/Accelerator/AI processing power.

Saylick · Aug 19, 2022

Hardwareluxx got some Nvidia Grace previews:

Hotchips 34: NVIDIA nennt Details zum Cache und Speicher des Grace-Superchips - Hardwareluxx

Hotchips 34: NVIDIA nennt Details zum Cache und Speicher des Grace-Superchips.

www.hardwareluxx.de

Edit: ServeTheHome has a bit more commentary, along with a few extra slides not shown in the HWL article:

New NVIDIA Grace Arm CPU Details Ahead of HC34

Ahead of HC34, NVIDIA is allowing us to share details of the new NVIDIA Grace Arm server CPU and thoughts behind why NVIDIA needs Grace

www.servethehome.com

Saylick · Aug 23, 2022

WCCFtech got most of the slide deck, here:

Saylick · Aug 23, 2022

SarahKerrigan · Aug 24, 2022

So... Neoverse N2, y'all think? Still no Poseidon announcement and Grace is seemingly about to sample.

Also, I'm hesitant to call the Grace Superchip "dual socket" - it's two dice on a single module, versus Epyc, which is nine dice on a single module - but it's basically splitting hairs regardless. Grace looks like a very good product for Nvidia's needs.

igor_kavinski · Sep 1, 2023

Wide Horizons: NVIDIA Keynote Points Way to Further AI Advances

Chief Scientist Bill Dally described research poised to take machine learning to the next level.

blogs.nvidia.com

AMD may have to go ARM or RISC-V to compete on power efficiency.

itsmydamnation · Sep 1, 2023

igor_kavinski said:
Wide Horizons: NVIDIA Keynote Points Way to Further AI Advances

Chief Scientist Bill Dally described research poised to take machine learning to the next level.

blogs.nvidia.com

View attachment 85172

AMD may have to go ARM or RISC-V to compete on power efficiency.

no they don't..

AMD have chosen a different path that has different trade offs.
The market that really requires throughput plus power efficiency Nvidia dont have the AMD product in their comparison.
Further more that market is so non existing they didn't even decide to productise the the CPU only configuration.........

igor_kavinski · Sep 1, 2023

itsmydamnation said:
Further more that market is so non existing they didn't even decide to productise the the CPU only configuration.........

Isn't the bigger reason that they are not a CPU company and selling just CPU wouldn't be as profitable for them? The CPU is just so they don't have to put their GPUs in Intel/AMD or other ARM vendor servers.

itsmydamnation · Sep 1, 2023

igor_kavinski said:
Isn't the bigger reason that they are not a CPU company and selling just CPU wouldn't be as profitable for them? The CPU is just so they don't have to put their GPUs in Intel/AMD or other ARM vendor servers.

Sorry I don't understand how that related to your original point, it's entirely that reason AMD choose chiplets dispute the power penalty.

My further point was and have a product that doesn't have that penalty and comes with gobs of hbm and cache ...

A/// · Sep 1, 2023

this notion of ARM server processors being energy efficient compared to x86 is largely getting out of hand now. They can use as much power as the traditional stuff to do similar or more work.

igor_kavinski · Sep 2, 2023

itsmydamnation said:
Sorry I don't understand how that related to your original point

What I meant was, nVidia has almost no reason to go with selling a CPU-only server. Their CPU exists to end their reliance on AMD/Intel CPUs.

Schmide · Sep 2, 2023

igor_kavinski said:
Wide Horizons: NVIDIA Keynote Points Way to Further AI Advances

Chief Scientist Bill Dally described research poised to take machine learning to the next level.

blogs.nvidia.com

View attachment 85172

AMD may have to go ARM or RISC-V to compete on power efficiency.

Graph is flawed. Look at weather. It is normalized to AMD yet when nVidia clearly gets a score below 1.0x they label it as 1.0x ?

moinmoin · Sep 2, 2023

igor_kavinski said:
Wide Horizons: NVIDIA Keynote Points Way to Further AI Advances

Chief Scientist Bill Dally described research poised to take machine learning to the next level.

blogs.nvidia.com

View attachment 85172

AMD may have to go ARM or RISC-V to compete on power efficiency.

Is that you doing sarcasm? The graph (apparently using some odd normalization and rounding) is only talking about performance and throughput, both saying nothing about power efficiency at all and the latter furthermore more related to uncore design.

A/// · Sep 2, 2023

igor_kavinski said:
What I meant was, nVidia has almost no reason to go with selling a CPU-only server. Their CPU exists to end their reliance on AMD/Intel CPUs.

Only until either one surpasses it. Zen 5's epyc whatever it's called may bash grace on it's head.

DrMrLordX · Sep 2, 2023

Isn't Grace just a bog standard ARM design?

Question Nvidia to enter the server CPU market

Lifer

Golden Member

Diamond Member

Golden Member

Lifer

Diamond Member

Diamond Member

Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Lifer

Diamond Member

Lifer

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Lifer