Question Nvidia to enter the server CPU market

NTMBK · Apr 12, 2021

NVIDIA Unveils Grace: A High-Performance Arm Server CPU For Use In Big AI Systems

www.anandtech.com

Based on Neoverse cores, not a Denver derivative.

This makes sense with the ARM acquisition attempt. They want to fully control the server stack.

ThatBuzzkiller · Apr 22, 2021

jpiniero said:
You're missing the point of this system. The question really is does the CPU need to be that powerful in a system designed around the GPUs. We also don't know how many cores it has, the power consumption, etc.

Even in compute dense scenarios like we see in supercomputers, well over half of them (under category "accelerators/co-processors") don't have any accelerators or co-processors to speak of so yes high performance CPUs are still very relevant ...

Heterogeneous compute is about complimenting the CPU and was never about complimenting the GPU or any other accelerators for that matter. AMD learned this the hard way after an unsuccessful attempt with their Fusion project. Subpar CPU performance weren't going to cut it with most customers ... (what we're seeing from Nvidia is that they're clearly scrambling to coble something up together after IBM dropped NVLink from POWER10 and onwards)

The US government must've been smoking something pretty good if they didn't actually share the same vision like AMD/Intel (CPU performance leaders) in regards to heterogeneous compute ... (GPU compute was a trend that started last decade and heterogeneous compute is going to be the new hot trend in this decade)

jpiniero · Apr 22, 2021

ThatBuzzkiller said:
Even in compute dense scenarios like we see in supercomputers, well over half of them (under category "accelerators/co-processors") don't have any accelerators or co-processors to speak of so yes high performance CPUs are still very relevant ...

That's almost entirely due to GPU memory capcity issues, which is exactly the problem that Grace is trying to fix.

ThatBuzzkiller · Apr 22, 2021

jpiniero said:
That's almost entirely due to GPU memory capcity issues, which is exactly the problem that Grace is trying to fix.

If that were truly the case then why didn't AMD APUs take off in the high-end compute segment ? An integrated CPU/GPU solution would've elegantly solved the asymmetric memory system differences between the CPU and the GPU ...

jpiniero · Apr 22, 2021

ThatBuzzkiller said:
If that were truly the case then why didn't AMD APUs take off in the high-end compute segment ? An integrated CPU/GPU solution would've elegantly solved the asymmetric memory system differences between the CPU and the GPU ...

AMD hasn't made an APU aimed at HPC yet.

ThatBuzzkiller · Apr 22, 2021

jpiniero said:
AMD hasn't made an APU aimed at HPC yet.

Because there was obviously no demand for such a system and it would've largely failed ... (the only saving grace that came out of APUs were the console contracts)

If Intel, out of all vendors who don't take GPU compute seriously kept half-assing it or winging it the whole time has been the dominant vendor in high-end compute and AMD starts seeing some success because they gave their own CPUs the spotlight then it only validates Intel's strategy in the end ...

Ultimately, what this tells us is that following AMD's original strategy has little to no basis for a successful outcome so how is Nvidia going to change this ?

beginner99 · Apr 22, 2021

ThatBuzzkiller said:
If that were truly the case then why didn't AMD APUs take off in the high-end compute segment ?

Because they didn't make one for that segment and because it's too niche to be profitable? In such systems dedicated will always win as power is much less if a concern. The APUs failed because single-threaded CPU perfromance is actually most relevant metric on consumer client devices. Apple got that right. In HPC with high parallel workloads? MT is more important and dedicated accelerators. But yeah bulldozer was just bad in all metrics.

EDIT:

ThatBuzzkiller said:
Ultimately, what this tells us is that following AMD's original strategy has little to no basis for a successful outcome so how is Nvidia going to change this ?

Not say it will but NV does have a near monopoly in the AI/ML space with CUDA. So moving all your software away from that is probably more painful than a bit slower CPU, in ST at least.

NTMBK · Apr 22, 2021

ThatBuzzkiller said:
If that were truly the case then why didn't AMD APUs take off in the high-end compute segment ? An integrated CPU/GPU solution would've elegantly solved the asymmetric memory system differences between the CPU and the GPU ...

The memory subsystem on the APUs was always pitiful, and AMD were miles behind on the tools for compute. They had half-baked OpenCL going up against CUDA.

ThatBuzzkiller · Apr 22, 2021

beginner99 said:
Not say it will but NV does have a near monopoly in the AI/ML space with CUDA. So moving all your software away from that is probably more painful than a bit slower CPU, in ST at least.

Just because Nvidia has leadership in GPU compute doesn't mean that they'll be guaranteed leadership in heterogenous compute too ...

Heck, Intel is arguably closer to competing in heterogeneous compute than Nvidia are and guess what ? If most of the supercomputers reflect the state of the high-end compute market then Nvidia's leadership in GPU compute means nothing to most customers! Many of them probably had little to no CUDA specific code so the software stack deficit isn't nearly as bad as you think it is ...

"A bit slower CPU" is putting it nicely when the AMD equivalent in the future is going to be well over >50% faster per socket compared to Grace. It's way harder for Nvidia to solve their CPU design deficit compared to AMD/Intel solving their software deficit since they have optimized x86 code going for them at least while Nvidia has never had a good CPU design team or CPU performance leadership in the past ... (and they still aren't going to be anywhere close to either AMD or Intel in the future too)

jpiniero · Apr 22, 2021

ThatBuzzkiller said:
"A bit slower CPU" is putting it nicely when the AMD equivalent in the future is going to be well over >50% faster per socket compared to Grace. It's way harder for Nvidia to solve their CPU design deficit compared to AMD/Intel solving their software deficit since they have optimized x86 code going for them at least while Nvidia has never had a good CPU design team or CPU performance leadership in the past ... (and they still aren't going to be anywhere close to either AMD or Intel in the future too)

That's why they are buying ARM. nVidia is going to put them to work in making a high end server CPU. Grace isn't that and doesn't need to be.

ThatBuzzkiller · Apr 22, 2021

NTMBK said:
The memory subsystem on the APUs was always pitiful, and AMD were miles behind on the tools for compute. They had half-baked OpenCL going up against CUDA.

Performance wise the memory system in the APUs wasn't good but from a programming perspective it was ideal since there was no GPU specific device memory so memory system performance characteristics were identical on both the CPU/GPU ...

If you really wanted to bypass OpenCL's limitations then kernels could've been written in GPU specific assembly ...

I fail to see how the GPU memory capacity issue can be solved without potentially constraining GPU performance in some way because high-end GPUs will always be stuck with their unique pool of HBM memory ...

A compelling heterogenous compute platform won't ever start with the GPU if we take AMD's fruitless experiments as an example so whatever approach Nvidia takes will doom them if they don't have competent CPUs ... (their CUDA stack might even have to take a back seat if they want to start porting many existing x86 code into ARM code since emulation will never be good enough in this segment)

DrMrLordX · Apr 22, 2021

ThatBuzzkiller said:
A compelling heterogenous compute platform won't ever start with the GPU if we take AMD's fruitless experiments as an example so whatever approach Nvidia takes will doom them if they don't have competent CPUs ...

As has been mentioned elsewhere in this thread, I'm pretty sure CXL 2.0 allows attached devices (such as dGPUs) to maintain coherency with system memory while mostly bypassing the CPU. The only element of the CPU that might be involved would be the memory controller. That would fit in nicely with nVidia's existing UVM model on NVLink-equipped systems.

xpea · Apr 22, 2021

ThatBuzzkiller said:
if Nvidia's future DGX systems needs twice the number of CPU sockets to be competitive against the x86 alternatives then maybe Grace CPUs are mediocre and they have a failure on their hand ... (the most common systems will always consist of either 1P or 2P and rarely does 4P/8P ever see deployment)

Even Frontier, the newest supercomputer uses single CPU socket server nodes and the delayed Aurora supercomputer would've used dual CPU socket server nodes ...

Every vendor is free to balance the ratio CPU/GPU. They are no magic number. And NV is a disruptor. We will see which system will be mediocre when Grace will be available.

One thing is sure, in future DGX, Nvidia won't have to pay $7000 for an EPYC CPU. This money buys a lot of silicon...

Regarding the supercomputers, everybody knows that they are a lot of politics involved, it's not only based on technical merit.

moinmoin · Apr 22, 2021

I find it odd to read talks about how "fruitless" AMD's HSA efforts supposedly have been when we are currently watching live all the results of DARPA's Fast Forward initiatives with the whole Zen and R/CDNA families which are furthermore about to accumulate in the Frontier and El Capitan exascale supercomputers, both which will feature tightly coupled CPU/GPU nodes.

xpea · Apr 22, 2021

moinmoin said:
A system which replaces CXL capability with a lot of mediocre CPUs, gotcha.

Funny, mediocre is exactly what I think of CXL. Too little too late, like every standard that is ratified on lowest common ground. CXL bandwidth is totally unable to handle 8 Ampere Next without huge loss of performance. It's really mediocre compared to Grace DGX...

xpea · Apr 22, 2021

ThatBuzzkiller said:
If that were truly the case then why didn't AMD APUs take off in the high-end compute segment ? An integrated CPU/GPU solution would've elegantly solved the asymmetric memory system differences between the CPU and the GPU ...

Integrated CPU/GPU like in the most powerful computer in the world? Say hello to Fujitsu A64FX and their "mediocre" ARM CPU

PS: don't be foul by namming. Vector extensions do the same job as GPGPU...

moinmoin · Apr 22, 2021

xpea said:
Funny, mediocre is exactly what I think of CXL. Too little too late, like every standard that is ratified on lowest common ground. CXL bandwidth is totally unable to handle 8 Ampere Next without huge loss of performance. It's really mediocre compared to Grace DGX...

Open standards have to rely on existing infrastructure which in case of CXL is PCIe v5 which isn't on the market yet. Manufacturers are obviously free to push ahead with proprietary solutions, which (as well as the marketing around it) is exactly what Nvidia is very good at, as well as the resulting ability to use that as vendor lock-in.

ThatBuzzkiller · Apr 22, 2021

moinmoin said:
I find it odd to read talks about how "fruitless" AMD's HSA efforts supposedly have been when we are currently watching live all the results of DARPA's Fast Forward initiatives with the whole Zen and R/CDNA families which are furthermore about to accumulate in the Frontier and El Capitan exascale supercomputers, both which will feature tightly coupled CPU/GPU nodes.

I think you might want to read my posts again if you got that impression because I didn't say that 'HSA' was fruitless but their concept of 'Fusion' certainly is ... (only redeeming quality behind Fusion were the console contracts like I stated before)

And while looking back in the past, HSA was a failure too since no one else in the HSA Foundation adopted the standard aside from AMD so it was no where close to meeting it's intended goal. AMD took a radically different approach to heterogenous compute by putting GPUs on the back burner, participating less in standards like OpenCL and introducing ROCm to keep low maintenance. Ever since HSA was mostly dead, AMD changed a lot compared to their original vision and arguably now has more in common with Intel's strategy ... (less focus on GPU compute/more focus on the CPU)

moinmoin · Apr 22, 2021

ThatBuzzkiller said:
I think you might want to read my posts again if you got that impression because I didn't say that 'HSA' was fruitless but their concept of 'Fusion' certainly is ... (only redeeming quality behind Fusion were the console contracts like I stated before)

And while looking back in the past, HSA was a failure too since no one else in the HSA Foundation adopted the standard aside from AMD so it was no where close to meeting it's intended goal. AMD took a radically different approach to heterogenous compute by putting GPUs on the back burner, participating less in standards like OpenCL and introducing ROCm to keep low maintenance. Ever since HSA was mostly dead, AMD changed a lot compared to their original vision and arguably now has more in common with Intel's strategy ... (less focus on GPU compute/more focus on the CPU)

Fusion was little more than a consumer brand for the HSA strategy at best, necessary since at the time the iGPUs were competitive whereas the CPUs weren't.

I disagree that AMD "took a radically different approach". What it did was out of financial and organisational necessity. The initial focus on CPU was natural since a competitive CPU offered the biggest TAM to get back into relevancy. ROCm was natural since AMD couldn't keep up with the competitors' software prowess so going all out open source while jump-starting their efforts promised the best possible outcome. And GPUs I don't think were that much on the backburner than everybody seems to think; tech was bound to be shared between CPUs and GPUs, but the former had to reach a certain level first to be usable for the latter (like for MCM in GPUs). And Fast Forward started back in late 2012 at which point Papermaster was there for a year and people like Su and Keller were hired during the year. The result of the planning back then is what we see now, and HSA most certainly was part of it.

ThatBuzzkiller · Apr 22, 2021

moinmoin said:
Fusion was little more than a consumer brand for the HSA strategy at best, necessary since at the time the iGPUs were competitive whereas the CPUs weren't.

I disagree that AMD "took a radically different approach". What it did was out of financial and organisational necessity. The initial focus on CPU was natural since a competitive CPU offered the biggest TAM to get back into relevancy. ROCm was natural since AMD couldn't keep up with the competitors' software prowess so going all out open source while jump-starting their efforts promised the best possible outcome. And GPUs I don't think were that much on the backburner than everybody seems to think; tech was bound to be shared between CPUs and GPUs, but the former had to reach a certain level first to be usable for the latter (like for MCM in GPUs). And Fast Forward started back in late 2012 at which point Papermaster was there for a year and people like Su and Keller were hired during the year. The result of the planning back then is what we see now, and HSA most certainly was part of it.

I'm pretty sure their Fusion project existed before HSA ever did but that doesn't matter anymore since both of it's biggest components (HSAIL & HSA Foundation) don't have a legacy to stand on left so I don't see how HSA can be seen as a 'success' in any sense of the word. At most AMD tried to salvaged their work on HSA and used it as a basis for their ROCm kernel driver ... (it's the other way around since Fusion 'started' HSA)

Also, ROCm might be open source but that doesn't mean it's a "community project" when it's largely developed behind closed doors at AMD with no outside contributors. ROCm doesn't see the main benefit of open source which is receiving patches from the community so it's not different from most open source projects which are either dead or aren't community developed. AMD GPUs pretty much had very little going on in terms of activity and even now I think getting AVX-512 on CPUs at AMD is a higher priority because it's much more important to be at parity with Intel. If the CPU had to reach a certain level for the GPUs to be viable then this only endorses Intel's strategy. AMD's strategy in regards to their GPUs have definitely changed and their letting their CPU performance leadership go into their heads. AMD's new approach isn't what I'd describe to be a positive influence to their GPU division either since tons of their old time employees over their jumped ship to Nvidia because AMD isn't doing a good job of fulfilling their ego ... (all of the interesting stuff behind GPUs happens over at Nvidia so I can see their motivation for working over there)

AMD just like Intel doesn't have the "GPU guru culture" going for them anymore so a lot of employees working at their GPU division eventually end up at Nvidia. AMD regained their CPU guru culture over the course of their Zen project but it came at the cost of letting their GPU guru culture fall apart entirely and that's okay because it'll mean that AMD will be able to set industry standards more easily in high-end compute than they did before. As long as AMD and Intel have nearly all of the CPU gurus in the world, no ARM vendor including Nvidia can come close to even replacing them since they can make all the rules they want in their favour like standardizing AVX-512 at their expense. No amount of advantage in GPU compute is going to help Nvidia one bit because if AMD and Intel really wanted to they could collude together to block Nvidia GPUs to kill off their entire business ... (we can all see that AMD doesn't try nearly as hard with GPUs when they're far more powerful being CPU performance leaders)

DrMrLordX · Apr 23, 2021

ThatBuzzkiller said:
I'm pretty sure their Fusion project existed before HSA

Fusion was/is HSA. It's what AMD called their integrated GPGPU initiative in early press releases. They later rebranded it to HSA when they dialed it back a little. Fusion originally was meant to do a lot of what CCIX allows them to do now. I remember being quite enthusiastic about AMD using Hypertransport to do all kinds of interesting things, like make GPUs equal partners on the HT bus and allow them to access main memory the same way the CPU could. There was even speculation that HTX (rather than PCIe) video cards would be possible. HTX slots did exist, and they supported cards like Infiniband adapters, but that was about it.

Also, ROCm might be open source but that doesn't mean it's a "community project" when it's largely developed behind closed doors at AMD with no outside contributors.

Just because AMD open sources something doesn't mean people have to/want to work on it. The option is there, which is the main thing.

AMD just like Intel doesn't have the "GPU guru culture" going for them anymore so a lot of employees working at their GPU division eventually end up at Nvidia.

. . . what?

AMD regained their CPU guru culture over the course of their Zen project but it came at the cost of letting their GPU guru culture fall apart entirely

I could argue against this, but I'd be going pretty far off-topic to dismantle this statement. Needless to say, you should probably rethink your position.

As long as AMD and Intel have nearly all of the CPU gurus in the world

lolwut

no ARM vendor including Nvidia can come close to even replacing them since they can make all the rules they want in their favour like standardizing AVX-512 at their expense.

Unless NV does something that just ruins the ARM ecosystem for the HPC world, I would expect there to be more interest in SVE2 than AVX-512. When it comes to ML things get more complicated, but:

AMD relies almost exclusively on their GPU products for AI/ML
Intel is throwing spaghetti against the wall hoping for traction in the AI/ML market, including but not limited to: various AVX extensions (bfloat16 etc), two different AI company buyouts, FPGAs, and Xe.

In the end, Intel and AMD's best AI/ML products may well wind up being dGPUs where NV already has them beat (for now).

No amount of advantage in GPU compute is going to help Nvidia one bit because if AMD and Intel really wanted to they could collude together to block Nvidia GPUs to kill off their entire business

I hope you understand that's why NV is buying ARM Ltd.

ThatBuzzkiller · Apr 23, 2021

DrMrLordX said:
Fusion was/is HSA. It's what AMD called their integrated GPGPU initiative in early press releases. They later rebranded it to HSA when they dialed it back a little.

HSA is a continuation of Fusion but the latter was definitely realized before the former was since Fusion started with pre-GCN APUs ...

DrMrLordX said:
Just because AMD open sources something doesn't mean people have to/want to work on it. The option is there, which is the main thing.

What differentiates a vibrant open source project from other open source projects are the community contributions and currently ROCm does not get this benefit. As for there being an option, while that is true I highly doubt many people outside of AMD would want to touch such a poorly documented complex code base that's been developed many years behind closed doors ...

DrMrLordX said:
. . . what?

I could argue against this, but I'd be going pretty far off-topic to dismantle this statement. Needless to say, you should probably rethink your position.

It's true, you should look at the linkedin profiles of former AMD graphics employees and their most popular destination is working for Nvidia. Same applies to many Intel graphics employees too. Ever since the console contracts from the last decade or the death of HSA, AMD has suffered quite a bit of defections of their graphics team to Nvidia since AMD had no grand vision for their GPU team to work on. AMD's graphics team in the past pioneered the first DX11 implementation, started the HSA Foundation as one of the founding members, and was responsible for influencing the design of modern graphics APIs with Mantle but all of this came from a time when AMD still had a graphics guru culture right before their Zen project started ...

AMD has been trailing Nvidia harder now than they ever did before the start of the Zen project in terms of graphics technology. Even AMD's CPU design team gets to do all the cool things like optimizing Vega's register file and designing infinity cache too. It's all of the above that makes me think that AMD has lost their GPU guru culture in favour of a CPU guru culture since they seem to want to flex their CPU design team as much as possible even in their graphics chips ...

DrMrLordX said:
Unless NV does something that just ruins the ARM ecosystem for the HPC world, I would expect there to be more interest in SVE2 than AVX-512. When it comes to ML things get more complicated, but:

AMD relies almost exclusively on their GPU products for AI/ML
Intel is throwing spaghetti against the wall hoping for traction in the AI/ML market, including but not limited to: various AVX extensions (bfloat16 etc), two different AI company buyouts, FPGAs, and Xe.

In the end, Intel and AMD's best AI/ML products may well wind up being dGPUs where NV already has them beat (for now).

I hope you understand that's why NV is buying ARM Ltd.

ML is only a small fraction of the high-end compute market so AMD and Intel can keep doing the bare minimum with GPUs whilst Nvidia can't do the same for CPUs if they want a widespread reach of the market ...

Are the Grace CPUs even going to feature SVE/SVE2 ?

DrMrLordX · Apr 23, 2021

ThatBuzzkiller said:
HSA is a continuation of Fusion but the latter was definitely realized before the former was since Fusion started with pre-GCN APUs ...

Fusion started with HT sockets/HTX slots. HTX slots were only ever realized on certain Opteron boards.

What differentiates a vibrant open source project from other open source projects

Full stop. You're missing the point.

ROCm = open source
CUDA = closed source

That is the primary differentiator. AMD would be pleased as punch if people made actual contributions to ROCm, but that's not really the point. Other hardware vendors are free to comply with ROCm fully if they so choose, versus CUDA where only nVidia hardware is supported natively. Not that, you know, anyone else has chosen to support ROCm that I know of.

OpenCL (also ope n source), for all its warts, is (mostly) supported by AMD; Intel; and NV hardware.

It's true, you should look at the linkedin profiles of former AMD graphics employees and their most popular destination is working for Nvidia.

Meanwhile, AMD's dGPU products are positioned better against NV's than they have been in years. Remember Vega? You don't want them to go back to that do you?

AMD had no grand vision for their GPU team to work on.

Completely false. CDNA and RDNA2 are working quite nicely. CCIX is bringing AMD a step closer to the Fusion they had envisioned years ago. Yknow, before they had even produced Llano.

AMD has been trailing Nvidia harder now than they ever did before the start of the Zen project in terms of graphics technology.

Blatantly false. RDNA2 is positioned better against NV's product lineup than Vega ever was. And CDNA cards like the MI100 are beastly. AMD still loses when it comes to ROCm vs. CUDA, but there's enough CUDA lock-in out there that AMD is behind the 8-ball to begin with.

ML is only a small fraction of the high-end compute market

Really? REALLY???

Artificial Intelligence Market Size, Share | Industry Report, 2030

The global artificial intelligence market size was valued at USD 279.22 billion in 2024 and is projected to grow at a CAGR of 35.9% from 2025 to 2030

www.grandviewresearch.com

Are the Grace CPUs even going to feature SVE/SVE2 ?

Doesn't matter, since that isn't NV's priority. Point being:

AVX-512 isn't a significant threat to GPGPU-based anything (FP64, 16-bit ML, 8-bit ML)
AVX-512 isn't a significant threat to ARM vendors that do choose to support SVE/SVE2, such as Fujitsu

NV has a master plan of offering you a complete platform to host their expensive-and-oh-so-wonderful compute cards which is where they make all their money currently. They'll sell you the entire hardware AND software stack, top to bottom. Their compute cards are the stars of the show.

ThatBuzzkiller · Apr 24, 2021

DrMrLordX said:
Full stop. You're missing the point.

ROCm = open source
CUDA = closed source

That is the primary differentiator. AMD would be pleased as punch if people made actual contributions to ROCm, but that's not really the point. Other hardware vendors are free to comply with ROCm fully if they so choose, versus CUDA where only nVidia hardware is supported natively. Not that, you know, anyone else has chosen to support ROCm that I know of.

OpenCL (also ope n source), for all its warts, is (mostly) supported by AMD; Intel; and NV hardware.

If that's what you think then I don't think you understand what open source truly entails ...

Open source is more than just about hosting code in public. Have you ever considered that open source has interesting endpoints other than just for archiving purposes ? On some open source projects, their main goal is releasing their code for the purpose that others will use it in their own projects which is why permissive licenses like MIT licenses is ideal in this circumstance. For other open source projects, their goals could involve attracting contributors so less permissive like GPL licenses are more popular in these cases. In both cases, ROCm meets neither criteria since it only works on AMD hardware so others can't really use most of the codebase in their own projects and AMD isn't interested in community contributions either ...

Just because ROCm meets the definition of what it means to be an open source project by letter does not mean it has the spirit of an open source project which is being useful to other projects or getting maintainers on boards. There's literally nothing impressive about ROCm compared to any of the other run of the mill open source project many of which are already dead or rendered useless. The only thing ROCm has going for it compared to other open source projects is that it's being maintained by AMD but that's not a recipe for long-term success when we consider the fact that AMD has a tendency to abandon GPU compute stacks ...

OpenCL has two parts. The standard which could be argued to be open and then we have driver implementations many of which are closed source. ROCm is ironically the polar opposite since it's a closed standard dictated purely by AMD with an open source implementation so it doesn't really change anything with respect to CUDA ...

DrMrLordX said:
Meanwhile, AMD's dGPU products are positioned better against NV's than they have been in years. Remember Vega? You don't want them to go back to that do you?

The Evergreen architecture and earlier GCN iterations were truly greatness at the time. Vega and RDNA or even RDNA2 shows that AMD are just a shadow of their past graphics division. Instead of ever one upping Nvidia like they did in the past, all AMD ever does now is follow them so they don't have a "GPU guru culture" anymore ...

DrMrLordX said:
Completely false. CDNA and RDNA2 are working quite nicely. CCIX is bringing AMD a step closer to the Fusion they had envisioned years ago. Yknow, before they had even produced Llano.

Fusion was a dead end since the HSA Foundation fell apart. AMD already had concept working but they had to massively readapt it to run on discrete CPUs and GPUs to make it more successful ...

DrMrLordX said:
Really? REALLY???

Artificial Intelligence Market Size, Share | Industry Report, 2030

The global artificial intelligence market size was valued at USD 279.22 billion in 2024 and is projected to grow at a CAGR of 35.9% from 2025 to 2030

www.grandviewresearch.com

Yes, really and Intel makes almost just as much money in a quarter as Nvidia does in it's entire year when we compare their data center sales. Most supercomputers also don't deal with trite like GPUs so the market for GPU compute is far smaller than you imagine it is ...

DrMrLordX said:
Doesn't matter, since that isn't NV's priority. Point being:

AVX-512 isn't a significant threat to GPGPU-based anything (FP64, 16-bit ML, 8-bit ML)
AVX-512 isn't a significant threat to ARM vendors that do choose to support SVE/SVE2, such as Fujitsu

NV has a master plan of offering you a complete platform to host their expensive-and-oh-so-wonderful compute cards which is where they make all their money currently. They'll sell you the entire hardware AND software stack, top to bottom. Their compute cards are the stars of the show.

"AVX-512 isn't a significant threat to GPGPU-based anything (FP64, 16-bit ML, 8-bit ML)"

The same could be said for the other way around ...

"AVX-512 isn't a significant threat to ARM vendors that do choose to support SVE/SVE2, such as Fujitsu"

AVX-512 will be standardized before we even see a SVE2 implementation. If there's no hardware, programmers won't deal with a hypothetical programming model ...

Is Nvidia's master plan is to offer a GPU computing platform ? Been there, done that before! If they want to keep pushing out mediocre CPUs, they'll never be able to properly compete in the upcoming trend of heterogeneous compute ...

DrMrLordX · Apr 24, 2021

ThatBuzzkiller said:
If that's what you think then I don't think you understand what open source truly entails ...

I'll repeat myself

AMD IS OFFERING A LICENSE-FREE ALTERNATIVE TO CUDA FOR HARDWARE VENDORS THAT WANT TO USE IT.

Nothing more, nothing less. You can implement ROCm without paying licensing fees to anyone. OpenCL once served the same purpose ,but Khronos Group has done such a poor job of updating it that AMD had to go their own way. AMD does not give a fig if anyone contributes to the project, so long as it works. Sadly the toolchain often doesn't, but oh well. Learn 2 code AMD.

ROCm is ironically the polar opposite since it's a closed standard dictated purely by AMD

Only because nobody else has adopted it.

The Evergreen architecture and earlier GCN iterations were truly greatness at the time. Vega and RDNA or even RDNA2 shows that AMD are just a shadow of their past

Ehhh don't agree. Raster performance on RDNA2 is fantastic. I don't know where you're going with this. It's incredibly off-topic. Only thing wrong with RDNA2 is that you can't really buy one.

"GPU guru culture"

Yeah okay.

Fusion was a dead end since the HSA Foundation fell apart.

Fusion lives on in CCIX. The name is gone, but the original concepts are fulfilled. To be specific, they're finally achieving something similar to Torrenza:

Torrenza - Wikipedia

en.wikipedia.org

AMD already had concept working but they had to massively readapt it to run on discrete CPUs and GPUs to make it more successful ...

AMD announced Torrenza in 2006. And to say that HSA was working is extremely generous. Did you ever mess with the existing HSA toolchains for Kaveri or Carrizo? I did.

Yes, really and Intel makes almost just as much money in a quarter as Nvidia does in it's entire year

What's that got to do with anything? Intel isn't making all that money in HPC! Data centre sales aren't classified as HPC. Don't mix up numbers.

The same could be said for the other way around ...

How many people do you think actually use AVX-512? For anything? Honestly. Why did Intel buy not one but two AI chip companies AND start a GPGPU initiative when they already have AVX-512?

AVX-512 was the centerpiece upon which Xeon Phi was built. Phi is dead. GPGPU killed it, and if Phi were still alive, A64FX and similar would also be capable of killing it, assuming NV doesn't kill SVE/SVE2.

Is Nvidia's master plan is to offer a GPU computing platform ? Been there, done that before!

Since when?

ThatBuzzkiller · Apr 24, 2021

@DrMrLordX

I just find it really funny how AMD pioneered HSA but then they reused their work (HSA kernel driver) only to copy CUDA (HIP) in the end. Whatever leadership AMD had in GPUs was truly gone by that point which is why I maintain that they don't have a GPU guru culture anymore but regardless we can see eye to eye that this isn't the end of the world for them ...

I'm pretty sure Xeon Phi got killed off by regular CPUs and GPGPU can't run standard C++ code either. Xeon phi did not win all that much compared to regular CPUs since there was kernel launch overhead and slower core design in general so Intel found out that most of the performance boost came from increased memory bandwidth. They were better off just implementing AVX-512 straight into the CPUs. NV doesn't need to kill off SVE/SVE2 either since they never had a CPU guru culture to make competent CPU designs ...

GPU compute platforms have existed ever since CUDA created so adding crappy CPUs to it won't make their entire stack more compelling than it already is ...

Question Nvidia to enter the server CPU market

Lifer

Golden Member

Lifer

Golden Member

Lifer

Golden Member

Diamond Member

Lifer

Golden Member

Lifer

Golden Member

Lifer

Senior member

Diamond Member

Senior member

Senior member

Diamond Member

Golden Member

Diamond Member

Golden Member

Lifer

Golden Member

Lifer

Golden Member

Lifer

Golden Member