Discussion RDNA4 + CDNA3 Architectures Thread

Page 468 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,791
136
1655034287489.png
1655034259690.png

1655034485504.png

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it :grimacing:

This is nuts, MI100/200/300 cadence is impressive.

1655034362046.png

Previous thread on CDNA2 and RDNA3 here

 
Last edited:

Joe NYC

Diamond Member
Jun 26, 2021
3,692
5,229
136
According to Morgan Stanley Research, MI300X and MI355X are not profitable to run LLM inference, having worse returns than even Huawei platform. I would get it for training, but for inference? The claim is software is to blame. I could not find the original report to understand their methodology, but just common sense seems to be off (no one would be buying such accelerators). But maybe there is still some truth and this is why the sales are not so high as "investors in the second shovel company in AI gold rush" hoped?
View attachment 129071

Secondary source: https://wccftech.com/nvidia-blackwe...ns-miles-ahead-of-amd-software-optimizations/

I have seen that disputed (on Twitter) by someone who run the cloud company using these.
 

Mopetar

Diamond Member
Jan 31, 2011
8,496
7,753
136
The other products are very likely to be not profitable either.

HPC actually stands for Hype per (venture) Capital months of operation. AMD had a somewhat higher BB (bubble burst) factor so companies are more hesitant to squander some else's money on (invest in) their product. Current LLMs (largesse luring models) are proving to be unsustainable regardless of vendor though based on reports of a man in a leather jacket trying to figure out who to sell shovels to next.
 

dr1337

Senior member
May 25, 2020
523
807
136
According to Morgan Stanley Research, MI300X and MI355X are not profitable to run LLM inference, having worse returns than even Huawei platform. I would get it for training, but for inference? The claim is software is to blame. I could not find the original report to understand their methodology, but just common sense seems to be off (no one would be buying such accelerators). But maybe there is still some truth and this is why the sales are not so high as "investors in the second shovel company in AI gold rush" hoped?
View attachment 129071

Secondary source: https://wccftech.com/nvidia-blackwe...ns-miles-ahead-of-amd-software-optimizations/
it would be very interesting to know if these are benchmark numbers, or vendor numbers

if this is purely considering customers only, then the data is bupkis. nvidia has been long entrenched with their customers and in terms of profit count, there should be more profitable clusters operating on nvidia hardware entirely from marketshare alone.

Like where are the google numbers coming from? How are they deciding profit on that? Gemini licensing sales? Are people actually paying google big money on business accounts to use gemini?
 
  • Like
Reactions: Tlh97 and Joe NYC

Joe NYC

Diamond Member
Jun 26, 2021
3,692
5,229
136
it would be very interesting to know if these are benchmark numbers, or vendor numbers

if this is purely considering customers only, then the data is bupkis. nvidia has been long entrenched with their customers and in terms of profit count, there should be more profitable clusters operating on nvidia hardware entirely from marketshare alone.

Like where are the google numbers coming from? How are they deciding profit on that? Gemini licensing sales? Are people actually paying google big money on business accounts to use gemini?

Precisely. It is just "data" points derived through analysis, based on a set of assumptions, where half of the assumptions may be incorrect.

There is no raw data.
 

marees

Golden Member
Apr 28, 2024
1,809
2,434
96

AMD RDNA 4 GPU Architecture at Hot Chips 2025​

By Ryan Smith - August 25, 2025

Recapping AMD’s logical design, with a single GPU being made of (up to) multiple shader engines. The L2 cache was enlarged this generation to better prep the architecture for RT workloads. This also marks the 3rd generation of AMD’s Infinity Cache. All of which works to help keep the core fed.


RT performance in a nutshell: doubling BVH throughput has added most of RDNA 4’s RT performance improvements. But OoO memory, hardware instance transform, and oriented bounding boxes all further add to that, allowing for ~2x RT performance versus RDNA 3.


RDNA 4 also makes some updates to the shader engine with dynamic register allocations.
RT tends to eat up a lot of registers, but not during all stages of RT execution. Traversal uses relatively few registers, for example.
RDNA 3 would allocate registers based on the worst case scenario. RDNA 4, on the other hand, can allocate registers dynamically, allowing only the needed registers to be used, and then released once they’re no longer needed.
In practice, this has allowed AMD to increase the number of waves in flight versus RDNA 3, by squeezing in another wave to the freed-up registers.


The media block received some major encoder updates, such as adding B-frames for AV1 encoding, and overall lower latency.

Meanwhile the display block adds some features such as integrating Radeon image sharpening 2 into the block itself, rather than processing it as a shader effect.


 

coercitiv

Diamond Member
Jan 24, 2014
7,378
17,487
136
One year ago AMD announced their shift in strategy, supposedly focused on regaining market share:
So, my number one priority right now is to build scale, to get us to 40 to 50 percent of the market faster. Do I want to go after 10% of the TAM [Total Addressable Market] or 80%? I’m an 80% kind of guy because I don’t want AMD to be the company that only people who can afford Porsches and Ferraris can buy. We want to build gaming systems for millions of users.
AMD’s Jack Huynh

So how does this pan out? AMD market share has increased by... checks notes... -2.1%. Yep, that's a minus.

Q2’25 PC graphics add-in board shipments increased 27.0% from last quarter
1756878466252.png
 
Last edited:

adroc_thurston

Diamond Member
Jul 2, 2023
7,283
10,039
106
So how does this pan out? AMD market share has increased by... checks notes... -2.1%. Yep, that's a minus.
well, yeah.
As I've said, the only way to gain mss is to *win*.
Until they ship a 800W, 300+ CU monster, they lose. Until their MSS is 0%.
People (translation: ignoramuses) see only Nvidia in laptops and assume they are the best. Having a sticker on laptops helps a lot.
laptops are a collateral market to AIB DT.
You need to win here to win there.
 

ToTTenTranz

Senior member
Feb 4, 2021
718
1,190
136
One year ago AMD announced their shift in strategy, supposedly focused on regaining market share:

AMD’s Jack Huynh

So how does this pan out? AMD market share has increased by... checks notes... -2.1%. Yep, that's a minus.

Q2’25 PC graphics add-in board shipments increased 27.0% from last quarter
View attachment 129581
The 9070XT only released in March, and the 9060XT in June. Nvidia introduced 6 SKUs out of 4 different chips between January and May.
AMD insisted on waiting for being ultra sure about their competitors' prices before even starting to think about launching their own products.


AMD was months late with RDNA4, for playing that game. Stupidest thing was I (and many others) saw a fully functioning RX9070 XT at CES inside a PC running FSR4 on Ratchet and Clank, and some others on display. But then AMD went silent for 2 whole months.
 

adroc_thurston

Diamond Member
Jul 2, 2023
7,283
10,039
106
The 9070XT only released in March, and the 9060XT in June. Nvidia introduced 6 SKUs out of 4 different chips between January and May.
AMD insisted on waiting for being ultra sure about their competitors' prices before even starting to think about launching their own products.


AMD was months late with RDNA4, for playing that game. Stupidest thing was I (and many others) saw a fully functioning RX9070 XT at CES inside a PC running FSR4 on Ratchet and Clank, and some others on display. But then AMD went silent for 2 whole months.
None of that matters.
They just don't win, and you can't move boards when you don't win.
It's pretty dang simple (but ATi people fail to grasp for 15+ years).
 

branch_suggestion

Senior member
Aug 4, 2023
833
1,843
106
One year ago AMD announced their shift in strategy, supposedly focused on regaining market share:

AMD’s Jack Huynh

So how does this pan out? AMD market share has increased by... checks notes... -2.1%. Yep, that's a minus.

Q2’25 PC graphics add-in board shipments increased 27.0% from last quarter
View attachment 129581
Radeon sales are up q/q for the last couple, just that NV sales are bonkers due to releasing built up supply for the low end and bubble effects at the high end.
This cannot last just like both crypto bubbles, eventually there will be a oversupply and a normalisation of things, partially offset by the mid gen refresh.
Still the market is officially peak Intel levels of cooked, but eventually the AIB share will become less useful as laptops are progressively taken over by APUs.
Desktop DIY share is okay for AMD right now, but that is only one part of the market, Laptop/SI/OEM share for NV is >95%.
But when you factor in all GPUs shipped inc consoles and APUs, it is not impossible to challenge such a moat.
 

marees

Golden Member
Apr 28, 2024
1,809
2,434
96
The 9070XT only released in March, and the 9060XT in June. Nvidia introduced 6 SKUs out of 4 different chips between January and May.
AMD insisted on waiting for being ultra sure about their competitors' prices before even starting to think about launching their own products.


AMD was months late with RDNA4, for playing that game. Stupidest thing was I (and many others) saw a fully functioning RX9070 XT at CES inside a PC running FSR4 on Ratchet and Clank, and some others on display. But then AMD went silent for 2 whole months.
Looks like what Jack Hyunh said was spin

They are not willing to go far an all out war for market share / revenue at the expense of margin

RDNA 5 strategy seems sustainable
  1. AT0 (xx90xt) — shared with cloud gaming / professional use cases
  2. AT2 (xx70xt) — shared with xbox
  3. AT4 (xx50xt) — shared with Z3E (PC & 'xbox' handhelds )
  4. AT3 (xx60xt) — shared with edge inferencing / premium laptops, tablets, NUCs

Only AT3 looks a bit dodgy. They may have to sacrifice margins there to scale

Once they master chiplets then future archs (RDNA 6 or 7) should be more scalable
 
  • Like
Reactions: Tlh97

adroc_thurston

Diamond Member
Jul 2, 2023
7,283
10,039
106
Looks like what Jack Hyunh said was spin
yeah lmao.
RDNA 5 strategy seems sustainable
It's sustainable in a way that no single tapeout is wasted on client discrete graphics yes.
It's a really dang creative approach to building a stack, but also a de-facto surrender in DIY discrete graphics.
AT4 (xx50xt) — shared with Z3E (PC & 'xbox' handhelds )
handhelds are too poor for mdsP.