Discussion RDNA4 + CDNA3 Architectures Thread

Page 35 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
1655034287489.png
1655034259690.png

1655034485504.png

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it :grimacing:

This is nuts, MI100/200/300 cadence is impressive.

1655034362046.png

Previous thread on CDNA2 and RDNA3 here

 
Last edited:

Jaskalas

Lifer
Jun 23, 2004
34,208
8,249
136
Unfortunately. 8K is a fools errand on small(ish) monitors. It's really only useful for xtra large TVs (85" and above). But - the spec wars continue undaunted. Same with 500Hz refresh rate gaming monitors. Nobody's visual cortex processes images that fast.
I did not think I would appreciate 1440p as much as I have.
You'll have to drag me kicking and screaming back to 1080p.
 

SolidQ

Senior member
Jul 13, 2023
578
706
96
Well... n43 is dead, long live n48? )
You mean this?
11fa0d0e4fb7a130913cfaa9de3dab72.png
 

Ajay

Lifer
Jan 8, 2001
16,094
8,110
136
You mean this?
11fa0d0e4fb7a130913cfaa9de3dab72.png
Geez, what a totally blown generation if true. Too bad AMD couldn't salvage one 60/64 WGP die for a decent performance gain - maybe even over the 7900XT at a lower price. Though, there is that AMD GPU engineer that posted a protest.
 

Frenetic Pony

Senior member
May 1, 2012
218
179
116
For anyone thinking these leaks are true, you may have missed what appears to be a totally legitimate AMD engineer calling BS on all this and saying they were testing the new top end card as he was typing. So yeah, total BS.

Anway, just realized, Sony PS5 Pro leak put it at 60Cus (not 64). And the 3.5 leak puts it at 40CUs. So, erm, updated guessed specs because now it's 20CU per?:

Edit- Hmmm
(?)160CU N3E 384bit 32gbps 2.5ghz 450w 70% faster than a 4090.
100CU N3E 256bit 2.8ghz 350w 20% faster than a 4090
60CU N4P 192bit 3.0ghz 250w 4070ti < this < 4080
40CU N4P 128bit 3.1ghz 175w basically a 7700
 
Last edited:

TESKATLIPOKA

Platinum Member
May 1, 2020
2,554
3,100
136
Edit- Hmmm
(?)160CU N3E 384bit 32gbps 2.5ghz 450w 70% faster than a 4090.
100CU N3E 256bit 2.8ghz 350w 20% faster than a 4090
60CU N4P 192bit 3.0ghz 250w 4070ti < this < 4080
40CU N4P 128bit 3.1ghz 175w basically a 7700
Some of your performance projections are wrong.

7900 XTX vs RTX 4090
100% vs 124% (TPU at 4K raster)
This 160CU 2.5GHz GPU has only 67% more CU, at best It would be 67% faster than 7900XTX or 35% faster than RTX 4090, in reality less because of scaling.

100CU 2.8GHz also won't be 20% faster than 4090. Extra 4CU and 12% higher frequency would at best make it 17% faster, but in reality less. So still under RTX 4090.

60CU 3GHz would be at best a few % faster than 4070Ti, but 4080 would be 20% faster.

6700XT vs 7700XT
100% vs 128%
40CU 3.1GHz GPU could be more or less on the level of 7700XT, but fast GDDR7 is a must.
 
Last edited:
  • Like
Reactions: Tlh97

Timorous

Golden Member
Oct 27, 2008
1,769
3,313
136
For anyone thinking these leaks are true, you may have missed what appears to be a totally legitimate AMD engineer calling BS on all this and saying they were testing the new top end card as he was typing. So yeah, total BS.

Anway, just realized, Sony PS5 Pro leak put it at 60Cus (not 64). And the 3.5 leak puts it at 40CUs. So, erm, updated guessed specs because now it's 20CU per?:

Edit- Hmmm
(?)160CU N3E 384bit 32gbps 2.5ghz 450w 70% faster than a 4090.
100CU N3E 256bit 2.8ghz 350w 20% faster than a 4090
60CU N4P 192bit 3.0ghz 250w 4070ti < this < 4080
40CU N4P 128bit 3.1ghz 175w basically a 7700

New top end card does not necessarily mean new enthusiast tier card. If the stack is truncated like it was with RDNA 1 then the top end card may be mid range.

What I would say though is that the supposed MI300 like GPU would be expensive. Like $2K + expensive to make it even worth existing and with that it would need to be by far the fastest thing money can buy so likely a 450W 295X2 style product without the crossfire drawbacks. I also do not see this kind of part having the perf/W to make the performance it needs viable at closer to 300W. It would need to be 2x the perf/W of RDNA 3 to stand a chance and that is just unrealistic IMO.

With that in mind it still leaves a 350W and below product stack so if we split the parts by power tiers we get something like 350W, 300W, 250W, 200W and 150W or there abouts.

a 50% perf/w gain at 350W would give you 4090 + 20% and given the perf/W miss on RDNA 3 I expect 50% is the minimum target. Based on TPU data the 7900XTX has 37% more perf/W than the 6900XT. AMD must be trying to make up for lost ground so I would not be surprised if the internal target must be around a 65% perf/W increase over RDNA 3 which would then net them a 125% perf/w gain over RDNA 2 which would put them back on track.

That would give the following as a rough performance stack vs current and an estimate of 5000 series.

5090 - peerless
350W ~ 4090 + 20% ~ 5080
300W ~ 4090 ~ 5070 ti
250W ~ 4080 / 7900XTX ~ 5070
200W ~ 4070Ti / 7900XT
175W ~ 4070 / 7800XT ~ 5060Ti
150W ~ 7700XT ~ 5060

Somewhat competitive depending on price, RT performance and feature set but nothing to compete with the 5090 imo.
 
  • Like
Reactions: Tlh97

eek2121

Diamond Member
Aug 2, 2005
3,118
4,454
136
For anyone thinking these leaks are true, you may have missed what appears to be a totally legitimate AMD engineer calling BS on all this and saying they were testing the new top end card as he was typing. So yeah, total BS.

Anway, just realized, Sony PS5 Pro leak put it at 60Cus (not 64). And the 3.5 leak puts it at 40CUs. So, erm, updated guessed specs because now it's 20CU per?:

Edit- Hmmm
(?)160CU N3E 384bit 32gbps 2.5ghz 450w 70% faster than a 4090.
100CU N3E 256bit 2.8ghz 350w 20% faster than a 4090
60CU N4P 192bit 3.0ghz 250w 4070ti < this < 4080
40CU N4P 128bit 3.1ghz 175w basically a 7700
Engineers sign NDAs just like everyone else.

He likely was saying that to kill the conversation. It is a popular tactic used by lawyers when they aren’t in a court room to shut someone up.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,110
136
Engineers sign NDAs just like everyone else.

He likely was saying that to kill the conversation. It is a popular tactic used by lawyers when they aren’t in a court room to shut someone up.
He didn't disclose any IP, market strategy, financials, etc. I think he was venting. The question is, did AMD 1) realize the problems early enough to develop and alternative to the original design of 'BIG' RDNA4 to get 1 or two 60 WGP tiles into and enthusiast card, or 2) Did AMD already have a fall back plan given the know difficulty of fully implementing the multi-tile/chiplet architecture of RDNA4. In the case of the first alternative, the higher end cards will probably take a bit longer to get out. In the case of the second alternative, it may come out in a more timely manner if there were enough engineers working on it. Anyway, this is all 100% speculation until we get some update from AMD in a quarterly report or AMD public event.
 

Timorous

Golden Member
Oct 27, 2008
1,769
3,313
136
He didn't disclose any IP, market strategy, financials, etc. I think he was venting. The question is, did AMD 1) realize the problems early enough to develop and alternative to the original design of 'BIG' RDNA4 to get 1 or two 60 WGP tiles into and enthusiast card, or 2) Did AMD already have a fall back plan given the know difficulty of fully implementing the multi-tile/chiplet architecture of RDNA4. In the case of the first alternative, the higher end cards will probably take a bit longer to get out. In the case of the second alternative, it may come out in a more timely manner if there were enough engineers working on it. Anyway, this is all 100% speculation until we get some update from AMD in a quarterly report or AMD public event.

Option 3 is the super RDNA 4 card was always a super halo tier product because anything that complex and using that much silicon needs to sell for a lot of money to make the margin remotely worthwhile and as such the rest of the stack below this tier was also planned and is still on schedule.
 

PJVol

Senior member
May 25, 2020
726
680
136
Option 3 is the super RDNA 4 card was always a super halo tier product because anything that complex and using that much silicon needs to sell for a lot of money to make the margin remotely worthwhile and as such the rest of the stack below this tier was also planned and is still on schedule.
I think the incentive to cancel such product is not margins related (don't see an issue selling it for 1.5K if the rest is assured), but yeah, the time (human resources) + money spent on R&D. It's too expensive atm in this sense.
 
Last edited:

Tigerick

Senior member
Apr 1, 2022
701
628
106
AMD must be trying to make up for lost ground so I would not be surprised if the internal target must be around a 65% perf/W increase over RDNA 3 which would then net them a 125% perf/w gain over RDNA 2 which would put them back on track.
AMD totally REDACTED up the design of N31, with such a huge memory bandwidth increasement, they only increase 20% CU. No wonder they have to price it below $1,000. With GDDR7 support on upcoming RDNA5, hopefully AMD will bump up amount of CU to align with the improvement of bandwidth. Let's assume the clock speeds are the same throughout generation, my rough estimate of CU cores needed to feed the bandwidth is slightly higher than your 125% figures, then it hit me that 210 CU might be the number cause there was rumor about 270CU before, we shall see...

That would give the following as a rough performance stack vs current and an estimate of 5000 series.

5090 - peerless
350W ~ 4090 + 20% ~ 5080
300W ~ 4090 ~ 5070 ti
250W ~ 4080 / 7900XTX ~ 5070
200W ~ 4070Ti / 7900XT
175W ~ 4070 / 7800XT ~ 5060Ti
150W ~ 7700XT ~ 5060

Somewhat competitive depending on price, RT performance and feature set but nothing to compete with the 5090 imo.

Yeap, based on my estimate 5070Ti or whatever name NV given, would perform similarly with current 4090 at much lower price points. This model going to offer many incentives for current RTX3080 users to upgrade if NV prices it at $799.

7700XT and 7800XT are surprisely competitive against NV offerings, that's why I think AMD will keep selling these models for 2 years until N52/N53 comes up...
 
Last edited by a moderator:

Ajay

Lifer
Jan 8, 2001
16,094
8,110
136
Yeap, based on my estimate 5070Ti or whatever name NV given, would perform similarly with current 4090 at much lower price points. This model going to offer many incentives for current RTX3080 users to upgrade if NV prices it at $799.
Going by recent trends, if the 5070Ti has performance that good, it'll be $1000 - $1200 US. And, it'll probably on have 16GB of RAM.
 

adroc_thurston

Diamond Member
Jul 2, 2023
3,701
5,419
96
AMD totally fucked up the design of N31, with such a huge memory bandwidth increasement, they only increase 20% CU
Because the thing is missing 30% the clockrate.
hopefully AMD will bump up amount of CU to align with the improvement of bandwidth
mmmmmmore speeeeeeeeeed.
Yeap, based on my estimate 5070Ti or whatever name NV given, would perform similarly with current 4090 at much lower price points.
N3e is barely a shrink off N4, especially for SRAM-heavy designs like modern GPU.
 

Mopetar

Diamond Member
Jan 31, 2011
8,145
6,843
136
AMD put the infinity cache on the MCMs, which are using N6 so they aren't eating a lot of extra cost for little scaling. I don't know if it's necessary, but using v-cache on those seems like a good way to increase the cache size and give users a reason to drop extra $$$ on a model that's got extra bells and whistles.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,659
3,801
106
Looking at the RDNA 4c design, AMD moved on from MCM in the future chiplet design. There will not be individual MCMs.

In RDNA 4c, the base die took over the functionality of 2 MCMs, including their Infinity / MALL caches.

The concept is still similar to MCM, but implementation will be different (in RDNA 5, chiplet based). And the odds are, it will continue to be on TSMC N6.