Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 10 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,624
5,894
146

Kepler_L2

Senior member
Sep 6, 2020
330
1,162
106
Nothing is crazier than 2.7x. That has never been done before. 9700Pro did not manage it, 8800GTX did not manage it. Heck as good as RDNA2 is that did not manage anywhere close if you factor in that AMD did not manufacture a 300W RDNA part to make it more like for like.

You can talk about interesting layout ideas, crazy low TDPs etc but if AMD managed to pull off a 2.7x perf increase in a sub 400W TDP part it will be the 1st time in the history of GPUs and 3D accelerators that such a leap has been done.
The 9700Pro was actually 3x faster than previous gen in some situations, although that certainly wasn't the average. 8800GTX was like 2.2 to 2.5x faster also.

RDNA3 will be one of the largest leaps ever in performance, efficiency and specially price :p
 

Gideon

Golden Member
Nov 27, 2007
1,625
3,650
136
The 9700Pro was actually 3x faster than previous gen in some situations, although that certainly wasn't the average. 8800GTX was like 2.2 to 2.5x faster also.

RDNA3 will be one of the largest leaps ever in performance, efficiency and specially price :p
Oh I believe the top end will cost number I'm not even prepared to usher. But yeah, I can see them achieving speedups in the ballpark if they go full-out on the stacked cache, e.g. 512MB or 1GB. That alone could radically improve performance >2x in all bandwidth bound workloads.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,601
5,780
136
I have some questions on X3D and V-Cache --> Will the tech be ready for N5 by 2Q22? Or N31 is late 22
1625820746705.png

Otherwise there is bunch of of key enablers which would be sad if AMD failed to achieve that target
X3D Chiplets
V-Cache
N5P PPA 30% Efficiency
N5P MTr gain
New uArch

V-Cache or rather X3D in general and GPUs is like a match made in heaven.
GPUs have massive die sizes, moving the cache to the chiplets is double win. Cache chiplets can be made with SRAM tailored libs for maximum density per mm2. Logic dies can stay within reasonable sizes if IC is moved out. And IC is very suitable for RDNA2
RDNA2 can clock sky high, N5P PPA gains will ensure those insane clocks can be maintained 24/7
~50 TF on the fully enabled N31 on AIB cards would be what I am expecting.
 
Last edited:

Timorous

Golden Member
Oct 27, 2008
1,608
2,753
136
The 9700Pro was actually 3x faster than previous gen in some situations, although that certainly wasn't the average. 8800GTX was like 2.2 to 2.5x faster also.

RDNA3 will be one of the largest leaps ever in performance, efficiency and specially price :p

If it manages 2.7x performance it will be one for the history books.
 

Mopetar

Diamond Member
Jan 31, 2011
7,835
5,981
136
The crazy rumors starts again. But I remember that I was one of the guys that though that the 2.5GHz RDNA2 on OC was crazy talk by some user here.

I think most of us were already pretty skeptical by the time the clock speeds we over 2.2 GHz so I wouldn't feel bad for thinking that 2.5 GHz was ludicrous, particularly given AMD's past performance with Polaris and Vega not clocking particularly well.
 

Timorous

Golden Member
Oct 27, 2008
1,608
2,753
136
I think most of us were already pretty skeptical by the time the clock speeds we over 2.2 GHz so I wouldn't feel bad for thinking that 2.5 GHz was ludicrous, particularly given AMD's past performance with Polaris and Vega not clocking particularly well.

I was skeptical until Sony announced 2.23Ghz in the PS5.
 

Mopetar

Diamond Member
Jan 31, 2011
7,835
5,981
136
I was skeptical until Sony announced 2.23Ghz in the PS5.

I think people were still rightly skeptical because Sony was being a bit coy with how they presented their numbers and some assumed that it was more of a boost figure, especially given that Microsoft announced a 1.85 GHz clock speed around that same time frame.
 

Glo.

Diamond Member
Apr 25, 2015
5,705
4,549
136
If it manages 2.7x performance it will be one for the history books.
It depends on the context.

2.7 times over Navi 21 - it would have to have 200 CUs, plus some performance per clock increases.
2.7 times over Navi 22 - it would have to have 100 CUs, plus some performance per clock increases.
2.7 times over Navi 23 - it means that it would have to have 80 CUs plus some performance per clock increases.
2.7 times over Navi 24 - it means that it would have to have 24 CUs, plus some performance per clock increases.

So it may not turn out to be "THAT" crazy as people make it to be.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,355
2,848
106
I fixed It for you. ;)
2.7 times over Navi 24 - it means that it would have to have 40 CUs, plus some performance per clock increases.




BTW Bondrewd mentioned something interesting not so log ago.
Honestly the interesting part will be NV21 vs NV33 because those have similar enough config.
Link
The question is If he meant the full version or not. Even with only 60CU It would end up pretty big, maybe IF won't be integrated into GPU like now.
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
5,705
4,549
136
BTW Bondrewd mentioned something interesting not so log ago.
Link
The question is If he meant the full version or not. Even with only 60CU It would end up pretty big, maybe IF won't be integrated into GPU like now.
Well, if you will look at the specs that I posted above, Navi 33 vs 23 would need to have 80 CUs in order to get 2.5 times performance.

To get 2.7 times performance would need more performance be it higher clock speeds or higher perf/clock.

Navi 21 in this context has 80 CUs.

:)

Im pretty sure the GDDR bus of Navi 33 will be... surprising to a lot of people.
 

uzzi38

Platinum Member
Oct 16, 2019
2,624
5,894
146
Well, if you will look at the specs that I posted above, Navi 33 vs 23 would need to have 80 CUs in order to get 2.5 times performance.

To get 2.7 times performance would need more performance be it higher clock speeds or higher perf/clock.

Navi 21 in this context has 80 CUs.

:)

Im pretty sure the GDDR bus of Navi 33 will be... surprising to a lot of people.
That's still middling in wtf factor.

N31 die count is the one to wait for.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,355
2,848
106
Well, if you will look at the specs that I posted above, Navi 33 vs 23 would need to have 80 CUs in order to get 2.5 times performance.

To get 2.7 times performance would need more performance be it higher clock speeds or higher perf/clock.

Navi 21 in this context has 80 CUs.

:)
It doesn't need to be 80CU. It could be even 72CU or even more than 80CU. ;)

Im pretty sure the GDDR bus of Navi 33 will be... surprising to a lot of people.
If they can put 256MB IF or more, then I wouldn't be surprised If It had only 128bit GDDR6.
 

Trumpstyle

Member
Jul 18, 2015
76
27
91
Based on the latest rumors this is what I'm at.

Navi 31 = 160CU, 256bit bus
Navi 32 = 120CU, 192bit bus
Navi 33 = 80CU, 128bit bus

They all will have infinity cache/3D cache to make up for the lack of memory bandwidth. Rdna3 will need 1.2x higher ipc and 1.3x clock increase to reach 2.5x performance gain which is what I'm going with. (Not 240CU's or 128gpu cores per CU)

The latest rumors has Navi 31 around 2.7x performance increase but since Rdna2 failed to live up to its expectations we need to be extra careful, that's why 2.5x is better.

Rdna3 top Gpu will be around 20-30% faster than Nvidias Lovelace in gaming performance (raster) and both gpus will consume about 450W. You can air-cool 450W so these gpus will still have mainstream appeal.
 
  • Like
Reactions: Tlh97 and Kepler_L2

Timorous

Golden Member
Oct 27, 2008
1,608
2,753
136
Rdna2 failed to live up to its expectations

If by failed to live up to expectations you mean greatly exceeded what most people expected then okay.

People were really skeptical of a 2x performance increase over the 5700XT even though the perf/watt + TDP budget made that seem entirely doable. As it so happens 6900XT is just over 2x the 5700XT at 4k.
 
  • Like
Reactions: Tlh97 and KompuKare

GodisanAtheist

Diamond Member
Nov 16, 2006
6,783
7,117
136
What if N33 is the largest single GPU chiplet, and all "N3x's" above it are multi-die "chips"?

So you're looking at something really crazy like:

7900XTX = 3x N33 dies, 240CU
7800XT = 2x N33 dies, 160 CU
7700XT = 1x N33 die, 80 CU

Everything below that point can be serviced by an additional N34 die and/or recycled N23/24 dies for the extreme low end.

If we're gonna start the hypetrain, lets go big.
 

Glo.

Diamond Member
Apr 25, 2015
5,705
4,549
136
What if N33 is the largest single GPU chiplet, and all "N3x's" above it are multi-die "chips"?

So you're looking at something really crazy like:

7900XTX = 3x N33 dies, 240CU
7800XT = 2x N33 dies, 160 CU
7700XT = 1x N33 die, 80 CU

Everything below that point can be serviced by an additional N34 die and/or recycled N23/24 dies for the extreme low end.

If we're gonna start the hypetrain, lets go big.
N33 is monolithic.

And I personally expect for this GPU RX 6900 XT+25% performance.

At least, based on rumors. Will it be true? We will see. If we thought that RDNA2 rumors were crazy, RDNA3 rumors blow them out of the water.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,355
2,848
106
Based on the latest rumors this is what I'm at.

Navi 31 = 160CU, 256bit bus
Navi 32 = 120CU, 192bit bus
Navi 33 = 80CU, 128bit bus

They all will have infinity cache/3D cache to make up for the lack of memory bandwidth. Rdna3 will need 1.2x higher ipc and 1.3x clock increase to reach 2.5x performance gain which is what I'm going with. (Not 240CU's or 128gpu cores per CU)

The latest rumors has Navi 31 around 2.7x performance increase but since Rdna2 failed to live up to its expectations we need to be extra careful, that's why 2.5x is better.
20% higher IPC and 30% higher clocks would mean 56% higher performance like Mopetar said. With a 160CU chiplet GPU you would far exceed 2.5x or 150% performance gain you set.

Rdna3 top Gpu will be around 20-30% faster than Nvidias Lovelace in gaming performance (raster) and both gpus will consume about 450W. You can air-cool 450W so these gpus will still have mainstream appeal.
People had problems with 300W(AMD) or 350W(NVIDIA) TBP and now 450W should be ok? Coolers would need to be very expensive and gigantic. The likelihood of this is pretty close to zero.