Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 130 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,622
5,880
146

Timorous

Golden Member
Oct 27, 2008
1,607
2,747
136
Just to get a rough idea on RT performance of the RX7900XTX

Ill use the AMD slide and Kitguru RTX4080 review since I find it to have similar fps on the RX6050XT with AMDs own slides.

Unfortunately the review only have the CP77 and Resident Evil.


Cyberpunk 77 4K RayTracing (no DLSS/FSR)


4080 = 28.1fps

7900XTX = 21fps

6950XT = 13fps



Resident Evil : Village 4K RayTracing (no DLSS/FSR)

4080 =138.4fps

7900XTX = 135fps

6950XT = 92fps

RADEON-RX-7900-2.jpg


Cyber-DXR3-768x768.png


REVDXR3-768x768.png

This is always tricky to judge but I think if we go by the deltas between the 6950XT and the 7900XTX in the slides AMD showed we can get some ballpark info.

Using these numbers the XTX gets 134 in RE:V and 19.9 in CP2077

At Techspot we see Dying light hit 36 fps vs the 4080s 39 fps and CP2077 there gets 21 fps vs the 4080 31fps.

At TPU the same scaling would get a 7900XTX to 20.6 FPs vs the 4080 29FPS. RE:V is 120.6 vs 120.6

So CP2077 will probably be a big win for the 4080 in RT but the RE:V and Dying Light look pretty close with the edge towards the 4080 but by far far less than the price increase.

At 4K the 4080 does seem to lose a bit more performance when turning RT on than the 4090 does.
 
  • Like
Reactions: Kaluan

GodisanAtheist

Diamond Member
Nov 16, 2006
6,783
7,115
136
With a glut of N6 capacity, AMD can crank these out and sell them at competitive prices.

- Ok looks like we're back to N33 having 2048 Dual Pumped SPs, so take a 6600xt * 17.5% IPC bump is essentially what we're looking at.

No way the N33 hits 6900xt or even 6800 levels I think. May be closer to a 6750xt at the end of the day, but will go for peanuts thanks to a basic design and cheap manufacturing process.

Edit: let's say AMD gets it to clock to 2.8 -3.0 GHz, in which case we're looking at a solid 6800 competitor for likely $399. Not shabby.
 
Last edited:

Joe NYC

Golden Member
Jun 26, 2021
1,934
2,272
106
- Ok looks like we're back to N33 having 2048 Dual Pumped SPs, so take a 6600xt * 17.5% IPC bump is essentially what we're looking at.

No way the N33 hits 6900xt or even 6800 levels I think. May be closer to a 6750xt at the end of the day, but will go for peanuts thanks to a basic design and cheap manufacturing process.

Edit: let's say AMD gets it to clock to 2.8 -3.0 GHz, in which case we're looking at a solid 6800 competitor for likely $399. Not shabby.

N33 does not have to beat N21. That was a fantasy scenario that was hard to believe. If for no other reason, the best case scenario would be 1/2 of the memory bandwidth of N21.

With the die size of ~N23, and the cost of N23, AMD can sell these $300 range.
 

Kaluan

Senior member
Jan 4, 2022
500
1,071
96
This is always tricky to judge but I think if we go by the deltas between the 6950XT and the 7900XTX in the slides AMD showed we can get some ballpark info.

Using these numbers the XTX gets 134 in RE:V and 19.9 in CP2077

At Techspot we see Dying light hit 36 fps vs the 4080s 39 fps and CP2077 there gets 21 fps vs the 4080 31fps.

At TPU the same scaling would get a 7900XTX to 20.6 FPs vs the 4080 29FPS. RE:V is 120.6 vs 120.6

So CP2077 will probably be a big win for the 4080 in RT but the RE:V and Dying Light look pretty close with the edge towards the 4080 but by far far less than the price increase.

At 4K the 4080 does seem to lose a bit more performance when turning RT on than the 4090 does.
Native 4K RT is a relevant metric, but realistically, no one would run that, no one buys +$1K GPUs to play at ~30FPS.

Almost everyone who really wants RT at 4K would use upscaling tech to achieve good framerates (where applicable).

This is why how N31 (and AD103) performs at native 1080-1440p RT would be more relevant, just add a approximation of FSR/DLSS overhead. But we have zero sub-4K numbers from AMD, so eh.
N33 does not have to beat N21. That was a fantasy scenario that was hard to believe. If for no other reason, the best case scenario would be 1/2 of the memory bandwidth of N21.

With the die size of ~N23, and the cost of N23, AMD can sell these $300 range.
The N33* die itself should be quite a bit cheaper than N33, it's <86% of N23 and on N6 (which is allegedly cheaper to produce than N7/N7+/N7P).

IDK the GDDR clocks, but even if they'd go 20Gbps instead of 18 (don't think they'll use 16 again), the costs likely won't be obscene for a 8GB setup.
 
Last edited:

TESKATLIPOKA

Platinum Member
May 1, 2020
2,355
2,848
106
The N33 die itself should be quite a bit cheaper than N23, it's <86% of N23 and on N6 (which is allegedly cheaper to produce than N7/N7+/N7P).
Fixed It for you.
- Ok looks like we're back to N33 having 2048 Dual Pumped SPs, so take a 6600xt * 17.5% IPC bump is essentially what we're looking at.

No way the N33 hits 6900xt or even 6800 levels I think. May be closer to a 6750xt at the end of the day, but will go for peanuts thanks to a basic design and cheap manufacturing process.

Edit: let's say AMD gets it to clock to 2.8 -3.0 GHz, in which case we're looking at a solid 6800 competitor for likely $399. Not shabby.
RX 6600 XT has 2359 MHz game frequency.
N33 -> 2800MHz gaming frequency.
IPC should be lower than 17.5%, let's say 15%.
Gaming frequency of 3GHz would mean 100*1.15*[2800/2359]=137%
TPURX 6600xtRX 6700xtRX 6800
1080p100%122%141%
1440p100%127%155%
2160p100%141%181%
Hopefully they won't go over $399, then It will be a good card.
 

uzzi38

Platinum Member
Oct 16, 2019
2,622
5,880
146
The N32 die itself should be quite a bit cheaper than N33, it's <86% of N23 and on N6 (which is allegedly cheaper to produce than N7/N7+/N7P).
PS: N6 is actually ever so slightly more expensive for a given die area, but not by much. Like, cost/transistor is basically flat from N7.

Either way, your point is still correct, N33 should be cheaper to produce than N23.
 
  • Like
Reactions: Tlh97 and Kaluan

Timorous

Golden Member
Oct 27, 2008
1,607
2,747
136
PS: N6 is actually ever so slightly more expensive for a given die area, but not by much. Like, cost/transistor is basically flat from N7.

Either way, your point is still correct, N33 should be cheaper to produce than N23.

TSMC were pushing companies to use N6 because it has fewer steps so higher WPM. They were offering discounts to move to that node vs N7 so unless TSMC have jacked up prices now more people have moved I am not sure this is actually correct.
 
  • Like
Reactions: Tlh97 and KompuKare

DisEnchantment

Golden Member
Mar 3, 2017
1,601
5,780
136
TSMC were pushing companies to use N6 because it has fewer steps so higher WPM. They were offering discounts to move to that node vs N7 so unless TSMC have jacked up prices now more people have moved I am not sure this is actually correct.
Jacked up prices of N6 forcing people to stay on N7 and then N7/N6 utilization at 50%. Some things are not adding up.
Capacity utilization rates for TSMC's 7nm process platform and its process variants N6, N7/N6 have fallen below 50%, according to industry sources.
DigiTimes usually downplay pretty much every one else other than TSMC, usually their TSMC info is very good.

BTW we also have ASICs on N7, but meaningless to disclose anything (even if I want to) because everyone pays a different rate for wafers. It is not a commodity item you buy from the fish market, you enter in a business contract for wafer supply. You take a penalty if you don't accept the delivery of wafers and vice versa.
 

maddie

Diamond Member
Jul 18, 2010
4,738
4,667
136
Jacked up prices of N6 forcing people to stay on N7 and then N7/N6 utilization at 50%. Some things are not adding up.

DigiTimes usually downplay pretty much every one else other than TSMC, usually their TSMC info is very good.

BTW we also have ASICs on N7, but meaningless to disclose anything (even if I want to) because everyone pays a different rate for wafers. It is not a commodity item you buy from the fish market, you enter in a business contract for wafer supply. You take a penalty if you don't accept the delivery of wafers and vice versa.
Can you at least say if going up or down? ;)
 

Kaluan

Senior member
Jan 4, 2022
500
1,071
96
Fixed It for you.

RX 6600 XT has 2359 MHz game frequency.
N33 -> 2800MHz gaming frequency.
IPC should be lower than 17.5%, let's say 15%.
Gaming frequency of 3GHz would mean 100*1.15*[2800/2359]=137%
TPURX 6600xtRX 6700xtRX 6800
1080p100%122%141%
1440p100%127%155%
2160p100%141%181%
Hopefully they won't go over $399, then It will be a good card.
Thanks, was incredibly tired and sleepy when I wrote that.

RDNA3 CU scaling is also a factor, hard to account for (so far, from 1st party data, N31 5376 SP to N31 6144 SP seem to scale very well). Then there's the reduced VGPR size on N33, which may or may not affect performance on lower CU count designs like it.

RX 6800 perf @ 1080 to 1440p @ ~170W is probably a safe bet, for now. If they price them a bit more aggressively than N23 launch MSRP (and consider inflation), I keep thinking they could finally have a true Polaris successor in the market.
 

PJVol

Senior member
May 25, 2020
533
446
106
Is it me or do the recent slides clearly indicate the N31 raster performance (unless the math below is flawed somewhere) ?
80CU->96CU - 1.20x
IPC - 1.174x ( RX-810 endnote, "minimal uplift" based on tests in ~ 20 games and 5 synthetic benchmarks)
Clocks (~10% -- I added 1% to the 9% boost clock increase from 6950xt -> 7900xtx) - 1.1x
--------------------------
Total +55% perfomance to that of 6950xt
 
Last edited:

SteinFG

Senior member
Dec 29, 2021
410
464
106
Is it me or do the recent slides clearly indicate the N31 raster performance (unless the math below is flawed somewhere) ?
80CU->96CU - 1.20x
IPC - 1.174x ( RX-810 endnote, "minimal uplift" based on tests in ~ 20 games and 5 synthetic benchmarks)
Clocks (~10% -- I added 1% to the 9% boost clock increase from 6950xt -> 7900xtx) - 1.1x
--------------------------
Total +55% perfomance to that of 6950xt
Well yes, AMD already releasd slides with 4K game FPS. it's basically +55%
 

Mopetar

Diamond Member
Jan 31, 2011
7,831
5,980
136
Not gonna lie the so called 7"9"00XT looks like poor value. On it's own but also compared to the XTX.

Naming doesn't fit IMO. Should've been 7800XT and 7900XT, although I think AMD can't go below that $899 price tag without sacrificing margins and that's why it's called 7900XT and not 7800XT.

The chiplet based design does give them more flexibility in terms of how many of those they produce. I think that AMD took a page out of Apple's book and are using the 7900XT as the entry level model that is just there to make the next level (only $100 more) look so much better by comparison.

Depending on what Navi 32 winds up looking like, the 7900XT might have a hard time justifying itself against a 16 GB card if it clocks a lot higher. Especially if that card tops out at $600.
 

Saylick

Diamond Member
Sep 10, 2012
3,125
6,296
136
Saw this Tweet where WildC calculates the energy usage for the fan out for the MCDs:

Now, 17W divided by 5% implies a GPU power of 340W, which is rather high?

1668619785028.png

If we assume that the effective bandwidth is more appropriate, rather than the 5.3TB/s of peak bandwidth, then the fan out costs 11W effective rather than 17W. This would mean a GPU power of 221W minimum, which is much more in line with my expectations. Ultimately, 11W is really a small price to pay to enable MCM/chiplets for GPUs. There's going to be a latency hit of course, but GPUs have historically been good at hiding latency. Besides, the latency of a cache hit is still far lower than going out to VRAM.

1668620114290.png
 

SteinFG

Senior member
Dec 29, 2021
410
464
106
Now, 17W divided by 5% implies a GPU power of 340W, which is rather high?
No. 7900 XTX TDP is 355W, and it just seems like they were basing 5% claim from 355W.
If we assume that the effective bandwidth is more appropriate, rather than the 5.3TB/s of peak bandwidth, then the fan out costs 11W effective rather than 17W.
AMD page on the XTX GPU states: "Effective Memory Bandwidth: Up to 3500 GB/s"
So 3.5TB/s is tops, not 5.3.
The 5.3 figure is probably max link speed, but not max data speed. Like with DDR memory, you can't reach theoretical 3200*128/8 = 51.2GB/s on DDR4-3200, even if link speed is equal to that.
This would mean a GPU power of 221W minimum, which is much more in line with my expectations
How does a 221W power consumption be more in line than 340W, if the subject is a 355W+ GPU.
 

Kaluan

Senior member
Jan 4, 2022
500
1,071
96
Is it me or do the recent slides clearly indicate the N31 raster performance (unless the math below is flawed somewhere) ?
80CU->96CU - 1.20x
IPC - 1.174x ( RX-810 endnote, "minimal uplift" based on tests in ~ 20 games and 5 synthetic benchmarks)
Clocks (~10% -- I added 1% to the 9% boost clock increase from 6950xt -> 7900xtx) - 1.1x
--------------------------
Total +55% perfomance to that of 6950xt

Here's the rest of the data extrapolated from the performance uplift numbers they've shown:

6950XT to 7900XTX
~55% (4K raster)
~68% (4K RT)
~62% (4K combined)

6950XT to 7900XT
~30% (4K raster)
~46% (4K RT)
~38% (4K combined)

7900XT to 7900XTX
~19% (4K raster)
~15% (4K RT)
~17% (4K combined)


BTW, there's some more numbers (some new) they have up on their RX 7900 series master page, but I don't care for comparing them to 3rd party, data might be messy/misleading:

rdna3 5.jpg
 
  • Like
Reactions: Tlh97 and Elfear

Saylick

Diamond Member
Sep 10, 2012
3,125
6,296
136
No. 7900 XTX TDP is 355W, and it just seems like they were basing 5% claim from 355W.

AMD page on the XTX GPU states: "Effective Memory Bandwidth: Up to 3500 GB/s"
So 3.5TB/s is tops, not 5.3.
The 5.3 figure is probably max link speed, but not max data speed. Like with DDR memory, you can't reach theoretical 3200*128/8 = 51.2GB/s on DDR4-3200, even if link speed is equal to that.

How does a 221W power consumption be more in line than 340W, if the subject is a 355W+ GPU.
First off, great questions. Same ones I asked myself when I tried to make sense of it.

The 7900XTX has a TDP/TBP of 355W but that includes EVERYTHING on the board. When I said GPU power, I meant the power from the N31 MCM package itself. At least that's my interpretation of the term when AMD uses it.

Igor's Lab had an article where they estimated the GPU power for the 3090 and they figured the die itself consumes roughly 230W, even though the total card is rated for 350W.

1668626310356.png

AMD said that the fan out costs <5% of the GPU power. I think we can agree that the GPU power means only the GPU die itself, and does not include the memory modules, VRM losses, etc. Taking 11W and dividing by 5% gives us 221W, which is closer to what Igor got for his estimate.

1668626511110.png

Secondly, regarding your question about the transfer rate for the MCDs. AMD reports that the peak bandwidth that can be transferred is 5.3 TB/s, but everyone knows that peak does not equal effective because it won't be delivering the peak bandwidth 100% of the time. I interpret that 5.3 TB/s as the measured transfer rate, not the link speed, so if you don't want to believe AMD on their choice of words, that's fine by me.

Going off of Wild_C's tweet, the peak power demand for the fan out should be 17W but if we assume the effective bandwidth is representative of the average or typical bandwidth, then it corresponds to an average or typical power usage of ~11W for the fan out.

1668626353194.png
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,355
2,848
106
Igor mentions 60W for GDDR6x, yet he also wrote 2.5W/module and that would mean 2.5 * 12 = 30W.o_O
I think memory controller is not included and that is probably ~10-20W more, so 70-80W in total.

As a continuation to my older post.Link
I still question If using HBM wouldn't be better than MCD+GDDR6.
It would consume less power than GDDR6.
It was estimated at 20W for 16GB HBM2 + 10W for controller or 30W in total. Link
With RDNA3 you also have to include Fanout to the power consumption, so I think ~80-90W power consumption is not unreasonable for MCD+GDDR6.
You would save 50-60W and that is a lot especially for mobile.

Cost for 8GB HBM2 was $175($150 memory + $25 interposer) vs $52-68($6.5-8.5 per module) for GDDR5 at the time. Link
A huge difference certainly, but this was in 2017.
In 2019 you could supposedly get 16GB HBM2(4 stacks) for $120 so $145 with interposer, If that didn't get cheaper. Link
Cost of GDDR6 in 2019 was $10.79-11.69 for 12-14gbps 1GB modules If you bought 2,000 units, that would put It at $173-187 for 16GB, but big customers have up to 40% discount so only $104-112. Link

Currently we already have 2GB modules, but at digikey the price is $26.22 per unit so $210 for 16GB or $105 for 16GB with 50% discount.Link
RX 7900XTX has 24GB Vram and that would mean $26.22*12 for Vram + 6 * $6.2 for MCD we are already at $352 just for the chips. If the discount is 40-50% then $157-189 Vram + $37.2 MCD for a total of $194-226.

For HBM you could either choose HBM2E with 4 stacks(32GB) for a total of 1843 GB/s.(+92% over N31)
Or HBM3 with 2 stacks(32GB) for a total 1639 GB/s.(+71% over N31)
That should be enough to feed N31.

My conclusion is that HBM would consume less power and cost less to make. If 2 stacks(32GB) of HBM3 cost more than what 4 stacks(16GB) cost in 2019 I would be surprised.

P.S. Does someone have a subscription to Techinsights? They do analysis for GPU's Bill of Materials.

edit: I wrongly calculated the cost of 24GB Vram, so I fixed It and It's bolded out.
 
Last edited:
  • Like
Reactions: Tlh97 and Vattila

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
I have the feeling RX7900XT (MSRP $899) will be 2-3% slower at 4K vs RTX 4080 (MSRP $1199) in raster performance.
And it could be 2-3% faster or at least equal at 1440p vs RTX4080.
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
PS: N6 is actually ever so slightly more expensive for a given die area, but not by much. Like, cost/transistor is basically flat from N7.

Either way, your point is still correct, N33 should be cheaper to produce than N23.

Incorrect. Ask anyone in the industry. TSMC charges less for N6 because they can output more wafers per month thanks to decreased machine (thanks to EUV) time. That means they actually make more money, despite charging less.
 

Timorous

Golden Member
Oct 27, 2008
1,607
2,747
136
I have the feeling RX7900XT (MSRP $899) will be 2-3% slower at 4K vs RTX 4080 (MSRP $1199) in raster performance.
And it could be 2-3% faster or at least equal at 1440p vs RTX4080.

Taking the 30% scaling from a 6950XT in 4k raster and applying to TPU charts you get 99% with 4080 at 100% and in the techspot charts you get 111 FPS vs the 4080 111 fps.

So I think it will be trading blows in raster with the 4080.
 

moinmoin

Diamond Member
Jun 1, 2017
4,944
7,656
136
My conclusion is that HBM would consume less power and cost less to make. If 2 stacks(32GB) of HBM3 cost more than what 4 stacks(16GB) cost in 2019 I would be surpriced.
The big difference is that GDDR memory is being added by the card manufactures, so it's a cost manufacturing the card. HBM on the other hand adds to the cost of the GPU package AMD creates and has to sell to the manufacturers. As such the latter needs more investment and risk taking by AMD and as a result looks significantly more expensive than comparable GPU packages using standard GDDR memory.

(It's also the reason why only Apple adds all the memory on the same package as CPU/GPU on its A and M series chips: They don't need to sell the resulting package to anybody but themselves.)
 
  • Like
Reactions: Tlh97 and Vattila