Discussion RDNA4 + CDNA3 Architectures Thread

Page 384 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,791
136
1655034287489.png
1655034259690.png

1655034485504.png

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it :grimacing:

This is nuts, MI100/200/300 cadence is impressive.

1655034362046.png

Previous thread on CDNA2 and RDNA3 here

 
Last edited:

Kronos1996

Member
Dec 28, 2022
69
114
76
Using 1440p TPU data RTX 5080 is 9% better performance per area. While using, in theory, 20W more power and 50% more memory bandwidth.
Hardware Unboxed found it significantly worse than Blackwell. They measure while gaming, not doing synthetics. It’s barely improved over RDNA 3. At least for the 9070 XT.
 

Attachments

  • IMG_2550.png
    IMG_2550.png
    1,016.6 KB · Views: 22
  • IMG_2551.png
    IMG_2551.png
    1.1 MB · Views: 18
  • IMG_2552.png
    IMG_2552.png
    1.1 MB · Views: 17
  • Like
Reactions: dr1337

adroc_thurston

Diamond Member
Jul 2, 2023
7,083
9,840
106
Hardware Unboxed found it significantly worse than Blackwell. They measure while gaming, not doing synthetics. It’s barely improved over RDNA 3. At least for the 9070 XT.
TPU measures on 3 games too.
Their numbers are an outlier anyway.
 

SolidQ

Golden Member
Jul 13, 2023
1,501
2,464
106
1 card for whole country and premium price :p
smoll-pricing-mistake-1-3mil-v0-v6erjwyxltme1.png



Hardware Unboxed found it significantly worse than Blackwell.
Results so diffrence from game to game
 

coercitiv

Diamond Member
Jan 24, 2014
7,355
17,424
136
Only 64MB of Infinity Cache. The chip was made for 1080p to 1440p, never 4K.
Watch the HUB video, there are several instances where the 9070 XT performs better at 4K than it does at 1440p, relative to cards like 7900XTX / 5070 Ti etc.

Common knowledge says the scaling should be worse, and yet the numbers tell a different story (at least partially).
 

gdansk

Diamond Member
Feb 8, 2011
4,568
7,681
136
Hardware Unboxed found it significantly worse than Blackwell. They measure while gaming, not doing synthetics. It’s barely improved over RDNA 3. At least for the 9070 XT.
TPU data is only using games too, not synthetics. You can really tell who only looked at HWUB review instead of reading many reviews. From the HWUB review it looks pretty bad. Many others are more positive overall.

I don't know the cause of the discrepancy, perhaps they underclocked again. But as usual expect them to retest it shortly.
 

Josh128

Golden Member
Oct 14, 2022
1,319
1,985
106
I’m thoroughly whelmed. Navi 48 has much worse PPA vs GB203. It’s a 5% small die but 20-25% slower. All I can conclude is that they lost a lot of PPA efficiency by just doubling Navi 44. It should be in-between GB203 + GB205 for this performance level. Closer to 300-330mm2. Also screwed power efficiency. It’s an impressive achievement vs Navi 31/32 but second place is still second place.
1741203582731.png

Its not terrible, but as we spoke of after the HWUB power leaks, theres still a good bit of distance between it and Blackwell / Ada. However, the chart above is pretty interesting, as is the perf/watt it can achieve from undervolting / power limiting, as shown in the link below.

 

Timorous

Golden Member
Oct 27, 2008
1,978
3,864
136
Only 64MB of Infinity Cache. The chip was made for 1080p to 1440p, never 4K.


https://cdn0.techbang.com/system/images/579780/original/dd2f093559150f1960e8f0be0d98b55a.jpg?1607437364




GDDR7 on N48 would've been nice for 4K with RT.
Or a bigger chip with more IF and/or more memory channels, of course.

If only they made a 96CU model in a 500mm area with a 384 bit bus.

Would likely sit between 4090 and 5090 in perf and could be sold for $1,000. Would have made the 5080 look stupid.
 

Kronos1996

Member
Dec 28, 2022
69
114
76
TPU measures on 3 games too.
Their numbers are an outlier anyway.
HUB avoids using built-in benchmarks because they’re rarely a good representation of the game. They specifically choose the heaviest parts of the game to use for benchmark runs. Worst case scenario basically but IMO that’s good test methodology. Canned benchmarks tend to be easily manipulated or optimized for marketing purposes. Cough Apple Cough
 
  • Like
Reactions: madtronik

adroc_thurston

Diamond Member
Jul 2, 2023
7,083
9,840
106
Only 64MB of Infinity Cache. The chip was made for 1080p to 1440p, never 4K.
RDNA4 introduces a whole bunch of new MALL policies tho.
HUB avoids using built-in benchmarks because they’re rarely a good representation of the game. They specifically choose the heaviest parts of the game to use for benchmark runs. Worst case scenario basically but IMO that’s good test methodology. Canned benchmarks tend to be easily manipulated or optimized for marketing purposes. Cough Apple Cough
a canned game benchmark is just a repeatable slice of the game.
Manual stuff is only ever worth it for, say, MMO CPU benchmarking.
 
  • Like
Reactions: Tlh97

Timorous

Golden Member
Oct 27, 2008
1,978
3,864
136
RDNA4 introduces a whole bunch of new MALL policies tho.

a canned game benchmark is just a repeatable slice of the game.
Manual stuff is only ever worth it for, say, MMO CPU benchmarking.

This is where digital foundry have done a good job with their automation. It is essentially their own canned benchmark in custom scenes. Best of both worlds imo.
 

Kronos1996

Member
Dec 28, 2022
69
114
76
RDNA4 introduces a whole bunch of new MALL policies tho.

a canned game benchmark is just a repeatable slice of the game.
Manual stuff is only ever worth it for, say, MMO CPU benchmarking.
Right and I’m sure no game studios have ever had a reason to create a canned benchmark that makes the game look like it runs way better than it actually does. Surely there would be absolutely ZERO benefit except maybe enticing gamers with older hardware to buy their game expecting a good experience? But they would NEVER do that because game studios are the absolute pinnacle of moral excellence right?

I need to stop before I get a lethal dose of sarcasm. 😂
 
  • Like
Reactions: lightmanek

gdansk

Diamond Member
Feb 8, 2011
4,568
7,681
136
Hardware Unboxed found it significantly worse than Blackwell.
They also measure it using a proxy, system power in only a few games which you described as heaviest sections of these games. There are sites that use pcat average across all games in their suite and it ends up much closer (to the 5070 Ti). Make of that what you will.

I don't think the 9070 XT is in the efficiency zone for N48 anyway. But it wouldn't be fair to push GB203 down below 300W for a fair comparison because it uses a lot more memory power.

If you go by HWUB alone then even the 9070 isn't efficient despite nearly every other review saying otherwise. Their method might be right. Or it may not be. But look around.
 
  • Like
Reactions: Tlh97

Keller_TT

Member
Jun 2, 2024
158
186
76
The 9070 XT has been pushed beyond its intended scope for that last 3-4% to appear closer to 5070 Ti (and $50↑ price revision), and makes it look dull imo.

The 9070 Pulse and Hellhound are MSRP, near reference designs, and both are very similar in power draw. The Hellhound is cool and quiet too. They come in at 225W sustained, while the 9070 XT is 310W for the Pulse.
The perf gap is a median +13%, and the OC models net a further 2-3% I suppose, but gets into 7900 XTX power levels for worser results in raster.

This smells like Zen 4 frequency maxing. 65W 7700 vs 105W 7700X (97% v 100%). I would take a 97% 9070 XT for 265W.

Edit: Redacted my critical judgement of the +42% perf claims. Not nearly as bad as RDNA3 claims.

9070 Series Power Efficiency.jpg
 
Last edited:

itsmydamnation

Diamond Member
Feb 6, 2011
3,072
3,897
136
LM Studio
2ce1e5f3e1bac09f4af361c410e708f81f91ae2e9ed6cbcf74e9f6c659066fdf.jpg




good comparision between models, length, output etc
if your a gamer , thank god it only has 16gb.


Also ROFL at the people on the PPW/PPA anti RDNA4 crusade ..... like so dumb..... it has 40% less bandwidth, that's so much extra performance , normalise for that and look magic now the uarch is wonderful..... like just sigh.....

RDNA4 PPW/PPA is good , sku configs will vary wildly , but lets not pretend that people actually care about PPW.............
 

adroc_thurston

Diamond Member
Jul 2, 2023
7,083
9,840
106
If Nvidia decides to waste billions of dollars of shareholder value by allocating wafers for gaming card supply, then people will choose the Blackwell cards over the Radeon cards if both sell at msrp.
DC parts are never wafer-limited.
 

gdansk

Diamond Member
Feb 8, 2011
4,568
7,681
136
If you want a 275W 9070 XT you can probably get that with a slider in the drivers.
Usually neither AMD or Nvidia stop you from setting lower power limits.
And the sites which have tested the 9070 XT at a fixed frame rate as a proxy for equal work efficiency (TPU, CB) usually have it near the top of the charts.
 

gdansk

Diamond Member
Feb 8, 2011
4,568
7,681
136
It doesn't quite match up, including that. Also see their individual game benchmark numbers. They are beyond margin of error.
To which numbers are you comparing?
I have checked a few sites and for the same games it is pretty similar...