Discussion RDNA4 + CDNA3 Architectures Thread

Page 464 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,785
136
1655034287489.png
1655034259690.png

1655034485504.png

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it :grimacing:

This is nuts, MI100/200/300 cadence is impressive.

1655034362046.png

Previous thread on CDNA2 and RDNA3 here

 
Last edited:

mostwanted002

Member
Jun 16, 2023
47
79
71
mostwanted002.page
Is there a toggle to switch to lower res than 4k?
9070XT isn't a 4k GPU and in such GPU heavy situations differences are minimal ...
You can switch down resolutions within games. Or use upscaling (FSR3/4) which essentially renders at lower resolution and then upscales to target 4k, gives performance boost while maintaining fedility.
 

Z O X

Junior Member
Oct 31, 2022
16
10
51
You can switch down resolutions within games. Or use upscaling (FSR3/4) which essentially renders at lower resolution and then upscales to target 4k, gives performance boost while maintaining fedility.

By comparing averages with native res and up-scaling, it turns out bigger gains are on 4k than lower resolutions/fsr.
 

SolidQ

Golden Member
Jul 13, 2023
1,451
2,363
106
FSR4 gonna be aviable on C2077
2ee782c1c6f68c23c0c8bd8567d7fa8f.jpg


So that why Azor was meeting with CDproject team
 

ToTTenTranz

Senior member
Feb 4, 2021
456
844
136
Yeah, it's a big discount to previous 32GB workstation Radeons.
I guess AMD is finally trying harder to get people to adopt their GPUs for AI workloads.

And it's not like putting another 16GB GDDR6 on the back of the PCB is costing them too much. It's only ~$40 at the moment, so with a $500 premium over the regular 9070XT they're getting more than a 1000% markup on the memory upgrade.
 

Shmee

Memory & Storage, Graphics Cards Mod Elite Member
Super Moderator
Sep 13, 2008
8,129
3,067
146
This is actually pretty decent for AI hobbyists.
I am curious, how does it fair in AI against the 7900XTX, with its 24GB of memory? Obviously the 7900XTX is RDNA3 and not a pro card, and has a bit less memory, but it is a bit cheaper.
 
Jul 27, 2020
26,022
17,952
146
I am curious, how does it fair in AI against the 7900XTX, with its 24GB of memory? Obviously the 7900XTX is RDNA3 and not a pro card, and has a bit less memory, but it is a bit cheaper.

1752897185079.png
1752897214049.png

Would be roughly 25% slower in brute speed unless significant bottlenecks have been removed in the RDNA4 drivers.
 

mostwanted002

Member
Jun 16, 2023
47
79
71
mostwanted002.page
Would be roughly 25% slower in brute speed unless significant bottlenecks have been removed in the RDNA4 drivers.
FP16 is usually the training territory.
On the inference side of things, it may be faster than XTX since RDNA4 now has dedicated hardware for computing FP8 and 4 times the throughput for INT4/8.
The majority of out-of-the-box inferencing models are served in INT8 or INT4 quantizations to reduce VRAM cost and faster tokens/sec.
 

Attachments

  • 1752899212003.png
    1752899212003.png
    151.5 KB · Views: 13

RnR_au

Platinum Member
Jun 6, 2021
2,532
5,935
136

View attachment 127345
View attachment 127346

Would be roughly 25% slower in brute speed unless significant bottlenecks have been removed in the RDNA4 drivers.
Compute is not that important in inference. Memory bandwidth is king and the 7900XTX has nearly 50% more membw over the 9700. So the 7900XTX should be a fair bit faster.
 
  • Like
Reactions: mostwanted002

Shmee

Memory & Storage, Graphics Cards Mod Elite Member
Super Moderator
Sep 13, 2008
8,129
3,067
146
Huh so the 7900XTX is actually faster with AI? Interesting. How would it compare to a 3090 or 4090?
 

mostwanted002

Member
Jun 16, 2023
47
79
71
mostwanted002.page
Compute is not that important in inference. Memory bandwidth is king and the 7900XTX has nearly 50% more membw over the 9700. So the 7900XTX should be a fair bit faster.
Indeed. R9700 will be a mix of perf stats. Upside of newer compute accelerators, alongside being capped by memory bandwidth. Gonna be interesting to see initial 3rd party numbers after embargo.
 
Last edited:

gdansk

Diamond Member
Feb 8, 2011
4,208
7,055
136
At what price is it worth buying the trap version ?

$200 ??
At $250 isn't it much better than a B580 or RTX 5050?
It should be low enough, in my opinion.

The BoM savings aren't actually wise for board partners. Just make the 16GB. No one in DIY wants the 8GB version. OEMs only want Nvidia. So board partners should learn quickly to make very few of them.
 
  • Like
Reactions: marees