Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 161 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,705
6,427
146

Joe NYC

Platinum Member
Jun 26, 2021
2,539
3,472
106
I would even consider 7950X3D If It's not too expensive for you, this way you don't have to worry about CPU for a pretty long time.

N32 should be faster than RTX 3080, RT will be likely worse, but 16GB Vram is very tempting.
Is this level of performance enough for your monster monitor?

I would consider longevity for components differently. The last thing I would want to replace is mobo, 2nd from last is CPU (and no problem replacing is GPU.)

So regarding 7800x3d vs. 7950x3d: I think there is a good case for saving money on the difference between 7800x3d and 7950x3d and apply it to one CPU more upgrade to AM5 motherboard, Zen 5 or something after it

For now, I have no big need for 16 cores, so I will likely also be getting 7800x3d. Unless, there is a measurable performance increase for 7950x3d in gaming...

The way I could see there could be is if there was some way to improve gaming performance in 7950 would be to vacate the CCD with V-Cache of all the threads not related to the game being played, to have all of the background / system threads on the non-V-Cache CCD, which would leave the V-Cache CCD and all of its cache data for the game.

But this would need some sort of customized work on the scheduler...
 
  • Like
Reactions: TESKATLIPOKA

PJVol

Senior member
May 25, 2020
708
632
136
But do note that "IP version" and "GFX ID" are not the same from what we can see.
I'm not sure I understand what "GFX ID" is, but driver's code usually follows a certain naming rules, and GC IP version (major, minor, rev.) refers to a certain SKU (or SKUs, if cut-down version exist),
which in itself comprise the 'gpu family', e.g. dgpu - 11.0.0 or apu - 11.0.1.
So, for example the string "1102" just means Graphics Engine maj. ver. 11, min. ver. 0, and rev. 2, which devs seem to have adopted instead of those horrible hard-to-spelll fish names.
 
Last edited:
  • Like
Reactions: Kaluan

TESKATLIPOKA

Platinum Member
May 1, 2020
2,523
3,038
136
ComputerBase tested Adrenaline 22.12.2
Playing youtube reduced consumption:
7900XT: 71W -> 46W
7900XTX: 80 -> 54W
But It's still higher than N21, If you also enable HDR, then It's comparable.

Power consumption didn't change while playing, but limiting performance to 144 FPS decreases power draw much more than before.
Maybe some of you still remember that debate, where N31 ended up less efficient than N21 by limiting FPS to 144.
Screenshot_32.pngScreenshot_36.png
 
Last edited:

TESKATLIPOKA

Platinum Member
May 1, 2020
2,523
3,038
136
Now it's N21 that's behaving a bit odd in the ComputerBase tests: all things being equal, the 6800XT and 6900XT should use the same or even lower power than the 6800.
You are right. For the same performance, RX6800 should need higher clocks and voltage than RX6900XT, which will increase the power consumption.
 

Glo.

Diamond Member
Apr 25, 2015
5,803
4,777
136
Paul, from RGT suggests in his recent video that Strix Point has 24 CUs/12 WGPs, not 16 CUs/8 WGPs.

1536 ALUs/3072 vALUs.

This thing HAS to have Infinity cache. Otherwise it doesn't make any sense to put such big amount of sheer horsepower into an APU that would be fed just by memory controller.
 

moinmoin

Diamond Member
Jun 1, 2017
5,064
8,032
136
Even with IC that's way too many CUs. On APUs up to now the limited bandwidth also helps limit power usage of the iGPU. Those CUs being optimally used (which is what IC is for) would make the chip a desktop grade one power consumption wise.
 

Kepler_L2

Senior member
Sep 6, 2020
537
2,199
136
Paul, from RGT suggests in his recent video that Strix Point has 24 CUs/12 WGPs, not 16 CUs/8 WGPs.

1536 ALUs/3072 vALUs.

This thing HAS to have Infinity cache. Otherwise it doesn't make any sense to put such big amount of sheer horsepower into an APU that would be fed just by memory controller.
If only there was a way to double memory bandwidth...
 

Glo.

Diamond Member
Apr 25, 2015
5,803
4,777
136
If only there was a way to double memory bandwidth...
LPDDR5X 8533 MHz on a 128 bit bus has 136 GB/s bandwidth.

Thats nearly as much as Navi 24 has on a 64 bit GDDR6 memory bus(144 GB/s). N24 obviously has 16 MB of IC and is bound to 4 GB VRAM, which unified memory architecture will not be bound by :).

Also - caches on the die are also going to be larger, like L2 has to double from 4 to 8 MB, again. If the IC will be 32 MB - it should be fine. If, however, AMD will go for something more, like 64 MB IC - it will have more than enough bandwidth to feed both: CPUs and iGPU even at the same time.
 
  • Like
Reactions: Kaluan

Kaluan

Senior member
Jan 4, 2022
504
1,074
106
ComputerBase tested Adrenaline 22.12.2
Playing youtube reduced consumption:
7900XT: 71W -> 46W
7900XTX: 80 -> 54W
But It's still higher than N21, If you also enable HDR, then It's comparable.

Power consumption didn't change while playing, but limiting performance to 144 FPS decreases power draw much more than before.
Maybe some of you still remember that debate, where N31 ended up less efficient than N21 by limiting FPS to 144.
View attachment 74524View attachment 74525

Well now, how cute of them... only 3 weeks late lol

Ancient Gameplays already investigated 22.12.2 (7900) improvements mere days after the driver was out tho, I've posted about it in the RX 7900 review thread awhile back:

With what drivers? The ones released 5-6 days ago (22.12.2) seem to overhaul clocking behavior. Better temps, power draw behavior and even some performance uplifts (CP2077 1% lows are a massive 22% higher). Going by Fabio/Ancient Gameplay's testing:

View attachment 73500View attachment 73497View attachment 73498View attachment 73499

Results are of from his tuned/OC 400W PL (Sapphire) reference 7900 XTX, unlocked framerate first, 100fps cap second. CP2077 raster only, maxed.


Fabio (Ancient Gameplays) also has a short video up on the newer 23.1.1 (7900), on the subject of power efficiency improvements, here's what he found:
AG 23.1.1 update1.jpgAG 23.1.1 update2.jpg

Full video here:
 

Kaluan

Senior member
Jan 4, 2022
504
1,074
106
Also - caches on the die are also going to be larger, like L2 has to double from 4 to 8 MB, again. If the IC will be 32 MB - it should be fine. If, however, AMD will go for something more, like 64 MB IC - it will have more than enough bandwidth to feed both: CPUs and iGPU even at the same time.
Smart Access Cache incoming? :grin:

A.M.D... it's in the BAG SAC!
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,523
3,038
136
As @moinmoin said, 24CU is a lot. That's 2x more than what Phoenix has and at 2.5GHz It would provide the same TFlops as 7600S, at 3.3GHz It would compare to 7700S.
Even If Strix Point used 3nm process, that's only ~25-30% lower power consumption compared to 4nm. 35-45W TDP is clearly not enough, you would need to lower clocks or increase TDP compared to PHX.

Then the largest problem is BW. 128-bit LPDDR5 won't give you enough bandwidth to feed this.
32MB IC is not enough to compensate in my opinion, you would need 64-72MB, that's >70% hitrate at Full HD.

I made a table of Strix Point models in the correct thread. I even calculated the needed amount of IC as part of LLC for GPU. It looks interesting.
 
  • Like
Reactions: moinmoin

TESKATLIPOKA

Platinum Member
May 1, 2020
2,523
3,038
136
I made a BW table for Infinity cache 2.
2005502394.png


Infinity cache 216 MB32 MB48 MB64 MB96 MB128 MB (doesn't exist)
B/CLK3847681152153623043072
Theoretical BW at 2.5GHz960 GB/s1920 GB/s2880 GB/s3840 GB/s5760 GB/s7680 GB/s
Hit rate at 1080p (BW)37 % (355 GB/s)55 % (1056 GB/s)65 % (1872 GB/s)72 % (2765 GB/s)78 % (4493 GB/s)80 % (6144 GB/s)
Hit Rate at 1440p (BW)23% (221 GB/s)38 % (730 GB/s)48 % (1382 GB/s)57 % (2189 GB/s)69 % (3974 GB/s)74 % (5683 GB/s)
Hit rate at 4K (BW)19 % (182 GB/s)27 % (518 GB/s)34 % (979 GB/s)42 % (1613 GB/s)53 % (3053 GB/s)62 % (4762 GB/s)

BTW 6MB L2 in RDNA3 at 2.5GHz has 7680 GB/s.
AMD did a good job with this.

P.S. I think Infinity cache 2 is working at ~1.9GHz based on this slide.
amd_rdna_3_tech_day_press.webp

5300-960 = 4340 GB/s / 2304 B/clock => 1.88GHz
Similar speed as the first Infinity cache.

5300/2304 = 2.3GHz is the correct value.
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
5,803
4,777
136
So if your math is correct, 32 MB IC2 for Strix Point would have 1.65 TB/s bandwidth.

Very interesting.

Edit: ah heck. Only now I see that I missclicked, and wrote 1.65 instead of 1.75 TB/s.
 
Last edited:
  • Wow
Reactions: Stuka87

TESKATLIPOKA

Platinum Member
May 1, 2020
2,523
3,038
136
I divided 96 MB and 5.3 TB/s by 3, which resulted in 32 MB IC and 1.75 TB/s.

IC on Strix Point should be executed the same way as it is on Navi 31 chips.
That's wrong, because It includes BW from GDDR6.
5300-960 = 4340 GB/s is BW for 96MB infinity cache.
I just don't know If this is the theoretical max or not.
I need to find endnote RX-818

edit: I found It.
Screenshot_4.png
It looks like It's just max BW of Infinity cache without GDDR6.
So what you wrote is correct for 32MB, but effective BW will be lower because It depends on actual hitrate.
 
Last edited:
  • Like
Reactions: Kaluan