Discussion RDNA4 + CDNA3 Architectures Thread

Page 72 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
1655034287489.png
1655034259690.png

1655034485504.png

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it :grimacing:

This is nuts, MI100/200/300 cadence is impressive.

1655034362046.png

Previous thread on CDNA2 and RDNA3 here

 
Last edited:

Hans Gruber

Platinum Member
Dec 23, 2006
2,305
1,218
136
well yeah gen12 is a bad baseline to build off.
PVC suffered the same fate.

that's not a driver issue lmao.
It is a driver issues if the adrenaline software package does not function correctly with stable overclocks. Adrenaline software has an automatic overclock feature for the memory, GPU clock and fan curve. The fan curve (manually adjust) works perfectly. With GDDR6 memory, you do not need to push the GPU clock. GDDR6 memory overclocked can give 10-15% performance boost alone. When you overclock memory you get artifacts, micro stutters, ghost in the machine affects and a black screen if pushed too far.

With AMD, you get none of those GDDR6 errors. You get a good gaming experience and then for no reason at all it ends with a DX11 crash or some error report or your system crashes. Adrenaline software resets your overclocks back to stock settings for you and you do it all over again. The problem is the video card becomes unstable after a bunch of overclocking rounds. That is why I say the AMD drivers are fragile. If you stay within the lines, you should have a good gaming experience. That means running everything stock. To truly reset your adrenaline software, you have to uninstall the drivers using the AMD cleanup utility. Doing a full removal with the utility removes the ghost in the machine effect.

I fully understand overclocking can lead to system instability. AMD has that message before selecting auto overclocking in Adrenaline. Even when manually clocked, the system becomes unstable when it was stable for many hours. I am speaking to the memory overclock specifically because that is where all the benefits come from with GDDR6 memory. Going from 1750mhz to 1770mhz should not cause instability in GDDR6 memory in my opinion. Adrenaline does the auto overclock of the memory up to 1830mhz. If any minor memory overclock leads to instability, I am putting it on the software package.

All you have to do is google adrenaline drivers crashes. It's all over the web. Supposedly they fixed the issue a couple of years ago but people with RDNA3 cards are complaining about driver issues. My guess is they were using adrenaline to overclock their cards. A visually stable overclock that runs flawlessly for hours and then it crashes out of nowhere. With Nvidia, the overclocking is binary. It either works or it doesn't work. If it's unstable you will see it right away. In afterburner you lower the overclock and the system runs smoothly and you quickly find your wall.

With Intel, they have a fancy dashboard with a bunch of features, none of them work. You cannot even overclock the GDDR6 memory with the Intel cards. Supporting that absent feature would increase ARC card performance by 10-15%. All you have to do is look at old reviews of the 1660Ti vs 1660super. The Super has the GDDR6 while the 1660ti has the buffed GPU core without GDDR6 memory. When you overclock the 1660 super memory, it's magic.

The 7900 GRE just got buffed with the GDDR6 memory modules via a adrenaline driver software update. The memory limitation was bumped from 2300mhz up to 3000mhz. Sounds great but if the adrenaline software introduces issues because of software flaws. There are going to be a lot of unhappy customers if they get driver crashes. I call it driver crashes because the adrenaline software package includes radeon/adrenaline drivers. I am not saying 3000mhz should be attainable. If people start getting crashes bumping the 7900GRE up to 2500mhz and get crashes, that is a software/driver problem and not a memory module limit issue.

Back in the day, I could move the memory slider all the way up with my Radeon 7950. That card is almost 14 years old. It cannot achieve any overclock anymore but runs stable at stock settings. So I have a baseline and a good understanding of what hardware and silicon/memory degradation can do over many years of use.
 

SolidQ

Senior member
Jul 13, 2023
539
629
96
>7600 performance in a 130mm² die size??? WOW!!!
if you notice there is 2x>>
1b4f147717c7e3c78ca1eae29eef3ca7.png


16WGP/32CU/128bit
but what mean 288/515?
 

CouncilorIrissa

Senior member
Jul 28, 2023
540
2,120
96

branch_suggestion

Senior member
Aug 4, 2023
392
875
96
So, uh...
Navi 48:

32
64
256
693
2770
~240 mm²

Navi 44

16
32
128
288
515
~130 mm²


Navi 44 die size 130mm²... WTF

>7600 performance in a 130mm² die size??? WOW!!!
240/140mm^2 was my estimate.
So if it is 4/2SE instead of 2/1, extremely impressive area eff.
First number is WGP
Second is CU and probably Infinity Cache too
Third is bus width
Fourth is memory bandwidth, 21.7Gbps GDDR6 for N48, 18Gbps GDDR6 for N44, same as N33.
Final number appears to be effective memory bandwidth, basically confirming IC is the same as N32/33 with a clock bump. https://www.amd.com/en/products/graphics/amd-radeon-rx-7800-xt https://www.amd.com/en/products/graphics/amd-radeon-rx-7600-xt
So, is the 515 a typo and it should 2515 instead?
I assume these are base clocks and not boost (or whatever the hell AMD are calling them now) ones.
AMD no longer gives base clocks (or just for RDNA3 as base clocks were weird) Just game (typical median) clocks and boost (typical peak).
 

CouncilorIrissa

Senior member
Jul 28, 2023
540
2,120
96
240/140mm^2 was my estimate.
So if it is 4/2SE instead of 2/1, extremely impressive area eff.
First number is WGP
Second is CU and probably Infinity Cache too
Third is bus width
Fourth is memory bandwidth, 21.7Gbps GDDR6 for N48, 18Gbps GDDR6 for N44, same as N33.
Final number appears to be effective memory bandwidth, basically confirming IC is the same as N32/33 with a clock bump. https://www.amd.com/en/products/graphics/amd-radeon-rx-7800-xt https://www.amd.com/en/products/graphics/amd-radeon-rx-7600-xt

AMD no longer gives base clocks (or just for RDNA3 as base clocks were weird) Just game (typical median) clocks and boost (typical peak).
Yea, the number looks like IC capacity, otherwise it does not really make sense to list that number as 1WGP = 2CU anyway.

2770 MHz sounds slow-ish? I was expecting them to finally meet the 3GHz clk meme (unless there are serious IPC gains compared to RDNA3?)
 

branch_suggestion

Senior member
Aug 4, 2023
392
875
96
Yea, the number looks like IC capacity, otherwise it does not really make sense to list that number as 1WGP = 2CU anyway.

2770 MHz sounds slow-ish? I was expecting them to finally meet the 3GHz clk meme (unless there are serious IPC gains compared to RDNA3?)
The last number cannot be clock speeds as AMD has made them non static, plus each SE and the frontend have their own clock domains on RDNA3.
Both numbers line up with effective memory bandwidth.
Oh, and clocks are probably not final yet anyway.
 

branch_suggestion

Senior member
Aug 4, 2023
392
875
96
Assuming this leak is legit, I guess I need to kneel to Teska.
Oh well, I guess it was a tad too soon for GDDR7, but for AMD to fit so much in such a small space is outstanding, I was expecting a more exotic memory subsystem but it turns out it is basically N32/33 after L2. Looking forward to uncore changes too, and how much of that uncore is being gutted in N44.
The meat is of course the WGP/SA/SE changes.
 

Timorous

Golden Member
Oct 27, 2008
1,748
3,240
136
Regardless of whether all of those leaks are spot on or not, they're still missing the most important number of all, which is cost.

We've seen that when AMD has good prices they succeed in the market. Let's hope they've learned from this experience.

RV770 was not great for AMD. It gave them marketshare but not enough to offset the low margins. They would have been better off selling it far more in line with the perf/% of NVs cards.

If N48 can offer 7900XT performance then only expect a small price advantage despite the cheaper die because AMD would much rather have the margin than get into a price competition with NV. Especially since a lot of buyers don't want to buy AMD they just want to buy NV for less (or get a higher tier at the same price).
 
  • Like
Reactions: DAPUNISHER

Mahboi

Golden Member
Apr 4, 2024
1,033
1,897
96
>7600 performance in a 130mm² die size??? WOW!!!
Is it? I thought they aimed higher, closer to a 7700 xt than a 7600, but that seems excessive. 128 bit bus is not surprising but a bit disappointing if it's not on GDDR7 and doesn't get a 12/10Go version.

The current (botched ofc) N33 is 202mm² I believe.
So...we're looking at a possible ~50% improvement, plus a 20% general improvement from the lost perf in RDNA 3.
Seems consistent with earlier RDNA gens, no? 50% per gen? Although that's perf not area.
 
  • Like
Reactions: Tlh97