Discussion RDNA4 + CDNA3 Architectures Thread

Page 65 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,602
5,788
136
1655034287489.png
1655034259690.png

1655034485504.png

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it :grimacing:

This is nuts, MI100/200/300 cadence is impressive.

1655034362046.png

Previous thread on CDNA2 and RDNA3 here

 
Last edited:

soresu

Platinum Member
Dec 19, 2014
2,657
1,860
136
Gamers Nexus routinely reruns their benchmarks.
Phoronix:

ERuwt0YW4AMpePU.jpg:large
 

Tigerick

Senior member
Apr 1, 2022
651
536
106
While we are waiting for RDNA4 specs, how about some speculation of RDNA5 cards which are supposedly come out next year? We still don't know how the chiplets going to arrange; be it SED modular design or GCX single die. And we also don't know how AMD going to re-arrange RDNA5's WGP design. But we do know AMD going to employ GDDR7 as standard. Let's assume AMD use GCX and each GCX is linked to base tile with 128-bit GDDR7 memory bus as shown below:

AMD-NAVI-4C-PATENT-850x333.jpg
Above picture is AMD's patent for Navi4c, it could be used for RDNA5. What AMD described as virtual compute die could be single GCX (or multiple SEDs) sitting on top of base IC die with Infinity Cache and GDDR7 memory controllers. I am not sure bridge chip is necessary though....Anyhow, since I am maintaining specs of upcoming nVidia's Blackwell, let's put it together and compare the product positioning with estimated pricing:-


RDNA5 Lineup8600 XT ?8700 ?7900M8700 XT ?8800 XT ?8900 XT ?8900 XTX ?
Estimated Price$299 ?$399 ?N5 Node$499 ?$699 ?$899 ?$999 - $1,099 ?
GPU chiplet160 mm2 ?250 mm2 ?304 mm2OneTwoThreeThree
Bridge chipNANANA0133
Infinity Cache32 MB ?48 MB ?64 MB32 MB ?64 MB ?80 MB ?96 MB ?
GDDR716GB GDDR612GB GDDR616GB GDDR612 GB16 GB20 GB24 GB
Memory Bus128-bit192-bit256-bit128-bit256-bit320-bit384-bit
Memory BW320 GB/s480 GB/s576 GB/s512 GB/s1 TB/s1.25 TB/s1.5 TB/s
WGP ?16203616324048
CU (with DI)32407232648096
CU (4 CU per WGP)NANANA64128160192
Blackwell LineupRTX 5060RTX 5060TiRTX 5070RTX 5070 TiRTX 5080RTX 5080 Ti
Estimated Price$299 ?$449 ?$599 ?$799 ?$999 ?$1,199 ?
GDDR6X16GB GDDR612GB GDDR612 GB16 GB20 GB24 GB
Memory BW320 GB/s480 GB/s576 GB/s768 GB/s960 GB/s1.15 TB/
% of RDNA5> 12.5%~ 76%~ 76%~ 76%

  • The WGP numbers are purely speculated, we should look into final numbers of CU which is crucial to rasterization performance. The CU numbers are slightly lower but if we compared to 7900M number then it is making sense. Why compare to 7900M? Cause 8700XT would most likely replacing current 7900M as 8900M. From 7900M's 16GB 256-bit GDDR6 to 8900M's 16GB 128-bit GDDR7, it provides transmission from GDDR6 to GDDR7 in mobile notebook.
  • Even though 8700XT comes with 18.5% bigger memory bandwidth, they are still slower than 7800XT. Let's see how much performance AMD is able to squeeze out of RDNA5. 8700XT's competitor would be upcoming RTX5070 with slightly faster 12GB GDDR6X.
  • 7800XT is considered oddball for NV; that might explain why NV has to replace RTX4070 with RTX4070S which is only 7% slower than RTX4070Ti. Even though RTX4070S is faster than 7800XT, there is still 4GB RAM advantage with 7800XT. Unfortunately, 8800XT would be priced much higher than 7800XT, my estimated price would be $699, $200 extra. The reasons behind are because of one more GCX and base tile. But they are still $100 cheaper than RTX5070Ti which is direct competitor of 8800XT, this time no more memory size difference but memory type difference... ;)
  • 8900XT and 8900XTX are replacing current 7900XT series, that is simple to understand. OTOH, most people could confuse why do NV launch RTX4080S with only 2% faster performance but $200 cheaper. Well, I think I know why. Cause upcoming RTX5080 with 20GB is going to replace RTX4080S at $999 price point, that put 8900XT in direct competition with $100 cheaper price point.
  • If my calculations are correct, RDNA5 with GDDR7 has memory BW advantage compared to GDDR6X on the Blackwell series. However, there are power and memory bandwidth penalty due to chiplet design, that's why NV has maintained monolithic design for the moment. Anyhow, we will see direct fighting between NV and AMD once again next year.
  • There are four models listed above, AMD should be working on the successor of Navi4c, let's called it Navi5c. I am not sure AMD will pull through though, with four GCXs and four base tiles, the power requirements are quite high, we shall see...
Well, that's all for the moment. There are a lot of assumptions above and let's see how many percentages I could be right :p. Feel free to disagree, that's the purpose of discussion....
 
Last edited:

Shmee

Memory & Storage, Graphics Cards Mod Elite Member
Super Moderator
Sep 13, 2008
7,403
2,439
146
Honestly, OCN is pretty dead, but some of the sub forums are quite active. IE the watercooling and gpu sub forums. https://www.overclock.net/threads/official-amd-radeon-rx-7900-xtx-xt-owners-club.1802706/
Long but worthy read. User input is way more valuable than TPU GN re review charts. IMO.
I found in the past OCN to be very helpful with BIOS mods/flashing and the like. I think there was info there that showed me how to mod the BIOS on my X99 for PCIe bifurcation.
 
  • Like
Reactions: Tlh97 and DaaQ
Mar 11, 2004
23,074
5,557
146
Er, I thought we're supposed to be getting megahammer 999999XTXXXXXTTTTXXXXTTTXXT with MSRP of $3k+?

Why are you labeling RDNA5 as 7xxx and 8xxx series? Wait, were you the person insisting that 7800 is RDNA4? Oh, you are. Yeah, so you've already been shown to have no clue what you're talking about.

Sorry we really don't need this thread mucked up even more than it is with inane pointless speculation based on essentially nothing. Go make the RDNA5/CDNA4 thread if you want to start in on that.
 
  • Like
Reactions: Tlh97 and SteinFG
Aug 4, 2023
137
270
96
Once the package is large enough, unless there are a lot of bridge connections, but that isn't a big deal at the ASP's for such things. Package yields become the main concern.
I suppose it does depend, the bridge connection is SoIC, no? So that is better than 2.5D routing underneath for raw metrics. Both can do stuff the other cannot so that is the main thing.
 

adroc_thurston

Platinum Member
Jul 2, 2023
2,050
2,637
96
Once the package is large enough, unless there are a lot of bridge connections, but that isn't a big deal at the ASP's for such things. Package yields become the main concern.
Piling up on SoIC means lower packaging yield.
I suppose it does depend, the bridge connection is SoIC, no? So that is better than 2.5D routing underneath for raw metrics. Both can do stuff the other cannot so that is the main thing.
Better, yes.
More money, also yes.
 

krawcmac

Junior Member
Nov 21, 2014
7
13
81
Hello guys, please don't hate me but there is a new video from MLID.

The video is on projected performance and launch window for RDNA4 cards. He talks about Navi 48 and 44. MLID gives a very wide window for launch for those cards (Q3.24 - Q1.25). It is stated that Navi 48 can clock to 3-3.3 GHz. So it looks like RDNA4 is fixed RDNA3 with some additional features. One thing I am missing is the projection on RDNA4 power consumption.
 
  • Haha
  • Like
Reactions: Ajay and Ranulf

soresu

Platinum Member
Dec 19, 2014
2,657
1,860
136
Hello guys, please don't hate me but there is a new video from MLID.

The video is on projected performance and launch window for RDNA4 cards. He talks about Navi 48 and 44. MLID gives a very wide window for launch for those cards (Q3.24 - Q1.25). It is stated that Navi 48 can clock to 3-3.3 GHz. So it looks like RDNA4 is fixed RDNA3 with some additional features. One thing I am missing is the projection on RDNA4 power consumption.
#1. A short text summary is enough to suffice what little information is actually imparted in any video from MLID, RGT or GamerMeld.

There's no need for promoting these channels directly by posting the video itself 😒

#2. Lower end RDNA3 dies are fabbed on N6.

If his information is correct then N48 and N44 are on N4P which is enough to account for the clocking difference.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,602
5,788
136
Hello guys, please don't hate me
Ahh ...

#1. A short text summary is enough to suffice what little information is actually imparted in any video from MLID, RGT or GamerMeld.
- 9 months launch window
- RDNA4 has Matrix ops (LLVM says only GFX940)
- His leaks are more worthy than usual 'Linux driver stuffs' (the changes are actually in LLVM)
Not sure if these "leaks" are the kind that you like to hear :)


One interesting thing is that RGT said he heard RDNA5 has register renaming. Which is odd, that I have seen only in a patent. Not sure if he is actively digging patents. If this register renaming is there, they could solving lots of stalling issues.
 
Last edited:
  • Like
Reactions: Tlh97 and Elfear

Glo.

Diamond Member
Apr 25, 2015
5,707
4,551
136
One interesting thing is that RGT said he heard RDNA5 has register renaming. Which is odd, that I have seen only in a patent. Not sure if he is actively digging patents. If this register renaming is there, they could solving lots of stalling issues.
And improve bandwidth(internal) and efficiency of the GPUs with this arch.

Especially this would be interesting for APUs.
 

moinmoin

Diamond Member
Jun 1, 2017
4,946
7,656
136
CUDA -> HIP/AMD finally with minimum to zero work from ZLUDA....

Yay, more open source!

Kinda bad news:
  1. It was open sourced because AMD stopped funding it.
  2. It used to only support Intel GPUs, now it only supports AMD GPUs.
It being open source I hope it will still see wide usage which ideally helps making CUDA projects running on non-Nvidia GPUs a more common expectation.
 

soresu

Platinum Member
Dec 19, 2014
2,657
1,860
136
Yay, more open source!

Kinda bad news:
  1. It was open sourced because AMD stopped funding it.
  2. It used to only support Intel GPUs, now it only supports AMD GPUs.
It being open source I hope it will still see wide usage which ideally helps making CUDA projects running on non-Nvidia GPUs a more common expectation.
Indeed.

I was initially hopeful about hipSYCL but now it seems like they are just fooling around with names while ZLUDA seems to be making dreams come true.

If this gets even a little bit of GPU rendering access with Arnold it's a win for me.

The fact that AMD and Intel in one way or another chose to cut funding or not fund to begin with makes me wonder what is going on.

Perhaps the coder is just difficult to work with.

Either way it's already doing better than the current HIP backend for Blender Cycles, so I'd rather see AMD's money going toward ZLUDA rather than Blender given it could allow commercial GPU renderers yet lacking HIP backends to run on it.
 
  • Like
Reactions: Tlh97 and moinmoin

Aapje

Golden Member
Mar 21, 2022
1,381
1,863
106
What we really need is an industry-standard like in gaming, where all cards support DirectX, so the better implementation of the same API wins.

I think that the cloud companies are the most likely to force that onto the market, as they want to offer GPU-computing to the market, but don't want to be forced into choosing one vendor. They are even designing their own chips, which allows them to set a standard. So I can see Google/Amazon/etc coming up with a shared API and then demanding that Nvidia, AMD & Intel support it for them to even be in the running to deliver chips.
 

PJVol

Senior member
May 25, 2020
534
447
106
It is cheaper, more flexible and just plain better than CoWoS, so it really is necessary.
Oh, not this again!
In all honesty, I'd like to see AMD to throw out all this bridge / interconnect / whatever **** from gaming segment and pack what's left in bga, leaving "innovations" for their datacenter moneybags.
 
Last edited:

SteinFG

Senior member
Dec 29, 2021
416
470
106
Some people are guessing, so I think why not post my expectations. I'm considering recent (unreliable) rumors: 64 CU Navi 48, 7900 XT level of performance; 32 CU Navi 44.
VRAMCoresMemory busTDPComparable toPrice
RX 8800 XT16GB64 CU256bit, 20Gbps260W~7900 XT$530
RX 8700 XT16GB56 CU256bit, 18Gbps230W~6950 XT$470
RX 8600 XT16GB32 CU128bit, 20Gbps160W~4060 Ti 16G$330
RX 8500 XT8GB28 CU128bit, 18Gbps130W~3060 Ti$250

I expect 8700 XT to have 16GB (7700XT with 12GB is a fail). I also think there's no demand for an 8GB card that's more powerful than 3060 Ti, that's why there's no 8GB 32 CU chip on this table. RX 7600 should slip to $200-$220 and RX 6600 will get discontinued.
 
Last edited: