Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 135 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,574
146

Kaluan

Senior member
Jan 4, 2022
500
1,071
96
New AMD "Meet the Experts" webcast on the 15th next month:


Not expecting much, but perhaps some teasers for product from the lower end of the RX 7000 stack and some hints about what to expect from Adrenalin driver updates and what FSR3 is and how it works.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,222
5,224
136
New AMD "Meet the Experts" webcast on the 15th next month:


Not expecting much, but perhaps some teasers for product from the lower end of the RX 7000 stack and some hints about what to expect from Adrenalin driver updates and what FSR3 is and how it works.

Probably the event that GN (previous video I linked) was at where the had a quick Q&A with one of the engineers, so I expect there will be some overlap with that.
 

Kaluan

Senior member
Jan 4, 2022
500
1,071
96
Probably the event that GN (previous video I linked) was at where the had a quick Q&A with one of the engineers, so I expect there will be some overlap with that.
Maybe, but I don't see Sam Naffziger in the speakers list.

Will skim through it regardless. They may go into lenghty ROCm and GPUOpen talks as well.
 

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
Boy, I'm getting so tired of so called leaks lately. Recently we had "only 6 and 8 core get V-Cache" then a few days later "actually, 7950X3D will be a thing", now we've had "custom 7900XT/XTX only 1-2 weeks after launch?" followed a day ago by "actually, AIBs will have custom cards ready on the 13th".

Bleah.

Either way, I'm keeping my fingers crossed for both reference and custom SKU reviews being up on the lifting of the NDA.

Don’t confuse “leaks” with “wishful thinking”. A 7950X3D is something myself and others drool over 🤤, it is a desire vs a rumor. It will likely never take hold. Regarding OEM parts, the situation is a bit complicated because those involved can’t actually say anything except what AMD allows them to say, which is absolutely nothing prior to launch day.

I suspect that we will see a mix of cards from shortly after launch and onward. We already know this of course, because big players are leaking.
 
  • Like
Reactions: Tlh97 and Leeea

TESKATLIPOKA

Platinum Member
May 1, 2020
2,329
2,811
106
…and rumors of 3ghz N32 are a continuing thing apparently.
If I understand It correctly a RX7900xtx has a max boost for shaders at 2.5GHz, but gaming clock drops to 2.3GHz to save power, frontend is probably 2.5GHz for boost and gaming clockspeed.

Comparison:
SpecsFrequencyProcessing powerTexture FillratePixel FillrateBandwidthTPU 4K performance
RX 7970 XTX6SE:6144SP:384TMU:192ROP2500 MHz61,440 GFlops (380%)960 GT/s (190%)480 GP/s (238%)960 GB/s (188%)~201% ?
RX 7970 XT6SE:5376SP:336TMU:192ROP2400 MHz51,610 GFlops (319%)806 GT/s (160%)461 GP/s (238%)800 GB/s (156%)~169% ?
RX 6950 XT4SE:5120SP:320TMU:128ROP2310 MHz23,654 GFlops (146%)739 GT/s (146%)296 GP/s (147%)576 GB/s (113%)132%
Full N323SE:3840SP:240TMU:96ROP3000 MHz46,080 GFlops (285%)720 GT/s (143%)288 GP/s (143%)640 GB/s (125%)~151% ?
RX 68003SE:3840SP:240TMU:96ROP2105 MHz16,166 GFlops (100%)505 GT/s (100%)202 GP/s (100%)512 GB/s (100%)100%
How did I calculate performance for N31 and N32?
RX 7900 XTX -> 61440/2*1.174/23654 = 1.52 * 132% = 201%
RX 7900 XT -> 51610/2*1.174/23654 = 1.28 * 132% = 169%
N32 -> 46080/2*1.174/23654 = 1.14 * 132% = 151%

I don't think RX 7900XT will be only 12% faster than N32, GFLOPs and Texture fillrate is 12% better, but Pixel fillrate is 60% better and BW is 25% better.
Let's not forget that N32 has a big disadvantage by having only 3 Shader Engines and that means only 1/2 Rasterizer, Primitive units and L2 cache of N31.
My personal opinion is that N32 will cost $649-699.
 
Last edited:
  • Like
Reactions: Tlh97 and Joe NYC

Timorous

Golden Member
Oct 27, 2008
1,532
2,535
136
Don’t confuse “leaks” with “wishful thinking”. A 7950X3D is something myself and others drool over 🤤, it is a desire vs a rumor. It will likely never take hold. Regarding OEM parts, the situation is a bit complicated because those involved can’t actually say anything except what AMD allows them to say, which is absolutely nothing prior to launch day.

I suspect that we will see a mix of cards from shortly after launch and onward. We already know this of course, because big players are leaking.
If I understand It correctly a RX7900xtx has a max boost for shaders at 2.5GHz, but gaming clock drops to 2.3GHz to save power, frontend is probably 2.5GHz for boost and gaming clockspeed.

Comparison:
SpecsFrequencyProcessing powerTexture FillratePixel FillrateBandwidthTPU 4K performance
RX 7970 XTX6SE:6144SP:384TMU:192ROP2500 MHz61,440 GFlops (380%)960 GT/s (190%)480 GP/s (238%)960 GB/s (188%)~201% ?
RX 7970 XT6SE:5376SP:336TMU:192ROP2400 MHz51,610 GFlops (319%)806 GT/s (160%)461 GP/s (238%)800 GB/s (156%)~169% ?
RX 6950 XT4SE:5120SP:320TMU:128ROP2310 MHz23,654 GFlops (146%)739 GT/s (146%)296 GP/s (147%)576 GB/s (113%)132%
Full N323SE:3840SP:240TMU:96ROP3000 MHz46,080 GFlops (285%)720 GT/s (143%)288 GP/s (143%)640 GB/s (125%)~151% ?
RX 68003SE:3840SP:240TMU:96ROP2105 MHz16,166 GFlops (100%)505 GT/s (100%)202 GP/s (100%)512 GB/s (100%)100%
How did I calculate performance for N31 and N32?
RX 7900 XTX -> 61440/2*1.174/23654 = 1.52 * 132% = 201%
RX 7900 XT -> 51610/2*1.174/23654 = 1.28 * 132% = 169%
N32 -> 46080/2*1.174/23654 = 1.14 * 132% = 151%

I don't think RX 7900XT will be only 12% faster than N32, GFLOPs and Texture fillrate is 12% better, but Pixel fillrate is 60% better and BW is 25% better.
Let's not forget that N32 has a big disadvantage by having only 3 Shader Engines and that means only 1/2 Rasterizer, Primitive units and L2 cache of N31.
My personal opinion is that N32 will cost $649-699.

If AMD are changing the number of CUs per SE from 8 to 10 for N32 there is no reason they wouldn't increase the number of rops from 32 to 40 or something
 
  • Like
Reactions: Tlh97 and Kaluan

TESKATLIPOKA

Platinum Member
May 1, 2020
2,329
2,811
106
If AMD are changing the number of CUs per SE from 8 to 10 for N32 there is no reason they wouldn't increase the number of rops from 32 to 40 or something
They could do It, but likely won't.
N22 and N23 had 2SE but with different amount of CUs per SE, but ROPs didn't change.
They didn't even bother deactivating some ROPs in cutdown versions of N1x, N2x or N31.
If AMD wanted to have more ROPs then It would have been better to have 4SE for 128 ROPs to begin with.
Even with 96 ROPs It is just a bit behind 6950XT in pixel fillrate.
Still, we have to wait to see If N32 will really be clocked at 3GHz.
 
Last edited:

Timorous

Golden Member
Oct 27, 2008
1,532
2,535
136
They could do It, but likely won't.
N22 and N23 had 2SE but with different amount of CUs per SE, but ROPs didn't change.
They didn't even bother deactivating some ROPs in cutdown versions of N1x, N2x or N31.
If AMD wanted to have more ROPs then It would have been better to have 4SE for 128 ROPs to begin with.
Even with 96 ROPs It is just a bit behind 6950XT in pixel fillrate.
Still, we have to wait to see If N32 will really be clocked at 3GHz.

N21 and N22 had 20 CUs per SE. N23 was cut from that.

N32 has more CUs per SE than N31 so it is not quite the same.
 
  • Like
Reactions: Tlh97 and Kaluan

TESKATLIPOKA

Platinum Member
May 1, 2020
2,329
2,811
106
N21 and N22 had 20 CUs per SE. N23 was cut from that.

N32 has more CUs per SE than N31 so it is not quite the same.
True, but It's the same 20CUs per SE as N21 and N22 or N10 had.
Of course, we can debate If the ratio 32 ROPs for 20CUs is still enough for RDNA3 as It was for RDNA1 and RDNA2.
 
Last edited:

Kepler_L2

Senior member
Sep 6, 2020
308
977
106
The amount of ROPs is determined by the number of RBs (Render Backends). If RDNA3 is still using RDNA2-style fat RBs then it's pretty much guaranteed that Navi32 is 96 ROPs. But if they went back to RDNA1-style smaller RBs then we could see some customization in RBs per SE for each die.
 

Timorous

Golden Member
Oct 27, 2008
1,532
2,535
136
The amount of ROPs is determined by the number of RBs (Render Backends). If RDNA3 is still using RDNA2-style fat RBs then it's pretty much guaranteed that Navi32 is 96 ROPs. But if they went back to RDNA1-style smaller RBs then we could see some customization in RBs per SE for each die.

Unless it clocks to the absolute moon (3.2Ghz+ in game as an average) then 96 ROPS seems too little for a 7800XT tier part. If it did that then 96 would be sufficient to provide a good uplift vs the 6800XT and it would leave a lot of room for a lower clocked, lower shader count 7700XT that does not need to disable ROPS so I could see a 48CU 3SE 7700XT.

Maybe a 5SE N31 design with 14 CUs per SE (so cut like the 7900XT and with 1 disabled SE) for 70 CUs but that seems like a lot of cuts on a 50% larger die for one of the more volume oriented parts. I just don't see it on its own merit and I don't see how it would fit within the product stack because then you end up with 7800 and 7700 both being 16GB or you end up with 7800 and 7900XT both being 20GB. Nah. I just don't see N31 as a 7800XT as something AMD would do.
 
Last edited:

Aapje

Golden Member
Mar 21, 2022
1,312
1,773
106
I expect the 7800 (XT) to be Navi 32. Then they have more separation between the tiers like Nvidia put more separation between the 4090 and the 4080 than it had with the 3090 vs 3080.

The 7800 XT then remains a 256 bit part with 16 GB and the 7800 can become 192 bit with 12 GB.

Although I'm wondering how the 7700 tier then ends up. That could also become the 192 bit part with the 7800 XT and 7800 both having 256 bit and 16 GB, but just different cuts.
 

maddie

Diamond Member
Jul 18, 2010
4,723
4,628
136
I expect the 7800 (XT) to be Navi 32. Then they have more separation between the tiers like Nvidia put more separation between the 4090 and the 4080 than it had with the 3090 vs 3080.

The 7800 XT then remains a 256 bit part with 16 GB and the 7800 can become 192 bit with 12 GB.

Although I'm wondering how the 7700 tier then ends up. That could also become the 192 bit part with the 7800 XT and 7800 both having 256 bit and 16 GB, but just different cuts.
No 7800 probably. The 6800 was almost an orphan. I think they regretted it, at least as an initial offering.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,329
2,811
106
Why is everyone discussing the amount of ROPs in N32? Is 96ROPs really a problem or the only potential bottleneck?
If you check N32, then It is basically 1.5x of N22 except IC.
If they can't significantly increase the clocks, then It will compare to N21 in performance.

How It could theoretically perform:
N22: 40CU*64Shaders*2Flops*2600MHz= 13,312 GFlops
N32: 60CU*64Shaders*2Flops*2600MHz*1.174= 23,442 GFlops
+76% over N22 at 4K.
 
Last edited:

Timorous

Golden Member
Oct 27, 2008
1,532
2,535
136
Why is everyone discussing the amount of ROPs in N32? Is 96ROPs really a problem or the only potential bottleneck?
If you check N32, then It is basically 1.5x of N22 except IC.
If they can't significantly increase the clocks, then It will compare to N21 in performance.

How It could theoretically perform:
N22: 40CU*64Shaders*2Flops*2600MHz= 13,312 GFlops
N32: 60CU*64Shaders*2Flops*2600MHz*1.174= 23,442 GFlops
+76% over N22 at 4K.

Just look at the specs.

7900XT6950XTDelta7800XT(N32)6800Delta
84 CU80 CU+ 5%60 CU60 CU0%
192 ROPs128 ROPs+ 50%96 ROPs96 ROPs0%
800 GBps576 GBps+ 39%640 GBps512 GBps+25 %
Overall FPS DeltaPer AMD slides+ 30%

If we take AMD at face value that the RDNA3 CUs are 17% faster clock for clock then to hit 6800XT + 30% you need to be about 50% faster than the 6800 at 4K so that would require 30% higher clocks. The 6800 manages to sustain between 2.1 and 2.2 Ghz so call it 2.15. That means overall clocks for N32 need to be 2.8Ghz to hit the desired performance target of 6800XT + 30%.

Even with 30% higher clocks though it would have a lower pixel fillrate than the 6800XT and given a minimum gen on gen gain of 30% you probably need more pixel fill (perhaps not 30% more but some amount more). To do that you need more like + 50% clocks which is 3.2Ghz.

Provided it can clock that high (and AMD did say RDNA3 was architected to hit > 3Ghz and N32 is the ideal die for that to be true) it will work out fine. It it falls short then the gen on gen game will be lacklustre. Good pricing could make up for it but I don't think AMD will want to charge less than $650.
 
  • Like
Reactions: Tlh97 and Joe NYC