Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 52 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,662
6,163
146

TESKATLIPOKA

Platinum Member
May 1, 2020
2,372
2,864
136
Definitely premature for that.
kopite7kimi
RTX 4070, AD104-275-Kx(x is a number)-A1, 7168FP32, 160bit 18Gbps 10G GDDR6, 300W.
I am basing It on leaks.
RTX 4070 has 7168 FP32 + 3584 INT32(Ampere: 3584FP32 + 3584FP32/INT32), so this change should increase performance and frequency is supposedly >=2.7GHz.
N33 has only 4096 FP32 and frequency >=3GHz.
To me It looks like even in raster will be RTX 4070 faster.

If those 4096 FP32 are actually 4096FP32 (+VOPD), then I wouldn't be so sure at least for raster.
 
Last edited:
  • Like
Reactions: Mopetar

Grabo

Senior member
Apr 5, 2005
241
41
91
Guys, let's not forget that RT performance will be way more important than when RDNA2 launched.
Let's be honest, RDNA2 is very bad with RT enabled and If they don't fix It now then they can't get away with only competitive raster performance. We say that Ampere has worse perf/W than RDNA2, but that's true when you don't have RT enabled. If you enable It, then RDNA2 looses badly.

N33 is supposedly comparable to RX 6950XT. Performance worse than 3070? Very sad.
cyberpunk-2077-rt-1920-1080.png


I feared that 8GB Vram will have a big impact on RT performance, but after looking at RTX 3070(Ti) 8GB, It mostly happens at 4K + RT enabled. Ok, in some games even at 1440p, but not by that much and FPS is high enough.

cyberpunk-2077-rt-3840-2160.png
doom-eternal-rt-3840-2160.png


My conclusion is that N33 shouldn't have a problem with 1080-1440p RT enabled because of only 8GB Vram, at least in current games.
I am more worried about the RT performance of RDNA3. Hopefully, It's better than Ampere, because Ada will be.

There has been some information and speculation in regards to raytracing, for instance https://overclock3d.net/news/gpu_di...cted_compute_units_that_enhance_ray_tracing/1
https://appuals.com/rdna3-ray-tracing-boost-explained/ ; it seems AMD has promised better rt performance, not necessarily implemented in the same way as Nvidia. For the end user it matters not a whit, tick "ray tracing" and add whatever such visual fidelity is implemented in the game and don't be a horrorshow for the fps, I guess.

From the overclock3d article:
Full on patch tracing, where an entire scene and all aspects of a game is ray traced, is not something that AMD is pushing for with RDNA 3. AMD's pushing hybrid rendering, where ray tracing is used alongside traditional rasterised graphics to deliver high performance levels while accessing the visual benefits of ray tracing. AMD is currently investing in techniques that can enable ray tracing in a more performance friendly manner, allowing gamers to get the most performance out of their graphics cards and games.
This makes clear that while AMD plans to deliver a ray tracing boost with RDNA 3, they are not promising earth shattering benefits. Hybrid rendering is the future, as we would have to wait for another console generation before the entire gaming industry pushing things to another level.
 
  • Like
Reactions: Ranulf

Kaluan

Senior member
Jan 4, 2022
500
1,071
96
kopite7kimi

I am basing It on leaks.
RTX 4070 has 7168 FP32 + 3584 INT32(Ampere: 3584FP32 + 3584FP32/INT32), so this change should increase performance and frequency is supposedly >=2.7GHz.
N33 has only 4096 FP32 and frequency >=3GHz.
To me It looks like even in raster will be RTX 4070 faster.

If those 4096 FP32 are actually 4096FP32 (+VOPD), then I wouldn't be so sure at least for raster.
Not entirely sure why you (and others TBH) are comparing a die that is likely going to go in RTX 4070 class SKUs to one that is likely to go in RX 7600 class SKUs tho
 

jpiniero

Lifer
Oct 1, 2010
14,678
5,303
136
kopite7kimi

I am basing It on leaks.
RTX 4070 has 7168 FP32 + 3584 INT32(Ampere: 3584FP32 + 3584FP32/INT32), so this change should increase performance and frequency is supposedly >=2.7GHz.
N33 has only 4096 FP32 and frequency >=3GHz.
To me It looks like even in raster will be RTX 4070 faster.

If those 4096 FP32 are actually 4096FP32 (+VOPD), then I wouldn't be so sure at least for raster.

That's a good question. I'm not expecting the 4070 to be faster than the 3090/6900 XT but I suppose it could be. And if N33 were decently slower than 3090/6900 XT, AMD probally doesn't bother with the desktop release.

I think the main purpose of N33 is for Dragon Range gaming laptops.
 

moinmoin

Diamond Member
Jun 1, 2017
4,968
7,721
136
Guys, let's not forget that RT performance will be way more important than when RDNA2 launched.
Is that actually the case? Currently the current gen of consoles are effectively the baseline wrt RT performance. Full-on RT currently is a luxury both for consumers (since only costly high end cards offer decent performance) as well as developers (since using RT won't save them the work doing backed in lighting etc. as the high end cards offering decent performance are only a tiny part of the TAM). So currently parts of RT that can be accelerated well even on consoles (like real time shadows, so selective reflections etc.) see more widespread adoption whereas good full-on RT performance still needs to trickle down to cheaper cards, ideally to the point that by the time next gen consoles arrive using full-on RT by default has become a no-brainer. But to get there that RT tech may well deviate significantly from current RT tech in its implementation.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,372
2,864
136
That's a good question. I'm not expecting the 4070 to be faster than the 3090/6900 XT but I suppose it could be. And if N33 were decently slower than 3090/6900 XT, AMD probally doesn't bother with the desktop release.

I think the main purpose of N33 is for Dragon Range gaming laptops.
N33 is supposedly as fast as N21. At least in 1080p, not sure about 1440p, highly doubt in 2160p.
I compared RX 6600XT vs RX 6900XT
1080p: +60%
1440p: +81%
2160p: +117%
FullHD looks realistic.
 
  • Like
Reactions: Tlh97 and Kepler_L2

Timorous

Golden Member
Oct 27, 2008
1,669
2,939
136
The competition is 10 GB. If it's faster than the 4070, people will gloss over it. Even if it's slower people won't mind.

They won't at all because 4070/7700XT are 1440p cards and there are already cases where 8GB falls short. With added RT on top 8GB is even less usable at 1440p. Further I don't think there is any expectation that N33 will be faster than the 4070, it should be competing with the 4060.

You do realize that the cut N23's original MSRP was $329. I hope you aren't expecting Zen 4 to start at under $299 either.

Do you really think AMD would have set an MSRP of $330 and $380 for the two N23 parts in a non inflated market? Even with such 'bad' MSRPs prices went a lot higher at the time anyway due to the demand.

To fill the price gap void that these high N33 prices and not having an N34 leaves. And there's no N34 unless AMD can manage to get serious mobile deals for it and probally'd be mobile only. So you just keep selling RDNA 2 Refresh as long as you can hold it's MSRP. If The Flood happens, it happens. You probally won't even see desktop AD106 until The Flood fears subside.

The only potential gap to fill is below the 7600 and for that you need a cheaper part than N33. The only option would be N23 but with 8GB ram and it being on N7 rather than N6 chances are the actual cost / die is pretty similar. On top of that N33 will have better perf/watt than N23 so cooling and power requirements go down which helps to lower BOM as does using just 6GB of ram (possible with N23 as well).

Bottom line is that N33 cut with 6GB ram and the efficiency improvements of RDNA 3 is probably no worse BOM wise than a 6GB N23 based part and I guarantee it will be cheaper than an N22 based part so from a margin POV why go with the slower N23 config for the 7500XT when you can make it faster and use cut N33 with no real increase in cost to AMD.

Why would AMD do that? They only did two cuts with Navi 21 which is the biggest die.

Because the other options cost more and / or give you a far worse product.

With just 3 graphics dies AMD can cover the whole non APU stack and for N31 / N32 parts the number of MCDs can vary based to lower BOM slightly in the cut versions making them less painful to manufacture. This is what I think AMD are going to do with the lineup.

AMD DieProductSpecNV DieProduct
N317900XT384bit - 24GB - 96CU - 192MB ICAD1024090Ti
N317900384bit - 24GB - 84CU - 192MB ICAD1024090
N317850XT320bit - 20GB - 80CU - 160MB ICAD1034080Ti
N327800XT256bit - 16GB - 64CU - 128MB ICAD1034080
N327800256bit - 16GB - 64CU - 128MB IC??4070Ti
N327700XT192bit - 12GB - 48CU - 96MB ICAD1044070
N327700192bit - 12GB - 42CU - 96MB ICAD1044060Ti
N337600XT128bit - 8GB - 32CU - 64MB ICAD1064060
N337600128bit - 8GB - 28CU - 64MB IC??4050Ti
N337500XT96bit - 6GB - 24CU - 32MB IC??4050

Not sure on NV specs so excluded them but I think something like this can work for AMD.

My issue is where do AMD put the 5SE 5MCD part? It has a place I am just not sure where so here is the super wishful thinking version AMD could produce (with a more realistic version too). with some price guesses.

DieWishful thinking stackMore likely stackSpecTarget Res
N317900XTRage Fury MAXXX ($2,000)384bit - 24GB - 96CU - 576MB IC (V-cache on the MCDs because why not for $2k?)8K / 4K with RT
N317900Rage Fury ($1,500)384bit - 24GB - 96CU - 192MB IC8k / 4k with RT
N317800XT7900XT ($1100)320bit - 20GB - 80CU - 160MB IC4k with RT
N3178007900320bit - 20GB - 70CU - 160MB IC4k with RT
N327700XT7800XT ($750)256bit- 16GB - 64CU - 128MB IC4K / 1440p with RT
N3277007800256bit - 16GB - 56CU - 128MB IC4K / 1440p with RT
N327600XT7700XT ($550)192bit - 12GB - 48CU - 96MB IC1440p / 1080p with RT
N3276007700192bit - 12GB - 42CU - 96MB IC1440p / 1080p with RT
N337500XT7600XT ($400)128bit - 8GB - 32CU - 64MB IC1080p
N3375007600128bit - 8GB - 28CU - 64MB IC1080p

With the more likely product each tier is going to get around a 1.7-2x perf gain with the full fat N31 occupying a brand new tier that AMD don't really compete in. It also means in theory the x900 can stay as a 300W part which might be more palatable for some.

*Edited the amount of IC for the MAXXX because with a cache only die AMD can fit 64MB in 36mm and if the 40mm or so estimate for the MCD is accurate they can just slap it over the top of that and end up with an MCD containing 96MB of IC.
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,678
5,303
136
Again, with the yields as good as they are there shouldn't be any need to have more than one cut for each level of the stack.

you need a cheaper part than N33.

They don't really. See what AMD did with Zen 3. They are getting out of the low end and even the lower mid range. And with TSMC's price hikes it's an even easier call.
 

Kepler_L2

Senior member
Sep 6, 2020
378
1,447
106
They won't at all because 4070/7700XT are 1440p cards and there are already cases where 8GB falls short. With added RT on top 8GB is even less usable at 1440p. Further I don't think there is any expectation that N33 will be faster than the 4070, it should be competing with the 4060.



Do you really think AMD would have set an MSRP of $330 and $380 for the two N23 parts in a non inflated market? Even with such 'bad' MSRPs prices went a lot higher at the time anyway due to the demand.



The only potential gap to fill is below the 7600 and for that you need a cheaper part than N33. The only option would be N23 but with 8GB ram and it being on N7 rather than N6 chances are the actual cost / die is pretty similar. On top of that N33 will have better perf/watt than N23 so cooling and power requirements go down which helps to lower BOM as does using just 6GB of ram (possible with N23 as well).

Bottom line is that N33 cut with 6GB ram and the efficiency improvements of RDNA 3 is probably no worse BOM wise than a 6GB N23 based part and I guarantee it will be cheaper than an N22 based part so from a margin POV why go with the slower N23 config for the 7500XT when you can make it faster and use cut N33 with no real increase in cost to AMD.



Because the other options cost more and / or give you a far worse product.

With just 3 graphics dies AMD can cover the whole non APU stack and for N31 / N32 parts the number of MCDs can vary based to lower BOM slightly in the cut versions making them less painful to manufacture. This is what I think AMD are going to do with the lineup.

AMD DieProductSpecNV DieProduct
N317900XT384bit - 24GB - 96CU - 192MB ICAD1024090Ti
N317900384bit - 24GB - 84CU - 192MB ICAD1024090
N317850XT320bit - 20GB - 80CU - 160MB ICAD1034080Ti
N327800XT256bit - 16GB - 64CU - 128MB ICAD1034080
N327800256bit - 16GB - 64CU - 128MB IC??4070Ti
N327700XT192bit - 12GB - 48CU - 96MB ICAD1044070
N327700192bit - 12GB - 42CU - 96MB ICAD1044060Ti
N337600XT128bit - 8GB - 32CU - 64MB ICAD1064060
N337600128bit - 8GB - 28CU - 64MB IC??4050Ti
N337500XT96bit - 6GB - 24CU - 32MB IC??4050

Not sure on NV specs so excluded them but I think something like this can work for AMD.

My issue is where do AMD put the 5SE 5MCD part? It has a place I am just not sure where so here is the super wishful thinking version AMD could produce (with a more realistic version too). with some price guesses.

DieWishful thinking stackMore likely stackSpecTarget Res
N317900XTRage Fury MAXXX ($2,000)384bit - 24GB - 96CU - 576MB IC (V-cache on the MCDs because why not for $2k?)8K / 4K with RT
N317900Rage Fury ($1,500)384bit - 24GB - 96CU - 192MB IC8k / 4k with RT
N317800XT7900XT ($1100)320bit - 20GB - 80CU - 160MB IC4k with RT
N3178007900320bit - 20GB - 70CU - 160MB IC4k with RT
N327700XT7800XT ($750)256bit- 16GB - 64CU - 128MB IC4K / 1440p with RT
N3277007800256bit - 16GB - 56CU - 128MB IC4K / 1440p with RT
N327600XT7700XT ($550)192bit - 12GB - 48CU - 96MB IC1440p / 1080p with RT
N3276007700192bit - 12GB - 42CU - 96MB IC1440p / 1080p with RT
N337500XT7600XT ($400)128bit - 8GB - 32CU - 64MB IC1080p
N3375007600128bit - 8GB - 28CU - 64MB IC1080p

With the more likely product each tier is going to get around a 1.7-2x perf gain with the full fat N31 occupying a brand new tier that AMD don't really compete in. It also means in theory the x900 can stay as a 300W part which might be more palatable for some.

*Edited the amount of IC for the MAXXX because with a cache only die AMD can fit 64MB in 36mm and if the 40mm or so estimate for the MCD is accurate they can just slap it over the top of that and end up with an MCD containing 96MB of IC.
The N31 version with 3D V-Cache has 384MB of Infinity Cache, not 576MB.
 

Timorous

Golden Member
Oct 27, 2008
1,669
2,939
136
Again, with the yields as good as they are there shouldn't be any need to have more than one cut for each level of the stack.

With traditional monolithic dies correct (although AMD have done it before with N10 for example, 5700XT, 5700 and 5600XT in a 192bit 6GB config) Part of this is because power and cost scaling tapers off since you are using the same amount of silicon as the more expensive part but charging less for it and you don't get as much in the way of power savings so your reductions in PCB and cooler costs are not as great as designing a dedicated part to go into that segment.

The difference here though is that AMD are using multiple dies for the top tiers. This means when they decide they want to cut a SE out of N31 to make a 20GB part they can save on silicon costs due to not having to use 6MCDs. They can save on power because they only need power the cut GPU and 5 MCDs and they can save on cooling because they have fewer chips to cool. It is entirely possible those benefits to the BOM cost with this chiplet approach make designing 4 GCDs for each segment and then managing the inventory and wafer allocation the more expensive option and it will always be less flexible than if you have fewer chips.

So while the need to cut due to yield reasons will be minimal the flexibility to just decide post die manufacture where to allocate parts is pretty powerful for maximising units sold.

On top of that it is possible AMD will ignore the non XT parts for a while but that would leave a lot of large pricing gaps in the stack so filling them with parts that are defective differently (2 WGPs down across > 1 SE for the non xt config vs 1 or more WGP down in a single SE for the cut SE XT configs).

Bottom line is AMD are working with much smaller GPUS this time around. a full fat N32 will be ~390mm of silicon which for an x800 tier part is probably less silicon cost than the N21 part used in the 6800XT. The 7700XT will be ~350mm of silicon which is about the same as N22 but about 33% is cheaper N6 based silicon so again die costs for that might not be any greater than N22. With that in mind AMD can saturate the 7800 tier market and then allocate any remaining N32 dies to the 7700 tier market to maximise profits. Much easier to do this than it is to predict upfront how many dies for each part you need. It also allows AMD to bring the lower tier products to market faster because they are based on the same GCD.

The N31 version with 3D V-Cache has 384MB of Infinity Cache, not 576MB.

So either AMD have decided to increase granularity and manufacture 32MB cache dies for all v-cache parts going forward (makes something like a 7600X3D with half the v-cache of the 7800X3D viable as the v-cache die cost is half and allows further product differentiation) or AMD are taking MCDs that have a defect that makes the PHY no longer work and stacking those because why waste perfectly good cache.

Outside of those two reasons I don't see why AMD wouldn't just re-use the 64MB cache dies they make already unless the TSV pitch has changed.
 
  • Like
Reactions: Tlh97

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
Guys, let's not forget that RT performance will be way more important than when RDNA2 launched.
Let's be honest, RDNA2 is very bad with RT enabled and If they don't fix It now then they can't get away with only competitive raster performance. We say that Ampere has worse perf/W than RDNA2, but that's true when you don't have RT enabled. If you enable It, then RDNA2 looses badly.

N33 is supposedly comparable to RX 6950XT. Performance worse than 3070? Very sad.

I don't think RT will be way more important. Raster performance is going to be king for a very long time. The only place RT will be important is games where the developer purposely doesn't implement "old school" reflections to make the game look like garbage if RT is off (Which several nVidia back games did).

In most cases, if you showed the same game to people, with old school reflections, and RT, side by side, in most cases they could not tell the difference. Other than one ran twice as fast.

But, AMD has stated there will be better RT performance, so we will just have to wait and see. But as it is, most people don't care about RT.
 

jpiniero

Lifer
Oct 1, 2010
14,678
5,303
136
My point about the yield is that AMD's not going to cut chips further than they need to in order to have extra SKUs.

On top of that it is possible AMD will ignore the non XT parts for a while but that would leave a lot of large pricing gaps in the stack

Not really.

N31 is the $1500-2k+ market. N32 is 900-1200. N33 is 500-700. Stick RDNA 2 Refresh below as long you can get the MSRP. That's simple and all you need.
 

Kepler_L2

Senior member
Sep 6, 2020
378
1,447
106
My point about the yield is that AMD's not going to cut chips further than they need to in order to have extra SKUs.



Not really.

N31 is the $1500-2k+ market. N32 is 900-1200. N33 is 500-700. Stick RDNA 2 Refresh below as long you can get the MSRP. That's simple and all you need.
You're vastly overestimating how expensive the RDNA3 lineup is.
 

leoneazzurro

Senior member
Jul 26, 2016
945
1,504
136
About N31, if the claim of "50+% more perf/W compared to RDNA2" refers to FPS (and it was, in the case of the RDNA to RDNA2 comparison") then performance of N31 could be (very roughly) estimated. If N31 is in the 400W range, i.e., then the baseline increase will result already in double the performance of a 6900XT. If AMD is sandbagging, it may be even more (2,1-2,2x). Something I wonder is how it will reach this performance. N33 is supposed to be in the same ballpark of 6900XT at least in fullHD. It has less BW; less IC. Discussion above seems to conclude that VLIW2, while doubling FP32 resources per CU/WGP, increases the throughput per CU/WGP by 1,3x, 1,4x at best. That means that a 4096 ALU N33 would get the throughput of 2870 ALU N21, per clock. So it would require clocks way above 3GHz (almost 4GHz) to achieve a similar performance. Not considering the inferior bandwidth and IC amount. So what s the "secret sauce" of that, if these performance claims are true? VLIW2 is truly so relatively inefficient, or real throughput per CU is higher than 1,3-1,4x per clock? Are there some other secrets (caching, compression, return to Wave64 programming..) we don't know about?
 

Saylick

Diamond Member
Sep 10, 2012
3,211
6,562
136
N31 is the $1500-2k+ market. N32 is 900-1200. N33 is 500-700. Stick RDNA 2 Refresh below as long you can get the MSRP. That's simple and all you need.
I would never buy N33 for $700 if it just ties the 6900XT @ 1080p.

Need to hit those price ranges by a 0.7x factor to be realistic.
 

jpiniero

Lifer
Oct 1, 2010
14,678
5,303
136
You're vastly overestimating how expensive the RDNA3 lineup is.

Unless you are within AMD's marketing department, you have no idea what AMD is truly planning on charging at this point.

The prices I am estimating are what I think nVidia is planning on charging and associating that with the competitive AMD part.

The old AMD is over buddy. Plus TSMC price hikes, etc.
 

GodisanAtheist

Diamond Member
Nov 16, 2006
6,910
7,302
136
Not sure if It came up in conversation but all Zen 4 parts will be coming with a basic IGP included.

So I figure AMD will treat the sub $200 GPU market as effectively dead at this point, as its almost completely covered by Intel IGPs, AMD IGPs, and AMD APUs.

I figure we'll see even a 7400 start at $230 or something.
 

Aapje

Golden Member
Mar 21, 2022
1,426
1,935
106
I think that the prices that Nvidia plans to charge are going to result in such low sales that they need to lower prices.

Companies are not just able to charge what they want and many signs point to a severely worse market conditions for video card makers.
 

Hans Gruber

Platinum Member
Dec 23, 2006
2,149
1,096
136
I'll repeat what I said earlier... I hope you guys aren't expecting sub-300 Zen 4 either.
We're not. I hope AMD and Intel read the Gartner article. Now is a good time to raise prices. If supply and demand were not real.
 

Saylick

Diamond Member
Sep 10, 2012
3,211
6,562
136
But, why would you buy either for 1080p?
Lol, I didn't say I would buy it for 1080p. I said I wouldn't buy it at $700 if it just matched 6900XT performance at 1080p, with the implication that the 6900XT was on par or faster at higher resolutions.
 
  • Like
Reactions: Ajay

jpiniero

Lifer
Oct 1, 2010
14,678
5,303
136
We're not. I hope AMD and Intel read the Gartner article. Now is a good time to raise prices. If supply and demand were not real.

If you read the article, most of that was Chromebooks which is dominated by 14 nm Atom. Desktop and traditional laptop sales were up.
 
  • Like
Reactions: Kaluan and Mopetar