Question Speculation: RDNA3 + CDNA2 Architectures Thread

uzzi38 · Jan 23, 2021

Man I have been dying to make this one for a while now.

First rumours for RDNA3 are here so new thread time!

Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3 is much bigger than from RDNA1 to RDNA2. We should expect many big improvements in GFX11. 🤔" / Twitter

dr1337 · May 3, 2022

Glo. said:
Just watch it yoursleves guys.

never

Frenetic Pony · May 3, 2022

jpiniero said:
I still insist that both N33 will be more than the 6750 XT's MSRP. Maybe if they keep the size down it won't be that much more.

That's the benefit of a process shrink and chiplets (probably just 1 from SRAM bu still) though. Shrink the equivalent die by about 80%, cut another 25% off of that by making a cheaper SRAM chip for the big cache thing and you get about double the chips per wafer versus 67XX. Even with packaging costs BOM should be similar between.

Guess it'll all depend on how good Alchemist is and how available, same with Ada, and same with zombie crypto coming back though.

Also the new RDNA3 arrangement fits in with the lowest (current) end chip just kinda looking like a 6nm shrink of 66XX. Wonder if they'll change the arch to RDNA3 or just shrink the current chip and call it good.

Timorous · May 4, 2022

jpiniero said:
I still insist that both N33 will be more than the 6750 XT's MSRP. Maybe if they keep the size down it won't be that much more.

On the one hand die size is going to be similar ish between N33 and ~~N23~~ N22* because 4k shaders + 128 bit bus + 8 (assumed) PCIE 5 lanes + 128MB IC on N6 is probably similar in size to 2.5k shaders + 192 bit bus + 16 PCIE 4 lanes + 96MB IC on N7. N33 may even be a bit larger.

OTOH 8GB of GDDR6 vs 12GB does reduce the memory cost, lower TBP means the PCB and cooling can be a bit cheaper as well and N6 is cheaper than N7 too so overall BOM cost is probably about even at worst and at best is an advantage for the N33 based part. Given that the MSRP of the 6700XT was $480 a $450 7600XT is not out of the question with similar or better margin.

If performance is faster than a 6900XT at 1080p and similar at 1440p (outside of the cases where 8GB is limiting) then $450 would not be terrible value even if it is expensive for a 600 series card historically. I see no way AMD can sell N33 as the 7700XT with just 8GB of VRAM. Performance wise it is there but since 700 class is usually targeting 1440p or entry level 4K 8GB just will not cut it for the upcoming gen of GPUs so that won't fly. Giving it 16GB of VRAM is an option but that adds expense and it reduces max volume so that looks doubtful to me as well.

*Thank you uzzi for point that out.

uzzi38 · May 4, 2022

Timorous said:
On the one hand die size is going to be similar ish between N33 and N23 because 4k shaders + 128 bit bus + 8 (assumed) PCIE 5 lanes + 128MB IC on N6 is probably similar in size to 2.5k shaders + 192 bit bus + 16 PCIE 4 lanes + 96MB IC on N7. N33 may even be a bit larger.

OTOH 8GB of GDDR6 vs 12GB does reduce the memory cost, lower TBP means the PCB and cooling can be a bit cheaper as well and N6 is cheaper than N7 too so overall BOM cost is probably about even at worst and at best is an advantage for the N33 based part. Given that the MSRP of the 6700XT was $480 a $450 7600XT is not out of the question with similar or better margin.

If performance is faster than a 6900XT at 1080p and similar at 1440p (outside of the cases where 8GB is limiting) then $450 would not be terrible value even if it is expensive for a 600 series card historically. I see no way AMD can sell N33 as the 7700XT with just 8GB of VRAM. Performance wise it is there but since 700 class is usually targeting 1440p or entry level 4K 8GB just will not cut it for the upcoming gen of GPUs so that won't fly. Giving it 16GB of VRAM is an option but that adds expense and it reduces max volume so that looks doubtful to me as well.

*N22

But yeah, I'm expecting N33 to be in the 350-400mm^2 range. A little ahead of N22 on die area, but also lower memory and board costs.

Ajay · May 4, 2022

Timorous said:
OTOH 8GB of GDDR6 vs 12GB does reduce the memory cost, lower TBP means the PCB and cooling can be a bit cheaper as well and N6 is cheaper than N7 too so overall BOM cost is probably about even at worst and at best is an advantage for the N33 based part. Given that the MSRP of the 6700XT was $480 a $450 7600XT is not out of the question with similar or better margin.

I seem to recall that AMD is looking to boost it’s gross margins. Assuming that includes GPUs, prices will be going up. Hopefully, the increases will tend towards the incremental (< =10%).

Timorous · May 4, 2022

Ajay said:
I seem to recall that AMD is looking to boost it’s gross margins. Assuming that includes GPUs, prices will be going up. Hopefully, the increases will tend towards the incremental (< =10%).

Make N33 attractively priced on N6 then just don't make many N31 and N32 parts and use most N5 capacity for Zen4 CCDs to use in Genoa / Ryzen etc. That might be a way to boost GMs while still providing enough stock of a part that will probably sell like hot cakes.

Stuka87 · May 4, 2022

Timorous said:
OTOH 8GB of GDDR6 vs 12GB does reduce the memory cost, lower TBP means the PCB and cooling can be a bit cheaper as well and N6 is cheaper than N7 too so overall BOM cost is probably about even at worst and at best is an advantage for the N33 based part. Given that the MSRP of the 6700XT was $480 a $450 7600XT is not out of the question with similar or better margin.

There are shortages of everything. Prices of everything have gone up (raw materials, components, labor). The 7600XT is going to be a $500+ card. There is nothing to suggest a price cut over the current card.

uzzi38 · May 4, 2022

Kepler on Twitter: "Confirmed just a Jebait, RDNA3 does support AV1 encoding. https://t.co/ii8bYHz1wY" / Twitter

Well that didn't last long.

Bigos · May 4, 2022

gfx11 (a.k.a. RDNA3) support in mesa (mostly OpenGL stuff with some VCN as well; the compiler backend lives in LLVM though).

amd: add gfx11 support (!16328) · Merge requests · Mesa / mesa · GitLab

This is most of it. Some bits are missing and they will be added later.

gitlab.freedesktop.org

Not sure if there was any doubt, but AV1 decode seems to be supported at the very least (could not find any info about encode though).

amd: add gfx11 support (!16328) · Merge requests · Mesa / mesa · GitLab

This is most of it. Some bits are missing and they will be added later.

gitlab.freedesktop.org

xilli_fiberbit · May 4, 2022

What's mean that :

https://twitter.com/x/status/1521954389942054912

?

Saylick · May 4, 2022

xilli_fiberbit said:
What's mean that :
https://twitter.com/x/status/1521954389942054912
?

Perhaps someone can explain better than me, but it appears that previous RDNA1 and RDNA2 had to decompress data at certain stages of memory subsystem before it could be used, but it sounds like in RDNA3 the data does not need to be decompressed at all regardless of where it exists, hence @Kepler_L2 's comment about "end-to-end DCC".

Edit: To help others better understand what's going on here, here's AMD's own article on Delta Color Compression (DCC). Looks like RDNA3 completely does away with needing to decompress as all the structures are rebuilt to work on compressed data, which helps AMD get away with a smaller memory bus.

Saylick · May 4, 2022

Anyone here can help explain what might be going on here?

https://twitter.com/x/status/1521917087496687616

lobz · May 5, 2022

Glo. said:
Just watch it yoursleves guys.

At this point I don't even know what to say.

Just watch it yourselves..... dude it's supposed to be a written forum and not a freaking videoconference. I can read through 2 pages of interesting arguments here instead of listening to Paul (whom I otherwise quite like) trying and failing to finish a sentence under 5 minutes.

Nothing personal, you're not the only one doing this, but my cup has run over just now 😂

Bigos · May 5, 2022

Saylick said:
Anyone here can help explain what might be going on here?

Exactly what the comments say.

NGG is the new hardware shader stage introduced in RDNA that replaces the post-tessellation pre-rasterization stages (vertex and geometry). It also directly implements mesh shaders. RDNA1/2 supports both the legacy and the new stage but RDNA3 removes the former. This means that AMD had to make NGG work good with all of the vertex processing features, like geometry shading and transform feedback (stream-out in DX nomenclature). Though notable is that transform feedback is currently unimplemented for gfx11 in mesa (they probably left it for later).

More details here: https://gitlab.freedesktop.org/mesa...h-software-stage-runs-on-which-hardware-stage

Another notable change is more versatile DCC support. Now all hardware blocks understand DCC so decompression is usually not required (though not always, imagine Intel iGPU + AMD dGPU situation where display is driven by the iGPU - you need to decompress the frame before sending it to display since Intel does not understand DCC).

DisEnchantment · May 5, 2022

Quite interesting commit, advanced PSR

For additional power savings, PSR SU (also referred to as PSR2) can be
enabled on eDP panels with PSR SU support.

PSR2 saves more power compared to PSR1 by allowing more opportunities
for the display hardware to be shut down. In comparison to PSR1, Shut
down can now occur in-between frames, as well as in display regions
where there is no visible update. In otherwords, it allows for some
display hw components to be enabled only for a **selectively updated**
region of the visible display. Hence PSR SU.

AMD X.Org drivers - Patchwork

patchwork.freedesktop.org

Mopetar · May 5, 2022

lobz said:
I can read through 2 pages of interesting arguments here instead of listening to Paul (whom I otherwise quite like) trying and failing to finish a sentence under 5 minutes.

Paul is really like the kid who realized that he needed a 10 page paper and is desperately trying to stretch the 4 pages he has to fit.

It's okay though because he has a soothing voice, so I can always put one of his videos on to help me drift off to sleep before it's finished. If it's important information someone here will summarize it the next day.

Saylick · May 5, 2022

Mopetar said:
Paul is really like the kid who realized that he needed a 10 page paper and is desperately trying to stretch the 4 pages he has to fit.

It's okay though because he has a soothing voice, so I can always put one of his videos on to help me drift off to sleep before it's finished. If it's important information someone here will summarize it the next day.

lol, especially when he is compelled to always say "Nice

" whenever he mentions the 6900XT.

Aapje · May 5, 2022

lobz said:
Just watch it yourselves..... dude it's supposed to be a written forum and not a freaking videoconference. I can read through 2 pages of interesting arguments here instead of listening to Paul (whom I otherwise quite like) trying and failing to finish a sentence under 5 minutes.

Thank God for video speedup.

DeathReborn · May 6, 2022

Aapje said:
Thank God for video speedup.

Thank god for seeing the logo and going "nah, I'm good thanks". Red, Green Blue... not gonna watch it.

DisEnchantment · May 11, 2022

RDNA3 has dual issue wave32 vector op support.

⚙ D125261 [AMDGPU] gfx11 subtarget features & early tests

reviews.llvm.org

Legacy geometry engine is gone.

soresu · May 13, 2022

DisEnchantment said:
RDNA3 has dual issue wave32 vector op support.

Is that a tacit affirmation of the doubled shader logic per WGP?

Olikan · May 13, 2022

soresu said:
Is that a tacit affirmation of the doubled shader logic per WGP?

Nope, a wavefront will use as many shaders it can...

I can see 2 uses of it:
1- Similar to hypertreading in CPUs, a way to use all resources avaible when possible. The drivers would need to be aware of it, will have diminushing returns If devs fine tune for it;

2- to Double the shaders performance, Remember Fermi? This time not inceasing clocks, but alot of silicon... AMD would need to double the resources to feed them. Registry pressure would be insane

Saylick · May 13, 2022

Looks like the rumors for N31 are corralling on the new theory that it's a single N5 GCD and all the Infinity Cache and memory controllers are on N6 MCDs (6 total). Still MCM architecture, probably uses fan out bridges, but not 3D stacked. Still, with all of the logic on N5 and all of the IO and cache on N6, it appears to be very cost optimized. The N5 GCD likely is only around 400mm2 knowing that N21 was 520mm2 and around half of that was shaders. 520mm2 / 2 * 2.4x shader count / 2x node shrink gives you around 315mm2. Add on the PHY for the fan out bridges and extra die space for better RT and architectural improvements, 400mm2 seems to be within the ballpark.

Edit: Just adding more flavor from various leakers:

- RedGamingTech:

- KittyYYuko:

https://twitter.com/x/status/1525013156384604160

Translation:

Many people see the Navi 31's performance goal as three times the Navi 21's, but that's a mistake, and some say it's four times the Navi 21's.
I would like to reiterate that Navi31 is always designed to achieve 6 times the performance of Navi10.

- Kopite7Kimi:

https://twitter.com/x/status/1523480266437988352

Frenetic Pony · May 13, 2022

Saylick said:
Looks like the rumors for N31 are corralling on the new theory that it's a single N5 GCD and all the Infinity Cache and memory controllers are on N6 MCDs (6 total). Still MCM architecture, probably uses fan out bridges, but not 3D stacked. Still, with all of the logic on N5 and all of the IO and cache on N6, it appears to be very cost optimized. The N5 GCD likely is only around 400mm2 knowing that N21 was 520mm2 and around half of that was shaders. 520mm2 / 2 * 2.4x shader count / 2x node shrink gives you around 315mm2. Add on the PHY for the fan out bridges and extra die space for better RT and architectural improvements, 400mm2 seems to be within the ballpark.

There's probably other shrinks elsewhere. Missing backwards compat for deprecated features like the old geo pipeline, more SIMD32 versus work dispatch, etc. Arch improvements so far seem more about fixing old problems like needing to decompress in certain stages; and the improved RT pipe that was patented a while ago used the texturing units, meaning any expansion there is probably minimal. Bridge is also pretty minimal afaik, could see the chip end up in the mid 300mm range.

With the price hike from TSMC and extra dies a single 128 simd32 compute chip alongside bus and etc. probably costs a bit more to produce than the 6700xt. Thus my guess of $500 for a full chip and $400 for a partially disabled/binned (non "xt") version. Of course if things go bad those prices could rise by $50-60 or so

Hopefully though it'll be a bit smaller, and that price increase won't happen. It'd be real nice to see 6800(non xt) or a bit better at $400.

Edit- really though I don't understand the rumored specs entirely. The most logical compute tile should be like, 96simds (48 RDNA2 compute units). The cost would be tiny, probably as low as a Navi 23 (6600) board for each compute die. You could pair it with 128bit/256bit/etc. bus and SRAM as you scale up to 1/2/3/4 compute dies. A single one would be theoretically 20% faster than a 6750xt at minimum, and more for raytracing. That should be the sweet spot for cost/performance right? Eg 12 teraflops/8gb for $329; 15tf/16gb $400; 24tf/16gb $700; 30tf/16gb $1000; 36 tf/12gb $1250; 44tf/24gb $1600; 58tf/16gb $2500.

Aapje · May 13, 2022

Yeah, RDNA3 seems to be heavily optimized for cost across the range, which also makes sense if it is to be the basis for console refreshes like the PS5 pro. In general, AMD may create an extremely compelling product with relatively low power consumption, relatively low cost and excellent performance.

Question Speculation: RDNA3 + CDNA2 Architectures Thread

Platinum Member

Senior member

Senior member

Golden Member

Platinum Member

Lifer

Golden Member

Diamond Member

Platinum Member

Member

Member

Diamond Member

Diamond Member

Platinum Member

Member

Golden Member

Diamond Member

Diamond Member

Golden Member

Platinum Member

Golden Member

Platinum Member

Platinum Member

Diamond Member

Senior member

Golden Member