Question Speculation: RDNA3 + CDNA2 Architectures Thread

uzzi38 · Jan 23, 2021

Man I have been dying to make this one for a while now.

First rumours for RDNA3 are here so new thread time!

Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3 is much bigger than from RDNA1 to RDNA2. We should expect many big improvements in GFX11. 🤔" / Twitter

eek2121 · Nov 22, 2022

I am curious as to how loud/quiet the default fan curve will be. My 3090 sounds like a vacuum without me changing the fan curve.

Stuka87 · Nov 22, 2022

eek2121 said:
I am curious as to how loud/quiet the default fan curve will be. My 3090 sounds like a vacuum without me changing the fan curve.

That will be up to each individual board maker.

Heartbreaker · Nov 22, 2022

PJVol said:
Not a single question that goes beyond of what is already known.
Pretty useless one.

Sorry it bored you. For me it was very cool seeing the engineer who basically convinced AMD to do chiplets in the first place talk about it.

Have we seen those slides before? I haven't seen them. Link if we have?

This is the first time I have seen any solid indication of how much Chiplets save AMD. Basically a monolithic 16 core Ryzen would cost them 2x to build vs the chiplet one. These savings are MUCH higher than I would have expected, and he said it was based on their internal yield models which he said were very accurate. So this isn't some vague marketing slide. The same slide also showed a more recent cost increase per area for smaller processes.

Plus the slide on the different scaling of Memory, Analog, Logic, was interesting and explains how the Memory controller chips work well.

Also very interesting, is what a massive work effort it is to port memory controllers/cache to a new node. I think a lot of the time, people assume porting the same stuff to a new node is trivial. This makes it clear that it's the total opposite of that, and just putting this stuff in a chiplet on the previous node, saves them a massive amount of work, and the kind of work no one really likes. Engineers want to work on new architecture logic, not porting memory controllers.

I also wondered if they were going to use an expensive silicon interposer for the chiplets, but he indicated they are using some much less expensive plastic tech.

In short I got a lot out of it. I hope some others did as well, even if you got nothing.

maddie · Nov 22, 2022

guidryp said:
Sorry it bored you. For me it was very cool seeing the engineer who basically convinced AMD to do chiplets in the first place talk about it.

Have we seen those slides before? I haven't seen them. Link if we have?

This is the first time I have seen any solid indication of how much Chiplets save AMD. Basically a monolithic 16 core Ryzen would cost them 2x to build vs the chiplet one. These savings are MUCH higher than I would have expected, and he said it was based on their internal yield models which he said were very accurate. So this isn't some vague marketing slide. The same slide also showed a more recent cost increase per area for smaller processes.

Plus the slide on the different scaling of Memory, Analog, Logic, was interesting and explains how the Memory controller chips work well.

Also very interesting, is what a massive work effort it is to port memory controllers/cache to a new node. I think a lot of the time, people assume porting the same stuff to a new node is trivial. This makes it clear that it's the total opposite of that, and just putting this stuff in a chiplet on the previous node, saves them a massive amount of work, and the kind of work no one really likes. Engineers want to work on new architecture logic, not porting memory controllers.

I also wondered if they were going to use an expensive silicon interposer for the chiplets, but he indicated they are using some much less expensive plastic tech.

In short I got a lot out of it. I hope some others did as well, even if you got nothing.

When Zen originally arrived, there were extensive discussions on these fora about the cost savings and the significant binning benefits of chiplets. We found that the cost benefit increased with increased core counts, the opposite of the norm for the industry.

Maybe you can do a search.

Timorous · Nov 22, 2022

maddie said:
When Zen originally arrived, there were extensive discussions on these fora about the cost savings and the significant binning benefits of chiplets. We found that the cost benefit increased with increased core counts, the opposite of the norm for the industry.

Maybe you can do a search.

Sure but did anybody expect the cost was less than half for a 16 core part using chiplets vs monolithic or that an 8 core monolithic part would cost about the same as a 16c chiplet part?

Heartbreaker · Nov 22, 2022

Timorous said:
Sure but did anybody expect the cost was less than half for a 16 core part using chiplets vs monolithic or that an 8 core monolithic part would cost about the same as a 16c chiplet part?

Exactly everyone talked about, and theory crafted savings. This is the first time I've seen them essentially quantified by AMD, and IMO they are much larger than expected.

maddie · Nov 22, 2022

Timorous said:
Sure but did anybody expect the cost was less than half for a 16 core part using chiplets vs monolithic or that an 8 core monolithic part would cost about the same as a 16c chiplet part?

These have been out a few yrs.

Paul98 · Nov 22, 2022

Now that they have these MCD's, sounds like they can greatly reduce the time and cost of MCD's across generations, and only do a major update on those when it's really needed.

Stuka87 · Nov 22, 2022

Paul98 said:
Now that they have these MCD's, sounds like they can greatly reduce the time and cost of MCD's across generations, and only do a major update on those when it's really needed.

Yup. The MCDs don't gain much from node shrinks. So they can stay on the cheaper node until there is a real reason to move them forward. Really saves a lot of time for things like respins, or porting the GCD to a new process.

DisEnchantment · Nov 22, 2022

AMD X.Org drivers - Patchwork

patchwork.freedesktop.org

[07/19] drm/amdgpu/discovery: set the APU flag for GC 11.0.4 - Patchwork

patchwork.freedesktop.org

Git:

@@ -2220,6 +2220,7 @@  int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
     case IP_VERSION(10, 3, 6):
     case IP_VERSION(10, 3, 7):
     case IP_VERSION(11, 0, 1):
+    case IP_VERSION(11, 0, 4):
         adev->flags |= AMD_IS_APU;
         break;

New GPU added upstream
GC 11.0.4 --> Is this Strix Point? Or Van Gogh Successor?

Strange thing, I thought GC 11.0.1 -->This is indicated as an APU/PHX, I thought it is N32.

Kepler_L2 · Nov 22, 2022

DisEnchantment said:
AMD X.Org drivers - Patchwork

patchwork.freedesktop.org

[07/19] drm/amdgpu/discovery: set the APU flag for GC 11.0.4 - Patchwork

patchwork.freedesktop.org

Git:

@@ -2220,6 +2220,7 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev) case IP_VERSION(10, 3, 6): case IP_VERSION(10, 3, 7): case IP_VERSION(11, 0, 1): + case IP_VERSION(11, 0, 4): adev->flags |= AMD_IS_APU; break;

New GPU added upstream
GC 11.0.4 --> Is this Strix Point? Or Van Gogh Successor?

Strange thing, I thought GC 11.0.1 -->This is indicated as an APU/PHX, I thought it is N32.

Phoenix 2

PJVol · Nov 22, 2022

guidryp said:
In short I got a lot out of it. I hope some others did as well, even if you got nothing

Sorry if it was a discovery for you, but I honestly thought for the most of educated people here it's not, at least for those building on knowledge not only from tech-press slides.
There are many other sources of info, such as patents, ISSCC and VLSID publications, etc
Anyways, I'm glad you found this interview worthwhile.

PJVol · Nov 22, 2022

DisEnchantment said:
New GPU added upstream

Don't know if this is old, but anyway... just curious if 8se asic was ever planned.

/**
* GFX11 could support more than 4 SEs, while the bitmap
* in cu_info struct is 4x4 and ioctl interface struct
* drm_amdgpu_info_device should keep stable.
* So we use last two columns of bitmap to store cu mask for
* SEs 4 to 7, the layout of the bitmap is as below:
* SE0: {SH0,SH1} --> {bitmap[0][0], bitmap[0][1]}
* SE1: {SH0,SH1} --> {bitmap[1][0], bitmap[1][1]}
* SE2: {SH0,SH1} --> {bitmap[2][0], bitmap[2][1]}
* SE3: {SH0,SH1} --> {bitmap[3][0], bitmap[3][1]}
* SE4: {SH0,SH1} --> {bitmap[0][2], bitmap[0][3]}
* SE5: {SH0,SH1} --> {bitmap[1][2], bitmap[1][3]}
* SE6: {SH0,SH1} --> {bitmap[2][2], bitmap[2][3]}
* SE7: {SH0,SH1} --> {bitmap[3][2], bitmap[3][3]}
*/

https://elixir.bootlin.com/linux/v6.1-rc6/source/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c#L6359

scineram · Nov 22, 2022

Stuka87 said:
Yup. The MCDs don't gain much from node shrinks. So they can stay on the cheaper node until there is a real reason to move them forward. Really saves a lot of time for things like respins, or porting the GCD to a new process.

To me it sounded like they might reuse the MCD design for RDNA 4.

Stuka87 · Nov 22, 2022

scineram said:
To me it sounded like they might reuse the MCD design for RDNA 4.

What it means is they won't have to port the design to a new node. Porting a design is WAY more work than updating an existing design for the same node. They can very easily make tweaks to the current design to improve it while keeping it on the same node.

So not only are the MCDs being manufactured on a really cheap, reliable process, but there is way less overhead when it comes to any required changes.

Aapje · Nov 22, 2022

Yeah, and because they can use the same design for longer, the design costs per chip are lower as well. So the chip is cheap to produce and the design costs are quite low as well.

Aapje · Nov 22, 2022

beginner99 said:
Pricing entirely depends on clocking. with the 7900xt being $899 and a high clocking N32 would come rather close to it in terms of performance, I think even $699 is wishful thinking for a 7800xt. that would be a $200 gap or almost 30% jump. 7900xt is already unattractive right now and a 7800xt with say 15% less performance for 30% less money would make it useless. Therefore I expect it to be priced either higher or not really hitting these 3 ghz clocks as well. I think the perfromance gab will need to be at least 20% for a $699 price.

But they will want to make the 7800 XT quite unattractive because they will have great yields with such a small GCD. So it make most sense to give the cut dies a poor perf/$, so only people who really have a specific budget buy them. That way, they don't have to sell fully functioning chips as a lower tier. So it's better for AMD if the 7900 XTX (full N31), 7800 XT (full N32) and 7600 XT (full N31) are the big sellers.

moinmoin · Nov 22, 2022

guidryp said:
GN talk with a AMD engineer about chiplets:

One part that's news to me is that Naffziger was pushing for chiplets back in 2016. Seemed late to me, but I guess the Zeppelin MCM is not considered as chiplets yet. Still this claim means AMD made the decision to disaggregate compute and I/O only sometime in 2016 and launched it with Zen 2 in 2019 (and keeps the formula essentially unchanged since). That's still quite a speedy TTM for a previously unproven tech.

moinmoin · Nov 22, 2022

In the European Union there are government grants for creating ARM based supercomputers. SiPearl is creating Rhea based on 72 Neoverse cores, and recently announced to collaborate with AMD to use Instinct for compute. Tools from ROCm will be ported to ARM as part of that.

SiPearl's press release from 14th:

https://sipearl.com/wp-content/uploads/2022/11/PR_SiPearl_AMD_collaboration_VDEF.pdf

Kepler_L2 · Nov 22, 2022

Stuka87 said:
What it means is they won't have to port the design to a new node. Porting a design is WAY more work than updating an existing design for the same node. They can very easily make tweaks to the current design to improve it while keeping it on the same node.

So not only are the MCDs being manufactured on a really cheap, reliable process, but there is way less overhead when it comes to any required changes.

I'm not sure if that's still the plan, but the original roadmap for RDNA4 was 3nm GCD + 5nm MCD.

Saylick · Nov 22, 2022

Kepler_L2 said:
I'm not sure if that's still the plan, but the original roadmap for RDNA4 was 3nm GCD + 5nm MCD.

That'd be a surprise. I'd honestly expect something like N4 GCD + N6 MCD for midrange and N3E GCD + N6 MCD for the high end of RDNA 4. Not sure if we'll see dual GCDs but it would be nice to see that in the next generation.

GodisanAtheist · Nov 22, 2022

Kepler_L2 said:
I'm not sure if that's still the plan, but the original roadmap for RDNA4 was 3nm GCD + 5nm MCD.

-Wonder if this will allow for an RDNA3+ gen where we get a tweaked GCD and unchanged MCD (maybe more of them?)

Only time MCD would ever really need to be changed would be when GDDR7 shows up, at which point it might make sense to just do all the work to get a shrink in too. Alternatively TSMC might decide to sunset

Maybe AMD doesn't see GDDR7 meeting their timelines and will just stick to the existing MCD for RDNA4.

Actually, it would be wild to see AMD tick tock their cores with a new GCD for a new gen and a new MCD for the mid-gen refresh.

eek2121 · Nov 22, 2022

Stuka87 said:
That will be up to each individual board maker.

I was referring to the reference card. When it comes to third parties, they nearly always throw efficiency out the Window.

GodisanAtheist said:
-Wonder if this will allow for an RDNA3+ gen where we get a tweaked GCD and unchanged MCD (maybe more of them?)

My current thought is this: No, they will not touch MCDs. Yes, they will tweak the GCD. For the *50 versions we will see higher clocks similar to last gen. We won't see significant changes unless AMD moves N32 chips to an N31 die, which I don't think will happen. However, if there is an actual clock frequency "bug", we will see some rather large uplifts in frequency without a similarly large uplift in power consumption, or maybe WITH a large uplift in power consumption. Early rumors for Navi31 were claiming a 400-450W TGP. Maybe THAT part is the *50 part. I'm not going to bother speculating TBH.

Aapje · Nov 23, 2022

Saylick said:
That'd be a surprise. I'd honestly expect something like N4 GCD + N6 MCD for midrange and N3E GCD + N6 MCD for the high end of RDNA 4. Not sure if we'll see dual GCDs but it would be nice to see that in the next generation.

Same here. They may switch to a new node when they switch to GDDR7, but even then it is questionable. Why switch to N5 unless those wafers really drop in price?

Kaluan · Nov 24, 2022

Boy, I'm getting so tired of so called leaks lately. Recently we had "only 6 and 8 core get V-Cache" then a few days later "actually, 7950X3D will be a thing", now we've had "custom 7900XT/XTX only 1-2 weeks after launch?" followed a day ago by "actually, AIBs will have custom cards ready on the 13th".

Bleah.

Either way, I'm keeping my fingers crossed for both reference and custom SKU reviews being up on the lifting of the NDA.

Question Speculation: RDNA3 + CDNA2 Architectures Thread

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Attachments

Diamond Member

Diamond Member

Golden Member

Senior member

Senior member

Senior member

Senior member

Diamond Member

Golden Member

Golden Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Senior member