Discussion RDNA4 + CDNA3 Architectures Thread

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,567
5,553
136
1655034287489.png
1655034259690.png

1655034485504.png

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits
Or Phoronix

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it :grimacing:

This is nuts, MI100/200/300 cadence is impressive.

1655034362046.png

Previous thread on CDNA2 and RDNA3 here

 
Last edited:

Ajay

Lifer
Jan 8, 2001
14,826
7,436
136
From hardware POV, it is the It is the same form factor - OAM.
Okay, so I looked it up. OAM is a physical and electrical standard that supports up to 112 Gbps SerDes (PAM4). So same PHY for both vendors, apparently the protocols are different. That's interesting.
 

Ajay

Lifer
Jan 8, 2001
14,826
7,436
136
"the lack of universal support is your biggest detriment to widespread ROCm adoption"

View attachment 81823
Yeah, the guys running supercomputers will do whatever they have to to get the highest performance - everybody else wants plug in software components as is the case with CUDA. Honestly, providing ROCm on discrete GPUs under windows would make great development machines for student (as Nvidia did). Last I check, ROCm was only available under Linux. Pls correct me if I'm wrong.
 

Ajay

Lifer
Jan 8, 2001
14,826
7,436
136
Also, can we pls have RDNA4 this year (I know, isn't happening). AMD just needs to up the ante and I'm annoyed that they fell short since I'll likely be buying and AMD GPU this year. They must have seen the problems they faced with RDNA3 when they started getting samples back from TSMC - but it was too late to make any significant changes at that point.
 
  • Like
Reactions: Tlh97 and coercitiv

tajoh111

Senior member
Mar 28, 2005
289
305
136
Also, can we pls have RDNA4 this year (I know, isn't happening). AMD just needs to up the ante and I'm annoyed that they fell short since I'll likely be buying and AMD GPU this year. They must have seen the problems they faced with RDNA3 when they started getting samples back from TSMC - but it was too late to make any significant changes at that point.

I think if anything, expect delays with RDNA4. Just what is implied with Nvidia, consumer graphic is likely to take a back seat to Data center. The reasons for AMD to give up the consumer market are much more pronounced than Nvidia.

The consumer and gaming market is generally a poor market to be in unless your the dominant player. High cost with large silicon and a market that is generally cheap except for the high end.

Add in stagnation in revenue when mining isn't a factor for AMD and it's not worth it really for AMD considering the resources they need to speed to keep up with Nvidia.

Take out the consoles and AMD might be making a billion dollars in revenue annually from discrete GPU which is basically zero dollars in terms of net profit when you add the cost of production and the R and D. Maybe a loss at this point. Remember Sony is AMD's biggest customer at 16% of revenue with MS also being a significant portion as well. Remove that from AMD revenue for gaming and AMD's is making 2 to 3 hundred million a quarter for gaming.

Now AMD has even more reasons to invest in CDNA with the AI market being a growing one without the growingly more entitled gaming market which is largely stagnant. This is going to pull away resources from RDNA development which was already strained, particularly the software side. Look at how long FSR3 is taking to come out and look at when RDNA2 driver development slowdown right after RDNA3 was launched.

So if any market is going to be getting cuts for R and D in spending, it's AMD consumer graphics which is useless for the data center market. Atleast, Nvidia has tensor cores in their consumer GPU so that the product can pivot into AI or vice versa. A pure raster card is going to be difficult to justify going forward when taking into account AMD's limited resources and R and D spending for CPU's.
 

Saylick

Platinum Member
Sep 10, 2012
2,869
5,663
136
I think if anything, expect delays with RDNA4. Just what is implied with Nvidia, consumer graphic is likely to take a back seat to Data center. The reasons for AMD to give up the consumer market are much more pronounced than Nvidia.

The consumer and gaming market is generally a poor market to be in unless your the dominant player. High cost with large silicon and a market that is generally cheap except for the high end.

Add in stagnation in revenue when mining isn't a factor for AMD and it's not worth it really for AMD considering the resources they need to speed to keep up with Nvidia.

Take out the consoles and AMD might be making a billion dollars in revenue annually from discrete GPU which is basically zero dollars in terms of net profit when you add the cost of production and the R and D. Maybe a loss at this point. Remember Sony is AMD's biggest customer at 16% of revenue with MS also being a significant portion as well. Remove that from AMD revenue for gaming and AMD's is making 2 to 3 hundred million a quarter for gaming.

Now AMD has even more reasons to invest in CDNA with the AI market being a growing one without the growingly more entitled gaming market which is largely stagnant. This is going to pull away resources from RDNA development which was already strained, particularly the software side. Look at how long FSR3 is taking to come out and look at when RDNA2 driver development slowdown right after RDNA3 was launched.

So if any market is going to be getting cuts for R and D in spending, it's AMD consumer graphics which is useless for the data center market. Atleast, Nvidia has tensor cores in their consumer GPU so that the product can pivot into AI or vice versa. A pure raster card is going to be difficult to justify going forward when taking into account AMD's limited resources and R and D spending for CPU's.
You make a good argument.

That effect you describe, where market leaders have an advantage, is similar to TSMC vs. Samsung vs. Intel, and it's because in the semiconductor space R&D costs are so high. The only way to recoup your initial investment is to be the dominant player. This results in dominant companies being able to maintain their R&D budget which lets them create better products which lets them maintain their market leadership. It's a positive feedback loop for market leaders and a negative feedback loop for those who aren't. It ends up crowding out start-ups because the barrier to entry becomes almost insurmountable. At least with Intel, the US government has a vested interest so they'll get whatever financial backing it takes to regain market leadership, but with AMD vs. Nvidia in the consumer GPU market, such backing will never exist.

I've mentioned this in the past, but another example of this is FSR vs. DLSS. I've seen plenty of people have moaned, "Oh, if AMD GPUs had special features that differentiated them from Nvidia, I'd have more reason to buy them. FSR is basically just a poor man's DLSS." What they don't realize is that FSR is open source for the exact same reason that I described in the above paragraph: only market leaders can develop proprietary features that will be guaranteed wide adoption because they have a wide user base. If AMD walled off FSR to AMD cards only, even less developers would implement it and when a feature is not adopted, it dies off. This is yet another reason why Nvidia is able to maintain a software advantage over AMD. I honestly don't know how they could combat it unless Nvidia dropped the ball for a few generations in a row, but I don't see that happening.

Nvidia hire the best software developers because software developers want to work for Nvidia. They also have the resources to basically fund university researchers who then end up working at Nvidia after they complete their Doctorate studies. Again, it's a positive feedback loop for them. It's probably not considered anti-competitive behavior, but it really is cutthroat. It's success begets success.
 
Last edited:
  • Like
Reactions: Lodix and Tlh97

Ajay

Lifer
Jan 8, 2001
14,826
7,436
136
Take out the consoles and AMD might be making a billion dollars in revenue annually from discrete GPU which is basically zero dollars in terms of net profit when you add the cost of production and the R and D. Maybe a loss at this point. Remember Sony is AMD's biggest customer at 16% of revenue with MS also being a significant portion as well. Remove that from AMD revenue for gaming and AMD's is making 2 to 3 hundred million a quarter for gaming.
It's so hard to tell what discrete graphics brings in. AMD lumps it under Gaming Revenue which includes consoles. That's a total of $1.8B US per quarter (+/-). Wish it was broken down further. I guess what really matters is how long this slump continues. If this is the 'new normal' well, we are going to see few dies and a small product stact I would think - and just leverage developments for consoles and iGPUs to keep up a smaller dGPU presence. Or, they are more clever than that and have a better plan.

If you have any specifics slides, facts, figures that breaks this down more, I'd love to see them.
 

maddie

Diamond Member
Jul 18, 2010
4,661
4,498
136
I think if anything, expect delays with RDNA4. Just what is implied with Nvidia, consumer graphic is likely to take a back seat to Data center. The reasons for AMD to give up the consumer market are much more pronounced than Nvidia.

The consumer and gaming market is generally a poor market to be in unless your the dominant player. High cost with large silicon and a market that is generally cheap except for the high end.

Add in stagnation in revenue when mining isn't a factor for AMD and it's not worth it really for AMD considering the resources they need to speed to keep up with Nvidia.

Take out the consoles and AMD might be making a billion dollars in revenue annually from discrete GPU which is basically zero dollars in terms of net profit when you add the cost of production and the R and D. Maybe a loss at this point. Remember Sony is AMD's biggest customer at 16% of revenue with MS also being a significant portion as well. Remove that from AMD revenue for gaming and AMD's is making 2 to 3 hundred million a quarter for gaming.

Now AMD has even more reasons to invest in CDNA with the AI market being a growing one without the growingly more entitled gaming market which is largely stagnant. This is going to pull away resources from RDNA development which was already strained, particularly the software side. Look at how long FSR3 is taking to come out and look at when RDNA2 driver development slowdown right after RDNA3 was launched.

So if any market is going to be getting cuts for R and D in spending, it's AMD consumer graphics which is useless for the data center market. Atleast, Nvidia has tensor cores in their consumer GPU so that the product can pivot into AI or vice versa. A pure raster card is going to be difficult to justify going forward when taking into account AMD's limited resources and R and D spending for CPU's.
I disagree, If the discrete card desktop market was the entirety of AMD, then yes, but we have console gaming, desktop gaming and now the growing APU handheld market, all sharing the same GPU tech. The IP R&D costs will be the same and the additional cost to produce a discrete GPU is a marginal cost. That is the true cost in keeping up with Nvidia in the desktop market, not the overall cost, which you have to spend anyhow.
 

moinmoin

Diamond Member
Jun 1, 2017
4,720
7,237
136
I disagree, If the discrete card desktop market was the entirety of AMD, then yes, but we have console gaming, desktop gaming and now the growing APU handheld market, all sharing the same GPU tech. The IP R&D costs will be the same and the additional cost to produce a discrete GPU is a marginal cost. That is the true cost in keeping up with Nvidia in the desktop market, not the overall cost, which you have to spend anyhow.
Indeed. Add to that that AMD's best gen so far, RDNA2, was essentially paid by the console manufactures. QA wise that apparently helped as well considering how RDNA3 turned out.
 
Jul 27, 2020
13,300
7,894
106
AMD just needs to up the ante and I'm annoyed that they fell short since I'll likely be buying and AMD GPU this year.
Look what's coming: https://portal.eaeunion.org/sites/o...7&ListId=d84d16d7-2cc9-4cff-a13b-530f96889dbc

1686963475843.png

 

Ajay

Lifer
Jan 8, 2001
14,826
7,436
136
Look what's coming: https://portal.eaeunion.org/sites/o...7&ListId=d84d16d7-2cc9-4cff-a13b-530f96889dbc

View attachment 81861

Well, hopefully performance matches the 4070 (in raster), except with that precious 16GB, so I don't have to worry that I'll buy a AAA gaming title in three years and it will only run HQ settings at 1080p.
 

Kepler_L2

Senior member
Sep 6, 2020
268
762
106
It's so hard to tell what discrete graphics brings in. AMD lumps it under Gaming Revenue which includes consoles. That's a total of $1.8B US per quarter (+/-). Wish it was broken down further. I guess what really matters is how long this slump continues. If this is the 'new normal' well, we are going to see few dies and a small product stact I would think - and just leverage developments for consoles and iGPUs to keep up a smaller dGPU presence. Or, they are more clever than that and have a better plan.

If you have any specifics slides, facts, figures that breaks this down more, I'd love to see them.
RDNA4 should be the start of a truly modular approach to GPUs, so I don't think the product stack will get smaller.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,014
2,414
106
RDNA4 should be the start of a truly modular approach to GPUs, so I don't think the product stack will get smaller.
I don't think I am very excited about that.

N31 turned out to be a big transistor eater.
I calculated a monolithic RDNA3 GPU with N21's specs.Link
It had only 33 billion transistors, while N31 had 75% more.
For that increase, you got only 50% more SE, 20% more CUs(Shaders, TMUs), 50% more ROPs, 50% more BW and -25% Infinity cache which translates to ~20-25% more performance.
 

Joe NYC

Golden Member
Jun 26, 2021
1,821
2,104
106
I don't think I am very excited about that.

N31 turned out to be a big transistor eater.
I calculated a monolithic RDNA3 GPU with N21's specs.Link
It had only 33 billion transistors, while N31 had 75% more.
For that increase, you got only 50% more SE, 20% more CUs(Shaders, TMUs), 50% more ROPs, 50% more BW and -25% Infinity cache which translates to ~20-25% more performance.
RDNA4 will likely be modular more in a way resembling Mi300 (CDNA3) than Navi 31/32 (RDNA3).

Here is a picture from earlier in this thread, from @DisEnchantment , from a patent:

1687014385943.png


The base die, similar to Mi300 will likely have 2 memory controllers and (more) cache.

Currently, MCD die is 37.5 mm2. The new base die will have 2 of them, then some more cache (easily 2x of RDNA3), all on N6, for total die size of ~100-110 mm2.

Then, the GCD (SED?), of about 100 mm2 will be stacked on top of this base die. The top and bottom dies will form a unit equivalent to a Navi43 equivalent with 8 GB,

Additional units like this will be joined by some kind of bridge, possibly also 3D stacked silicon bridge (704x in the pic) for very high bandwidth, low power overhead.

Adding more will be the scaling of performance:
1 unit - Navi 43, 8 GB
2 units - Navi 42, 16 GB
3 units - Navi 41, 24 GB
4 units - Navi 40, 32 GB
also half steps would be possible with only 1 of 2 memory chips populated. Possibly, 4 more products. As @Kepler_L2 said, there could be more products with modular approach.

There would be only 2 types of dies to tape out:
- N6 base die
- N4(?) compute die
- possibly (tiny) silicon bridge die
- possibly (small) IO die

With this approach, AMD will have fewer dies, less validation, and more manufacturing flexibility, as the component dies can become, at the point of final assembly, different products.

If the dies are ~100 mm2, yields will be extremely high. That's really the sweet spot AMD is aiming for with all of its dies. Zen 3, Zen4, Zen4c are all in 70-100 range, Mi300 GPU die is probably also in the same range, 100-125 mm2.
 

Ajay

Lifer
Jan 8, 2001
14,826
7,436
136
RDNA4 should be the start of a truly modular approach to GPUs, so I don't think the product stack will get smaller.
Fair point. Hope it works out well. Also hope AMD can churn out a real 'halo' GPU. They really put some punch into the marketing of GPUs because all the 'influencers' on social media will rave about them.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,014
2,414
106
RDNA4 will likely be modular more in a way resembling Mi300 (CDNA3) than Navi 31/32 (RDNA3).

Here is a picture from earlier in this thread, from @DisEnchantment , from a patent:

.....


The base die, similar to Mi300 will likely have 2 memory controllers and (more) cache.

Currently, MCD die is 37.5 mm2. The new base die will have 2 of them, then some more cache (easily 2x of RDNA3), all on N6, for total die size of ~100-110 mm2.

Then, the GCD (SED?), of about 100 mm2 will be stacked on top of this base die. The top and bottom dies will form a unit equivalent to a Navi43 equivalent with 8 GB,

Additional units like this will be joined by some kind of bridge, possibly also 3D stacked silicon bridge (704x in the pic) for very high bandwidth, low power overhead.

Adding more will be the scaling of performance:
1 unit - Navi 43, 8 GB
2 units - Navi 42, 16 GB
3 units - Navi 41, 24 GB
4 units - Navi 40, 32 GB
also half steps would be possible with only 1 of 2 memory chips populated. Possibly, 4 more products. As @Kepler_L2 said, there could be more products with modular approach.

There would be only 2 types of dies to tape out:
- N6 base die
- N4(?) compute die
- possibly (tiny) silicon bridge die
- possibly (small) IO die

With this approach, AMD will have fewer dies, less validation, and more manufacturing flexibility, as the component dies can become, at the point of final assembly, different products.

If the dies are ~100 mm2, yields will be extremely high. That's really the sweet spot AMD is aiming for with all of its dies. Zen 3, Zen4, Zen4c are all in 70-100 range, Mi300 GPU die is probably also in the same range, 100-125 mm2.
I don't think you understand what was my point. It was that chiplet design at least in N31 but likely also in N32 uses up a significant amount of transistors just to connect GCD to MCDs.

2SE,16CU,1024SP, 64TMUs, 64ROPs, 64-bit controller is the difference between a hypothetical RDNA3 based N21 and N31 as I linked before.
That certainly doesn't need 24.7 billion transistors and to be honest, the difference in transistors is even bigger, because of the missing 32MB IC in N31.
I think even N32 will have a lot more transistors than N21.

To me, It looks like ~18 billion transistors were used just for those interconnects in N31, I don't have any other explanation for what else could that amount of transistors be needed.
That's 31% of all transistors, that's just a ridiculously high amount and space just to connect 6MCDs to a single GCD.
I hope N4* won't end up the same way.

I agree with your thought about more cache per MCD, I always though only 16MB was too little. 32MB looks like a good choice.
I won't speculate about the size of SED(GCD) or what It is, but for comparison I got ~138mm2 on N5 for 2SE,40CU,2560SP,160TMU,64ROP and interconnects for 2 MCDs.
N31 Die sizes.png
 
Last edited:

Joe NYC

Golden Member
Jun 26, 2021
1,821
2,104
106
I don't think you understand what was my point. It was that chiplet design at least in N31 uses up a significant amount of transistors just to connect GCD to MCDs.

2SE,16CU,1024SP, 64TMUs, 64ROPs, 64-bit controller is the difference between a hypothetical RDNA3 based N21 and N31 as I linked before.
That certainly doesn't need 24.7 billion transistors and to be honest, the difference in transistors is even bigger, because of the missing 32MB IC in N31.
To me, It looks like ~18 billion transistors were used just for those interconnects, I don't have any other explanation for what else could that amount of transistors be needed.
That's 31% of all transistors, that's just a ridiculously high amount and space just to connect 6MCDs to a single GCD.
I hope N4* won't end up the same way.

I agree with your thought about more cache per MCD, I always though only 16MB was too little. 32MB looks like a good choice.
I won't speculate about the size of SED(GCD) or what It is, but for comparison I got ~138mm2 on N5 for 2SE,40CU,2560SP,160TMU,64ROP and interconnects for 2 MCDs.
View attachment 81887
I think I did understand your point - that the Navi 3x style modularity did not deliver, that a monolithic incremental approach from Navi2x could have likely performed better.

But, IMO, that is now only a theoretical argument. There is no way AMD is going back to monolithic dGPU dies, it will be more modularity but different modularity. With Navi 3x, AMD just dipped its toes in modularity - with mixed success.

I think the problem was that AMD was not sure about the diving into 3D stacking on mainstream parts in H2 2022. By the time of RNDA4 production (2024?), that will not be a problem. Each one of AMD's product line will incorporate 3D stacking, possibly even the high end notebook chips.

If the RDNA4 chips look anything like the picture from the patent application, including separate IO die, that would mean even more die savings from the GCD compute die, even more specialization of those transistors for compute.

With 3D stacking the communication will go through TSVs, which have higher density and may be placed under other transistors, of other functional blocks, not necessarily taking up a dedicated area of the compute die for for connecting to other dies.

The fact that the Navi 3x transistor budget fell short of expected performance may have a number of reasons (including bugs?) so it may not be something to draw conclusions from for the next generation.
 

Ajay

Lifer
Jan 8, 2001
14,826
7,436
136
I think I did understand your point - that the Navi 3x style modularity did not deliver, that a monolithic incremental approach from Navi2x could have likely performed better.

But, IMO, that is now only a theoretical argument. There is no way AMD is going back to monolithic dGPU dies, it will be more modularity but different modularity. With Navi 3x, AMD just dipped its toes in modularity - with mixed success.

I think the problem was that AMD was not sure about the diving into 3D stacking on mainstream parts in H2 2022. By the time of RNDA4 production (2024?), that will not be a problem. Each one of AMD's product line will incorporate 3D stacking, possibly even the high end notebook chips.

If the RDNA4 chips look anything like the picture from the patent application, including separate IO die, that would mean even more die savings from the GCD compute die, even more specialization of those transistors for compute.

With 3D stacking the communication will go through TSVs, which have higher density and may be placed under other transistors, of other functional blocks, not necessarily taking up a dedicated area of the compute die for for connecting to other dies.

The fact that the Navi 3x transistor budget fell short of expected performance may have a number of reasons (including bugs?) so it may not be something to draw conclusions from for the next generation.
Well, to really benefit from a multi-chip strategy, the GCD's must be modular as well. That will help with scaling across product lines (as I believe @Kepler_L2 was pointing to. Then, AMD will need a very high bandwidth, low latency interconnect like Apple uses in it's dual chip M2 Ultra SoC. Apple, and TSMC have shown that this can be done with two 'chiplets'. This will have to be extended to three or four interconnects with smaller GCDs (maybe 150mm^2) to go from a lower end GPU up to the top performance rung. Sadly, if rumors are correct, AMD will be using N4P for RDNA4 - rather than the higher density N3E; maybe the performance (clocks) just are not going to be there with that widely available node at the start of HVM.
 
Last edited:
  • Like
Reactions: Tlh97

Joe NYC

Golden Member
Jun 26, 2021
1,821
2,104
106
Well, to really benefit from a multi-chip strategy, the GCD's must be modular as well. That will help with scaling across product lines (as I believe @Kepler_L2 was point to. Then, AMD will need a very high bandwidth, low latency interconnect like Apple uses in it's dual chip M2 Ultra SoC. Apple, and TSMC have shown that this can be done with two 'chiplets'. This will have to be extended to three or four interconnects with smaller GCDs (maybe 150mm^2) to go from a lower end GPU up to the top performance rung. Sadly, if rumors are correct, AMD will be using N4P for RDNA4 - rather than the higher density N3E; maybe the performance (clocks) just are going to be there with that widely available node at the start of HVM.
That's precisely what the patent describes. Multiple GDCs stacked on base dies (which contain memory controller and cache) and base dies are connected together with stacked active silicon bridge.

So this would be the first time a stacked chip spans 2 base chips. Which seems quite ambitious. This type of connection would go way beyond any other horizontal links, including Apple's.

BTW, the patent was filed in 2021, published in October 2022, but still an interesting read, even if you just look at the pictures (by downloading the PDF):

 

Joe NYC

Golden Member
Jun 26, 2021
1,821
2,104
106
Well, to really benefit from a multi-chip strategy, the GCD's must be modular as well. That will help with scaling across product lines (as I believe @Kepler_L2 was point to. Then, AMD will need a very high bandwidth, low latency interconnect like Apple uses in it's dual chip M2 Ultra SoC. Apple, and TSMC have shown that this can be done with two 'chiplets'. This will have to be extended to three or four interconnects with smaller GCDs (maybe 150mm^2) to go from a lower end GPU up to the top performance rung. Sadly, if rumors are correct, AMD will be using N4P for RDNA4 - rather than the higher density N3E; maybe the performance (clocks) just are going to be there with that widely available node at the start of HVM.
BTW, the GCD dies, once stripped of all non compute elements, L3, and I/O, connected to other dies via TSVs and 3D stacking would be quite dense.

I was kind of assuming that it would be using N4P rather than N3 for cost and capacity reasons. The lowest end SKU with one of the ~100 mm2 GCDs + ~100 mm2 N6 will have to sell in the $200 range.

As far as allocating capacity (between AMD products) of the most advanced nodes, I think dGPUs are not really a contender for AMD for the most advanced node. Datacenter / AI will likely be the first ones to get the N3 node.
 
  • Like
Reactions: Tlh97

TESKATLIPOKA

Platinum Member
May 1, 2020
2,014
2,414
106
I think I did understand your point - that the Navi 3x style modularity did not deliver, that a monolithic incremental approach from Navi2x could have likely performed better.
....
The fact that the Navi 3x transistor budget fell short of expected performance may have a number of reasons (including bugs?) so it may not be something to draw conclusions from for the next generation.
I can agree that there are some reserves in Dual issue performance and also frequency with N31, but in all honesty, It still performs decently for those specs.

Why I am not very happy with N31 is that It has >2x more transistors vs N21, but the difference between a comparable monolithic RDNA2 vs RDNA3 GPU is only 20-25% more transistors.
I can only conclude that ~half of the increased transistor budget vs N21 was used for interconnects, and that's very bad.

If a more modular N4x keeps the same interconnects, then
It will be even worse. Let's hope It's connected differently, TSVs and 3d stacking.
 
  • Like
Reactions: Tlh97 and Joe NYC

TESKATLIPOKA

Platinum Member
May 1, 2020
2,014
2,414
106
BTW, the GCD dies, once stripped of all non compute elements, L3, and I/O, connected to other dies via TSVs and 3D stacking would be quite dense.

I was kind of assuming that it would be using N4P rather than N3 for cost and capacity reasons. The lowest end SKU with one of the ~100 mm2 GCDs + ~100 mm2 N6 will have to sell in the $200 range.
I don't think we can expect RX 8600 to cost only $199-249 when a cheaper N6 N33 sells for $269.
The question is what they can pack inside a 100mm2 GCD.
If they can put at least 2SE, 40CU, 2560SP, 160TMU and 64ROPs, which is ~25% more than what N33 has, then I wouldn't be surprised If they asked $299 for this.
The only real problem would be 8GB vram.
Seriously, will they newer release a 3gbit DDR6 modules?
 
  • Like
Reactions: Tlh97 and Joe NYC