Question Speculation: RDNA3 + CDNA2 Architectures Thread

uzzi38 · Jan 23, 2021

Man I have been dying to make this one for a while now.

First rumours for RDNA3 are here so new thread time!

Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3 is much bigger than from RDNA1 to RDNA2. We should expect many big improvements in GFX11. 🤔" / Twitter

moinmoin · Jan 25, 2023

guidryp said:
Steamdeck is actually a great counter example against the Big GPU APU meme. Steam deck has a custom part, so Valve could have ordered any size GPU in their APU that they wanted, but the Steamdeck has only half the GPU size vs AMDs standard 6800U gpu.

That's a dedicated handheld game machine, and they chose only half the GPU of AMDs standard APU.

It also has only half the amount of CPU cores. I don't think Van Gogh as used in Steam Deck fits into this discussion at all.

Steam Deck is actually a pretty well balanced system, rather portable, and the screen resolution/GPU/CPU performance ratio is comparable to current gen consoles at 4K which should ensure many games running well on consoles to run reasonably well on Steam Deck as well (potential compatibility issues aside).

Heartbreaker · Jan 25, 2023

moinmoin said:
It also has only half the amount of CPU cores. I don't think Van Gogh as used in Steam Deck fits into this discussion at all.

Steam deck definitely comes into when answering someone that argues we need bigger GPU for hand held consoles, and Steam Deck ships with about half the generic APU for laptops.

insertcarehere · Jan 25, 2023

leoneazzurro said:
And who told you that the target for Strix Point is N33 (which measures a bit more than 200mm^2 alone)? I spoke about low end dGPUs (GTX1650/2050/6400 class). Having a greater number of WGPs can also mean they can be clocked lower, getting a lower power consumption as well. Also, the CU can help doing some GPU compute if available.

That implies AMD will go the "wide and slow" approach and trade off die area for power efficiency, or, in another words, the chips comparatively more expensive to make for a similar level of performance.

Its not impossible per se (Apple arguably does this with their SoCs), but AMD is not Apple and they don't have a track record of taking such an approach for their integrated products. Especially when their customers are OEMs who'd just take the cheaper chip any day.

Anyway, it seems this new AMD leaker has been willing to put more ballpark performance estimates on N33 and Phoenix:

https://twitter.com/x/status/1618307235393966081

https://twitter.com/x/status/1617957687182123008

moinmoin · Jan 25, 2023

guidryp said:
Steam deck definitely comes into when answering someone that argues we need bigger GPU for hand held consoles, and Steam Deck ships with about half the generic APU for laptops.

Maybe I read @leoneazzurro's post wrong but to me it sounded more like emphasizing the importance of "reasonably powerful iGPU" which is the case with Van Gogh combined with Steam Deck's low screen resolution.

I honestly don't see the point in the whole "big APU" discussion anyway. As somebody noted earlier already Subor launched a custom big APU before, so whoever sees a big market in doing the same should be able to follow suit. As long as nobody sees a market for repeating that why should AMD itself take such a risk (especially since AMD won't create the end product and as such needs to rely on the reluctant OEMs anyway)?

jpiniero · Jan 25, 2023

insertcarehere said:
https://twitter.com/x/status/1617957687182123008

If desktop N33 is only a tad faster than the 6650 XT, I see no point in releasing it really. I am rather skeptical.

itsmydamnation · Jan 25, 2023

insertcarehere said:
Anyway, it seems this new AMD leaker has been willing to put more ballpark performance estimates on N33 and Phoenix:

BS he is a new leaker , like a day after N31 release a certain someone disappeared then 1 day later a new person appears..... funny that.....

leoneazzurro · Jan 26, 2023

guidryp said:
So you are expecting a big transistor increase for minimal cost increase.

The 2010's would like to inform you that ship has sailed.

I put the math there, feel free to use it. It seems in your reasoning then Phoenix should not have existed at all.

leoneazzurro · Jan 26, 2023

This All the Watts guy seems a copy of Greymon but even less accurate. His previous claims about N3x were called out as BS and I will be not surprised if he's comparing Mobile N33 SKUs to Desktop N22/23 cards...

Timorous · Jan 26, 2023

AMD have laptop N33 as 15% faster than desktop 3060 (although that does mean trusting their numbers which is unfortunate). That is about 6650 tier so the question will be how much faster is desktop N33 vs laptop.

TESKATLIPOKA · Jan 26, 2023

insertcarehere said:
Anyway, it seems this new AMD leaker has been willing to put more ballpark performance estimates on N33 and Phoenix:

https://twitter.com/x/status/1618307235393966081

I think with Phoenix he is wrong.
Dual issue still brings some performance improvement, although It depends on the game.
On the other hand, even If we don't know game clocks but boost is still 25% higher.
Just this should put It higher than Rembrandt.
From what he wrote It's just a bit better, but It's true that BW is a bottleneck.

insertcarehere · Jan 26, 2023

TESKATLIPOKA said:
I think with Phoenix he is wrong.
Dual issue still brings some performance improvement, although It depends on the game.
On the other hand, even If we don't know game clocks but boost is still 25% higher.
Just this should put It higher than Rembrandt.
From what he wrote It's just a bit better, but It's true that BW is a bottleneck.

To be frank I don't think we can conclude that dual issue brings performance improvement by itself. Yes the computerbase test comparing 7900XT vs 6900XT shows some improvement (9% average at 4k) but those are not equal GPUs even if you normalize CU/clocks.
- 7900XT has 800gb/s of vram bandwidth vs 512 gb/s for 6900XT, this is counteracted somewhat by the higher IC the latter has, but at 4k the 7900XT still should have more usable bandwidth.
- 7900XT has 192 ROPs vs 128 ROPs for 6900 XT, so it has 50% higher Pixel Rate even at the same clocks.

How much of the observed improvement is due to 2x FP32 as opposed to the 7900XT just being more endowed in bandwidth and GPU front end?

In a heavily BW-constrained environment with no more front end (which Phoenix is going to be), I could see gains not being that significant vs Rembrandt, even taking into account clock speeds.

Heartbreaker · Jan 26, 2023

leoneazzurro said:
I put the math there, feel free to use it. It seems in your reasoning then Phoenix should not have existed at all.

You put your made up guesses there.

Semi Analysis has a massive analysis of 3nm, and the bottom line is that it's a more complex, expensive node, that will be challenging to deliver any improvement in cost/transistor:

TSMC’s 3nm Conundrum, Does It Even Make Sense? – N3 & N3E Process Technology & Cost Detailed

Shrinking finally costs more, Moore’s Law is now dead in economic terms A couple of weeks ago, we were able to attend IEDM, where TSMC presented many details about their N3B and N3E, 3nm clas…

www.semianalysis.com

Moreover, with a standard monolithic chip (50% Logic + 30% SRAM + 20% Analog), density only increases by 1.3x. This is effectively flat on cost per transistor for the typical monolithic chip designs, with higher development costs.

During IEDM, TSMC revealed that N3E had a bit-cell size of 0.021 μm2, precisely the same as N5. This is a devastating blow to SRAM.

N3 designs with significantly more transistors, will cost significantly more. Even worse if you attempt to add a giant "infinity cache", to compensate for poor memory BW.

Timorous · Jan 26, 2023

guidryp said:
You put your made up guesses there.

Semi Analysis has a massive analysis of 3nm, and the bottom line is that it's a more complex, expensive node, that will be challenging to deliver any improvement in cost/transistor:

TSMC’s 3nm Conundrum, Does It Even Make Sense? – N3 & N3E Process Technology & Cost Detailed

Shrinking finally costs more, Moore’s Law is now dead in economic terms A couple of weeks ago, we were able to attend IEDM, where TSMC presented many details about their N3B and N3E, 3nm clas…

www.semianalysis.com

N3 designs with significantly more transistors, will cost significantly more. Even worse if you attempt to add a giant "infinity cache", to compensate for poor memory BW.

The cache won't be on N3 just like the cache in RDNA3 MCM is not on N5.

Heartbreaker · Jan 26, 2023

Timorous said:
The cache won't be on N3 just like the cache in RDNA3 MCM is not on N5.

We are talking about the APU for laptops which are monolithic so far.

leoneazzurro · Jan 26, 2023

guidryp said:
You put your made up guesses there.

Semi Analysis has a massive analysis of 3nm, and the bottom line is that it's a more complex, expensive node, that will be challenging to deliver any improvement in cost/transistor:

TSMC’s 3nm Conundrum, Does It Even Make Sense? – N3 & N3E Process Technology & Cost Detailed

Shrinking finally costs more, Moore’s Law is now dead in economic terms A couple of weeks ago, we were able to attend IEDM, where TSMC presented many details about their N3B and N3E, 3nm clas…

www.semianalysis.com

N3 designs with significantly more transistors, will cost significantly more. Even worse if you attempt to add a giant "infinity cache", to compensate for poor memory BW.

Your guesses are not much better (which version of N3 will be used? The "standard" or the "density optimized" or, like it happened already with the current product, customer oriented variants? Because even in N5 there are variants where the density varies wildly) Also we don't even know if there is an Infinity Cache but you assume that there will be. Chip stacking may play a role in reducing the costs. In any case, what you say about the costs don't change anything about the target market scenario, or do you think that magically the new CPU and dGPU will keep the existing processes forever?

Kronos1996 · Jan 26, 2023

guidryp said:
We are talking about the APU for laptops which are monolithic so far.

Dragon Range? They can make MCM laptop chips now just fine.

Heartbreaker · Jan 26, 2023

Kronos1996 said:
Dragon Range? They can make MCM laptop chips now just fine.

That's just the Desktop part with a different name, made for high power laptops with dGPU.

Note that as Desktop part, it has a MUCH smaller GPU (only 2 CU) than the real laptop parts.

As always, there is strong pressure to make everything a small as possible.

leoneazzurro said:
Your guesses are not much better (which version of N3 will be used? The "standard" or the "density optimized" or, like it happened already with the current product, customer oriented variants?) Also we don't even know if there is an Infinity Cache but you assume that there will be. Chip stacking may play a role in reducing the costs. In any case, what you say about the costs don't change anything about the target market scenario, or do you think that magically the new CPU and dGPU will keep the existing processes forever?

Those aren't my guesses. It's Semi Analysis detailed work vs your guesses.

Of course they will move on to new processes. But that doesn't mean they are going to pay to do a large increase in transistors, when transistor costs are flat. Your faulty assumption is that they were getting a big increase in transistor budget for free, which they aren't.

leoneazzurro · Jan 26, 2023

guidryp said:
That's just the Desktop part with a different name, made for high power laptops with dGPU.

Note that as Desktop part, it has a MUCH smaller GPU (only 2 CU) than the real laptop parts.

As always, there is strong pressure to make everything a small as possible.

Those aren't my guesses. It's Semi Analysis detailed work vs your guesses.

Of course they will move on to new processes. But that doesn't mean they are going to pay to do a large increase in transistors, when transistor costs are flat. Your faulty assumption is that they were getting a big increase in transistor budget for free, which they aren't.

Semi Analysis details many N3 variants and there are also many N5 variants. So area density and transistor cost must be evaluated on the final design. Without knowing this everything is a guess. Otherwise we could not have a 39% increase of transistor density going from N23 to N33 when N7 to N6 theoretical increase is 18% for the logic only.

And, I never said it was for free. Please quote me where I said that. I said that the area dedicated to the GPU was kept constant and that is likely to stay flat. Which also means that costs will go higher, but this we have already seen with N5.

Heartbreaker · Jan 26, 2023

leoneazzurro said:
And, I never said it was for free. Please quote me where I said that. I said that the area dedicated to the GPU was kept constant and that is likely to stay flat. Which also means that costs will go higher, but this we have already seen with N5.

You listed your guess of minimal increase in price/area, and a BIG increase in transistor density, that equals a big increase in free transistor budget.

When I Pointed this out you just answered with "I put the math there, feel free to use it. ".

You can't pretend you now meant something completely different.

Timorous · Jan 26, 2023

guidryp said:
We are talking about the APU for laptops which are monolithic so far.

There will come a point where it is cheaper to 3d stack or tile 2 or more smaller dies than to make 1 larger monolithic die on an advanced node. Trying to predict what N3 products will look like is not easy. Look at MI300. 3d stacked and tiled with cache under the cores/shaders.

The tech is there, just needs scaling up.

leoneazzurro · Jan 26, 2023

guidryp said:
You listed your guess of minimal increase in price/area, and a BIG increase in transistor density, that equals a big increase in free transistor budget.

When I Pointed this out you just answered with "I put the math there, feel free to use it. ".

You can't pretend you now meant something completely different.

Frankly, I never said it was free and that was your assumption only. The simple fact I assumed same area on a new process with higher wafer costs means that die/GPU area cost cost will go up. All die costs will go up. About the transistor increase, yes, it could be big depending on the design choices as demonstrated by actual examples (N23 vs N33) even on a similar node. If you have comprehension issues, please don't push them on others.

Heartbreaker · Jan 26, 2023

leoneazzurro said:
Frankly, I never said it was free. About the transistor increase, yes, it could be big depending on the design choices. If you have your comprehension issues, please don't push them on others.

You don't understand the implications of your own math? Based on faulty assumptions as it was, it amounted to a large increase in transistor budget at the same cost.

If you are going to say something like "I put the math there", you should understand the implications of that math.

To spell it out for you, removing your faulty assumptions, just keeping the same area will increase costs significantly, so they won't do that.

And NO, the same does not apply for Phoenix.

Note that 4nm is actually an economical node, with improved transistor economics. Unlike 3nm where Semi Analysis says: "Shrinking finally costs more, Moore's Law is now dead in economic terms"

3nm is a particularly uneconomic node.

Even given the more favorable economics of 4nm, AMD still stayed with a 12 CU design and shrunk the APU by 18% area vs the previous generation.

Given worse transistor economics at 3nm, expect an even greater shrink to contain costs.

leoneazzurro · Jan 26, 2023

guidryp said:
You don't understand the implications of your own math? Based on faulty assumptions as it was, it amounted to a large increase in transistor budget at the same cost.

If you are going to say something like "I put the math there", you should understand the implications of that math.

To spell it out for you, removing your faulty assumptions, just keeping the same area will increase costs significantly, so they won't do that.

And NO, the same does not apply for Phoenix.

Note that 4nm is actually an economical node, with improved transistor economics. Unlike 3nm where Semi Analysis says: "Shrinking finally costs more, Moore's Law is now dead in economic terms"

3nm is a particularly uneconomic node.

Even given the more favorable economics of 4nm, AMD still stayed with a 12 CU design and shrunk the APU by 18% area vs the previous generation.

Given worse transistor economics at 3nm, expect an even greater shrink to contain costs.

My talk started from the point that as there are markets where APUs with a powerful iGPU side may have sense, because an APU will have generally lower costs than a CPU+ comparable dGPU*+accessory costs (* comparable dGPU being the lower mainstream class), a Strix Point with 12WGP at similar sizes than Phoenix -which means around up to 200mm^2- could could be entirely in the realms of possibility .
You started denying it first with considerations on BW (and I pointed out the even not using IC there are today already new memory standards offering way higher bandwidth than current solutions) then started attacking the costs not even understanding that the original point (an APU of similar size of current ones costing less than a discrete GPU+separate CPU+all the PCB and accessory costs) was still valid, as considerations about density and transistor costs (which can vary greatly even on the same process but you are negating that) apply also to the discrete components solution, or even worse as there will be some part with low scaling replicated on both CPU and GPU (i.e. memory controllers). And this will be valid not only for AMD but for other players as well (Apple being APU-only in the portable market should be a hint). Not even speaking of other advantages (possibility to implement very small form factors which usually come at a very hefty price premium). It seems You don't understand -or don't want to understand- what I'm saying since the beginning, and I will stop here because at this point this is pratically trolling.

Heartbreaker · Jan 26, 2023

leoneazzurro said:
You started denying it first with considerations on BW (and I pointed out the even not using IC there are today already new memory standards offering way higher bandwidth than current solutions)

A future memory standard, that even including an Infinity Cache, would still be inadequate for the previous discussed increase in the APU.

If you are going to theorycraft, it at least needs to stand up to rudimentary analysis.

Unless you can get 200 MB/s of BW, on top of a sizeable Infinity Cache, then proposed increase in the APU makes little sense, just based on being bottlenecked by BW limitations.

then started attacking the costs not even understanding that the original point (an APU of similar size of current ones costing less than a discrete GPU+separate CPU+all the PCB and accessory costs) was still valid,

It's really not valid, and never was, which is why AMD never does this. Because this is false comparison. This is not a niche part meant to compete with more powerful dGPUs. This is the generic mass market part, that must compete on a cost basis against Intel laptop parts going into the majority of laptops where buyers don't care about a more powerful GPU.

If you make a big GPU part for the majority of the market that doesn't care about having a big GPU, then you make an overpriced part that can't compete in this market.

It seems You don't understand -or don't want to understand- what I'm saying since the beginning...

More like you don't understand the full implications of what you are saying.

Most people just buy basic laptops and don't care about GPU at all. AMD's part must economically compete with Intels, so the cost must be contained. Going for a large GPU that most people don't care about, just drives up the cost, and makes it less competitive.

You make the mistake that many on forums do. Assuming what you want, is what everyone wants. Most people aren't looking for more powerful GPUs in their laptop, so this is not a case where a big APU laptop is competing against a more expensive dGPU laptop, it's a case where the more expensive big APU laptop ends up competing against a laptop with a less expensive Intel chip.

You think it's about competing against a dGPU because you want dGPU performance.

Think about it. It's pretty much always been the case that AMD could build a more powerful APU to challenge more dGPUs, but they NEVER do, and it's because big GPU APU is a niche part, not a mainstream part, and the APU needs to be a mainstream part.

maddie · Jan 26, 2023

Compared to 5nm

N3 : - 25<>30% less power - 10<>15% more performance - 1.7X logic density
N3E : - 34% less power - 18% more performance - 1.6X logic density

"Ho says that TSMC's original N3 features up to 25 EUV layers and can apply multi-patterning for some of them for additional density. By contract, N3E supports up to 19 EUV layers and only uses single-patterning EUV, which reduces complexity, but also means lower density."

This means that a N3E wafer has to cost 60% more than a 5nm wafer for equal logic transistor cost.

TSMC Will Reportedly Charge $20,000 Per 3nm Wafer

GPUs and SoCs to get more expensive

www.tomshardware.com

N3 - $20K : N5 - $16K : N7 - $10K :

For logic, cost/transistor still falling.

Question Speculation: RDNA3 + CDNA2 Architectures Thread

Platinum Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Lifer

Diamond Member

Golden Member

Golden Member

Golden Member

Platinum Member

Senior member

Diamond Member

Golden Member

Diamond Member

Golden Member

Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Diamond Member