Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 163 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,705
6,427
146

jpiniero

Lifer
Oct 1, 2010
15,223
5,768
136
If I were AMD, when Strix Point releases I would just maintain production of ONLY Phoenix, and Strix Point APUs. I don't know how their wafer supply agreements play out with the TSMC, and if they are the reason why AMD has to maintain production of older hardware.

And not get any sales? It's obvious that AMD is forced to charge a lot more for even Rembrandt (if they want to make similar margins...) compared to Cezanne/Barcelo because of the wafer prices let alone Phoenix. Strix Point is going to be even worse if it uses N3.

Cezanne/Barcelo is going to be around for awhile I think.
 

Glo.

Diamond Member
Apr 25, 2015
5,803
4,777
136
And not get any sales? It's obvious that AMD is forced to charge a lot more for even Rembrandt (if they want to make similar margins...) compared to Cezanne/Barcelo because of the wafer prices let alone Phoenix. Strix Point is going to be even worse if it uses N3.

Cezanne/Barcelo is going to be around for awhile I think.
That is the unfortunate downside(the high wafer prices).

What I proposed would simplify the lineup.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
Any iGPU in an APU increases the cost, so It's never really free, but It's still pretty useful. Especially when your dGPU dies.:D

Nah I mean chipset iGPUs added just $5-7. The CPU iGPUs add $30 plus the cost of the faster chip. So $30 if you were going to buy the CPU anyway, but the real cost might be $50-100. On mobile it might be even higher since you often need to buy an i7 to get the top of the line iGPU.

When Iris devices with eDRAM were being made, it was a $200-300 adder on top of the maxed out i7 config!

And $30 is a HUGE amount in terms of production cost. If it adds $30 to the MSRP of the CPU sure, but $30 is likely the cost of the entire chip.

You don't want a backup display GPU that expensive.

@Kronos1996 I second that it's more reasonable to be for a custom APU such as consoles.

I am not going to rule it out entirely but I am saying it doesn't make sense.

Everyone just assumes RDNA3+ (assuming same node) WGPs are the exact same size as Phoenix's RDNA3 WGPs.

And why would it be significantly smaller? + suggests they are likely adding features also it means the changes aren't that big.

Things like RDNA did to double FP32 instruction rate required substantial circuitry changes and they had to cut down on other areas to get there. You don't get things for free. They are not going to have that kind of changes with RDNA3+.
 
Last edited:

MrTeal

Diamond Member
Dec 7, 2003
3,614
1,816
136
What would be the point of a 24CU IGP on an APU? That's a pretty massive GPU, just slightly smaller than a RX 7600M/S, but that's a 90W/75W TDP solution. It also has 32MB of IC and 256GB/s of bandwidth out to GDDR6. With dual channel LPDDR5x the whole APU would have 120GB/s bandwidth, with DDR5-5600 it would be less and shared with the CPU. Unless they completely change the memory system, that poor GPU would be completely memory staved.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
That's a CPU bound resolution though. The 7900 XTX is getting 361 fps. Which essentially makes the benchmark worthless.

At 4K, the 7900 XTX is getting 190fps. Which is 40% faster than the 6900 XT.

That's why I also included Civ IV. We're really nitpicking at this point, when the leakers were so far off that in reality we didn't get any increase in perf/$ this generation at all.
 

GodisanAtheist

Diamond Member
Nov 16, 2006
7,166
7,666
136
Ok, just cause my memory is getting a little hazy here: Why is AMD even pursuing APUs anymore? What is the benefit of making a whole separate line of much larger monolithic dies with beefier IGPs when they can theoretically just slap something more powerful onto the Zen4 IO die or even package a CPU with a GCD (obviously not a proper RDNA3 GCD but something purpose built) that is much smaller and leverages their existing interconnect/substrate strategy?

APUs felt like they were the solution back before chiplets could be a thing, but now that we have chiplet CPUs and chiplet GPUs... is the clock ticking here on the APU?
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
Ok, just cause my memory is getting a little hazy here: Why is AMD even pursuing APUs anymore? What is the benefit of making a whole separate line of much larger monolithic dies with beefier IGPs when they can theoretically just slap something more powerful onto the Zen4 IO die or even package a CPU with a GCD (obviously not a proper RDNA3 GCD but something purpose built) that is much smaller and leverages their existing interconnect/substrate strategy?

APUs felt like they were the solution back before chiplets could be a thing, but now that we have chiplet CPUs and chiplet GPUs... is the clock ticking here on the APU?

Because Intel is. AMD lost out on quite a bit of OEM business because their CPU's required a dedicated GPU. If then want to get in on that business, they have to have a CPU/GPU that can compete with Intel.
 

insertcarehere

Senior member
Jan 17, 2013
639
607
136
Because Intel is. AMD lost out on quite a bit of OEM business because their CPU's required a dedicated GPU. If then want to get in on that business, they have to have a CPU/GPU that can compete with Intel.

Except that issue did get remedied by Zen 4 putting a barebones IGP on their IOD. Pretty big difference between that and making the IGP beefy enough to run games well, especially when the extra die area can be used for higher margin products elsewhere for AMD.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,523
3,037
136
What would be the point of a 24CU IGP on an APU? That's a pretty massive GPU, just slightly smaller than a RX 7600M/S, but that's a 90W/75W TDP solution. It also has 32MB of IC and 256GB/s of bandwidth out to GDDR6. With dual channel LPDDR5x the whole APU would have 120GB/s bandwidth, with DDR5-5600 it would be less and shared with the CPU. Unless they completely change the memory system, that poor GPU would be completely memory staved.
It's not that big, only 2x bigger than a Phoenix without IGP, If my calculation is correct. :D
I calculated here 227mm2 for a Phoenix with 24CU + 32MB IC.
RX7600S goes as low as 50W including GDDR6 memory and let's not forget It's on 6nm process.
It's not like this 4nm APU must be limited to 35-45W. For this one add 30W for a total 65-75W and you are ready to go. If It can clock at ~2650MHz on average, then It will have 90% of RX 7700s 100W TFLOPs.
32MB IC would certainly help, but If that amount would be enough is pretty questionable.
At worst, I would use a shared 64MB LLC instead of separate 16MB L3 and 32MB IC. This would increase size by ~12-13mm2 to 240mm2.
I think with this amount, even a slower DDR5 shouldn't be a big problem.
Price could be set at $399 for 8C+24CU+64MB LLC, where It could offer better perf/$ ratio than a similarly performing CPU+dGPU combination.

What's the real problem in my opinion is OEM manufactures. As we saw with Rembrandt, they love to put AMD in premium laptops, so this in a laptop would cost more than a combination of CPU+dGPU so It would be pointless.

If they released It for DIY, then It could have a very good price/performance ratio and be a great option in a cheaper gaming machine or small form factor. In this case you can even set TDP a lot higher than 65-75W so IGP could be close to 3GHz on average.
This is my personal opinion.

P.S. The only reason I wouldn't buy this even If It existed is that I am limited to a laptop and a CPU+dGPU option would end up cheaper. Still, buying a new laptop for >€1500 with only 8GB dGPU(N33,Ad106,Ad107) doesn't look so great.
 
Last edited:
  • Like
Reactions: Kaluan

Glo.

Diamond Member
Apr 25, 2015
5,803
4,777
136
What would be the point of a 24CU IGP on an APU? That's a pretty massive GPU, just slightly smaller than a RX 7600M/S, but that's a 90W/75W TDP solution. It also has 32MB of IC and 256GB/s of bandwidth out to GDDR6. With dual channel LPDDR5x the whole APU would have 120GB/s bandwidth, with DDR5-5600 it would be less and shared with the CPU. Unless they completely change the memory system, that poor GPU would be completely memory staved.
The point of a large iGPU is to scale the usage of such product to larger number of use cases. Car infotainment, wearables, wearable VR, handhelds, push for efficiency, push for reduction of design costs, and manufacturing costs, push for AI expansion and use cases, and plenty more.

People really live in the past thinking that everything will be as always was, when we are on the brink of software/hardware/experience paradigm shift.

If you do not get it, already: scaling iGPUs larger is to increase TAM, and use cases to increase both volume, and profit margins. The goal(of AMD and Intel) is that APUs/SOCs are going to be 90% of all of computing. That is the reason why Nvidia tried to buy ARM, to have a competitive edge in a world where all they have are essentially non-APU projects.

And no, we are not talking about small iGPUs, integrated in CPUs or CPU packages. We are talking about big and powerful GPUs.
 

Glo.

Diamond Member
Apr 25, 2015
5,803
4,777
136
In the discussion about Strix Point, and 24 CUs.

IF the rumors are true, and if Kepler is correct about RDNA3 architecture being fixed for Strix Point - it would mean that Dual Issue is working properly, and we should expect the fabled 256 ALUs/WGP.

So if SP has 24 CUs/12 WGPs, then it also has 3072 vALUs/1536 ALUs.
 
  • Like
Reactions: Tlh97 and Kaluan

insertcarehere

Senior member
Jan 17, 2013
639
607
136
The point of a large iGPU is to scale the usage of such product to larger number of use cases. Car infotainment, wearables, wearable VR, handhelds, push for efficiency, push for reduction of design costs, and manufacturing costs, push for AI expansion and use cases, and plenty more.

Except for Car infotainment, none of the other use cases listed here require, or frankly want a large iGPU that requires more power than what those use cases can provide for the large iGPU to scale, 24CU RDNA3 is a waste running at 15w.

People really live in the past thinking that everything will be as always was, when we are on the brink of software/hardware/experience paradigm shift.

We live in a world where cost per transistor has basically stagnated and wafer costs are skyrocketing with each new process, and therefore every mm^2 of silicon is precious. An APU with a large GPU component (24CU RDNA3 IGP would definitely qualify) being sold to the public will inevitably face at least some consumers that don't assign a large premium to the GPU part, effectively making that chunk of die space a waste. It just makes more sense to cut that extra silicon out of the APU and assign it to actual GPUs, where there is far more certainty that potential buyers would assign value to the GPU in question.

If you do not get it, already: scaling iGPUs larger is to increase TAM, and use cases to increase both volume, and profit margins. The goal(of AMD and Intel) is that APUs/SOCs are going to be 90% of all of computing. That is the reason why Nvidia tried to buy ARM, to have a competitive edge in a world where all they have are essentially non-APU projects.

And no, we are not talking about small iGPUs, integrated in CPUs or CPU packages. We are talking about big and powerful GPUs.

As per above, in an era where Wafers costs are skyrocketing, spending extra silicon on attributes where it's uncertain to be valued by end consumers makes little sense.

To put it in a more concrete example:
- AMD can probably make a hell of an APU if they were willing to go big with ~280mm^2 die on 5nm. That's the sort of space where ~N33 performance in an iGPU would be very possible. The problem is that 280mm^2 is a lot of silicon, equivalent to either:
- 4 Zen 4 CCDs, which is not very far away from 2 7950Xs, selling for ~$600+ each.
- Navi 31 GCD, which is the core chip to the 7900XTX, a product that sells for $1k.
Now, can 8c Zen4 with an N33-class iGPU sell for the sort of premiums which can compete with these options for AMD internally?
 
  • Like
Reactions: Tlh97

Glo.

Diamond Member
Apr 25, 2015
5,803
4,777
136
Except for Car infotainment, none of the other use cases listed here require, or frankly want a large iGPU that requires more power than what those use cases can provide for the large iGPU to scale, 24CU RDNA3 is a waste running at 15w.
First: they don't need NOW. The use case for large iGPUs is coming soon-ish. Actually - next year.

Secondly. Tell all of that to Intel, who is doing EXACTLY the same thing as AMD. Why is Intel doing the exatly same thing as AMD on this front, hmmm?
As per above, in an era where Wafers costs are skyrocketing, spending extra silicon on attributes where it's uncertain to be valued by end consumers makes little sense.

To put it in a more concrete example:
- AMD can probably make a hell of an APU if they were willing to go big with ~280mm^2 die on 5nm. That's the sort of space where ~N33 performance in an iGPU would be very possible. The problem is that 280mm^2 is a lot of silicon, equivalent to either:
- 4 Zen 4 CCDs, which is not very far away from 2 7950Xs, selling for ~$600+ each.
- Navi 31 GCD, which is the core chip to the 7900XTX, a product that sells for $1k.
Now, can 8c Zen4 with an N33-class iGPU sell for the sort of premiums which can compete with these options for AMD internally?
And how much more feasible financially is designing two separate designs on 3 nm process, than single one, with much simpler implementation, much simpler needs in terms of PCB, controllers, memory, etc?
 
  • Like
Reactions: Tlh97 and Kaluan

Aapje

Golden Member
Mar 21, 2022
1,515
2,065
106
We live in a world where cost per transistor has basically stagnated and wafer costs are skyrocketing with each new process, and therefore every mm^2 of silicon is precious.

You are forgetting about chiplets. That way they can add fairly small iGPU chiplets that cost just a few bucks to make and have huge yields.

They can also vary the iGPU based on need. So they can add a fairly slow iGPU on a big node if the demands are low and a fairly fast one on a smaller node for more demanding uses.
 

GodisanAtheist

Diamond Member
Nov 16, 2006
7,166
7,666
136
You are forgetting about chiplets. That way they can add fairly small iGPU chiplets that cost just a few bucks to make and have huge yields.

They can also vary the iGPU based on need. So they can add a fairly slow iGPU on a big node if the demands are low and a fairly fast one on a smaller node for more demanding uses.

-This is th root of my initial question on the last page, why do monolithic APUs exist anymore? IMO AMD's next step is to have a heterogeneous compute package that has a CPU, a GCD, and an IO die on package.

The GCD would be an N34 (or N35 even) class die, tiny <100mm2 die that can go on add in cards for more power and bandwidth or right on the CPU package for a more powerful IGP.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,340
5,464
136
-This is th root of my initial question on the last page, why do monolithic APUs exist anymore? IMO AMD's next step is to have a heterogeneous compute package that has a CPU, a GCD, and an IO die on package.

The GCD would be an N34 (or N35 even) class die, tiny <100mm2 die that can go on add in cards for more power and bandwidth or right on the CPU package for a more powerful IGP.

Monolithic is still more power efficient, and the APUs are aimed at mobile.
 

Glo.

Diamond Member
Apr 25, 2015
5,803
4,777
136
-This is th root of my initial question on the last page, why do monolithic APUs exist anymore? IMO AMD's next step is to have a heterogeneous compute package that has a CPU, a GCD, and an IO die on package.

The GCD would be an N34 (or N35 even) class die, tiny <100mm2 die that can go on add in cards for more power and bandwidth or right on the CPU package for a more powerful IGP.
Because its cheaper to design and yield ONE monolithic design, than TWO separate designs.

The only place where, for AMD, its cheaper to break the designs into pieces is like N31 was executed. By moving the cache and memory controllers into seperate chiplet and seperate process.

Two seperate designs, with over 1 bln USD costs, each are going to cost less than one single with development costs of 1 bln dollars, period? iGPU and CPU designs are going to be different, even on the same process.

Thats the whole point why you keep APUs monolithic in the case of AMD.

Intel has its own fabs, for their CPUs and chipsets, and uses TSMC for the iGPUs. In their use case - it will be beneficial for them to break it apart.
 

Kronos1996

Junior Member
Dec 28, 2022
15
17
41
I don't mean that there would be both N24 and this chip. I meant this one should have been designed and released as N24.

N24 was aimed against GA107.
MX models are much weaker, with a single exception being MX570.
GeForce MX570 was announced a month earlier than N24 and has comparable performance to a cutdown N24 is based on GA107.
6500M(Full N24) is comparable to RTX 3050(GA107).
Phoenix should already provide the same level of performance as this cutdown N24 making It rather pointless.
There is absolutely no good reason for N24 to be produced for an additional 3-5 years when even now barely anyone wants them, which is evident by the amount of laptops with It.

Because AMD needs something for <=$249.
That 150mm2 chip wouldn't have worse profits than N24, and It also wouldn't cost much more to make.
It's not like N24 is much cheaper to make than N33, when you compare versions with 8GB Vram, yet price will be very different.
Making a 107mm2 GPU which after a year is made pointless by an IGP from the same company doesn't make much sense to me. :cool: At least my beefed up version of N24 would still be >50% faster than Phoenix and could be sold at least until Strix is out.:)

P.S. I think that 12GB version of mine is too costly to make because of clamshell, I would keep only the 6GB version for $239.
Navi 24 and Navi 33 can coexist because the cost and price point are far enough apart that it makes sense to have both. You’re 150mm2 design has way too much overlap with both and would likely have a worse price/perf then what AMD made. It would probably have the same performance as Navi 24 just with the PCIE lanes, Infinity Cache and/or Bus Width increased. Navi 14 is 158mm2 with similar performance on a similar node. Most of the space savings came from stripping out IO. So if we say that restoring minimum acceptable IO for a desktop chip would add ~50mm2 and then add another ~50mm2 to get that 50% performance improvement guess what we end up with? 200mm2 Navi 33…

You’re looking only at gaming performance and ignoring the largest segment of the laptop market. Business notebooks and micro-desktops. Many businesses need a GPU with professional driver support but not super powerful. Still, Navi 24 with it’s meager 4GB of VRAM is plenty and probably kicks the shit out of iGPU’s in professional applications. That’s almost certainly where most of Navi 24 is being sold along with a huge chunk of all AMD laptops right now. Hence why you don't see many in stores. Business PC’s are like the server market, very high volume and long-term profitable sales. However, if takes years to build up relationships with those customers.
 
Last edited:

TESKATLIPOKA

Platinum Member
May 1, 2020
2,523
3,037
136
Navi 24 and Navi 33 can coexist because the cost and price point are far enough apart that it makes sense to have both. You’re 150mm2 design has way too much overlap with both and would likely have a worse price/perf then what AMD made. Forgive me if I give more weight to the opinion of the company that designed and built them.
If you are sure my 150mm2 design priced at $239 would be too close to Navi 33 then you can share with us how much N33 will be sold for.
My design would certainly have much better performance/price than N24. I can't tell how It would fare against N33, because I don't know Its price.

You’re looking only at gaming performance and ignoring the largest segment of the laptop market. Business notebooks and micro-desktops. Many businesses need a GPU with professional driver support but not super powerful. Still, Navi 24 with it’s meager 4GB of VRAM is plenty and probably kicks the shit out of iGPU’s in professional applications. That’s almost certainly where most of Navi 24 is being sold along with a huge chunk of all AMD laptops right now. Hence why you don't see many in stores. Business PC’s are like the server market, very high volume and long-term profitable sales. However, if takes years to build up relationships with those customers.
Mobile N24 is RX 6300M, RX 6450M, RX 6500M, RX 6550M and for mobile workstations Pro W6300M and Pro W6400M

There is such a huge demand for N24 in laptops(consumer + business), that I could find only 3 different laptops with Navi24.
Laptop models with 6500M: HP VICTUS, ThinkPad Z16 G1 and Bravo 15 B5E
Nothing else exist as far as I know.
This doesn't say anything positive about N24's sales.

It would be best If you provided some data to back up what you said.
 

MrTeal

Diamond Member
Dec 7, 2003
3,614
1,816
136
The point of a large iGPU is to scale the usage of such product to larger number of use cases. Car infotainment, wearables, wearable VR, handhelds, push for efficiency, push for reduction of design costs, and manufacturing costs, push for AI expansion and use cases, and plenty more.

People really live in the past thinking that everything will be as always was, when we are on the brink of software/hardware/experience paradigm shift.

If you do not get it, already: scaling iGPUs larger is to increase TAM, and use cases to increase both volume, and profit margins. The goal(of AMD and Intel) is that APUs/SOCs are going to be 90% of all of computing. That is the reason why Nvidia tried to buy ARM, to have a competitive edge in a world where all they have are essentially non-APU projects.

And no, we are not talking about small iGPUs, integrated in CPUs or CPU packages. We are talking about big and powerful GPUs.
I thought we were talking about iGPUs integrated into CPUs or CPU packages. Unless things have changed, Strix Point is going to be a pretty typical APU with a dual channel DDR5/LPDDR5 interface. Rembrandt/6900HX is a 12CU solution, and it already shows good performance scaling moving from DDR5-4800 to DDR5-5600 showing its bandwidth dependency.
A 24CU solution with really no more bandwidth is going to really be starved. Not saying it wouldn't be more performant than the 12CU one, but they're only so much GPU you can shove into an APU with a typical dual channel memory interface before you run into huge diminishing returns. That's even before crappy OEMs ship em with a single 16GB SODIMM populated. :p
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
5,803
4,777
136
A 24CU solution with really no more bandwidth is going to really be starved. Not saying it wouldn't be more performant than the 12CU one, but they're only so much GPU you can shove into an APU with a typical dual channel memory interface before you run into huge diminishing returns. That's even before crappy OEMs ship em with a single 16GB SODIMM populated. :p
Indeed Strix Point appears to be typical 128 bit bus DDR5 APU.

But the GPU will not be starved thanks to 32 MB L4 cache/System Cache, as per:
 

MrTeal

Diamond Member
Dec 7, 2003
3,614
1,816
136
I'm not sure you can say it won't be bandwidth starved with a shared 32MB L4 without knowing more details on how the cache works. Even in the absolute best case and it's 32MB dedicated IC, that's still 50-60% hit rate at 1080p,and then you're going out to the shared 90GB/s bus (at DDR5-5600) to main memory. That's a big if, and even then it's still a huge chunk of CUs with limited bandwidth. That's still 62% of the total bandwidth of the top Navi 24 part, and that's only 16 CU.

It'd be an interesting part as a 8800G or something, but it still feels like an answer in search of a problem. Something with that kind of silicon distribution feels a lot more like it'd be in a console or other custom solution, with a higher bandwidth memory interface.
 

insertcarehere

Senior member
Jan 17, 2013
639
607
136
First: they don't need NOW. The use case for large iGPUs is coming soon-ish. Actually - next year.

Secondly. Tell all of that to Intel, who is doing EXACTLY the same thing as AMD. Why is Intel doing the exactly same thing as AMD on this front, hmmm?

Yes, Nostradamus, please tell me how a Steam Deck that can empty its battery in 90 minutes as is benefits from having a 40-50W APU, which 24CU will need to get decent performance benefits. Or the sci-fi batteries that it would take to get a laptop-class GPU powered in a wearable of all things.

And no, Intel is not doing the exactly the same thing given that they're breaking out the GPU into dedicated chiplets, instead of staying with a Monolithic design.
And how much more feasible financially is designing two separate designs on 3 nm process, than single one, with much simpler implementation, much simpler needs in terms of PCB, controllers, memory, etc?
AMD evidently thought it was worth it to break out their CPUs and GPUs into small modular components (CCDs, IODs...etc) and incur additional design costs there. I struggle to see why that suddenly stops being the case with APUs.

You are forgetting about chiplets. That way they can add fairly small iGPU chiplets that cost just a few bucks to make and have huge yields.

They can also vary the iGPU based on need. So they can add a fairly slow iGPU on a big node if the demands are low and a fairly fast one on a smaller node for more demanding uses.

Indeed, I think GPU chiplets are increasingly the way to go for iGPUs that aspire to be more than "boot up the computer" and "basic media acceleration", a single die with everything included will inevitably have some parts of it not be valued in a way that a more modular solution with chiplets can mitigate.
 
  • Like
Reactions: Tlh97