Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 173 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,622
5,880
146

leoneazzurro

Senior member
Jul 26, 2016
923
1,451
136
Navi 33 will be smaller than Navi 23 and will be using N6 rather than N7. AMD i likely going to be getting N6 dies at lower price than equivalent N7 dies.

So, 2 areas of cost saving.

And it offers better features such as improved video decoder/encoder, and improved RT (for what it's worth in this class of GPUs). Transistor density improved almost 40% going from N23 to N33.
 
  • Like
Reactions: Tlh97 and Joe NYC

Kronos1996

Junior Member
Dec 28, 2022
15
17
41
That's just the Desktop part with a different name, made for high power laptops with dGPU.

Note that as Desktop part, it has a MUCH smaller GPU (only 2 CU) than the real laptop parts.

As always, there is strong pressure to make everything a small as possible.




Those aren't my guesses. It's Semi Analysis detailed work vs your guesses.

Of course they will move on to new processes. But that doesn't mean they are going to pay to do a large increase in transistors, when transistor costs are flat. Your faulty assumption is that they were getting a big increase in transistor budget for free, which they aren't.
Yes I’m aware, the point is that MCM packaging is ready for mobile applications. In fact Intel’s entire product stack will be MCM starting with Meteor Lake and I’m sure AMD will follow suit. Modular APU’s with different CPU/GPU/Cache configurations will be the norm very shortly.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,226
5,228
136
Yes I’m aware, the point is that MCM packaging is ready for mobile applications. In fact Intel’s entire product stack will be MCM starting with Meteor Lake and I’m sure AMD will follow suit. Modular APU’s with different CPU/GPU/Cache configurations will be the norm very shortly.

A desktop part used in a high power laptop, doesn't indicated MCM is ready for mobile. They have been putting high power desktop chips in laptops, for as long as their have been laptops.

Monolithic is still more power efficient, and is preferred option where mobile efficiency is needed, which is the bulk of the mainstream mobile market. Which is why Phoenix will still be monolithic. No indication on Strix Point yet.
 

Glo.

Diamond Member
Apr 25, 2015
5,705
4,549
136
Yes I’m aware, the point is that MCM packaging is ready for mobile applications. In fact Intel’s entire product stack will be MCM starting with Meteor Lake and I’m sure AMD will follow suit. Modular APU’s with different CPU/GPU/Cache configurations will be the norm very shortly.
For AMD, the Powerful iGPU+ powerful CPU combo will be monolithic. What will be on chiplets is Cache+memory controllers.
 

Glo.

Diamond Member
Apr 25, 2015
5,705
4,549
136
Number of transistors is not that important, what's important is actual size on the same process.

RDNA3 CU is supposedly a bit smaller than RDNA2 CU on the same process according to SkyJuice from Angstronomics.


Then this is from Locuza. If N5 provides 70% better scaling, then without It, It would still be 50% denser.
View attachment 75410
There is no reason to use RDNA2 when RDNA3(+) has comparable size on the same node and is still faster.

Steam Deck has only 15W power budget for SoC and that SoC is 4C8T Zen2; 1-1.6GHz 8CU RDNA2 IGP using N7 process.
16-18CU RDNA2(3) is too much for N4 or N5 process and 15W.
Phoenix is using N4 instead of N7 and still kept 12CU IGP, only boost increased by 25%.
They can either keep 8CU and significantly increase frequency to 2-2.4GHz(+50%) depending on V/F curve or keep frequency but increase CU to 12(+50%).
Phoenix reviews will show us how high It can clock at limited TDP with 12CU.
For Steam Deck 2 and 15W TDP you'll have completely standard Strix Point, smallest die.
 

Glo.

Diamond Member
Apr 25, 2015
5,705
4,549
136
LPDDR5T is 64 bit per memory chip, 9600 MHz memory standard. So two of these and you 153 GB/s, and 4 of these with 256 bit bus - 306 GB/s.

That would be plenty enough for feeding even more than 24 CUs.

N24 has 1024 ALUs, 16 MB Infinity Cache, 64 bit bus, and 144 GB/s of memory bandwdith in highest possible SKU.
Strix POint is rumored to have 1536 ALUs, 32 MB L4 cache, 128 bit DDR5 memory controller most likely with 6400 MHz clock for 102 GB/s.

Strix Point will have 50% more ALUs, 100% more IC, and around 29% less memory bandwidth with more unlimited VRAM capacity.

I think Strix Point will be fine, even with 6400 MHz and only 102 GB of memory bandwdith will be fast enough to deliver 6000-6500 pts in 3DMark Time Spy.

And 6000 pts is RTX 2060-RX 5600 XT mobile levels of performance.
 
  • Like
Reactions: Tlh97 and Joe NYC

Heartbreaker

Diamond Member
Apr 3, 2006
4,226
5,228
136
LPDDR5T is 64 bit per memory chip, 9600 MHz memory standard. So two of these and you 153 GB/s, and 4 of these with 256 bit bus - 306 GB/s.

There is no 256 bus AMD design for APUs. They are meant to run on standard 128 bit AM5 socket, with similar pinout when soldered into a laptop.

~150 GB/s is theoretical possible on 128 bit designs, if they support LPDDR5T. Do they even support LPDDR5X yet?
 

MrTeal

Diamond Member
Dec 7, 2003
3,569
1,698
136
I'm not sure I'd say 153GB/s is plenty for a 24CU GPU and the CPU, but it's certainly better. It would definitely help if Strix Point was used in something like a Steamdeck or console. For mobile, it's harder to say. Most big gaming laptops ship with DDR5 SODIMMs instead of LPDDR5. You see LPDDR5 in high margin ultralights like the Carbon, but sizing RAM is probably a bit of a nightmare if you're talking a shared memory system with soldered RAM.
 

Glo.

Diamond Member
Apr 25, 2015
5,705
4,549
136
There is no 256 bus AMD design for APUs. They are meant to run on standard 128 bit AM5 socket, with similar pinout when soldered into a laptop.

~150 GB/s is theoretical possible on 128 bit designs, if they support LPDDR5T. Do they even support LPDDR5X yet?
What makes you think they are meant to run in standard AM5 boards?

What if they will come soldered to MoBos with soldered RAM, or with CAMM sockets?
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,226
5,228
136
What makes you think they are meant to run in standard AM5 boards?

What if they will come soldered to MoBos with soldered RAM, or with CAMM sockets?

Same reason, ALL their APUs in history fit their current socket.

Because it makes business sense, instead of being based on wishful thinking.
 

Glo.

Diamond Member
Apr 25, 2015
5,705
4,549
136
Same reason, ALL their APUs in history fit their current socket.

Because it makes business sense, instead of being based on wishful thinking.
Its not wishful thinking, its necessary.

Intel large SOCs are going to be soldered as well - They are mobile, BGA only parts. Those SOCs will not land on DIY platforms.

So now, let me ask you a question. Knowing that DIY is dying platform, and its not financially feasible to maintain low-end products, OEMs want different solutions, and next generation memory will require changing the tech, and well, soldering the RAM what will happen with desktop, DIY PCs, hmmm?

Why Intel and AMD are focusing on development of Mini-PCs that have external PCIe connection, like Compute Element from Intel, or full Mini-PC with external PCIe port that connects to docking station which then connects to a dGPU?

DIY is dying and it will become only the highest end, of highest end solutions. Thats why I have said that 90% of market in very close future are going to be APUs/SOCs.

And not being bound by DIY platform limitations will allow companies, like Intel or AMD innovate on Unified Memory architecture front.

Apple is their biggest competition, and OEMs want to compete with Apple. And Apple sells only Unified Memory Architecture solutions.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,226
5,228
136
Its not wishful thinking, its necessary.

Necessary, to fulfill your wish for a big integrated GPU. :rolleyes:

AMDs competition is Intel, and there is no sign Intel is even going to catch Rembrandt GPU performance with it's iGPU so no real pressure on AMD to increase it's iGPU.

When AMD needs to grow it's iGPU, then a half step would make more sense, so a 16-18 CU part.

24 CU is just the typical clickbait to excite people and get clicks. Rumors said Phoenix would be 24 CU as well. It's more exciting so it gets more clicks.

We have an APU thread where it's probably best continue those discussions there:
 
Last edited:
  • Like
Reactions: insertcarehere

MrTeal

Diamond Member
Dec 7, 2003
3,569
1,698
136
This is getting really far afield and off topic for this thread. Phoenix and Strix Point were being discussed because they are RDNA3(+) chips, and lately specifically the rumors that SP would have a 24CU iGPU. Unannounced possible future APUs with >2 channel memory architectures and powerful GPUs to compete with the Apple M chips might be an interesting thread, but it doesn't have anything to do with RDNA3.
 

Mopetar

Diamond Member
Jan 31, 2011
7,831
5,980
136
The Infinity cache reduces the need for BW only to a certain degree. The 24 CU APU of the rumor is a 9 TF part.

Sure that seems like a lot, but eventually it won't be. It's roughly 50% more than top-end Polaris GPUs which are almost 7 years old at this point or essentially what a 3050 will get you today. That probably represents a good performance tier to aim for at the entry level. Considering even just the MSRP of a 3050, it makes a tempting pie to take a bite of.

Particularly challenging at 3nm is that SRAM (Cache) is didn't scale at all from 5nm. Zero SRAM scaling and increased process costs makes adding a big Cache challenging(AKA Expensive).

All the more reason to makes a shared last level cache that both the CPU and GPU cores can utilize. Having two separate dies that both need some cache just means extra silicon for each.

Maybe some company does need to work with AMD for them to want to make product like this, but as the market changes and evolves I think we'll see stronger APUs that creep upwards in capabilities to capture the eroding low-end of the GPU market.

Perhaps consider it as an entry-level GPU with a CPU attached rather than the other way around. Given how expensive even low-end GPUs have become, there's money to be had in that market segment given it's already so price sensitive.
 
  • Like
Reactions: Tlh97

Joe NYC

Golden Member
Jun 26, 2021
1,934
2,272
106
AMDs competition is Intel, and there is no sign Intel is even going to catch Rembrandt GPU performance with it's iGPU so no real pressure on AMD to increase it's iGPU.

AMD competition (for the APU, iGPU) ranges from less capable Intel processors to more capable Apple M1/M2, to mobile CPUs paired with dGPU.

AMD does better when it tries to lead than when it tries to react to Intel.
 
  • Like
Reactions: Tlh97

TESKATLIPOKA

Platinum Member
May 1, 2020
2,355
2,848
106
I'm confused. How is 7nm RDNA2 "same process" as 5mn RDNA3?

But the Deck is really using only 8CUs! I always believed it was 12CU.
You have everything in that post.
RDNA3 WGP 5nm is a lot smaller than RDNA2 WGP 7nm.
If they were using the same process for both, then It should be still a bit smaller.
Just for your info.
RDNA2 is using 7nm(N1,N22,N23) and 6nm(N24).
RDNA3 is using 5nm(N31,N32) and 6nm(N33).
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,355
2,848
106
For Steam Deck 2 and 15W TDP you'll have completely standard Strix Point, smallest die.
And that's how many CU? 12 or 16?
The question is If Strix is really 3nm and If Steam is willing to use It in their upcoming Steam Deck 2.
If they use Strix, then that will cost more than what they currently use. Will they increase price or sacrifice margin?
 

RnR_au

Golden Member
Jun 6, 2021
1,705
4,153
106

Timorous

Golden Member
Oct 27, 2008
1,608
2,751
136
I guess the million dollar question is whether or not more cache would actually mean appreciably more performance for the 7900 series.

Probably not as is but maybe if it had hit the much higher clocks it seems it was supposed to then the extra bandwidth from double the Infinity Cache would have come in handy.