Question Zen 6 Speculation Thread

Page 228 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

marees

Golden Member
Apr 28, 2024
1,446
2,031
96
The low end to mid dGPUs would be using LPDDR instead of GDDR.

And a professional / AI version of the same card could put in the biggest LPDDR chips that is currently on the market.

I find it quite intriguing that AMD is able to contain so much of the bandwidth requirements using on die L2s, to the point that AMD can get away with LPDDR memory.
Imagine the RDNA 5 version of 3060 12gb could be a 10050xt 64gb !!
 

Joe NYC

Diamond Member
Jun 26, 2021
3,428
5,024
136
Imagine the RDNA 5 version of 3060 12gb could be a 10050xt 64gb !!

AMD could be trolling NVidia on low end with big LPDDR5 memory sizes.

What I wonder though, why not doe the same throughout the stack?

If NVidia can go up to 512 bit memory bus (8 channels) why not go to 6 LPDDR6 channels with high end card, which would be 576 bits?

Because then, if the biggest LPDDR5 memory chip is 64 GB, the high end professional / AI card could have 384 GB, which would be maximum trolling.

Or maybe split high end gaming to use GDDR7 and high end professional / AI with LPDDR6.

But, it's also good to keep in mind that NVidia is also doing a lot of work with LPDDR across the product stack, so AMD may not have a monopoly here.
 
Last edited:
  • Like
Reactions: Tlh97 and marees

MS_AT

Senior member
Jul 15, 2024
812
1,637
96
The unreal compilation workload shows similar memory activity in HWINFO to running a memory stress test like karhu or occt memory stress.
I don't deny it. But I guess you don't hit peak bandwidth. At least I don't when compiling llvm with llvm on Windows, which I guess is somewhat comparable load to Unreal. Plus looking at various tests from phoronix compilation does not really scale well with memory bandwidth. https://www.phoronix.com/review/threadripper-9000-ddr5-6400-4800/2 https://www.phoronix.com/review/amd-ryzen9-9950x-ddr5/4 https://www.phoronix.com/review/8-12-channel-epyc-9005/4 (the compilation benchmarks are usually in the middle of the page).

Or maybe it would be better to say that once you satisfy a specific BW threshold (likely related to how many cores you have and how much parallelism exists in the build), compilation workload does not care anymore and the BW requirement is relatively modest in the first place.
 

blackangus

Senior member
Aug 5, 2022
247
451
106
I don't deny it. But I guess you don't hit peak bandwidth. At least I don't when compiling llvm with llvm on Windows, which I guess is somewhat comparable load to Unreal. Plus looking at various tests from phoronix compilation does not really scale well with memory bandwidth. https://www.phoronix.com/review/threadripper-9000-ddr5-6400-4800/2 https://www.phoronix.com/review/amd-ryzen9-9950x-ddr5/4 https://www.phoronix.com/review/8-12-channel-epyc-9005/4 (the compilation benchmarks are usually in the middle of the page).

Or maybe it would be better to say that once you satisfy a specific BW threshold (likely related to how many cores you have and how much parallelism exists in the build), compilation workload does not care anymore and the BW requirement is relatively modest in the first place.
I'd GUESS that compilation in both cases is memory ops bound rather than bandwidth bound. But that is just a guess.
 

Magras00

Junior Member
Aug 9, 2025
21
46
46
The funny thing about "AMD traded MALL in Strix Point for the NPU" is that both MALL and NPU are deprecated in the future
Cost saving measures right?

NPU + GPU vs GPU only but beefed up ML HW = higher TOPS/mm^2.
Imagination Technologies also deprecating NPU with e-series GPU.

Navi 48 MALL has around 10% area overhead. IDK how much area L2+MALL unification (matching NVIDIA L2) adds vs L2 only.

AMD clearly going the NVIDIA route with no MALL in RDNA 5. Look at how NVIDIA's GB203 can squeeze all non-shader GPU core logic into a area footprint only ~10% higher than Navi 48. Obviously different µarch, but Navi 48 could've been significantly smaller with 64MB L2 instead of 8MB L2 + 64MB MALL.
 
Last edited:
  • Like
Reactions: Joe NYC and marees

Magras00

Junior Member
Aug 9, 2025
21
46
46
For actual chiplet stuff (SED+AID+MID) I think so, but for the quasi-monolithic GPUs like ATx with just GMD+MID it's gone.

What is GMD and AID?

SED = Shader Engine Die

MID = Media Interface Die

GMD = ?

AID = ?

GFX13 and later introduces SE autonomy (L1 cache) so could MALL could effectively be L2 in actual chiplet designs? Just can't see why SED chiplets requires one more cache level when SEs are almost completely autonomous in GFX13 and later. Unless AMD is clustering SEs and need a local larger datastore than L1's within SEs. Makes sense as IIRC one RDNA 4 SE is only ~26-27mm^2.
This assumes SEs aren't completely rearchitected and blown up to massive size with RDNA 6 and later (RDNA 5 isn't going full MCM). Do we even know anything about RDNA 6 rn?

Maybe UDNA is about laying groundwork for heterogenous compute in GFX14+? Hypothetically mixing and matching CDNAx, RDNAx, and ASIC chiplets with one frontend (OS only sees one GPU). Effectively a Zen-like GPU chiplet architecture.
Maybe a few tidbits at AMD's 2025 Financial Analyst day in November?
 
Last edited:

ToTTenTranz

Senior member
Feb 4, 2021
508
920
136
I went to the mountain, and I was given these words of wisdom, carved on marble tablets, to share with you:

View attachment 129070


I guess Medusa Halo Mini is the actual Strix Point successor to go into handhelds and small gaming laptops.
128bit LP5X might be a bottleneck, but LP6 would be a big improvement.


AMD going with LPDDR for their dGPUs sounds crazy, though. How is pJ-per-byte for LP6 compared to GDDR7?
It does lend to very high memory capacities that many will be looking for running AI models locally.
 
  • Like
Reactions: bearmoo and marees

marees

Golden Member
Apr 28, 2024
1,446
2,031
96
I guess Medusa Halo Mini is the actual Strix Point successor to go into handhelds and small gaming laptops.
128bit LP5X might be a bottleneck, but LP6 would be a big improvement.


AMD going with LPDDR for their dGPUs sounds crazy, though. How is pJ-per-byte for LP6 compared to GDDR7?
It does lend to very high memory capacities that many will be looking for running AI models locally.
If true then medusa (mini) premium will also go into 2027 version of Xbox rog Ally handhelds
 
  • Like
Reactions: ToTTenTranz

soresu

Diamond Member
Dec 19, 2014
3,975
3,421
136
And Sony's rumored "PS6 handheld" gets DOA unless it's a third of the price.
If people can play PS4 and PS5 games (plus maybe PS3 and certainly PS1 + 2 emulation on Zen6) on it then it has something other handhelds do not have legally, or without some measure of technical aptitude.

Don't underestimate the draw of off the shelf legacy game libraries.
 

Kepler_L2

Senior member
Sep 6, 2020
954
3,928
136
What is GMD and AID?

SED = Shader Engine Die

MID = Media Interface Die

GMD = ?

AID = ?

GFX13 and later introduces SE autonomy (L1 cache) so could MALL could effectively be L2 in actual chiplet designs? Just can't see why SED chiplets requires one more cache level when SEs are almost completely autonomous in GFX13 and later. Unless AMD is clustering SEs and need a local larger datastore than L1's within SEs. Makes sense as IIRC one RDNA 4 SE is only ~26-27mm^2.
This assumes SEs aren't completely rearchitected and blown up to massive size with RDNA 6 and later (RDNA 5 isn't going full MCM). Do we even know anything about RDNA 6 rn?

Maybe UDNA is about laying groundwork for heterogenous compute in GFX14+? Hypothetically mixing and matching CDNAx, RDNAx, and ASIC chiplets with one frontend (OS only sees one GPU). Effectively a Zen-like GPU chiplet architecture.
Maybe a few tidbits at AMD's 2025 Financial Analyst day in November?
GMD = Graphics Memory Die

AID = Active Interposer Die