Discussion RDNA 5 / UDNA (CDNA Next) speculation

Page 38 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

marees

Golden Member
Apr 28, 2024
1,461
2,054
96
I was wondering if these would end up standalone or shared die.

Sharing memory IO would suggest a shared die would be more optimal.
If I grokked MLID correctly the cpu is in the i/o die

Then you have 2 options:
  1. Add another cpu die (replacement to strix point)
  2. Add another gpu die (medusa premium & also halo I think)
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,659
2,511
136
I wonder what will be the smallest LPDDR6 chip available? IIRC the smallest LPDDR5 ones are 8Gb, but I dunno if those are even still in production, the most common ones I see around are 12Gb.

AMD might be forced to put more memory in AT3 cards than on AT2 ones?
 
  • Like
Reactions: Tlh97 and marees

marees

Golden Member
Apr 28, 2024
1,461
2,054
96
I wonder what will be the smallest LPDDR6 chip available? IIRC the smallest LPDDR5 ones are 8Gb, but I dunno if those are even still in production, the most common ones I see around are 12Gb.

AMD might be forced to put more memory in AT3 cards than on AT2 ones?
Need a thread for how much vram is too much vram 😉
 

Joe NYC

Diamond Member
Jun 26, 2021
3,436
5,034
136
If I grokked MLID correctly the cpu is in the i/o die

Then you have 2 options:
  1. Add another cpu die (replacement to strix point)
  2. Add another gpu die (medusa premium & also halo I think)

One thing that contradicts is the XBox configuration, shown in previous videos, where there is a base monolithic CPU-SOC die and a separate GPU die

And this video suggests this approach may be shared with laptops.
 

marees

Golden Member
Apr 28, 2024
1,461
2,054
96
One thing that contradicts is the XBox configuration, shown in previous videos, where there is a base monolithic CPU-SOC die and a separate GPU die

And this video suggests this approach may be shared with laptops.
No the Xbox is gddr7 — desktop mode
Medusa halo & premium are lpddr6 — laptop mode

So the architecture changes. But still take all this with mountains of salt as MLID is the source
 

Joe NYC

Diamond Member
Jun 26, 2021
3,436
5,034
136
No the Xbox is gddr7 — desktop mode
Medusa halo & premium are lpddr6 — laptop mode

So the architecture changes. But still take all this with mountains of salt as MLID is the source

It is not spelled out what is on each die, but the way I understand it is that what he calls IOD in Medusa Point is a die that also has the base set of cores ~4 full + 4-8 dense + 2 LP cores.

And this base die would be the base, low cost monolithic laptop die.

Then, you can add on one end the 12 Zen 6 CPU chiplet
And in Medusa Mini, on the other end, you can add, the small GPU chiplet.

On Medusa Full, instead, add the big GPU chiplet.
 

marees

Golden Member
Apr 28, 2024
1,461
2,054
96
Any guesses on (2027) launch prices ??

  1. AT0 (10090xt > 5090) — 384 bit bus so $1500+ ?
  2. AT1 (10080xt) — scrapped
  3. AT2 (10070xt = 5080 > xbox next) — 72 CU 192bit gddr7 so $600+
  4. AT3 (10060xt < 5070) — 48 CU 384bit lpddr6 so $400+
  5. 9060xt 16gb (=ps5 pro) ~ $300
  6. AT4 (10050xt > 3060 12gb in raster) — 24 CU 128bit lpddr6 so $250 ?
My revised estimations / guesstimates (no LLMs used)

(Assuming this lpddr vram thingy is true & also assuming it works out) Imagine this line-up (in 2027)

  • AT0
    • 10090xt+ — Multiple models starting at $1500 plus and huge vram like Radeon VII or titan
  • AT1
    • 10080xt — scrapped (Lisa Su took her toys & went home)
  • AT2 (gddr7)
    • 10070 xtx 24gb = $700 (~5080)
    • 10070 xt 18gb = $600 (~5070 ti)
    • 10070 gre 15gb = $500-$550 (~5070 super)
  • AT3 (lpddr6)
    • 10060 xt 24gb = $450-$500 (~5070)
    • 10060 16gb = $400 (~5060ti 16gb)
  • AT4 (lpddr6/lpddr5x)
    • 10050xt 32gb = $350 (~9060xt 16gb)
    • 10050xt 24gb = $300 (~9060)
    • 10040xt 16gb = $250 (~3060 12gb in raster)
 
Last edited:

marees

Golden Member
Apr 28, 2024
1,461
2,054
96
Why adding so much memory to AT2, AT3 and AT4? I would assume 18gb / 16gb / 12gb for these.

I'd like to get more, but it is unlikely that we are seeing that.
AT3 & AT4 are joke guesses because MLID said lpddr


AT2 I have to give a serious rethink.
I am now thinking xtx will use 4gb gddr7 while xt & gre will use 3gb gddr7
 

Saylick

Diamond Member
Sep 10, 2012
3,965
9,262
136
My revised estimations / guesstimates (no LLMs used)

(Assuming this lpddr vram thingy is true & also assuming it works out) Imagine this line-up (in 2027)

  • AT0
    • 10090xt+ — Multiple models starting at $1500 plus and huge vram like Radeon VII or titan
  • AT1
    • 10080xt — scrapped (Lisa Su took her toys & went home)
  • AT2 (gddr7)
    • 10070 xtx 24gb = $700 (~5080)
    • 10070 xt 18gb = $600 (~5070 ti)
    • 10070 gre 15gb = $500-$550 (~5070 super)
  • AT3 (lpddr6)
    • 10060 xt 24gb = $450-$500 (~5070)
    • 10060 16gb = $400 (~5060ti 16gb)
  • AT4 (lpddr6/lpddr5x)
    • 10050xt 32gb = $350 (~9060xt 16gb)
    • 10050xt 24gb = $300 (~9060)
    • 10040xt 16gb = $250 (~3060 12gb in raster)
I'd be surprised if the 10070 XT or whatever they call it only ends up being a 5070 Ti at $600. That's basically the same as a 9070 XT in perf/$ but with 50% more VRAM.
 

basix

Member
Oct 4, 2024
181
357
96
AT3 & AT4 are joke guesses because MLID said lpddr


AT2 I have to give a serious rethink.
I am now thinking xtx will use 4gb gddr7 while xt & gre will use 3gb gddr7
Even if they are using LPDDR6, why adding more memory than useful? These things have to be cheap and 16 GByte for a mainstream GPU and 12 GByte for the Low End part seem to be reasonable.
Yes, you can add more memory. But why should AMD do that when not of big benefit for the average gamer?
The same logic applies to AT2 with GDDR7. 18 GByte are most reasonable on 192bit with 24 Gbit chips. And 18 GByte are also perfectly suited for a 1440p card and also fine enough for 4K with upscaling. Most gamers won't benefit from 24 GByte but would have to pay more.

For workstation and professional parts it will be another story. There you could attach 128 GByte to a 2-ch LPDDR6 bus if AMD wants to.

never say the AI bubble didn't do anything for you

Maybe this ML/AI stuff is even the lone reason for exchanging a GDDR7 with a LPDDR6 memory interface on the lower end parts. You can build dGPUs aside of APUs with the same chips and the same humongous amounts of memory for the professional market. Good for ML/AI workloads and probably also nice for other workstation applications where FLOPS of a midrange GPU are enough but more memory is welcome (EDA etc.).
 
Last edited:
  • Like
Reactions: Magras00 and marees

marees

Golden Member
Apr 28, 2024
1,461
2,054
96
Even if they are using LPDDR6, why adding more memory than useful? These things have to be cheap and 16 GByte for a mainstream GPU and 12 GByte for the Low End part seem to be reasonable.
Yes, you can add more memory. But why should AMD do that when not of big benefit for the average gamer?
The same logic applies to AT2 with GDDR7. 18 GByte are most reasonable on 192bit SI with 24 Gbit chips. And 18 GByte are also perfectly suited for a 1440p card and also fine enough for 4K with upscaling. Most gamers won't benefit from 24 GByte but would have to pay more.
Not sure what will happen with AT2

But for AT3 & AT4 the digital phone camera & mega pixels scenario applies, imo
Basically marketing

Anecdotally there was a 2gb vram Nvidia card much slower than a 1gb vram card but my colleague bought the slower 2gb one & very proudly proclaimed that he bought the 2gb. That is the market I have in mind for AT3 & AT4. definitely not forum users
 

basix

Member
Oct 4, 2024
181
357
96
I know what you are thinking of. But does that work today as good as in the past? Anybody can pull out ChatGPT to ask for the better GPU (and might get the correct answer - or not).
For example, the megapixel race has pretty much ended. Many new phones and cameras get released with fewer pixels than their predecessor. People either gained more knowledge (more pixels != more quality), simply don't care because good enough or are not interested in technical details.

Higher VRAM amounts on lower end parts make the more expensive ones less attractive as well.
 

marees

Golden Member
Apr 28, 2024
1,461
2,054
96
I know what you are thinking of. But does that work today as good as in the past? Anybody can pull out ChatGPT to ask for the better GPU (and might get the correct answer - or not).
For example, the megapixel race has pretty much ended. Many new phones and cameras get released with fewer pixels than their predecessor. People either gained more knowledge (more pixels != more quality), simply don't care because good enough or are not interested in technical details.

Higher VRAM amounts on lower end parts make the more expensive ones less attractive as well.
You are logical
But imo AMD needs a marketing trick to beat the 6050 9gb?
This could be it

But now that Jensen knows of this he will be scheming up a riposte for this

If you had these 3 options (for an entry level GPU) then which one are you buying 🤔

  • 10050xt 32gb = $350 (~9060xt 16gb)
  • 10050xt 24gb = $300 (~9060)
  • 10040xt 16gb = $250 (~3060 12gb in raster)
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,659
2,511
136
Why adding so much memory to AT2, AT3 and AT4? I would assume 18gb / 16gb / 12gb for these.

I'd like to get more, but it is unlikely that we are seeing that.

AT3 and AT4 use LPDDR interfaces. AT3 supposedly has 384 bit LPDDR6. This puts a lower limit on the amount of ram, as I doubt there will even be very small LPDDR6 chips.
 

basix

Member
Oct 4, 2024
181
357
96
Good point. 16...24 Gbit chips are probably the lower boundary. There will for sure be no 8 Gbit Chips.

Which then makes 4 modules and therefore 8...12 GByte on a "Dual-Channel" 192bit LPDDR6 interface and 8 modules resulting in 16...24 GByte at Quad-Channel. Still everything possible ;)
 

Magras00

Member
Aug 9, 2025
28
60
46
Cc @dangerman1337

AT3 — 48 CU 384 bit lpddr6
AT4 — 24 CU 128 bit lpddr6


MLID said LPDDR5X not LPDDR6.

I wonder what will be the smallest LPDDR6 chip available? IIRC the smallest LPDDR5 ones are 8Gb, but I dunno if those are even still in production, the most common ones I see around are 12Gb.

32Gb per chip according to Cadence. Yep no 16Gb or even 24Gb capacities.

They list 4-64GB device densities but it's for 48bit bus width vs LPDDR5X's 32bit. AT3's quad channel 384bit LPDDR6 has 8 modules as @basix said so minimum capacity of 32GB cappint out at 512GB no clamshell.
256bit LPDDR5X AT3 design using 9600Mbps or faster lowest amount is 24GB and up to 128GB without clamshell.

Yeah Samsung only lists 12Gb but that's for slower LPDDR5X not practical for AT4 unless it's really weak. Samsung lists LPDDR5X 9600Mbps densities from 24-128Gb. So AT4 based designs can be 12-64GB without clamshell.

AT3 & AT4 are joke guesses because MLID said lpddr


AT2 I have to give a serious rethink.
I am now thinking xtx will use 4gb gddr7 while xt & gre will use 3gb gddr7

LPDDR is unconventional but not unreasonable. It's about trade-offs. GDDR6 has a much higher GB/s per mm^2 for PHYs while LPDDR6 lowers board complexity, power draw, and at least halves $/GB while allowing for true LLM slot-in cards. Trade offs reminiscent of Infinity Cache.

Did some pixel counting to figure out GB/s/mm^2 for LPDDR5X, LPDDR6 (speculative). Skip to conclusion for comparisons.
  • Strix Halo (N4) 256bit LPDDR5X PHYs (256GB/s @8000Mbps) = ~37mm^2
  • GB203/5080 (4N) 256bit GDDR7 PHYs (960GB/s @30Gbps) = ~51mm^2
  • Navi 48/9070 XT (N4) 256bit GDDR6 PHYs (645GB/s @20.1Gbps) = 41mm^2

Area overhead of GDDR7 is ~25%. No info on LPDDR6 yet but worst case let's use 25% overhead from GDDR7 and add 50 percent for 384bit PHY (could be lower if 256bit area matches 384bit). LPDDR6 384bit PHY ~69mm^2 but there's no way the PHYs will be this big.

AT3 128bit GDDR7 vs 384bit LPDDR6
With higher data rates (assuming no change to area)
69mm^2 384bit @12Gbps LPDDR6 = 576GB/s
51mm ^2 256bit @36Gbps GDDR7 = 1152GB/s

128bit GDDR7 enough for 576GB/s. That's 25.5mm^2 vs 69mm^2, so -43.5mm^2 worst case. With N3P at ~$21K/wafer that's +$16 silicon cost.

Conclusion
12Gbps LDDR6 requires 2.71x (speculative worst case) more area per GB/S than 36Gbps GDDR7.
8000Mbps LPDDR5X requires 2.27x more area per GB/s than 20.1Gbps GDDR6.

GDDR6 and GDDR7 requires more interconnects and spacing than LPDDR5X so this somewhat reduces the gap, but it's still massive. If AMD is using LPDDR for AT3 and AT4 then they're increasing GPU die size for some other benefits. With that said 2X memory per tier vs NVIDIA without higher BOM is probably wishful thinking but deprecating MALL and shrinking supersized L2 (as MLID implied) will result in significant area savings.
 

basix

Member
Oct 4, 2024
181
357
96
The thing about board complexity can be an important one for mobile designs. If you can move the LPDDRx packages closer to the chip (because lower datarate compared to GDDR), your design gets more compact. Neat for laptops.

I did look up LPDDR5 packages and there are 64bit packages available. If that extends to LPDDR6 (96bit per package then) you could serve Dual-Channel with just two packages and Quad-Channel with four packages. That would be very dense regarding PCB space.
But a single Die should always be 16/32bit (LPDDR5) or 24/48bit (LPDDR6) in width. So if 32 Gbit is the smallest available LPDDR6 Die, AT4 would land at 16 GByte.

As a side note:
There are dual-PHY available, which support both LPDDR5X and LPDDR6. So we could see "anything" regarding SKU definition. Some might use LPDDR6. Some might use LPDDR5X. For chips like AT3 and AT4 such dual-PHY would make sense and allow for a very broad specification range (bandwidth and memory capacity). Ideal for two Die which should allegedly get used for a wide range of applications. Both dGPUs (Low-End to mainstream, professional, ML/AI) and APUs (Premium to Highend, professional, ML/AI) with a wide range of memory capacities and bandwidth (we could think of 8...512 GByte capacity and 100...700 GByte/s bandwidth). LPDDR6/5X is just much better suited for that compared to GDDR7. So for Die re-use between dGPU and APUs that choice of memory type (use LPDDR instead of GDDR) makes additional sense.
 
Last edited:

Magras00

Member
Aug 9, 2025
28
60
46
The thing about board complexity can be an important one for mobile designs. If you can move the LPDDRx packages closer to the chip (because lower datarate compared to GDDR), your design gets more compact. Neat for laptops.

I did look up LPDDR5 packages and there are 64bit packages available. If that extends to LPDDR6 (96bit per package then) you could serve Dual-Channel with just two packages and Quad-Channel with four packages. That would be very dense regarding PCB space.
But a single Die should always be 16/32bit (LPDDR5) or 24/48bit (LPDDR6) in width. So if 32 Gbit is the smallest available LPDDR6 Die, AT4 would land at 16 GByte.

As a side note:
There are dual-PHY available, which support both LPDDR5X and LPDDR6. So we could see "anything" regarding SKU definition. Some might use LPDDR6. Some might use LPDDR5X. For chips like AT3 and AT4 such dual-PHY would make sense and allow for a very broad specification range (bandwidth and memory capacity).

x96 packages could allow some interesting designs for mobile indeed. 2027 will be an interesting year for PC tech.

Yep the smallest x64/64 bit LPDDR5X package is 32Gb/s and that's LPDDR5X 7500-8533, the faster memory (>8533Gbps) begins at 48Gb per package. Anything less than 96bit 64Gb packages won't happen. AT3 LPDDR6 = 32GB, AT3 LPDDR5X = 24GB. AT4 LPDDR6 = 16GB, AT3 LPDDR5X = 12GB.

Could be a weird situation with AT3 vs AT2. AT2 could be restricted to 18/24GB offerings (without clamshell), while AT3 defaults to 24/32GB.

Totally forgot. IIRC hasn't AMD mobile APUs sported dual-PHYs since IDK how long. Zen 2 or even earlier? So one dual-PHY that supports lets say LPDDR5X 10677 and LPDDR6 12000 used for AT3 and AT4. Likely a repeat of RDNA 4's Navi 48 and 44. Mirrored die for lower design cost with the addition of IO, display and media moved to seperate MID.

Edit: Like @basix said dual-PHYs are available and I can't see AMD not going for an easy win here. AT4 with 24 RDNA5 CUs with 192bit LPDDR6 12000 16GB with 288 GB/s, that's only 10% regression vs 9060XT, could be very interesting. Add 16-24MB L2, increase clocks by 10-15% vs 9060XT and add ~20% higher raster IPC (Kepler's guesstimate) will allow AMD to easily match 9060XT 16GB in raster with significantly lower BOM. $249-279 seems doable. With 2X RT performance bump (@Kepler_L2 guesstimate) AT4 dGPU would annihilate 9060 XT 16GB in path traced games. 2X increase in raw ray traversal and intersection throughput should land it around a RX 9070. However based on MLID claims, massive PT perf gap between Blackwell and RDNA 4, and AMD patent filings I don't think that 2X figure vs RDNA 4 is high enough.

For the entry market AMD could use slower LPDDR5X + cut down config or clocks for a PCIe only power card with 12GB VRAM around 4060 level perf in the leak. But It'll probably be 4060 or higher not 3060 level.
 
Last edited:
  • Like
Reactions: basix and marees

basix

Member
Oct 4, 2024
181
357
96
And another consideration could be Apple. The two Medusa Halo versions (12C / 24C) together with AT3 and AT4 could challenge M5/M6 Pro & Max.

Medusa Point competes with the bottom line M5/M6 SoC.

And I would like to get a workstation card with AT3 and 256 or even 512 GByte VRAM :)

Regarding ML/AI, AT3 and AT4 with LPDDR6 would operate in a similar bandwidth range per tensor/matrices TFLOPS like an RTX 5090. Crazy, if you think about that. So AT3 and AT4 could be quite decent "accelerator cards" for ML/AI. Together with the huge VRAM capacity you could even outmatch RTX Pro 6000 with 96 GByte or GB202 successor with max. 128 GByte for some tasks (if 32 Gbit GDDR7 modules are available), despite featuring much lower raw TFLOPS numbers. And because the chips are small and use cheap memory: Much better bang for the buck. And having the possibility to pair that in APU style with a CPU is just the cherry on top.

I see a strategy there... ;)

Edit:
One funny thought I had now was US export restrictions to China. Because AT3 and AT4 are much below the bandwidth and TFLOPS limits of those restrictions, they could potentially sell well in China.
 
Last edited: