Info LPDDR6 @ Q3-2025: Mother of All CPU Upgrades

Page 11 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tuna-Fish

Golden Member
Mar 4, 2011
1,659
2,513
136
Why is it only 114.1GB/s for LPDDR6 10.7GT/s 96-bits? It should be 128.4GB/s, which is 50% higher than LPDDR5 10.7GT/s because of the 50% higher width. Is the new algorithm for LPDDR6 resulting in a loss?
LPDDR6 mixes the ECC/DBI bits into the data that's being moved, instead of having separate pins for them. Because of this, it moves 288 bits for each 256 bits of actual data. You have to multiply the width of the interface by 8/9 to get the actual throughput.

This is part of what made non-power-of-2 interface width work. A subchannel moves 12 bits at a time over 24 cycles, for 288 bits total. The size of units CPUs are actually fetching from memory is one cache line, or, on x86, 64B = 512b.

Of that 288, 16 are metadata actually stored on the memory, accessible to the system, and will likely be used for ECC. The other 16 are bits that are generated on transfer by the memory chips, and either used for DBI (reducing power use) or for link protection ECC.
 
Last edited:

Tigerick

Senior member
Apr 1, 2022
803
765
106
Why is it only 114.1GB/s for LPDDR6 10.7GT/s 96-bits? It should be 128.4GB/s, which is 50% higher than LPDDR5 10.7GT/s because of the 50% higher width. Is the new algorithm for LPDDR6 resulting in a loss?
Here is official calculator from JEDEC pdf file:

10.667 Gbps x (256 / 288) x (24 / 8) = 28.5 GB/s per 24-bit memory bus

Thus 96-bit memory bus with 10.667 Gbps LPDDR6 = 28.5 x 4 = 114 GB/s

I have listed all the figures with 96-bit, 192-bit and 384-bit in the table on main page. Have a look!
 

511

Diamond Member
Jul 12, 2024
3,724
3,504
106
10.667 Gbps x (256 / 288) x (24 / 8) = 28.5 GB/s per 24-bit memory bus

Thus 96-bit memory bus with 10.667 Gbps LPDDR6 = 28.5 x 4 = 114 GB/s
this would mean a 384bit would be roughly 455GB/s only RIP.
A 256 bit LPDDR5X connection Yields: 256/8*10667 = 341GB/s
A 384 bit LPDDR6 connection Yields: 10667 Gbps x (256 / 288) x (384/ 8) = 455GB/s
 
Last edited:

Tigerick

Senior member
Apr 1, 2022
803
765
106
I didn't follow this. Why can't AT3 & AT4 be stand alone GPU dies ?
I think I have answer for your question. First, table times with RDNA5 lineup:

ModelCUMemoryMemory BWICRDNA5CUMemoryMemory BWIC
Radeon 9070 XT6416 GB 256-bit GDDR6644.6 GB/s64 MBAT2- 70XT64 - 7018 GB 192-bit GDDR7864 GB/s24 MB
Radeon 90705616 GB 256-bit GDDR6644.6 GB/s64 MB
Radeon 9070GRE4812 GB 192-bit GDDR6432 GB/s48 MBAT2- 70GRE4815 GB 160-bit GDDR7720 GB/s20 MB
Radeon 9060 XT328/16 GB 128-bit GDDR6322 GB/s32 MBAT2- 60XT4412 GB 128-bit GDDR7576 GB/s16 MB
Radeon 9060288 GB 128-bit GDDR6322 GB/s32 MBRadeon 9050 ?328/16 GB 128-bit GDDR6322 GB/s32 MB
Radeon 7400288 GB 128-bit GDDR6288 GB/s32 MB

  • See, so simple with the table provided by MLID. AT2 GPU die will be used by Xbox Next/PS6? and RDNA5 discreet GPU. It helps AMD to spread high cost of N3 process.
  • Specs wise: 70CU seems to be max die. AMD most likely use that one as 70XT, thus MLID's claim of 20% faster than RTX4080 seems valid. IC might cut by half, not sure higher speed of GDDR7 will remedy the loss?
  • The rumored die size of 264 mm2 might not 100% correct but due to reduction of IC and memory bus, die area should be much smaller than N48 - 357 mm2.
  • Above is AT2 die, how about AT3 and AT4? Unlike AT2, AT3 and AT4 should integrate IOD and NPU. If AMD has no other uses, why not just name the die as Medusa Halo? There must have some usage other than APU for Zen 6? Well, I think I am the only one who could answer that......It is going to use on APU for ARM platform. NOT Soundwave, but successor of Soundwave. Yep, AMD won't release just one ARM APU. It is going to be full transition. And that's why I name AMD ARM thread as The Beginning of Transformation !!! IF you don't believe, fine by me. Times will prove my points. And please checkout N1x before comment.
  • adroc as usual blindly believe in roadmap without his own judgments. Now I laid my points, adroc what do you think?
 
Last edited:

branch_suggestion

Senior member
Aug 4, 2023
745
1,599
106
  • See, so simple with the table provided by MLID. AT2 GPU die will be used by Xbox Next/PS6? and RDNA5 discreet GPU. It helps AMD to spread high cost of N3 process.
Yes, now why wouldn't AMD also try to do the same for AT3/4?
  • Specs wise: 70CU seems to be max die. AMD most likely use that one as 70XT, thus MLID's claim of 20% faster than RTX4080 seems valid. IC might cut by half, not sure higher speed of GDDR7 will remedy the loss?
70 is not divisible by 4, it has to be 72. Also MALL is kill outside of CDNA for now, just a fat L2, my guess is 8MB per 64b of G7 and 4MB per 64b of L5X, assuming they use the combo PHY which is 256b L5X/384b L6
  • The rumored die size of 264 mm2 might not 100% correct but due to reduction of IC and memory bus, die area should be much smaller than N48 - 357 mm2.
It is also N3P and the uncore is on a seperate die. Seems doable.
  • Above is AT2 die, how about AT3 and AT4? Unlike AT2, AT3 and AT4 should integrate IOD and NPU.
NPU is dead, and yes it will have the necessary SoC stuff on die like the Z6LP complex and all of the usual GPU uncore.
  • If AMD has no other uses, why not just name the die as Medusa Halo?
Because the other use is being sold as a standalone dGPU?
Not exactly hard to figure out. Yes they lose a bit of margin compared to a pure dGPU part but the reuse on mobile SoCs is well worth it.
  • There must have some usage other than APU for Zen 6? Well, I think I am the only one who could answer that......It is going to use on APU for ARM platform. NOT Soundwave, but successor of Soundwave. Yep, AMD won't release just one ARM APU. It is going to be full transition. And that's why I name AMD ARM thread as The Beginning of Transformation !!! IF you don't believe, fine by me. Times will prove my points. And please checkout N1x before comment.
I checked N1X, it is a cost optimised GB10, good GPU, meh CPU for a 2025 part.
Oh wait sorry it got delayed to 2026, not even Mediatek can overcome the Tegra curse.
Give a genuine argument why AMD should go all in on WoA and abandon decades of x86 entrenchment?
November 11 will likely confirm that Zen 7 exists and it is x86, and I get to say I told you so.
Soundwave is a semicustom APU for Microsoft Surface WoA products, that is quite literally it. Van Gogh was Surface semicustom too before presumably Intel did Intel things again and Valve bailed AMD out.
Oh and the last thing, ARM are about to go on the warpath with a full spectrum of first party merchant Si, competing against their own customers since they want to make real money for the first time.
Soon it will not be cheap to use ARM at all, at that point they are little different to the business model of x86.

I thought the $INTC diehards on twitter were unhinged, but they always have good competition.
 

Kepler_L2

Senior member
Sep 6, 2020
958
3,947
136
70 is not divisible by 4, it has to be 72. Also MALL is kill outside of CDNA for now, just a fat L2, my guess is 8MB per 64b of G7 and 4MB per 64b of L5X, assuming they use the combo PHY which is 256b L5X/384b L6
I now believe it's actually

8 SE * 2 SA * 6 CU for AT0
4 SE * 2 SA * 5 CU for AT2
2 SE * 2 SA * 6 CU for AT3
1 SE * 2 SA * 6 CU for AT4

With the new gfx13 CU being equivalent to old WGP

L2 should be 4MB per UMC, so 64MB/24MB/32MB/16MB for ATx lineup.
 

poke01

Diamond Member
Mar 8, 2022
4,007
5,335
106
Oh wait sorry it got delayed to 2026, not even Mediatek can overcome the Tegra curse.
Mediatek isn’t a good CPU designer, combine them with Tegra folks and you get delays. Xiaomi made a better X925 CPU on the same process.

Give a genuine argument why AMD should go all in on WoA and abandon decades of x86 entrenchment?
November 11 will likely confirm that Zen 7 exists and it is x86, and I get to say I told you so
Zero reason for AMD ditch x86 as they control it.

People need to understand, why Apple moved. Not because ARM is the better ISA, Apple wanted greater control than what AMD and Intel could offer them and moving to in-house ARM designs provided that.
OT Soundwave, but successor of Soundwave. Yep, AMD won't release just one ARM APU. It is going to be full transition. And that's why I name AMD ARM thread as The Beginning of Transformation !!! IF you don't believe, fine by me. Times will prove my points. And please checkout N1x before comment.
 

eek2121

Diamond Member
Aug 2, 2005
3,400
5,031
136
margins.
It's for margins.

This, but yes, also control. Also, the whole x86 vs ARM argument is literally decades old. Give it a rest already. x86 has been dead since the 80s! Long live x86! Meanwhile, AMD just keeps moving forward, taking all the marketshare that everyone claims Qualcomm should be taking right about now.

Maybe just wait for it to finally happen and gloat after? 😉

With specific regards to the topic: We are likely 2-3 years away from widespread DDR6 adoption, and more than a year away from widespread LPDDR6. My own speculation, of course. 😉
 

johnsonwax

Senior member
Jun 27, 2024
325
490
96
margins.
It's for margins.
Not here. Cook Doctrine:
We believe that we need to own and control the primary technologies behind the products we make, and participate only in markets where we can make a significant contribution.
Every time Apple has an opportunity to gain an IP advantage, which their ARM architecture license allowed them to hold, they will take it. There aren't meaningful margin opportunities on the silicon, but there is meaningful control opportunities. They build margins off of the control. x86 doesn't allow for this.

It's hard to say what the margin opportunities are in a neutral way, sure they don't have to pay AMDs margins, but there are substantial economies of scale in silicon that going your own way make difficult to match. A series has huge volume, but M series not nearly as much.
 

branch_suggestion

Senior member
Aug 4, 2023
745
1,599
106
I now believe it's actually

8 SE * 2 SA * 6 CU for AT0
4 SE * 2 SA * 5 CU for AT2
2 SE * 2 SA * 6 CU for AT3
1 SE * 2 SA * 6 CU for AT4

With the new gfx13 CU being equivalent to old WGP

L2 should be 4MB per UMC, so 64MB/24MB/32MB/16MB for ATx lineup.
Yay a not completely unhinged lineup. AT3/4 having 1SE would've been weird.
So in effect the CU is now WGP thicc and now CU mode is good enough that WGP mode has no reason to exist?
So AT2 would be 80 old CU equivalent.
 

Kepler_L2

Senior member
Sep 6, 2020
958
3,947
136
Yay a not completely unhinged lineup. AT3/4 having 1SE would've been weird.
So in effect the CU is now WGP thicc and now CU mode is good enough that WGP mode has no reason to exist?
So AT2 would be 80 old CU equivalent.
Yeah, MI400 has the WGP-sized CU with WGP mode and also wave64 support deprecated, and I think this carries over to gfx13.
 

johnsonwax

Senior member
Jun 27, 2024
325
490
96
With specific regards to the topic: We are likely 2-3 years away from widespread DDR6 adoption, and more than a year away from widespread LPDDR6. My own speculation, of course. 😉
I think longer than that, depending on your definition of 'widespread'. So much of the high volume low end PC space really hangs on outdated tech to keep costs down. On Amazon there are still 85 pages of products using LPDDR4, and 22 pages using LPDDR3, compared to 225 pages of LPDDR5 - which is now 5+ years old. iPhone picked up LPDDR5 3 years ago. All rumors are that M5 will pick it up so Apple will have it across their line, and we will see the performant x86 silicon pick it up, but they're gong to have a lot of new silicon that won't use it for quite a while because that's how the SKU spam works - you need to throw off a lot of LPDDR5 SKUs to justify upselling to the LPDDR6 ones.