• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Question Zen 6 Speculation Thread

Page 361 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
If they go wider on CCD count, the real question is scheduling and inter chip communication. You can throw more chiplets at it, but gaming and mixed loads get weird if the OS is bouncing threads across die boundaries. A better play might be fewer CCD but bigger per CCD, more cache, better memory controller, and tighter fabric latency.

That's what they appear to be doing. Also, lifting the cap on be bandwidth available to a single CCD, which was a limiting factor in single CCD desktop, and also lifting overall bandwidth limit by new memory controller supporting faster DDR5 chips.

Bigger L3 pool will automatically lower latency to low thread count workloads, such as games, which even bigger V-Cache lower further.

Also, L3 helps bandwidth. More bandwidth stays internal to the CCD, more of the external bandwidth it frees up.

DDR6 and a new socket would make sense as a platform refresh, but I wouldn’t expect it unless the memory controller and IO side are actually the limiting factor they want to fix.

I think AMD is just going to wait for Intel to get burned on DDR6
 
oh look. fun product rumour.

This is going to be a really fast SoC huh
 
oh look. fun product rumour.

This is going to be a really fast SoC huh
I'm curious to see how it will perform, CU count isnt much different from a RX 9070, with the benefit of being a newer gen, but power wise probably will be a huge difference
Also hoping the brands adopt it more than they did with Strix Halo
 
Also hoping the brands adopt it more than they did with Strix Halo
for starters it will be costlier than strix halo
  1. 20% more GPU CUs
  2. 3nm rather than 4nm
the other option is medusa premium (half the gpu size of halo). I am hoping shear scale brings the price down. it might take 2030 until open ai to go bankrupt. until then both halo & mini halo would be used for edge AI use cases (such as DGX spark). hence the price of these systems will be at premium level until then
 
Bigger L3 pool will automatically lower latency to low thread count workloads, such as games, which even bigger V-Cache lower further.
It's a victim cache. So it would be more correct to say low thread count non-streaming workloads such as games.
Also, L3 helps bandwidth. More bandwidth stays internal to the CCD, more of the external bandwidth it frees up.
It's a victim cache. So it would be more correct to say low thread count non-streaming workloads such as games.

The reason being, for a streaming workload you are unlikely to visit the same data cache line twice, so once stuff gets evicted from L2 you are unlikely to see it accessed from L3 before it gets evicted from there too.

Of course this behavior varies from program to program, but it seems that people generally assume L3 and L2 are the same in all respects, L3 being bit slower and bigger. That's not entirely the case. Just something to keep in mind.
 
oh look. fun product rumour.

This is going to be a really fast SoC huh

Being able to use LPDDR6 could be transformative to both Medusa Premium and Halo. It would also grant AMD the ability to sell higher-end versions.
That and giving Vcache to the CCDs in those Medusa options would give a substantial boost to gaming performance.
 
for starters it will be costlier than strix halo
  1. 20% more GPU CUs
  2. 3nm rather than 4nm
the other option is medusa premium (half the gpu size of halo). I am hoping shear scale brings the price down. it might take 2030 until open ai to go bankrupt. until then both halo & mini halo would be used for edge AI use cases (such as DGX spark). hence the price of these systems will be at premium level until then
Medusa Premium could end up being a killer budget gaming chip. If it uses LPDDR6, the memory bandwidth will be close to Strix Halo. 24 RDNA5 CUs could match 40 RDN3.5 CUs too.

The big unknown is cost, but there's a lot of potential to keep costs down by reducing redundancy.
  • Most gaming laptops have a DDR5 memory bus for the CPU and a GDDR7 memory bus for the dGPU. Medusa Premium only has one memory bus and it's a standard size. That keeps motherboard BOM lower and uses less die area.
  • Most dGPU gaming laptops have a mostly-unused iGPU, along with two media engines, two display engines, and two sets of display PHYs. These redundant components take up a lot of die area.
  • I think it's unlikely but there's been speculation that RDNA5 could fulfill the Copilot+ TOPS requirement. If so, it wouldn't need an NPU.
  • 8 PCIe 4.0 could be cut since this would never be paired with a dGPU. Though the die to die link would offset this.
Strix Point die shot with area of some blocks added.
sp.png

I don't know how much stuff they'll actually cut, but Medusa Premium could perform like Strix Halo and be priced closer to Strix Point.
 
Medusa Premium could end up being a killer budget gaming chip.
/0
If it uses LPDDR6, the memory bandwidth will be close to Strix Halo. 24 RDNA5 CUs could match 40 RDN3.5 CUs too.
PHY's a combo but FP10 pinout means L5x in mobile.
I don't know how much stuff they'll actually cut, but Medusa Premium could perform like Strix Halo and be priced closer to Strix Point.
Yeah it'll get close to the current 106 tier.
Problem is, it's a mix of N3p and N2p and that's all $$$.
 
New version of RAM is always expensive when it just comes out. That plus the fact that we are in a RAM shortage means LPDDR6 might be priced like gold.

Historically, high end Android flagship have been the first to adopt a new LPDDR version, and laptops have followed 2-3 years later.
 
Budget in the context of gaming laptops, competing with RTX 5060 and RTX 6050 laptops in price and performance.
Oh yeah that's true, they're designated discrete compete parts.
They could do on-package LPDDR6 for a few laptops, but that increases the price.
No way man, that requires a separate platform basically.
QCOM X2ee is a meme part for a reason.
No way it's cheaper than a 6050 laptop, given it's the same build quality and such.
Yeah it will be, AMD will gladly swallow lower margin for design volume increment.
Plus AT3/4 APUs are built off commodity parts; nothing bespoke like stxH.
 
Historically, high end Android flagship have been the first to adopt a new LPDDR version, and laptops have followed 2-3 years later.
History went outta the window when fat GPGPU solutions (VR200/Helios) started loading on gobs and gobs of LPDDR.
GPGPU stuff will probably get to LP6 first too.
 
No way it's cheaper than a 6050 laptop, given it's the same build quality and such.
My point has already been rendered moot since it sounds like LPDDR6 is off the table for Medusa Premium laptops. I'm curious why you think that though. Medusa Premium could use 60mm^2 less die area from removing the iGPU, and have simpler and cheaper motherboards and heatsinks.
 
oh look. fun product rumour.

This is going to be a really fast SoC huh
Interestingly, memory manufacturers like Samsung and Innosilicon are already supplying LPDDR6 modules to customers for validation. Innosilicon's LPDDR6 modules boast an impressive speed of 14.4 Gbps, significantly faster than Samsung's initial modules, which achieve 10.7 Gbps.
Uh, looks like Techpowerup made an error. Innosilicon a memory controller provider, whereas Samsung is the memory module makers? They are comparing apples and oranges.

14.4 Gbps is the max JEDEC speed for LPDDR6, whereas modules are expected to start at 10.7 Gbps.
 
Uh, looks like Techpowerup made an error. Innosilicon a memory controller provider, whereas Samsung is the memory module makers? They are comparing apples and oranges.

14.4 Gbps is the max JEDEC speed for LPDDR6, whereas modules are expected to start at 10.7 Gbps.

Due to the bit overhead, LPDDR6-10666 (the initial speed) offers the same bandwidth as LPDDR5X-9600. LPDDR6-14400, when it arrives, would be equivalent to LPDDR5X-12800 (which AFAIK does not exist) This assumes you use the same overall bus width - which is a given in any design that uses dual LPDDR6/LPDDR5X controllers.

-14400 may be the max JEDEC speed "for now", but -6400 was the max for LPDDR5 so safe to say there will be a future 'X' extension to goose that further. I just wouldn't expect a revolution because of the use of LPDDR6. It is no different than going to LPDDR5X-9600, which I doubt people would be as excited about. Its a decent boost, but unlikely to be noticeable unless you're running benchmarks or running the iGPU at the ragged edge of its frame rate capability.
 
LPDDR6-14400, when it arrives, would be equivalent to LPDDR5X-12800 (which AFAIK does not exist)
Samsung: Who's talking?
It is no different than going to LPDDR5X-9600, which I doubt people would be as excited about. Its a decent boost, but unlikely to be noticeable unless you're running benchmarks or running the iGPU at the ragged edge of its frame rate capability.
The big deal is the 50% increased channel width. So going from 128b LPDDR5X-9600 to 192b LPDDR6-10700 is a 50% bandwidth improvement.
 
This assumes you use the same overall bus width - which is a given in any design that uses dual LPDDR6/LPDDR5X controllers.
That's not how it works. Both of the combo PHYs announced so far keep the amount of channels constant between the memory types, not the width of the interface. Because all the lines other than data lines would need to be increased for more channels, and you'd have to do weird remapping between data lines of adjacent channels for different widths. (Also, because of changes in how LPDDR6 works, a wider bus is cheaper with LPDDR6 than with LPDDR5X.) A device using such a combo PHY that supports 192-bit LPDDR6 can only support 128-bit LPDDR5X.
 
The big deal is the 50% increased channel width. So going from 128b LPDDR5X-9600 to 192b LPDDR6-10700 is a 50% bandwidth improvement.
Still only a toy when it comes to the AI use case. Apple shows the way on membw, but at least with Strix Halo you can attach egpu's to give prompt processing and token generation speed a kick in the pants using sparse models.
 
This assumes you use the same overall bus width - which is a given in any design that uses dual LPDDR6/LPDDR5X controllers.

That's not how it works. Both of the combo PHYs announced so far keep the amount of channels constant between the memory types, not the width of the interface. Because all the lines other than data lines would need to be increased for more channels, and you'd have to do weird remapping between data lines of adjacent channels for different widths. (Also, because of changes in how LPDDR6 works, a wider bus is cheaper with LPDDR6 than with LPDDR5X.) A device using such a combo PHY that supports 192-bit LPDDR6 can only support 128-bit LPDDR5X.

Forgot to source it. You can find it clearly from Synopsys data sheet for the combo PHY here: https://www.synopsys.com/dw/doc.php/ds/c/dwc_lpddr6_5_5x_5_phy_ds.pdf Bottom of page 2.

You have to give them your phone number and email for the download, but they don't actually check anything except that they are potentially valid, you can just use throwaways if you want to.
 
According to JEDEC, LPDDR6-14400 has 2x the bandwidth of LPDDR5X-9600. https://www.jedec.org/sites/default/files/Brett Murdock_FINAL_Mobile_2024.pdf
1770762863335.png
Extrapolating from that. A 192-bit LPDDR6 memory bus has 1.33x the bandwidth of a 128-bit LPDDR5/X memory bus of the same per-pin speed.

A LPDDR6-10667 192-bit config (which should be the base LPDDR6 config for laptops) has 89% as much bandwidth as Strix Halo.
 
Back
Top