Question Zen 6 Speculation Thread

OneEng2 · Monday at 11:05 AM

I am personally in the market for a laptop for my wife. After plenty of review reading, despite the great battery life the ARM based PC's offer, there are simply WAY too many downsides and performance is consistently criticized by people who buy it.

Even compatibility (which today we take for granted) is in question with many applications not working correctly on ARM on windows machines.

Perhaps a day will come when this isn't true; however, the reality TODAY is that the vast majority of laptops will continue to be x86. I think ARM on windows is more a competitor to Chromebooks.

Perhaps I am different than the average buying consumer, but while I am very interested in the long battery life, the moment I see ARM, I run the other way when laptop shopping.

511 · Monday at 11:10 AM

Why would you buy WoA if LNL exists and is widely available either buy MacBook or LNL.

Antey · Monday at 11:15 AM

OneEng2 said:
I think ARM on windows is more a competitor to Chromebooks.

WoA laptops are expensive, premium devices. Chromebook, even chromebook ''plus'' are very cheap laptops for students. They don't compete in the same bracket.

And google announced (days ago) that chromeOS will be merged with AndroidOS. So in 2026 chromeOS will be something like ''Android for Desktop''.

511 · Monday at 11:16 AM

FYI x86 is fully compatible with Android Intel also maintains an AOSP Internal build unlike WoA

OneEng2 · Monday at 11:50 AM

511 said:
Why would you buy WoA if LNL exists and is widely available either buy MacBook or LNL.

I would buy LNL in a PC. I would never buy anything running iOS (other than a phone) as I am totally tied into the MS infrastructure.

Antey said:
WoA laptops are expensive, premium devices.

That perform more like Chromebooks when saddled with legacy x86 applications. I just can't see the value proposition. Perhaps someone could explain it to me.

basix · Monday at 11:52 AM

MrMPFR said:
Most of this is pure speculation:

No NPU given RDNA 5 based 2 WGP/CU @3ghz = ~50 dense INT8 TOPS.

The problem with Copilot+:
It specifically asks for an NPU. Look at Lunar Lake: 48 TOPS NPU and 67 TOPS GPU. Why bothering to add an NPU if the iGPU already exceeds the 40 TOPS requirement? The same applies for Strix Halo.

MrMPFR · Monday at 11:53 AM

marees said:
How many RDNA 5 WGPs do you need for 50 TOPS or 75 TOPS 🤔

2-3. 9070XT at @2.97Ghz lists 389 INT8 TOPS. 389 x 4/64 = 24.31 TOPs INT8 dense. +3% clocks to @3.06Ghz x doubled throughput = 50 TOPs. 3 WGPs/CUs = 75 TOPs.

MrMPFR · Monday at 12:14 PM

basix said:
The problem with Copilot+:
It specifically asks for an NPU. Look at Lunar Lake: 48 TOPS NPU and 67 TOPS GPU. Why bothering to add an NPU if the iGPU already exceeds the 40 TOPS requirement?

That requirement could change as more companies add systolic arrays to their GPU IP but AMD can't plan based on that. Seems like NPU is mandatory unless something major is going on behind the scenes that remains to be disclosed.

Power efficiency and since desktop already pretty much disregards perf/watt why bother including an NPU if the GPU is already plenty powerful, unless still mandatory of course.

basix · Monday at 12:44 PM

Microsoft could change the Copilot+ requirements, for sure.

Going forward, you either have an iGPU as NPU replacement (including API compatibility) or you keep your 40 TOPS NPU requirement but over time the relative silicon real estate for those 40 TOPS will get smaller and smaller over time. I would not increase NPU requirements to 100 TOPS, it would make more sense to shift that to the GPU. With that you could theoretically also add dGPUs to the framework, but that somehow makes a plain Copilot+ sticker useless.

Not sure, what Microsoft will do.

I would suggest to keep the 40 TOPS NPU (maybe refine it to 50 TFLOPS INT8/FP8 and 100 TFLOPS FP4 to have some shiny new number -> INT8 TOPS are getting meaningless because many DNNs are shifting towards FP8 and FP4).
Strix NPUs already support 50 TFLOPS INT8/FP8. So the only addition would be FP4 and with N3P etc. the silicon area should get much smaller.
Then add a 2nd Copilot+ critera (yay new sticker) which scales with GPU TFLOPS (iGPU or dGPU). For example "Level 1...10" for e.g. 100....1000 TFLOPS FP8 and double that with FP4. Merely 4x RDNA5 CU would be required for Level 1.
That should be doable with any APU and does not add additional or unused function blocks to the chip. You could extend the Levels to 100 if Microsoft wants to (10 PFLOPS FP8 -> Rubin CPX). Those levels are just a relative indication of how fast the system can process DNN matrices. Software could also read out the "Copilot Level" and do something with it.

Edit:
One thing I forgot, XDNA2 supports already MX6 and MX9. Only MX4 (FP4) is missing: https://docs.amd.com/r/en-US/am027-versal-aie-ml-v2/Functional-Overview
So very little to add to the existing NPU (MX4 can be derived from MX6 units). INT2 / Bitnet support would be cool as well. You can derive that from existing INT8 units.

MoistOintment · Monday at 1:00 PM

basix said:
The problem with Copilot+:
It specifically asks for an NPU. Look at Lunar Lake: 48 TOPS NPU and 67 TOPS GPU. Why bothering to add an NPU if the iGPU already exceeds the 40 TOPS requirement? The same applies for Strix Halo.

Because the NPU is more power efficient and CoPilot+ plans to use it often.

MrMPFR said:
Power efficiency and since desktop already pretty much disregards perf/watt why bother including an NPU if the GPU is already plenty powerful, unless still mandatory of course.

Unification. NPU on desktop makes sense from CPU designer to just make the designs between desktop and laptop more similar. Makes sense on the software side to have the same accelerators on both platforms.

Krteq · Monday at 1:13 PM

Just leaving this here

Schmide · Monday at 1:21 PM

MrMPFR said:
2-3. 9070XT at @2.97Ghz lists 389 INT8 TOPS. 389 x 4/64 = 24.31 TOPs INT8 dense. +3% clocks to @3.06Ghz x doubled throughput = 50 TOPs. 3 WGPs/CUs = 75 TOPs.

Haha I thought you meant 2-3 9070xt (s). (dangling modifier)

basix · Monday at 1:21 PM

MoistOintment said:
Because the NPU is more power efficient and CoPilot+ plans to use it often.

Unification. NPU on desktop makes sense from CPU designer to just make the designs between desktop and laptop more similar. Makes sense on the software side to have the same accelerators on both platforms.

Afaik NPUs are also better regarding time to first token or in other words execution latency. They work better with small batch sizes. For many applications and single customer use cases this is helpful. But for big number crunching it should be better to move towards the GPU in the longterm. The GPU does also have massive support from a big and wide memory system. Replicate that for an NPU is a waste of sand.

But funnily enough, doesn't Qualcomm add a better link between GPU and NPU to move matrix computations to the NPU (the GPU does not support such acceleration)

Regarding software:
HW differences could be abstracted away by HALs and APIs.

Antey · Monday at 1:22 PM

abut that video. @Krteq

Nothing new i think. Strix halo moved on from SerDes and is using TSMC Info-oS. This will mean better latencies and much better idle power consumption. And because of this 3D V cache on top of the die is much simplier, but they could figure it out and put it under the die.

It's interesting for mobile because up to now (and before strix halo) you had to design a monolithic design and now they can bring chiplet designs to mobile.

Josh128 · Monday at 1:30 PM

Antey said:
abut that video. @Krteq

Nothing new i think. Strix halo moved on from SerDes and is using TSMC Info-oS. This will mean better latencies and much better idle power consumption. And because of this 3D V cache will probably need to be on top of the die again.

No way. Theres no way they will choose to go back on top of a supposed >6GHz Zen 6 and limit themselves back to ~5GHz after a single gen of v cache below the CCD that enabled large gains. If Zen 6 vanilla is >6 GHz, I'd completely expect Zen 6 3D to be no less than ~5.5GHz.

basix · Monday at 1:36 PM

Yeah, I do also not think that the V-Cache gets moved on top again.

I do not see a problem of routing a few wires through TSVs on a bottom V-Cache Die. Zen 5's V-Cache Die does it already for all the power delivery stuff. And with Zen 7 you probably have to do it anyways, when all of the L3-Cache gets moved to a stacked Die.

OneEng2 · Monday at 3:47 PM

Before anyone can point to the very good benchmarking figures for the new Snapdragon, let me preface it with this:

Price will also be a serious consideration, as Qualcomm representatives told us that we should expect X2 Elite Extreme should arrive at a price tier higher than the range we saw last year with the launch of its first-gen Snapdragon X chips. In short, X2 Extreme systems will probably sell for significantly more than $1,000.

poke01 · Monday at 6:43 PM

OneEng2 said:
Before anyone can point to the very good benchmarking figures for the new Snapdragon, let me preface it with this:

New is expensive. Mudusa point/halo will also be expensive.

Meteor Late · Monday at 6:52 PM

OneEng2 said:
That perform more like Chromebooks when saddled with legacy x86 applications. I just can't see the value proposition. Perhaps someone could explain it to me.

The value proposition was about having great battery life while running everyday apps already ported, basically, such as excel, browsing, watching videos, etc. That was the value proposition.
Problem is of course Lunar Lake came in a bit later and especially but not exclusively thanks to MoP, offers a very good battery life too.

Joe NYC · Monday at 7:31 PM

Krteq said:
Just leaving this here

MLID now says that Zen 6 will not use the same packaging as Strix Halo (InFO) but some sort of embedded bridge

https://twitter.com/x/status/1972790692633059583

adroc_thurston · Monday at 10:43 PM

Joe NYC said:
MLID now says that Zen 6 will not use the same packaging as Strix Halo (InFO) but some sort of embedded bridge

not news?
It's 25um pitch -LSI (CoWoS or InFO idk) for anything tiled Zen6.

But the actual d2d interface is very much BoW stuff.

511 · Monday at 11:12 PM

adroc_thurston said:
It's 25um pitch -LSI (CoWoS or InFO idk) for anything tiled Zen6.

So Same as Foveros-S

adroc_thurston · Monday at 11:13 PM

511 said:
So Same as Foveros-S

Same as what Apple's been shipping for like almost 4 years in Ultra parts.
As to how AMD's gonna use all the microbumps, well, take a guess.

Joe NYC · Tuesday at 3:54 AM

adroc_thurston said:
Same as what Apple's been shipping for like almost 4 years in Ultra parts.
As to how AMD's gonna use all the microbumps, well, take a guess.

More microbumps, wider bus?

Another guess would be creating connections to other CCDs, which would lead to a question "what for?"

511 · Tuesday at 4:01 AM

Joe NYC said:
Another guess would be creating connections to other CCDs, which would lead to a question "what for?"

L3<-> L3 transfer perhaps?

Question Zen 6 Speculation Thread

Senior member

Diamond Member

Member

Diamond Member

Senior member

Senior member

Member

Member

Senior member

Member

Golden Member

Diamond Member

Senior member

Member

Golden Member

Senior member

Senior member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member