Question Zen 6 Speculation Thread

igor_kavinski · Aug 20, 2025

OneEng2 said:
If it's 15% ST and then increased the core count by 50%, wouldn't MT be more like 57%?

Since Zen 3, AMD's been making designs that crave bandwidth because gobs of that is available in workstations and servers but then they used the same designs for client too where they got hobbled by limited bandwidth. No idea if client Zen 6 gets the bandwidth it needs to unlock its full potential.

adroc_thurston · Aug 20, 2025

igor_kavinski said:
Since Zen 3, AMD's been making designs that crave bandwidth because gobs of that is available in workstations and servers

no, they just build server cores.

blackangus · Aug 20, 2025

igor_kavinski said:
Not sure about that. M$ is a monopoly. So is Apple. What's Intel so afraid of?

ok man I love ya... but that statement is clearly a bad example.

Apple, MS, Redhat, etc are all OS vendors/hardware vendors they clearly compete with each other in that category, so not a monopoly.

HurleyBird · Aug 20, 2025

OneEng2 said:
Oh, I rather think a 52 core NVL will handily best a Zen 6 24c/48t cpu in highly MT applications. I am guessing by 30%.

It's either 26c/52t vs 52c/52t, or 24c/48t vs 48c/48t. Don't resolve the equivocation one way for AMD and the other for Intel.

LightningZ71 · Aug 20, 2025

When comparing Intel and AMD's mobile market share, one must realize that AMD remains rather (intentionally) capacity constrained. They are refusing to flood the market with low margin parts, nor do they want to reduce margin on the parts that they actually do make. As for Intel, judging by the bids that I'm seeing pass through my email these days, they are stuffing channels with Alder/Raptor Lake U/P/H products at near break-even prices. The big push to eradicate Windows 10 is bringing a lot of purchases forward, and on the desktop, local AI is just not a priority.

AMD could be doing notably better in pro/mid to high end non-gaming laptops if they hadn't taken a double barrel sawed off shotgun to their own reproductive organs with Strix Point by trading out MALL cache for a useless NPU.

adroc_thurston · Aug 20, 2025

LightningZ71 said:
one must realize that AMD remains rather (intentionally) capacity constrained.

They're not capacity-constrained.
AMD has lower design volume and far more limited channel presence. But they're working on it.

LightningZ71 said:
by trading out MALL cache for a useless NPU.

Neither is important.
MALL is for the iGPU but no one uses the iGPU.

LightningZ71 · Aug 20, 2025

No one uses the iGPU in the first world because they are essentially only usable for data visualization and handheld games. With 16MB of MALL. The 16CU of Strix Point would be in the ballpark of the Radeon RX6500 with better memory capacity. Probably near 3050 mobile performance.

branch_suggestion · Aug 20, 2025

LightningZ71 said:
by trading out MALL cache for a useless NPU.

Good news might be on the horizon.
Since M$ is expanding the Copilot spec to GPUs too, and there is no XDNA on recent AMD portfolio material...

Kepler_L2 · Aug 20, 2025

branch_suggestion said:
Good news might be on the horizon.
Since M$ is expanding the Copilot spec to GPUs too, and there is no XDNA on recent AMD portfolio material...

The funny thing about "AMD traded MALL in Strix Point for the NPU" is that both MALL and NPU are deprecated in the future

Joe NYC · Aug 20, 2025

OneEng2 said:
AMD is making money hand over fist. I wouldn't say they are failing. Furthermore, they may be losing mobile market share, but they are punishing Intel terribly financially. The current trend and Intel strategy is not sustainable. Looks to me like AMD has a better chess game going on.

Good point. It's like the war of attrition going on in Ukraine. AMD is in a healthy position, Intel is sickly and scrambling to hold the front lines.

AMD is in a great position to outlast Intel. Even though these threads would like to see more "maneuver warfare", Lisa is content to just grind Intel down. Time is on AMD side.

AMD has weapons to break the deadlock in server and desktop. We will see if Zen 6 notebook can deliver a breakthrough or just more of attrition warfare.

OneEng2 said:
I am guessing you say this because you believe that N3P is about equal to 18A? While this COULD be true, I don't think there is any evidence to suggest it. Personally, I expect 18A to be close to N2 .... but cost more and possibly clock slower to some extent. In DC, the lower clock shouldn't make any difference .... so it might be a more potent technology than many are thinking .... at least for DC.

That seems to be the consensus, that 18A is on par with N3P but the actual performance of the product remains to be seen.

MLID hinted that Gorgon Point (N4P) and Panther Lake will be close to being on par in performance, which would point to Intel design side underdelivering, AMD overdelivering.

Then, when Zen 6 arrives, with N3P and N2P and LP cores, AMD might finally gain advantage. We will see.

OneEng2 said:
LOL. I agree. AMD's "Server First" design philosophy has been a very big success financially. I think they will get around to pushing their advantage in other markets, but as you say, they will stay focused on DC and AI for Zen 6 and Zen 7 IMO.

DC CPU was a little depressed, overall, in last 2 years, as a lot of CAPEX shifted from CPU to GPU.

But, according to Lisa, all the past growth in GPU now needs more datacenter CPUs to process and present the results etc., in general to keep up. So both Intel and AMD grew last quarter in datacenter CPU, but there is a difference between AMD growing with Turin and Intel growing with obsolete Sapphire Rapids and Emerald Rapids.

As I said in other threads, Intel will face a brick wall trying to release Diamond Rapids with 256 P Cores , which don't excel in power efficiency and area efficiency and will lack SMT against Venice Dense, which will excel in power efficiency, area efficiency and has SMT.

OneEng2 said:
Yep. As I stated earlier, this is a pretty bad chess strategy for Intel.
By unit sales, client is 20-30x the size of datacenter.
By revenue, they are about equal (~20-30bn/yr)

Looking at x86 datacenter revenue, it is about $5 billion per quarter, $20 billion per year
Client is about $10 billion per quarter, $40 billion per year.

In server, AMD is at roughly $2 billion per quarter, with $3 billion upside to $5 billion
In client, AMD is also at ~$2 billion but here with $8 billion upside to $10 billion

So there is far more to gain for AMD on the client end than on server end.

inquiss · Aug 20, 2025

Kepler_L2 said:
The funny thing about "AMD traded MALL in Strix Point for the NPU" is that both MALL and NPU are deprecated in the future

Really interesting. In favour of using the GPU for copilot features or dropping the whole damn thing?

Joe NYC · Aug 20, 2025

Kepler_L2 said:
The funny thing about "AMD traded MALL in Strix Point for the NPU" is that both MALL and NPU are deprecated in the future

Makes sense on the CPU side, where CPU benefits from L3 far more than from MALL.

But depreciating MALL for the GPU - I guess there will be more SRAM embedded inside the GPU die, where it can be access with much lower latencies.

I wonder if AMD will ever figure out a V-Cache for GPU, which could be using cheaper die...

Kepler_L2 · Aug 20, 2025

inquiss said:
Really interesting. In favour of using the GPU for copilot features or dropping the whole damn thing?

Using the iGPU, there's no point in a 50-100 TOPs NPU when you have a 200+ TOPs iGPU.

Joe NYC said:
But depreciating MALL for the GPU - I guess there will be more SRAM embedded inside the GPU die, where it can be access with much lower latencies.

They're going with larger L2 like NVIDIA

Joe NYC said:
I wonder if AMD will ever figure out a V-Cache for GPU, which could be using cheaper die...

They already do for MI series

jpiniero · Aug 20, 2025

Kepler_L2 said:
Using the iGPU, there's no point in a 50-100 TOPs NPU when you have a 200+ TOPs iGPU.

Wouldn't the NPU be much more efficient on die space?

Joe NYC · Aug 20, 2025

Kepler_L2 said:
They already do for MI series

Mi series cache is closer to MALL, AFAIK. At least it was in Mi300/Mi325. I am not sure if it changed since then.

I was thinking a V-Cache that is more closely coupled with the GPU die - a logical unit with the GPU, with low latencies.

As opposed to a standalone unit (MALL) closely coupled with memory controller, and not with GPU.

Joe NYC · Aug 20, 2025

jpiniero said:
Wouldn't the NPU be much more efficient on die space?

iGPU + NPU vs. optimized iGPU?

My guess incremental iGPU die size increase to iGPU optimized to also process NPU workload more optimally will be smaller than standalone NPU.

jpiniero · Aug 20, 2025

Joe NYC said:
iGPU + NPU vs. optimized iGPU?

My guess incremental iGPU die size increase to iGPU optimized to also process NPU workload more optimally will be smaller than standalone NPU.

I would think that if they are getting rid of the NPU, it means that AI hype is already over.

inquiss · Aug 20, 2025

I

Joe NYC said:
iGPU + NPU vs. optimized iGPU?

My guess incremental iGPU die size increase to iGPU optimized to also process NPU workload more optimally will be smaller than standalone NPU.

I'd guess that's it's more like Microsoft asked for npu because they didn't want to miss a potential trend in on device AI but it's kinda useless. AI looks like it will be served from the cloud for now. You can just use the GPU instead because the NPU won't be used at the same time as the GPU and, if in the future games involve it, then you wouldn't be gaming on the igpu anyway. That's niche. So makes sense to me.

Wonder if the GPU has anything in the next gen that makes it better at what the NPU used to do?

adroc_thurston · Aug 20, 2025

Joe NYC said:
I was thinking a V-Cache that is more closely coupled with the GPU die - a logical unit with the GPU, with low latencies

That's not how any of that works.
MALL is already part of the GPU cache ladder.

Not like latency matters for GPUs much.

RnR_au · Aug 20, 2025

jpiniero said:
I would think that if they are getting rid of the NPU, it means that AI hype is already over.

I think it just means that AMD's next igpu will have proper tensor cores, or something close enough to it.

ToTTenTranz · Aug 20, 2025

Kepler_L2 said:
Using the iGPU, there's no point in a 50-100 TOPs NPU when you have a 200+ TOPs iGPU.

Do those iGPU TOPs have a similar perf/watt as the current NPUs?

Kepler_L2 said:
They're going with larger L2 like NVIDIA

That's for iGPUs only, or RDNA5+ as well?

But it's good news that the large cache isn't going away. It's just moving down a level.

Joe NYC · Aug 20, 2025

inquiss said:
Wonder if the GPU has anything in the next gen that makes it better at what the NPU used to do?

@adroc_thurston says AMD is bringing in some optimized matrix operations to RDNA5.

But RDNA5 will probably not make it into most of the Zen 6 products except maybe Medusa Halo.

Joe NYC · Aug 20, 2025

adroc_thurston said:
That's not how any of that works.
MALL is already part of the GPU cache ladder.

Not like latency matters for GPUs much.

Well, more like serving some of the bandwidth needed from caches rather than from memory, where the bandwidth can become a bottleneck.

adroc_thurston · Aug 20, 2025

Joe NYC said:
Well, more like serving some of the bandwidth needed from caches rather than from memory, where the bandwidth can become a bottleneck.

Hence why you'll get like 48M of L2 on the very top.

ToTTenTranz said:
But it's good news that the large cache isn't going away. It's just moving down a level.

It's a much, much smaller cache.
Membw scaling itself is enough.

branch_suggestion · Aug 20, 2025

Kepler_L2 said:
The funny thing about "AMD traded MALL in Strix Point for the NPU" is that both MALL and NPU are deprecated in the future

MALL will still be used in 3D/chiplet parts right?
At the end of the day it does come back to meeting bandwidth requirements, GDDR7 is a pretty nice bump so a big L2 is enough for the compute this gen.

adroc_thurston said:
It's a much, much smaller cache.
Membw scaling itself is enough.

Indeed, I do wonder if the new L2 will have some MALL like functions.
And well a big L2 is ideal, much higher bw and lower latency for the SEs to feed upon. I think AMD figured out that their L2 is extremely good and making it big and simple is a net gain in PPA, though probably only possible with G7.

Question Zen 6 Speculation Thread

Lifer

Diamond Member

Senior member

Platinum Member

Platinum Member

Diamond Member

Platinum Member

Senior member

Golden Member

Diamond Member

Senior member

Diamond Member

Golden Member

Lifer

Diamond Member

Diamond Member

Lifer

Senior member

Diamond Member

Platinum Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Senior member