Question Zen 6 Speculation Thread

Page 223 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Jul 27, 2020
28,173
19,203
146
If it's 15% ST and then increased the core count by 50%, wouldn't MT be more like 57%?
Since Zen 3, AMD's been making designs that crave bandwidth because gobs of that is available in workstations and servers but then they used the same designs for client too where they got hobbled by limited bandwidth. No idea if client Zen 6 gets the bandwidth it needs to unlock its full potential.
 

LightningZ71

Platinum Member
Mar 10, 2017
2,603
3,290
136
When comparing Intel and AMD's mobile market share, one must realize that AMD remains rather (intentionally) capacity constrained. They are refusing to flood the market with low margin parts, nor do they want to reduce margin on the parts that they actually do make. As for Intel, judging by the bids that I'm seeing pass through my email these days, they are stuffing channels with Alder/Raptor Lake U/P/H products at near break-even prices. The big push to eradicate Windows 10 is bringing a lot of purchases forward, and on the desktop, local AI is just not a priority.

AMD could be doing notably better in pro/mid to high end non-gaming laptops if they hadn't taken a double barrel sawed off shotgun to their own reproductive organs with Strix Point by trading out MALL cache for a useless NPU.
 

LightningZ71

Platinum Member
Mar 10, 2017
2,603
3,290
136
No one uses the iGPU in the first world because they are essentially only usable for data visualization and handheld games. With 16MB of MALL. The 16CU of Strix Point would be in the ballpark of the Radeon RX6500 with better memory capacity. Probably near 3050 mobile performance.
 
  • Like
Reactions: Tlh97

Joe NYC

Diamond Member
Jun 26, 2021
3,837
5,378
136
AMD is making money hand over fist. I wouldn't say they are failing. Furthermore, they may be losing mobile market share, but they are punishing Intel terribly financially. The current trend and Intel strategy is not sustainable. Looks to me like AMD has a better chess game going on.

Good point. It's like the war of attrition going on in Ukraine. AMD is in a healthy position, Intel is sickly and scrambling to hold the front lines.

AMD is in a great position to outlast Intel. Even though these threads would like to see more "maneuver warfare", Lisa is content to just grind Intel down. Time is on AMD side.

AMD has weapons to break the deadlock in server and desktop. We will see if Zen 6 notebook can deliver a breakthrough or just more of attrition warfare.

I am guessing you say this because you believe that N3P is about equal to 18A? While this COULD be true, I don't think there is any evidence to suggest it. Personally, I expect 18A to be close to N2 .... but cost more and possibly clock slower to some extent. In DC, the lower clock shouldn't make any difference .... so it might be a more potent technology than many are thinking .... at least for DC.

That seems to be the consensus, that 18A is on par with N3P but the actual performance of the product remains to be seen.

MLID hinted that Gorgon Point (N4P) and Panther Lake will be close to being on par in performance, which would point to Intel design side underdelivering, AMD overdelivering.

Then, when Zen 6 arrives, with N3P and N2P and LP cores, AMD might finally gain advantage. We will see.

LOL. I agree. AMD's "Server First" design philosophy has been a very big success financially. I think they will get around to pushing their advantage in other markets, but as you say, they will stay focused on DC and AI for Zen 6 and Zen 7 IMO.

DC CPU was a little depressed, overall, in last 2 years, as a lot of CAPEX shifted from CPU to GPU.

But, according to Lisa, all the past growth in GPU now needs more datacenter CPUs to process and present the results etc., in general to keep up. So both Intel and AMD grew last quarter in datacenter CPU, but there is a difference between AMD growing with Turin and Intel growing with obsolete Sapphire Rapids and Emerald Rapids.

As I said in other threads, Intel will face a brick wall trying to release Diamond Rapids with 256 P Cores , which don't excel in power efficiency and area efficiency and will lack SMT against Venice Dense, which will excel in power efficiency, area efficiency and has SMT.

Yep. As I stated earlier, this is a pretty bad chess strategy for Intel.
By unit sales, client is 20-30x the size of datacenter.
By revenue, they are about equal (~20-30bn/yr)

Looking at x86 datacenter revenue, it is about $5 billion per quarter, $20 billion per year
Client is about $10 billion per quarter, $40 billion per year.

In server, AMD is at roughly $2 billion per quarter, with $3 billion upside to $5 billion
In client, AMD is also at ~$2 billion but here with $8 billion upside to $10 billion

So there is far more to gain for AMD on the client end than on server end.
 

Joe NYC

Diamond Member
Jun 26, 2021
3,837
5,378
136
The funny thing about "AMD traded MALL in Strix Point for the NPU" is that both MALL and NPU are deprecated in the future

Makes sense on the CPU side, where CPU benefits from L3 far more than from MALL.

But depreciating MALL for the GPU - I guess there will be more SRAM embedded inside the GPU die, where it can be access with much lower latencies.

I wonder if AMD will ever figure out a V-Cache for GPU, which could be using cheaper die...
 

Kepler_L2

Golden Member
Sep 6, 2020
1,029
4,396
136
Really interesting. In favour of using the GPU for copilot features or dropping the whole damn thing?
Using the iGPU, there's no point in a 50-100 TOPs NPU when you have a 200+ TOPs iGPU.
But depreciating MALL for the GPU - I guess there will be more SRAM embedded inside the GPU die, where it can be access with much lower latencies.
They're going with larger L2 like NVIDIA
I wonder if AMD will ever figure out a V-Cache for GPU, which could be using cheaper die...
They already do for MI series
 

Joe NYC

Diamond Member
Jun 26, 2021
3,837
5,378
136
They already do for MI series

Mi series cache is closer to MALL, AFAIK. At least it was in Mi300/Mi325. I am not sure if it changed since then.

I was thinking a V-Cache that is more closely coupled with the GPU die - a logical unit with the GPU, with low latencies.

As opposed to a standalone unit (MALL) closely coupled with memory controller, and not with GPU.
 

Joe NYC

Diamond Member
Jun 26, 2021
3,837
5,378
136
Wouldn't the NPU be much more efficient on die space?

iGPU + NPU vs. optimized iGPU?

My guess incremental iGPU die size increase to iGPU optimized to also process NPU workload more optimally will be smaller than standalone NPU.
 

inquiss

Senior member
Oct 13, 2010
559
816
136
I
iGPU + NPU vs. optimized iGPU?

My guess incremental iGPU die size increase to iGPU optimized to also process NPU workload more optimally will be smaller than standalone NPU.
I'd guess that's it's more like Microsoft asked for npu because they didn't want to miss a potential trend in on device AI but it's kinda useless. AI looks like it will be served from the cloud for now. You can just use the GPU instead because the NPU won't be used at the same time as the GPU and, if in the future games involve it, then you wouldn't be gaming on the igpu anyway. That's niche. So makes sense to me.

Wonder if the GPU has anything in the next gen that makes it better at what the NPU used to do?
 
  • Like
Reactions: Joe NYC

Joe NYC

Diamond Member
Jun 26, 2021
3,837
5,378
136
That's not how any of that works.
MALL is already part of the GPU cache ladder.

Not like latency matters for GPUs much.

Well, more like serving some of the bandwidth needed from caches rather than from memory, where the bandwidth can become a bottleneck.
 

branch_suggestion

Senior member
Aug 4, 2023
848
1,876
106
The funny thing about "AMD traded MALL in Strix Point for the NPU" is that both MALL and NPU are deprecated in the future
MALL will still be used in 3D/chiplet parts right?
At the end of the day it does come back to meeting bandwidth requirements, GDDR7 is a pretty nice bump so a big L2 is enough for the compute this gen.
It's a much, much smaller cache.
Membw scaling itself is enough.
Indeed, I do wonder if the new L2 will have some MALL like functions.
And well a big L2 is ideal, much higher bw and lower latency for the SEs to feed upon. I think AMD figured out that their L2 is extremely good and making it big and simple is a net gain in PPA, though probably only possible with G7.