Discussion Zen 7 speculation thread

Page 18 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

adroc_thurston

Diamond Member
Jul 2, 2023
7,472
10,241
106
Not sure I agree, produce a crap ton of 8 Core CCDs with Silverking and they'll be still quite a few 6 core CCDs.

My big question will there be enough defective Silverton CCDs that'll have half the Core and Cache count and still use X3D cache meaning you still have 192MB L3 SKUs and run the same clock speed as the full fat versions? Silverking will be "budget" Zen 7 (I mean it'll be probably be produced for a very long time since AM5 will become "budget" platform ala AM4 is right now).
Ah we were talking Zen6.
 
  • Like
Reactions: marees

Joe NYC

Diamond Member
Jun 26, 2021
3,815
5,363
136
In the MLID Zen 7 leak video, he refers to:
- FP512 - I wonder what that might be. AVX-512 equivalent synchronized with Intel
- 1/2 ACE - ACE seem to come from Advanced Matrix Extensions from AMD / Intel collaboration, and presumably, since this referred to 1/2 CCD (8 cores) it could be 1/2 of AMD's planned ACE unit.
- Tom was wondering if this can replace NPU, and I am wondering the same thing. Probably not.
- it mentions 4x FP8 performance and 2x Int8 performance. I am assuming that these will be new datatypes for AVX-512. Zen 6 is already adding 2x FP16 performance, so Zen 7 seems to be extending it further to 8xFP8. I presume this will also be part of the AVX-512 definition.

Overall, it looks to me like AMD is determined to take performance crown from Apple in client in every measure and category, including efficiency.

Similarly, in server, extend the lead absolute performance per core and also efficiency.

With full cores CCDs moving to 16 core, then full core CPU goes from 96 to 128 cores, and with 224 L3 (including V-Cache) total L3 going to 1.7 GB, and on 264 core dense to > 2 GB L3 per CPU.
 

511

Diamond Member
Jul 12, 2024
4,822
4,386
106
- FP512 - I wonder what that might be. AVX-512 equivalent synchronized with Intel
- 1/2 ACE - ACE seem to come from Advanced Matrix Extensions from AMD / Intel collaboration, and presumably, since this referred to 1/2 CCD (8 cores) it could be 1/2 of AMD's planned ACE unit.
- Tom was wondering if this can replace NPU, and I am wondering the same thing. Probably not.
- it mentions 4x FP8 performance and 2x Int8 performance. I am assuming that these will be new datatypes for AVX-512. Zen 6 is already adding 2x FP16 performance, so Zen 7 seems to be extending it further to 8xFP8. I presume this will also be part of the AVX-512 definition.
It's a lot lower than AMX and Int8 is supported with VNNI which is coming with Zen 6. I wouldn't put much stock in MLID assumption cause he can't read a spec sheet properly and 100% he would have gotten the numbers mixed up.

1762753893530.png
 
  • Like
Reactions: marees

Joe NYC

Diamond Member
Jun 26, 2021
3,815
5,363
136
It's a lot lower than AMX and Int8 is supported with VNNI which is coming with Zen 6. I wouldn't put much stock in MLID assumption cause he can't read a spec sheet properly and 100% he would have gotten the numbers mixed up.

View attachment 133508

To his credit, expressed lowest level of confidence in it, kind of admitted he does not know what he is talking about. Questioned himself what it means.

So, if ACE is going to migrate from per core to per CPU (or CCD ore CCX), then it will be a new ballgame as far as performance. We could see anywhere from 8x to 32x performance increase of the co-processor unit vs. per core AMX.

So the question than is how that level of performance compares with NPUs on the current computers, if the NPU can be discarded this way.

As far as additional data type enhancements to AVX-512, he said usefulness of those would be in feeding AI GPUs some data in their desirable formats. Not as a substitute for Matrix manipulations, in either datacenter GPU or the ACE unit.
 

mikegg

Platinum Member
Jan 30, 2010
2,033
586
136
Overall, it looks to me like AMD is determined to take performance crown from Apple in client in every measure and category, including efficiency.
Can you estimate how much more performance and efficiency gains AMD needs to overtake Apple's M8?

Here's a baseline for you via Notebookcheck. M4 Pro is roughly 52% faster in Cinebench ST and 3.6x more efficient than Strix Halo.

Let's suppose M8 doubles M4 per/watt to 19pts/w. Let's suppose ST is increased by 46% over 4 generations to 260 points.

Will Zen7 increase Strix Halo efficiency by 7.2x while also increasing ST performance by 2.2x?

BenchmarkStrix Halo 395+M4 Pro MiniM4 Max% Difference (M4 Max vs Strix Halo)
Memory Bandwidth256GB/s273GB/s546GB/s+113.3%
Cinebench 2024 ST116.8178178+52.4%
Cinebench 2024 MT164817292069+25.6%
Geekbench ST297838363880+30.3%
Geekbench MT212692250925760+21.1%
3DMark Wildlife (GPU)196151934537434+90.8%
GFX Bench (fps) (GPU)114125.8232+103.5%
Blender GPU Party Tug (GPU)55 sec43 sec
Cinebench ST Power Efficiency2.62 pts/W9.52 pts/W
Cinebench MT Power Efficiency14.7 pts/W20.2 pts/W
 
Last edited:
  • Like
Reactions: BorisTheBlade82

Joe NYC

Diamond Member
Jun 26, 2021
3,815
5,363
136
Can you estimate how much more performance and efficiency gains AMD needs to overtake Apple's M8?

The biggest game changer for AMD will be inclusion of LP cores, which AMD does not have currently. That's a game changer for battery life, which most people take as a proxy of efficiency.

So, this advantage, perception of efficiency will shrink significantly.

Here's a baseline for you via Notebookcheck. M4 Pro is roughly 52% faster in Cinebench ST and 3.6x more efficient than Strix Halo.

The ST absolute scores of Mac CPUs is why they are so legendary. And why it is something of an inspiration, I bet, to all the CPU designers. It seems like AMD is gunning for those numbers. But +52% is a big gap.

The ST performance per Watt has too much noise to be of any use.

Let's suppose M8 doubles M4 per/watt to 19pts/w. Let's suppose ST is increased by 46% over 4 generations to 260 points.

Will Zen7 increase Strix Halo efficiency by 7.2x while also increasing ST performance by 2.2x?

MT is a more useful measure of core efficiency, because it maximizes power usage by the cores and minimizes extraneous variables.

In that performance, M4 leads by 21-25%, which (plus future Apple gains) should be quite achievable.

In MT efficiency, M4 leads by 37% in efficiency. So, this would be more challenging, but don't forget that Apple has a node advantage now and will not have node advantage when Zen 6 and Zen 7 ship.

All of the gaps shrink just from closing the gap in the process technology used, before any new CPU design features are introduced.

BenchmarkStrix Halo 395+M4 Pro MiniM4 Max% Difference (M4 Max vs Strix Halo)
Memory Bandwidth256GB/s273GB/s546GB/s+113.3%
Cinebench 2024 ST116.8178178+52.4%
Cinebench 2024 MT164817292069+25.6%
Geekbench ST297838363880+30.3%
Geekbench MT212692250925760+21.1%
3DMark Wildlife (GPU)196151934537434+90.8%
GFX Bench (fps) (GPU)114125.8232+103.5%
Blender GPU Party Tug (GPU)55 sec43 sec
Cinebench ST Power Efficiency2.62 pts/W9.52 pts/W
Cinebench MT Power Efficiency14.7 pts/W20.2 pts/W
 

mikegg

Platinum Member
Jan 30, 2010
2,033
586
136
The biggest game changer for AMD will be inclusion of LP cores, which AMD does not have currently. That's a game changer for battery life, which most people take as a proxy of efficiency.

So, this advantage, perception of efficiency will shrink significantly.
Efficiency cores are one thing. I can play AAA games on my Macbook Pro on battery life and it remains cool, no fans spinning, and lasts 5-6 hours on battery. Efficiency cores play little to no role in this scenario.

The ST performance per Watt has too much noise to be of any use.
Why? Numbers are right there for you.

MT is a more useful measure of core efficiency, because it maximizes power usage by the cores and minimizes extraneous variables.
Nope. MT efficiency is a function of how many cores (and threads) and then run them at lower clock speeds. MT efficiency is easy to achieve relatively. If AMD wants, they can beat M4 Pro in MT efficiency right now. In fact, their Epyc/Threadrypper CPUs already do probably.

Funny how you talked up highly of Zen 7's core design, then switched to using MT as the efficiency as the barometer. What?
 
Last edited:

511

Diamond Member
Jul 12, 2024
4,822
4,386
106
Can you estimate how much more performance and efficiency gains AMD needs to overtake Apple's M8?

Here's a baseline for you via Notebookcheck. M4 Pro is roughly 52% faster in Cinebench ST and 3.6x more efficient than Strix Halo.

Let's suppose M8 doubles M4 per/watt to 19pts/w. Let's suppose ST is increased by 46% over 4 generations to 260 points.

Will Zen7 increase Strix Halo efficiency by 7.2x while also increasing ST performance by 2.2x?

BenchmarkStrix Halo 395+M4 Pro MiniM4 Max% Difference (M4 Max vs Strix Halo)
Memory Bandwidth256GB/s273GB/s546GB/s+113.3%
Cinebench 2024 ST116.8178178+52.4%
Cinebench 2024 MT164817292069+25.6%
Geekbench ST297838363880+30.3%
Geekbench MT212692250925760+21.1%
3DMark Wildlife (GPU)196151934537434+90.8%
GFX Bench (fps) (GPU)114125.8232+103.5%
Blender GPU Party Tug (GPU)55 sec43 sec
Cinebench ST Power Efficiency2.62 pts/W9.52 pts/W
Cinebench MT Power Efficiency14.7 pts/W20.2 pts/W
you can obviously run Strix Halo ST at lower clocks to get better perf/watt in ST/MT and comparing a N3E product to N4P product is apple to oranges comparison let Zen6/M6 launch
 

mikegg

Platinum Member
Jan 30, 2010
2,033
586
136
you can obviously run Strix Halo ST at lower clocks to get better perf/watt in ST/MT and comparing a N3E product to N4P product is apple to oranges comparison let Zen6/M6 launch
So what is Strix Halo's ST speed if you lower the clocks to get the same perf/watt as M4 Pro?

At stock clocks, M4 Pro is already 52% faster than Strix Halo ST.
 

mikegg

Platinum Member
Jan 30, 2010
2,033
586
136
You can't get the same perf/watt as M4 Pro but it would be like half of M4 pro
Hmm... So Strix Halo would have the ST performance of an iPhone 12 in that case. I think probably even worse given that you're decreasing power usage by 3.6x. Maybe iPhone 10 ST performance.

Are we sure @Joe NYC 's speculation about AMD's Zen7 goal is achievable given how far behind AMD is right now? It seems like AMD is about 3-4 generations behind M4. That means in order for Zen7 to beat M8, it needs to have about 8 generations of improvements in 2. Is it achievable?
Overall, it looks to me like AMD is determined to take performance crown from Apple in client in every measure and category, including efficiency.
 
Last edited:

Joe NYC

Diamond Member
Jun 26, 2021
3,815
5,363
136
Here's a baseline for you via Notebookcheck. M4 Pro is roughly 52% faster in Cinebench ST and 3.6x more efficient than Strix Halo.

What's the link to this review? Because when I loop at the site, they only have a Strix Halo tablet for comparison. So, comparing a tablet vs. a "Pro" premium laptop vs. a Mini PC does not seem like Apples to Apples.

When I see Mini PC comparisons, the results I find are different from what you posted.
 

mikegg

Platinum Member
Jan 30, 2010
2,033
586
136
What's the link to this review? Because when I loop at the site, they only have a Strix Halo tablet for comparison. So, comparing a tablet vs. a "Pro" premium laptop vs. a Mini PC does not seem like Apples to Apples.

When I see Mini PC comparisons, the results I find are different from what you posted.
 

Joe NYC

Diamond Member
Jun 26, 2021
3,815
5,363
136
Why? Numbers are right there for you.

I don't trust the power consumption for ST benchmark and I don't think it is a useful metrics.

ST performance is unconstrained by overall power limit of the chip - unless it is constrained by various BIOS settings. You don't know what they are. It can be anything. And it can be influenced by non-Core related uses of power that are not being accurately measured - much more than MT.

MT test, which is constrained to the same power limit is a much better measure of efficiency

Nope. MT efficiency is a function of how many cores (and threads) and then run them at lower clock speeds. MT efficiency is easy to achieve relatively. If AMD wants, they can beat M4 Pro in MT efficiency right now. In fact, their Epyc/Threadrypper CPUs already do probably.

Funny how you talked up highly of Zen 7's core design, then switched to using MT as the efficiency as the barometer. What?

No, you started it by raising points about it and I just answered.
 

mikegg

Platinum Member
Jan 30, 2010
2,033
586
136
I don't trust the power consumption for ST benchmark and I don't think it is a useful metrics.

ST performance is unconstrained by overall power limit of the chip - unless it is constrained by various BIOS settings. You don't know what they are. It can be anything. And it can be influenced by non-Core related uses of power that are not being accurately measured - much more than MT.

MT test, which is constrained to the same power limit is a much better measure of efficiency
Translation: AMD is so far behind in ST efficiency and performance that burying one's head in the sand is a valid option.
 

Joe NYC

Diamond Member
Jun 26, 2021
3,815
5,363
136

Thanks. So, it is using the Strix Halo tablet vs. MacBook Pro, and I still don't see where the Mac Mini results are coming from.
 

mikegg

Platinum Member
Jan 30, 2010
2,033
586
136
Thanks. So, it is using the Strix Halo tablet vs. MacBook Pro, and I still don't see where the Mac Mini results are coming from.
Mac Mini label was my mistake. I used this chart somewhere else. Regardless, the numbers themselves don't change.
 

Joe NYC

Diamond Member
Jun 26, 2021
3,815
5,363
136
Translation: AMD is so far behind in ST efficiency and performance that burying one's head in the sand is a valid option.

No, set the power limit to the core (if possible) to 3W, 5W, 10W, 15W and see where the performance lands and then draw the performance / W from that.
 

mikegg

Platinum Member
Jan 30, 2010
2,033
586
136
No, set the power limit to the core (if possible) to 3W, 5W, 10W, 15W and see where the performance lands and then draw the performance / W from that.
Why just the core? Will real users experience special physics that their computer will only consume power from the core?

Let us know if you have numbers for those watts.
 

Joe NYC

Diamond Member
Jun 26, 2021
3,815
5,363
136
So what is Strix Halo's ST speed if you lower the clocks to get the same perf/watt as M4 Pro?

At stock clocks, M4 Pro is already 52% faster than Strix Halo ST.

If the last 1% of performance was at expense of 10% of power you would not be reporting anything useful. Since you don't know where you are on the curve on either of the processors.

If there were overclocking tools for Mac and you raised your voltage limit, raised the clock, you could get to very power inefficient zone too. But Apple just prevents you from doing it.

BTW, don't limit clocks as @511 suggested, limit power. That way you are getting some useful information out of the test.
 
  • Like
Reactions: Tlh97 and Hulk