Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 431 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

poke01

Senior member
Mar 8, 2022
998
1,094
106
Yeah it's not hard, but GB6 is worthless now since you can pump matrix matth subtest with shared accelerators.
They seem to be continuously fudging Geekbanch 6, so not my favorite, but here it is anyway:
Geekbench 5 ain't any better as it includes AVX512 to pump up scores. So any non-Intel (ex 11th gen) and ARM CPUs will score lower.

Cinebench 2024 is probably the best one now for consumers
 
  • Like
Reactions: Henry swagger

AMDK11

Senior member
Jul 15, 2019
313
205
116
Honestly, I don't believe in an average IPC increase of +10%. This is actually some point on the IPC growth curve for a specific task.

I think it will be closer to +20%.

But I guess I'm not very wrong if I guess that the IPC growth curve for Zen 5 will be from +1-5% to +40-50% ;)
 
Last edited:

AMDK11

Senior member
Jul 15, 2019
313
205
116
Zen3 did that with barely any area investment.
Zen 3 is a new and better design than Zen 2, adding approximately 15% more transistors to the core logic (excluding L2).

Additionally, the brand new L3 Cache design shared across 8 cores without splitting into 2 CCXs of 4 cores each had a significant impact on IPC.
 

AMDK11

Senior member
Jul 15, 2019
313
205
116
amd-zen-3-ipc-gains.png

Note that not all tests are 1T. AMD shows IPC growth curve 8 cores vs 8 cores.

Do you think that the L3 cache in Zen3 has absolutely nothing to do with the achieved IPC?
 
Last edited:

SpudLobby

Senior member
May 18, 2022
788
489
106
Any predictions if Zen 5 can match Apple M4 Geekbench 6 score that has been posted, in single thread? They seem to be continuously fudging Geekbanch 6, so not my favorite, but here it is anyway:

M4: ~3800
7950: 3068
You can just subtract 9-10% from the Apple score for the non-AMX value.

And yeah GB5 AVX512 makes it unreliable as well.
 
  • Like
Reactions: Tlh97 and Joe NYC

AMDK11

Senior member
Jul 15, 2019
313
205
116
I've literally posted the detailed breakdown of microarchitectural updates contributing to Zen3 IPC uplift.
Deck's 2yo and public.
I know this graphic (since it was published online) with a percentage breakdown of how much is allocated to a given part of the core.

My question is still: doesn't the RAM controller, L3 cache capacity and design, etc. affect IPC?

If you reduce RAM bandwidth and increase access latency, won't you lower the achieved IPC?
 

AMDK11

Senior member
Jul 15, 2019
313
205
116
No, they're literally the things responsible for the IPC uplift.
So why does AMD provide an IPC increase for games in Zen 3, when it is known that this applies to a larger number of cores, e.g. 8, and where it is known that the increase will result from a common and large L3 for all 8 cores without division into CCX? Can you explain this to me?

Isn't it the case that with unified L3 on 8 cores, communication latencies between cores are lower, which means that the microarchitecture has less downtime and is able to process more data?

What do you think the 3D-V cache does?
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,668
14,676
136
The following is ONLY from the perspective of a Distributed computing perspective. This means 2 things, computing power and efficiency. We use all cores and most of the time SMP.

Zen 1 : way better than bulldozer in all respects, and if I remember correctly cheaper and more efficient than the Intel counter parts.
Zen 2 : small improvements in performance, about the same efficiency as Zen 1.
Zen 3 : MUCH better performance AND efficiency compared to Zen 2 The larger L3 cache made a big difference in some apps.
Zen 4 : MUCH better performance than Zen 3, but about the same efficiency. But in apps that use avx-512, nothing could touch the performance. For primegrid, we had to disable SMT and pin cores to a CCX for maximum performance, but when we do, nothing that Intel has comes close.

Zen 5 : Most of us anticipate that it will be the same in efficiency, but far better in performance.

Again, speaking for the DC community and that means 100% load 24/7/365. And yes, the electric bill means more to use than any other group of users.
 

AMDK11

Senior member
Jul 15, 2019
313
205
116
Because those are branchy membound workloads, Zen3 new BP with bigger BTBs did wonders there.
There are no miracles, otherwise you could say it's a miracle that it works. If you hypothetically cut off all L3 memory, wouldn't the performance drop be an IPC drop? Will BTB make up for it too?
After all, Core will wait longer for data and will see a measurable drop in IPC. Remember, I'm not talking about the theoretical IPC of the microarchitecture, but the measurable one.


For example, the difference in a given title at 4GHz with 3DV cache is +15% compared to cores without 3DV cache.
 

H433x0n

Senior member
Mar 15, 2023
933
1,032
96
The following is ONLY from the perspective of a Distributed computing perspective. This means 2 things, computing power and efficiency. We use all cores and most of the time SMP.

Zen 1 : way better than bulldozer in all respects, and if I remember correctly cheaper and more efficient than the Intel counter parts.
Zen 2 : small improvements in performance, about the same efficiency as Zen 1.
Zen 3 : MUCH better performance AND efficiency compared to Zen 2 The larger L3 cache made a big difference in some apps.
Zen 4 : MUCH better performance than Zen 3, but about the same efficiency. But in apps that use avx-512, nothing could touch the performance. For primegrid, we had to disable SMT and pin cores to a CCX for maximum performance, but when we do, nothing that Intel has comes close.

Zen 5 : Most of us anticipate that it will be the same in efficiency, but far better in performance.

Again, speaking for the DC community and that means 100% load 24/7/365. And yes, the electric bill means more to use than any other group of users.
Power usage is going up quite a bit for DT SKUs. It’s unclear if it’s an accident or not but there’s numerous instances where DT 8C SKU has a listed TDP of 170W.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,668
14,676
136
Power usage is going up quite a bit for DT SKUs. It’s unclear if it’s an accident or not but there’s numerous instances where DT 8C SKU has a listed TDP of 170W.
efficiency is the key. If an 8 core Zen 5 performs better than a 16 core Zen 4, then it would be worth it to us.

No absolutes on power, just efficiency.
 

AMDK11

Senior member
Jul 15, 2019
313
205
116
I don't care about hypotheticals, you have detailed microarch breakdown from AMD.
The division on the slide is of a marketing nature and is not very insightful. It doesn't take cache into account at all because specific parts of the core are responsible for retrieving data.

The truth is that the cache and RAM of the controller are part of the measurable IPC. By subtracting or adding cache (example APU with smaller L3) or changing the RAM controller, the measurable IPC will also change.

Changing from 2 CCX with 4 cores to 1 CCX with 8 cores will also increase measurable IPC as faster access to more cache will be provided, but also faster data exchange between the 8 cores (less downtime).

That's why it's funny when someone claims that cache is not part of measurable IPC and has absolutely no impact on IPC.
 

AMDK11

Senior member
Jul 15, 2019
313
205
116
It's for ISSCC. From the people who designed it. I'll take that over 'a gamer explains'.

There's a reason AMD markets the X3Ds as 'the ultimate in latency reduction' not 'the ultimate in IPC'.
C&C you will also read about solutions advertised by AMD that are presented as unique but in fact are not.

Latency reduction is part of measurable IPC because it causes the core logic to wait less for data and can therefore execute more instructions at the same time.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,821
3,312
136
C&C you will also read about solutions advertised by AMD that are presented as unique but in fact are not.

Latency reduction is part of measurable IPC because it causes the core logic to wait less for data and can therefore execute more instructions at the same time.
Yes but your going to see that covered by the prefetch and load parts of the breakdown diagram. But that will only be a component of those sections.
 

Abwx

Lifer
Apr 2, 2011
11,101
3,776
136
C&C you will also read about solutions advertised by AMD that are presented as unique but in fact are not.

Latency reduction is part of measurable IPC because it causes the core logic to wait less for data and can therefore execute more instructions at the same time.
You just forgot that a V-cache is used by 8 cores, it s not single core IPC wich is improved but only MT throughput with some select apps, if you do a ST test you ll see that the score is the same with and without V-cache.
 
  • Love
Reactions: spursindonesia

adroc_thurston

Platinum Member
Jul 2, 2023
2,813
4,129
96
The division on the slide is of a marketing nature and is not very insightful
It's from ISSCC.
The truth is that the cache and RAM of the controller are part of the measurable IPC
Sort of.
MTL nuked the memperf outta low orbit and it still barely a notable (like 4%) regression.
tl;dr modern caches are good and work.

Truly awful mem implementations you haven't seen in your life since you're terminally gamer™.