Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 209 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

Tup3x

Senior member
Dec 31, 2016
964
949
136
I guess the most gains come from new process, higher power draw and clocks. Actual IPC gains are decent/somewhat underwhelming, nothing ground breaking. I'm disappointed about the lack of USB4 when the connectivity otherwise it nice.
 

UsandThem

Elite Member
May 4, 2000
16,068
7,380
146
I thought that was the very reason why this forum exists in the first place :D
I didn't say people shouldn't do it, but since the person quoted me concerning a comment I made about their quarterly financials, I just wanted them to know I'm not one to debate on unreleased hardware.

Everyone else, carry on. :p
 

uzzi38

Platinum Member
Oct 16, 2019
2,632
5,957
146
Cinebench doesn't care about system memory or inter-core latency and bottlenecks.
It also doesn't see much gains from things like improved branch predictors for example. Zen 3 already sees a >99% branch predict rate in R23.

Even if you want to measure just the performance of the core itself, it's a poor benchmark.

Maybe you should just wait for more useful universal tests to be performed, like SPEC and Geekbench? Just a thought. Not saying I think there's going to be monumental gains in other workloads, just that treating Cinebench as a perfect workload for determining the capabilities of a CPU core taking memory out of the equation is a bit ridiculous.
 
Last edited:

amrnuke

Golden Member
Apr 24, 2019
1,181
1,772
136
The architecture on these machines (Zen3/ADL) is already so wide that it has been my opinion that significant ST performance increases due to architectural enhancements are going to be tough to come by. The low hanging fruit, and even the fruit higher up the tree has been picked. The only fruit left is really high up and hard to harvest.

Now a few admittedly more knowledgeable members here have informed me that there is plenty of IPC to be gained from architecture.

That being said (written), given what AMD has reported thus far it seems ST IPC on these current architectures is topping out. Perhaps the battle will be moving to cache and the rest of the memory subsystem in an effort to most efficiently "feed the beast?"
While x86 and ARM are different beasts, the differences do not account for the vast disparity in performance/watt and IPC seen between the 5950X and the M1. I'd bet there is still yet much improvement that can be made on x86 in IPC and performance/watt via uarch design. Those members who informed you that there is plenty of IPC to be gained from architecture are most likely correct. The question is balancing IPC with all the other design targets -- that is, whether sacrifice in any other area is worth the incremental IPC benefit.
 

dnavas

Senior member
Feb 25, 2017
355
190
116
200 second renders like the one shown tend to benefit Intel more.

s/tend/tended/ :>
Perhaps whatever advantage Intel had is gone now, but the fix may be targeted rather than indicating a general 40% advantage. I think it best to assume we're getting 15-20% improvement and be happy for those cases where they exceed. If it's generally 40%, well, party time, and also, expect for your wallet to be unhappy.
 
  • Like
Reactions: lightmanek

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,556
14,511
136
I thought that was the very reason why this forum exists in the first place :D
This forum has multiple uses. To be released hardware, current hardware, just released hardware, helping people with hardware, a few social forums, a DC forum, etc... His point was the same as mine. ARGUING about non-released hardware makes no sense to us. Discussing it is a different matter. There are those that even troll this thread, as they are an "Intel supporter".
 

Timmah!

Golden Member
Jul 24, 2010
1,419
631
136
Since 24c is not happening and it seems there is not really a much place under the IHS to fit additional chiplets (for whatever reason i thought it is, i remember seeing some mockup scheme showing it should be possible, but cant find it anymore), do you think AMD does not really count with more than 16 core on mainstream platform in upcoming years? I mean, throughout the life of AM5? Or do you think future chiplets (say as soon as in Zen5) are going to be smaller or pack more cores?

EDIT: found that mockup: https://pbs.twimg.com/media/FGBnRKQXoAADm3G?format=jpg&name=large
 
Last edited:
  • Like
Reactions: Tlh97 and Vattila

biostud

Lifer
Feb 27, 2003
18,250
4,763
136
Since 24c is not happening and it seems there is not really a much place under the IHS to fit additional chiplets (for whatever reason i thought it is, i remember seeing some mockup scheme showing it should be possible, but cant find it anymore), do you think AMD does not really count with more than 16 core on mainstream platform in upcoming years? I mean, throughout the life of AM5? Or do you think future chiplets (say as soon as in Zen5) are going to be smaller or pack more cores?
The rumors suggest that zen5 will be heterogeneous with a chiplet with 8 zen5 cores and a chiplet with 16 zen4c cores.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,350
1,535
136
s/tend/tended/ :>
Perhaps whatever advantage Intel had is gone now, but the fix may be targeted rather than indicating a general 40% advantage. I think it best to assume we're getting 15-20% improvement and be happy for those cases where they exceed. If it's generally 40%, well, party time, and also, expect for your wallet to be unhappy.
... Soo, I just want to make the point that blender (46% gain) can make use of AVX-512, while CB r23 (15% gain) cannot. Just one small stone in the well of "non-AVX512 performance mostly comes from clocks and the cache, with no other significant changes".

Tiny thing of note: the increase in tlb size (from the GB leak) would probably significantly improve the gain from v-cache for those models with it.
 

gdansk

Platinum Member
Feb 8, 2011
2,106
2,594
136
Since 24c is not happening and it seems there is not really a much place under the IHS to fit additional chiplets (for whatever reason i thought it is, i remember seeing some mockup scheme showing it should be possible, but cant find it anymore), do you think AMD does not really count with more than 16 core on mainstream platform in upcoming years? I mean, throughout the life of AM5? Or do you think future chiplets (say as soon as in Zen5) are going to be smaller or pack more cores?

EDIT: found that mockup: https://pbs.twimg.com/media/FGBnRKQXoAADm3G?format=jpg&name=large
It supports the speculation that AMD will use Bergamo CCD in consumer products to compete better with Intel. At the rate Intel is adding cores they'll need it by fall 2023.
 
  • Like
Reactions: Tlh97 and Vattila

Abwx

Lifer
Apr 2, 2011
10,947
3,457
136
... Soo, I just want to make the point that blender (46% gain) can make use of AVX-512, while CB r23 (15% gain) cannot. Just one small stone in the well of "non-AVX512 performance mostly comes from clocks and the cache, with no other significant changes".

Tiny thing of note: the increase in tlb size (from the GB leak) would probably significantly improve the gain from v-cache for those models with it.

Blender is MT while the CB number you re talking about is in ST, in CB MT the improvement over a 5950X is apparently 45%, AVX512 or not..
 

Doug S

Platinum Member
Feb 8, 2020
2,263
3,513
136
At the end what we want is performance.

So if we get 15% higher ST performance coming from 5% IPC + 10% higher clocks or
If we get 15% higher ST performance coming from 15% IPC and zero higher clocks

The end result is the same ;)

Yes and no. If you get 15% greater IPC then you get increased performance at all clocks, including when running in lower power mode. If you get it via higher clocks, unless you are magically able to increase clock rates without power increasing AND maintain the same power curve so you are getting 15% higher clocks at the same power at all lower clock rates, it is an inferior solution.

There's a reason Apple is designing their cores (targeted at power efficiency moreso than absolute performance) at lower clocks gaining more of their performance via higher IPC. If they were targeting absolute performance they'd probably be capable of hitting if not 5 GHz at least 4 GHz.
 

Saylick

Diamond Member
Sep 10, 2012
3,158
6,382
136
If(or when) AMD puts multiple core types in single chip, I presume they will have different names for cores vs chip themselves. it would be weird to call it Zen 5 chip with Zen5 and Zen4c cores.
They already have different names for the chips and products. Using Zen 3 as an example, there's many products that use the Zen 3 core but they are all called something different at the product level, e.g. Vermeer, Renoir, Rembrandt, Milan, etc. I think the Zen 3 CCD that is used in DT and Server had a specific name as well, but it's not mentioned often enough for me to remember exactly what it was called. The Zen 5 generation of products will be no different.
 

majord

Senior member
Jul 26, 2015
433
523
136
Yes and no. If you get 15% greater IPC then you get increased performance at all clocks, including when running in lower power mode. If you get it via higher clocks, unless you are magically able to increase clock rates without power increasing AND maintain the same power curve so you are getting 15% higher clocks at the same power at all lower clock rates, it is an inferior solution.

There's a reason Apple is designing their cores (targeted at power efficiency moreso than absolute performance) at lower clocks gaining more of their performance via higher IPC. If they were targeting absolute performance they'd probably be capable of hitting if not 5 GHz at least 4 GHz.

There are no hard and fast rules like that. IPC vs frequency is a balancing act with many factors. In the case of these zen cores, it's Performance target range, Process node characteristics , use of SMT and the relationship that creates between peak ST and MT througput.

Increasing frequency without increasing power consumption is no more magical than increasing IPC without increasing power consumption , and the benefits of both can scale down the performance curve in the same way. All depends how its achieved.

There's no evidence that they aren't simply taking advantage of process headroom in 5nm, and optimising the core to scale to higher frequencies. This may make sense for architectural reasons such as poor perf/watt scaling when going wider (without a major uArch overhaul) . Who knows. They have highly effective performance modelling tools at their disposal to weighup these things. Providing they're doing it based on engineering decisions and not marketing ones , then it's probably the best option.

As a side note. I don't think people shoujld be getting quite so fixated on this 15% number.. Given AMDs track record of their performance teasers of the last few generations (40% IPC turning into more like 50% wt zen 1 , quoting performance based on what turn out to be non- halo SKUs , only to reveal them at the last minute , etc etc) I'd be giving some benefit of the doubt for now. There's every chance that's closer to 20% , something that's realistically in the same ballpark as previous uplifts.

. Not putting any bets on that will, but I wouldn't be doing the opposite either. The ">" is there for a reason

Frustrating as it is, the reality is we're still not much wiser than before the teaser . At best we know the minimum to expect, and that it seems, frequency uplift vs IPC uplift is skewed towards the former this time. how much is uncertain.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
And the only thing we have is >15% ST uplift in one benchmark that is not cache sensitive, all done on a pre-prod. sample running at unknown clock speeds. Too much is unknown to call it a failure at this point. If all they managed to get is a clock boost with minor IPC while using the 5nm node then something is way off versus all the previous Zen cores.
This. Partially.

This is one benchmark that is likely not sensitive to caches much at all and that is the main improvement in Zen 4. The marketing and fanboys seem to be out in force with this announcement, but I think the signal to noise ratio here is still a lot better than most forums. It is good for AMD to temper expectations but I think this 15% number is going to turn out on the low end. There is always someone making outlandish claims, either for the views or to attempt to make very good products seem like a disappointment, so sandbagging a little bit is a good idea.

I don’t think I have said much about Zen 4 performance other than I expected a good improvement. I am a lot more interested in what we get after the initial basic Zen 4; stacked devices are likely coming, possibly with SoIC used to stack things other than cache. I also expect that there will be some sharing between GPUs (CDNA/RDNA) and and CPUs.

There is likely going to be significantly larger improvements with Zen 4 for some applications that are more sensitive to the cache changes. Some improvement will be just due to increased clock speeds though. There isn’t anything wrong with that as long as they aren’t overclocking the processors by pushing them too hard and burning massive amounts of power like intel seems to be.

I don’t have time to go through and reply to a bunch of post. Some observations though:

We have basically known that Zen 4 is tweaked Zen 3 with a new process. It is the same family; this makes a lot of changes unlikely. A huge amount of new things will come with Zen 5 though, since it will be a new family. Possibly not even really “zen”. Zen 4 will not be any more of a “speed racer” than Zen 3. Rebalancing the pipeline is a huge change. It isn’t likely to happen with Zen 4. Clock speed increases may be due to cache design changes, since cache is often a limiter, but it is likely mostly due to process tech. I am thinking initial stacked devices come with a Zen 4 refresh and Bergamo and Zen 5 may use stacking for a much wider range of products.

Multi-threaded applications will likely show a much larger improvement than the single thread benchmarks would suggest. The doubled L2 size will significantly decrease cache contention when multiple SMT threads are active. It also decreases L3 contention, which is probably only a problem with many cores active. I expected a significant TLB improvement since, AFAIK, Zen 3 could only access 8 MB of the L3 per core without overflowing the TLB and needing to access the page table (higher latency). There is also the possible massive improvement in all-core clock speed for multithreaded applications.
 

poke01

Senior member
Mar 8, 2022
735
712
106
We have basically known that Zen 4 is tweaked Zen 3 with a new process. It is the same family; this makes a lot of changes unlikely. A huge amount of new things will come with Zen 5 though, since it will be a new family. Possibly not even really “zen”. Zen 4 will not be any more of a “speed racer” than Zen 3. Rebalancing the pipeline is a huge change. It isn’t likely to happen with Zen 4. Clock speed increases may be due to cache design changes, since cache is often a limiter, but it is likely mostly due to process tech. I am thinking initial stacked devices come with a Zen 4 refresh and Bergamo and Zen 5 may use stacking for a much wider range of products.
Do you know this to be true or you just guessing?

Leakers said up to 37% ST uplift for Zen 4 and people hyped Zen 4 so much. Yeah, after this Zen 4 reveal I am not trusting anymore leaks.

It took AMD nearly 24 months to deliver Zen 4 and we shall see later this year if Zen 4 "wrecks Intel".
 

StinkyPinky

Diamond Member
Jul 6, 2002
6,766
784
126
Honestly if I was only something like a 2700x I'd just throw in a 5800X3D or 5900 series CPU and call it a day. Way more cost effective than an entire new MB/ram/CPU combo. Especially as higher end DDDR5 is expensive.