Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 130 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

yuri69

Senior member
Jul 16, 2013
366
561
136
The market position where Intel has no answer to Threadripper is why there isn't anything new in the pipeline. Just like Intel could get away with sitting on 4 core CPUs for all of those years, AMD can get away with continuing to sell existing TR parts.

They'll probably eventually have something new, but right now they're going to prioritize the areas where they can't rest of their laurels due to competition from Intel or they can make more money. Threadripper is in the awkward middle position where it's not quite as profitable as the server market and not nearly as contested as the desktop and mobile markets.

We'll get some new parts eventually, but there just isn't any pressure on AMD to do it as soon as some might like. Maybe there are some technical hurdles in there as well, but they're not vastly different from what would need to be solved for new server parts.
Intel will launch SPR 'X' for HEDT. Seeing the performance of Golde Cove, it has potential. 50+c of highly clocked GCs on a new platform will definitely beat the current Zen 2 TR and most likely also the 'still upcoming' Zen 3 TR.

TBH I'm surprised AMD even bothers with Chagall/Zen 3 TR launching it in Q2/Q3 2022.
 

eek2121

Platinum Member
Aug 2, 2005
2,883
3,859
136
Zen 3 Threadripper will be a pretty decent upgrade over Zen 2 Threadripper, and launching it now allows them to avoid having to release another product on 5nm.

Regarding Zen 4, unsure where the Q4 rumor ever came from. I’ve always believed it was Q3. Note that Zen 4 will likely be “announced” at Computex, not launched. That very likely puts it in Q3.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
Intel will launch SPR 'X' for HEDT. Seeing the performance of Golde Cove, it has potential. 50+c of highly clocked GCs on a new platform will definitely beat the current Zen 2 TR and most likely also the 'still upcoming' Zen 3 TR.

TBH I'm surprised AMD even bothers with Chagall/Zen 3 TR launching it in Q2/Q3 2022.
How high could they possibly clock? People buy such high core count processors because they are going to use them continuously not burst of processing like low core count processors. I haven’t paid much attention to intel parts, but doesn’t Alder Lake pull something like 200 to 300 Watts for 8 performance cores? Your talking about 50 of those cores and they are going to clock high somehow? It will not be competitive with Threadripper if it pulls that much more power. Intel can kind of get away with it on 8 core processors, but that isn’t going to work for a 50 core.
 
  • Like
Reactions: Tlh97 and scineram

eek2121

Platinum Member
Aug 2, 2005
2,883
3,859
136
How high could they possibly clock? People buy such high core count processors because they are going to use them continuously not burst of processing like low core count processors. I haven’t paid much attention to intel parts, but doesn’t Alder Lake pull something like 200 to 300 Watts for 8 performance cores? Your talking about 50 of those cores and they are going to clock high somehow? It will not be competitive with Threadripper if it pulls that much more power. Intel can kind of get away with it on 8 core processors, but that isn’t going to work for a 50 core.

Doubt they will clock that high, though for what it is worth, Dropping the clocks down to 2.5-4.0 ghz drops power consumption considerably.
 

yuri69

Senior member
Jul 16, 2013
366
561
136
How high could they possibly clock? People buy such high core count processors because they are going to use them continuously not burst of processing like low core count processors. I haven’t paid much attention to intel parts, but doesn’t Alder Lake pull something like 200 to 300 Watts for 8 performance cores? Your talking about 50 of those cores and they are going to clock high somehow? It will not be competitive with Threadripper if it pulls that much more power. Intel can kind of get away with it on 8 core processors, but that isn’t going to work for a 50 core.
The current top-of-the-line highend desktops based on Golden Cove are clocked well past their efficiency range (chasing 5.2GHz...). Igor has recently simulated a lowend (4.4GHz) configuration aka i5 12400 in gaming workloads. That chip consumes over 30% *less* energy compared to AMD Zen 3 (5600X) while achieving more gaming FPS.

Of course gaming workload doesn't squeeze the CPU to its limits but still, it doesn't seem bleak for high core count SPR.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
The current top-of-the-line highend desktops based on Golden Cove are clocked well past their efficiency range (chasing 5.2GHz...). Igor has recently simulated a lowend (4.4GHz) configuration aka i5 12400 in gaming workloads. That chip consumes over 30% *less* energy compared to AMD Zen 3 (5600X) while achieving more gaming FPS.

Of course gaming workload doesn't squeeze the CPU to its limits but still, it doesn't seem bleak for high core count SPR.
I don’t have high expectations for intel anymore, so they are strictly believe it when I see it now. I wouldn’t buy one of their high clocked parts since they are essentially overclocked if they are that far up the power consumption / frequency curve. Same thing with AMD’s attempt with the FX 9590 processors.

Also, if you are simulating a lower clocked part by using a high clocked part, then you are probably using a high binned part that will actually have very good power consumption characteristics, or at least better than a more mid-range bin. Then you are comparing that to a 5600x which is probably one of the lowest binned parts. That doesn’t sound like a very valid comparison. Although, I don’t know if you actually get much really high binned AMD parts as desktop parts. The really good bins probably mostly go to Epyc.
 

Abwx

Lifer
Apr 2, 2011
10,842
3,295
136
The current top-of-the-line highend desktops based on Golden Cove are clocked well past their efficiency range (chasing 5.2GHz...). Igor has recently simulated a lowend (4.4GHz) configuration aka i5 12400 in gaming workloads. That chip consumes over 30% *less* energy compared to AMD Zen 3 (5600X) while achieving more gaming FPS.

Of course gaming workload doesn't squeeze the CPU to its limits but still, it doesn't seem bleak for high core count SPR.

You can get much lower power by using about no voltage margin, it will work even better with gaming tests since power is lower than with usual MT softs, so thoses numbers are not relevant at all in respect of a stock SKU that is voltage set such that it has robust stability under any circumstances.
 

Timmah!

Golden Member
Jul 24, 2010
1,389
596
136
The current top-of-the-line highend desktops based on Golden Cove are clocked well past their efficiency range (chasing 5.2GHz...). Igor has recently simulated a lowend (4.4GHz) configuration aka i5 12400 in gaming workloads. That chip consumes over 30% *less* energy compared to AMD Zen 3 (5600X) while achieving more gaming FPS.

Of course gaming workload doesn't squeeze the CPU to its limits but still, it doesn't seem bleak for high core count SPR.

For right price under 1500 EUROs i would be down for some Golden Cove 22~28C beauty at 4,4GHz. If only Intel pulled their collective you know what together and delivered.
Anyway, what are the odds for the Zen 4, which apparently is going to be revealed at CES, to be more than 16C? I read today more cores are apparently expected, which was news to me, but it was on wccftech, so.....
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
How high could they possibly clock?

Assuming the following:
-Only V/F matters for power use
-10% is taken up by the uncore

An 8 core Golden Cove using 200W at 5.2GHz would end up being a...

3.4GHz 52 core Sapphire Rapids-X at 350W. You also forget that clock will likely be when 52 cores are active. They can make it so at 8 cores it'd be at 5GHz and somewhere in the 4.xGHz range anywhere between 8 and 52 cores.

24 cores @ 4.5GHz
36 cores @ 4GHz

Yea, that's going to be a beast in every way.
 
Last edited:

uzzi38

Platinum Member
Oct 16, 2019
2,556
5,531
146
The current top-of-the-line highend desktops based on Golden Cove are clocked well past their efficiency range (chasing 5.2GHz...). Igor has recently simulated a lowend (4.4GHz) configuration aka i5 12400 in gaming workloads. That chip consumes over 30% *less* energy compared to AMD Zen 3 (5600X) while achieving more gaming FPS.

Of course gaming workload doesn't squeeze the CPU to its limits but still, it doesn't seem bleak for high core count SPR.

What is high core count SPR more likely to be doing? Gaming, or MT workloads?

Don't get me wrong, SPR is fine from a performance aspect in a bubble. But using gaming loads as a measure of power consumption? That's just being silly.

Besides, the main reason why Intel pull ahead in gaming atm us the I/O die on AMD's side ruins any sort of efficiency on that side. Compare power consumption in gaming between a 5800X or even a 5600X vs a 5700G, and you'll find the 5700G pulls half the power.



I'm not exaggerating.

Anyway, back onto SPR. It should beat Milan unless Intel drastically screw up the uncore or something. Provided the cores have higher per-core power they can make up any power efficiency deficit Golden Cove still has much easier, and AMD's 100W IOD makes that rather feasible.

But it really should win vs Milan on performance. Because on the cost side of things it's looking VERY bad for Intel. Intel's using more 7nm on SPR than AMD are using any silicon at all on Milan, the thing pulls 25% higher power, employs advanced packaging in EMIB, features much larger (and thus, defect prone) dies and is limited to DDR5, the latter is a huge increase go cost in the current market - and will stay that way for a couple of years.

Even vs Milan I don't see SPR being a good competitor, and frankly Genoa is going to end up absolutely demolishing it less than half a year later.

So yes, things look bleak for SPR, even if the product itself is a huge step up from previous DC parts.
 

eek2121

Platinum Member
Aug 2, 2005
2,883
3,859
136
The current top-of-the-line highend desktops based on Golden Cove are clocked well past their efficiency range (chasing 5.2GHz...). Igor has recently simulated a lowend (4.4GHz) configuration aka i5 12400 in gaming workloads. That chip consumes over 30% *less* energy compared to AMD Zen 3 (5600X) while achieving more gaming FPS.

Of course gaming workload doesn't squeeze the CPU to its limits but still, it doesn't seem bleak for high core count SPR.

Don’t sell gaming short. A CPU that performs great in games has never performed poorly in other workloads, likewise, a CPU that performs poorly in games…

The only exceptions I have ever seen were software stack related. Many games are an absolute worst case scenario for CPSs. Memory bandwidth and/or latency constrained, lightly threaded, etc.
I don’t have high expectations for intel anymore, so they are strictly believe it when I see it now. I wouldn’t buy one of their high clocked parts since they are essentially overclocked if they are that far up the power consumption / frequency curve. Same thing with AMD’s attempt with the FX 9590 processors.

Also, if you are simulating a lower clocked part by using a high clocked part, then you are probably using a high binned part that will actually have very good power consumption characteristics, or at least better than a more mid-range bin. Then you are comparing that to a 5600x which is probably one of the lowest binned parts. That doesn’t sound like a very valid comparison. Although, I don’t know if you actually get much really high binned AMD parts as desktop parts. The really good bins probably mostly go to Epyc.

I mean, the tests are already there. I get where you are coming from, however. Even if I were still on Zen 2, I would not be considering ADL-S, but I have been preaching for quite a while about how SPR-S is going to be a problem for AMD (mostly due to > Zen 3 IPC and > 2S support, allowing for higher compute density).

We will see how it pans out. Things are quite different in the enterprise world. Quite a few folks here claim that vendors are reluctant to embrace > 2S designs. Knowing what I know about the industry, I disagree, but I am not afraid to admit when I am wrong. That being said, I suspect that at minimum, Intel will win purely based on brand recognition. Beyond that, SPR-S looks to soundly beat Milan in nearly every workload and a 4S SPR system will absolutely wipe even Milan-X off the map for the most popular and relevant cloud based workloads. I do strongly suspect that AMD will continue to win some key benchmarks, but those benchmarks will be niche compared to the broader industry.

I hope I am wrong, of course (my AMD shares and options will suffer), but after watching the players in the industry for this long…
 

uzzi38

Platinum Member
Oct 16, 2019
2,556
5,531
146
Are we talking public release of Genoa, or . . . ? AMD is going to be selling Genoa through ODM channels well before any of us see it in the public eye. Sapphire Rapids won't even get a few months in those channels.
SPR is far all intents and purposes really available in Q2 as Patrick states here: https://www.servethehome.com/intel-sapphire-rapids-xeon-expectations-reset-to-q2-2022/

Last I know (this is a bit old now, might be out of date), but Genoa begins ships to hyperscalers before the end of Q2 also. I'm anticipating we'll see a full launch of Genoa around the beginning of Q4 as a result (at the latest). It almost certainly will be before the end of the year.

Just a reminder - Genoa's already sampling. This is pretty much the expected timeline at this point.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,587
5,703
136
I guess Genoa shipments would already be claimed by Azure and GCP and other Hyperscalers for two quarters at least.
Azure took a lot of deliveries of Milan already before AMD even launched it officially. Like, they took Milan shipments in Q4 2020 before it was launched in Q2 21.
Same story with Milan-X. According to Patrick/STH they have been populating their DC with Milan-X in Q3 21 before AMD announcements and official availability in 1Q22.

So, low availability for Genoa in 2022 for everybody else.
But Genoa should make EPYC Milan or TR Zen 3 cheaper and more available.

What is happening also is that TSMC is opting to invest in N3 instead of more capacity in N5 (because they want to push customers to leading nodes to get the upper hand vs Samsung/Intel)
AMD will be forced to N3 sooner than they would have liked, if they want more capacity. (4 years of N7, 2019/20/21/22, 2 years of N5 2022/23, N3 2023)
So it looks like Milan, Genoa and Turin will coexist at some point as standard offerings.
 
May 17, 2020
122
233
86
What is happening also is that TSMC is opting to invest in N3 instead of more capacity in N5 (because they want to push customers to leading nodes to get the upper hand vs Samsung/Intel)
AMD will be forced to N3 sooner than they would have liked, if they want more capacity. (4 years of N7, 2019/20/21/22, 2 years of N5 2022/23, N3 2023)
So it looks like Milan, Genoa and Turin will coexist at some point as standard offerings.
TSMC in the past had said that will offer more capacity for N5 compared to N7, for customer which use N5 they can move to N4 because Apple and Intel are the leading customers on N3

Last transcript, AMD could start to use N3 not before 2024 so in 2023 they could move to N4 :

 
Last edited:

DisEnchantment

Golden Member
Mar 3, 2017
1,587
5,703
136
TSMC in the past had said that will offer more capacity for N5 compared to N7, for customer which use N5 they can move to N4 because Apple and Intel are the leading customers on N3

Last transcript, AMD could start to use N3 not before 2024 so in 2023 they could move to N4 :

N4 is basically tweaked N5, just like how TSMC considers N6 to be part of N7 family. They will be produced in the same fab F18P1/2/3. Capacity used for N4 means capacity taken from N5.
I don't think N5 will hit N7 capacity.

Today, N7/6 has >220K wpm (F14 8 phases, F15 4+ phases) compared to >100K wpm of N5/4 (F18P1/2/3, even F18P4 is going to produce N3).
But outside of AZP1 there are no additional N5 fabs being made. Most investments are on F18P5/6/7/8 for N3. F20 is N2
N7 will replace all 10FF and 16FF offerings.
You can check official TSMC Foundry update data or their earnings report.
 
  • Like
Reactions: Tlh97

jamescox

Senior member
Nov 11, 2009
637
1,103
136
Assuming the following:
-Only V/F matters for power use
-10% is taken up by the uncore

An 8 core Golden Cove using 200W at 5.2GHz would end up being a...

3.4GHz 52 core Sapphire Rapids-X at 350W. You also forget that clock will likely be when 52 cores are active. They can make it so at 8 cores it'd be at 5GHz and somewhere in the 4.xGHz range anywhere between 8 and 52 cores.

24 cores @ 4.5GHz
36 cores @ 4GHz

Yea, that's going to be a beast in every way.
I would hope people are buying a 64-core Threadripper or Threadripper WX for applications that will actually make use of all of those cores, IO, and / or memory bandwidth. The performance with a low number of cores active probably doesn’t really figure into this equation at all. Intel still needs higher clocks and higher power to match AMD.
 

andermans

Member
Sep 11, 2020
151
153
76
I would hope people are buying a 64-core Threadripper or Threadripper WX for applications that will actually make use of all of those cores, IO, and / or memory bandwidth. The performance with a low number of cores active probably doesn’t really figure into this equation at all. Intel still needs higher clocks and higher power to match AMD.

Some systems are multi-purpose though. e.g. my current system with a threadripper is a threadripper for software development purposes (lots of cores does wonders for large C++ codebases) but at the same time I'd like to play the odd game on it. The higher max clocks for single-core are a major draw over EPYC(or even threadripper pro) for that case and even my current 2990wx I notice is getting slow-ish for gaming, in particular if one wants to upgrade to >60 fps on modern AAA games. In some cases you can get a whole bunch of extra perf by limiting the game to one CCX but it is all kinda messy.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
I would hope people are buying a 64-core Threadripper or Threadripper WX for applications that will actually make use of all of those cores, IO, and / or memory bandwidth.

Yes, cause we're back in 2003 right? When quad cores were enterprise Xeon chips?

That's a silly argument since even servers moved away from that more than a decade ago. I remember when Xeons were ultra-low clocked Pentiums with large amount of caches, so per socket it heavily underperformed and it only made sense with transactional servers or when you wanted scalability with MP systems.

That's no longer the case, since Turbo allows clocks to be almost as high as low core count parts when low number of cores are active. So servers DO care since we know even in that space there are limits to scalability.

Besides, these are HEDT. I assure you rich gamers will be among the people who buy them. These will be tested by Anandtech, Techpowerup and the rest. Of course low thread count performance matters, because even Cinebench loses scaling pretty quickly. They will cry foul if with only 16 cores active it runs at 3GHz, rather than 4.5GHz it should be able to reach.

And that's if you consider 16 cores "low threads". Not that a 50 core 3.4GHz Golden Cove will lose to competitors.

The power usage figure is also irrelevant in this space if it also performs significantly higher.
 
Last edited:

jamescox

Senior member
Nov 11, 2009
637
1,103
136
Don’t sell gaming short. A CPU that performs great in games has never performed poorly in other workloads, likewise, a CPU that performs poorly in games…

The only exceptions I have ever seen were software stack related. Many games are an absolute worst case scenario for CPSs. Memory bandwidth and/or latency constrained, lightly threaded, etc.


I mean, the tests are already there. I get where you are coming from, however. Even if I were still on Zen 2, I would not be considering ADL-S, but I have been preaching for quite a while about how SPR-S is going to be a problem for AMD (mostly due to > Zen 3 IPC and > 2S support, allowing for higher compute density).

We will see how it pans out. Things are quite different in the enterprise world. Quite a few folks here claim that vendors are reluctant to embrace > 2S designs. Knowing what I know about the industry, I disagree, but I am not afraid to admit when I am wrong. That being said, I suspect that at minimum, Intel will win purely based on brand recognition. Beyond that, SPR-S looks to soundly beat Milan in nearly every workload and a 4S SPR system will absolutely wipe even Milan-X off the map for the most popular and relevant cloud based workloads. I do strongly suspect that AMD will continue to win some key benchmarks, but those benchmarks will be niche compared to the broader industry.

I hope I am wrong, of course (my AMD shares and options will suffer), but after watching the players in the industry for this long…
We where talking about Threadripper and Threadripper WX. Games don’t really figure into this equation. From the benchmarks I have seen, most games stop scaling with core count at a small number of cores. Benchmarks I have seen scaled a little up to 10 cores, but that was with the most demanding games and the highest performance GPU. Most people aren’t running 3090s, so they will be gpu limited even with relatively weak, but modern CPUs. The regular Threadripper 5000 will do very well in games, but that isn’t a reason to buy one.

If you start talking Epyc enterprise level, then the power consumption becomes much more of an issue. My work has had to upgrade the AC in out modest server room several times. The cost of powering new servers is a limitation. I doubt Intel will even beat Milan-x in most benchmarks and they will likely take more power to do it. It will be application specific; it always is. Milan-x gets a massive boost in many applications with the massive amount of SRAM. That will be difficult to beat. Genoa will be significantly more power efficient with a new, 6nm IO die. Later, but possibly not that much later will be Bergamo, with likely pervasive use of stacking. Bergamo will be massive increase in power efficiency.

AMD’s chips will likely be significantly cheaper to make also. There is a possibility that Bergamo will actually be a single reticle sized stacked package since most of the cache could be stacked. That also could be on the order of GB of stacked cache. GPUs may get up to 512 MB with RDNA3. It may be some of the same chips used in Bergamo for truly massive amounts of SRAM.

I haven’t been paying too much attention to Intel and their overclocked parts. Selling high power consumption, overclocked parts will not work in the server space. Depending on when/if intel delivers, they may have a small window, but I don’t think it looks good. Perhaps they catch up with Milan, but then there is Milan-X, then there is Genoa; new lower power IO die and up to 96 cores and you’re talking about 50 something? That isn’t even getting into Bergamo with 128 cores and possibly near ARM server chip levels of power consumption; the real competition may end up between AMD and ARM solutions with intel in third place. I saw some benchmarks, probably at phoronix where intel actually was in third place to the AMD and ARM processors in first and second.

Going to 4 socket capable is unlikely to help. Four socket boards are going against the trends. They are huge and expensive. Most systems these days are single or dual socket boards connected by infiniband or other high speed interconnect. A lot of systems are things like 4 separate single or dual sockets packed into a 2U or similar (1u high and half width for each server). The market for actual 4 socket servers is tiny. They even have gpu servers built this way.

I suspect the reason we might get a Zen 4 preview at CES is that a lot of hyper scalers and other large partners will have final Genoa silicon soon if not already. It will be difficult to keep things secret under those circumstances. Milan, Milan-x, and Genoa may coexist for a while since the server market will be slower to switch to a completely new platform. It may be slowed down by DDR5 availability also.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
Some systems are multi-purpose though. e.g. my current system with a threadripper is a threadripper for software development purposes (lots of cores does wonders for large C++ codebases) but at the same time I'd like to play the odd game on it. The higher max clocks for single-core are a major draw over EPYC(or even threadripper pro) for that case and even my current 2990wx I notice is getting slow-ish for gaming, in particular if one wants to upgrade to >60 fps on modern AAA games. In some cases you can get a whole bunch of extra perf by limiting the game to one CCX but it is all kinda messy.
For Threadripper, maybe. For Epyc, and threadripper pro, probably unlikely. I have seen a lot of server systems where they actually lock it to the base clock for consistency. Do you really thing that a Threadripper 5000 is going to be a performance bottleneck in games? This gaming argument for 32+ core processors is ridiculous. I think Threadripper 5000 will do very well in games and if they make a version with stacked cache it will crush everything else, but still not worth the cost as a gaming only processor at all.

The 2990WX was marketed as a workstation processor for reasons. I think that is the one that had 4 cpu chips but only 2 quadrants connected to memory. You might be able to get more performance with some numa configuration. I have been playing with that on some Epyc processors. Anyway, that is around 4 years old now and wasn’t the best gaming processor from the start.