Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 427 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
821
1,457
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,250
16,108
136
Let me get this straight. You can set it to eco mode, turn ryzen master off or exit it and it releases that 120 megs but keeps the processor in eco mode until next startup? It doesn't do any scheduling right? 6pm to 12 pm full power and 6 am to 5 pm eco mode every day?
Since the change I mentioned is in BIOS, unless I specifically change something in Ryzen master (I never ), its always in "sort of" ECO mode. It stays at 142 watt or below. I use Ryzen master to make sure the changes I make in BIOS have made the desired affect, and its just a monitoring tool. On my 7950x's I boot windows, check it out, and if all is fine , then boot linux. They stay in linux for months or years.
 
  • Like
Reactions: Tlh97 and A///

Joe NYC

Diamond Member
Jun 26, 2021
3,655
5,199
136
V-cache is really only useful for gaming. If you look at the 5800X3D reviews it gets beat out by a regular 5800X in almost everything outside of gaming. Unless you primarily use a computer for one of those small number of other workloads, v-cache isn't going to give you a better experience.

Voltage problems of 5800x3d caused reduction of clock speeds between 4% and 10.5%. So if the CPU did not benefit from increased V-Cache, you just ended up with slower clock speed, and 4% to 10.5% reduction in performance.

But the issue that plagued 5800x3d is likely limited to AM4 socket CPUs, so likely limited to no degradation of clock speeds for Zen 4 V-Cache.

Perhaps the one area where it shines, which isn't really captured by benchmarks, is improving general system and application responsiveness when running a mixed app workload. The extra last level cache means that other applications have more room in the cache. I doubt that too many people are going to be willing to pay the ~$100 extra for the v-cache model though.

I would venture a guess of + $50 for 8 core V-Cache model.

AMD just needs to be more price competitive with Intel in the midrange, because the E-core spam on the i5 works well enough for the 13600K to pull ahead in many different thread heavy workloads and the chip does well enough in gaming that AMD doesn't have any real advantage there either.

The trick to the E-core spam is the infinitesimal number of users it benefits.

It's almost like a magic trick, or a political trick. To trick people to caring about something that does not affect them.

Among PC users who are in any way bottlenecked by the CPU, the difference between the size of the the set of gamers vs. E-core spam users is huge. You are at risk of getting a divide by zero error when calculating: gamers / E-core spam

The advantage of Zen 4 V-Cache vs. 13600k in gaming will be double digit. A generational difference.

The platform costs will be digested by the time Zen 4 V-Cache is out, at which time, you wiwll be able to get the best gaming CPU, bar none, PCIe5 M.2 + PCIe5 GPU (from AMD). I think it will be quite compelling.

The only issue is timing and platform availability, cost reduction. AMD shot itself in the foot by releasing b670 before b650. That dumb decision is a thing of the past. AMD just has to suffer through the pain of that, and not start shooting itself in the other foot in the meantime...

Instead of panicking with price cuts, AMD could offer a discount / coupon for motherboard purchase, to directly target the pain point.
 

A///

Diamond Member
Feb 24, 2017
4,351
3,160
136
Since the change I mentioned is in BIOS, unless I specifically change something in Ryzen master (I never ), its always in "sort of" ECO mode. It stays at 142 watt or below. I use Ryzen master to make sure the changes I make in BIOS have made the desired affect, and its just a monitoring tool. On my 7950x's I boot windows, check it out, and if all is fine , then boot linux. They stay in linux for months or years.
Oh thats right. You work in datacenters. I'd seen you mention weeks months and years for a long time and wondered how you dealed with windows slowing down during a session like that. I may move to zen 4 once prices come down. I payed through the nose for my current system because I said screw it and paid scalper prices.
 

Kaluan

Senior member
Jan 4, 2022
515
1,092
106
V-cache is really only useful for gaming. If you look at the 5800X3D reviews it gets beat out by a regular 5800X in almost everything outside of gaming. Unless you primarily use a computer for one of those small number of other workloads, v-cache isn't going to give you a better experience.
[PRESS "X" FOR DOUBT]

Anyone who actually peeked at comprehensive review data and also knows the clocking behaviour of 5800X and 5800X3D, which is that 5800X is clocking 6,5% (~4,32GHz v ~4,6GHz) higher under all-core load and 9% (~4,45GHz v ~ 4,85GHz) higher in ST boost... could deduce that the measly 1-3% faster 5800X is vs 5800X3D in productivity, isn't because V-Cache doesn't help (it does), it's because 5800X3D has a pretty chunky clock handicap vs it's vanilla brother.

If AMD can "fix" part/all of the clocking deficiency stacked Zen3 has, I see no reason Raphael-X can't muster extra 5% more "productive" than vanilla Zen4. And not just clocks, Zen4's DDR4 IMC reminds more of Zen era DDR4 IMC than Zen3 IMC, a bigger cache may alleviate it's potential bottlenecks. At least until more mature Zen4 microcode is a common thing.

Let's be honest here, the only reasons people don't perceive Vermeer-X as a productivity CPU 'family', is because 5950X3D doesn't exist. Perception and bias is rarely self-aware.

Looking at AI, physics/simulation, image processing, archiving and even some rendering workloads (like Corona), it's pretty obvious V-Cache can be for more than just gaming (it does best in workloads with huge sets of data), and I deduce from the fact that AMD seems be preparing an entire product stack around it means they will likely not market it just for gamers this time around.

Look, even Intel's benchmark agrees 😅
crossmark-amd-ryzen-7-5800x3d-performance.png
 
Last edited:

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,250
16,108
136
[PRESS "X" FOR DOUBT]

Anyone who actually peeked at comprehensive review data and also knows the clocking behaviour of 5800X and 5800X3D, which is that 5800X is clocking 6,5% (~4,32GHz v ~4,6GHz) higher under all-core load and 9% (~4,45GHz v ~ 4,85GHz) higher in ST boost... could deduce that the measly 1-3% faster 5800X is vs 5800X3D in productivity, isn't because V-Cache doesn't help (it does), it's because 5800X3D has a pretty chunky clock handicap vs it's vanilla brother.

If AMD can "fix" part/all of the clocking deficiency stacked Zen3 has, I see no reason Raphael-X can't muster extra 5% more "productive" than vanilla Zen4. And not just clocks, Zen4's DDR4 IMC reminds more of Zen era DDR4 IMC than Zen3 IMC, a bigger cache may alleviate it's potential bottlenecks. At least until more mature Zen4 microcode is a common thing.

Let's be honest here, the only reasons people don't perceive Vermeer-X as a productivity CPU 'family', is because 5950X3D doesn't exist. Perception and bias is rarely self-aware.

Looking at AI, physics/simulation, image processing, archiving and even some rendering workloads (like Corona), it's pretty obvious V-Cache can be for more than just gaming (it does best in workloads with huge sets of data), and I deduce from the fact that AMD seems be preparing an entire product stack around it means they will likely not market it just for gamers this time around.

Look, even Intel's benchmark agrees 😅
View attachment 69977
This thread is Zen 4, not "5800x3d..."
 

Mopetar

Diamond Member
Jan 31, 2011
8,488
7,729
136
Which is quite a major area when thinking about it. Benchmarks are run in isolation to make the results comparable but are quite artificial that way. The more cores and cache a CPU has the more it is capable of running multiple of them at once.

But how do you quantify it in a meaningful way? That's why we use benchmarks in the first place, so unless someone develops a repeatable benchmark to model multi-application system use, all we can do is hypothesize that it ought to be faster, but by how much is another matter. Is the extra $50-$100 worth a slightly snappier response when switching between applications? I don't think most users will feel that way. It's just added benefit for gamers.
 

Harry_Wild

Senior member
Dec 14, 2012
860
169
106
It be nice if an updated bios would just have “TDP wattage” and have options like: 65, 90, 120, 140, 200, 240 instead setting 6-7 parameters on the bios or in Ryzen Master. o_O

In Ryzen Master, their is no one select that is called “Eco” that I found!:oops:
 

Kaluan

Senior member
Jan 4, 2022
515
1,092
106
Maybe the real 6 core chips were actually the 12 & 16 core chips we made along the way? 😂

Very interesting find by der8auer (priceless live reaction too lol) :
 
  • Like
Reactions: Tlh97 and inf64

maddie

Diamond Member
Jul 18, 2010
5,156
5,545
136
Maybe the real 6 core chips were actually the 12 & 16 core chips we made along the way? 😂

Very interesting find by der8auer (priceless live reaction too lol) :
Seeing that the 7600X is one of, if not, the slowest seller, this most likely should be a 7900X die bonding failure.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,667
2,537
136
Has anyone seen if they can activate the other cores on the 7900X? Since Chips&Cheese was able to deactivate 3 cores per CCD to fake a 10 Core 7800X

Everyone just assumes that it can't be done because after Phenom AMD moved to on-die antifuses that they permanently burn to disable cores, in a way you cannot tamper with in software.
 

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
Everyone just assumes that it can't be done because after Phenom AMD moved to on-die antifuses that they permanently burn to disable cores, in a way you cannot tamper with in software.
Yes, but there have been reports(albeit old) of 1600X that were actually 8C/16Ts... Not heard of them after the OG Zen. Perhaps someone with a 7900X can check?
 

lightmanek

Senior member
Feb 19, 2017
512
1,252
136
Yes, but there have been reports(albeit old) of 1600X that were actually 8C/16Ts... Not heard of them after the OG Zen. Perhaps someone with a 7900X can check?

I have one still in use in a daily gaming PC. Ryzen 5 1600X 8-core processor text in BIOS is a sight to behold :)
PS. Processor works flawlessly as an 8 core since day 1 and it's even OC to 4GHz all cores.
 

Rekluse

Member
Sep 16, 2022
36
46
61
Getting back up to speed on all this.
Current Rumors point to the following chiplet configurations for upcoming Zen4 Server chips :

Genoa 96 Core - 12 Chiplets - 8 Zen4 Cores Each - Arranged 3 across on each corner around IOD
Bergamo 128 Core - 8 Chiplets - 16 Zen4c Cores Each - Arranged 2 across on each corner around IOD

Correct ?
 

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
Getting back up to speed on all this.
Current Rumors point to the following chiplet configurations for upcoming Zen4 Server chips :

Genoa 96 Core - 12 Chiplets - 8 Zen4 Cores Each - Arranged 3 across on each corner around IOD
Bergamo 128 Core - 8 Chiplets - 16 Zen4c Cores Each - Arranged 2 across on each corner around IOD

Correct ?

It remains a Mystery as to how AMD plan to implement 128 Cores per CPU package. I did a few mock ups many pages ago. they need to at least(at the very minimum) reduce the CCD size to 60% of the original to make it fit on the CPU substrate.
 

LightningZ71

Platinum Member
Mar 10, 2017
2,509
3,191
136
128 cores per package will be based on 16 core CCDs. They will need only 8 CCDs to accomplish that on the same package. It will look like Milan, two each on each quadrant. The 16 core CCDs are two 8 core CCXs with 1/2 the L3 cache. The L3 Cache makes up a large chunk of the CCX size, so halving it means that the CCD only needs to be roughly 50% larger to accommodate all 16 cores. That, alone, gets you most of the way there. Recent rumors point to AMD using a slightly denser track library on the Bergamo CCDs to increase transistor density and further shrink the CCDs.
 
  • Like
Reactions: TESKATLIPOKA

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
The L3 Cache makes up a large chunk of the CCX size, so halving it means that the CCD only needs to be roughly 50% larger to accommodate all 16 cores. That, alone, gets you most of the way there.
I've done the math and the drawing, it's more complicated than what you believe.

Basically, Halving the L3$ will not suffice, The L3$ needs to be a quarter of the size(which can be done with the dense libraries from TSMC), The FPU about 40% the size(Ala PS5 FPU), No TSV rails..

Let me try to find my diagrams.
 

Rekluse

Member
Sep 16, 2022
36
46
61
128 cores per package will be based on 16 core CCDs. They will need only 8 CCDs to accomplish that on the same package. It will look like Milan, two each on each quadrant. The 16 core CCDs are two 8 core CCXs with 1/2 the L3 cache. The L3 Cache makes up a large chunk of the CCX size, so halving it means that the CCD only needs to be roughly 50% larger to accommodate all 16 cores. That, alone, gets you most of the way there. Recent rumors point to AMD using a slightly denser track library on the Bergamo CCDs to increase transistor density and further shrink the CCDs.

I've done the math and the drawing, it's more complicated than what you believe.

Basically, Halving the L3$ will not suffice, The L3$ needs to be a quarter of the size(which can be done with the dense libraries from TSMC), The FPU about 40% the size(Ala PS5 FPU), No TSV rails..

Let me try to find my diagrams.

I can't quite remember whether it's cache that benefits most from a node shrink or whether it benefits the least. Or am I confusing all this for the memory controllers in the IOD. Possibly the less bandwidth sensitive Chiplet MCD structure of RDNA3 that's still on 6nm that's throwing me off.
 

LightningZ71

Platinum Member
Mar 10, 2017
2,509
3,191
136
Cache is benefiting less and less from node shrinks, however, apparently, choosing different design libraries can notably influence it's density. AMD mentioned that with respect to the 3D Cache on the 5800X3d that they had used a cache optimized design rule set for the cache die. I suspect that using a higher density library for Bergamo won't be focused on cache density.