Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 196 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
Well, about the 24 core...IF TRUE (which I think it may very well be) In the Pentathlon currently going on, in one project that is CPU specific, I only have about 785 threads going, and only about half of those on this one project, but I am outputting more than the number one use of all time on this project. 2 5950x's are equal in output to a 7742 64 core, due to their high speed and IPC. I have 4 of them, and a 3950x and a 7452 EPYC (32c/64t) and 3 7742's doing this. I would LOVE to replace them with 24 core boxes.

On the current chart , for the day, up till now, the number one user has 3,170,667 and I have 3,446,000 points.
:eek:

It's also coz using 128 threads needs very, very optimized software. Five years down the line, we might see more and more software start to support 128 threads. Windows 11 may even have its max thread count get upgraded to 128.
Windows allows the use of more than 128 threads already, though there are probably per process limitations (I haven’t researched).

It is typically more sensible to use separate processes anyway for a variety of reasons.

The software that I use (and write) would benefit from more cores, however Threadripper is almost too much of a step up, so I do wish we could get a 24-32 core chip from AMD. My 5950x definitely gets saturated.
I think it might depend on the date of release of Raptor Lake? When is that supposed to be again? 16C 7950x might not cut it against 8+16 13900k, if they happen to compete.
Anyway its funny proposition that i might be hypothetically upgrading from intel 7940x to amd 7950x :)

With Zen 4 expected to have a total performance uplift of > 25%, AMD has nothing to worry about.

Also, Intel is currently power limited, so you likely won’t see even close to a linear increase.

I still wish AMD would explore the idea of a hybrid chip (beyond Zen4c, which is rumored to be EPYC only).
 
  • Like
Reactions: Drazick and Tlh97
Jul 27, 2020
15,787
9,838
106
Windows allows the use of more than 128 threads already, though there are probably per process limitations (I haven’t researched).
It's a bit messed up.


1652718006935.png
Ultimately this puts us in a bit of a quandary for our CPU-to-CPU comparisons on the following pages. Normally we run our CPUs on W10 Pro with SMT enabled, but it’s clear from these benchmarks that in every multithreaded scenario, we won’t get the best result. We may have to look at how we test processors >16 cores in the future, and run them on Windows 10 Enterprise.
 

Vattila

Senior member
Oct 22, 2004
799
1,351
136
For the desktop 7000-series, I hope we will see a core count increase across all of the performance tiers — Ryzen 3, 5, 7, 9 — with 6, 8, 12 and 24 cores for the top SKU in each tier, respectively. I want to see a relegation of 6 cores and below to the value tier (Ryzen 3), thereby establishing 8 cores as the mainstream (Ryzen 5) configuration for gaming. This would align PC and console gaming configurations in terms of core count.

Here is my attempt at a good SKU line-up and a simplified model naming scheme for the 7000-series:

HEDT Tier (old "Ryzen 9", replacing "Ryzen Threadripper"):
  • Ryzen 7950X — 24C/48T + V-Cache
  • Ryzen 7950 — 24C/48T
  • Ryzen 7900X — 16C/32T + V-Cache
  • Ryzen 7900 — 16C/32T
Premium Tier (old "Ryzen 7"):
  • Ryzen 7750 — 12C/24T
  • Ryzen 7700X — 8C/16T + V-Cache
Mainstream Tier (old "Ryzen 5"):
  • Ryzen 7500 — 8C/16T
Value Tier (old "Ryzen 3"):
  • Ryzen 7350 — 6C/12T
  • Ryzen 7300 — 4C/8T
Here the second digit in the model number is the old tier number (3|5|7|9), making separate numbering of the tier redundant, allowing simpler model naming. The first digit of the model number is the generation. The two last digits designate the positioning within the tier, with higher number for more performance. Suffix "X" now designate V-Cache and is reserved for the premium and HEDT tiers.

For the workstation market, with core counts spanning 16-64 cores, replacing the "Ryzen Threadripper PRO" brand by "EPYC Threadripper", using the rumoured cut-down SP6 server socket, sounds like a good and plausible way to go, rather than maintain a dedicated Threadripper socket.
 
Last edited:

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
The software that I use (and write) would benefit from more cores, however Threadripper is almost too much of a step up, so I do wish we could get a 24-32 core chip from AMD. My 5950x definitely gets saturated.

It just works on Linux, but on Windows it requires extra effort. You can read the high level discussion of concepts and problem here:


Also supposedly it is solved starting with Windows 11 and Server 2022:


How optimal the assigments are? Who knows, but it makes sense to run these new OS'es if one is seriuos about perf of 64C TR or is running any system that has > 64 threads.
And old OSes had nasty behavior, where the system that had say 40 or 48 total cores with HT were hurt the most, due to madness with NUMA / HT assigments to first ( default) processor group.
 

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
HEDT Tier (old "Ryzen 9", replacing "Ryzen Threadripper"):
  • Ryzen 7950X — 24C/48T + V-Cache
  • Ryzen 7950 — 24C/48T
  • Ryzen 7900X — 16C/32T + V-Cache
  • Ryzen 7900 — 16C/32T

Has anyone, with any kind of reliable knowledge, shown us how 3 chiplets would fit on an AM5 package? The IOD will wind up being larger to handle all the extra routing/buffering. Then there needs to be space, not just for the CCDs, but also for all the extra traces from that CCD to the IOD and the socket.
 

CakeMonster

Golden Member
Nov 22, 2012
1,385
483
136
And old OSes had nasty behavior, where the system that had say 40 or 48 total cores with HT were hurt the most, due to madness with NUMA / HT assigments to first ( default) processor group.

Does this mean that a theoretical 24c/48t Z4 could run into trouble on W10 compared to a 16c/32t 5950X which AFAIK works perfectly? Or is there some technical stuff that went over my head..?
 

tomatosummit

Member
Mar 21, 2019
184
177
116
Has anyone, with any kind of reliable knowledge, shown us how 3 chiplets would fit on an AM5 package? The IOD will wind up being larger to handle all the extra routing/buffering. Then there needs to be space, not just for the CCDs, but also for all the extra traces from that CCD to the IOD and the socket.
I don't think anyone useful has seen under the hood yet.
Only thing we have is the cpu with the new heatspreader. What that does have is the smds being on the outside of the copper cover which would mean there's more room for chiplets or whatever underneath the heat spreader.

Is the cpu package itself the same size as am4? I always assumed it was a little larger, due to extra pins and signal integrity for ddr5/pcie5, I know it keeps the same cooler mounting.
 

SteinFG

Senior member
Dec 29, 2021
400
454
106
fitting chiplets is not real problem. I've seen so many paint.exe edits where people cram whatever they want. Look at the bottlenecks - AMD at most used 1 mem channel for 1 chiplet (3990X is exception). This gives them nice scaling - ryzen: 2 channels 2 chiplets, threadripper: 4 channels 4 chiplets, milan: 8 channels 8 chiplets, genoa: 12 channels 12 chiplets.

So as long as there is 2 memory channels on mainstream, I expect there to be 2 chiplets. And AMD also designed AM5 to be as small as possible (no components under the package, only pins.

Is the cpu package itself the same size as am4? I always assumed it was a little larger, due to extra pins and signal integrity for ddr5/pcie5, I know it keeps the same cooler mounting.

package size is the same as AM4 - 40x40 mm. accounting for heatsink and capacitors, about 25x25 mm area can be used for silicon on AM5, about the same as on AM4
 

maddie

Diamond Member
Jul 18, 2010
4,723
4,628
136
Has anyone, with any kind of reliable knowledge, shown us how 3 chiplets would fit on an AM5 package? The IOD will wind up being larger to handle all the extra routing/buffering. Then there needs to be space, not just for the CCDs, but also for all the extra traces from that CCD to the IOD and the socket.
Traces in the substrate is key. AMD already explained the challenges in routing 2 chiplets to the IOD. 3 might be too much for the package size.
 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
If they really needed to do it they probably could. Genoa had up to 96 cores so they obviously figured out how to fit additional chiplets into a given area. Either that or Zen 4 chiplets are actually 12-core parts.

However, I don't think they necessarily need to do so though. A 16C Zen 4 CPU is going to be a monster that Intel will have problems beating.

Anyone who would get an advantage from 24 cores over 16 will get even more of an advantage from 32, 64, etc. and be better served with a Threadripper part.

I suspect that if we do see more than 16 cores it will be due to desktop parts that use the Zen 4c chiplets that are more densely packed. If that's the case then we'll get up to 32-core desktop parts.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,324
1,462
136
Has anyone, with any kind of reliable knowledge, shown us how 3 chiplets would fit on an AM5 package? The IOD will wind up being larger to handle all the extra routing/buffering. Then there needs to be space, not just for the CCDs, but also for all the extra traces from that CCD to the IOD and the socket.

The traces require a lot less space (and the chiplets can be closer together) if the rumors about the InFO packaging are correct. This would also help reduce the size of the io chiplet, it might be surprisingly small if it's made on N6.

I am not convinced at all that there will be a 24-core AM5 Ryzen in the next generation, but it's not because they can't do it, I just don't see the need for it. If there is one, I would however wager that it will make price records for a mainstream platform CPU.
 
Last edited:

Timmah!

Golden Member
Jul 24, 2010
1,399
604
136
fitting chiplets is not real problem. I've seen so many paint.exe edits where people cram whatever they want. Look at the bottlenecks - AMD at most used 1 mem channel for 1 chiplet (3990X is exception). This gives them nice scaling - ryzen: 2 channels 2 chiplets, threadripper: 4 channels 4 chiplets, milan: 8 channels 8 chiplets, genoa: 12 channels 12 chiplets.

So as long as there is 2 memory channels on mainstream, I expect there to be 2 chiplets. And AMD also designed AM5 to be as small as possible (no components under the package, only pins.



package size is the same as AM4 - 40x40 mm. accounting for heatsink and capacitors, about 25x25 mm area can be used for silicon on AM5, about the same as on AM4

Cant you have 2 chiplets served by single memory channel?
 

Timmah!

Golden Member
Jul 24, 2010
1,399
604
136
Yes. Running with 1 memory stick. Not wise, but it will work.

Why? There are boards with 4 slots, yet only 2 channels, so its 2 sticks per channel, right? How does number of sticks concerns the inner layout of CPU, whether the traces there are routed to 2 6-core chiplets, rather than hypothetical single 12-core chip? (I assume that hypothetical 24 core would be 4 6-cores chiplets, not 3 8-cores)
 

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Regarding increased core counts on the desktop, and apart from the immensely interesting technical discussion in this thread over the last couple of years — which has gone over all the challenges and possible solutions in great detail — I recall statements from AMD CEO Lisa Su and CTO Mark Papermaster, both saying that we should expect further core count scaling:

"There will be more core counts in the future. I would not say that somehow 16-core and 64-core are the limits. They will come as we scale other parts of the system as well." (Lisa Su)

AMD CEO Lisa Su talks core counts, console launches, and Apple relationship | VentureBeat

"We asked Papermaster if it would make sense to move up to 32 cores for mainstream users: "I don’t see in the mainstream space any imminent barrier, and here's why: It's just a catch-up time for software to leverage the multi-core approach," Papermaster said. "But we're over that hurdle, now more and more applications can take advantage of multi-core and multi-threading. [...] In the near term, I don’t see a saturation point for cores. You have to be very thoughtful when you add cores because you don’t want to add it before the application can take advantage of it. As long as you keep that balance, I think we'll continue to see that trend.""

AMD CTO Mark Papermaster: More Cores Coming in the 'Era of a Slowed Moore's Law' | Tom's Hardware (tomshardware.com)
Article - "AMD CTO Mark Papermaster: More Cores Coming in the 'Era of a Slowed Moore's Law'" - @ Tom's (AnandTech Forums)

Notably, there has not been a core count bump on the desktop since "Zen 2" ("Matisse") and the launch of 16-core Ryzen 9 3950X back in 2019-NOV. Now that the system platform is set to scale, as Lisa Su says, with the upcoming AM5 socket providing vastly more memory bandwidth with DDR5 — it feels like the time is right. As we know, there will certainly be a 50-100% core count bump in the server segment over the next 12 months, so it is not unreasonable to expect a 50% bump on the desktop, I would think. Intel is increasing core counts with "Raptor Lake" as well, making another step up for AMD all the more urgent.

How they have engineered it all will be interesting to see.
 

maddie

Diamond Member
Jul 18, 2010
4,723
4,628
136
Why? There are boards with 4 slots, yet only 2 channels, so its 2 sticks per channel, right? How does number of sticks concerns the inner layout of CPU, whether the traces there are routed to 2 6-core chiplets, rather than hypothetical single 12-core chip? (I assume that hypothetical 24 core would be 4 6-cores chiplets, not 3 8-cores)
The question was, "Cant you have 2 chiplets served by single memory channel?".

My answer was, yes, which happens if you only have 1 memory stick installed.
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
Keep in mind:

threadripper 2990wx: 32 cores, quad dram channels at 2400 to maybe 3000 if you were lucky.

conjectural 24 core zen 4 desktop part: 24 cores, quad channel dram (ddr5 Simms are essentially two channels each), 2400 (1/2 of ddr5 4800 bandwidth per channel) to around 3000 per channel (ddr5 5800+ will be available).

feeding the beast won’t be a problem.