Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 374 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
VERY impressive considering 226W - 244W isn't that much higher than AMD's max socket power of 1.35 * 170W = 230W. It literally does look like the limitations to hitting 5.4 GHz all core just comes down to keeping it cool.

But look how low the voltage appears here(on the world record holder)
1663795553753.png

1663795721127.png

This should be doable to anyone running a good Water Cooler Loop.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,764
3,131
136
The number of FPU units in a core has no bearing on AVX512 adoption. The fundamental issue had always been that vectorizing code to improve performance is difficult. AVX512 is supposed to help in making things easier.
It's actually quite important, it chances the load/store to execution ratio and if your optimising your loops etc then you need to choose which one to target.
 
  • Like
Reactions: lobz

inf64

Diamond Member
Mar 11, 2011
3,697
4,015
136
It's quite Obvious when you see the V Core and Speeds. To reach that type of performance the Raptor Lake needs to be pushed to 5.8 Ghz which is way past it's efficiency point. Ryzen 7950X is not being pushed that hard here.
16 big cores with similar IPC and clocks are always going to win vs a hybrid design like ADL/RL. I expect that intel will come to their senses and step up the number of P cores at some point - if not it will not be pretty.
 
  • Like
Reactions: Tlh97 and Markfw

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
16 big cores with similar IPC and clocks are always going to win vs a hybrid design like ADL/RL. I expect that intel will come to their senses and step up the number of P cores at some point - if not it will not be pretty.
Eh? It's not a tradeoff between one big core and one small core. It's more like 1:4.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
16 big cores with similar IPC and clocks are always going to win vs a hybrid design like ADL/RL. I expect that intel will come to their senses and step up the number of P cores at some point - if not it will not be pretty.

Not really surprised tbh. The 5950x also had superior performance per watt in heavy multithreaded applications as I recall, so naturally the 7950x would continue that trend in 3d rendering which is the epitome of a heavy multithreaded workload.

Lightly threaded workloads, the type more likely to be encountered by most end consumers is likely going to be very different. There the hybrid architecture will show it's true strength.

And if we're being honest, how many end consumers are going to be doing serious rendering workloads using mainstream CPUs? A very small minority, and even then mostly for benchmark purposes. Real pro consumers will be using Xeons and Threadrippers along with GPUs.

Also, we'll see if you're still singing the same song when Intel's mainstream CPUs eventually have 32 efficiency cores with the IPC of a Zen 3 core or better.

I wasn't really a proponent of the hybrid architecture until I realized how competitive it made Alder Lake in workloads that Zen 3 with double the big cores should have destroyed it in. Yes Intel had to blow out it's performance far past the optimal efficiency range, but the fact that it could still compete in those workloads was quite the eye opener for me.

Gracemont is only the beginning my dude.
 
  • Like
  • Haha
Reactions: lobz and Exist50

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,542
14,496
136
Not really surprised tbh. The 5950x also had superior performance per watt in heavy multithreaded applications as I recall, so naturally the 7950x would continue that trend in 3d rendering which is the epitome of a heavy multithreaded workload.

Lightly threaded workloads, the type more likely to be encountered by most end consumers is likely going to be very different. There the hybrid architecture will show it's true strength.

And if we're being honest, how many end consumers are going to be doing serious rendering workloads using mainstream CPUs? A very small minority, and even then mostly for benchmark purposes. Real pro consumers will be using Xeons and Threadrippers along with GPUs.

Also, we'll see if you're still singing the same song when Intel's mainstream CPUs eventually have 32 efficiency cores with the IPC of a Zen 3 core or better.

I wasn't really a proponent of the hybrid architecture until I realized how competitive it made Alder Lake in workloads that Zen 3 with double the big cores should have destroyed it in. Yes Intel had to blow out it's performance far past the optimal efficiency range, but the fact that it could still compete in those workloads was quite the eye opener for me.

Gracemont is only the beginning my dude.
Well, we will see next week. But at the moment it looks like Zen4 will win in single core, while also winning in efficiency, and will also win the multicore while also winning in efficiency.

So where does that leave the hybrid approach ? Unless of course all these pre-release benchmarks are crap.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Well, we will see next week. But at the moment it looks like Zen4 will win in single core, while also winning in efficiency, and will also win the multicore while also winning in efficiency.

So where does that leave the hybrid approach ? Unless of course all these pre-release benchmarks are crap.
Even if Zen 4 were to win across the board, why would that change anything about the value of hybrid? 12 Golden/Raptor Cove cores wouldn't fair any better, in the general case.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,542
14,496
136
Even if Zen 4 were to win across the board, why would that change anything about the value of hybrid? 12 Golden/Raptor Cove cores wouldn't fair any better, in the general case.
OK, well thats a personal opinion. I fail to see where the hybrid approach is a good idea. UNLESS the max number of threads for most applications meets the number of P-cores AND the thread director on all operating systems is perfect or very close to perfect.

I don't see that happening in the real world. I could be wrong. Talk to me in 5 years.

Oh, and this IS a Zen 4 thread....
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Well, we will see next week. But at the moment it looks like Zen4 will win in single core, while also winning in efficiency, and will also win the multicore while also winning in efficiency.

To me, the Achilles heel of Zen 4 from what I've seen will be memory performance. This will be mitigated mostly by the large L3 cache and excellent branch predictors, but from the Aida benchmarks I've seen, Zen 4 is definitely at a disadvantage to Raptor Lake in that respect with the latter having both higher read, write, copy bandwidth as well as less memory latency.

This could translate into Zen 4 not taking the gaming crown, as gaming is notoriously sensitive to memory performance.

Zen 4s strength will probably be mostly compute oriented applications due to having up to 16 big cores. Rendering is a good example of a compute oriented workload.

In the end, I think Raptor Lake will take the gaming crown and be more dominant in lightly threaded workloads while Zen 4 will definitely close the gap somewhat in gaming and be more dominant in heavy multithreaded workloads.

I'm personally more interested in the encoding performance of Zen 4. Alder Lake had strong encoding performance which will be reinforced with Raptor Lake, but Zen 4 could potentially be more powerful due to more cores.

When it comes to SIMD throughput, Alder Lake can do 3x 256 bit loads per cycle. Does anyone know how many Zen 4 can do?
 

Saylick

Diamond Member
Sep 10, 2012
3,125
6,296
136
When it comes to SIMD throughput, Alder Lake can do 3x 256 bit loads per cycle. Does anyone know how many Zen 4 can do?
Seeing as how Zen 3 could do 3 Loads and 2 Stores but only 2 Load and 1 Store if 256-bit, it wouldn't surprise me if they just bumped up the whole L/S unit to do full 256-bit for Zen 4. But to answer your question, no I don't think we've seen evidence to confirm just yet.

1663818258778.png
 

biostud

Lifer
Feb 27, 2003
18,238
4,755
136
Lightly threaded workloads, the type more likely to be encountered by most end consumers is likely going to be very different. There the hybrid architecture will show it's true strength.

And if we're being honest, how many end consumers are going to be doing serious rendering workloads using mainstream CPUs? A very small minority, and even then mostly for benchmark purposes. Real pro consumers will be using Xeons and Threadrippers along with GPUs.
If you are buying a 13900k/7950X you better have some work that requires heavily threaded performance.

If you only have have work that is lightly threaded, how is it going to perform better on a hybrid design than a regular design, as it is only using a few cores?