Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 141 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
How much will the V-cache offset DDR5 gains?

What I mean is how much performance boost will we see in moving to V-cache with current DDR4 for memory bound applications?

How much without V-cache but with DDR5?

How much with V-cache and then adding DDR5?

That is the neat thing about Zen3+ on Rembrandt, it's Zen3 with DDR5. Since the 6900HX will boost to 5 Ghz we can extrapolate many of the ST performance
 

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
My 5950X boosts to 5.05 - 5.1 GHz ST. All core boost around 4.05 - 4.15 GHz. Stock, results depends on the room temperature ( it is higher when the heating system is not turned on )
I have most seen most 5800X boost around 4.5 GHz all core.

Assuming the part used for that demo is an 8 Core part,
5.4+ GHz ST and 5.0 GHz all core should be easy on a hypothetical 7800X.
5.6+ GHz ST and 4.8 GHz all core on a hypothetical 7950X. With PBO should be easy to go 5 GHz on all 16 cores.
That AVX512 added at the right moment with DDR5, FP will see a massive jump considering Raphael will launch with 5200 DDR5 (or even much higher, since 5200 is for Genoa) .
 

eek2121

Platinum Member
Aug 2, 2005
2,904
3,905
136
Okay, I still don’t believe that Intel will be particularly competitive in HEDT in the near future. If you are buying a chip with more than about 10 cores, then I hope you are doing something other than gaming to justify it. Yes, those idiots are customers, but they are also a super tiny niche. AMD almost didn’t offer a HEDT product at all, so AMD isn’t going to expend too much energy chasing such idiots. They may still get them on the strength of their rebadged server parts though.

I also don’t believe that intel will win this fight based on low core count boost clocks. The number of applications where intel boost clocks will win vs. AMD boost clocks, in the context of you are already buying an high core count HEDT processor for something other than gaming, is super tiny.
Modern games can make use of > 10 cores thanks to DirectX 12. I have seen a few games drive my 5950X to about 50-70% utilization across 32 threads. I will have to make a list. That being said, I mostly agree with you.
My 5950X boosts to 5.05 - 5.1 GHz ST. All core boost around 4.05 - 4.15 GHz. Stock, results depends on the room temperature ( it is higher when the heating system is not turned on )
I have most seen most 5800X boost around 4.5 GHz all core.

Assuming the part used for that demo is an 8 Core part,
5.4+ GHz ST and 5.0 GHz all core should be easy on a hypothetical 7800X.
5.6+ GHz ST and 4.8 GHz all core on a hypothetical 7950X. With PBO should be easy to go 5 GHz on all 16 cores.
That AVX512 added at the right moment with DDR5, FP will see a massive jump considering Raphael will launch with 5200 DDR5 (or even much higher, since 5200 is for Genoa) .
I suspect Zen 4 will be around 5.2, maybe 5.3. It is pretty easy to get a well binned Zen 3 core to those levels.
 

CakeMonster

Golden Member
Nov 22, 2012
1,384
482
136
Modern games can make use of > 10 cores thanks to DirectX 12. I have seen a few games drive my 5950X to about 50-70% utilization across 32 threads. I will have to make a list. That being said, I mostly agree with you.
I see that too on my 5900X, but when looking up benchmarks for the 6 and 8 core parts, apparently the threads scale just fine on fewer cores, it seems overall utilization increases on 12/16 just because the threads are available.

One example is HZD, it pushes above 50% for me in crowded areas and it visually lights up on massive threads usage in Task Manager, but it still does just fine on lower core parts, and HUB considers HZD a game 'that is not heavy on the CPU'.
 
Jul 27, 2020
15,738
9,792
106
I see that too on my 5900X, but when looking up benchmarks for the 6 and 8 core parts, apparently the threads scale just fine on fewer cores, it seems overall utilization increases on 12/16 just because the threads are available.

One example is HZD, it pushes above 50% for me in crowded areas and it visually lights up on massive threads usage in Task Manager, but it still does just fine on lower core parts, and HUB considers HZD a game 'that is not heavy on the CPU'.
This needs to be investigated further. Maybe the extra threads on extra cores makes the 1% minimum FPS better. Also, threads across CCDs might decrease minimum FPS due to greater inter-CCD latency.
 

andermans

Member
Sep 11, 2020
151
153
76
I see that too on my 5900X, but when looking up benchmarks for the 6 and 8 core parts, apparently the threads scale just fine on fewer cores, it seems overall utilization increases on 12/16 just because the threads are available.

One example is HZD, it pushes above 50% for me in crowded areas and it visually lights up on massive threads usage in Task Manager, but it still does just fine on lower core parts, and HUB considers HZD a game 'that is not heavy on the CPU'.

Games tend to be horrible and have their threads waiting in a busy wait loop on work. So they are all spending CPU time when waiting for work. I believe this is mostly because Windows is (was?) pretty bad with thread scheduling.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
Modern games can make use of > 10 cores thanks to DirectX 12. I have seen a few games drive my 5950X to about 50-70% utilization across 32 threads. I will have to make a list. That being said, I mostly agree with you.

I suspect Zen 4 will be around 5.2, maybe 5.3. It is pretty easy to get a well binned Zen 3 core to those levels.
Using cores doesn’t mean that performance continues scaling. In the benchmarks I have seen, some games continued scaling up to around 10 cores, but most stopped scaling at lower than that. It is subject to diminishing returns. In many cases, it is actually the larger cache that is available on some higher core count parts that was making the difference, not actually the additional cores.

Also, you have to be in a situation where you aren’t gpu limited from the start. If you are playing at 4K with high quality, then you need a powerful cpu to keep up with the gpu, but most higher end CPUs will put the bottleneck on the gpu. This is why I would expect most new HEDT parts (regardless of who makes them) to perform the same in games; they will all put the bottleneck on the gpu for any reasonable resolution and quality levels. Someone will probably still test them at 1080 with a 3090Ti or something. Although, Intel may not win such contrived benchmarks going forward due to the ridiculous amount of cache available on AMD parts. Since DX12 actually allows software to take better advantage of many cpu cores, we are generally going to be even more bottlenecked by the GPUs going forward.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
Not saying low core count MHz is only thing it matters. But that a 56 core SPR-X will win in all areas. Yes it'll use more power, but we all know that. 450W RTX 3090 Ti, 600W Turin, 1KW+ accelerators, etc, etc.
Will win in all areas against what? Threadripper 5000? Threadripper 5000 X3D? If AMD were to throw 4 high leakage, 8-core X3D chiplets on a Threadripper package, it will likely dominate in many applications and in a lot of cases, not by a little. Intel is solidly in the “I will believe it when I see it category”, but most companies are in that category right now due to covid delays and shortages anyway. AMD almost certainly can answer any challenge from Intel due to the modularity and flexibility of their server platform if they are inclined to do so. The intel parts could be very late and will certainly be massive power hogs that could limit performance severely.

I dubious as to gather we will actually see an SP5 Threadripper though. If they have plenty of 5 nm chiplets, then maybe sooner than expected, but SP5 is even more bigger and ridiculous than current Epyc / Threadripper sockets in the consumer market. I have always thought that they should have 3 sockets to allow full Epyc, 1/2 Epyc, and 1/4 Epyc (desktop socket). If AM5 is limited to dual channel memory, then it isn’t even 1/4 Epyc. 1/4 Epyc Genoa would be triple channel memory and up to 3 cpu chiplets. The AM5 package is looking real weird though. They seem to be trying to allow for a lot of z-height and they have pushed capacitors out to the edge with the little cut outs, so I am wondering if there will be some interesting expansion possible in the future. They have quite a bit of room. I don’t know if we can rule out models with an 8 core high performance chiplet combined with a 16 core efficiency chiplet. If there is no SP5 based Threadripper, then a 3 chiplet AM5 might make sense. They can get a lot of IO by using multiple IO die / chipset chips. I am wondering if those will be made at GF or Samsung rather than TSMC.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
How much will the V-cache offset DDR5 gains?

What I mean is how much performance boost will we see in moving to V-cache with current DDR4 for memory bound applications?

How much without V-cache but with DDR5?

How much with V-cache and then adding DDR5?
Cache size and bandwidth are very application dependent. The performance improvement will very by a large amount between applications. It can make a large difference, even in some bandwidth constrained applications. This is why infinity cache on GPUs can hide the lower memory bandwidth in some cases.
 
  • Like
Reactions: lightmanek

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
Why is that exactly?
Pretty slim on product announcements and and future product information. Except for Rembrandt - that's a big deal for laptops and maybe some mini business desktops. I'm curious how much supply there will be. Other than that, a weak GPU, V-cache only announced on one CPU. Not much useful info on Zen4. So, just a kind of of boring CES. Zen4 desktop/server and RDNA3 will be much more interesting, once they are formally announced.

Admittedly, I'm a long time DIY enthusiast - so I'm biased.
 
Jul 27, 2020
15,738
9,792
106
Pretty slim on product announcements and and future product information. Except for Rembrandt - that's a big deal for laptops and maybe some mini business desktops. I'm curious how much supply there will be. Other than that, a weak GPU, V-cache only announced on one CPU. Not much useful info on Zen4. So, just a kind of of boring CES. Zen4 desktop/server and RDNA3 will be much more interesting, once they are formally announced.

Admittedly, I'm a long time DIY enthusiast - so I'm biased.
Well, AMD's presentation was positively erotic compared to the snoozefest Intel put on. Mobileye? They will get sued to oblivion when their precious technology starts causing accidents.
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Well, AMD's presentation was positively erotic compared to the snoozefest Intel put on. Mobileye? They will get sued to oblivion when their precious technology starts causing accidents.

Ah well Intel will sell Mobileye anyways. Intel could have put little more detail about ARC at least.

Yea they've been weird since few years ago. I remember endless launches of Tigerlake. Was it 3 or 4 times I can't remember.

They know what they are ahead on and what they are not. At least they are not falsely flaunting it.

AMD being quiet on the RMB-H and Intel being quiet on ADL-P/U performance is similar to that.
 
  • Haha
Reactions: lobz

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
SemiAnalysis can confirm that Zen 4 based desktop and server products will use a fan out. This fan out will then be packaged traditionally on top of a standard organic substrate which will have LGA pins on the bottom of this. The company packaging these products and technical reason for moving to fan out will be revealed behind the paywall.
Dylan seems confident that Genoa and Raphael will use fan out packages.
While I don't know his track record, I have seen him active with a lot of well known industry folks.

Nevertheless, this is rather in line with the LinkedIn snippets I dug up regarding GMI3 SI/PI being modeled for X3D
1641490230838.png
Also much thinner substrate this time around for SP5 and AM5.
FO migration could a good time to do with Zen 4 and full architecture update with Zen 5 along with node shrink.
 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
Pretty slim on product announcements and and future product information.

All of the presentations were pretty boring, but what else is there to expect when no one has any of their next generation products launching until the end of the year? Almost all of the Radeon stack has been released, Zen 4 is still half a year away at least so we're not going to get a lot of details now, and there's even enough of a wait on Zen 3D that AMD isn't releasing all of the details yet.

Similarly, Intel just released their new architecture so there isn't much to say beyond additional products that weren't available at launch. Maybe they could have said more about their GPU efforts, but if they don't have a product ready yet there's no real reason to or they wouldn't be adding anything new.

Pretty much the same story with Nvidia. They announced their low-end GPU part of the stack and a new top-end GPU for the people that will gladly buy it to replace their last top-end NVidia GPU. Just be glad AMD didn't talk about another cloud service that no one wants.
 
  • Like
Reactions: Tlh97 and Ajay

TBytemaster

Junior Member
Jun 23, 2020
7
19
61
Dylan seems confident that Genoa and Raphael will use fan out packages.
While I don't know his track record, I have seen him active with a lot of well known industry folks.

Nevertheless, this is rather in line with the LinkedIn snippets I dug up regarding GMI3 SI/PI being modeled for X3D
View attachment 55539
Also much thinner substrate this time around for SP5 and AM5.
FO migration could a good time to do with Zen 4 and full architecture update with Zen 5 along with node shrink.
Hopefully this means direct compute chiplet communication? Though, it could just as well only be for IO-compute chiplet connections, bah.

In magical Christmas land where everything my layman mind can dream comes true, we'd get direct die-to-die CCDs, and they'd use each other's massive stacked L3 as an L4 too. Like the Zen 3 CCX unification all over again almost.

EDIT: in hindsight I'm not sure why I was thinking any of this would lead to direct CCD-CCD communication.
 
Last edited:
  • Like
Reactions: Joe NYC