Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 48 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

soresu

Platinum Member
Dec 19, 2014
2,617
1,812
136
It was from the Zen 3 announcement stream October 2020:

On Zen3 AMD told us they have completed the design: https://www.tweaktown.com/news/67008/amds-new-zen-3-design-complete-4-ready-2021/index.html

We haven't heard something like this for Zen 4.
There's an investor day on May 19th so we may get some news then.

That being said I doubt they want to do that with capacity constraint meaning they have not sold as much Zen3 product as they would wish to at this point.

I'd expect them to want as much ROI on Zen3 as they possibly can before telegraphing Zen4 readiness causing people like us to dig in and wait for it.

Though Rembrandt is still to come anyways - that will make them a nice return in mobile/laptops, SFF and embedded while Zen4 Raphael is having its big moment in the mainstream desktop segment.
 
  • Like
Reactions: Tlh97

Tarkin77

Member
Mar 10, 2018
70
147
106
There's an investor day on May 19th so we may get some news then.

That being said I doubt they want to do that with capacity constraint meaning they have not sold as much Zen3 product as they would wish to at this point.

I'd expect them to want as much ROI on Zen3 as they possibly can before telegraphing Zen4 readiness causing people like us to dig in and wait for it.

Though Rembrandt is still to come anyways - that will make them a nice return in mobile/laptops, SFF and embedded while Zen4 Raphael is having its big moment in the mainstream desktop segment.
its their anual stockholder meeting - they usually dont reveal new stuff there.

The next important/interesting event is a Computex Keynote with Dr. Su.
 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
Stockholder meeting will probably be just as interesting since there are really just as many questions about when they can be expected to satisfy current demand for various product lines as there might be about future ones. Obviously they won't (or probably can't) give too many specific answers, but even some general guidelines might be helpful.
 
  • Like
Reactions: Tlh97 and soresu

tomatosummit

Member
Mar 21, 2019
184
177
116
From this article https://www.hardwaretimes.com/amd-i...per-epyc-trento-cpus-via-infinity-fabric-3-0/ it seems that the modifications made in EPYC Trento which will be in frontier is to support 4 GPU MI200 via Infinity Fabric 3.0 on a mesh interconnexion, so with two EPYC Trento on the same motherboard that's mean a support of 8 GPU MI200 at maximum ?

The roadmap was revealed a while ago and shows where they're intending to go with it. The pcie complex in amd cpus is multipurpose and already connects socket to socket coherently. The 4 gpu link is just a direct pcie/ccix/IF link to each gpu and a ring connection between the gpus themselves. The real difference to what's available to consumers is that the gpu link is memory coherent.
I think the super computer nodes are sticking to 1cpu + 4gpu with eight of those complexes in a single tray according to what cray has revealed.
amd mgpu stuff.jpg
 

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
Related to above post
Linux 5.14 will be the kernel version with initial support of AMD's cache coherent heterogeneous memory management

amdkfd:
- Initial SR-IOV support for Aldebaran
- Topology fixes
- Initial HMM SVM support
- Misc code cleanups and bug fixes

Update:
I am very surprised, Intel let this one through very fast.
Because for 2 years they were blocking AMD's cgroup implementation for GPU, and then last month they came up with an implementation of their own .... using some of the rejected implementation from AMD.
AMD's Linux support is very excellent lately HMM/GPU cgroup/criu/Multi GPU VM
 
Last edited:

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,572
146

On this slide Rome is listed as 2019Q2 - 2020Q1, and launched in 2019Q3.

Milan is listed as 2020Q3 - 2021Q2, and launched in Q1 (barely). Well that being said, it was in a state where AMD could have launched it in late 2020 as well, but due to Ice Lake delays they delayed Milan to neaten up firmware side as best as possible to smoothen out the launch.

Now we have Genoa listed as 2022 Q2 - 2022 Q4. Should be safe to say that you should expect it to launch within that range of dates.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
That is the embedded roadmap though.

EPYC:

Embedded EPYC:

It isn't saying Milan should have been 2020Q3-2021Q2;

It is saying Embedded Milan should have been 2020Q3-2021Q2;

Also, effectively the google date the Embedded website is Aug 4, 2020
Was only listed on: Aug 5, 2020.

Meaning that the Embedded Epyc 7001/7002 only launched then in August 2020.

Jan 19, 2021 upload date
Googling it with "V3000"/"Zen4"/"7004" points it to having the slide.

Identifying the asterisk:
*AMD roadmaps are subject to change without notice or obligations to notify of changes.
Placement of boxes is not intended to represent first year of product shipment.
 
Last edited:

soresu

Platinum Member
Dec 19, 2014
2,617
1,812
136
That is the embedded roadmap though.

EPYC:

Embedded EPYC:

It isn't saying Milan should have been 2020Q3-2021Q2;

It is saying Embedded Milan should have been 2020Q3-2021Q2;

Also, effectively the google date the Embedded website is Aug 4, 2020
Was only listed on: Aug 5, 2020.

Meaning that the Embedded Epyc 7001/7002 only launched then in August 2020.

Jan 19, 2021 upload date
Googling it with "V3000"/"Zen4"/"7004" points it to having the slide.

Identifying the asterisk:
*AMD roadmaps are subject to change without notice or obligations to notify of changes.
Placement of boxes is not intended to represent first year of product shipment.
Wait a second here....

There's an embedded EPYC???!!!
 
May 17, 2020
122
233
86
Yes the embedded EPYC exists : https://www.amd.com/en/processors/embedded-epyc-series, the 3000 series exist since a while

Jan 19, 2021 upload date
Googling it with "V3000"/"Zen4"/"7004" points it to having the slide.

Identifying the asterisk:
*AMD roadmaps are subject to change without notice or obligations to notify of changes.
Placement of boxes is not intended to represent first year of product shipment.
Sadly, the link to AMD presentation hosted by aicipc doesn't work

AMD had released the V2000 series last year but it isn't appear on the roadmap
 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
Wait a second here....

There's an embedded EPYC???!!!
Yep, and the embedded series launch and become available after the normal series.

On a other note, Supermicro has had leaked the "Icelake" Server Platform and the "Genoa" Server Platform are launching in the same year.
=> https://learn-more.supermicro.com/jump-start-program // Later in the year, this should include "Genoa" / "SP5"
February 16, 2021 (Icelake's Jumpstart) -> sometime later in 2021 (Genoa's Jumpstart)

The "Genoa" references at Supermicro are in past-tense not present-tense or future-tense. At Quanta Computer Incorporated it is also using past-tense phrasing for "Genoa". Which makes me think Hyperscale data centers already have SP5 platforms in some form.
 
Last edited:

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
AMD has had an extensive EPYC embedded portfolio since the Zen1/+ days. The embedded Snowy Owl products were the only way to get at the 4 ten-Gigabit MACs in the original zeppelin die.
 

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
How does it work then? I am curious because I have not gone through the JEDEC spec, but AT overview
and from Rambus writeup, and this what I can surmise

View attachment 40432
Channel A and B can receive independent Command and addresses from the MC. Also dedicated Data Buffers for channel A and channel B
Channel A can perform operations while Channel B is in recovery.
This gives a smart MC to optimize addressing so that the queued memory request at the MC can be handled in parallel and not spending time to recover if it were a single channel.

Basically same thing mentioned in a blog post by Corsair like what I wrote.


With DDR5, individual modules are split into two separate channels by design, allowing for shorter traces that contribute to less latency and higher speeds when it comes to communicating with individual memory ICs on a memory module. This also allows for what’s referred to as command/address mirroring since the signal from the CPU has to travel a shorter overall path to access specific banks of memory whereas in DDR4 a command/address signal had to travel through all banks of memory in a longer chain.

On DDR4, when a single bank of memory needed to be refreshed, the CPU had to wait for all banks of memory on a module to refresh before doing another read or write. With DDR5, we’ve got double the bank groups and when a bank needs to be refreshed only the same bank of each group is refreshed, allowing for the other memory bank groups to be accessed without the CPU having to wait.
If the IMC is improved to take advantage of this architecture, we can expect some gains in applications which have big data sets that spill beyond the L3.

Zen4 APUs will have respectable graphics capability if they have some kind of SLC as well. 100GB/s+ BW in dual kit mode.

On the server side, nice timing to move to 96 cores with DDR5. Those high core count parts could really stretch their legs.
Zen4 also supports 57 bit VA should be able to handle insane amounts of memory.
 
Last edited:

Hougy

Member
Jan 13, 2021
77
60
61
Why doesn't AMD use the high density process? Wouldn't the much higher IPC made possible by many more transistors make up for the lost frequency? Plus, it would be much more energy efficient
 

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
Why doesn't AMD use the high density process? Wouldn't the much higher IPC made possible by many more transistors make up for the lost frequency? Plus, it would be much more energy efficient
The Zen architecture is a high frequency oriented design. Maybe after Zen, AMD will go the other route, especially if future nodes (beyond what we know) will be trending towards lower peak frequencies.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
Why doesn't AMD use the high density process? Wouldn't the much higher IPC made possible by many more transistors make up for the lost frequency? Plus, it would be much more energy efficient
I'm pretty sure since Zen2/3, AMD has been using the most dense libs available.

Ex:
ZenLibs.png

The actual design reasons for AMD going for Hi-Freq, rather than going full on Hi-Inst will probably remain unknown. For all we know it could be marketing-sided rather than engineer-sided. As it is psychologically easier to sell a chip that has increased frequency over the last generation.

Above, extended w/ actual high-end on AMx:
Ryzen 7 1800X = 4.1 GHz boost
Ryzen 7 2700X = 4.3 GHz boost
Ryzen 7 3950X = 4.7 GHz boost
Ryzen 7 5950X = 4.9 GHz boost
 
Last edited:

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
I'm pretty sure since Zen2/3, AMD has been using the most dense libs available.

Ex:
View attachment 44460

The actual design reasons for AMD going for Hi-Freq, rather than going full on Hi-Inst will probably remain unknown. For all we know it could be marketing-sided rather than engineer-sided. As it is psychologically easier to sell a chip that has increased frequency over the last generation.

Above, extended w/ actual high-end on AMx:
Ryzen 7 1800X = 4.1 GHz boost
Ryzen 7 2700X = 4.3 GHz boost
Ryzen 7 3950X = 4.7 GHz boost
Ryzen 7 5950X = 4.9 GHz boost
Zen3 is using N7 HD indeed. While RDNA is using N7 HP.
For N5 the most optimal range for Zen 4 would be around 3.8-4.2 GHz according to the Shmoo plot, after that would need considerable jump in voltage for getting the frequency to the same levels like 5950X for example.
One of the advantages of designs which top out at ~3.2 GHz is that they are well below this value resulting in big gains in efficiency.

1621235308312.png

Why doesn't AMD use the high density process? Wouldn't the much higher IPC made possible by many more transistors make up for the lost frequency? Plus, it would be much more energy efficient
My thinking is that when they originally designed Zen1-Zen3 using the CCX concept they intended it to be very small and easily manufacturable. It was supposed to be cheap to produce. The high clocks could help get more performance.
Zen3 Core and upto L2 is quite small in comparison to most contemporary designs. Zen3 core (without L3) is less than half the MTr of M1
However when Zen2 and later Zen3 landed they need to tack on the big L3 to handle the weakness of the memory hierarchy with the multiple CCXs
In the end Zen3 got big anyway. Also making the MTr or the active core too high while operating in high frequency would have increased the TDP by a bit

For Zen4 with this learning it should be interesting if AMD would raise the transistor count drastically or again stick with smaller core.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
Going with high density libraries is the sane choice since it leaves all paths open for the actual design implementation, whereas with high performance the way back to higher energy efficiency is harder. Think of it this way: With HD you essentially get to work with a finer grid for your design. This doesn't mean you have to make actual use of all the density, instead the finer grid allows finer selective placing of dense area, dark silicon and deliberate spacing for achieving higher frequencies.

See also our old thread on this topic: https://forums.anandtech.com/thread...ets-warning-many-images.2577325/post-40075202
 

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
Going with high density libraries is the sane choice since it leaves all paths open for the actual design implementation, whereas with high performance the way back to higher energy efficiency is harder. Think of it this way: With HD you essentially get to work with a finer grid for your design. This doesn't mean you have to make actual use of all the density, instead the finer grid allows finer selective placing of dense area, dark silicon and deliberate spacing for achieving higher frequencies.

See also our old thread on this topic: https://forums.anandtech.com/thread...ets-warning-many-images.2577325/post-40075202
Oh, right. AMD can just use design rules that are stricter than the PDK allows to minimize hot spotting, and they can also use xtor in parallel to increase drive current and maintain high switching speeds (like in the execution pipelines).
 

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
Zen3 is using N7 HD indeed. While RDNA is using N7 HP.
For N5 the most optimal range for Zen 4 would be around 3.8-4.2 GHz according to the Shmoo plot, after that would need considerable jump in voltage for getting the frequency to the same levels like 5950X for example.
One of the advantages of designs which top out at ~3.2 GHz is that they are well below this value resulting in big gains in efficiency.

View attachment 44466

Ugh, hope that's an early sample :(. That, or there are allot of tricks in Zen4 to increase throughput within the cores (aka more IPC, which is a misnomer). Moving more towards an Apple A (or M) series CPU architecture??
 

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
According to ExecutableFix on twitter, AMD will be LGA-1718 and will support PCIE 4.0 and DDR5 with a 600 series chipset. This says to me that we might see Zen 4 sooner rather than later. DDR5 will probably require a new version of IF, so I really don’t see a basic “Zen 3+“ being pushed out, as Zen 3 would need to be modified to support everything.