Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Vattila · Oct 6, 2019

Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts!

jpiniero · Jun 7, 2021

https://twitter.com/x/status/1401862974986285065

This guy seems to think Zen 4 (desktop?) will be Q4 2022.

Kepler_L2 · Jun 7, 2021

jpiniero said:
https://twitter.com/x/status/1401862974986285065

This guy seems to think Zen 4 (desktop?) will be Q4 2022.

I think he's wrong but we'll see. This would mean over 3 years of development for Zen 4 and 2.5 years for RDNA3, much longer than usual for AMD.

jpiniero · Jun 7, 2021

Kepler_L2 said:
I think he's wrong but we'll see. This would mean over 3 years of development for Zen 4 and 2.5 years for RDNA3, much longer than usual for AMD.

It could be just that the Epyc/MI products are coming first and there isn't room to do the client products until later.

Kepler_L2 · Jun 7, 2021

jpiniero said:
It could be just that the Epyc/MI products are coming first and there isn't room to do the client products until later.

MI200 is coming this year, MI300 should be after RDNA3.

EPYC does release early for the big boys (google, facebook, tencent, etc.) but general availability is usually after the desktop release. Lisa Su confirmed recently that Genoa will launch 2022, so imo we should expect Raphael Q3 and Genoa Q4.

exquisitechar · Jun 7, 2021

Kepler_L2 said:
MI200 is coming this year, MI300 should be after RDNA3.

EPYC does release early for the big boys (google, facebook, tencent, etc.) but general availability is usually after the desktop release. Lisa Su confirmed recently that Genoa will launch 2022, so imo we should expect Raphael Q3 and Genoa Q4.

It has been rumored that Genoa will launch before Raphael, I think.

Kepler_L2 said:
I think he's wrong but we'll see. This would mean over 3 years of development for Zen 4 and 2.5 years for RDNA3, much longer than usual for AMD.

I believe it's true that RDNA3 hasn't been taped out yet, at least. Unfortunately, the rest might be too.

CakeMonster · Jun 7, 2021

jpiniero said:
https://twitter.com/x/status/1401862974986285065

This guy seems to think Zen 4 (desktop?) will be Q4 2022.

My bet would also be Q4 '22 since we have confirmation of a Zen3 with more cache. Gives them time to get everything right and build up some stock + account for unforeseen problems.

I'm not making shit up for Twitter views though, I'm just a clueless hardware fan who goes with what makes most sense given my very limited knowledge.

maddie · Jun 7, 2021

Kepler_L2 said:
I think he's wrong but we'll see. This would mean over 3 years of development for Zen 4 and 2.5 years for RDNA3, much longer than usual for AMD.

TSMC 5nm on 5nm die stacking won't be available until Q3 2022. The consensus is that this is the future, so expected. In fact, this is the earliest timeframe possible if Zen4 has die stacking as standard and not an add on.

Production timeframes are not only design limited but also if the product can be made.

exquisitechar · Jun 7, 2021

maddie said:
TSMC 5nm on 5nm die stacking won't be available until Q3 2022. The consensus is that this is the future, so expected. In fact, this is the earliest timeframe possible if Zen4 has die stacking as standard and not an add on.

It won't be standard, I'm sure.

moinmoin · Jun 7, 2021

Depending on source El Capitan, which uses Genoa, is to be delivered in late 2022, early 2023 or by mid-2023 at the latest so Genoa has to be in production before then.

HurleyBird · Jun 7, 2021

Kepler_L2 said:
I think he's wrong but we'll see. This would mean over 3 years of development for Zen 4 and 2.5 years for RDNA3, much longer than usual for AMD.

I think it will depend a lot on how Alder Lake compares with Zen3D.

DrMrLordX · Jun 7, 2021

moinmoin said:
Depending on source El Capitan, which uses Genoa, is to be delivered in late 2022, early 2023 or by mid-2023 at the latest so Genoa has to be in production before then.

It is theoretically possible that there are Zen4 versions of EPYC without v-cache.

CakeMonster · Jun 7, 2021

HurleyBird said:
I think it will depend a lot on how Alder Lake compares with Zen3D.

I can't imagine a Z3 with more cache and 16c/32t would have any problems compared to AD at all. Also with the 'budget' option of 5950x for that same time period.

moinmoin · Jun 7, 2021

DrMrLordX said:
It is theoretically possible that there are Zen4 versions of EPYC without v-cache.

Sure, but if launching v-cache in datacenters now is an issue of validation, validating and launching Epyc 7004 without it would be very odd.

DisEnchantment · Jun 7, 2021

DrMrLordX said:
It is theoretically possible that there are Zen4 versions of EPYC without v-cache.

I would say this is almost guaranteed.
Lambda services, load balancers, proxy servers, REST/gRPC API gateways etc scale well with cores. Not sure if these folks hosting such services wanna pay premium for cache heavy SKUs which won't bring any gain vs regular EPYC 7002/3 type cache SKUs.
I would even suggest that Altra even cut the cache for their 128 core chip which is squarely aimed at nginx type loads.
The weaker but more cores and with lesser cache did well in such tests on Phoronix.

maddie · Jun 7, 2021

exquisitechar said:
It won't be standard, I'm sure.

Why not? Is it unreasonable to think that the top of the range will have die stacking as standard? Standard does not mean all products having it, and as any new release is a top down one, then Q4, at earliest, it must be.

Doug S · Jun 7, 2021

maddie said:
TSMC 5nm on 5nm die stacking won't be available until Q3 2022. The consensus is that this is the future, so expected. In fact, this is the earliest timeframe possible if Zen4 has die stacking as standard and not an add on.

Production timeframes are not only design limited but also if the product can be made.

If they are waiting a full two years before the initial 5nm ramp to make 5nm die stacking available, that would 100% be because of customer scheduling. If Apple was going to use it they would have had it available much earlier, so you can probably use it to tell when those AMD products will ship (though it is possible they have other customers wanting to use it we don't know about)

maddie · Jun 7, 2021

Doug S said:
If they are waiting a full two years before the initial 5nm ramp to make 5nm die stacking available, that would 100% be because of customer scheduling. If Apple was going to use it they would have had it available much earlier, so you can probably use it to tell when those AMD products will ship (though it is possible they have other customers wanting to use it we don't know about)

I actually don't understand your points. What was the time lag for 7nm? AMD designed the CCDs for it knowing that it would only be available for use in Q4 2021. It takes time to R&D new techniques. Why would you think that they could do it earlier but held back? someone could use that argument for almost everything. Why 3d stacking only now? Why chiplets only recently? Why many others.

DrMrLordX · Jun 7, 2021

DisEnchantment said:
I would say this is almost guaranteed.
Lambda services, load balancers, proxy servers, REST/gRPC API gateways etc scale well with cores. Not sure if these folks hosting such services wanna pay premium for cache heavy SKUs which won't bring any gain vs regular EPYC 7002/3 type cache SKUs.
I would even suggest that Altra even cut the cache for their 128 core chip which is squarely aimed at nginx type loads.
The weaker but more cores and with lesser cache did well in such tests on Phoronix.

Bear in mind that, in the case of v-cache, it isn't a matter of sacrificing area that could be used for cores in favor of cache (as was the case with the Altra). It's more as @moinmoin indicated - waiting for validation. AMD can get you Genoa today without v-cache, or you can wait a year and get it with v-cache if your workload would actually benefit from the extra L3.

Not all workloads benefit from L3.

Genoa DOES offer an increase in core count vs. Milan, so it isn't necessarily a choice between Genoa w/out v-cache vs. Genoa with v-cache. It's a matter of choosing between Milan-X and Genoa-not-X.

DisEnchantment · Jun 7, 2021

DrMrLordX said:
Bear in mind that, in the case of v-cache, it isn't a matter of sacrificing area that could be used for cores in favor of cache (as was the case with the Altra). It's more as @moinmoin indicated - waiting for validation. AMD can get you Genoa today without v-cache, or you can wait a year and get it with v-cache if your workload would actually benefit from the extra L3.

Not all workloads benefit from L3.

Genoa DOES offer an increase in core count vs. Milan, so it isn't necessarily a choice between Genoa w/out v-cache vs. Genoa with v-cache. It's a matter of choosing between Milan-X and Genoa-not-X.

Mmmm ... I am not sure I understood the relation between Milan-X and Genoa, but what I am trying to say is that there are lots of customers who would be interested in a plain Genoa without the V Cache, especially if it comes at a lower cost. The whole point with V Cache is to have another tool (like IF for scaling cores) to scale the end product, be it cache, core count and so on.

Therefore I believe AMD would definitely offer such a high core count Genoa SKU w/o V Cache, because it is suitable for many common loads.
We have a whole bunch of services on Azure/AKS that does nothing but authenticate and process request/response to/from the worker nodes within our DMZ and doing nothing but the very bare minimum of operations and processing at most a dozen bytes of data. Changing the instance type does nothing for us, changing vCPU count makes a difference. For such a service I would select an SKU for my Azure Subscription that allows me the highest vCPU count possible which is what we did.

JoeRambo · Jun 7, 2021

DisEnchantment said:
We have a whole bunch of services on Azure/AKS that does nothing but authenticate and process request/response to/from the worker nodes within our DMZ and doing nothing but the very bare minimum of operations and processing at most a dozen bytes of data. Changing the instance type does nothing for us, changing vCPU count makes a difference. For such a service I would select an SKU for my Azure Subscription that allows me the highest vCPU count possible which is what we did.

I think eventually ARM will eat both AMD/Intel in such usages. Graviton X or Altra Biturbo or whatever they will call it is perfect fit for such workloads, going to be unbeatable in all metrics relevant for cloud and usecases like Your AWS stuff.
AMD should be shooting for higher performance segment, and with V-Cache, ZEN3 they are doing just that. 8C chiplet with 96MB of private L3 is force is nice unit of computing for those non-trivial computing backend nodes.

Doug S · Jun 7, 2021

maddie said:
I actually don't understand your points. What was the time lag for 7nm? AMD designed the CCDs for it knowing that it would only be available for use in Q4 2021. It takes time to R&D new techniques. Why would you think that they could do it earlier but held back? someone could use that argument for almost everything. Why 3d stacking only now? Why chiplets only recently? Why many others.

What I'm saying is that the time lag depends on having customers who want to use it. Sure it takes time to R&D new techniques, but doing the same thing with 5nm they are doing with 7nm is not a "new technique".

What would be the point of having all the equipment ready and employees trained to handle stacking 5nm chips when Apple is your only 5nm customer and they aren't interested in doing it? They are releasing it in Q3 2022 because that's when the first customer is ready to take delivery of 3D stacked 5nm chips.

If Apple had wanted to 3D stack the A15, you can bet they would have had it ready by Q3 2021 instead of Q3 2022.

itsmydamnation · Jun 7, 2021

JoeRambo said:
I think eventually ARM will eat both AMD/Intel in such usages. Graviton X or Altra Biturbo or whatever they will call it is perfect fit for such workloads, going to be unbeatable in all metrics relevant for cloud and usecases like Your AWS stuff.
AMD should be shooting for higher performance segment, and with V-Cache, ZEN3 they are doing just that. 8C chiplet with 96MB of private L3 is force is nice unit of computing for those non-trivial computing backend nodes.

its funny with Zen 1 of the big iron cpu's ( regardless of ISA) AMD would have had to be the least preferred CPU uarch for big DB. Now with Zen3 + Vcache it has to be the most, only down side right now it maximum memory pool because of only 2P.

MadRat · Jun 7, 2021

eek2121 said:
Zen 4, for example, has a GPU included.

GPU on die sounds like a great business case for at least 2GB of HBM. Aim at the consumers that don't want top of the line 4K or higher gaming. (Many are content at 1920x1280 @ 30fps.) Leaves top end video cards securely in hands of Bitcoin mining.

It's not like they have an endless glut of video card stock to sell off. Great timing for it.

maddie · Jun 7, 2021

Doug S said:
What I'm saying is that the time lag depends on having customers who want to use it. Sure it takes time to R&D new techniques, but doing the same thing with 5nm they are doing with 7nm is not a "new technique".

What would be the point of having all the equipment ready and employees trained to handle stacking 5nm chips when Apple is your only 5nm customer and they aren't interested in doing it? They are releasing it in Q3 2022 because that's when the first customer is ready to take delivery of 3D stacked 5nm chips.

If Apple had wanted to 3D stack the A15, you can bet they would have had it ready by Q3 2021 instead of Q3 2022.

I have to disagree.

What makes you think that stacking 5nm is the same as stacking 7nm?

TSMC has already stated, and was already commented on here, that the copper pillar interconnect spacing is going to decrease by a factor of 10 over time. Advancing the state of the art is and will be continuing.

jamescox · Jun 7, 2021

scineram said:
They need to drastically increase L1 as well to get that IPC uplift.
Also my impression is the GPU is integrated into the IOD. That is what the old image GN shared showed as well.

There are a lot of possibilities if stacking is involved. It could be directly on the IO die or it could be a chiplet stacked on top of the IO die or a chiplet stacked on top of a larger interposer with other chiplets. If the IO die is made on the latest process, then it may make sense for it to be directly on the same die. For lower cost systems, it would make sense for it to just be directly on the IO die with no stacking; basically the same as current Zen 3.

That kind of comes back to making a chip for stacking on an interposer vs. a non-stacked solution. If they make a cpu chiplet specifically designed for stacking, then how do you do a lower end design where stacking is possibly too expensive? Do they make two different chiplets? It seems like they wouldn’t do two different chiplets. It seems like the lower end would just be a fully integrated APU. Where does the IO die with graphics fit? What market does it cover? It might be that the integrated graphics is so much better than previous solutions (due to DDR5 or large caches or something) that it can compete well with low end discrete graphics. That would change the marketing positions if the integrated IO die graphics was sufficient for 1080p. I have been suspicious of graphics in the IO die due to the market segmentation. If you are going for cheap, then a monolithic APU seems to make more sense. It does make some sense to have some graphics functionality across the whole line though.

Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Senior member

Lifer

Senior member

Lifer

Senior member

Senior member

Golden Member

Diamond Member

Senior member

Diamond Member

Platinum Member

Lifer

Golden Member

Diamond Member

Golden Member

Diamond Member

Platinum Member

Diamond Member

Lifer

Golden Member

Golden Member

Platinum Member

Platinum Member

Lifer

Diamond Member

Senior member