Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)

Page 159 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Except for the details about the improvements in the microarchitecture, we now know pretty well what to expect with Zen 3.

The leaked presentation by AMD Senior Manager Martin Hilgeman shows that EPYC 3 "Milan" will, as promised and expected, reuse the current platform (SP3), and the system architecture and packaging looks to be the same, with the same 9-die chiplet design and the same maximum core and thread-count (no SMT-4, contrary to rumour). The biggest change revealed so far is the enlargement of the compute complex from 4 cores to 8 cores, all sharing a larger L3 cache ("32+ MB", likely to double to 64 MB, I think).

Hilgeman's slides did also show that EPYC 4 "Genoa" is in the definition phase (or was at the time of the presentation in September, at least), and will come with a new platform (SP5), with new memory support (likely DDR5).

Untitled2.png


What else do you think we will see with Zen 4? PCI-Express 5 support? Increased core-count? 4-way SMT? New packaging (interposer, 2.5D, 3D)? Integrated memory on package (HBM)?

Vote in the poll and share your thoughts! :)
 
Last edited:
  • Like
Reactions: richardllewis_01

amd6502

Senior member
Apr 21, 2017
971
360
136
Eh, the TDP being taken up by the IO chiplet is way too high. Power efficiency is probably one good reason among many both Amazon, and as rumored Microsoft, are looking at or already building their own server CPUs. AMD is no doubt cognizant of this and will be looking at vastly increasing efficiency where it can. All the profit margins in the world aren't helpful if there aren't any sales; and AMD would be wise not to get too far ahead of itself with it's recent success.

Zen 4 APUs will probably get the new RDNA2 igpu design over anything else initially. The more interesting concept might be large chiplet APUs. Shove the same CU count as a 65/600xt into a chiplet and compete with the M1x and future variants seems entirely plausible. People being able to do professional video work on a laptop for example is awesome for that market segment; and there's also motivation for AMD to compete in other segments. A 3-4 chiplet APU, one monolithic with 8 cores/IO/medio, 1-2 64mb Vcache chiplets, and a big GPU chiplet could still save die space over separate GPU/CPU setups, be enough power for a giant segment of the market including gamers, and open up desktop form factors like mini PCs to more uses for AMD. Imagine how much an 8 core CPU/11 teraflop RDNA3 GPU combo sold for like $400-500 would sell. Seems like an all around win.

Well, the server IO hub is going to be completely different and much larger than the consumer IOX. I think most expect those server IO chiplets for multi CCD setups would be worth shrinking to more advance nodes like 6/7nm, because of the importance of the energy efficiency factor you mention. (Likely even with advanced features like stacked memory or more cache. )

For the consumer side, I was wondering what the odds are that they will somewhat deviate from the current strategy of using server oriented CCD for enthusiast and diy desktop, and 8c-monolithic OEM-oriented APU silicon for the OEM desktop/AiO and high end mobile.

They have a strong history of liking powers of 2, but my thought was, that Zen4 will be more powerful, and that the sweet spot to distinguish itself from the current products (which are quite good already) might be something between a 4c and 8c APU for the monolithic APU. This APU would be optionally complemented in the OEM market by server oriented CCD 8c Zen4 Ryzens in an 1+1 MCM/chiplet configuration, by integrating a small iGPU onto the IOX. I think 1+1 MCM would still be pretty energy efficient; definitely not a power budget outside of what would be in an All-in-One consumer desktop.

So, that is, two projects, a consumer IOX with a small gpu, and a monolithic M1 6c/12t competitor, both of which would have DDR4 plus DDR5 ability.
 
Last edited:

dr1337

Senior member
May 25, 2020
311
514
106
Charlie teases that for Genoa there'll be a smaller Epyc socket along SP5. If true this one could also be what's used for future Threadripper chips.

My bet this is about a BGA version of SP5 for the next generation of epyc embedded chips. AMD already has a smaller epyc 'socket' technically speaking, shrinking sp4 down quite a bit by ditching LGA.
 

Joe NYC

Golden Member
Jun 26, 2021
1,899
2,195
106
Charlie teases that for Genoa there'll be a smaller Epyc socket along SP5. If true this one could also be what's used for future Threadripper chips.


Here is a tweet from ExecutableFix:


So maybe the SP6 will just go after lower end of server market (now owned by Xeon D) and TR5 will still be a separate socket for Threadripper.

My wish list would for TR5 would be 4 to 6 memory channels supporting only 1 DIMM, but at highest possible speed, and generous amount of PCIe lanes. Maybe 1/2 of SP5 socket.
 
  • Like
Reactions: Tlh97 and Saylick

jamescox

Senior member
Nov 11, 2009
637
1,103
136
Which don't do them much good until they're ready to go and require serious design considerations that look like a serious gamble. Take for example your idea that bridges replace IFOP. That's an all-in design choice because you now need all IO dies to be fabricated at TSCM because you've built a chiplet that won't integrate with anything that can't be connected by that bridge or you have to build all of your chiplets so that they can work in either configuration.



Even if you're making that decision years ago, you choose Global Foundries because it allows for more production. AMD would know how many wafers that they can buy, they would know that they're going to be making new console SoCs, they would know that they're going to be expanding their GPU offerings, and they would know that they're likely to have a strong advantage in HEDT and servers due to their chiplet-based approach and be able to grow in those markets.

If you move your IO dies to TSMC you need almost twice as many wafers to offset the ~40% of area that the CPUs devote to the IO dies. Even without a crystal ball to predict a pandemic and any supply chain issues you'd have to be foolish to make that move. It just doesn't make sense logistically.



I'm not sure that's happening. I keep pointing out why it doesn't make sense, and the only response is "But what about this other really cool, but unlikely possibility?" I honestly don't think the basic points that strongly suggest continued use of Global Foundries have even been addressed. Recently they extended their commitments to GF and have now agreed to buy $2.1 billion in wafers through the next four years. But rather than address that it's just "But what about cool hypothetical future technology? Wouldn't that be sweet?"



I'm not sure that's necessarily the case. It seems most likely that RNDA3 uses two bridged chiplets if it uses any at all. It may not even do that for all models and because those chiplets are perfectly fine if they aren't connected because they're really just GPUs being put on the same package as opposed to modules that need a separate die in order to function.

Keep in mind that Zen 2 apparently had the technology in place to utilized stacked cache, but it took until the tail end of Zen 3 to actually implement it. We're talking about bleeding edge technologies that haven't been used at scale before and it takes a lot of time to get the kinks worked out. The path to the cool future technology is a slow one with incremental evolution and building on top of previous successes, not a giant leap made with reckless abandon and all caution thrown to the wind.
What, specifically, do you think doesn’t make sense? I posted here quite a while ago that I expect Bergamo will make use of stacked die. It makes a lot of sense for a super high core count product to use stacking to lower interconnect power consumption significantly. It is also suspiciously limited to 8 cpu die instead of 12 like Genoa. I assume Bergamo was developed in parallel with Genoa, with Genoa being very similar to Milan; the low risk path.

There was an anandtech article about TSMC 2.D and 3D stacking from 2020 posted here a long time ago. Some of the stacking tech has been available since 2018. I don’t know how bleeding edge it actually is at this point. TSMC was already demonstrating 12 high SoIC stacks in 2020. AMD has, so far, used 1 high stacks. Is that too risky 2 years later? AMD seems likely to use an MCD (bridge chiplet with cache) for gpus so it doesn’t seem that unlikely that it would be used in other products. Also, if AMD does not make use of TSMC stacking soon, then they will be at risk of falling behind Intel. Use of such stacking tech could allow Intel to surpass AMD’s serdes based MCM designs due to the significantly lower power consumption of the interconnect. AMD’s serdes based packages are actually just MCMs (chip based, not chiplet based), which have been around for a long time. See the IBM Power 5 MCM from 2004. That had 8 chips, 4 cpu die and 4 cache die. Intel is going to jump directly to stacking.

The old article about TSMC stacking tech indicates that they were working on making larger and larger sizes available for the stacking tech using reconstituted wafers. Some of them may go up to 3x reticle size now; may have been at 1.5 in 2020. I suspect Bergamo could be done in a single reticle size; smaller IO die plus 8 cpu chiplets similar in size to current cpu chiplets. That would possibly be less than 600 mm2 for the cpu chiplets with around 230 mm2 left for the IO die. IFOP links are 2x width of IFIS, so eliminating 8 (or 12 if comparing to Genoa) and making it on an advanced TSMC process would save a lot of area. So, while this might be risky, if it can be done in a single reticle size, it may be less risky than you might think.

They would have to make all of the component chips at TSMC, but I don’t know if that is an issue. Bergamo will exist along side Genoa (less risk; you could still get a 96-core Genoa) and probably Milan and Milan-x for a while. It will be a very high end cloud product, likely with very high margins. We are talking about a 128-core processor, possibly with 512 or even 1 GB of cache. How much will that cost? They likely already have all of the design work for the required units since they are also required for other products (APUs, GPUs, etc). AMD has excellent design and implementation reuse.

It would probably still make sense that the Genoa IO die be made at GF. AMD will also be making Milan and Genoa parts for a while. I know some companies spec a system and then buy the same thing for a few years, so they will be buying Milan for a couple years yet; the server market moves slowly. I assume that they would make all of their chipsets at GF. Chipsets are likely to still be essentially an IO die, just without the memory controllers. It would also be plausible to continue to make desktop IO die at GF. I have wondered if it would be plausible for them to make some low end Zen 4 APUs at Samsung or even GF. We still don’t have lower end Zen 3 based parts. Using MCMs for 8 core or less is a bit wasteful.
 
  • Like
Reactions: Tlh97 and Joe NYC

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
What, specifically, do you think doesn’t make sense?

Specifically claims that AMD would be exclusively using N6 for IO dies with Zen 4 products. It doesn't sound like that's something you believe though.

It isn't as though the only thing that AMD might use GF for is IO dies. There have been some rumors that future Athlon APUs are going to be made on the GF 12LP+ node and I've also read some people who think that they could be using them for HBM manufacturing for future products.

Eventually they certainly will move to using something like N6, but mainly because it's an older process and isn't being used for AMD's chiplets, GPUs, APUs, or any other products. At that point it doesn't matter as much because it doesn't compete with their other products for wafers and the cost will have come down.
 

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
Specifically claims that AMD would be exclusively using N6 for IO dies with Zen 4 products. It doesn't sound like that's something you believe though.

It isn't as though the only thing that AMD might use GF for is IO dies. There have been some rumors that future Athlon APUs are going to be made on the GF 12LP+ node and I've also read some people who think that they could be using them for HBM manufacturing for future products.

Eventually they certainly will move to using something like N6, but mainly because it's an older process and isn't being used for AMD's chiplets, GPUs, APUs, or any other products. At that point it doesn't matter as much because it doesn't compete with their other products for wafers and the cost will have come down.

I am willing to bet against AMD using glofo simply because of the timing of the WSA renegotiation. Zen 4 work predates the WSA changes. It just screams “Zen 3, Milan-X, Threadripper, and Milan have exceeded expectations and we anticipate sales through the next 3 years.“ To me, 12LP+ is too new to even be under consideration for a new IO die. The announcement was Q3 2019.
 
  • Like
Reactions: Tlh97 and Joe NYC

Joe NYC

Golden Member
Jun 26, 2021
1,899
2,195
106
Specifically claims that AMD would be exclusively using N6 for IO dies with Zen 4 products. It doesn't sound like that's something you believe though.

That part seemed clear originally, when some leaks came out saying that IOD size shrunk some 33% (despite adding more IO), that AMD must have moved IOD to TSMC N6.

But now, when new leak came out about AMD using fan out, it is possible that IOD shrunk because it was pad limited and use of fan out allowed it to shrink.

As far as these mysteries go, we will have the answer shortly...
 

jamescox

Senior member
Nov 11, 2009
637
1,103
136
Specifically claims that AMD would be exclusively using N6 for IO dies with Zen 4 products. It doesn't sound like that's something you believe though.

It isn't as though the only thing that AMD might use GF for is IO dies. There have been some rumors that future Athlon APUs are going to be made on the GF 12LP+ node and I've also read some people who think that they could be using them for HBM manufacturing for future products.

Eventually they certainly will move to using something like N6, but mainly because it's an older process and isn't being used for AMD's chiplets, GPUs, APUs, or any other products. At that point it doesn't matter as much because it doesn't compete with their other products for wafers and the cost will have come down.
I wouldn’t rule out most IO dies being made at TSMC. It would still be split processes, probably with some 6 nm process tweaks specifically for the IO die and the cpu die on an optimal 5 nm process. There are still a lot of things they could make at GF, but to take best advantage of a lot of the advanced packaging technologies, it will be a lot easier to make all of the components at TSMC for each product. They may need to use some advanced packaging tech to avoid being pad limited on the IO dies. The higher clock speeds and power consumption required of the IO die for DDR5, pci-e 5, and an integrated GPU also make me think that 6 nm TSMC is more likely. The Genoa IO die would still be rather large for 6 nm though. That is why I was wondering, initially, if it would be split in quarters or halves to make a smaller die. That doesn’t seem to be the case; seems to be monolithic.

With Genoa being a completely new platform that requires new boards and memory, it will take a while for it to ramp in the server market. AMD will be making massive Milan IO die at GF for quite a while, especially with Milan-x just releasing. They will likely be making the chipsets at GF, with some systems rumored to take two of them somehow. I have seen and thought I mentioned the possibility of a low end GF made APU. The APU wouldn’t be likely to need any advanced packaging tech. It should be interesting how it compares with an integrated gpu. I haven’t seen anything about HBM. Standard HBM uses micro-solder balls, so it can be made at different fabs. I guess that is plausible. Not sure what else they could make at GF. They may be constrained to TSMC processes for a lot of things due to advanced packaging tech limitations, it isn’t much of a stretch that most, if not all, Zen 4 IO die will be at TSMC. If Bergamo makes use of stacked bridge chips, then I would expect all component chips to be made at TSMC. It is possible that the rumors may have mixed up Bergamo and Genoa info, so it is hard to tell. I guess I just don’t find it to be that unlikely that AMD would use TSMC for Zen 4 IO die.
 
  • Like
Reactions: Tlh97 and Joe NYC

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
I expect AMD has already planned to switch existing products away from GlobalFoundries.

1. May 2021 => Removed greater than 7nm node exclusivity.
2. December 2021 => Removed 2022+ 14nm/12nm price hike.

AMD's older gen cIOD/sIOD/Embedded Ryzen/EPYC will most definitely be moved to TSMC.

Reason of movement:
Forewarned inconsistent supply of 14nm/12nm FinFET caused by;
45RH => 45FDX-RH
45RFe => 45FDX-RFA
45SPCLO => 45FDX-PH
45FDBase => 45FDX

Which utilize the same 193i tools, but have significantly more customers utilizing that node. Given the known customers of the above node, it would put the switch from GF to TSMC in the 2H of 2023.

This also reflects GlobalFoundries new tradition of letting go customers who can fab elsewhere. GlobalFoundries only wants customers that can only fab at GlobalFoundries to match their needs. Since every other fab makes better Fins than GF, it is better to end that relationship than continue it. Side-GF is gone, it is Sole-GF from now on.

Essentially, AMD has planned to leave GloFo's Fins for awhile, and the December 2021 price-negotiation is a concession to reduce fiscal weight on AMD. Since, AMD's in the dark projects might be solely produced on 22FDX/12FDX.
 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
@NostaSeronx So you say AMD is moving away from GloFo's FinFET. But AMD extended its WSA with GloFo, and recent deals like Tesla using Picasso point to being long time deals as well.
Picasso-H APU is most likely a stop-gap since AMD hasn't released the 6nm V-series or R-series.
2023 Tesla models will most likely be using Rembrandt or Mendocino or an actual semi-custom design.

The WSA agreement only has pre-orders of FinFETs up to 2023. There is no guarantee that GlobalFoundries will continue to support net income losses provided by 14nm/12nm FinFET forever. In this case it is better to kill off FinFETs as soon as possible when an alternative same-gen node pops up. Feature Rich Bulk(Majority of the profits) -> FDX and FDX goes to 12-nm. For example, 2016-2022 majority of GloFo's customers went for Singapore's 55nm-40nm lineup. Which shot Singapore from the worst (2010-2015) to the best (2016-present). Dresden's 28nm and Singapore's 55/40nm upgrade track is 22FDX, and Malta's 45nm upgrade track is also 22FDX. So, all new customers given the 10-year and 20-year partnerships agreements are expected to move to 22FDX/12FDX eventually.

Zen4 = TSMC 6nm for IOD, only
Zen2/Zen3 Refreshes = TSMC 16nm/12nm for IOD or re-use of TSMC 6nm IOD(die-harvested).

GlobalFoundries isn't going to keep AMD from a better option, especially if since 2017 they have been tanking it at as a loss.

GloFo killed 7LP before production because AMD was the sole customer.
GloFo killed 28BLK-HP(28HP->28HPP->28SHP) after production because AMD was the sole customer. ;; both instances GloFo was a second-source.
GloFo given the above is likely to cancel 14/12nm Fin production because the 2nd customer has already agreed to move to 12FDX.

Given:
14nm/12nm/12nm-Fin+ vs 12nm-FD and 45nm-FD
45nm FDSOI and 12nm FDSOI has more customers slotted at Fab8 than 14nm/12nm/12nm-Fin+ for 2023+.
(22nm FDSOI is suppose to come in after second plan Module 2; Module 1 = 45nm/12nm-FD and Module 2 = 45nm/22nm-FD; first plan Module 2 was the only hope for 14/12nm-Fin)
 
Last edited:

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
I wonder if Samsung has a cost effective node available in the "7nm" < ? < "12nm" space? AMD still needs diversity in supply sources just to protect their own business in case of interruptions in a single supplier (natural disasters have a way of being surprises, and no one can predict the geo-political situation in the future).
 

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
I wonder if Samsung has a cost effective node available in the "7nm" < ? < "12nm" space?

Cost-effective? Maybe. Last year's 5LPE seems to have similar power/performance characteristics to N7. Whether or not it could be tweaked to AMD's specifications is an entirely different matter. Regardless, I would imagine Samsung would be charging a discount versus TSMC, assuming they have the volume to offer (which they may not).

AMD would be better-off not touching anything older that 5LPE though.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
I wonder if Samsung has a cost effective node available in the "7nm" < ? < "12nm" space? AMD still needs diversity in supply sources just to protect their own business in case of interruptions in a single supplier (natural disasters have a way of being surprises, and no one can predict the geo-political situation in the future).
The general issue is AMD already had Zen at TSMC. So, why would they abandon opening it back up, if they are allowed by GlobalFoundries to produce at TSMC? Why choose another fab, when technically AMD already established IP over there.

Single supplier is fine, if they have multiple modules. Do not forget TSMC is offshoring 28nm, 16nm/12nm, and 5nm. It is not like they are strapped for cash like another fab. AMD adopting more nodes at TSMC means higher ranking in customers, thus better deals once the market goes from supply shortage to supply glut.
 
Last edited:

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
I'm thinking more of AMD's trailing and low end parts. Samsung's "8nm" class foundry nodes should be sufficient for things like low end APUs, embedded products, IODs and related items. Those things don't need leading edge performance, they just need to be low(er) cost and "Reasonable" for power/performance. What we don't know is pricing structure there. I don't think it's worth the time or effort to go back to their 14nm class 11LPP node, unless it's just VERY cheap. It seems to me that 8LPP/LPU might be an attractive node for many purposes.
 

Henry swagger

Senior member
Feb 9, 2022
356
235
86
I'm thinking more of AMD's trailing and low end parts. Samsung's "8nm" class foundry nodes should be sufficient for things like low end APUs, embedded products, IODs and related items. Those things don't need leading edge performance, they just need to be low(er) cost and "Reasonable" for power/performance. What we don't know is pricing structure there. I don't think it's worth the time or effort to go back to their 14nm class 11LPP node, unless it's just VERY cheap. It seems to me that 8LPP/LPU might be an attractive node for many purposes.
Amd can't use Samsung nodes because tsmc gets very mad.. an analyst revealed that.. gloflo 14nm is Samsung just loaned or licensed....if amd wants to compete with intel in supply they need 3 foundry supply... because a analyst said intel ships 400 million cpu a year and AMD and 80 million
 

Frenetic Pony

Senior member
May 1, 2012
218
179
116
TSMC is in a class by itself, as your post indicates. There's no way 5LPE costs as much as N7.

I suspect that at the moment they can charge a good bit higher, but long term that can easily change. I would assume however any business AMD goes with Samsung for might wait till 2023 and the inclusion of EUV pellicles. The yield problems are what seem to be driving people away from high end Samsung wafers even more than power/performance considerations versus TSMC. Pellicles and some recent improvements in foundry processes and machines (upgradeability) could potentially see those yields rise for Samsung; and they certainly turned around quite quickly from using pellicles on "some" parts to "everywhere" starting next year. Pure speculation but that could indicate a potentially large customer wanted such before signing an agreement.
 

moinmoin

Diamond Member
Jun 1, 2017
4,934
7,619
136
GloFo killed 7LP before production because AMD was the sole customer.
GloFo killed 28BLK-HP(28HP->28HPP->28SHP) after production because AMD was the sole customer. ;; both instances GloFo was a second-source.
GloFo given the above is likely to cancel 14/12nm Fin production
Sound reasoning, but it doesn't answer at all what's up with the extended WSA between GloFo and AMD. An additional $1.4 billion until 2024 when it was widely thought that the WSA would run out and not be renewed in 2021. Did the WSA turn into a poisonous pill for GloFo instead AMD, securing 14/12nm Fin production for AMD until 2024 while GloFo wanted to convert those fabs earlier?
 
  • Like
Reactions: Tlh97 and Joe NYC

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
Sound reasoning, but it doesn't answer at all what's up with the extended WSA between GloFo and AMD. An additional $1.4 billion until 2024 . . .

Not that it changes your point, but it's been further extended to $2.1 billion through 2025.

Again, I think the obvious answer is that they're going to continue using Global Foundries for IO dies, although there's rumors regarding other products as well.