Speculation: Ryzen 4000 series/Zen 3

Page 143 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

jamescox

Senior member
Nov 11, 2009
637
1,103
136
I am not sure I understood properly but I try to share my point of view on this. I am happy to be corrected.

When a thread is executing, in case data/instruction is not found on L2, the L3 is probed and if hit the cache lines are evicted and loaded with one from L3. If miss then go to Main memory.
If multiple CCXs are present, when there is an L3 miss the other L3s are probed (if there is a entry in the coherency directory otherwise go to main memory)

I understood there are proactive coherency probes, where directory entries are maintained across the L3s which are used to invalidate data on other CCXs when the data has been modified by another CCX. Additionaly the L3 directory contains entries to the private L2 caches
Probing across L3s is across the fabric hence some additional latency. But it does not matter how many L3s (at least to some extent), so it is a bit scalable across a lot of CCXs.
The directory entries across L3 only refers to regions not specific cache lines. So this way they can deal with big sizes. Probing across the L2 involves cache lines. But is way faster as well.
I don't know if inclusive cache or others would be more suited in this case.
View attachment 26080

View attachment 26081
All of this is already on Zen2. So for Zen3
  • For the single core on the 8 core CCX, the bigger L3 means it has a lot of space to swap with until eviction to main memory finally happens. Additionally due to memory prefetching, at L3, chances of hits when data is not found on L2 are higher with bigger L3 ( and possibily avoiding engaging the probes across to other L3s which incurs penalty ).
  • For a very high core count part, the traffic on the Infinity fabric for highly threaded loads is probably going to be much less than with a 4 core CCX, simply because there are lesser CCXs to talk to.
  • On the other hand, the 8 Core CCX would be more complex due to a lot of routing between the cores/L2 to the L3 but then counter balanced by the presence of a single fabric connection instead of two previously for two 4 core CCX. Lesser SerDes blocks.
  • Request to the IMC are also streamlined, because for an 8 core part for example, the IMC (coherency slave) is handling only one master. This is very important because DRAM has big recovery times and this could degrade memory performance.

I don’t see why the number of serdes blocks would go down. It is probably connecting to the same IO die and it needs the same bandwidth; still needs to feed 8 cores. We probably don’t get a new IO die until Zen 4 with DDR5 and pci-express 5. Connecting to the same IO die should mean exactly the same number of serdes. I don’t know how the current links are arranged. Previously they were 32-bit links (IFOP) between chips for Zen 1. They were at roughly half the clock compared to the 16-bit external links (IFIS). I was assuming that each CCD still had a single 32-bit link to the IO die. There shouldn’t be any serdes between the 2 on die CCX and the fabric link on the die. That seems to be what you are implying here? That would be a waste. On chip links would be at 256-bit wide and would only be packetized (256 bit -> 8 x 32 bit) for going over the external serdes link. It wouldn’t make any sense to packetize it on chip, send it some where still on chip, depacketize it, switch it at at least 256-bit wide, and then repacketize it for sending to the IO die.

Increasing FP performance probably means that they need to improve cache bandwidth. L2 and L3 are 32 bytes per cycle. L1 can do 2 x 32-byte load and 1 32-byte store. L2 and L3 bandwidth is actually only enough for one operand of a 256-bit AVX2 unit. Zen 2 has 2 FMA units that each can consume 3 32-byte operands per clock for FMA operations. This is why GPUs with massive bandwidth and data paths are much better at some algorithms. They actually have the bandwidth to feed their compute resources more continuously. So, how are they going to get the extra bandwidth? Doubling the paths to 64-bytes kind of seems unlikely. Perhaps the L1 will be able to load more than 2 x 32-bytes per clock? That may improve performance significantly without even increasing the number of AVX units. I was thinking that they would go up to 3 AVX256 units, but it may be all cache improvements. It would be great if they managed to reduce latency in L1/L2 to make up for the slightly higher L3 latency.
 

moinmoin

Diamond Member
Jun 1, 2017
4,944
7,656
136
We probably don’t get a new IO die until Zen 4 with DDR5 and pci-express 5. Connecting to the same IO die should mean exactly the same number of serdes.
Since AMD's WSA with GloFo runs out early next year I expect them to stop using GloFo for the Zen 3 IOD. Since the IOD will then change node anyway there likely will be Zen 3 specific changes to it as well. I wouldn't rule out more significant changes to the package layout either.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,601
5,780
136
There shouldn’t be any serdes between the 2 on die CCX and the fabric link on the die. That seems to be what you are implying here? That would be a waste.
You are right actually. I was imagining the Zen layout. the Zen2 CCD has one IFOP.
1595021420908.png
We probably don’t get a new IO die until Zen 4 with DDR5 and pci-express 5
I think there will be a new IO die or at the very least it will be updated. EPYC is using a lot of power for the interconnects/IF, which could be responsible for it not being able to sustain high clocks with 8 chiplets within that thermal envelope.

So, how are they going to get the extra bandwidth? Doubling the paths to 64-bytes kind of seems unlikely. Perhaps the L1 will be able to load more than 2 x 32-bytes per clock? That may improve performance significantly without even increasing the number of AVX units. I was thinking that they would go up to 3 AVX256 units, but it may be all cache improvements. It would be great if they managed to reduce latency in L1/L2 to make up for the slightly higher L3 latency.
Yeah, the moment we want to piece together some of the pieces from info in public domain, the more we know so much info is missing.

Since AMD's WSA with GloFo runs out early next year I expect them to stop using GloFo for the Zen 3 IOD. Since the IOD will then change node anyway there likely will be Zen 3 specific changes to it as well. I wouldn't rule out more significant changes to the package layout either.
Tbh, I am really looking forward to such a change. The interconnect and core topology seems rather more interesting to me.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,686
1,221
136
Since AMD's WSA with GloFo runs out early next year ...
AMD's WSA continues till 2024.
"January 1, 2019 and continuing through March 1, 2024"

Any and all products beyond or at 7nm do not have to be produced at GlobalFoundries.

AMD can port the IOD to 7nm at Samsung, TSMC, or SMIC. Nothing in the WSA prevents that, since Jan 2019.

The only line potentially being produced at GlobalFoundries up till 2024 is 5G Wireless, Edge Compute/AI, Networking, Security & Storage segments and associated product lines.
 
Last edited:

moinmoin

Diamond Member
Jun 1, 2017
4,944
7,656
136
AMD's WSA continues till 2024.
"January 1, 2019 and continuing through March 1, 2024"
If it does it's not combined with any significant purchase obligations which would be the same difference, see:
AMD recently (well, actually across the last two months for different parts) published its annual report on Form 10-K for 2019.

Page 39 (45 in the PDF) agrees with scannall, "Purchase obligations" is 1,677 mil this year, 592 mil next year, and after that there's only 21 mil left that could well be mostly unrelated to the WSA.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,686
1,221
136
If it does it's not combined with any significant purchase obligations which would be the same difference, see:
GlobalFoundries and AMD haven't declared the obligations for 2022, 2023, 2024. Those are to be done with an updated WSA soon(tm).wsa2021not2022.png

If you read before that, AMD are obligated to have annual purchasing agreements beyond 2024 as well. However, they won't be covered by the Original WSA after that. (December 31st 2024 is the end of Original WSA annual purchasing agreements.)
 
Last edited:
  • Like
Reactions: krumme

Veradun

Senior member
Jul 29, 2016
564
780
136
Is the new 12nm + made by GF good enough for a new infinity gauntlet?

I'm fully expecting a change in the gauntlet to come with DDR5 tho, on 12FDX maybe, while they keep a steady update path for the infinity stones with no core count increase anytime soon
 

jpiniero

Lifer
Oct 1, 2010
14,584
5,206
136
Man, trying to buy a console right now is super frustrating. I am not sure where supply is at, but tracking sales it's been a super solid quarter for tech. A friend of mine @ Target corp has been saying their consumer tech sales have been like Cyber Week/BF continuously since the start of the lock down type measures and they have been selling every TV, Console and other device they have been able to stock.

If Sony is serious and can have stock AND pricing (seriously, xbox's and switches are going for $100-$200 more than they were going for last fall if you can even find them) they could really sweep this holiday season. That would be quite the achievement to start out with 2x the install base heading into 2021 just because you could put them on the shelves.

This is really getting off topic, but US console hardware sales were down in June compared to last year, after being up big the previous three months.

I would say that the increase in PS5 production actually is to avoid any kind of disruptions a second wave could have on the supply chain. I don't see anything to suggest that AMD would do anything similar with Ryzen (keep the channel stuffed with products they are OK with selling into next year) but it's something to watch.
 
  • Like
Reactions: blckgrffn

NostaSeronx

Diamond Member
Sep 18, 2011
3,686
1,221
136
Is the new 12nm + made by GF good enough for a new infinity gauntlet?
Nope, TSMC's 6nm is though. However, I believe the switch will happen next year. Since, 6nm at AMD/TSMC started in 2019.

AMD's 6nm node is custom just like the 5nm node. I believe they also have unique diffusion breaks, contact over active gate, and potentially SRB wafers(SSRW tSi NFET/SSRW cSiGe PFET) for N6 early. The N5 node of AMDs will up-port N4/N3 perf/power boosters as well.
 
Last edited:
  • Like
Reactions: Tlh97 and Veradun

itsmydamnation

Platinum Member
Feb 6, 2011
2,764
3,131
136
This is really getting off topic, but US console hardware sales were down in June compared to last year, after being up big the previous three months.

I would say that the increase in PS5 production actually is to avoid any kind of disruptions a second wave could have on the supply chain. I don't see anything to suggest that AMD would do anything similar with Ryzen (keep the channel stuffed with products they are OK with selling into next year) but it's something to watch.
You have to remember the rest of the first word hasn't gone full retard like the US has in its corona battle. I dont think its about supply chain, i think Sony will see every thing they make, its just a question of which countries. MS who is more heavily reliant on the US/UK markets might feel the effects of Corona more.
 

jpiniero

Lifer
Oct 1, 2010
14,584
5,206
136
You have to remember the rest of the first word hasn't gone full retard like the US has in its corona battle. I dont think its about supply chain, i think Sony will see every thing they make, its just a question of which countries. MS who is more heavily reliant on the US/UK markets might feel the effects of Corona more.

10 million is a lot... the PS4 had sold 4.2 million by the end of 2013. We'll have to see on pricing, but if it's $499 for the cheapest model, that's going to be tough. I think the original rumor was that Sony was only going to produce like a million or two.

Maybe the cheapest console will be less than $499. Even then I think it's more channel stuffing than them thinking they will sell 10 million in two+ months.
 

blckgrffn

Diamond Member
May 1, 2003
9,123
3,057
136
www.teamjuchems.com
10 million is a lot... the PS4 had sold 4.2 million by the end of 2013. We'll have to see on pricing, but if it's $499 for the cheapest model, that's going to be tough. I think the original rumor was that Sony was only going to produce like a million or two.

Maybe the cheapest console will be less than $499. Even then I think it's more channel stuffing than them thinking they will sell 10 million in two+ months.

To your original point - I couldn't even find NIB consoles to buy when I was shopping a couple weeks ago (xbox one) - everything was weird third party sellers, marketplaces and second hand (FB & CL). It did seem like I could find Playstations, but that probably changes week to week when big box retailers have stock.

I guess what I am saying is that given an entire quarter of huge sales that likely blew out any sales models the procurement departments had & the supply chain issues & the ramp up of probably their best partners on the new hardware... the inventory on hand was really drawn down and there was maybe/likely a dip due to availability, not just purchase intent. I know my own online store is looking at losing our most popular skus for months because of the India shut down :(

That said, it could also be awful timing for launching a next gen product considering the US political maelstrom approaching and the fact that the $600 per week unemployment benefit is ending. It is my *guess* that this "extra" income likely helped spur discretionary purchases that otherwise would not have happened in the US given the current economic climate.

I mean, Q4 is going to be a huge trip here in the US. It has implications for many things :rolleyes: Haha, one of the reasons I came back to these forums was to think on these things less. Sorry for the continued OT.
 
Last edited:
  • Like
Reactions: JimmyH

jpiniero

Lifer
Oct 1, 2010
14,584
5,206
136
To your original point - I couldn't even find NIB consoles to buy when I was shopping a couple weeks ago (xbox one) - everything was weird third party sellers, marketplaces and second hand (FB & CL). It did seem like I could find Playstations, but that probably changes week to week when big box retailers have stock.

The surge might have caught MS off guard while they wind down the One and ramp up the Series X and Lockhart.
 
  • Like
Reactions: blckgrffn
Mar 11, 2004
23,073
5,554
146
The surge might have caught MS off guard while they wind down the One and ramp up the Series X and Lockhart.

Its possible. Microsoft did put the One X on sale ($299) back in like March (think it was late March or maybe April). I have a hunch that was already their plan though.

Er, wasn't Lockhart cancelled, at least in the form it was planned? I hope people aren't expecting a new cheaper Xbox this year as I don't think that's in the plans. Seems that they're aiming to roll with the One S and Series X.

I think Sony's goal is to gain an edge in the early stages. A mix of making sure they have availability and possibly trying to restrict Microsoft's output, so that if people are looking to buy a console, their choice is basically made for them. I actually think that was partly Microsoft's plan with the One X (get it cleared out so that consumers basically have to opt for the Series X).
 

ksec

Senior member
Mar 5, 2010
420
117
116
Is Sonys doubling of PS5 production the reason for the delay of ZEN3 (on Desktop)?

Tl;Dr, There is no Delay.

While some would argue it was about building hype on Zen2 , I still think they announced Zen 2 details way too early. Leading to the thought that felt Zen 2 has been with us for long. As a matter of fact Zen 2 was announced in early 2019 and launched in Mid July, and EPYC was in August.

Even if Zen 3 were released in August it would still have been within a 12 month period, and we have always known Zen 3 would not be launch 12 months after Zen 2, eating into its sales.

Zen 3 is aiming to launch in late 2020. So it is within it original schedule, and there are no delay ( that we know ).

Although AMD better make lots of stock ready......especially for their EPYC Server processor. Doubling the CPU sales of last year is barely a pass on the current market valuation.
 

jpiniero

Lifer
Oct 1, 2010
14,584
5,206
136
Er, wasn't Lockhart cancelled, at least in the form it was planned? I hope people aren't expecting a new cheaper Xbox this year as I don't think that's in the plans. Seems that they're aiming to roll with the One S and Series X.

Appears it is coming, and if it's cheap enough, it might end up overshadowing the Series X.
 

Hans Gruber

Platinum Member
Dec 23, 2006
2,131
1,088
136
I don't understand AMD right now. Delaying the release RDNA 2 and Zen3. If AMD beat Intel on every performance metric, people would not be questioning their release dates. Tying in the release of RDNA 2 and Zen 3 to the new Playstation 5 and Xbox is not wise. I say this because AMD is behind Nvidia and has been for years. They are not beating Intel in gaming performance. Obviously, that will all change with Zen 3 but the long history of AMD being a failed company before Ryzen cannot be ignored. Was it 2012 the last time AMD was the performance king in discrete graphics? I am thinking 7950/7970 era.

The new Playstation 5 and Xbox should have been released 2 years ago. The 1080 is a 2016 card and is still a performance beast today. That should not be. We have been cheated by the tech industry for years from lack of innovation and price gouging.
 

DrMrLordX

Lifer
Apr 27, 2000
21,620
10,829
136
The new Playstation 5 and Xbox should have been released 2 years ago.

That would not have been possible, not with the specs we see today. And I don't think anyone would want to see PS5 with GF 12nm chips in it to be perfectly honest. As far as Zen3 and RDNA2 go, I agree that AMD shouldn't be taking this long, but they obviously want to sell through their current-gen products just a little longer.
 
  • Like
Reactions: Tlh97 and raghu78

jamescox

Senior member
Nov 11, 2009
637
1,103
136
You are right actually. I was imagining the Zen layout. the Zen2 CCD has one IFOP.

I think there will be a new IO die or at the very least it will be updated. EPYC is using a lot of power for the interconnects/IF, which could be responsible for it not being able to sustain high clocks with 8 chiplets within that thermal envelope.


Yeah, the moment we want to piece together some of the pieces from info in public domain, the more we know so much info is missing.


Tbh, I am really looking forward to such a change. The interconnect and core topology seems rather more interesting to me.

The power consumption of the IO die could be improved, but are they going to put a lot of work into porting the old DDR4/PCI-E 4 IO die over to TSMC with the probably completely new IO die for DDR5 and PCI-E 5 coming soon? I kind of expect that the Zen3 Epyc IO die will be the same. I could see them maybe making use of Global Foundries latest and greatest low power process and tweak the design slightly; that may not be nearly as much work but I still don't know if it makes sense to spend the resources to do it. It is unclear how small the IO die can actually be. It has 128 PCI-E links, 8 DDR4 channels, and 256 infinity fabric links for CCD (8 x 32 bit). That is a lot of pins, especially considering may of these take multiple pins per bit and are bidirectional. It is a big chip, but there is a limit to how small it can be due to the number of pins required / pad space on the die. There may not be that many reasons to get it onto a smaller process.
 

juergbi

Junior Member
Apr 27, 2019
12
14
41
The power consumption of the IO die could be improved, but are they going to put a lot of work into porting the old DDR4/PCI-E 4 IO die over to TSMC with the probably completely new IO die for DDR5 and PCI-E 5 coming soon? I kind of expect that the Zen3 Epyc IO die will be the same.

They already have a DDR4 controller on 7nm for Renoir. PCIe is only 3.0 on Renoir but this might also be an artificial limitation. That said, I don't expect Milan to have a 7nm IO die.
 

Vattila

Senior member
Oct 22, 2004
799
1,351
136
but there is a limit to how small it can be due to the number of pins required / pad space on the die. There may not be that many reasons to get it onto a smaller process.

Good point. When they go to interposer design, then they can narrow the I/O pad pitch and shrink.