Discussion Zen 5 Architecture & Technical discussion

DisEnchantment · Jun 3, 2024

Since we have the official unveil done, would be a good time to discuss on the Architecture and technical specifications of the Zen 5 Core.

More details will be added once they are public like die size, architecture details etc.

---------------------------------------------------------------------------------------

No Market share discussion
No Leaks
No speculation unrelated to publicly released technical materials/patches/manuals etc.

---------------------------------------------------------------------------------------

DisEnchantment · Jun 3, 2024

Zen 5 was supposed to focus on efficiency as per FAD2022. Quite odd there was no mention of it.

Dual decode seems interesting, so Z5 would now be 8 wide (2x 4 wide), Tremont style.
They did not increase the BW to the integer units, I wonder what is the reason behind that, is the int not BW constrained at all?

Nothingness · Jun 3, 2024

DisEnchantment said:
Zen 5 was supposed to focus on efficiency as per FAD2022. Quite odd there was no mention of it.

Yes, I find it a bit concerning.

DisEnchantment said:
Dual decode seems interesting, so Z5 would now be 8 wide (2x 4 wide), Tremont style.
They did not increase the BW to the integer units, I wonder what is the reason behind that, is the int not BW constrained at all?

It makes sense to increase the number of integer units if you have enough improvements in branch prediction and data prefetch. If you don't, assuming the micro architecture design is already well balanced, you'll be limited by branch mispredictions and memory latency, so adding integer units would only increase power with little benefit.

Saylick · Jun 3, 2024

DisEnchantment said:
Zen 5 was supposed to focus on efficiency as per FAD2022. Quite odd there was no mention of it.

Probably a stretch, but could there have been an assumption made at that time that vanilla Zen 5 would be on N3 and not N4?

DisEnchantment · Jun 3, 2024

Hitman928 said:
Ahh well, at least we should have some calm before the next hype storm starts brewing. I also want to give a shout out to @Exist50 who seems to have called quite a lot of things with good accuracy on both sides and did so without being abrasive or seeking to demean other posters. Hopefully we can get back to that type of discussion which seemed to be what we (mostly) had for a while until fairly recently.

We can continue here. Indeed we use to have sane/courteous discussions before but I was warned by the mods when I lamented on the deterioration of the civility of discussions in the thread.

itsmydamnation said:
The post mortem will be interesting, is the frontend not as bigly as we thought? PRFi or rob sizes smaller then expected? Not a large enough jump in prefetch / predict? If rob >512 and decode a complete rework then I will say zen5 is bulldozer tier execution, they just are in a much better place before hand and made far better architectural choices.

The front end should be bigly, ROB we dont know yet
But I am wondering why the L1 BW to the core is not increased for integer RF

Nothingness said:
It makes sense to increase the number of integer units if you have enough improvements in branch prediction and data prefetch. If you don't, assuming the micro architecture design is already well balanced, you'll be limited by branch mispredictions and memory latency, so adding integer units would only increase power with little benefit.

Indeed, but they did increase the decode with, but the BW between int RF and L1 seems unchanged which seems odd, This would be unlike Intel's GLC.

They also have a unified int scheduler, would expect some latency from there too.

adroc_thurston · Jun 3, 2024

Saylick said:
Probably a stretch, but could there have been an assumption made at that time that vanilla Zen 5 would be on N3 and not N4?

It was already N4 in 2022.
What became N3b slipped a year and some before that.

Nothingness · Jun 3, 2024

DisEnchantment said:
Indeed, but they did increase the decode with, but the BW between int RF and L1 seems unchanged which seems odd, This would be unlike Intel's GLC.

They also have a unified int scheduler, would expect some latency from there too.

Hmm so if we assume the changes in the micro architecture are deeper, then I have 3 hypotheses:

It's the first (public) iteration, so not everything is well balanced
Increasing the number of ALUs would have increased power too much, so they wait for the next process
AMD thought the improvements were enough for that generation and they keep some improvements for Zen6.

Pure speculation 🙂

poke01 · Jun 3, 2024

I think efficiency has improved. They didn’t focus on Zen 5 technical details at all this time.

Hopefully, we get more details soon.

SteinFG · Jun 3, 2024

poke01 said:
I think efficiency has improved. They didn’t focus on Zen 5 technical details at all this time.

Hopefully, we get more details soon.

We'll definitely get more datails at hot chips (aug 25)

Nothingness · Jun 3, 2024

poke01 said:
I think efficiency has improved. They didn’t focus on Zen 5 technical details at all this time.

Hopefully, we get more details soon.

Efficiency might have improved but that doesn't invalidate my hypotheses I think: even if it improved, power was perhaps still too high to widen the core.

majord · Jun 3, 2024

Well she did say incredibly energy efficient or something. But not specifics, so perhaps it's not exciting enough for numbers. I don't recall if they've ever really gone into specifics for the desktop launch, I'd expect that for mobile announcement / press releases

coercitiv · Jun 3, 2024

majord said:
Well she did say incredibly energy efficient or something. But not specifics, so perhaps it's not exciting enough for numbers. I don't recall if they've ever really gone into specifics for the desktop launch, I'd expect that for mobile announcement / press releases

We can ignore what they say and look at what they do: they're dropping TDP for desktop SKUs except the 16-core. These SKUs must at least match the old gen counterparts in MT performance with a ~40W handicap.

If AMD is really honest about the TDP change, then it is the exact opposite of what happened with Zen 4, where the performance gain was partially fed with extra power headroom.

DisEnchantment · Jun 3, 2024

Anybody got the die sizes?

Triskain · Jun 3, 2024

I tried a quick Paint eyeball based on the package render AMD published (IOD size fits to what is known from Zen 4 -> 122.2mm², so it can't be totally off), and the CCD Size was quite close to Zen 4, the increase is in the low single digit mm² range, but certainly less than 5mm² added (I measured 68.4mm²). Maybe thats the consolation price, whatever they did with the microarchitecture & implementation at least didn't blow out the area...

DisEnchantment · Jun 3, 2024

Triskain said:
I tried a quick Paint eyeball based on the package render AMD published (IOD size fits to what is known from Zen 4 -> 122.2mm², so it can't be totally off), and the CCD Size was quite close to Zen 4, the increase is in the low single digit mm² range, but certainly less than 5mm² added (I measured 68.4mm²). Maybe thats the consolation price, whatever they did with the microarchitecture & implementation at least didn't blow out the area...

There is this one which is an actual picture not a render.

jpiniero · Jun 3, 2024

Triskain said:
I tried a quick Paint eyeball based on the package render AMD published (IOD size fits to what is known from Zen 4 -> 122.2mm², so it can't be totally off), and the CCD Size was quite close to Zen 4, the increase is in the low single digit mm² range, but certainly less than 5mm² added (I measured 68.4mm²). Maybe thats the consolation price, whatever they did with the microarchitecture & implementation at least didn't blow out the area...

And the density reduction isn't much between N5 and N4P. You do get solid power savings though.

Ajay · Jun 3, 2024

Some, rough at this point, die info:

https://twitter.com/x/status/1797638258723365189

Core size went up quite a bit, even though overall die size didn't.

DisEnchantment · Jun 3, 2024

Ajay said:
Core size went up quite a bit, even though overall die size didn't.

Seems to be, but errors with a few pixels of lines would amount to few decimal points of mm2, so not really accurate but enough indication that there is hardly any die size increase for the CCD if at all!

Triskain · Jun 3, 2024

Ajay said:
Some, rough at this point, die info:

https://twitter.com/x/status/1797638258723365189

Core size went up quite a bit, even though overall die size didn't.

According to Fritz's estimate the CCD's for Zen4 and Zen5 are ~ basically the same size (which fits to my rendered package shot eyeball). Looks like improved L3 Layout Density & Topology (Ladder Cache for the Win?) is doing the heavy lifting to compensate for the Core Area increase on the Die Level. If we take the core area increase at face value (big IF!) then the IPC increase (16%) fits okay-ish to Pollack's Law Square Root of the Area increase (√(1.26) -> 1.122 "predicted" IPC increase). So I wouldn't call the Arch a disaster, more like an AMD interpretation of your average "Cove" iteration , but its certainly no Zen3 in terms of bang per gate. In retrospect, Zen3 appears to have been a lightning in a bottle moment unlikely to be repeated any time soon by AMD...

Ajay · Jun 3, 2024

Triskain said:
In retrospect, Zen3 appears to have been a lightning in a bottle moment unlikely to be repeated any time soon by AMD...

Possibly. I think that the CPU team still had some so called 'low hanging fruit' in Zen3 that helped achieve that IPC jump on a mature 7N process. That is probably over - everything else will be harder. Process node shrinks aren't going to be much of a help on the xtor/mm2 front going forward either. Going forward, node changes will be more towards power and performance improvements, IMHO.

naukkis · Jun 3, 2024

DisEnchantment said:
But I am wondering why the L1 BW to the core is not increased for integer RF

Because they didn't add ports only made them wider to do 2*512 bit loads to AVX512-registers. Pretty much like Intel. Adding ports and maintain very high clocks seems difficult combination.

Saylick · Jun 3, 2024

Triskain said:
According to Fritz's estimate the CCD's for Zen4 and Zen5 are ~ basically the same size (which fits to my rendered package shot eyeball). Looks like improved L3 Layout Density & Topology (Ladder Cache for the Win?) is doing the heavy lifting to compensate for the Core Area increase on the Die Level. If we take the core area increase at face value (big IF!) then the IPC increase (16%) fits okay-ish to Pollack's Law Square Root of the Area increase (√(1.26) -> 1.122 "predicted" IPC increase). So I wouldn't call the Arch a disaster, more like an AMD interpretation of your average "Cove" iteration , but its certainly no Zen3 in terms of bang per gate. In retrospect, Zen3 appears to have been a lightning in a bottle moment unlikely to be repeated any time soon by AMD...

To use an analogy, I think a lot of us were expecting Zen 5 to be like ARM's X cores moment where ST performance was prioritized above all else, but what it ended up being is more like ARM's middle cores which provides a balance of performance, area, and power.

poke01 · Jun 3, 2024

https://twitter.com/x/status/1797693321047126499

There’s been node shrinks on the cache to offset the bigger cores and according to clam, this is why the CCD looks the same.

gdansk · Jun 3, 2024

With a cache shrink is it possible there is a 96MB vcache chip? If the L3 is smaller, did the TSV pattern change?

adroc_thurston · Jun 3, 2024

poke01 said:
There’s been node shrinks on the cache to offset the bigger cores

There is no SRAM bitcell shrink, that's the thing.
It's the exact same thing across N5, N4 and N3e family of nodes.

Discussion Zen 5 Architecture & Technical discussion

Golden Member

Golden Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Senior member

Diamond Member

Golden Member

Member

Golden Member

Lifer

Lifer

Golden Member

Member

Lifer

Golden Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member