Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

Saylick · Oct 27, 2024

fastandfurious6 said:
no

the question is: why can't they fit extra L3 in the die without v-cache?

They could use more die area to have more L3 but it adds cost.

Hitman928 · Oct 27, 2024

Saylick said:
They could use more die area to have more L3 but it adds cost.

You also get to a point where the signal paths are longer with a large L3 on one die versus additional cache in a stacked cache chiplets.

fastandfurious6 · Oct 27, 2024

why not both larger L3 + v-cache?

Saylick · Oct 27, 2024

fastandfurious6 said:
why not both larger L3 + v-cache?

Cost.

fastandfurious6 · Oct 28, 2024

I'm happy to pay bro 😭

DAPUNISHER · Oct 28, 2024

fastandfurious6 said:
I'm happy to pay bro 😭

therealmongo · Oct 28, 2024

sl0519 said:
From reddit: 9800x3D specifications appear on german price comparison website

Full specs:

So 5.2 max boost is pretty much confirmed. If thermal restraint was lifted, why is boost still .3 Ghz down from non V-cache model??

possibly as they are saving that for the higher tieir 3d cache cpus

StefanR5R · Oct 28, 2024

fastandfurious6 said:
why not both larger L3 + v-cache?

32 MB on-die L3$ is already quite a lot for 8c/16t for most applications. The next time they spend more CCD area on cache, they might perhaps spend it on L2$ rather than L3$. (Edit: It's easier said than done for a high-speed design.)

Gideon · Oct 28, 2024

StefanR5R said:
32 MB on-die L3$ is already quite a lot for 8c/16t for most applications. The next time they spend more CCD area on cache, they might perhaps spend it on L2$ rather than L3$.

I really wish the x86 world (for desktop and mobile) would also offer similar tiled L2 as Qualcomm and Apple:

Snapdragon X has a 96KB L1 and 12MB L2 per core cluster (4 x 3MB slices, but a single core can use all 12MB), with nearly identical latency to AMD’s private 1MB L2.

Oryon uses an Apple-like caching strategy. A large 96 KB L1 and relatively fast L2 with 20 cycles of latency together mean Oryon doesn’t need a mid-level cache. Firestorm has a bigger 128 KB L1, but Oryon’s L1 is still much larger than the 32 or 48 KB L1 caches in Zen 4 or Redwood Cove.

AMD has a 1 MB L2 mid-level cache private to each core, then a 16 MB L3. That setup makes it easier to increase caching capacity, because the L2 cache can insulate the core from L3 latency. However, that advantage is minimal for mobile Zen 4 parts, which max out at 16 MB of L3. Oryon therefore provides competitive latency especially as accesses spill out of Zen 4’s L2. Meteor Lake follows a similar caching strategy to Zen 4, but has more caching capacity at the expense of higher latency.

Qualcomm’s Oryon Core: A Long Time in the Making

In 2019, a startup called Nuvia came out of stealth mode.

chipsandcheese.com

I'm not sure what the optimal cache hierarchy would be for desktop chips, but the current design sure seems nonoptimal for client workloads (it makes much more sense on server).

Could you imagine the gaming performance of an AMD chip that would have:

24MB tiled L2 (12MB + 12MB shared by 4 cores each, if it's required to not regress in latency)
96 - 128MB L3 (3D cache)
2nd gen chiplets (similar to Strix Halo) with full-width memory bandwidth per CCD and faster FCLK support (2600+Mhz)

Ideally it should also have 3-4 memory channels using aggressive low-latency CAMM2 modules, but you can't have it all I guess (that would need a new socket) ...

TL;DR:

A 1MB - 3MB private L2 only makes sense if it provides 2-3x better latency or bandwidth than a 12MB shared, tiled L2. Otherwise, it’s a waste of SRAM potential, IMO.

Kepler_L2 · Oct 28, 2024

Joe NYC said:
9800x3d delidded, no V-Cache in sight

View attachment 110431

Here's Your First Look at a Delidded AMD Ryzen 7 9800X3D CPU With Next-Gen 3D V-Cache Tech

AMD's Ryzen 7 9800X3D CPU has been delidded, revealing the next-generation of 3D V-Cache technology innovations for gamers.

wccftech.com

V-Cache isn't visible on retail silicon. Those shots of Lisa holding X3D with visible dies is without the top supporting silicon.

MS_AT · Oct 28, 2024

Gideon said:
I really wish the x86 world (for desktop and mobile) would also offer similar tiled L2 as Qualcomm and Apple:

Snapdragon X has a 96KB L1 and 12MB shared L2 per 4 cores (4x 3MB slices but a single core can use all 12MB), but it has the same latency than AMD private 1MB L2.

It does have the similar latency in cycles, but worse absolute latency [ns].

yuri69 · Oct 28, 2024

Joe NYC said:
9800x3d delidded, no V-Cache in sight

Keep reminding yourself this has been AMD in the economy mode. Nothing experimental/expensive should be expected (yeah, I'm looking at you, Strix Halo).

maddie · Oct 28, 2024

fastandfurious6 said:
why not both larger L3 + v-cache?

V-cache is L3.

igor_kavinski · Oct 28, 2024

maddie said:
V-cache is L3.

He's saying increase the L3 size on the CCD (maybe 48MB?) and then also increase the V-cache size (possibly 96MB). So that could be a total of 144MB and it would help with some games that have a larger data set that the current 96MB is too small for.

Tuna-Fish · Oct 28, 2024

Gideon said:
I really wish the x86 world (for desktop and mobile) would also offer similar tiled L2 as Qualcomm and Apple:

Snapdragon X has a 96KB L1 and 12MB L2 per core cluster (4 x 3MB slices, but a single core can use all 12MB), with nearly identical latency to AMD’s private 1MB L2.

At what clocks? Cache latency measured in clock cycles is identical, but this was only possible for them to implement because they had twice the wall clock time to do so.

The AMD L2 is extremely tight, you are not increasing it's size at all without a latency regression. You are absolutely not sharing it with anything without a latency regression.

Joe NYC · Oct 28, 2024

This is supposed to be "official spec sheet". One thing that's missing from the previous "official spec sheet" is DDR-6000 support:

https://twitter.com/x/status/1850843696591143026

igor_kavinski · Oct 28, 2024

Joe NYC said:
One thing that's missing from the previous "official spec sheet" is DDR-6000 support:

Not sure what you mean by that. AFAIK, IOD is same so it will retain the same DDR5-5600 basic support just like all previous AM5 CPUs.

SteinFG · Oct 28, 2024

6000+ Jedec support is coming with new IO die, which is probably Zen 6

SteinFG · Oct 28, 2024

What's the current rumor for Zen 6 IOD? RDL like in Navi 31/32, right?

Joe NYC · Oct 28, 2024

igor_kavinski said:
Not sure what you mean by that. AFAIK, IOD is same so it will retain the same DDR5-5600 basic support just like all previous AM5 CPUs.

I meant this promo material posted by VideoCardz:

StefanR5R · Oct 28, 2024

Joe NYC said:
This is supposed to be "official spec sheet". One thing that's missing from the previous "official spec sheet" is DDR-6000 support:

https://twitter.com/x/status/1850843696591143026

Already posted in #21,400 and #21,401. And it hasn't become any more official than it was at the time of #21,405 and #21,411. :-) IOW, it's quite possibly not a leak, but likely just a rehash of previous leaks and wannabe-leaks.

It's an Austrian price comparison site. They are unlikely to receive 1st party spec sheets before product launch.

SteinFG · Oct 28, 2024

Joe NYC said:
I meant this promo material posted by VideoCardz:

This writing is either ousourced to Malaysia or fake.
or both

StefanR5R · Oct 28, 2024

Kepler_L2 said:
V-Cache isn't visible on retail silicon. Those shots of Lisa holding X3D with visible dies is without the top supporting silicon.

To be fair, it's not just the very lowest end of online journalism who don't get it. ComputerBase.de have been trolled ;-) by AMD in the same way. "... the 3D cache, which can otherwise be seen with the naked eye, ..."

(Edit: CB were made aware of this fact and corrected their article now.)

coercitiv · Oct 28, 2024

SteinFG said:
This writing is either ousourced to Malaysia or fake.
or both

You can add LLM generated as well:

Introducing the AMD Ryzen 7 9800X3D, the latest powerhouse in gaming and multitasking performance, featuring revolutionary 3D V-Cache technology. Elevate your computing experience with unbeatable speeds and unparalleled efficiency.

Next-Gen 3D V-Cache Technology: Enhanced performance with up to 96MB of L3 cache.

Unmatched Gaming Performance: Boosts gaming performance by up to 26% over the previous generation.

Higher Clock Speeds: Achieves up to 5.2GHz for lightning-fast processing.

Improved Thermal Performance: Better cooling efficiency for sustained high performance.

Zen 5 Architecture: Built on the latest Zen 5 core architecture for superior efficiency and power.

AM5 Compatibility: Fully compatible with the AM5 platform, supporting PCIe Gen 5 and DDR5 memory.

Multi-Threaded Excellence: Ideal for multitasking and content creation with 8 cores and 16 threads.

Future-Proof Design: Ready for upcoming technologies and applications.

Gideon · Oct 28, 2024

MS_AT said:
It does have the similar latency in cycles, but worse absolute latency [ns].

Yeah, that's true. I wish we had (relatively) apples to apples latency comparison between M4 @ 4.4 Ghz and Zen 5 to see what are the actual latency. The only info that chipsandcheese has up is quite out of date 7950X vs M1 (from here):

Roughly 5.4ns for M1 vs 2.4ns for 7950X. So yeah, a very significant over ~2x difference.

But M1 clocked only up to 3.2 M4 clocks to 4.4 GHz. I'm more than certain Apple relaxed the L2 latency by a few cycles doing that, but I'd still very much like to see where they ended up (and i hope reviewers measure it).

Tuna-Fish said:
The AMD L2 is extremely tight, you are not increasing it's size at all without a latency regression. You are absolutely not sharing it with anything without a latency regression.

That's true, it's not possible without some latency regression. My whole point was that with the ever-more-prevalent 3D cache there is a growing gap between the rather small 1MB L2 and the gigantic 96MB L3

Take techpowerup reviews as an example (as they use the same mobo and ram configs):

For Ryzen 9700X review they registered 7.7ns L3 latency
For Ryzen 7700X it was 9.9ns L3 latency
For Ryzen 7800X3D it was 12.7ns L3 latency for 3x bigger cache

A <30% regression for 3x the size. Looks to be a pretty decent tradeoff (and I expect it to be less for 9800X3D as it clocks higher!).

But then again, the latency gap between L2 and L3 went from 3x to 4x.

As many consumer applications are heavily cache/memory bound, there seems to be performance there, waiting to be extracted.

What options are there to do that?

1. Adding extra cache layers - possible, but numerous other significant drawbacks
2. Upsizing the private L2 to 2MB or 3MB (as Intel did) - this is the easiest solution, but even with "just" 2MB of L2, we use 16MB of the CCD's SRAM budget on L2, while limiting the amount a single thread can use to 2 MB. Going beyond that (3MB for 24MB total) seems insanely wasteful to me.
3. sharing the L2 between cores - a much more complex solution with obvious latency regressions as you stated

Extrapolating what AMD did with L3 it should be possible to go from 1MB to 3MB with a 30% latenchy increase (3ns -> 4ns). Actually i think AMD would do better, as AFAIK going from 512KB to 1MB AMD managed to regress much less than that!

TL;DR: So a private 2MB L2 is indeed the most obvious solution to address this.

It's just that in my La La land, I'd like to see a shared L2 solution where the banks next to the core have almost no latency regression and the ones further away have 20-30% but allow a core to use up to 8MB of L2 instead of "just" 2MB.

The intriguing alternative is to keep the L2 at 2MB on the base SKU and take the "2-3 cycle hit” on 3D cache parts by also double the private L2 on the V-cache die to 4MB (keeping the relative latency between L2 and L3 the same)

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Senior member

Super Moderator CPU Forum Mod and Elite Member

Member

Elite Member

Platinum Member

Senior member

Senior member

Senior member

Diamond Member

Lifer

Golden Member

Diamond Member

Lifer

Senior member

Senior member

Diamond Member

Elite Member

Senior member

Elite Member

Diamond Member

Platinum Member