Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

igor_kavinski · 2025-03-13T12:11:03-0400

adroc_thurston said:
MALL only makes sense for GPUs.

Broadwell, Crystal Well etc. showed that it can matter for certain applications especially with limited data sets that need to do repetitive computations, like game engines.

But I think MALLs can matter more if a hardware/software profiling solution is developed so the same required data does not keep on getting evicted and reloaded over and over again just because there are short periods where that data isn't needed.

adroc_thurston · 2025-03-13T12:12:24-0400

igor_kavinski said:
Broadwell, Crystal Well etc. showed that it can matter for certain applications especially with limited data sets that need to do repetitive computations, like game engines.

Good news everyone! V$ exists.

igor_kavinski · 2025-03-13T12:17:16-0400

adroc_thurston said:
Good news everyone! V$ exists.

Yeah but we need 512MB of that. At least I won't be satisfied till that happens.

adroc_thurston · 2025-03-13T12:17:40-0400

igor_kavinski said:
Yeah but we need 512MB of that.

no you don't.

igor_kavinski said:
At least I won't be satisfied till that happens.

Too many sets for too many pains. Do not.

LightningZ71 · 2025-03-13T14:04:27-0400

igor_kavinski said:
Yeah but we need 512MB of that. At least I won't be satisfied till that happens

In general, the miss rate on a last level cache halves as the size of the cache quadruples. For example, if your hit rate on a 512Kb cache was 90%, your miss rate would be 10%. If you doubled that cache twice, to 2 MB, you would improve the miss rate to 5% and the hit rate to 95%. It would make a noticeable difference only in programs that have a hot working set that now fits in the expanded cache, but spilled before. Those are very general numbers for x86 code as the effect is still HIGHLY dependent on the hot working set size of each program.

Be aware that, for every doubling of cache size, you are going to introduce additional access latency as well as additional latency in any memory operations that result from a miss when seen from the program itself, OR, you will make the design of the cache more complex, taking up more area, resulting in additional product cost. Eventually, you just aren't making any useful impact in working set latencies and will have to resort to LOTS of predictive extra data loads from main memory to attempt to preload the cache with data that you think that the program will need next. This burns up a lot of energy making memory calls that are often unneeded.

I think that AMD is currently happy with their L3 cache ratio and may look to maintain that ratio into larger CCXs with respect to VCache packages.

igor_kavinski · 2025-03-13T14:07:46-0400

LightningZ71 said:
Eventually, you just aren't making any useful impact in working set latencies and will have to resort to LOTS of predictive extra data loads from main memory to attempt to preload the cache with data that you think that the program will need next. This burns up a lot of energy making memory calls that are often unneeded.

This should be exposed as a BIOS option and let the users make that call. I personally have no issue burning a few extra watts for maximum performance.

LightningZ71 · 2025-03-13T14:09:56-0400

I vaguely remember from long ago that there were processors that had bios settings where you could turn cache prefetch on and off. It's been a minute, I've slept since then, and there may have been an alcohol or two in my system along the way, so that's about all I have at the moment.

igor_kavinski · 2025-03-13T14:13:44-0400

Cache prefetching is available in some Intel BIOS (turning it off causes a serious performance hit). I haven't seen it in the Epyc BIOS though.

MS_AT · 2025-03-13T14:33:42-0400

LightningZ71 said:
I vaguely remember from long ago that there were processors that had bios settings where you could turn cache prefetch on and off. It's been a minute, I've slept since then, and there may have been an alcohol or two in my system along the way, so that's about all I have at the moment.

It should be available on AM5. Usually the option can be found from AMD specifc menu but your mileage may vary, depending on the manufacturer.

igor_kavinski · 2025-03-13T14:36:16-0400

MS_AT said:
It should be available on AM5. Usually the option can be found from AMD specifc menu but your mileage may vary, depending on the manufacturer.

Guess I'll find out when I get my 9900X up and running, provided some gremlin doesn't steal it on its way to me.

Search

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment

Golden Member

igor_kavinski

Lifer

adroc_thurston

Diamond Member

igor_kavinski

Lifer

adroc_thurston

Diamond Member

LightningZ71

Platinum Member

igor_kavinski

Lifer

LightningZ71

Platinum Member

igor_kavinski

Lifer

MS_AT

Senior member

igor_kavinski

Lifer

TRENDING THREADS