Info 64MB V-Cache on 5XXX Zen3 Average +15% in Games

Page 60 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Kedas

Senior member
Dec 6, 2018
355
339
136
Well we know now how they will bridge the long wait to Zen4 on AM5 Q4 2022.
Production start for V-cache is end this year so too early for Zen4 so this is certainly coming to AM4.
+15% Lisa said is "like an entire architectural generation"
 
Last edited:
  • Like
Reactions: Tlh97 and Gideon
Jul 27, 2020
15,755
9,819
106
In the long term (if AMD doesn't abandon V-cache all of a sudden), game engines might get optimized for keeping their most used functions/data/assets etc. inside the V-cache with minimum need to go out to "hundreds of cycles slower" RAM. Especially procedural generation of game levels would benefit immensely. Why bundle gigabytes of assets with a game when you can just create them on the fly and load them from V-cache in nanoseconds with gobs of bandwidth? Imagine addictive and fun indie games less than 100 megs that give AAA titles a run for their money. Can't wait to see the mind boggling creations of hackers from the demo scene.
 
  • Like
Reactions: Tlh97 and ozzy702

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
Pity there appears to be no 5900X3D which would be a logical replacement for my 3900X. Did AMD give a reason? Was it just cost? Limited supply?

I guess I'll just keep an eye out for a heavy price cut for a 5900X.
There are a few 5900X3D but as prototypes on the Lab(One was shown on Computex 2021), there are plenty of reasons as to why they did not, but the main reasons are that they are ramping up Milan-X which they sell for $1,340 a pop(Per CCD), that is where the money is and they deem that the 5800X3D is enough to recapture the gaming crown and hold off Intel until Zen4 drops
 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
Dude, premium CPUs command premium prices. The 1800X was launched at $499 and was not even the top dog at gaming. Intel has always charged a pretty penny for top of the line(see 11700K vs 11900K).

I don't even think the 1800X was really pitched as a gaming CPU and that's not the reason it was worth $500 at launch. The 1800X was half the price of one of Intel's 8-core HEDT CPUs and delivered better performance at lower power in a lot of benchmarks. $500 was a bargain price for that CPU.

The 5800X3D will launch at $500 or more, it's a Halo product for extreme gamers. This not for people worrying about budgets, like many of you do.

It probably won't be much more than the 12900KF if it costs more at all so I think $500 is probably about the limit. We also don't have a wider set of gaming benchmark results to go of off either, so it's still hard to say exactly how well it squares up against Intel's best.
 
  • Like
Reactions: Thunder 57 and Zepp

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
Another thing to remember is that games can't take advantage of another CCD Cache, (otherwise the 16 Core 256MB EPYC would have been the best gaming processor) So having all of the extra 3D-V Cache would have only lower the core speed (the 5900X3D prototype was shown at 4 Ghz at Computex) for the same gains(15%) so there is a trade off.

On Milan-X where many apps are meant to hop thru cpus and clusters well this is a no brainer. but for gaming? Well the 5800X3D is the most balanced approach.
 
  • Like
Reactions: Tlh97 and Markfw

HurleyBird

Platinum Member
Apr 22, 2003
2,670
1,250
136
if they had released such an SKU they would had to downscale the single core frequency at 4.6GHz at best, down from its usual 4.9.

4.9 was what they were comfortable binning at 6 quarters before the 5800X3D, and you're speaking very confidently for what must be complete speculation.

Again, if the limit is thermals, why did base clocks drop by 400MHz while boost clocks only dropped by 200 MHz (I mistakenly said earlier it was the same amount). That's a 10.5% reduction at base vs 4.3% reduction at boost.

Obviously, there's an element of thermals and an element of binning, but any thermal bottleneck is going to be more pronounced at high frequencies and voltages. The massive reduction in base clockspeed compared with a much smaller reduction in boost clocks doesn't make much sense if the primary consideration is thermals, but it may if it's binning.
 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
I don't think that it all down to thermals, but aside from just dissipating heat, the additional cache is going to be consuming power. That may also have something to do with it on top of binning decisions since powering all of that additional cache means there's less power to go around. If that's the case then the people willing to go beyond the default limits might be able to get some additional OC headroom.
 

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
I don't think that it all down to thermals, but aside from just dissipating heat, the additional cache is going to be consuming power. That may also have something to do with it on top of binning decisions since powering all of that additional cache means there's less power to go around. If that's the case then the people willing to go beyond the default limits might be able to get some additional OC headroom.
The 3D-V Cache die can be turned off when not in use to save on power(as to how the OS deem that necessary is something we don't know yet)
 

Abwx

Lifer
Apr 2, 2011
10,847
3,297
136
4.9 was what they were comfortable binning at 6 quarters before the 5800X3D, and you're speaking very confidently for what must be complete speculation.

Again, if the limit is thermals, why did base clocks drop by 400MHz while boost clocks only dropped by 200 MHz (I mistakenly said earlier it was the same amount). That's a 10.5% reduction at base vs 4.3% reduction at boost.

Obviously, there's an element of thermals and an element of binning, but any thermal bottleneck is going to be more pronounced at high frequencies and voltages. The massive reduction in base clockspeed compared with a much smaller reduction in boost clocks doesn't make much sense if the primary consideration is thermals, but it may if it's binning.

ALL cores boost produce more heat, so silicon temp is higher, isnt it, so they had to dial down the TDP.

For single core boost it is assumed that close cores are not much boosted, so the nearby silicon is used as cooler, wich allow for higher frequencies.
.
 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
The 3D-V Cache die can be turned off when not in use to save on power(as to how the OS deem that necessary is something we don't know yet)

Why turn it off when we know that the performance is better with the additional cache even though the clock speed is lower? Furthermore, if you're considering the total system power, then having more cache is better for lowering the total power use. Having to load something from main memory is more expensive in terms of power use. Unless you're going to put the computer to sleep, there aren't too many cases I can think of where you'd see an advantage from turning the additional cache off.
 

Hitman928

Diamond Member
Apr 15, 2012
5,180
7,631
136
Why turn it off when we know that the performance is better with the additional cache even though the clock speed is lower? Furthermore, if you're considering the total system power, then having more cache is better for lowering the total power use. Having to load something from main memory is more expensive in terms of power use. Unless you're going to put the computer to sleep, there aren't too many cases I can think of where you'd see an advantage from turning the additional cache off.

It doesn't get disabled, just gated off so it isn't drawing power and turned back on when needed, essentially it gets put to sleep. I imagine this all happens at the hardware level, the OS/software doesn't have visibility into the extra cache being on or off.
 

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
Why turn it off when we know that the performance is better with the additional cache even though the clock speed is lower? Furthermore, if you're considering the total system power, then having more cache is better for lowering the total power use. Having to load something from main memory is more expensive in terms of power use. Unless you're going to put the computer to sleep, there aren't too many cases I can think of where you'd see an advantage from turning the additional cache off.

Just pointing out that AMD confirmed that it can be powered down when not in use

1641506227049.png
 
  • Like
Reactions: Elfear

HurleyBird

Platinum Member
Apr 22, 2003
2,670
1,250
136
ALL cores boost produce more heat, so silicon temp is higher, isnt it, so they had to dial down the TDP.

For single core boost it is assumed that close cores are not much boosted, so the nearby silicon is used as cooler, wich allow for higher frequencies.

You aren't explaining the reduction to base clocks. In a vanilla 5800X, there isn't a huge difference between single core and all core boost. The official specified boost frequency is closer to all core than single core. The difference between single and all core boost is significantly less than the 400 MHz reduction to base clocks.


It's possible that the 5800X3D will change this behavior so that there is a larger gradient between single and all core boost, but that would be pure speculation.
 

Abwx

Lifer
Apr 2, 2011
10,847
3,297
136
You aren't explaining the reduction to base clocks. In a vanilla 5800X, there isn't a huge difference between single core and all core boost. The official specified boost frequency is closer to all core than single core. The difference between single and all core boost is significantly less than the 400 MHz reduction to base clocks.


It's possible that the 5800X3D will change this behavior so that there is a larger gradient between single and all core boost, but that would be pure speculation.

5800X consume as much if not more than a 5950X, that s twice the heat in the single chiplet, so by the definition it is the most thermally limited chip of the lot.

That they used this SKU as first Vcache product is quite surprising given said thermal constraints, but whatever the motives that s not a matter of cost or short supply since it s TSMC s best interest to have a notorious firm providing a commercial proof of the concept, the only remaining reasons are either technical or market demand that is unsure.
 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
It doesn't get disabled, just gated off so it isn't drawing power and turned back on when needed, essentially it gets put to sleep. I imagine this all happens at the hardware level, the OS/software doesn't have visibility into the extra cache being on or off.

Perhaps my wording didn't make it clear, but under what situations would the hardware want to do that? The larger cache improves performance and reduces the amount of time that the CPU needs to run because it doesn't need to wait on as many memory accesses. If that weren't the case, we wouldn't see a 15% performance improvement despite a ~5% reduction in clock speed.

Really the only time you'd want to turn it off is if the system is idle at which means that the cores don't need additional power to boost either. Otherwise as long as the system is operating, you'd always want to have the additional cache active because it ultimately saves power.

My original point was that AMD has kept the same TDP for the 5800X3D as it had for the 5800X. Because the cache requires power, it means that there's less that can be supplied to the remainder of the chip so the clock speeds naturally decrease. Of course, if you don't care about that and are willing to bypass those limits, it's entirely possible that you can still achieve the same clock speeds on a 5800X3D as you could on a 5800X assuming that the limits aren't also due to binning changes. Even if there were binning changes that just means a greater variability in silicon quality and that some parts could still reach those clocks, but there's no guarantee.
 

Hitman928

Diamond Member
Apr 15, 2012
5,180
7,631
136
Perhaps my wording didn't make it clear, but under what situations would the hardware want to do that? The larger cache improves performance and reduces the amount of time that the CPU needs to run because it doesn't need to wait on as many memory accesses. If that weren't the case, we wouldn't see a 15% performance improvement despite a ~5% reduction in clock speed.

Really the only time you'd want to turn it off is if the system is idle at which means that the cores don't need additional power to boost either. Otherwise as long as the system is operating, you'd always want to have the additional cache active because it ultimately saves power.

You don't always need such a massive cache. Many programs fit just fine inside Zen's 32 MB of existing L3 cache and won't see any uplift with the extra cache. If the extra cache isn't being used, it doesn't make sense to waste power by continually refreshing it.

My original point was that AMD has kept the same TDP for the 5800X3D as it had for the 5800X. Because the cache requires power, it means that there's less that can be supplied to the remainder of the chip so the clock speeds naturally decrease. Of course, if you don't care about that and are willing to bypass those limits, it's entirely possible that you can still achieve the same clock speeds on a 5800X3D as you could on a 5800X assuming that the limits aren't also due to binning changes. Even if there were binning changes that just means a greater variability in silicon quality and that some parts could still reach those clocks, but there's no guarantee.

The base clocks will be set based upon max core utilization, meaning the full L3 cache is engaged. Potentially, in real world clocks, if the V-cache is powered off then you would still see the same base clocks as the standard 5800x, assuming all else being equal. I'm honestly not really sure that thermals will be an issue as the cache doesn't get that hot and the only area where there is cache sits above only cache so it shouldn't get as hot as within the FPU units for instance.

The rest of the top layer that sits above the rest of the base CPU is substrate designed for low thermal resistance. The amount of substrate above the CPU logic/registers is the same in the V-cache SKUs as it is in the regular SKUs because they have the same total thickness. The stability substrate in the V-cache SKUs should be as good, if not better, thermally conducting than the substrate in the regular SKUs. The question is at the interface, how much thermal resistance appears there? I don't know, but it should be relatively quite thin and using low thermal resistive bonding so I wouldn't think it would add all that much, but I don't really know.

Does the L3 cache clock at the same speed as the cores in Zen 3? Anyone know?
 

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
You aren't explaining the reduction to base clocks. In a vanilla 5800X, there isn't a huge difference between single core and all core boost. The official specified boost frequency is closer to all core than single core. The difference between single and all core boost is significantly less than the 400 MHz reduction to base clocks.


It's possible that the 5800X3D will change this behavior so that there is a larger gradient between single and all core boost, but that would be pure speculation.

The reason for the reduction on base and boost clicks is power budget. A 5800X uses up to 142W of power. The V-Cache component itself uses a slice of that power, which lowers the power budget for the cores. The reason the base clocks drop more than boost is that the base clock is a baseline rating when all cores are active, and the boost clock is the peak boost for one core. The reason the boost clock is slightly lower is actually for better binning. It has absolutely nothing to do with thermals. You can enable PBO or overclock to get the performance back.

EDIT: this information is direct from AMD. If you read AnandTech you would already know this.
 
  • Like
Reactions: igor_kavinski

HurleyBird

Platinum Member
Apr 22, 2003
2,670
1,250
136
A 5800X uses up to 142W of power. The V-Cache component itself uses a slice of that power, which lowers the power budget for the cores. The reason the base clocks drop more than boost is that the base clock is a baseline rating when all cores are active, and the boost clock is the peak boost for one core. The reason the boost clock is slightly lower is actually for better binning. It has absolutely nothing to do with thermals. You can enable PBO or overclock to get the performance back.

So, one guy saying it's all thermals. You think it's not thermals at all, but is rather all about power power. In actuality there's a bit of both, but even together you can't explain a 400 MHz drop in base clocks.

A 5950X also has a 3.4GHz base clock, with the same 105W TDP. Even accounting for better binning on the 5950X (and again, 6 quarters between that and the 5800X3D. Average silicon quality has obviously gone up). You wouldn't make the argument that the V-cache die consumes anywhere near as much as a compute die, because that would be silly. Yet here we are.
 

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
So, one guy saying it's all thermals. You think it's not thermals at all, but is rather all about power power. In actuality there's a bit of both, but even together you can't explain a 400 MHz drop in base clocks.

A 5950X also has a 3.4GHz base clock, with the same 105W TDP. Even accounting for better binning on the 5950X (and again, 6 quarters between that and the 5800X3D. Average silicon quality has obviously gone up). You wouldn't make the argument that the V-cache die consumes anywhere near as much as a compute die, because that would be silly. Yet here we are.

Where has anyone from AMD ever stated it was thermals? In the strictest sense, AMD’s TDP is pretty close to their power consumption, so maybe someone got confused?
 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
You don't always need such a massive cache. Many programs fit just fine inside Zen's 32 MB of existing L3 cache and won't see any uplift with the extra cache. If the extra cache isn't being used, it doesn't make sense to waste power by continually refreshing it.

What general workload is going to fit in 32 MB of cache? Sure the instructions aren't necessarily aren't going to use that much, but it's rare that anything is operating on less than 32 MB of data.

That also doesn't include other programs that are running. Sure for benchmarking purposes everything else is turned off and background processes are eliminated too the greatest extent possible, but most people are going to be running multiple other applications in the background and make great use of that extra cache space.

The base clocks will be set based upon max core utilization, meaning the full L3 cache is engaged. Potentially, in real world clocks, if the V-cache is powered off then you would still see the same base clocks as the standard 5800x, assuming all else being equal.

I suppose if we get the magic workload that doesn't need the additional cache and it's being run in something more akin to a server environment where only that program is running you could just get the regular performance by not utilizing the v-cache, but if you know that why even bother buying the 3800X3D instead of a 3800X or something else like a 39xxX instead?
 
  • Like
Reactions: Tlh97

Hitman928

Diamond Member
Apr 15, 2012
5,180
7,631
136
What general workload is going to fit in 32 MB of cache? Sure the instructions aren't necessarily aren't going to use that much, but it's rare that anything is operating on less than 32 MB of data.

That also doesn't include other programs that are running. Sure for benchmarking purposes everything else is turned off and background processes are eliminated too the greatest extent possible, but most people are going to be running multiple other applications in the background and make great use of that extra cache space.



I suppose if we get the magic workload that doesn't need the additional cache and it's being run in something more akin to a server environment where only that program is running you could just get the regular performance by not utilizing the v-cache, but if you know that why even bother buying the 3800X3D instead of a 3800X or something else like a 39xxX instead?

Cinebench is an example that doesn't use all that cache. Several of the Spec2006 benchmarks get no where near saturating Zen3's cache, I'm pretty sure they don't even break out of Apple's M1 L2 cache. It's not like the whole software program gets loaded into the cache, just the data and instructions it needs to do the calculations. We'll have to see when it gets released how often the extra L3 cache is really used. It's usually in very branchy code like compiling and games where a lot of cache helps.
 

Hitman928

Diamond Member
Apr 15, 2012
5,180
7,631
136
So, one guy saying it's all thermals. You think it's not thermals at all, but is rather all about power power. In actuality there's a bit of both, but even together you can't explain a 400 MHz drop in base clocks.

A 5950X also has a 3.4GHz base clock, with the same 105W TDP. Even accounting for better binning on the 5950X (and again, 6 quarters between that and the 5800X3D. Average silicon quality has obviously gone up). You wouldn't make the argument that the V-cache die consumes anywhere near as much as a compute die, because that would be silly. Yet here we are.

While in reality, it is actually only a 10.5% reduction in base clock, it does seem excessive for power or thermal necessity, but maybe it will be discussed more upon review. Only other reason I can think of is signal integrity from needing to communicate with all the extra cache the CPU can't sustain as high of clocks and it's worse when all cores are accessing L3. That's just a wild guess though. Hope a reviewer asks AMD about it.
 

Hitman928

Diamond Member
Apr 15, 2012
5,180
7,631
136
As far as I am aware only the L1 is as fast as the core. L2 is half of that and L3 even slower.

If L3 speed isn't fixed by somehow tied to the clock of the cores, that might have something to do with it, given the massive amount of L3 cache added. (Again, just throwing out guesses here).