Info 64MB V-Cache on 5XXX Zen3 Average +15% in Games

Page 95 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Kedas

Senior member
Dec 6, 2018
355
339
136
Well we know now how they will bridge the long wait to Zen4 on AM5 Q4 2022.
Production start for V-cache is end this year so too early for Zen4 so this is certainly coming to AM4.
+15% Lisa said is "like an entire architectural generation"
 
Last edited:
  • Like
Reactions: Tlh97 and Gideon

Insert_Nickname

Diamond Member
May 6, 2012
4,971
1,695
136
Odd that they released the 3700X and the 3800X around the same time while for Vermeer they released the 5800X 11/05/20, then they are releasing the 5700X 04/04/22. Again they're proving the 5800X a bad buy. Dang got suckered again, lol

The 5800X has always been the odd member of the bunch. If you've been going for value, the 5600X has been the better buy, and for performance the step up to the 5900X is relatively small. You get so much more with the 5900X, 4 more cores, double the L3 cache etc. Add the 5800X is difficult to cool due to the single chiplet.

That new 5700X is killer value if you have an older X370/B450 board to put it in, even if you loose PCIe 4.0. It's even 65W, so it'll be a breeze to cool (pun intended). They got a real winner there, which may explain why they didn't launch it with the others. Too popular, thus requiring too many shipped units.

I'll probably get one for my Crosshair VI. It'll bring it completely up to modern standards. Such an update for a 5 year old board is almost unheard of.
 

Insert_Nickname

Diamond Member
May 6, 2012
4,971
1,695
136
AMD have been selling Every good 8C/16T CCD as Milan in the middle of a Pandemic/Chip Shortage, What were they supposed to do?

Exactly. Not enough dies to go 'round. It'd have ended up like the 3100/3300X. Impossible to get hold of, apart from the initial supply.

About the only thing missing is a low-end APU. Maybe the Athlon 4150GE will see a general launch too eventually.
 
  • Like
Reactions: Tlh97

Shivansps

Diamond Member
Sep 11, 2013
3,912
1,569
136
AMD have been selling Every good 8C/16T CCD as Milan in the middle of a Pandemic/Chip Shortage, What were they supposed to do?

Not discontinue 12nm APUs, cpus and Polaris GPUs in the middle of this mess maybe? It certanly didnt help that they put all pressure on 7nm at the same time.
 

Shivansps

Diamond Member
Sep 11, 2013
3,912
1,569
136
They didn't. Just fed to the Chromebooks.

Most of those were 14nm native dual cores. There is no way they sold enoght 3500C/3700C to kill the entire pc apu market for over a year. And in notebooks they got replaced by Renoir. 12nm APU/CPUs dissapeared some time after Renoir launched.

Also the 12/14nm 3000g Athlons were just fine in the pc market... until the mining craze, then they got suck up by miners. Now Athlons are MIA and has been for at least half a year.

Also keep in mind that the 5600G and 5700G had been holding up just fine since they launched them and Cezanne is still the main notebook APU as of today.
 
Last edited:
  • Like
Reactions: Tlh97 and Ranulf

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
Most of those were 14nm native dual cores. There is no way they sold enoght 3500C/3700C to kill the entire pc market. And inn notebooks they got replaced by Renoir.

Also the 12/14nm 3000g Athlons were just fine in the pc market... until the mining craze, then they got suck up by miners. Now Athlons are MIA and has been for at least half a year.
Believe it or not, there was/still a shortage on 14nm/12nm capacity. Better used that capacity on the IO die than on low return Athlons
 

jpiniero

Lifer
Oct 1, 2010
16,139
6,596
136
Most of those were 14nm native dual cores. There is no way they sold enoght 3500C/3700C to kill the entire pc apu market for over a year. And in notebooks they got replaced by Renoir. 12nm APU/CPUs dissapeared some time after Renoir launched.

Demand is declining but there's also still Windows laptops out there with Picasso. I think some models that came with Dali now come with a quad core model.
 

jamescox

Senior member
Nov 11, 2009
644
1,105
136
AMD have been selling Every good 8C/16T CCD as Milan in the middle of a Pandemic/Chip Shortage, What were they supposed to do?
It isn’t just the money. It is also an image / reputation issue. It is very bad for corporations to decide to make the jump to AMD and then not be able to get the parts.
 

Mopetar

Diamond Member
Jan 31, 2011
8,358
7,443
136
There's also the massive Asian market to consider. India and China account for over a third of the world's population by themselves and while there are some wealthier segments of the population that buy them products that Western countries do, a lot of the population can't afford that but could buy low-end parts that won't even be released in the US or Europe.
 

epsilon84

Golden Member
Aug 29, 2010
1,142
927
136
The 5800X has always been the odd member of the bunch. If you've been going for value, the 5600X has been the better buy, and for performance the step up to the 5900X is relatively small. You get so much more with the 5900X, 4 more cores, double the L3 cache etc. Add the 5800X is difficult to cool due to the single chiplet.

That new 5700X is killer value if you have an older X370/B450 board to put it in, even if you loose PCIe 4.0. It's even 65W, so it'll be a breeze to cool (pun intended). They got a real winner there, which may explain why they didn't launch it with the others. Too popular, thus requiring too many shipped units.

I'll probably get one for my Crosshair VI. It'll bring it completely up to modern standards. Such an update for a 5 year old board is almost unheard of.

Gotta agree with this, I see the 5600/5700X as the true value options to extend the life of aging AM4 platforms.

WR to the 5900X, since it has twice the L3 cache of the 5800X, is there a reason why there is no appreciable difference in gaming performance? I'm guessing it has something to do with the dual CCD layout of the 5900X and the latencies involved between 2x32MB L3 stacks?
Basically what confuses me is that a 5900X w/64MB L3 shows no improvement over a 5800X w/32MB L3 in games, yet a 5800X3D w/96MB L3 is supposed to beat a 5900X by 15%? Are the CCD latencies so bad as to completely negate the extra 32MB L3 on the 5900X vs the 5800X?
 

coercitiv

Diamond Member
Jan 24, 2014
7,128
16,525
136
WR to the 5900X, since it has twice the L3 cache of the 5800X, is there a reason why there is no appreciable difference in gaming performance?
As you guessed, it's the dual CCD. Here's the picture that speaks more than words:

1648722055681.png

For gaming you need big cache that is also very fast, and 2 x 32MB separated by this much latency should not be counted as 100% increase over 32MB, but more like a 50% increase that also comes with an inherent performance penalty in gaming. Turns out the gains from more cache are mostly offset by the loss of perf due to dual CCD layout.

Meanwhile the 5800X3D gets 200% more cache that is (almost) as fast. It's not even a contest in terms of efficiency for latency sensitive tasks that love a fat L3.
 

kognak

Junior Member
May 2, 2021
21
44
91
As you guessed, it's the dual CCD. Here's the picture that speaks more than words:

View attachment 59376

For gaming you need big cache that is also very fast, and 2 x 32MB separated by this much latency should not be counted as 100% increase over 32MB, but more like a 50% increase that also comes with an inherent performance penalty in gaming. Turns out the gains from more cache are mostly offset by the loss of perf due to dual CCD layout.

Meanwhile the 5800X3D gets 200% more cache that is (almost) as fast. It's not even a contest in terms of efficiency for latency sensitive tasks that love a fat L3.
Pictures are nice but threads in reality don't behave like that. Because Zen L3 is victim cache, it only contains data pushed out from L2 cache. Data in L3 is from threads running in that particular CCD. It also means it's unlikely the L3 on one CCD contains anything needed on other CCD. Also schedulers try to keep threads from migrating between CCDs. Latency between CCDs is pretty much insignificant factor over all, it's about same as going to system memory. You have to create specific circumstances to bring it up. What's significant is amount of L3 for single thread can access within core complex. Both 5800X and 5900X have same amount L3 from that perspective, if data isn't in 32MB, next step is over higher latency path if it is in other CCD or system memory, it doesn't really matter.
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Data in L3 is from threads running in that particular CCD. It also means it's unlikely the L3 on one CCD contains anything needed on other CCD. Also schedulers try to keep threads from migrating between CCDs.


That's unfortunately oversimplified and incomplete picture of how things interact between. A game can have zero inter thread communication and might get scheduled perfectly on 1 CCD only and still greatly suffer from inter CCD latency. How? DirectX runtime and GPU drivers are also multithreaded and might get scheduled on different CCD ( and obviously the chance of this increases the more loaded CCD or the more game thread load there is ).
So now data has to traverse between being generated on 1 CCD by game, to being "consumed" by DX/GPU drivers on different CCD and the speed of this transfer is RAM speed ( bw and latency ). Obviously ZEN3 did a lot of reduce this impact by 6C sized CCD in 5900X and huge pool of 32MB of L3 and probably DX runtime/GPU drivers have heuristics to try to move closer to threads that are generating data (as all optimizations they are not optimal and might backfire in various ways ).
 

kognak

Junior Member
May 2, 2021
21
44
91
That's unfortunately oversimplified and incomplete picture of how things interact between. A game can have zero inter thread communication and might get scheduled perfectly on 1 CCD only and still greatly suffer from inter CCD latency. How? DirectX runtime and GPU drivers are also multithreaded and might get scheduled on different CCD ( and obviously the chance of this increases the more loaded CCD or the more game thread load there is ).
So now data has to traverse between being generated on 1 CCD by game, to being "consumed" by DX/GPU drivers on different CCD and the speed of this transfer is RAM speed ( bw and latency ). Obviously ZEN3 did a lot of reduce this impact by 6C sized CCD in 5900X and huge pool of 32MB of L3 and probably DX runtime/GPU drivers have heuristics to try to move closer to threads that are generating data (as all optimizations they are not optimal and might backfire in various ways ).
Not denying any of this however it doesn't seem to be relevant when benchmarking games. Multiple CCXs doesn't really hurt Zen2 either. In best cases single CCX variant is couple percent faster than rest, it's meaningless. Even 5700G is barely faster than "lowly" 3700X despite having higher clock speed, advantages of Zen3 arch and all cores in one CCX. 5700G should be closer to 5800X than 3700X but it's not. With 32MB L3 it certainly would be. Considering how much attention "issue" of multi-CCX gets, one could think it actually makes a difference somewhere.
 

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
Last edited:

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
I hope they release the original prototype they showed, as a real product at some point, even after Zen 4 launches. I want that more than the 5800X3D. It has 192MB total cache.

If you are looking for total L3$ Regardless if it's 3D L3$ or not, then Milan/Milan-X is the way to go, look at these models, they all have more L3$ than the 5900X3D prototype, and with that ASRock MB you could build a very nice workstation(it's compatible with 3D V Cache too)

1648742146447.png

You could build a Mini ITX sized Workstation With ASRock. It's compatible with 3D V Cache too..

1648742327941.png

View attachment 59384

 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Multiple CCXs doesn't really hurt Zen2 either. In best cases single CCX variant is couple percent faster than rest, it's meaningless.

Actually we had very good testing point in 3100 vs 3300X in Zen2 generation. Very extreme, sure as there are only 2 cores per CCD in 3100, but still very good.


And plenty other results around the web, not exactly "couple percent" range. And gap is widening in FPS 99 percentile => exactly where predicted effects would be.

Now 5900X obviously has 6 cores to play with, instead of just two, but there certainly are and will be further scenarios in the future where it will have workloads that will be impacted by latencies discussed.
 

kognak

Junior Member
May 2, 2021
21
44
91
Actually we had very good testing point in 3100 vs 3300X in Zen2 generation. Very extreme, sure as there are only 2 cores per CCD in 3100, but still very good.

10% higher clock speed. L3 size is doubled, 8MB to 16MB. Take those away and how much there is actual difference left. Not much considering 2+2 is worst possible configuration. 3600 in same test pretty much matched 3300X. These have same 16MB L3 and similar clock speed.
 
  • Like
Reactions: lobz

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
10% higher clock speed. L3 size is doubled, 8MB to 16MB. Take those away and how much there is actual difference left. Not much considering 2+2 is worst possible configuration. 3600 in same test pretty much matched 3300X. These have same 16MB L3 and similar clock speed.

The graphs have OC variants and other cpus with 8MB of L3. Yet the gap is still not several % as You claimed.