Info 64MB V-Cache on 5XXX Zen3 Average +15% in Games

Page 127 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Kedas

Senior member
Dec 6, 2018
355
339
136
Well we know now how they will bridge the long wait to Zen4 on AM5 Q4 2022.
Production start for V-cache is end this year so too early for Zen4 so this is certainly coming to AM4.
+15% Lisa said is "like an entire architectural generation"
 
Last edited:
  • Like
Reactions: Tlh97 and Gideon

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Not really. As long as you have your Windows power plan set up properly, Ryzen uses very little power at idle. Ryzen mobile chips actually have lower idle power than Intel chips.

The desktop CPUs ( non APU ones ) have bad idle/low load power usage even if we ignore elefant in the room of them not having iGPU to power down dGPU. APUs are fine, but this is 5800X3D in DTR class laptop discussion, not anything "mobile".
 

ZGR

Platinum Member
Oct 26, 2012
2,052
656
136
Dang, some of those performance gains are ridiculous; almost like all the data for some benchmarks fit entirely within the L3. Let's hope Zen 4 provides 16c+ with stacked cache! Next CPU I get will absolutely have another big cache. Even with the lack of overclocking, I'd definitely be choosing a stacked cache variant once again.
 

KompuKare

Golden Member
Jul 28, 2009
1,016
934
136
A note for the future: don't use Anandtech charts when making a point about gaming performance with relation to the memory subsystem. Anandtech tests with stock memory speeds @ JEDEC timings, which means DDR4 2933 for the 10900K.
Problem surely is, has any other reviewer gone back to re-visit desktop Broadwell recently?

Would be interesting to see some of the workloads 5800X3D does so well benched on Broadwell but it's hard tasks.

(If any 5770C Broadwell owners wants to contribute to the Fallout 4 drawcall thread: https://forums.anandtech.com/thread...u-draw-call-performance-in-fallout-4.2548618/ that would be nice, still no 12900K results in there either.)
 

ZGR

Platinum Member
Oct 26, 2012
2,052
656
136
Problem surely is, has any other reviewer gone back to re-visit desktop Broadwell recently?

Would be interesting to see some of the workloads 5800X3D does so well benched on Broadwell but it's hard tasks.

(If any 5770C Broadwell owners wants to contribute to the Fallout 4 drawcall thread: https://forums.anandtech.com/thread...u-draw-call-performance-in-fallout-4.2548618/ that would be nice, still no 12900K results in there either.)

I can try to do the test but I gotta back up my game and settings files.
edit: I did it
 
Last edited:

ondma

Platinum Member
Mar 18, 2018
2,721
1,281
136
@Markfw I think that is kind of the point. Intel can *attempt* to claim bragging rights on the desktop, however Genoa is shipping. No joke. Word "on the street" is that more Zen 4 chips have been shipped than Sapphire Rapids, albeit with far fewer customers.

I feel like a certain percentage of people here are trying to compare Milan or Ryzen 5000 to Intel Sapphire Rapids or Alder Lake and they are forgetting the big picture: They are comparing very old AMD chips (ignoring Milan-X, which already beats Intel current chips) vs. Intel current, and AMD is not only releasing a new chip soon, but they are also doing so on a node shrink, and they had two years to do it. The performance gains will be much larger from Zen 3 -> Zen 4 vs. Skylake to Alder Lake. To be clear, Intel 14+eXX < TSMC N7 -> Custom TSMC N5P. Intel has technically caught up with TSMC N7 DUV (we hope?). AMD's current mobile chips are on a better node, and desktop/server chips are on an even better one. Milan-X alone is an absolute headache for Intel. Genoa?

Shoot, under the right circumstances (ahem), one could argue the gains from Zen 2 -> Zen 3...ON THE SAME NODE...were larger than Skylake to Alder Lake.
That seems a bit ...... optimistic. You do know that the gains from Skylake to Alder Lake were on the order of 40% (or more) in productivity workloads, right?
 

Saylick

Diamond Member
Sep 10, 2012
3,172
6,410
136
It seems to make perfect sense to put these 3D chips on notebooks, the amount of performance they can achieve with little power usage and heat just looks perfect. That undervolted 5800X3D could be put o a small ITX with a small cooler.
Only motive to not put these chips there is that AMD can't meet demand.
Yeah, given that iGPUs could use more bandwidth too, it really does seem like a theoretical APU with shared V-cache between the CPU and GPU that is partitioned dynamically to give you the best frame-rate (e.g. if CPU bound, allocate more to CPU, and vice-versa) would be pretty sweet.
 

Mopetar

Diamond Member
Jan 31, 2011
7,848
6,015
136
5800X3D's typical power usage for gaming is below 65W so it could work wonderfully in a DTR laptop with a lower locked TDP. Only the application workloads would be impacted by lower performance.

Are there many gaming laptops that have a powerful enough GPU to create a large enough CPU bottleneck to matter?

I can see it being sold if only for marketing purposes and the fact that the TDP makes it more reasonable to include, but the 3080 mobile is basically just a 3070 Ti desktop class card. Same 48 SMS and roughly the same clock speeds.

Maybe it matters for a few titles (likely Flight Simulator or anything that AMD does embarrassingly well at) but I'm not sure if a 5800X3D performs much differently than a 5600X in those circumstances.
 

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
5800X3D crushing the 5950X on some Linux Benchmarks

Also
Would be interesting to test to see if there is any Performance boost on Windows Games running on Linux + Proton/Wine(SteamOS) vs Windows 10/11 with 3D V-Cache.
 

StinkyPinky

Diamond Member
Jul 6, 2002
6,766
784
126
So what was their reasoning not to make 12 and 16 cores versions of this beast? The chiplet design and latency between the chiplets?

Some of the gains are insane, like I had zero idea cache could make that much of a difference. Kinda makes me question why this technology hasn't been looked into previously?
 

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
So what was their reasoning not to make 12 and 16 cores versions of this beast? The chiplet design and latency between the chiplets?

The 5900X3D Prototype showed the same 15% Gaming performance uplift(using the same benchmarks) as the 5800X3D and people would be up in Arms because they would have to lower the clocks to 4.5 or less(Prototype running at 4 Ghz)

1650914860389.png


Some of the gains are insane, like I had zero idea cache could make that much of a difference. Kinda makes me question why this technology hasn't been looked into previously?

It's the Execution that AMD did right. Big L3/L4 Victim Caches have been implemented in the past, but as Chips and Cheese pointed out, there are performance regressions if there is too much latencies implemented

 
Last edited:
  • Like
Reactions: Tlh97

DrMrLordX

Lifer
Apr 27, 2000
21,643
10,860
136
@nicalandia

Leave it to Phoronix to find the edge cases where a 5800X3D works as a productivity processor. For people that want to profile performance on Milan-X systems at work, it'll be a nice home computer/prototyping workstation. I think for most people those workloads won't be relevant though, which is why AMD is billing it as a gaming CPU.

Would be interesting to see some of the workloads 5800X3D does so well benched on Broadwell but it's hard tasks.

Bear in mind that Crystalwell L4 really didn't perform much better than tuned DDR4. And by tuned I mean tuned by the standards of like . . . 2016.

You do know that the gains from Skylake to Alder Lake were on the order of 40% (or more) in productivity workloads, right?

You're skipping IceLake/Rocket Lake and Tiger Lake. It took Intel three architectural updates to hit those IPC gains. And multiple process updates to finally bring Golden Cove to the market.
 

Makaveli

Diamond Member
Feb 8, 2002
4,724
1,060
136
So what was their reasoning not to make 12 and 16 cores versions of this beast? The chiplet design and latency between the chiplets?

Some of the gains are insane, like I had zero idea cache could make that much of a difference. Kinda makes me question why this technology hasn't been looked into previously?

My personal opinion is there is less gains on the dual CCD models and you need to double the amount of cache so also much more expensive to make.

And last reason what is the point 6 months before Zen 4.
 

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
My personal opinion is there is less gains on the dual CCD models and you need to double the amount of cache so also much more expensive to make.
It's the Price, They are selling a 16 Core Milan-X for $4,000 and That CPU is Just Owning the 5950X in HPC based applications thanks to the stacked 3D$, so a 5950X3D would be undermining that Milan-X Segment.

1650917499921.png

1650917782228.png
 
Last edited:

jamescox

Senior member
Nov 11, 2009
637
1,103
136
That's what AMD has said at least. I was specifically replying to someone talking about future 3D-stacked products. I guess they can stick to stacking more and more cache layers as long as there's cache to stack it on top of, but I'm not sure it will work out as well for logic like the cores or other hotspots on the chip. The patent you've posted about basically shows that it wouldn't be possible without sandwiching a cooler in between the two.

Something this exotic is probably only going to be created for the high-margin server market. Maybe we get a limited edition gamer CPU if the extra cache layers continue to add performance uplift. What I'm curious about is if the cooling is good enough for them to be able to put logic layers on either side of it.
A TEC is a semiconductor device, so integrating it into a stack may not be that exotic. Also, if it is in a stack, it can’t really be that high of powered device. That is one of the problems with stacking; getting power up the stack. Thermal transfer is proportional to the temperature difference, so inducing a larger temperature difference could help significantly in transferring the heat out of the high powered logic die. It would be nice if the high powered device could just be on the top of the stack, but it is then difficult to transfer that much power up the stack. If the TEC is actually in the stack, with TSV routed through it, then it will almost certainly need to be a very thin, low powered device. They likely are not going for sub-ambient or anything really exotic here, just increasing the temperature difference enough to increase the heat transfer rate. This might explain some of the weirdly thick renders, how thick the Zen 4 package appears to be, and the rumored (?) 170 watt, water cooling recommended type part. Or it may just be an unused patent. I think we will see a completely different type of stacking with Bergamo and GPUs. I suspect that they will have infinity cache chips used as EFB bridge die. This doesn’t give the massive level of interconnect offered by SoIC, but it still allows HBM levels of interconnect, so good for L4. It also will allow the bridge chip / cache die to be under the logic die.
 

Det0x

Golden Member
Sep 11, 2014
1,031
2,963
136
Is your ram actually running at 1.6v or is that VDIMM number not accurate?
My sticks are 1.55v stock (4000MT/s CL14) and i have been running them at 1.6v for pretty much half a year already.
I also have custom cooling on them, looks like this:
1650952094494.png1650952110784.png1650952125991.png1650952142646.png
Max load temp i'm getting @ 1.6vdimm is around ~ 34 degrees with these RZQ timings (amps)

Since i'm already writing in this thread i can also share alittle info about my 5800x3d i guess :)

5800x3d dont seem hard to cool at all compared to my 5950x @ above 300ppt..
Getting 67 degrees max load-temp while running Cinebench r23 at all stock settings -> 14948 points
1650952619639.png

Seems to be running my CL13 1900/3800 5950x memory settings just fine ->aida mem 55.6ns latency in a somewhat bloated windows install
Also with -30 allcore CO i'm getting 58 degrees max load-temp while running Cinebench r23 -> 15181 points
1650952674283.png

There are no way around the whea errors above 1900:3800 speeds, but atleast i'm not getting reduced scaling/performance until above 2033:4066 in the hardest torture benchmark for the infinity fabric, such as Linpack Xtreme and Y-cruncher 2.5b.
(since i'm using win11 the whea suppressor is not available)

1900:
Linpack extended 8gb = 349.4 average gflops
Y-cruncher 2.5b = 95.231s
1650953289978.png

1966:
Linpack extended 8gb = 347.5 average gflops
Y-cruncher 2.5b = 94.966s
1650953323572.png

2000:
Linpack extended 8gb = 349.1 average gflops
Y-cruncher 2.5b = 94.377s
1650953357623.png

2033:
Linpack extended 8gb = 349.9 average gflops
Y-cruncher 2.5b = 93.764s
1650953394146.png
 
Last edited: