Info 64MB V-Cache on 5XXX Zen3 Average +15% in Games

Page 93 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Toggle sidebar Toggle sidebar

K

Kedas

Senior member

Jun 1, 2021

#1

Well we know now how they will bridge the long wait to Zen4 on AM5 Q4 2022.
Production start for V-cache is end this year so too early for Zen4 so this is certainly coming to AM4.
+15% Lisa said is "like an entire architectural generation"

Last edited: Jun 1, 2021

Reactions: Tlh97 and Gideon

H

Hitman928

Diamond Member

Mar 22, 2022

#2,301

MadRat said:
Apparently they will fill the void with silicon spacers.

I'm surprised that silicon is the solution.

Yes, they use dummy substrate to match the height of the stacked die. If this is what you were asking before, then I wasn't understanding your question, I thought you were asking about the gap between the top of the stacked dies and the heatspreader.

J

jamescox

Senior member

Mar 22, 2022

#2,302

Hitman928 said:
The wafers are typically around 700 um thick when completed. The layers that are used in stacking will have to be thinned to be a part of the process, so if you need to adjust heights to match heights on a bridged IC, it's not a problem. If you get really complex with multiple layers, then you'll obviously need to make sure you plan accordingly, but you shouldn't have a situation where you have different stack heights unless it's on purpose.

I don’t know if the SoIC will actually be used for bridging. The bridges parts will use Elevated FanOut Bridge (possibly TSMC’s InFO-L). This is no where near as dense as SoIC with hybrid bonding, if I understand correctly. EFB will likely be used for AMD GPUs. I am still wondering if it will be used for Bergamo, but without SoIC, it will not be L3 cache. It would need to be L4.

M

maddie

Diamond Member

Mar 22, 2022

#2,303

MadRat said:
Apparently they will fill the void with silicon spacers.

I'm surprised that silicon is the solution.

That's what you meant?

H

Hitman928

Diamond Member

Mar 22, 2022

#2,304

jamescox said:
I don’t know if the SoIC will actually be used for bridging. The bridges parts will use Elevated FanOut Bridge (possibly TSMC’s InFO-L). This is no where near as dense as SoIC with hybrid bonding, if I understand correctly. EFB will likely be used for AMD GPUs. I am still wondering if it will be used for Bergamo, but without SoIC, it will not be L3 cache. It would need to be L4.

Yeah, I don't know either. We don't even have confirmation from AMD yet that they are doing bridges. My remark was very much in the hypothetical realm and really just addressing the matching heights question.

J

jamescox

Senior member

Mar 22, 2022

#2,305

Hitman928 said:
The stack goes CPU > V-cache > Solder TIM > Heatspreader. The same as a non-Vcache CPU today.

The upper die does not have direct contact with the heatspreader. The V-cache does sit above the normal L3 cache and will cause greater heat buildup in the base L3 cache region, but the L3 cache region should still be less heat density than other parts of the CPU when under load. The FPU region, for instance, will typically have much higher heat density due to both the transistor density and how often the transistors are actually switching.

It is the same height as a standard die since the cpu die is polished down very thin. Since it is the same height, heat still has the same amount of silicon to transfer through. The TSVs are copper though, so they would likely have higher thermal conductivity. I don’t think the thermal interface between the two die is actually important. They are polished down to exceptional flatness. The main thing is that SRAM will produce some extra heat rather than just a passive piece of silicon.

E

eek2121

Diamond Member

Mar 22, 2022

#2,306

Unsure if you guys saw this: Geekbench score: X570 Taichi - Geekbench Browser

1633/11250 - speeds were 4.53 ghz.

EDIT: Similar MC score to an overclocked 5ghz 5800x. Interesting.

Reactions: Tlh97 and lightmanek

uzzi38

Platinum Member

Mar 23, 2022

#2,307

Geekbench is limited by memory throughput on MT so that makes sense.

Also why are we still entertaining the idea of V-Cache or anything similar on Bergamo ffs. The entire point of it is that it's a balance of price and per-core performance tailored to the hyperscaler market (which is inherently relatively low margin). Stacking memory on top is not an ideal trade-off to be making here, to say the least, PARTICULARLY not shared memory between all cores.

Reactions: Tlh97 and moinmoin

D

deasd

Senior member

Mar 23, 2022

#2,308

It's not a surprise since Vcache would be a great gamechanger in heavy workload while do way less in ST load. I guess when even comes to rendering like blender which use almost all threads of your CPUs have, the Vcache could bring much higher efficiency than any other CPUs without Vcache(same arch, same core/thread count).

coercitiv

Diamond Member

Mar 23, 2022

#2,309

deasd said:
I guess when even comes to rendering like blender which use almost all threads of your CPUs have, the Vcache could bring much higher efficiency than any other CPUs without Vcache(same arch, same core/thread count).

It's not about how many cores the workloads uses, but rather about dataset size and affinity towards memory throughput/latency.

Reactions: Mopetar and Tlh97

I

igor_kavinski

Lifer

Mar 23, 2022

#2,310

Does Zen 3 have cache control instructions to prevent certain required data from being evicted again and again due to cache pressure?

My prediction for non-gaming workloads that V-cache should benefit include VMs, compression/decompression, anything using JITs so javascript performance in browsers, console emulators, compilation, dotnet/java runtime performance and last but not the least, possibly ZFS.

I'm like

whenever I see benchmarks catapulting i7-5775C ahead of 5950X or where the i7-5775C is really close. Scientists are going to love 5800X3D and I bet quite a few of them already love their i7-5775Cs. Makes me wish AMD had made 5950X3D instead. The ultimate swansong of AM4!

I

igor_kavinski

Lifer

Mar 23, 2022

#2,311

Full comparison here: Intel Core i7-5775C vs. AMD Ryzen 9 5950X Benchmarks - OpenBenchmarking.org

T

Timorous

Golden Member

Mar 23, 2022

#2,312

Looking at the AMD slides again they showed a 40% gain over the 5900X in Watch Dogs: Legion which is really impressive. Makes me think games with high AI counts like Legion, grand strategy, rts, 4x games might see a benefit from the cache. It also makes me think that in a few games the 5800X3D might be faster than Zen 4.

It also makes me think that when the likes of HUB, GN, LTT, TPU etc test the CPU they will do the usual and compare it at 1080p in AAA games then call it a waste of sand. TPU less so since they test more games and use 720P.

Shame nobody bothers to test games where 60 FPS is easy to hit and more than enough for the game but simulation rates suffer late game where a beefier CPU matters more. You get the odd Civ 6 turn time test and that is about it for non FPS based metrics.

Reactions: Tlh97, Zepp, DAPUNISHER and 1 other person

I

igor_kavinski

Lifer

Mar 23, 2022

#2,313

Intel Core i7-5775C vs. Apple M1 Benchmarks - OpenBenchmarking.org

Wow!

I

igor_kavinski

Lifer

Mar 23, 2022

#2,314

Timorous said:
It also makes me think that when the likes of HUB, GN, LTT, TPU etc test the CPU they will do the usual and compare it at 1080p in AAA games then call it a waste of sand.

I actually want it to be declared that early on so the scalpers stay away from it.

Reactions: Mopetar, Tlh97, Ranulf and 2 others

N

nicalandia

Diamond Member

Mar 23, 2022

#2,315

MadRat said:
Apparently they will fill the void with silicon spacers.

This has been done for quite a while now, and Silicon spacers are required for thermal/conductive coherency and support.

Last edited: Mar 23, 2022

T

Tuna-Fish

Golden Member

Mar 23, 2022

#2,316

uzzi38 said:
Geekbench is limited by memory throughput on MT so that makes sense.

Also why are we still entertaining the idea of V-Cache or anything similar on Bergamo ffs. The entire point of it is that it's a balance of price and per-core performance tailored to the hyperscaler market (which is inherently relatively low margin). Stacking memory on top is not an ideal trade-off to be making here, to say the least, PARTICULARLY not shared memory between all cores.

Without a large, local cache at the chiplet, the power used for memory traffic totally blows out your power budget. You cannot make a CPU for the hyperscaler market by just doubling cores, pulling off the L3 and calling it good. It wouldn't work, it would either be so starved for memory bandwidth it would be weaker than the variant with less cores, or becasue of power limits you'd have to pull frequencies so low that it would, again, be weaker than the variant with less cores.

The rumors floating around is that Bergamo is what Zen looks like when you pull all of the L3 off the base die, and stack it on top. So they can fit twice the cores utilizing the space freed up by the L3, and use a cheaper process (probably N6) for the cache. Cache per chip is a bit lower than a normal epyc that also has L3 on the same die, cost per chip is similar, but you have twice the cores. The downside is that the cores under the cache have lower max frequency and power.

uzzi38

Platinum Member

Mar 23, 2022

#2,317

Tuna-Fish said:
Without a large, local cache at the chiplet, the power used for memory traffic totally blows out your power budget. You cannot make a CPU for the hyperscaler market by just doubling cores, pulling off the L3 and calling it good. It wouldn't work, it would either be so starved for memory bandwidth it would be weaker than the variant with less cores, or becasue of power limits you'd have to pull frequencies so low that it would, again, be weaker than the variant with less cores.

The rumors floating around is that Bergamo is what Zen looks like when you pull all of the L3 off the base die, and stack it on top. So they can fit twice the cores utilizing the space freed up by the L3, and use a cheaper process (probably N6) for the cache. Cache per chip is a bit lower than a normal epyc that also has L3 on the same die, cost per chip is similar, but you have twice the cores. The downside is that the cores under the cache have lower max frequency and power.

That wouldn't be cheaper at all, and what's more I don't know why you're fixated on removing the L3 cache from the base die. There's no point to that. You're trying to fit the exact same size cores in the exact same amount of space without realising that

1. The cores themselves might see some changes to better suit them towards hyperscaler workloads (or alternatively, cut corners that would hurt them in the general server market but not in the hyperscaler market to anywhere near the same degree).

2. Doubling the number of cores per die means you don't need to stick to the exact same die size - there's extra space on package. This would be a detriment if N5 was a poorly yielding node, but it's not.

N

nicalandia

Diamond Member

Mar 23, 2022

#2,318

Tuna-Fish said:
The rumors floating around is that Bergamo is what Zen looks like when you pull all of the L3 off the base die, and stack it on top. So they can fit twice the cores utilizing the space freed up by the L3, and use a cheaper process (probably N6) for the cache. Cache per chip is a bit lower than a normal epyc that also has L3 on the same die, cost per chip is similar, but you have twice the cores. The downside is that the cores under the cache have lower max frequency and power.

How else do you think they are going to fit 16 cores on a 72.225 mm2 Chiplet? I've done the math and even if they cut the L3 to 1/4th(the size of 8 MiB but as dense as the L3 Chiplet would make it a 16 MiB) it would be larger than that due to AVX-512 Registries taking twice as much size as the current 256, Also the L2 doubled to 1 Mib

Let me pull my mock up

Okay here it is, This is based on Locuza die annotations of Zen3, we know Zen4 will be a die shrink(from 7nm to 5nm), with AVX512 and double L2$. If we go by what Apple was able to accomplish in die area(Logic and SRAM) reduction from TSMC 7nm to TSMC 5nmis only about 20%

The Die on top is my Mock Up of the Zen4 Core with double L2 and double FP Registries and potential 20% die shrinkage(Logic and SRAM)

Last edited: Mar 23, 2022

Reactions: Tlh97, Elfear and lightmanek

S

StefanR5R

Elite Member

Mar 23, 2022

#2,319

nicalandia said:
How else do you think they are going to fit 16 cores on a 72.225 mm2 Chiplet?

Oh, the core count per CCD and the CCD size were already announced?

N

nicalandia

Diamond Member

Mar 23, 2022

#2,320

StefanR5R said:
Oh, the core count per CCD and the CCD size were already announced?

The core count in Genoa and Bergamo have been known for a while, but the chiplet size was leaked by Gigabyte

wccftech.com

AMD EPYC Genoa & SP5 Platform Leaked - 5nm Zen 4 CCD Measures Roughly 72mm, 12 CCD Package at 5428mm2, Up To 700W Peak Socket Power

Aside from the AM5 platform, the leaked Gigabyte documents have also detailed AMD's EPYC Genoa Zen 4 CPUs & SP5 server platform.

wccftech.com

Genoa will have 12 Chiplets, that works out to a lane per chiplet(Milan had 8 chiplets and 8 channel PCIe), what we don't know is how Bergamo will have 128 Cores, it will it be 8 Chiplets with 16 cores each?

How is AMD going to fit so many cores in such small Chiplets? They said they rework/tweaked the cache and most likely will be using TSMC Super High Dense SRAM libraries so even with 1/4 of Die Area Size it will amount to half of Cache(16Mib) or perhaps even lower? 8 MiB? We know that Zen2 APUs had that amount of ram and did pretty good in performance.

Last edited: Mar 23, 2022

uzzi38

Platinum Member

Mar 23, 2022

#2,321

nicalandia said:
The core count in Genoa and Bergamo have been known for a while, but the chiplet size was leaked by Gigabyte

AMD EPYC Genoa & SP5 Platform Leaked - 5nm Zen 4 CCD Measures Roughly 72mm, 12 CCD Package at 5428mm2, Up To 700W Peak Socket Power

Aside from the AM5 platform, the leaked Gigabyte documents have also detailed AMD's EPYC Genoa Zen 4 CPUs & SP5 server platform.

wccftech.com

Genoa will have 12 Chiplets, that works out to a lane per chiplet(Milan had 8 chiplets and 8 channel PCIe), what we don't know is how Bergamo will have 128 Cores, it will it be 8 Chiplets with 16 cores each?

How is AMD going to fit so many cores in such small Chiplets? They said they rework/tweaked the cache and most likely will be using TSMC Super High Dense SRAM libraries so even with 1/4 of Die Area Size it will amount to half of Cache(16Mib) or perhaps even lower? 8 MiB? We know that Zen2 APUs had that amount of ram and did pretty good in performance.

Mind you that chiplet size is specific to Genoa. Bergamo fits more cores per chiplet, but nowhere did AMD say each chiplet was the same size as Genoa.

Reactions: Tlh97, lightmanek and lobz

N

nicalandia

Diamond Member

Mar 23, 2022

#2,322

uzzi38 said:
Mind you that chiplet size is specific to Genoa. Bergamo fits more cores per chiplet, but nowhere did AMD say each chiplet was the same size as Genoa.

But we know they will be using the same Socket and Die Package and they are subject to the same Genoa limitations on Lane count and size.

uzzi38

Platinum Member

Mar 23, 2022

#2,323

nicalandia said:
But we know they will be using the same Socket and Die Package and they are subject to the same Genoa limitations on Lane count and size.

I don't see why that would mean the chiplets are smaller on size? You only have to fit 8 of them on package to get 128 cores if each sports 16 cores, not 12 like Genoa.

L

lobz

Platinum Member

Mar 23, 2022

#2,324

nicalandia said:
But we know they will be using the same Socket and Die Package and they are subject to the same Genoa limitations on Lane count and size.

I mean... OK? How does that relate to the exact area of CCDs? Or do you happen to know the exact parameters upon which AMD arranges any given yet to be releasee package? If so, by all means please, do share!!!

N

nicalandia

Diamond Member

Mar 23, 2022

#2,325

uzzi38 said:
I don't see why that would mean the chiplets are smaller on size? You only have to fit 8 of them on package to get 128 cores if each sports 16 cores, not 12 like Genoa.

Without a Chiplet resize I just don't see where are they going to fit 128 Cores of regular sized Zen4 Cores... Hence Zen4C which is said to be dense(as in SRAM Density not Logic as TSMC it's not there yet specially in 5nm)

You must log in or register to reply here.

Share:

Facebook X (Twitter) Reddit Tumblr WhatsApp Email Link

TRENDING THREADS

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)
- Started by DisEnchantment
- Sep 29, 2022
- Replies: 25K
CPUs and Overclocking
T
Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads
- Started by Tigerick
- Aug 22, 2022
- Replies: 24K
CPUs and Overclocking
Discussion Intel current and future Lakes & Rapids thread
- Started by TheF34RChannel
- Jun 18, 2017
- Replies: 23K
CPUs and Overclocking
Discussion Apple Silicon SoC thread
- Started by Eug
- Nov 10, 2020
- Replies: 11K
CPUs and Overclocking
Question Zen 6 Speculation Thread
- Started by IronLynx
- May 22, 2024
- Replies: 9K
CPUs and Overclocking

Top Bottom

This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.

Accept Learn more…