• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."

Info 64MB V-Cache on 59XX Zen3 Average +15% in Games

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Hitman928

Diamond Member
Apr 15, 2012
3,520
3,786
136
Regarding the building analogy, do we know how the data travel to this additional "floor"? Is it accessible at single place, like a floor would be via stairway/elevator shaft, or its connected on many points all over the surface of the chip?
There will be many TSVs for the routing between dies.
 

maddie

Diamond Member
Jul 18, 2010
3,448
2,418
136

jamescox

Senior member
Nov 11, 2009
220
392
136
This article indicates that the spacing between TSVs is 9 microns (9000 nm):


That is from last year though; these look like they may be closer, but it is hard to tell. The article talks about it going down to 0.9 microns, which is still 900 nm. It is a lot closer to being on die compared to micro-solder bump solutions (50 micron pitch), but it still is not the same as on die. It will be interesting to find out how this is logically arranged. The access latency is still very low even on 96 MB.
 
  • Like
Reactions: Elfear

DrMrLordX

Lifer
Apr 27, 2000
17,268
6,267
136
Anyone else getting a Pentium-Pro vibe?
Not entirely. I know what you're getting at - PPros had those on-chip 256k L2s in an era when, up to that point, L2 had been soldered onto the motherboard and ran at the bus speed rather than the CPU speed. Pentium Pro represented lots of other changes to the x86 world, however, that I don't see happening with v-cache-equipped Zen3. And honestly I don't expect the massive L3 of these v-cache CPUs to really outperform existing L3 in terms of bandwidth or latency. Socket 5 and Socket 7 L2 ran at 66 MHz in the days of Pentium Pro (100 mhz FSB Super 7 didn't come until later), whereas Pentium Pro L2 ran as high as 200 MHz, or higher if you were overclocking them (which some people did). L2 latency was hugely improved by PPro. I don't have numbers in front of me since that hardware is ancient by modern standards, but still. That on-board L2 of socket 5/socket 7 was slowwwww.
 
  • Like
Reactions: Tlh97 and krumme

Gideon

Golden Member
Nov 27, 2007
1,353
2,588
136

cytg111

Lifer
Mar 17, 2008
16,035
5,959
136
Not entirely. I know what you're getting at - PPros had those on-chip 256k L2s in an era when, up to that point, L2 had been soldered onto the motherboard and ran at the bus speed rather than the CPU speed. Pentium Pro represented lots of other changes to the x86 world, however, that I don't see happening with v-cache-equipped Zen3. And honestly I don't expect the massive L3 of these v-cache CPUs to really outperform existing L3 in terms of bandwidth or latency. Socket 5 and Socket 7 L2 ran at 66 MHz in the days of Pentium Pro (100 mhz FSB Super 7 didn't come until later), whereas Pentium Pro L2 ran as high as 200 MHz, or higher if you were overclocking them (which some people did). L2 latency was hugely improved by PPro. I don't have numbers in front of me since that hardware is ancient by modern standards, but still. That on-board L2 of socket 5/socket 7 was slowwwww.
Yea ok :). I was mearly hovering around "same cpu core and different cache configs" .. and on top of that, IIRC, ppro had a beefed FPU on top of that as well.
 
  • Like
Reactions: Tlh97 and Gideon

DrMrLordX

Lifer
Apr 27, 2000
17,268
6,267
136
It's true that latency will, if anything, be slightly worse, but AMD promised 2 TB/s bandiwidth. That has to be a considerable improvement (source):
Okay, fair point. Not sure how that additional bandwidth will affect things, but it should be interesting, especially if the branch prediction is good enough.

Yea ok :). I was mearly hovering around "same cpu core and different cache configs" .. and on top of that, IIRC, ppro had a beefed FPU on top of that as well.
It did, it did. I was pretty revolutionary for its time. In some ways, the Klamath PIIs were a step backwards.
 

DrMrLordX

Lifer
Apr 27, 2000
17,268
6,267
136
Until it became L3 thanks to K6-III.
Since we are going down history lane, you have to remember that the Super 7 boards were running 100 MHz fsb minimum, which was 50% faster than the original socket 5/socket 7 boards with on-board l2 cache running 66 MHz FSB. Latency and bandwidth both improved considerably. You'll also have to remember that older socket 7 boards had limitations on cacheable area (which were not well-documented at the time, but did exist). Those had been engineered out of Super 7 by the time of K6-III.

Just adding L3 to AMD's Super 7 lineup was a big deal in and of itself, but . . . can you imagine what v-cache would be like, if AMD halved the latency AND increased the size?
 

lightmanek

Senior member
Feb 19, 2017
289
528
136
Had an Epox MVP3-G5 with 2MB(!) L2 for work. Coupled with a K6-3 450MHz that thing was a complete beast.

It had a massive, for the time, 192MB RAM.
Yep, I too had 2MB L2 or with my K6-III 400MHz, L3 cache on board.
Funny that your total RAM at the time will soon be equal to an L3 cache on consumer desktop CPU :D

Kind of first test I will run on that thing will be Quake 2 and Quake 3 timedemos :D
 

IntelUser2000

Elite Member
Oct 14, 2003
7,497
2,279
136
It's true that latency will, if anything, be slightly worse, but AMD promised 2 TB/s bandiwidth. That has to be a considerable improvement (source):
Bandwidth on current Zen 3's L3 cache is no worse when taking into account capacity.

The earlier versions of AIDA64 had problems measuring it properly. With proper testing Zen 3's L3 will measure 1TB/s.
 

Insert_Nickname

Diamond Member
May 6, 2012
4,067
652
126
Funny that your total RAM at the time will soon be equal to an L3 cache on consumer desktop CPU :D
Had the same feeling back when I got my Ryzen 3600. Back in 1999 32MB RAM was pretty common for mid-range systems. 20 years later, and that's the CPUs L3 cache. Now that's progress. :D

Got 32GB RAM to go along with it. The symmetry is beautiful, isn't it?
 

zir_blazer

Golden Member
Jun 6, 2013
1,022
253
136
With Cache L3 sizes snowballing I'm actually rather dissapointed about AMD having removed CAR (Cache-as-RAM) support, as told by Coreboot documentation about Zen Picasso. CAR was a rather minor and mostly unknow feature pretty much used exclusively by Firmware during Hardware initialization, which could setup the Cache in CAR mode so that there could be some usable memory that isn't the Processor GPRs (General Purpose Registers) themselves before the DRAM Controller and the system RAM behind it are fully operational. The Cache L3 isn't a lot, yet I always found interesing the idea of someone being able of getting MS-DOS working without having any memory modules installed. With snowballing Cache sizes, it should be possible to run a small video framebuffer for APUs, too. I always thought that it could be very useful to run a system like that if you could use the Firmware to run some diagnostics tools from a self contained Processor without requiring having memory installed, and pretty much consider it a fully operational computer for as long that you didn't go beyond the limited Cache memory boundaries.
 

lobz

Golden Member
Feb 10, 2017
1,663
2,142
136
So if it's 36 mm^2 X2 for two stacks of additional cache to get 12% more performance in gaming only, it seems very inefficient. The additional 72 mm^2 of silicon should be costly, and since Zen 3 is 81 mm^2, 88.8% more silicon should be giving 37% more performance by the square root rule of thumb.
And we should call that The Hougy Coefficient from now on.
 
  • Haha
Reactions: Tlh97

moinmoin

Platinum Member
Jun 1, 2017
2,507
3,177
136
With Cache L3 sizes snowballing I'm actually rather dissapointed about AMD having removed CAR (Cache-as-RAM) support, as told by Coreboot documentation about Zen Picasso. CAR was a rather minor and mostly unknow feature pretty much used exclusively by Firmware during Hardware initialization, which could setup the Cache in CAR mode so that there could be some usable memory that isn't the Processor GPRs (General Purpose Registers) themselves before the DRAM Controller and the system RAM behind it are fully operational. The Cache L3 isn't a lot, yet I always found interesing the idea of someone being able of getting MS-DOS working without having any memory modules installed. With snowballing Cache sizes, it should be possible to run a small video framebuffer for APUs, too. I always thought that it could be very useful to run a system like that if you could use the Firmware to run some diagnostics tools from a self contained Processor without requiring having memory installed, and pretty much consider it a fully operational computer for as long that you didn't go beyond the limited Cache memory boundaries.
Yeah, I remember when that news broke originally. Maybe AMD can make it work again, but my assumption is that AMD removed public support for this due to privately using CAR itself already for running AGESA/PSP and handling SCF with the extensive internal firmware that's likely running there.
 

soresu

Golden Member
Dec 19, 2014
1,650
841
136
As someone whose first computer was a maxed out Atari 800 with a glorious 48K of RAM, I have to take issue with the claim that an L3 of tens of megabytes "isn't a lot" :laughing:
Even if Zen4 only increases EPYC4/TR5 core count to 96 we will actually see 1GB+ of SRAM cache in an x86 socket for the first time?
 

Bigos

Member
Jun 2, 2019
27
56
61
Isn't it rumoured that a single Zen 3 chiplet can host up to 4 stacks of additional L3 cache, so 288MB in total? With that, you should get over 2GB of L3 cache with the highest-end Milan-X.
 

ASK THE COMMUNITY