Skylake-EP Geekbench 4

jpiniero

Lifer
Oct 1, 2010
14,605
5,224
136
That result is old. There's a second result if you search on the mobo which is also old but slightly faster.
 

Drazick

Member
May 27, 2009
53
70
91
Large L2$ Cache will be great for Signal / Image / Data Processing workloads.

Will it also be on the i7 78xx / 79xxx?
 
  • Like
Reactions: Sweepr

itsmydamnation

Platinum Member
Feb 6, 2011
2,773
3,151
136
Perhaps L2 and L3 are exclusive. Even if they are not, remember that L3 are shared across all cores on a die.
If they are "copying Zen" (rolf flame bate away) and going from ring bus to meshes of X number of cores with Z type of interconnect between "core complex" then yeah if they move to the L1 writing back to the L2 ( not the L3) and the L3 to an eviction cache (just like Zen :p) then it makes sense.

If they keep their same ring bus cache design and keep the L1+L2 in the L3 then it doesn't make sense. being a server CPU i would assume all cores in use so the L3 would hold nothing but L1 and L2 data.

Or its reading it wrong and either the L2 is 1/2 the size or the L3 is double the size.
 
Mar 10, 2006
11,715
2,012
126
If they are "copying Zen" (rolf flame bate away) and going from ring bus to meshes of X number of cores with Z type of interconnect between "core complex" then yeah if they move to the L1 writing back to the L2 ( not the L3) and the L3 to an eviction cache (just like Zen :p) then it makes sense.

If they keep their same ring bus cache design and keep the L1+L2 in the L3 then it doesn't make sense. being a server CPU i would assume all cores in use so the L3 would hold nothing but L1 and L2 data.

Or its reading it wrong and either the L2 is 1/2 the size or the L3 is double the size.

iqFhmX1.jpg
 
  • Like
Reactions: NTMBK

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
SANDRA confirms it.

If we work under the assumption that the world's best server processor maker is competent, we should be trying to understand why Intel made the choices that they did.
The current BDW-EP 22 core has 55MB cache.

So a 32-core SKL-EP will have 76MB L2+L3 cache?

Seems about in-line with core count increase.
 

mikk

Diamond Member
May 15, 2012
4,141
2,154
136
By cutting some of the L3 cache they are able to increase its L2 cache size at the same time without sacrificing for Die size of the chip as a whole. There can be only one reason. This can be only a performance or power/efficiency as the reason. Per core L3 is still biger than a Core i5 with 6 MB overall. IPC difference between i5 @6MB and i7 @8 MB always was negligible. Therefore saying that there is no point in the L3 is a complete nonsense.
 
  • Like
Reactions: Arachnotronic

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Maybe because L2 has lower latency than LLC and the LLC has become big enough?
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Kinda interesting moves from Intel with L2 caches. First they nerf desktop Skylake L2 from 8-way to 4-way. Now they can burn some of power gains on larger L2 cache?

And L2 latency is always a function of "target" clock, Apple has epic IPC by targetting lower clock, since this is Intel's specialized server core, maybe they are no longer targetting 4Ghz+, but 3Ghz+ and can put the gains in latency?
 

jpiniero

Lifer
Oct 1, 2010
14,605
5,224
136
Kinda interesting moves from Intel with L2 caches. First they nerf desktop Skylake L2 from 8-way to 4-way. Now they can burn some of power gains on larger L2 cache?

You have to remember that the mainstream Skylake was... nerfed I guess, to get the power down so that it would realistically sort of work in a tablet. With a big server, that's less of an issue.
 
  • Like
Reactions: Arachnotronic
Mar 10, 2006
11,715
2,012
126
Something tells me that the HEDT CPUs are going to become the choice for gamers/enthusiasts rather than the mobile/power optimized mainstream chips beginning with SKL-X.

I might hold off on recommending KBL-S to people if they can hold out for X299.
 
Mar 10, 2006
11,715
2,012
126
The current BDW-EP 22 core has 55MB cache.

So a 32-core SKL-EP will have 76MB L2+L3 cache?

Seems about in-line with core count increase.

Broadwell-EP 22C/44T version has 55MB L3$, 256KB L2$/core, so in terms of raw cache on the die the Broadwell has more.
 
Mar 10, 2006
11,715
2,012
126

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
By cutting some of the L3 cache they are able to increase its L2 cache size at the same time without sacrificing for Die size of the chip as a whole. There can be only one reason. This can be only a performance or power/efficiency as the reason. Per core L3 is still biger than a Core i5 with 6 MB overall. IPC difference between i5 @6MB and i7 @8 MB always was negligible. Therefore saying that there is no point in the L3 is a complete nonsense.
Broadwell-EP 22C/44T version has 55MB L3$, 256KB L2$/core, so in terms of raw cache on the die the Broadwell has more.

Not really: 66MB vs. 76MB. But in any case, it's very obvious, isn't it?

Why would Intel want to put more L3$ on the die when they have 3D XPoint?

Intel-Xeon-E7-E5-Skylake-EX-_Purely-Platform_Vs-Nehalem.jpg
 

Nothingness

Platinum Member
Jul 3, 2013
2,421
751
136
By cutting some of the L3 cache they are able to increase its L2 cache size at the same time without sacrificing for Die size of the chip as a whole. There can be only one reason. This can be only a performance or power/efficiency as the reason. Per core L3 is still biger than a Core i5 with 6 MB overall. IPC difference between i5 @6MB and i7 @8 MB always was negligible. Therefore saying that there is no point in the L3 is a complete nonsense.
Though I agree with your conclusion, you shouldn't compare server chips against mainstream ones, especially non-i7 vs Xeon. As Arachnotronic points out, Broadwell-EP 22C have more raw cache than this Skylake. And in fact all Broadwell-EP have more cache/core; some Broadwell E7 go even farther with 15MB/core of L3.
 
  • Like
Reactions: Arachnotronic

itsmydamnation

Platinum Member
Feb 6, 2011
2,773
3,151
136
SANDRA confirms it.

If we work under the assumption that the world's best server processor maker is competent, we should be trying to understand why Intel made the choices that they did.
Coming from you that is amazingly funny. I was deliberately stirring as it is known that intel is moving away from the full dual lane ring bus. But even if its still ring bus and is 1.25Mb per core the L3 still has a reason, its to keep cache coherency "simple".

KNL doesn't have an L3. It has to keep all the L2's coherrent, it also shares its L2 between two cores. KNL also has 3 modes the interconnect operates in because im assuming all-to-all has quite bad latency , i can't find any latency number for any of the modes, im guessing their not that great (they dont need to be for KNL, it just needs to have massive bandwidth). On top of this the L2 can also only do one read and write per cycle (it has two cores attached).

That doesn't sound very good for a latency sensitive core. If the fabric logic is the same for Skylake Xeon then you would expect each tile to only serve one core, now the questions are:

is the L3 victim or write back?
what mode does the fabric run in?

1. if the L3 is victim and the fabric is in Quadrant Mode, then you have something very similar to Zen CCX + GMI, but it would probably have more inter quadrant/ccx bandwidth.

2. If the L3 is write back then its not there for extra storage but to keep the L2 fast for the local core maybe they could run that in all-to-all mode. Hell EX cut L2 in half to 1mb per core and there are very few workloads that saw big performance losses but i think the first option looks better. The crazy choice would be both factors are configurable.
 
  • Like
Reactions: Dave2150