Skylake-EP Geekbench 4

Arachnotronic · Dec 3, 2016

2x, 12-Core Intel Skylake-E (CPUID 50652, SKX) + GB4 http://bit.ly/2fywouB 1024KB L2, 1408KB L3 / core @Intel #Skylake-E

https://twitter.com/InstLatX64/status/794160063971725312

1MB L2$ per CPU core, 1.375MB of L3$ per core

VirtualLarry · Dec 3, 2016

Wow, serious competition for Zen.

jpiniero · Dec 3, 2016

That result is old. There's a second result if you search on the mobo which is also old but slightly faster.

Arachnotronic · Dec 3, 2016

jpiniero said:
That result is old. There's a second result if you search on the mobo which is also old but slightly faster.

The key is the L2$ cache size, not the performance number. It's not 512KB as thought, but 1MB.

Drazick · Dec 4, 2016

Large L2$ Cache will be great for Signal / Image / Data Processing workloads.

Will it also be on the i7 78xx / 79xxx?

Nothingness · Dec 4, 2016

Arachnotronic said:
The key is the L2$ cache size, not the performance number. It's not 512KB as thought, but 1MB.

That will definitely have an impact on cache latency. Can't wait to see benchmark results to see the impact

itsmydamnation · Dec 4, 2016

its odd it has to be miss reading it there is almost no point int the L3......

Nothingness · Dec 4, 2016

itsmydamnation said:
its odd it has to be miss reading it there is almost no point int the L3......

Perhaps L2 and L3 are exclusive. Even if they are not, remember that L3 are shared across all cores on a die.

itsmydamnation · Dec 4, 2016

Nothingness said:
Perhaps L2 and L3 are exclusive. Even if they are not, remember that L3 are shared across all cores on a die.

If they are "copying Zen" (rolf flame bate away) and going from ring bus to meshes of X number of cores with Z type of interconnect between "core complex" then yeah if they move to the L1 writing back to the L2 ( not the L3) and the L3 to an eviction cache (just like Zen

) then it makes sense.

If they keep their same ring bus cache design and keep the L1+L2 in the L3 then it doesn't make sense. being a server CPU i would assume all cores in use so the L3 would hold nothing but L1 and L2 data.

Or its reading it wrong and either the L2 is 1/2 the size or the L3 is double the size.

jpiniero · Dec 4, 2016

http://ranker.sisoftware.net/top_ru...f3cefbddb588bd9be3deefc9acc9f4c4e291ac9c&l=en

I wouldn't read anything into the scores (esp since it's Sandra), but the top three are all Skylake-EP it looks.

Arachnotronic · Dec 4, 2016

itsmydamnation said:
If they are "copying Zen" (rolf flame bate away) and going from ring bus to meshes of X number of cores with Z type of interconnect between "core complex" then yeah if they move to the L1 writing back to the L2 ( not the L3) and the L3 to an eviction cache (just like Zen ) then it makes sense.

If they keep their same ring bus cache design and keep the L1+L2 in the L3 then it doesn't make sense. being a server CPU i would assume all cores in use so the L3 would hold nothing but L1 and L2 data.

Or its reading it wrong and either the L2 is 1/2 the size or the L3 is double the size.

Arachnotronic · Dec 4, 2016

itsmydamnation said:
its odd it has to be miss reading it there is almost no point int the L3......

SANDRA confirms it.

If we work under the assumption that the world's best server processor maker is competent, we should be trying to understand why Intel made the choices that they did.

witeken · Dec 4, 2016

Arachnotronic said:
SANDRA confirms it.

If we work under the assumption that the world's best server processor maker is competent, we should be trying to understand why Intel made the choices that they did.

The current BDW-EP 22 core has 55MB cache.

So a 32-core SKL-EP will have 76MB L2+L3 cache?

Seems about in-line with core count increase.

mikk · Dec 4, 2016

By cutting some of the L3 cache they are able to increase its L2 cache size at the same time without sacrificing for Die size of the chip as a whole. There can be only one reason. This can be only a performance or power/efficiency as the reason. Per core L3 is still biger than a Core i5 with 6 MB overall. IPC difference between i5 @6MB and i7 @8 MB always was negligible. Therefore saying that there is no point in the L3 is a complete nonsense.

witeken · Dec 4, 2016

Maybe because L2 has lower latency than LLC and the LLC has become big enough?

JoeRambo · Dec 4, 2016

Kinda interesting moves from Intel with L2 caches. First they nerf desktop Skylake L2 from 8-way to 4-way. Now they can burn some of power gains on larger L2 cache?

And L2 latency is always a function of "target" clock, Apple has epic IPC by targetting lower clock, since this is Intel's specialized server core, maybe they are no longer targetting 4Ghz+, but 3Ghz+ and can put the gains in latency?

jpiniero · Dec 4, 2016

JoeRambo said:
Kinda interesting moves from Intel with L2 caches. First they nerf desktop Skylake L2 from 8-way to 4-way. Now they can burn some of power gains on larger L2 cache?

You have to remember that the mainstream Skylake was... nerfed I guess, to get the power down so that it would realistically sort of work in a tablet. With a big server, that's less of an issue.

Arachnotronic · Dec 4, 2016

Something tells me that the HEDT CPUs are going to become the choice for gamers/enthusiasts rather than the mobile/power optimized mainstream chips beginning with SKL-X.

I might hold off on recommending KBL-S to people if they can hold out for X299.

witeken · Dec 4, 2016

BTW, isn't Purley actually called Skylake-EX?

http://wccftech.com/massive-intel-xeon-e5-xeon-e7-skylake-purley-biggest-advancement-nehalem/

Arachnotronic · Dec 4, 2016

witeken said:
The current BDW-EP 22 core has 55MB cache.

So a 32-core SKL-EP will have 76MB L2+L3 cache?

Seems about in-line with core count increase.

Broadwell-EP 22C/44T version has 55MB L3$, 256KB L2$/core, so in terms of raw cache on the die the Broadwell has more.

Arachnotronic · Dec 4, 2016

witeken said:
BTW, isn't Purley actually called Skylake-EX?

http://wccftech.com/massive-intel-xeon-e5-xeon-e7-skylake-purley-biggest-advancement-nehalem/

Purley is the name of the platform. In the past, Intel's EX and EP platforms were separate (e.g. Grantley was Haswell-EP/Broadwell-EP; Brickland was Ivy/Haswell/Broadwell-EX). With Purley, the EP/EX platforms have converged. So Skylake-EP and Skylake-EX share the same Purley platform.

witeken · Dec 4, 2016

mikk said:
By cutting some of the L3 cache they are able to increase its L2 cache size at the same time without sacrificing for Die size of the chip as a whole. There can be only one reason. This can be only a performance or power/efficiency as the reason. Per core L3 is still biger than a Core i5 with 6 MB overall. IPC difference between i5 @6MB and i7 @8 MB always was negligible. Therefore saying that there is no point in the L3 is a complete nonsense.

Arachnotronic said:
Broadwell-EP 22C/44T version has 55MB L3$, 256KB L2$/core, so in terms of raw cache on the die the Broadwell has more.

Not really: 66MB vs. 76MB. But in any case, it's very obvious, isn't it?

Why would Intel want to put more L3$ on the die when they have 3D XPoint?

Intel-Xeon-E7-E5-Skylake-EX-_Purely-Platform_Vs-Nehalem.jpg

Arachnotronic · Dec 4, 2016

witeken said:
Why would Intel want to put more L3$ on the die when they have 3D XPoint?

On-chip caches are much higher bandwidth/lower latency than external DRAM memory. 3DXPoint is much slower than regular DRAM.

Nothingness · Dec 4, 2016

mikk said:
By cutting some of the L3 cache they are able to increase its L2 cache size at the same time without sacrificing for Die size of the chip as a whole. There can be only one reason. This can be only a performance or power/efficiency as the reason. Per core L3 is still biger than a Core i5 with 6 MB overall. IPC difference between i5 @6MB and i7 @8 MB always was negligible. Therefore saying that there is no point in the L3 is a complete nonsense.

Though I agree with your conclusion, you shouldn't compare server chips against mainstream ones, especially non-i7 vs Xeon. As Arachnotronic points out, Broadwell-EP 22C have more raw cache than this Skylake. And in fact all Broadwell-EP have more cache/core; some Broadwell E7 go even farther with 15MB/core of L3.

itsmydamnation · Dec 4, 2016

Arachnotronic said:
SANDRA confirms it.

If we work under the assumption that the world's best server processor maker is competent, we should be trying to understand why Intel made the choices that they did.

Coming from you that is amazingly funny. I was deliberately stirring as it is known that intel is moving away from the full dual lane ring bus. But even if its still ring bus and is 1.25Mb per core the L3 still has a reason, its to keep cache coherency "simple".

KNL doesn't have an L3. It has to keep all the L2's coherrent, it also shares its L2 between two cores. KNL also has 3 modes the interconnect operates in because im assuming all-to-all has quite bad latency , i can't find any latency number for any of the modes, im guessing their not that great (they dont need to be for KNL, it just needs to have massive bandwidth). On top of this the L2 can also only do one read and write per cycle (it has two cores attached).

That doesn't sound very good for a latency sensitive core. If the fabric logic is the same for Skylake Xeon then you would expect each tile to only serve one core, now the questions are:

is the L3 victim or write back?
what mode does the fabric run in?

1. if the L3 is victim and the fabric is in Quadrant Mode, then you have something very similar to Zen CCX + GMI, but it would probably have more inter quadrant/ccx bandwidth.

2. If the L3 is write back then its not there for extra storage but to keep the L2 fast for the local core maybe they could run that in all-to-all mode. Hell EX cut L2 in half to 1mb per core and there are very few workloads that saw big performance losses but i think the first option looks better. The crazy choice would be both factors are configurable.

Skylake-EP Geekbench 4

Lifer

No Lifer

Lifer

Lifer

Member

Platinum Member

Platinum Member

Platinum Member

Platinum Member

Lifer

Lifer

Lifer

Diamond Member

Diamond Member

Diamond Member

Golden Member

Lifer

Lifer

Diamond Member

Lifer

Lifer

Diamond Member

Lifer

Platinum Member

Platinum Member