Intel Skylake / Kaby Lake

Page 520 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
Anyone seen any OC reports on the MESH? I kinda scared to go above 3.2 on mine.
 
Mar 10, 2006
11,715
2,012
126
Anyone seen any OC reports on the MESH? I kinda scared to go above 3.2 on mine.

On my motherboard (MSI X299 Carbon), when I set the mesh ratio (well it's called "ring") to anything above stock, it automatically limits my CPU speed to that clock.

Something ain't right with this BIOS. Will update it tomorrow morning, hoping that'll fix things.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
What I find very surprising is the lower L2 bandwidth. I would have expected higher bandwidth to help feed the AVX-512 units.

I suspect they made a mistake on the cache latency tests: Take a look here for the 6900K @ 3GHz here: http://ocinc.in/intel-broadwell-e-cpu-comparison-10c-vs-8c-vs-6c/

Hardware.fr L3 Read results normalized to 3GHz and 8 cores:
6900K: 210GB/s
7900X: 64.5GB/s

You can see ocinc results are roughly in line with Hardware.fr results for the 6900K.

Now look at 7900X results here: https://www.youtube.com/watch?v=YgHhV3ZdiuI

The 7900X @ 4.3GHz with 10 cores achieves 1300GB/s of bandwidth for its L3.

Look also here: https://www.hardwaremag.fr/wp-content/uploads/2017/06/aida64-memory-cache-intel-core-i9-7900x.png

7900X @ 4.4GHz gets L3 bandwidth of nearly 1200GB/s. The L2 cache bandwidth on Hardware.fr is seriously off too. I'm not talking about 50% difference, I'm talking about 10x.

Something's really up with the Hardware.fr test.

The mesh is staying though, it's got to be a prerequisite for EMIB.

Why? Me and another guy talked about this on this very thread. Mesh is an internal interconnect. EMIB is an off-die connection. Totally different.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
It's clear that the beast needs to be fed better. Hopefully, with the 14nm++ refresh (Cascade Lake) Intel ships these things with faster L3 cache and higher out-of-the-box memory speed support.

The difference if true, is astonishing. A 33% increase in mesh speed increase results in 12% increase in Tomb Raider and 5% increase with Total War.

Unfortunately, I do not believe the situation will improve, if much.

Skylake-X is disappointing for client because its a derivative of a server architecture. Because many cores are more readily taken advantage of, you can go with lower clocks but have more cores. Since we have an upper limit on the frequency of about 5GHz, and that's with overclocking, the stock chips end up being fair bit less.

The die is already complex with all the additions so to reach the clock speeds expected on the client side, something had to go. That happens to be the mesh, which the L3 cache uses to communicate, and since its a much more integrated interconnect(unlike Infinity Fabric) it essentially becomes to represent L3 performance.

But on server, they won't have such issues. The cores are in the lower 3GHz range, which means client cores are 30-45% higher in frequency. Because uncore speeds likely won't change, relatively the mesh will perform 30-45% better than on client.

The real end of Moore's Law isn't about density, or the whatever the nm(or what marketing wants to you to believe) is supposed to represent. It's that there's no such thing as pure gains anymore. Using a car analogy, with newer generation you could end up with a truck that blows away the hauling capacity of preceding trucks while at the same time being able to still outperform previous generation race-oriented cars in acceleration and top speed.

Now, its all a trade-off. That's why purpose-built chips and circuits are all the rage. GPUs, ASICs, various accelerator instructions and circuits in CPUs. The trade-off is a normal thing in every other part of human life. Except... computers. Until now.

Presumably it travels from one die to another via EMIB using a mesh "lane

Yea.... no. Totally different. Mesh is a purpose-built internal interconnect. There are different electrical and physical aspects to consider and because of that it ends up being way different.
 
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
21,583
10,785
136
That's not too bad. How does your chip's performance stack up against a 6800k in the 4.2-4.3 GHz range?
 

moonbogg

Lifer
Jan 8, 2011
10,635
3,095
136
In Cinebench R15 I'm looking at 1474 cb for mulitcore and 195 cb for single-core.

That's a good score. I wonder how gaming performance is in comparison. I bet if you direct die cooled that chip you could sit pretty at 4.8 without issue. That would be sick.
 

DrMrLordX

Lifer
Apr 27, 2000
21,583
10,785
136
I'm not 100% sure what HWMonitor is going to report, though that is probably the amount of power it's estimating that you're using at the socket. I was more shooting for a total system power draw via something like a Kill-a-Watt.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
EMIB isn't even that, it is merely a packaging technology. It is basically a cheaper way to do interposers.

You are very correct. Thanks for pointing it out.

Technologies like Interposer and EMIB are ways to allow very high bandwidth off-chip connections. What protocol the connection uses is entirely up to the designer of the chip. The mesh interconnect in Skylake-X seems to have been almost turned into a marketing term. The thing is though, ringbus and the connections prior to it that were responsible for internal communications always existed and almost always proprietary.
 

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,686
136
Hardware Unboxed redid their overclock test using a custom water loop. Stable clocks at 1.2V increased to 4.7Ghz from 4.6Ghz, temps dropped from 90C+ to 73C, system power usage in CB dropped from 402W to 375W. Deliding the chip would likely shave another 15-20W of system power consumption.

For those of you who defend the silent Intel on their TIM decision, it's good to know the price enthusiasts end up paying for it, both in performance and more importantly on more powerful and expensive cooling solutions.
 
Last edited:
  • Like
Reactions: Grazick
Mar 10, 2006
11,715
2,012
126
Anyone seen any OC reports on the MESH? I kinda scared to go above 3.2 on mine.

I tried 3.3GHz on my 7820X and even with a mesh voltage of 1.1V, Windows wouldn't even boot (BSOD).

3.2GHz seems to be the practical limit for the mesh, but that's a 50% overclock and probably enough to pretty much remove the mesh as a perf bottleneck, so...I'm not complaining.
 
  • Like
Reactions: Edrick

imported_ats

Senior member
Mar 21, 2008
422
63
86
You are very correct. Thanks for pointing it out.

Technologies like Interposer and EMIB are ways to allow very high bandwidth off-chip connections. What protocol the connection uses is entirely up to the designer of the chip. The mesh interconnect in Skylake-X seems to have been almost turned into a marketing term. The thing is though, ringbus and the connections prior to it that were responsible for internal communications always existed and almost always proprietary.

Intel have really barely mentioned it besides the blog post they did which would have been done as part of a presentation at IDF (if that still existed). Compared to the mania about AMD and "Infinity Fabric", the hype is very muted. And while transitioning to a mech is a significant engineering change, how much visibility/impact it has on customers remains to be seen. Realistically, for the LCC die, it would probably be better off with a ring but for the larger dies that would of gone beyond the confines of a single ring, it will be pretty big. It will be interesting to see if Xeon-D transitions to a mesh or stays with a ring (though I do suspect that the next gen Xeon-D will transition to more memory channels). Honestly for most PC users, Xeon-D is probably a better starting point for a HEDT chip than Xeon as Xeon has a lot of overhead to support multi-socket configurations. Main issue with Xeon-D for the HEDT market is lack of memory channels currently (and people's weird fascination with more PCIe channels than they'll ever realistically use). If the next gen Xeon-D goes to 4 memory channels and say 32 PCIe lanes, I could see it easily taking the place of the LCC die in the HEDT lineup.
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
I tried 3.3GHz on my 7820X and even with a mesh voltage of 1.1V, Windows wouldn't even boot (BSOD).

3.2GHz seems to be the practical limit for the mesh, but that's a 50% overclock and probably enough to pretty much remove the mesh as a perf bottleneck, so...I'm not complaining.

Can You do default mlc runs on 2.4 and 3.2Ghz ? Keeping everything else the same. Thanks in advance.
 

TheGiant

Senior member
Jun 12, 2017
748
353
106
I tried 3.3GHz on my 7820X and even with a mesh voltage of 1.1V, Windows wouldn't even boot (BSOD).

3.2GHz seems to be the practical limit for the mesh, but that's a 50% overclock and probably enough to pretty much remove the mesh as a perf bottleneck, so...I'm not complaining.

Well that is a disappointment. It cancels the skl-x as top gaming CPU. We have to wait for coffee lake.
So basically if intel improves things with skl-x:
  • L2 cache latency a bit
  • L3 cache latency a lot (mesh overclock), the gaming king is what no one admits but lots of people look for, including me- this is what intel can offer as benefit over amd- Intel cannot compete with AMD on threadripper MT performance/Price but must offer something clearly better- they are losing the gaming king benefit here
  • a little improves skylake IPC
  • reduce power consumption
then we a have a clear winner here

I don't understand why they are going the same route as with p4- chasing high frequencies no matter what and improving SIMD while no improving the general workload....is the AVX512 so important or is it the same as SSE2 in the P4 era ?

but if amd does the same with ryzen...well
 

jj109

Senior member
Dec 17, 2013
391
59
91
Hardware Unboxed redid their overclock test using a custom water loop. Stable clocks at 1.2V increased to 4.7Ghz from 4.6Ghz, temps dropped from 90C+ to 73C, system power usage in CB dropped from 402W to 375W. Deliding the chip would likely shave another 15-20W of system power consumption.

For those of you who defend the silent Intel on their TIM decision, it's good to know the price enthusiasts end up paying for it, both in performance and more importantly on more powerful and expensive cooling solutions.

Mind you that result likely isn't just from cooling 93C to 73C. Tom's Hardware measured the difference in leakage power and it was only 5% between 60C and 100C and very linear.

The 90C+ power result that HWUnboxed measured was also thermal throttling, which means 402W wasn't even the ceiling.

There were a large # of firmware updates between the two tests as well, which could have tweaked the applied voltages or the power management. Honestly it wouldn't have killed HWUnboxed to do a controlled test. It takes a few minutes to swap out between CLC and custom loop on a bench.

Do know that SKL-X was never meant to be a gaming chip, let alone the top gaming chip. That's what the mainstream platform is for.

Here's something from the OCN X299 owners thread:

6RPgNlx.png


A common point of regression ( the GPU driver ) would explain why games aren't scaling as expected even when applications, 3dmark CPU tests, and game sub-tests (like Civ 6 AI) are fine.
 

TheF34RChannel

Senior member
May 18, 2017
786
309
136
Correct me of I'm wrong but I'm pretty sure it's the new cache design causing the poor game performance and not GPU driver.