Discussion Intel current and future Lakes & Rapids thread

Page 460 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
-Sapphire Rapids doesn't have Gracemont cores. It's physically not there in the die shot. Where did you get that idea?

-The 34C MCC is weird.

-When Jim Keller said 800+ OoOE resources, he didn't mean a specific Intel CPU, but that it could be expanded and the performance be taken advantage of. He was basically saying there's room to grow.

-The MCC/LCC Intel dies are relatively inefficient in terms of core/die area because the I/O take a lot of space.

Icelake-SP according to semianalysis:

XCC 42 cores: 640mm2
HCC 28 cores: 505mm2
LCC 16 cores: 370mm2

Does it mean that LCC has cores that are 2.5x larger than the ones used in XCC? No. Rather than having the I/O section separate as with AMD, Intel has it split between dies.

Yes the cores are large but not 80% large.
 

naukkis

Senior member
Jun 5, 2002
702
570
136
-When Jim Keller said 800+ OoOE resources, he didn't mean a specific Intel CPU, but that it could be expanded and the performance be taken advantage of. He was basically saying there's room to grow.

Isn't he referred Sunny Cove successor after Willow Cove?

 
  • Like
Reactions: AMDK11

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
The IO is definitely taking up quite a bit of space.

There is also the possibility that Intel might have some dark silicon in there to hit higher clocks.

EDIT: beyond the disabled core.

We still don’t know what this chip clocks at, correct?

EDIT: It would not surprise me if these chips managed to hit higher clocks than Ice Lake SP.
 
Last edited:

repoman27

Senior member
Dec 17, 2018
342
488
136
Even using those accurate measurements it seems silly to have an 80% increase in area per core on average. I hope whatever inflated size that much is worth it.


Yeah, could be a reason, but at that point I'd really expect a lot more than projected 20% gains, I mean even Icelake over Skylake gets +19% IPC for +38% area, it seems weird they need almost twice that again to reach the same increase. It would be just better to place 80% more Icelake cores otherwise...
As noted by others, increasing the core count significantly alters the ratio of core to uncore area. By my measurement the ICL-SP cpu core tiles are 8.9 mm², and SPR-SP are 13.5 mm². That makes the Golden Cove cores on 10+++ a little over 50% larger than Sunny Cove on 10+, which is still not insignificant.

Intel has been touting how their tiles are superior to conventional chiplets on an organic substrate because the EMIB bridges can act as long wires so they don't need to implement a separate SerDes and PHY on each end for the interconnect fabric. This does lead to considerable area and power savings, however, the reality is that disaggregation / disintegration always comes with significant trade-offs. The area dedicated to interconnect is still gated by the pitch of the top metal layer and micro-bump array. A little over 20% of the total area of each Sapphire Rapids XCC die is dedicated to on-package interconnection with the other tiles.

...
Icelake-SP according to semianalysis:
...
Those numbers from SemiAnalysis are rubbish. If you take a minute to read that page, you'll see that they were based on inaccurate floor-plan mockups. Furthermore, they extrapolated everything from the size of the LCC die, but didn't even manage to get that right (it's actually closer to 20.5 mm x 19.5 mm or 400 mm²). Most of their analysis stems from assumptions which have since proven incorrect, and they also blatantly lifted images from other websites and slapped their own watermarks on them, which isn't cool.

I measured images of the actual dies. Feel free to double check my work, but please stop citing what amounts to guesswork from a dodgy source.
 
  • Like
Reactions: Tlh97 and Exist50

vstar

Member
May 8, 2019
46
39
61
Looks like Windows 11 comes with a noticeable (few %) performance uplift (vs Win 10 21H1) for Hybrid architectures. This bodes well for Alder Lake when it comes out later this year.

 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Isn't he referred Sunny Cove successor after Willow Cove?

I just found the part in the presentation and he says "if you take Sunny Cove scaled to 800 instructions..."

Considering Keller is regarded as one of the best CPU architects, referring to 800+ instructions for a CPU that's going to get very little out of it doesn't sound right to me.
Again, he likely means there's room to grow when people keep saying there's no more performance to be had since voltage scaling is dead, moore's law is dead, clock scaling is dead, etc, etc.

Those numbers from SemiAnalysis are rubbish. If you take a minute to read that page, you'll see that they were based on inaccurate floor-plan mockups. Furthermore, they extrapolated everything from the size of the LCC die, but didn't even manage to get that right (it's actually closer to 20.5 mm x 19.5 mm or 400 mm²).

So further proves my point. Yes I get they are not completely correct but within the ballpark. 400mm2 is 8% larger than 370mm2, and that's for the smallest configuration. And my point was that IO takes disproportionate amount in smaller core count configurations.
 

repoman27

Senior member
Dec 17, 2018
342
488
136
So further proves my point. Yes I get they are not completely correct but within the ballpark. 400mm2 is 8% larger than 370mm2, and that's for the smallest configuration. And my point was that IO takes disproportionate amount in smaller core count configurations.
Your point was totally fine, btw. The only thing I took issue with was citing SemiAnalysis, which is why I snipped everything besides that in my response.
 
  • Like
Reactions: IntelUser2000

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,685
136
Looks like Windows 11 comes with a noticeable (few %) performance uplift (vs Win 10 21H1) for Hybrid architectures. This bodes well for Alder Lake when it comes out later this year.

Here's the problem with not establishing a baseline for testing Win 11 on hybrid CPUs:

Windows 11 apparently offers big performance improvements over Windows 10
The laptop being used has an Intel Core i7-10875H and an NVIDIA GeForce RTX 2070 Super.
One test that he ran was Time Spy in 3DMark, which jumped from a score of 6,872 to 7,613. That’s both a bump in CPU and GPU, with the GPU score increasing from 6,927 to 7,426, and the CPU score increasing from 6,573 to 8,886.
For Geekbench, the single-core performance increased from 1,138 on Windows 10 to 1,251 on Windows 11, and the multi-core score increased from 6,284 to 7,444, so there are some impressive improvements there too.

Now here's the cherry on top, this may also be linked to power management:
It would seem that Microsoft is adding some new power management. The person that did the testing uninstalled any ASUS software that might have been controlling the fans, so everything that happened there was controlled by the OS. It was notable that the fans ran differently while running the tests, and as we all know, the cooler the CPU and GPU stay, the better the performance is.

But the temperatures don’t even seem to reflect that. When running 3DMark, the results said that it ran hotter, but the fans were running louder and more consistently.
 

Asterox

Golden Member
May 15, 2012
1,026
1,775
136
Looks like Windows 11 comes with a noticeable (few %) performance uplift (vs Win 10 21H1) for Hybrid architectures. This bodes well for Alder Lake when it comes out later this year.


Hm, all this tests but not a single everyday aplication used by a lot of people.For example, free aplications Handbrake and 7Zip are running with no problem on this trial non final W11 version.It’s absurd, or like little kids playing in the "windows sand".

"This is still Windows", or it will never be optimized for example as the Apple MAC OS + ARM ecosystem.

It is very naive, to rely on magical new Windows OS which significantly improves CPU performance. :mask:
 

mikk

Diamond Member
May 15, 2012
4,112
2,106
136
Here's the problem with not establishing a baseline for testing Win 11 on hybrid CPUs:

Windows 11 apparently offers big performance improvements over Windows 10

Now here's the cherry on top, this may also be linked to power management:


Just to make it clear this refers to the i7-10875H test on youtube. in this review there is also a big 3dmark gain - there wasn't any 3dmark gain in the Lakefield test from hothardware and Digital Content Creation score wasn't better either. It can't be purely CPU related therefore, it's like the i7-10875H can use more power, he should check the power management. If there was a higher power limit for the Lakefield system I would expect improved 3d scores.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
Here's the problem with not establishing a baseline for testing Win 11 on hybrid CPUs:

Windows 11 apparently offers big performance improvements over Windows 10




Now here's the cherry on top, this may also be linked to power management:
Win11 or whatever it may be, seems to target newer CPUs. That alone makes a difference in system performance because the entire code base would be compiled for not so antique CPUs.
RHEL is doing the same, the baseline architecture is x86-64-v2. Scheduling improvement bringing performance gain for single core load sounds strange

MS should setup something like libc HW Caps, or deliver system binaries from the Update server according to CPU found on the system.
Or at least set a minimum requirement for CPUs from not so distant past.
x86 legacy is a blessing and a curse.
 
  • Like
Reactions: Tlh97

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
Here's the problem with not establishing a baseline for testing Win 11 on hybrid CPUs:

Windows 11 apparently offers big performance improvements over Windows 10




Now here's the cherry on top, this may also be linked to power management:
Just to make it clear this refers to the i7-10875H test on youtube. in this review there is also a big 3dmark gain - there wasn't any 3dmark gain in the Lakefield test from hothardware and Digital Content Creation score wasn't better either. It can't be purely CPU related therefore, it's like the i7-10875H can use more power, he should check the power management. If there was a higher power limit for the Lakefield system I would expect improved 3d scores.
Win11 or whatever it may be, seems to target newer CPUs. That alone makes a difference in system performance because the entire code base would be compiled for not so antique CPUs.
RHEL is doing the same, the baseline architecture is x86-64-v2. Scheduling improvement bringing performance gain for single core load sounds strange

MS should setup something like libc HW Caps, or deliver system binaries from the Update server according to CPU found on the system.
Or at least set a minimum requirement for CPUs from not so distant past.
x86 legacy is a blessing and a curse.

According to the comments I have seen, they were different builds of Windows 11.
 
  • Like
Reactions: Tlh97 and mikk

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,685
136
According to the comments I have seen, they were different builds of Windows 11.
Probably different builds, but what's more important is that the source of this benchmark documented his tests, showing different boosting behavior under Windows 11. The CPU boosted more aggressively, to the point of reaching thermal throttling more often. The score themselves matter very little, as they were obtained in a very unorthodox manner: power management left entirely to the OS instead of OEM software, benchmarks running while both monitoring and video recording software was running.

Whether this happened to the Lakefield system is anybody's guess (more aggressive boost), the point is we shouldn't just assume Win 11 has a better hybrid CPU scheduler. We lack a proper baseline, and the hothardware tests lack proper thermal & power information. If Lakefield silicon was white, we could also assume that Win 11 runs faster on white CPUs.

Happily for us Win 11 gets unveiled on June 24, we'll find out what gains we're getting soon enough.
 
Last edited:
  • Like
Reactions: Tlh97

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
Stumbled across this from @InstLatX64 on twitter:

1624129995540.png

He provides valid sources.

There are still some unknowns, but Gracemont looks like an absolute beast possibly faster than skylake even. Maybe we will be pleasantly surprised by ADL-S.

EDIT: ADL-P also appears to have 48 EUs (for the top sku)?

EDIT: ADL-P ES 8x8 had 48 EUs running at 1.2 ghz.

EDIT:

Ryzen 4800h 2,894mhz), Linux 5.8.13 crypto test:

read)
[ 17.722331] raid6: avx2x4 gen() 29399 MB/s
[ 17.778997] raid6: avx2x4 xor() 8932 MB/s
[ 17.835665] raid6: avx2x2 gen() 30120 MB/s
[ 17.892331] raid6: avx2x2 xor() 24125 MB/s
[ 17.948997] raid6: avx2x1 gen() 25706 MB/s
[ 18.005663] raid6: avx2x1 xor() 18915 MB/s
[ 18.062331] raid6: sse2x4 gen() 16327 MB/s
[ 18.118997] raid6: sse2x4 xor() 7738 MB/s
[ 18.175664] raid6: sse2x2 gen() 16681 MB/s
[ 18.232330] raid6: sse2x2 xor() 12929 MB/s
[ 18.288998] raid6: sse2x1 gen() 14136 MB/s
[ 18.345664] raid6: sse2x1 xor() 10110 MB/s
[ 18.345665] raid6: using algorithm avx2x2 gen() 30120 MB/s

ADL-P (800 Mhz) Linux 5.13.0rc6:
[ 0.692326] raid6: avx2x4 gen() 32869 MB/s
[ 0.709337] raid6: avx2x4 xor() 5236 MB/s
[ 0.726327] raid6: avx2x2 gen() 33171 MB/s
[ 0.743327] raid6: avx2x2 xor() 20256 MB/s
[ 0.760327] raid6: avx2x1 gen() 29890 MB/s
[ 0.777325] raid6: avx2x1 xor() 20386 MB/s
[ 0.794327] raid6: sse2x4 gen() 17645 MB/s
[ 0.811329] raid6: sse2x4 xor() 4680 MB/s
[ 0.828328] raid6: sse2x2 gen() 18736 MB/s
[ 0.845325] raid6: sse2x2 xor() 12584 MB/s
[ 0.862326] raid6: sse2x1 gen() 14603 MB/s
[ 0.879325] raid6: sse2x1 xor() 12719 MB/s
[ 0.879335] raid6: using algorithm avx2x2 gen() 33171 MB/s

Note that crypto tests can cary based on kernel version, cflags, etc.
 
Last edited:

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
There are still some unknowns, but Gracemont looks like an absolute beast possibly faster than skylake even. Maybe we will be pleasantly surprised by ADL-S.

God I really hope so. The wait to see how Alder Lake/Golden Cove performs is absolutely killing me! I was hoping there would have been a monumental leak by now, rather than just indications......although to be fair, the indications so far have been positive.

I have to admit I was disappointed by Intel's big little aspirations (understandable for mobile but for desktop?!?) when I first heard about it, but it's possible they may be on to something and seem to be going all in on that new paradigm shift; which I didn't think they would do if there wasn't a great benefit to it.

Whatever happens, I will do an upgrade this year as I've been putting it off long enough.
 
  • Like
Reactions: pcp7

Hulk

Diamond Member
Oct 9, 1999
4,191
1,975
136
And so the cycle begins. Early leaks show ADL could be a "beast," that comment gets extrapolated over the coming weeks/months, and then no matter how good the first real benches are ADL is immediately declared a failure and Intel is finished.
 
  • Like
Reactions: Tlh97 and ryan20fun

Thala

Golden Member
Nov 12, 2014
1,355
653
136
Whether this happened to the Lakefield system is anybody's guess (more aggressive boost), the point is we shouldn't just assume Win 11 has a better hybrid CPU scheduler.

Actually, and unlike the folks from hothardware.com, you only need half a brain to conclude that a higher multicore score cannot be the result of better hybrid scheduler. Thing is, in situations, when a benchmark generates as many threads as there are HW threads, the optimum schedule is trivial - even a non-hybrid scheduler is able to decide the optimum schedule for hybrid architectures in this case - or with other words, the optimum schedule is independent of core type distribution.
The really interesting cases, where a hybrid scheduler is of advantage, are situations, when the system is only partially loaded - but this is not exercised by a typical performance benchmark.
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
Actually, and unlike the folks from hothardware.com, you only need half a brain to conclude that a higher multicore score cannot be the result of better hybrid scheduler. Thing is, in situations, when a benchmark generates as many threads as there are HW threads, the optimum schedule is trivial - even a non-hybrid scheduler is able to decide the optimum schedule for hybrid architectures in this case.

Programs don't scale perfectly, so there is some room for optimization.
 
  • Like
Reactions: Tlh97 and ryan20fun

Thala

Golden Member
Nov 12, 2014
1,355
653
136
Programs don't scale perfectly, so there is some room for optimization.

Sure, there could be many optimizations, mostly in the implementation of synchronization and message-passing primitives, implementation of context switching and cache & TLB management etc. - but not in the algorithm, which determines the schedule in above situation (aka the task of the scheduler) - thats because the schedule is trivial and even the most barebone scheduler would be able to determine the optimum schedule.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
We'll see if Windows 11 actually delivers better Hybrid performance or even better performance in general.

You can see from Geekbench Windows 10 10875H devices performing just as well, if not better than that Windows 11 test.

From my memory, Windows versions never really brought performance improvements.

Stumbled across this from @InstLatX64 on twitter:

He's basically saying FMA enabling increased scores by 1.4x from SSE, and it took FMA + 256-bit to double the scores, which it manages to do.

So based on that Gracemont has full support for AVX2, meaning 256-bit FMA, which is 4x theoretical FP throughput from Tremont.

I don't expect it to be significantly faster than Skylake if it does. Maybe 5%.
 
Last edited:
  • Like
Reactions: Carfax83

mikk

Diamond Member
May 15, 2012
4,112
2,106
136
The fans ran differently on this i7-10875H device as he said, if his device struggled with temps before it is logical his scores went up with a more agressive fan curve. The CPU+GPU increase is not windows 11 (scheduler) related imho. The tested Lakefield device from hothardware on the other side is a fanless device and a more aggressive boost would pretty much require a power limit increase because with just 5W or 7W this is a big bottleneck. GPU scores didn't improve, so it's unlikely to me there is a PL1/PL2 increase.