Discussion Intel current and future Lakes & Rapids thread

Ajay · Aug 25, 2021

dullard said:
I'm not sure where you got the 4th power from.

An amazing ability to make simple math errors 😵.

eek2121 · Aug 25, 2021

Abwx said:
Including those two (and a two others subscores that perform not well) ADL is still like 17% faster than your CML clock/clock, and close to 20% faster than the original SKL since IPC has been slightly improved by CML.

And still perform better at 4.2GHz than a 5.1GHz 10C CML..?.

The clocks of the chip were 4.8ghz, not 4.2, and yes. Absolutely possible. A quad core 1195g7 scores around 5500. Add 15%, and then you have the small cores…

EDIT: I think people think this chip has 8 big and no small, but the cache implies otherwise.

coercitiv · Aug 25, 2021

eek2121 said:
Aren’t the non-K variants locked to 65W?

Nope.

Abwx · Aug 25, 2021

eek2121 said:
The clocks of the chip were 4.8ghz, not 4.2, and yes. Absolutely possible. A quad core 1195g7 scores around 5500. Add 15%, and then you have the small cores…

EDIT: I think people think this chip has 8 big and no small, but the cache implies otherwise.

Datas and instructions caches are accurate in respect of big core (L2 should be 1.25MB/core), small core has 64kB instruction cache.

Intel Alder Lake im technischen Detail

Zum Architecture Day 2021 hat Intel umfangreiche Details zur größten CPU-Neuvorstellung des Jahres geteilt: Alder Lake. Ein Überblick.

www.computerbase.de

eek2121 · Aug 25, 2021

coercitiv said:
Nope.

The PL1 for non-K variants is 65W. I was just curious if it could be changed, since I have only ever owned K variants…not that I ever buy Intel these days.

Abwx said:
Datas and instructions caches are accurate in respect of big core (L2 should be 1.25MB/core), small core has 64kB instruction cache.

Intel Alder Lake im technischen Detail

Zum Architecture Day 2021 hat Intel umfangreiche Details zur größten CPU-Neuvorstellung des Jahres geteilt: Alder Lake. Ein Überblick.

www.computerbase.de

EDIT: The L2 count rather, sorry! tabbing between things and I got mixed up. It says 2x.

eek2121 · Aug 25, 2021

eek2121 said:
The PL1 for non-K variants is 65W. I was just curious if it could be changed, since I have only ever owned K variants…not that I ever buy Intel these days.

EDIT: The L2 count rather, sorry! tabbing between things and I got mixed up. It says 2x.

you are right. Each cluster of 4 cores gets 1.25mb cache.

jpiniero · Aug 25, 2021

eek2121 said:
The PL1 for non-K variants is 65W. I was just curious if it could be changed, since I have only ever owned K variants…not that I ever buy Intel these days.

Just the multiplier limit is enforced.

Mopetar · Aug 25, 2021

I wouldn't read anything into slides labeled "for illustrative purposes only" because they contain no hard data.

However on the face of it something has gone horribly wrong at Intel if the energy efficient Atom core isn't energy efficient and actually worse than the performance core.

Instead, I think the way to "read" the charts is that the efficiency cores don't scale past a certain amount of power. Take Apple's M1 SoC which does quite well, but you can't feed it the same 100W+ that x86 desktop CPUs are designed to handle.

Once you push any CPU core to a certain point it falls off hard and an efficiency core going to 4 GHz is a bit surprising. As others have pointed out the base clock will be much lower and so will the sweet spot of where it gets peak performance per watt.

Doug S · Aug 25, 2021

Mopetar said:
Instead, I think the way to "read" the charts is that the efficiency cores don't scale past a certain amount of power. Take Apple's M1 SoC which does quite well, but you can't feed it the same 100W+ that x86 desktop CPUs are designed to handle.

Once you push any CPU core to a certain point it falls off hard and an efficiency core going to 4 GHz is a bit surprising. As others have pointed out the base clock will be much lower and so will the sweet spot of where it gets peak performance per watt.

You can call anything an "efficiency" core. It just needs to be more efficient than what you call your "performance" core.

If you strip out SMT, strip out AVX512, reduce the size of caches, buffers, rename registers etc. and have narrower decode/execution/load/store and redo the timing so your FO4 delays limit you to lower max clock rate then you have a core that is smaller, has a higher IPC, and is more energy efficient than your big core.

How much more efficient depends on how cut down it is. If your goal is for maximum efficiency then it won't contribute much to multithreaded loads but when your big cores are shut down you'll have huge power savings. If you want it to be a big contributor to multithreaded loads (and indeed it seems like that was one of Intel's design goals) you have to aim for less efficiency to get there. Its a compromise, you can't have both massive efficiency and major contribution to multithreaded loads.

coercitiv · Aug 25, 2021

Mopetar said:
However on the face of it something has gone horribly wrong at Intel if the energy efficient Atom core isn't energy efficient and actually worse than the performance core.

Instead, I think the way to "read" the charts is that the efficiency cores don't scale past a certain amount of power. Take Apple's M1 SoC which does quite well, but you can't feed it the same 100W+ that x86 desktop CPUs are designed to handle.

I'm pretty sure E-cores will be energy efficient in mobile, but on the desktop their aim will be to maximize area efficiency instead.

Even if P-core is twice as power efficient than E-core at high clocks, the 4x E-core complex is still twice as area efficient.

JoeRambo · Aug 26, 2021

Intel Corporation Alder Lake Client Platform - Geekbench Browser

Benchmark results for an Intel Corporation Alder Lake Client Platform with an Intel Core i9-12900K processor.

browser.geekbench.com

Getting better, almost 1900ST score.

uzzi38 · Aug 26, 2021

This is much more where I'd expect final silicon to perform. Could still be a touch slow, but any more differences it might be worth just chalking up to poor DDR5 or something.

Here's my 5950X for comparison: https://browser.geekbench.com/v5/cpu/6576128

Idk what PBO setting I ran the test at so ignore the MT score. Looking at ST score and we have:

5950X vs 12900K
Crypto: 4083 vs 4990
INT: 1459 vs 1614
FP: 1872 vs 1980

Det0x · Aug 26, 2021

Here is my max tuned 5950x vs Alderlake

Not all much difference in ST score, and its mostly down to AVX it seems..

Not sure this will be enough against Zen3d considering how good rocket lake 11900k score in this synthetic benchmark compared to "realworld performance"..

Timorous · Aug 26, 2021

uzzi38 said:
This is much more where I'd expect final silicon to perform. Could still be a touch slow, but any more differences it might be worth just chalking up to poor DDR5 or something.

Here's my 5950X for comparison: https://browser.geekbench.com/v5/cpu/6576128

Idk what PBO setting I ran the test at so ignore the MT score. Looking at ST score and we have:

5950X vs 12900K
Crypto: 4083 vs 4990
INT: 1459 vs 1614
FP: 1872 vs 1980

That does not seem great when the 11900K can get
Crypto: ~6,000
Int: ~ 1,600
FP: ~1,900

JoeRambo · Aug 26, 2021

Memory tuning matters, Text Compression / Nav scores while improved vs previuos score are still horribad.

It looks like GC will score ~2100 GB5 ST when paired with 6400+ DDR5

uzzi38 · Aug 26, 2021

Timorous said:
That does not seem great when the 11900K can get
Crypto: ~6,000
Int: ~ 1,600
FP: ~1,900

Comparing highly tuned scores with obviously not tuned ones is gonna give you a bad time.

Timorous · Aug 26, 2021

uzzi38 said:
Comparing highly tuned scores with obviously not tuned ones is gonna give you a bad time.

There are higher 11900K results. Those ones had 3600 ram with unknown timings. Besides 1850-1900 total matches the pre-release scores for the 11900K that were leaking.

Geekbench 5 stock result from a review 1870

March runs - Anything sub 1800 is using sub 3200 ram.

I expect the AL results are using sub-optimal ram but those Int and FP scores are not great vs RL.

IntelUser2000 · Aug 26, 2021

On a properly behaving system, the MT/ST score ratio is somewhat lower than the MT/ST core ratio.

So if you see 7x gains, that's a CPU with 8 cores. With Hyperthreading, it's slightly over 8x.

Now for the Intel chips, that seems to hold true for earlier Skylake parts, like up to Kabylake. The ratio falls a bit after that. Probably because it's overextending 14nm?

Based on that the 17K score for that chip means it only has the 8 Golden Cove cores enabled and Hyperthreading is on.

Det0x said:
Here is my max tuned 5950x vs Alderlake
Not sure this will be enough against Zen3d considering how good rocket lake 11900k score in this synthetic benchmark compared to "realworld performance"..

It's not just due to AVX. The Intel chips do better relative to AMD on Geekbench.

I speculate maybe it has to do with optimizations they have done back in the days when they wanted to compete in the portable space. I noticed that with the Bay Trail graphics doing much better on the mobile oriented ones such as GFXBench relative to other PC oriented benches.

Also, the Amberlake chip does pretty damn well on Geekbench. In the real world, it's actually quite horrible. The U chips are something like 50%+ faster in single thread, but in Geekbench ST the differences are about 10%. Obviously for certain scenarios Geekbench is only second next to Dhrystone in how bad it is.

naukkis · Aug 26, 2021

IntelUser2000 said:
Now for the Intel chips, that seems to hold true for earlier Skylake parts, like up to Kabylake. The ratio falls a bit after that. Probably because it's overextending 14nm?

It's because for ST they are boosting way higher clocks than those maintainable in MT. Intel slides are all about that situation. The bigger they make performance cores the less MT scalability they got from given power envelope. With Golden Cove they need those efficient cores to be able to be competitive in MT workloads.

Gideon · Aug 26, 2021

I was thinking what Intel could do about their Alder Lake AVX-512 situation. They need to support it in the future but there is a nasty tradeoff:

On the one hand they absolutely need to support all the big-core instructions on the small cores for hybrid to work well
On the other hand they really don't want t bloat up every single small core with full AVX-512 units, AI instructions, etc. Especially as Raptor lake will have 16 of those.

The solution is quite obvious: CMT (just as ARM is doing)

If a workload maxes out AVX-512 units on every core, it best be run on the big cores anyway. If those are full, then having less full-fat FPUs on small cores would help with thermals and hotspots.

CMT is sort-of a win-win:

Some tasks using AVX-512 erratically or lightly (speeding up memory operations, etc) do not need to be moved to big cores and do fine with a shared FPU.
FP heavy tasks will still do decently on little cores (as there are 16 small cores and 8 FPUs). Overall there would be a similar uplift as 8C Comet Lake -> 8C Rocket Lake.
Power and heat are much more manageable due to having less FPUs. These can also be placed a bit more sparsely than they otherwise could.

I'm not sure if Intel will pull it off by Raptor Lake, but I'm quite confident that's the way they're going. And they absolutely need AVX-512 if for no other reason then massively wasted die-space on big cores.

Timorous · Aug 26, 2021

Is this a legit 12700 score?

8c 16t? Does that mean 8 p-cores and 0 e-cores?

11700 comparison

ST scores compared - 11700 score used

	12700	11700
Crypto	1804	4358
Int	1484	1378
FP	1802	1638

Both are saying 4.8Ghz max frequency so this looks to be a + 7.7% INT and + 10% FP improvement over Rocket Lake if the result is legit.

JoeRambo · Aug 26, 2021

Best way to solve AVX-512 is to drop 512bit support and introduce AVX-256 by plundering AVX512 instruction set of useful instructions and keeping 256bit vector width.

They have made halfhearted effort in Gracemont already, by taking marketing department favourites like AVX512-VNNI and turning into AVX256-VNNI so they can make a slide how they beat "competition" in benchmarks that are useless for 99.99% people.

lobz · Aug 26, 2021

Timorous said:
That does not seem great when the 11900K can get
Crypto: ~6,000
Int: ~ 1,600
FP: ~1,900

yeah well, rocket lake is the undisputed x86 king of geekbench, before its release, going by geekbench scores people were out of their minds around here.

eek2121 · Aug 26, 2021

FYU regarding the memory, The 12900k result is with DDR5, but with standard JEDEC timings (CL40).

This chip is going to be a beast. Intel just has to not mess things up by trying to up the price. They need to sell the chip for around the same price as the 11900k. If they do that, they will have a winner on their hands.

JoeRambo · Aug 26, 2021

eek2121 said:
FYU regarding the memory, The 12900k result is with DDR5, but with standard JEDEC timings (CL40).

That is bad news for Intel actually. Cause web sites that produce results that are irrelevant for enthusiasts ( like Anandtech ) use official speeds and JEDEC timings. And those results are then used forever by forum members.

eek2121 said:
They need to sell the chip for around the same price as the 11900k.

Not sure if You are seriuos or sarcastic here, Intel will sell chip that is faster than 5950x for 11900K prices? LOL.

Discussion Intel current and future Lakes & Rapids thread

Lifer

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Golden Member

Platinum Member

Golden Member

Golden Member

Golden Member

Platinum Member

Golden Member

Elite Member

Golden Member

Platinum Member

Golden Member

Golden Member

Platinum Member

Diamond Member

Golden Member