Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Tigerick · Aug 22, 2022

Wildcat Lake (WCL) Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing Raptor Lake-U. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q1 2026.

	Intel Raptor Lake U	Intel Wildcat Lake 15W?	Intel Lunar Lake	Intel Panther Lake 4+4+4
Launch Date	Q1-2024	Q2-2026	Q3-2024	Q1-2026
Model	Intel 150U	Intel Core 7	Core Ultra 7 268V	Core Ultra 7 365
Dies	2	2	2	3
Node	Intel 7 + ?	Intel 18-A + TSMC N6	TSMC N3B + N6	Intel 18-A + Intel 3 + TSMC N6

CPU	2 P-core + 8 E-cores	2 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores
Threads	12	6	8	8
Max Clock	5.4 GHz	?	5 GHz	4.8 GHz
L3 Cache	12 MB		12 MB	12 MB
TDP	15 - 55 W	15 W ?	17 - 37 W	25 - 55 W

Memory	128-bit LPDDR5-5200	64-bit LPDDR5	128-bit LPDDR5x-8533	128-bit LPDDR5x-7467
Size	96 GB		32 GB	128 GB
Bandwidth			136 GB/s

GPU	Intel Graphics	Intel Graphics	Arc 140V	Intel Graphics
RT	No	No	YES	YES
EU / Xe	96 EU	2 Xe	8 Xe	4 Xe
Max Clock	1.3 GHz	?	2 GHz	2.5 GHz

NPU	GNA 3.0	18 TOPS	48 TOPS	49 TOPS

As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.

Kepler_L2 · Oct 11, 2024

MoistOintment said:
MLID is claiming that Intel is working on a PTL desktop SKU because ARL is a disappointment...

PTL-S got cancelled like 2 years ago lmao.

511 · Oct 11, 2024

Hulk said:
Intel is saying Skymont is +32% IPC over Gracemont so that is where I based my figure. 32% over Gracemont does not equal Raptor Cove. That would be more like 48%.

32% is Integer IPC and for FP it is 72% as you can see in FP they lacked

Intel-Raptor-Cove-13th-Gen-Raptor-Lake-AMD-Zen-4-Ryzen-7000-CPU-Core-IPC-Performance-_-DDR5-60...png

Hulk · Oct 11, 2024

511 said:
32% is Integer IPC and for FP it is 72% as you can see in FP they lacked
View attachment 109293

Yes you are correct. I didn't think it over enough before I started typing. I will be more careful moving forward.

DavidC1 · Oct 11, 2024

Hulk said:
I have gone over the Intel provided CB R24 data in some detail and here is what I have found. Please let me know where I've lost the trail.

It's fine if you got it wrong. You are asking questions, and it challenges us too. It's all good.

Hulk said:
Based on my testing of my 14900K in CB R24 MT, Raptor Cove does about 21.2 points/GHz (no HT), 27.8 points/GHz (with HT) and Gracemont about 12.8 in MT.

This is from Chips and cheese analysis of Cinebench 2024.

When hitting the execution units, Cinebench 2024 uses scalar and 128-bit packed floating point operations. Wider vector execution units are not useful. Scalar integer performance plays an important role in keeping the FP execution units fed.

So both scalar performance and 128-bit floating point performance is important. Scalar performance improves 32% on Skymont, and on top of that they are adding double the amount of FP units, meaning you get results that are better.

What is the difference between E and P in R23? Techpowerup shows ~50% between E and P for Alderlake. So yes Skymont can improve to a point where it's on par(or even better) than Raptor Cove.
These results are R23 on your system:

CB ST shows Raptor to have 38% better IPC than Gracemont. Of course Raptor loses it's HT capability here.

So R24 must increase FP load for the difference to be greater, where Skymont entirely closes the gap.

Why do I also have a feeling the gap between the E and P are greater than it should be for your system? A guy with 14700K on Youtube is getting 75 for E and 128 for P. 70%, but that's at default clocks, and P is quite a bit higher.

511 · Oct 11, 2024

MoistOintment said:
MLID is claiming that Intel is working on a PTL desktop SKU because ARL is a disappointment...

Yes i saw that but MLID also says Intel Foundry is bad he will not stop bashing Intel

511 · Oct 11, 2024

Hulk said:
Yes you are correct. I didn't think it over enough before I started typing. I will be more careful moving forward.

No worries
MLID should learn a thing or two from you 🤣

511 · Oct 11, 2024

Kepler_L2 said:
PTL-S got cancelled like 2 years ago lmao.

MLID being MLID nothing new he will make a new leak PTL-S canned after a Year close to launch nothing is going well at Intel all doom and gloom over again

DavidC1 · Oct 11, 2024

@Hulk Your Gracemont cores are underperforming. Are you sure you tested CB2024 ST?

Intel Processor N100 CPU - Benchmarks and Specs

Benchmarks, information, and specifications for the Intel Processor N100

www.notebookcheck.net

3.4GHz Turbo clock N100 is getting 60 points. That's 17.6 points per GHz.

511 · Oct 11, 2024

DavidC1 said:
GNR is underperforming in Tomshardware review too. Behind Emerald Rapids in most of their tests. Strange they don't mention this.

Well they won't but I think i know why it's doing it but it is based on my guess that it is the frequency affecting the performance Zen 5 may have higher low core load frequency vs Granite Rappids and the workloads are not embarrassingly parallel so GNR doesn't show goodness and maybe bit of bugs with software like NAMD

DavidC1 · Oct 12, 2024

511 said:
Well they won't but I think i know why it's doing it but it is based on my guess that it is the frequency affecting the performance Zen 5 may have higher low core load frequency vs Granite Rappids and the workloads are not embarrassingly parallel so GNR doesn't show goodness and maybe bit of bugs with software like NAMD

Clock doesn't explain lot of the losses.

They screwed up or the platform is rushed out. Based on Sierra Forest's scaling problems it may not be fixed until the Clearwater Forest generation.

Hulk · Oct 12, 2024

DavidC1 said:
@Hulk Your Gracemont cores are underperforming. Are you sure you tested CB2024 ST?

Intel Processor N100 CPU - Benchmarks and Specs

Benchmarks, information, and specifications for the Intel Processor N100

www.notebookcheck.net

3.4GHz Turbo clock N100 is getting 60 points. That's 17.6 points per GHz.

The E's perform differently when working with the P's. The P's can be fully isolated by turning of the E's. Then the E's can be "determined" by making the final score match the actual score. When you do that you will end up with about 12.8. Run them with 1 P at 800MHz and you will get around the number you quoted.

But if you use that number along with the number found for the P's with the E's off you will get an outrageously high total score. As the E's join in with the P's in greater number their IPC decreases in both versions of Cinebench. I've noticed this for years. I'm thinking it might have something to do with the shared L3 cache or something? That's what I mean when I write benching the E's is hard because they are slipperly and refuse to be nailed down.

511 · Oct 12, 2024

DavidC1 said:
Clock doesn't explain lot of the losses.

They screwed up or the platform is rushed out. Based on Sierra Forest's scaling problems it may not be fixed until the Clearwater Forest generation.

I think it will be fixed by Q1 25

PJVol · Oct 12, 2024

Det0x said:
9950X stock PPT limit is 200w

Isn't it 230W ? IIRC, AMD typically sets the PPT limit at 1.35 * TDP

gdansk · Oct 12, 2024

PJVol said:
Isn't it 230W ? IIRC, AMD typically sets the PPT limit at 1.35 * TDP

That was typical but this time they chose 200W. 🤷‍♂️

Thunder 57 · Oct 12, 2024

H433x0n said:
You’re predicting Zen 5 to be 15-20% more efficient than ARL? Guess we’ll find out on the 24th.

What if ARL is more efficient? Does that mean we can finally move on from everybody pretending their desktop PC is a server rack with strict TCO requirements?

I don't think most people care about power cost which in many areas is negligable. Rather, I think many, like myself, are more considered about extra heat being pumped into the room.

TESKATLIPOKA · Oct 12, 2024

Abwx said:
.........................

The answer is in january or so with Strix Halo s 16C/32T.

That will be Fire Range + dGPU.
Strix Halo won't be paired with a dGPU, so it's GPU performance will be limited to 4070 level.

PJVol · Oct 12, 2024

gdansk said:
That was typical but this time those chose 200W. 🤷‍♂️

Yes, indeed. Better thermals this time, lower ratio.

DrMrLordX · Oct 12, 2024

H433x0n said:
Why?

CPU A gets N performance at 200W. CPU B gets N * 1.01 performance at 300W.

Lower the power target to 125W for both CPUs. Which do you expect to be faster @125W? A or B? The only way the answer is B is if the frequency scaling past 125W is insanely bad.

Thunder 57 said:
I don't think most people care about power cost which in many areas is negligable. Rather, I think many, like myself, are more considered about extra heat being pumped into the room.

It's not just the extra heat being pumped into the room. It's removing it from the CPU in the first place. Most CPUs at that power level require custom water. Even an AiO would struggle to keep up.

naukkis · Oct 12, 2024

Hulk said:
But if you use that number along with the number found for the P's with the E's off you will get an outrageously high total score. As the E's join in with the P's in greater number their IPC decreases in both versions of Cinebench. I've noticed this for years. I'm thinking it might have something to do with the shared L3 cache or something? That's what I mean when I write benching the E's is hard because they are slipperly and refuse to be nailed down.

E-cores have shared L2 which makes it bandwidth starved when many E-cores need bandwidth. Skymont does double that L2 bandwidth vs Gracemont. And you are right that 4-core cluster have only same amount of L3 bandwidth that single P-core so L3-dependent workloads may become L3 bandwidth starved.

DavidC1 · Oct 12, 2024

Hulk said:
But if you use that number along with the number found for the P's with the E's off you will get an outrageously high total score. As the E's join in with the P's in greater number their IPC decreases in both versions of Cinebench. I've noticed this for years. I'm thinking it might have something to do with the shared L3 cache or something? That's what I mean when I write benching the E's is hard because they are slipperly and refuse to be nailed down.

Yea there's probably an impact due to that. They said 32/72% number in ST becomes 32%/55% in MT.

naukkis said:
E-cores have shared L2 which makes it bandwidth starved when many E-cores need bandwidth. Skymont does double that L2 bandwidth vs Gracemont. And you are right that 4-core cluster have only same amount of L3 bandwidth that single P-core so L3-dependent workloads may become L3 bandwidth starved.

There's also being able to do L1-L1 transfers, which will improve performance in MT scenarios and scaling.

We don't know for sure how it'll perform in wide range of workloads until we see Arrowlake in hands of people.

Numbers for Lunarlake and Arrowlake shows that lacking a high performance L3 cache is enough to starve Skymont to a point where it barely outperforms the predecessor. That also means Skymont is high performing enough that such things become a serious bottleneck. A 4.6GHz Raptor Cove class cores would indeed be starved.

DavidC1 · Oct 12, 2024

naukkis said:
E-cores have shared L2 which makes it bandwidth starved when many E-cores need bandwidth. Skymont does double that L2 bandwidth vs Gracemont. And you are right that 4-core cluster have only same amount of L3 bandwidth that single P-core so L3-dependent workloads may become L3 bandwidth starved.

@Hulk The above theory would be easy to test. Change the ring frequency and see if performance in MT workload changes.

I would also change memory speeds, since it would show if it's memory bandwidth starved. I don't know how the CB2024 behaves. I know earlier versions basically didn't care about memory speeds after a certain point. It behaved very much like SpecInt_1T. Since it had a nice mix of Integer and FP, it was an easy test for what people like to erroneously call "IPC".

4 core cluster means the ring also has to transfer data from the memory controller and that's being shared.

In a different topic, I would have said forget it to P cores and have all Skymont cores for Wildcat Lake. And if it had some of the optimizations that Lunarlake had such as the SLC then it would have been a great low-cost laptop chip and completely kill WoA.

igor_kavinski · Oct 12, 2024

Hulk said:
That's what I mean when I write benching the E's is hard because they are slipperly and refuse to be nailed down.

Get a Core i3-N305 laptop. All E's.

EDIT: https://up-shop.org/default/up-squared-pro-7000-edge-series.html

AMDK11 · Oct 12, 2024

Hulk said:
Since the P cores are similar architectures we'll start with the +9% for Lion Cove over Raptor Cove, using the non HT Raptor 21.2 points/GHz and increase it to 23.1.

LionCove is not even roughly the same microarchitecture as RaptorCove despite a modest average 9% IPC increase. LionCove loses a lot in the overall construction of ArrowLake.

The execution engine in LionCove has been thoroughly rebuilt and now closely resembles Zen5 and Skymont.

I hope to see a detailed analysis of LionCove with ArrowLake and an IPC test with HTT disabled in Raptor.

TESKATLIPOKA · Oct 12, 2024

I don't know, Arrow Lake is not that exciting to me, but to be fair I moved from desktop to laptop camp a long time ago.
LNL is exciting although not particularly powerful.
To me Panther Lake-P looks the most exciting next thing, although not that keen on 3 different core clusters, but whatever.
That 12 Xe3 IGP looks particularly appealing, although BW will be a problem unless they put some SLC inside.

@igor_kavinski: LPDDR6 should start at 10.667Gbps and that's only 25% more than what LNL has. And there is still the question, If PTL will use It. I vote for getting rid of NPU and putting SLC instead, it would help BW and save on power.

edit: I take It back, LPDDR6 10667 has 28.5GBps effective BW, while 8500 only has 17Gbps, so 67.5% increase.
That would be enough, but still at least a 16MB SLC wouldn't hurt.

igor_kavinski · Oct 12, 2024

TESKATLIPOKA said:
That 12 Xe3 IGP looks particularly appealing, although BW will be a problem unless they put some SLC inside.

Probably gonna have LPDDR5-9600 or something even faster.

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Senior member

Attachments

Golden Member

Diamond Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Platinum Member

Senior member

Lifer

Golden Member

Platinum Member

Platinum Member

Lifer

Senior member

Platinum Member

Lifer