Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Tigerick · Aug 22, 2022

Wildcat Lake (WCL) Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing Raptor Lake-U. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q1 2026.

	Intel Raptor Lake U	Intel Wildcat Lake 15W?	Intel Lunar Lake	Intel Panther Lake 4+0+4
Launch Date	Q1-2024	Q2-2026	Q3-2024	Q1-2026
Model	Intel 150U	Intel Core 7	Core Ultra 7 268V	Core Ultra 7 365
Dies	2	2	2	3
Node	Intel 7 + ?	Intel 18-A + TSMC N6	TSMC N3B + N6	Intel 18-A + Intel 3 + TSMC N6

CPU	2 P-core + 8 E-cores	2 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores
Threads	12	6	8	8
Max Clock	5.4 GHz	?	5 GHz	4.8 GHz
L3 Cache	12 MB		12 MB	12 MB
TDP	15 - 55 W	15 W ?	17 - 37 W	25 - 55 W

Memory	128-bit LPDDR5-5200	64-bit LPDDR5	128-bit LPDDR5x-8533	128-bit LPDDR5x-7467
Size	96 GB		32 GB	128 GB
Bandwidth			136 GB/s

GPU	Intel Graphics	Intel Graphics	Arc 140V	Intel Graphics
RT	No	No	YES	YES
EU / Xe	96 EU	2 Xe	8 Xe	4 Xe
Max Clock	1.3 GHz	?	2 GHz	2.5 GHz

NPU	GNA 3.0	18 TOPS	48 TOPS	49 TOPS

As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.

Kepler_L2 · Jul 5, 2024

Fjodor2001 said:
So 8P+32E+iGPU+Big NPU. Sounds nice. Would be lots of compute perf.

8+32 is dead.

AcrosTinus · Jul 5, 2024

Fjodor2001 said:
So 8P+32E+iGPU+Big NPU. Sounds nice. Would be lots of compute perf.

32E cores, where does this rumor come from again. Judging by the bigger E core size, this might never happen and it should never happen what a unbalanced chip.

Fjodor2001 · Jul 5, 2024

Kepler_L2 said:
8+32 is dead.

Where does that info come from? Or are you just guessing?

Fjodor2001 · Jul 5, 2024

AcrosTinus said:
Judging by the bigger E core size, this might never happen and it should never happen what a unbalanced chip.

Why unbalanced? 8P cores is sufficient for ST perf. If you need more MT perf, better provide that via more E cores.

AcrosTinus · Jul 5, 2024

Fjodor2001 said:
Why unbalanced? 8P cores is sufficient for ST perf. If you need more MT perf, better provide that via more E cores.

It is just a personal thing, I don't like the P to E ratio. Furthermore how many stops will the ringbus have for that, will the cluster size increase ?

Any sources for 32E, this is the first time I hearing this, hopefully it is not some MLID conjecture.

Kepler_L2 · Jul 5, 2024

Fjodor2001 said:
Where does that info come from? Or are you just guessing?

Intel has already briefed OEMs on ARL refresh. The only change is the SoC tile.

AMDK11 · Jul 5, 2024

Lion-Cove-Predict hosted at ImgBB

Image Lion-Cove-Predict hosted on ImgBB

ibb.co

It turns out that the prediction block in LionCove alone constitutes about 80-90% of the core logic of RedwoodCove.

LionCove's core is bigger than it seems.

DavidC1 · Jul 5, 2024

AMDK11 said:
Lion-Cove-Predict hosted at ImgBB

Image Lion-Cove-Predict hosted on ImgBB

ibb.co

It turns out that the prediction block in LionCove alone constitutes about 80-90% of the core logic of RedwoodCove.

LionCove's core is bigger than it seems.

Another indicator that the P core design needs to go the way of Netburst. No "Sea of FUBS" or whatever is going to change this. That's merely a symptom, not a cause.

The future is with the Austin E core design/team. They have not only done big changes every generation, but never before tried ideas as well. Innovation and smarts is what allow designs to have a chance at beating the square root law.

Probably, it's going to be from the Austin team they have a chance of taking performance leadership from ARM vendors, yes even Apple.

Exist50 said that it was being constantly under pressure of being cancelled that the Austin team had to execute. You try new things when under threat of extinction!

Saylick · Jul 5, 2024

AMDK11 said:
Lion-Cove-Predict hosted at ImgBB

Image Lion-Cove-Predict hosted on ImgBB

ibb.co

It turns out that the prediction block in LionCove alone constitutes about 80-90% of the core logic of RedwoodCove.

LionCove's core is bigger than it seems.

I think we're going to need die shots eventually, but if this is true, that's a huge amount of xtors being thrown at the core all for a measly 14% IPC increase.

AMDK11 · Jul 5, 2024

Saylick said:
I think we're going to need die shots eventually, but if this is true, that's a huge amount of xtors being thrown at the core all for a measly 14% IPC increase.

The IPC for ArrowLake is not certain. I suspect that the description on the LionCove slide in LunarLake saying that the prediction block is up to up to 8x larger means that LunarLake has a smaller predictor (4x larger? This is somewhat suggested by the early LunarLake graphics).

I suspect LionCove in ArrowLake is around 800-900 million transistors.

These are just guesses, but the presentation of ArrowLake-S will dispel any doubts.

I believe that the P-Core division will be closed if e-Core passes the large core design.

Intel officially states that future generations of P-Core will have 8 FPU ports and 10 ALU ports and I assume a 12-wide decoder.

Jan Olšan · Jul 5, 2024

DavidC1 said:
Another indicator that the P core design needs to go the way of Netburst. No "Sea of FUBS" or whatever is going to change this. That's merely a symptom, not a cause.

The future is with the Austin E core design/team. They have not only done big changes every generation, but never before tried ideas as well. Innovation and smarts is what allow designs to have a chance at beating the square root law.

Probably, it's going to be from the Austin team they have a chance of taking performance leadership from ARM vendors, yes even Apple.

Exist50 said that it was being constantly under pressure of being cancelled that the Austin team had to execute. You try new things when under threat of extinction!

Branch prediction is extremely important, it's where the performance potential starts, you can't succeed without it being beefy.

That said, where is that image from? We'll have to see if it even is correct... Sounds dubious that Lion's cove would be 3x bigger than Redwood Cove (or even bigger).

AMDK11 · Jul 5, 2024

Jan Olšan said:
Branch prediction is extremely important, it's where the performance potential starts, you can't succeed without it being beefy.

That said, where is that image from? We'll have to see if it even is correct... Sounds dubious that Lion's cove would be 3x bigger than Redwood Cove (or even bigger).

Sorry. I made the mistake of not specifying that I made this painting (a bit hastily). This is based on Intel's claims that LionCove has up to 8 times larger prediction block.

poke01 · Jul 5, 2024

Jan Olšan said:
Branch prediction is extremely important, it's where the performance potential starts, you can't succeed without it being beefy.

There are other CPUs from AMD, Apple, Qualcomm showing otherwise. This is too much increase for just 14% IPC gain.

DavidC1 · Jul 5, 2024

"IPC"(I hate that term)

LPE - 1.376
MTL E - 1.57
RPL E - 1.57
13500H E - 1.46
RPL P - 1.86
MTL P - 1.75
13500H P - 1.75
M3 P - 2.48

So MTL's E is 7% faster than the E in Raptorlake. But the Desktop is is equal to MTL's E. Meaning the Desktop ones are faster. Also, the gap is similar between the P. You can also see 13500H vs RPL-S comparisons it's 6-7% faster for the Desktop chip.

Maybe the loss isn't due to Tiles, but Mobile vs. Desktop, which I noticed before. We never got a proper Meteorlake desktop. In that case, Lion Cove comparisons have to be done against Raptor Cove, thus it'll be 14%, not 14% - Tile Losses.

38% over MTL E is 1.9 for Skymont on Integer, and 2.86 for FP. Another 30% is M3 performance. Actually if the 7% difference in Mobile vs Desktop applies here, we'd get 2.04 on Int and 2.93 for FP on Arrowlake's Skymont.

30% on top of that is 2.652 and 3.8, or 7% faster than Apple M3.

Who thought just two years ago the Intel E core had a chance to beat any of Apple's P cores?

AMDK11 · Jul 5, 2024

poke01 said:
There are other CPUs from AMD, Apple, Qualcomm showing otherwise. This is too much increase for just 14% IPC gain.

+14% for the LunarLake variant. ArrowLake testing will reveal the full picture of what LionCove and its 8x larger prediction block are all about.

I'm curious about the C&C analysis of the predictions and the rest of the LionCove core.

DavidC1 · Jul 5, 2024

AMDK11 said:
+14% for the LunarLake variant. ArrowLake testing will reveal the full picture of what LionCove and its 8x larger prediction block are all about.

I'm curious about the C&C analysis of the predictions and the rest of the LionCove core.

I see. So the picture is just a guess. But you enlarged the entire branch predictor, which suggests to me you are misunderstand parts of CPU architecture:

6:00 He talks about that block. "We significantly widened our prediction block". This isn't about 8x larger BTBs(nevermind enlarging the ENTIRE BPU block by 8 times). It's lot more corner case than you think. So he's probably talking about how much it can handle in parallel or data access paths.

It's like saying the L2 went from 2 way to 16 way. It sounds big at 8x, but performance impact will be lot, LOT less. I could quote other big numbers:
-Xe2's Draw XI is >13x faster
-Skymont's VNNI performance is 4x Gracemont

How do we know? Because we got just 14%. Just like people thought "Strange, why does Zen 5 only get 16%?" Because in Zen 5, it wasn't a straight up increase. They REDUCED features. It's so strange how most of the time we're like "Oh they are overstating their performance". But this time many people saying "We don't want to believe the numbers are that low".

Another great example is RDNA3. "Flops doubled, why does the performance suck?" Because Flops did not double. AMD did not add enough transistors for double shaders. They added dual issue, which is not the same thing. Die size numbers dashed hopes.

You/We have to stop listening people that lack deeper understanding of CPU/GPU architectures.

AMDK11 · Jul 5, 2024

That's why we need photos of the ArrowLake-S system.

In another video he gives details. It claims that the prediction scheme has been fundamentally changed, increasing the prediction block by up to 8 times. To this end, the throughput of requests to L2 has been tripled, and the predictor can now fetch 128 bytes per cycle (instead of 64).

He does not claim that the prediction possibilities have increased 8 times.

!It literally claims that LionCove got an 8x larger prediction block!

However, I am not saying that this gives a large IPC gain, but I wanted to refer to the forecast in LionCove itself and I am curious about more details and C&C analysis.

For convenience, from minute 9:35:

Fjodor2001 · Jul 6, 2024

Kepler_L2 said:
Intel has already briefed OEMs on ARL refresh. The only change is the SoC tile.

So does this mean that you have also received the same info as the OEMs? And that includes all the specs for Arrow Lake Refresh for desktop too, or only Arrow Lake Refresh for laptop?

Because the leak/rumor about 8P+32E was for desktop, not laptop.

Henry swagger · Jul 6, 2024

DavidC1 said:
"IPC"(I hate that term)

LPE - 1.376
MTL E - 1.57
RPL E - 1.57
13500H E - 1.46
RPL P - 1.86
MTL P - 1.75
13500H P - 1.75
M3 P - 2.48

So MTL's E is 7% faster than the E in Raptorlake. But the Desktop is is equal to MTL's E. Meaning the Desktop ones are faster. Also, the gap is similar between the P. You can also see 13500H vs RPL-S comparisons it's 6-7% faster for the Desktop chip.

Maybe the loss isn't due to Tiles, but Mobile vs. Desktop, which I noticed before. We never got a proper Meteorlake desktop. In that case, Lion Cove comparisons have to be done against Raptor Cove, thus it'll be 14%, not 14% - Tile Losses.

38% over MTL E is 1.9 for Skymont on Integer, and 2.86 for FP. Another 30% is M3 performance. Actually if the 7% difference in Mobile vs Desktop applies here, we'd get 2.04 on Int and 2.93 for FP on Arrowlake's Skymont.

30% on top of that is 2.652 and 3.8, or 7% faster than Apple M3.

Who thought just two years ago the Intel E core had a chance to beat any of Apple's P cores?

I think firestorm core set the blueprint for high ipc and low power.. and intel learned from apple because lunar is a direct copy of the m2 😁🥇

deasd · Jul 6, 2024

A leaker called Jaykihn spotted at chat replay session of MLID's video:

MirageI tested 4P cores and 4E cores at lunar lakes TDP on meteor lake and got about 6,600 pts or something like that in r23. 50% higher score is only 9,900 pts...
Jaykihn @Mirage It’s the MT at the power consumption that matters. That said, the ST is not looking promising.

I’ll use ARL-S as an example of lion cove, despite the different cache behavior. When I tested ARL-S 8+16+1 back in June, it was a 3% ST uplift on average. That is not promising whatsoever.

That particular 6+8 20% leak is false. Can confirm.

It’s 3%. That’s not a prediction, that’s testing.

I mean, I’m the source of the LNL benchmarks circulating right now. You probably should, lest you disregard the other info I’d provided.

The marketing figures are (currently) +3% ST, +15% MT. 8+16+1 comparisons, anyway.

Triston Davis At what power? With what configuration?
Jaykihn @Triston Davis 250W PnP, 8+16+1.

Eh, it’s ES2 numbers, I’m sure it’ll change before launch. But that’s the info I have access to right now, and it’d be the same realm as any accurate benchmarks that are showcased nowadays. And NP.

@Triston Davis The +1 means there’s one GFX complex (yes, some chips internally have multiple). @Rudolf van Wijk Lion cove does have HT capabilities, ARL and LNL just don’t have it enabled.

Arrow Lake desktop does not have HT. Can confirm.

Panther lake desktop is canned. And yes, ARL is October.

you can read most of his comment at 1:00:00

poke01 · Jul 6, 2024

DavidC1 said:
30% on top of that is 2.652 and 3.8, or 7% faster than Apple M3.

Going by this, 2.652 IPC for int would also be for M4 since M4 is 7% faster than M3.

poke01 · Jul 6, 2024

Henry swagger said:
I think firestorm core set the blueprint for high ipc and low power.. and intel learned from apple because lunar is a direct copy of the m2 😁🥇

Yep. It will be interesting to compare an Apple design and an Intel design on the same N3B node and to see their efficiency and performance differences.

FlameTail · Jul 6, 2024

poke01 said:
Yep. It will be interesting to compare an Apple design and an Intel design on the same N3B node and to see their efficiency and performance differences.

Lunar Lake vs M3

Both on N3B.

This kind of comparison can only be done once in a blue moon.

SiliconFly · Jul 6, 2024

SiliconFly · Jul 6, 2024

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Senior member

Attachments

Golden Member

Senior member

Diamond Member

Diamond Member

Senior member

Golden Member

Senior member

Platinum Member

Diamond Member

Senior member

Senior member

Senior member

Diamond Member

Platinum Member

Senior member

Platinum Member

Senior member

Diamond Member

Senior member

Senior member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Golden Member