Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Tigerick · Aug 22, 2022

Wildcat Lake (WCL) Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing Raptor Lake-U. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q1 2026.

	Intel Raptor Lake U	Intel Wildcat Lake 15W?	Intel Lunar Lake	Intel Panther Lake 4+4+4
Launch Date	Q1-2024	Q2-2026	Q3-2024	Q1-2026
Model	Intel 150U	Intel Core 7	Core Ultra 7 268V	Core Ultra 7 365
Dies	2	2	2	3
Node	Intel 7 + ?	Intel 18-A + TSMC N6	TSMC N3B + N6	Intel 18-A + Intel 3 + TSMC N6

CPU	2 P-core + 8 E-cores	2 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores
Threads	12	6	8	8
Max Clock	5.4 GHz	?	5 GHz	4.8 GHz
L3 Cache	12 MB		12 MB	12 MB
TDP	15 - 55 W	15 W ?	17 - 37 W	25 - 55 W

Memory	128-bit LPDDR5-5200	64-bit LPDDR5	128-bit LPDDR5x-8533	128-bit LPDDR5x-7467
Size	96 GB		32 GB	128 GB
Bandwidth			136 GB/s

GPU	Intel Graphics	Intel Graphics	Arc 140V	Intel Graphics
RT	No	No	YES	YES
EU / Xe	96 EU	2 Xe	8 Xe	4 Xe
Max Clock	1.3 GHz	?	2 GHz	2.5 GHz

NPU	GNA 3.0	18 TOPS	48 TOPS	49 TOPS

As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.

DavidC1 · Oct 2, 2024

H433x0n said:
To kill off a successful product segment for BOM optimization is a mistake. If LNL is a sales success, they could charge more to offset the higher costs.

Something more important than margins and revenue is the ability to create new markets and shut off competition.

It's same with the arguments regarding iGPUs. Sure it increases die costs a bit, but without it you are completely locked out of majority of markets. That is above and beyond the impact of any die size savings would give you.

DavidC1 · Oct 2, 2024

Based on the size of the NPU, they could have put 24MB MLC and further lowered power and improved performance. It's way way too large.

cannedlake240 said:
Nah, this is a misconception likely originating from MLID, about Lioncove bring a part of royal core/an entirely new project and uarch team led by Jim Keller. That's simply not true. Lioncove still has a lot in common with redwood cove and all prior Intel core uarch as shown by C&C and David Huang's articles. The core still behaves largely the same in a lot of key metrics

Yup.

Layout has changed substantially on Prescott too, it just did not perform well. Lion Cove is the size of Zen 5 despite a substantially denser process, without top clock speed advantage(both at 5.7GHz), without SMT which according to Intel should give noticeable ST and die area advantages.

The biggest red flag to me is the branch prediction regression. Branch prediction is THE key to improving performance, and it's now worse than distant successor to "Atom" core which had very very humble beginnings. Branch prediction regression tells me the team is really really struggling.

adroc_thurston · Oct 2, 2024

H433x0n said:
the costs are worth it

Nope.

H433x0n said:
and required to compete at the highest level.

Premium tablet chips are very very very niche.

H433x0n said:
I think it’d be in best interest of Intel to create a new product segment based off of LNL

It's literally dead already.
PTL and NVL are both focused on BOM optimization.

H433x0n said:
To kill off a successful product segment

It's not. Premium Windows tablets are 10 years like dead.

DavidC1 said:
Something more important than margins and revenue is the ability to create new markets and shut off competition.

It's not a new market, Core Y is like 12 years old.

DavidC1 said:
It's same with the arguments regarding iGPUs

Fat iGP is niche. Which is why the ULV PTL has a tiny one.

SiliconFly · Oct 2, 2024

FlameTail · Oct 2, 2024

Integer performance of Lion Cove seems really strong.

After watching Geekerwan'a review, I am overall very impressed by Lunar Lake. It even beats the X Elite!

SPEC2017 INT

Battery Life test

FlameTail · Oct 2, 2024

Comparison of CPU core areas

This was posted on reddit.
Private caches have been included to core area;
Intel/AMD = L0, L1, L2.
Apple/Qualcomm = L1.

It seems Lion Cove is still rather bloated, compared to the competition. On the other hand, Skymont area efficiency is impressive.

The Hardcard · Oct 2, 2024

FlameTail said:
Comparison of CPU core areas
View attachment 108593
This was posted on reddit.
Private caches have been included to core area;
Intel/AMD = L0, L1, L2.
Apple/Qualcomm = L1.

It seems Lion Cove is still rather bloated, compared to the competition. On the other hand, Skymont area efficiency is impressive.

Is it bloated? The L2 cache is what it is and it’s 2.5 MB. It would be interesting if logic area could be compared.

poke01 · Oct 2, 2024

adroc_thurston said:
It's not a new market, Core Y is like 12 years old.

Remembers 12” MacBook ugh that core M but still.

FlameTail said:
Comparison of CPU core areas
View attachment 108593
This was posted on reddit.
Private caches have been included to core area;
Intel/AMD = L0, L1, L2.
Apple/Qualcomm = L1.

It seems Lion Cove is still rather bloated, compared to the competition. On the other hand, Skymont area efficiency is impressive.

I think Intel got luckily with N3B, if Lunar was on N3E lion cove would have been even bigger.

Hulk · Oct 2, 2024

Re Geekbench 6 testing. I set my P cores to 2.7GHz to simulate 8 more E's. No hyperthreading of course. The result shows GB6 basically going to 0 by the time you get to 24 cores. By that I mean the 24th core should theoretically increase overall score by 4.3% but in reality only adds 0.6% increase in performance, or 13.2% of that theoretical amount.

Those first 8 or so cores are really important for a high GB6 score.

adroc_thurston · Oct 2, 2024

poke01 said:
Remembers 12” MacBook ugh that core M but still.

Earlier. This started with Ivy-Y which was a special set of parts at I think 12W.

FlameTail · Oct 2, 2024

poke01 said:
if Lunar was on N3E lion cove would have been even bigger.

Not by much, I think.

511 · Oct 2, 2024

I think they still have AVX-512 related logic and stuff in the core just it's fricking disabled Intel's presentation Slides that

AMDK11 · Oct 2, 2024

cannedlake240 said:
Nah, this is a misconception likely originating from MLID, about Lioncove bring a part of royal core/an entirely new project and uarch team led by Jim Keller. That's simply not true. Lioncove still has a lot in common with redwood cove and all prior Intel core uarch as shown by C&C and David Huang's articles. The core still behaves largely the same in a lot of key metrics

Of course it is. This is not due to any Keller or Royal Core.

LionCove is a transition from a unified schedule of 3xFP/ALU and 2xALU to separate 4xFP + 6xALU. This is a radical change since the days of Pentium PRO(P6).

AMDK11 · Oct 2, 2024

DavidC1 said:
Based on the size of the NPU, they could have put 24MB MLC and further lowered power and improved performance. It's way way too large.

Yup.

Layout has changed substantially on Prescott too, it just did not perform well. Lion Cove is the size of Zen 5 despite a substantially denser process, without top clock speed advantage(both at 5.7GHz), without SMT which according to Intel should give noticeable ST and die area advantages.

The biggest red flag to me is the branch prediction regression. Branch prediction is THE key to improving performance, and it's now worse than distant successor to "Atom" core which had very very humble beginnings. Branch prediction regression tells me the team is really really struggling.

Even though I agree that Intel is pushing for clock speeds like in the Pentium 4 times, you are transferring it too literally to LionCove. Pentium 4 compared to Pentium III had a regression from 3-Wide decoder to 1-Wide.

LionCove is pushing for higher IPC as opposed to Pentium 4.

FlameTail · Oct 3, 2024

https://twitter.com/x/status/1841528934770344189

Panther Lake to have a similar NPU to Lunar Lake (~50 TOPS) ?

DrMrLordX · Oct 3, 2024

DavidC1 said:
Which means Lunarlake is efficient at the system level that they can juice more to the SoC yet offer same battery life, ending up as a superior product.

That doesn't make any sense. If it pulls more power then it's going to wear out the battery faster unless it somehow is racing to idle quickly enough to justify the additional power draw. That doesn't seem to be the case based on the results of the Geekerwan video.

@Abwx also has a point. Why are the SoC power draw numbers so different? They don't conform to the board power limits, and there's no indicator that the opposition system has parasitic losses outside of the SoC. Also nobody really wants to address the main point he was making, which is that the Lunar Lake SoC requires more power to achieve higher benchmark results. The whole point of the test wasn't to speculate about parasitic losses or platform efficiency, it was to isolate each SoC and determine perf/watt. If Geekerwan intended for both SoCs to use the same amount of power, then he failed in execution, because they simply did not.

Jan Olšan said:
It's quite possible it was carried over from Alder Lake.

The only way anyone would have noticed such a problem on Alder Lake is if they somehow tuned it to pull the same TVB volts (1.6v+) as could be requested on a Raptor Lake system. Honestly I have no idea if Alder Lake can exhibit that behavior.

FlameTail · Oct 3, 2024

DavidC1 said:
And Intel/AMD needs to follow that direction, because I suspect large L1 caches are also contributing to efficiency in their designs because it keeps lot of data from going out into slower, higher power cache levels and memory.

The basis of Apple's design is also having stellar design team, because being able to have 4GHz clocks at such low power at just 9 pipeline stages and humongous L1 cache with 3 cycle latency is amazing.

Apple's cache hierarchy is also more cost/area effective. If you compare M3 vs Lunar Lake, Intel is spending more capacity (and hence area!) on caches.

Apple M3

4P + 4E

(192 KB pL1i/128 KB pL1d)×4 + (128 KB pL1i/64 KB pL1d)×4

16 MB sL2 + 4 MB sL2

8 MB SLC

Lunar Lake

4P + 4LPE

(48 KB pL0d)×4 + 0

(192 KB pL1d/128 KB pL1i)×4 + (64 KB L1i/32 KB L1d)×4

(2.5 MB pL2)×4 + 4 MB sL2

12 MB sL3 + 0

8 MB SLC

poke01 · Oct 3, 2024

FlameTail said:
Apple's cache hierarchy is also more cost/area effective. If you compare M3 vs Lunar Lake, Intel is spending more capacity (and hence area!) on caches.

Apple M3

4P + 4E

(192 KB pL1i/128 KB pL1d)×4 + (128 KB pL1i/64 KB pL1d)×4

16 MB sL2 + 4 MB sL2

8 MB SLC

Lunar Lake

4P + 4LPE

(48 KB pL0d)×4 + 0

(192 KB pL1d/128 KB pL1i)×4 + (64 KB L1i/32 KB L1d)×4

(2.5 MB pL2)×4 + 4 MB sL2

12 MB sL3 + 0

8 MB SLC

So where does this idea “Apple has more caches than Intel/AMD” come from then?

FlameTail · Oct 3, 2024

poke01 said:
So where does this idea “Apple has more caches than Intel/AMD” come from then?

Good question. Two reasons, I think;

(1) When Apple M1 came out, it did actually have much larger caches than Intel/AMD peers. However since then (M1 -> M4), Apple has hardly increased the sizes at all (L1i/L1d has been the same size, P-core L2 went from 12 MB to 16 MB). Meanwhile in the same time period, Intel/AMD made large increases to their cache capacities, so much so that now Intel has surpassed Apple (as you can see in my previous post).

(2) Apple has more cache for an equivalent level compared to Intel/AMD. Apple's L1 and L2 caches are huge (But they don't have an L3 like Intel/AMD do).

DavidC1 · Oct 3, 2024

DrMrLordX said:
That doesn't make any sense. If it pulls more power then it's going to wear out the battery faster unless it somehow is racing to idle quickly enough to justify the additional power draw.

You are also missing the point, or maybe you aren't paying attention. Likely AMD has incorrect sensors as two outlets point this out, how the system power is comparable to Intel platform that has the TDP of the SoC set 5W or so higher.

You have a handheld, you are gaming. One has a 22W SoC but 30W system power, for 2 hours of gaming with a 60WHr battery. The other has a 15W SoC but still same 30W system power, for the same 2 hours of gaming with a 60WHr battery.

If Geekerwan or other outlets set the TDP to be same between Intel platforms and AMD platforms, the system power would end up being LESS than the AMD system thus you end up with better battery life.

As a user do you really care about SoC power? Especially when the system power numbers show different than expected?

poke01 said:
So where does this idea “Apple has more caches than Intel/AMD” come from then?

Apple has way more L1 caches than both vendors. It's the fastest cache so it performs better, and it's the closest so you save power because you are needing to move less.

The Hardcard said:
Is it bloated? The L2 cache is what it is and it’s 2.5 MB. It would be interesting if logic area could be compared.

Lion Cove is on N3B process, which affords at least 30% density advantage over N4, thus Zen 5 on N3B would end up being 3.2mm2, making Lion Cove almost 50% larger.

DavidC1 · Oct 3, 2024

adroc_thurston said:
It's not a new market, Core Y is like 12 years old.

That's cause Y chips weren't excelling in any areas. The battery life was nowhere near competitive unless you got it thicker than most Tablets. The Y chips were super slow in both single thread and iGPUs, while Lunarlake offers excellent battery life, top notch single thread and iGPU performance.

It still is a market they should pursue, even just to keep WoA out. You don't abandon a market because it sucks for one year.

The BOM argument is short-sighted. The only reason WoA had any chance of taking even 1% marketshare is because Intel refused to make a truly good battery life Intel platform. This is also important on a psychological level, that x86 CAN rival ARM chips in battery life, which from comments on LNL reviews, people are surprised by. Of course Intel is likely going to do what you expect and just focus on BoM, because they are really a finance company that happens to hire engineers.

adroc_thurston said:
Fat iGP is niche. Which is why the ULV PTL has a tiny one.

I agree with this. But I'm pretty sure you are smart enough to figure out that wasn't my point.

poke01 · Oct 3, 2024

DavidC1 said:
The only reason WoA had any chance of taking even 1% marketshare is because Intel refused to make a truly good battery life Intel platform.

Is WoA ever a threat to Intel? First MS/qualcomm need to fix the basic productivity software like the adobe suite, blender etc and then Qualcomm needs to make a low power design which next gen X Elite isn’t going be.

I don’t it will be Qualcomm who Intel should be worried about although that 5GHz core if it that happens and its efficient then it’s different. Intel should be worried about Nvidia, if those rumours of a Nvidia entry into CPU WoA market is true then that’s more of a threat to Intel. Nvidia got potent engineers and mindshare and excellent GPU IP. If Nvidia is smart they would release a 4-12P core SoC with 20SM GPU and a RAM up to 128GB.

This is exactly what Thor Jetson is meant to be, if Nvidia release this on the WoA platform then it will be more than 1%.

AMDK11 · Oct 3, 2024

As for the LionCove vs Zen5 surface:

LionCove
L1-I 64KB
L0-D 48KB
L1-D 192KB
L2 2.5MB

Zen5
L1-I 32KB
L1-D 48KB
L2 1MB

In addition, LionCove has larger buffers, scheduler, ROB, etc. All this costs transistors. Besides, a single 8-Wide decoder makes the Front-End more complicated.

We'll see what LionCove will show at ArrowLake. I suspect that outside of L2, the LionCove logic in LunarLake has been slimmed down. Confirmation will be available in a few days.

SiliconFly · Oct 3, 2024

luro · Oct 3, 2024

adroc_thurston said:
It's a one-off and for a good reason.

And whats the reason?

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Senior member

Attachments

Platinum Member

Platinum Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Senior member

Diamond Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Platinum Member

Diamond Member

Senior member

Golden Member

Member