Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Page 320 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
941
857
106
Wildcat Lake (WCL) Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing Raptor Lake-U. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q1 2026.

Intel Raptor Lake UIntel Wildcat Lake 15W?Intel Lunar LakeIntel Panther Lake 4+0+4
Launch DateQ1-2024Q2-2026Q3-2024Q1-2026
ModelIntel 150UIntel Core 7Core Ultra 7 268VCore Ultra 7 365
Dies2223
NodeIntel 7 + ?Intel 18-A + TSMC N6TSMC N3B + N6Intel 18-A + Intel 3 + TSMC N6
CPU2 P-core + 8 E-cores2 P-core + 4 LP E-cores4 P-core + 4 LP E-cores4 P-core + 4 LP E-cores
Threads12688
Max Clock5.4 GHz?5 GHz4.8 GHz
L3 Cache12 MB12 MB12 MB
TDP15 - 55 W15 W ?17 - 37 W25 - 55 W
Memory128-bit LPDDR5-520064-bit LPDDR5128-bit LPDDR5x-8533128-bit LPDDR5x-7467
Size96 GB32 GB128 GB
Bandwidth136 GB/s
GPUIntel GraphicsIntel GraphicsArc 140VIntel Graphics
RTNoNoYESYES
EU / Xe96 EU2 Xe8 Xe4 Xe
Max Clock1.3 GHz?2 GHz2.5 GHz
NPUGNA 3.018 TOPS48 TOPS49 TOPS






PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



LNL-MX.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,042
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,531
  • INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    181.4 KB · Views: 72,439
  • Clockspeed.png
    Clockspeed.png
    611.8 KB · Views: 72,326
Last edited:

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,730
136
For the MT comparison it's 4x Skymont in LNL vs 2x Crestmont LP in MTL.
Why do you insist that LPE cores in LNL is equivalent to LP Island cores in MTL?

There is absolutely no indication that they are functionally the same.
 

poke01

Diamond Member
Mar 8, 2022
4,818
6,143
106
Skymont is what makes LNL and ARL interesting. I really want to see a deep dive of this core.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,730
136
Yeah that's not an L3 cache.
So what? For low priority tasks you don't need a low-latency caching subsystem. So the SLC can be away from the cores, and maybe even clock differently to save power. But it is a "Last Level" cache nonetheless.

It is increasingly becoming clear that MT perf of Skymont is being compared with Crestmont at the same core counts.
 

Kepler_L2

Golden Member
Sep 6, 2020
1,081
4,668
136
So what? For low priority tasks you don't need a low-latency caching subsystem. So the SLC can be away from the cores, and maybe even clock differently to save power. But it is a "Last Level" cache nonetheless.

It is increasingly becoming clear that MT perf of Skymont is being compared with Crestmont at the same core counts.
How is the MT perf increase so much higher than ST perf increase? A puny 8MB System Level Cache is not going to result in 70% higher performance.
 

Abwx

Lifer
Apr 2, 2011
12,004
4,968
136
So what? For low priority tasks you don't need a low-latency caching subsystem. So the SLC can be away from the cores, and maybe even clock differently to save power. But it is a "Last Level" cache nonetheless.

It is increasingly becoming clear that MT perf of Skymont is being compared with Crestmont at the same core counts.

It s just that you cant have at the same time 1.7x the ST perf at same power + the same ST perf at 0.3x the power all while having 2.9x the MT perf at same power + same MT perf at 0.3x the power with the same core count.

At same core count if you have 2.9x the MT perf at same power then it should be the same MT perf at about 0.12x the power since power scale as a square of performance.

If you take 1.7x ST perf at same power with one core then with 4 cores you ll get 6.8x the perf at 4x the power in MT, and hence 2.9x the MT perf at same power than 2C, this is assuming a square law, and that s in line with the single core being at 1.7x the perf at isowatt, so Kepler is right on this one.
 
Last edited:

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,730
136
How is the MT perf increase so much higher than ST perf increase? A puny 8MB System Level Cache is not going to result in 70% higher performance.
Lower MT frequency penalty vs Crestmont due to differences between N3B and Intel 4 is one possibility.

The other one (and more likely) being that this isn't a comparison with Lunar Lake as the underlying SoC, but Arrow Lake.

In which case it will be an apples to apples comparison with both having access to LLC.
 

Mahboi

Golden Member
Apr 4, 2024
1,058
1,969
96
At some point, diminishing returns will hit the E core like they do for every design. These huge gains are the product of 1) starting with a lower baseline, and 2) giving way more transistor budget to the design than before. It used to be ~4 E-cores equal the area of 1 P-core, and with Skymont it's closer to 2:1. While I don't doubt they could probably scale up the E core's performance if they were given the P core's transistor budget, they won't get 2x the performance. If I recall correctly, a rule of thumb for ST performance is that it's roughly proportional to the square root of the transistor budget, so with 2x the transistors a core should have ~40% higher ST performance.
Obviously, but that doesn't change the P core situation does it?
It's now a steaming hot failure because it went past diminishing returns long ago.
Starting fresh with the Monts as the new growing baseline makes a lot more sense. Or would, if Zen 5 wasn't cornering Intel in a few days.
 

DavidC1

Platinum Member
Dec 29, 2023
2,160
3,302
106
The thing has no chances against anything Zen5.
Didn't you also say "no chance" when I said expecting 30% gains(which they beat here by a huge amount) with Skymont?
Those increases seem astronomical. I hope they aren't.
It's not.

For FP performance, scaling up the number of units benefits wider workloads because you don't need recompiling while using new ISA like AVX does. So in FP it's straight up faster that way.

In Integer, it is consistent with previous gains of 30%, but they seem to be scaling up this time.

By the way Lion Cove is supposed to have 4x 256-bit vector now so the gain will be uniform too.
 
Last edited:

DavidC1

Platinum Member
Dec 29, 2023
2,160
3,302
106
IDC in absolute shambles. Almost guaranteed we get Conroe 2.0 in a few years.
The Austin Atom team has been far more innovative and willing to try new ideas than IDC ever was, even back during their peak.

Read back into Anandtech's article about Bonnell.

-Goldmont with pre-decode cache
-Tremont with clustered decode
-Skymont with the Nanocode

Atom team is beating the P core team like rabbit vs turtle comparison. Now the E core has scaled up a lot, each step the turtle takes is comparatively massive.
C&C posted an article on Skymont, which has some interesting technical discussion. Enjoy!
-96 byte fetch(up from 64 byte in Gracemont and 32 in Golden Cove)

C&C is WRONG about it being 16 ports, when leakers said it was 20+ and Gracemont already has 17 ports.
My inference is that Keller made a far bigger impression with the Atom team than IDC.
He's just one man. Go look at the history again. They executed with consistency. That's what got them here. Previously they thought "useless" Atom cores so no one gave attention. Keller might have helped them but earlier posts say the 12-wide core(future Atom) is the beneficiary, not this one. Credit is all on the team, not Tech Jesus.
 

DavidC1

Platinum Member
Dec 29, 2023
2,160
3,302
106

itsmydamnation

Diamond Member
Feb 6, 2011
3,130
3,985
136

18% faster with 3x the core size difference. Saying the P core is in shambles is an understatement.

This is what I mean that the uop cache is doomed folks. I agree with Eric Quinnell on this.
Why? The other vendors core with a far larger uop cache has no such problems.

to quote my daughter , skill issue.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,730
136
I've said it many times before that Gracemont was already at Sunny Cove/Cypress Cove-level IPC, at least for integer.

I had no doubt in my mind that the initial claims where it was said that Skymont is targeting ADL level performance would be true in the end.
 
  • Like
Reactions: Henry swagger

DavidC1

Platinum Member
Dec 29, 2023
2,160
3,302
106
TBH most of the area comes from transistors aimed at sustaining 5.5 GHz or more fMax.
Not just for spacing but for things such as extra pipeline stages and uop caches(which are a remnant of the high clock speed era).

IMO even Atom should pair down on the clocks, and go from 14 stages to 10-12 stages.
 

Philste

Senior member
Oct 13, 2023
308
486
106
Skymont gains just got confirmed from Computerbase Editor, also "double digit IPC gains" for Lion Cove.