Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Page 752 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
911
829
106
Wildcat Lake (WCL) Preliminary Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing ADL-N. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q2/Computex 2026. In case people don't remember AlderLake-N, I have created a table below to compare the detail specs of ADL-N and WCL. Just for fun, I am throwing LNL and upcoming Mediatek D9500 SoC.

Intel Alder Lake - NIntel Wildcat LakeIntel Lunar LakeMediatek D9500
Launch DateQ1-2023Q2-2026 ?Q3-2024Q3-2025
ModelIntel N300?Core Ultra 7 268VDimensity 9500 5G
Dies2221
NodeIntel 7 + ?Intel 18-A + TSMC N6TSMC N3B + N6TSMC N3P
CPU8 E-cores2 P-core + 4 LP E-cores4 P-core + 4 LP E-coresC1 1+3+4
Threads8688
Max Clock3.8 GHz?5 GHz
L3 Cache6 MB?12 MB
TDP7 WFanless ?17 WFanless
Memory64-bit LPDDR5-480064-bit LPDDR5-6800 ?128-bit LPDDR5X-853364-bit LPDDR5X-10667
Size16 GB?32 GB24 GB ?
Bandwidth~ 55 GB/s136 GB/s85.6 GB/s
GPUUHD GraphicsArc 140VG1 Ultra
EU / Xe32 EU2 Xe8 Xe12
Max Clock1.25 GHz2 GHz
NPUNA18 TOPS48 TOPS100 TOPS ?






PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



LNL-MX.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,034
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,527
  • INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    181.4 KB · Views: 72,435
  • Clockspeed.png
    Clockspeed.png
    611.8 KB · Views: 72,321
Last edited:

GTracing

Senior member
Aug 6, 2021
478
1,114
106
How meaningful is platform TOPS as a metric? Genuinely asking. I don't know much about this stuff.

Can models easily be run on the CPU, NPU, and GPU at the same time? Will a 10 TOPS CPU run AI software as fast as a 10 TOPS NPU or 10 TOPS GPU? Or does it depend on the software? Would running a model on the CPU use all the CPU horsepower and make the system slow and unusable until the software finishes?
 

511

Diamond Member
Jul 12, 2024
5,017
4,528
106
How meaningful is platform TOPS as a metric? Genuinely asking. I don't know much about this stuff.
TOPS is even worse than FP32 Flops for metric cause at least it's FP32 and is fixed Size while companies don't even list the TOPS they use it can be Int8/BF16/FP16/Int4/FP4 size.

Nvidia uses TOPS with an edge case called Sparsity they will just take the TOPS multiply by 2 due to the edge case and report it as TOPS for example the 980 TOPs on Nvidia RTX 5070 is FP4 with Sparsity so if we convert it into what Intel has reported it in terms of Int8 we would be dividing it by 4. 2 from going from FP4(4 bit) Sparsity to non Sparsity and 2 by going from FP4 to Int8 (8 bit) that would translate to approx 245 TOPS of Int8 or roughly double the throughput of the Panther Lake iGPU.

TOPS = ops/clk*clk frequency* Cores
This is the mathematical part of conversion
Can models easily be run on the CPU, NPU, and GPU at the same time? Will a 10 TOPS CPU run AI software as fast as a 10 TOPS NPU or 10 TOPS GPU? Or does it depend on the software? Would running a model on the CPU use all the CPU horsepower and make the system slow and unusable until the software finishes?
It all comes down to software also if they are not Memory Bandwidth and Size bound it would be performing the same but that's an ideal scenario an iGPU may be faster than an RTX 4090 if the model size is too big that it spills out of VRAM and becomes Slow.
For AI You need few things
> Memory large enough to fit the model.
> Memory Bandwidth to not slow the model
> Compute ofc measured in TOPS in this case BF16 is used for training smaller size like FP4 is limited to inference.
> Software Support( Hence the CUDA Moat cause AMDs MI300X is more powerful but lacking in SW)

Hope that helps
 
  • Like
Reactions: GTracing

511

Diamond Member
Jul 12, 2024
5,017
4,528
106
Does anyone care about TOPS though? I thought the concensus was that it was a waste of die space though could be used for something more valuable. As far as the NPU goes that is.
Btw Intel has cut the NPU Size in Half from the rumors
 

dullard

Elite Member
May 21, 2001
26,138
4,796
126
Yes, E cores are quite competent at SPECINT2017 at low clock speeds compared to P cores, but just to point out the obvious....

  • Real applications are not SPECINT
That test was SPECint. But, there have been other tests of P core vs. E core on other applications. The results so far have always the same: the performance curves cross each other at some point. Meaning one core is better at low power and the other core is better at high power.

Here are some admittedly dated ones, but they were easy to find and are real applications and not just SPECint:
1738941063587.png
1738941076754.png

but my point is for power constrained devices P cores are not that useful outside of maybe like 4P Cores the additional 2P cores will steal too much from power budget.
If your revised statement is that 6 P cores is too much, then lucky for you the 235H exists with fewer P cores and costs 31% less.
 

511

Diamond Member
Jul 12, 2024
5,017
4,528
106
I was thinking of the NPU on Meteor Lake and Arrow Lake which uses TSMC N6. I assume you were talking about Lunar Lake?
Yes they have made a significant changes compared to LNL to minimize the area for NPU while still meeting Co-Pilot requirements
 

OneEng2

Senior member
Sep 19, 2022
951
1,163
106
That test was SPECint. But, there have been other tests of P core vs. E core on other applications. The results so far have always the same: the performance curves cross each other at some point. Meaning one core is better at low power and the other core is better at high power.

Here are some admittedly dated ones, but they were easy to find and are real applications and not just SPECint:
View attachment 116511
View attachment 116512


If your revised statement is that 6 P cores is too much, then lucky for you the 235H exists with fewer P cores and costs 31% less.
Thanks. This is for older P and E cores and not LLK/ARL, but I suspect the story would still be the same.
 

511

Diamond Member
Jul 12, 2024
5,017
4,528
106
Thanks. This is for older P and E cores and not LLK/ARL, but I suspect the story would still be the same.
The architecture has massive gaps between GLC and GMT vs SKT and LNC so there is a massive uArch difference the curve has shifted in favour of SKT more vs LNL
 

fastandfurious6

Senior member
Jun 1, 2024
826
993
96
isn't it strange both camps still keep review embargos etc despite all products details announced already month+ ago?

when embargos lifted?
 

AcrosTinus

Senior member
Jun 23, 2024
221
226
76

🫣🫣🫣
Raptor is and was fast, so fast in fact it burned out like a star.
Only 18A or 14A based chips with a lot more logic can really surpass it, Zen5 is not much faster than RaptorCove as well.
 
  • Like
Reactions: lightmanek

511

Diamond Member
Jul 12, 2024
5,017
4,528
106
Raptor is and was fast, so fast in fact it burned out like a star.
Only 18A or 14A based chips with a lot more logic can really surpass it, Zen5 is not much faster than RaptorCove as well.
It's embarrassing that a Intel 7 product is making ST Performance of Zen5/ARL not meh
 

511

Diamond Member
Jul 12, 2024
5,017
4,528
106
I am waiting for Igor to leak NVL Performance Projection like they did with arrow lake.

Since the core count is 16P+32E I bet the 32E alone would have greater Multi than ARL-S and 2X over ARL is possible 100K R23?

I hope they don't use DLVR on Desktop there is no need for it on Desktop.

Any guesstimate for ST Performance?
 

Thunder 57

Diamond Member
Aug 19, 2007
4,155
6,933
136
I am waiting for Igor to leak NVL Performance Projection like they did with arrow lake.

Since the core count is 16P+32E I bet the 32E alone would have greater Multi than ARL-S and 2X over ARL is possible 100K R23?

I hope they don't use DLVR on Desktop there is no need for it on Desktop

At what kind of TDP do you think they could pull that off? And enough memory bandwitdth to feed the cores? Maybe they have a halo SKU for that but I suspect Intel will stick to 8P cores for the most part. Not necssarily a bad thing, look at Nvidia ATM.
 

511

Diamond Member
Jul 12, 2024
5,017
4,528
106
At what kind of TDP do you think they could pull that off? And enough memory bandwitdth to feed the cores? Maybe they have a halo SKU for that but I suspect Intel will stick to 8P cores for the most part. Not necssarily a bad thing, look at Nvidia ATM.
Around 300-350W which kind of is needed to feed the P cores if it were 48 E cores 250W would have been enough considering 144E SRF Cores are fine around 250W as for bandwidth I would guess Triple channel memory ought to be needed