Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Page 751 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
941
857
106
Wildcat Lake (WCL) Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing Raptor Lake-U. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q1 2026.

Intel Raptor Lake UIntel Wildcat Lake 15W?Intel Lunar LakeIntel Panther Lake 4+4+4
Launch DateQ1-2024Q2-2026Q3-2024Q1-2026
ModelIntel 150UIntel Core 7Core Ultra 7 268VCore Ultra 7 365
Dies2223
NodeIntel 7 + ?Intel 18-A + TSMC N6TSMC N3B + N6Intel 18-A + Intel 3 + TSMC N6
CPU2 P-core + 8 E-cores2 P-core + 4 LP E-cores4 P-core + 4 LP E-cores4 P-core + 4 LP E-cores
Threads12688
Max Clock5.4 GHz?5 GHz4.8 GHz
L3 Cache12 MB12 MB12 MB
TDP15 - 55 W15 W ?17 - 37 W25 - 55 W
Memory128-bit LPDDR5-520064-bit LPDDR5128-bit LPDDR5x-8533128-bit LPDDR5x-7467
Size96 GB32 GB128 GB
Bandwidth136 GB/s
GPUIntel GraphicsIntel GraphicsArc 140VIntel Graphics
RTNoNoYESYES
EU / Xe96 EU2 Xe8 Xe4 Xe
Max Clock1.3 GHz?2 GHz2.5 GHz
NPUGNA 3.018 TOPS48 TOPS49 TOPS






PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



LNL-MX.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,042
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,531
  • INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    181.4 KB · Views: 72,439
  • Clockspeed.png
    Clockspeed.png
    611.8 KB · Views: 72,325
Last edited:

DavidC1

Platinum Member
Dec 29, 2023
2,138
3,273
106
they will benifit from Win10 eol than AI
I meant it benefits sales of servers because you need CPUs to go with the accelerators like GPUs. It's a rising tide even if it's indirect. When the AI market crashes it'll be like a mini post-lockdown effect for servers.
Well since the node debacle the focus was always their node their design issues are only showing for the past two years after the nodes were fine.

I am still in awe that design teams were allowed near Infinte tape outs and it was the foundry's that has to accommodate the changes not design.
When an inefficiently run organization that "needed" all the waste to perform like how it used to and that gets cut, it'll do worse. Intel needs a much radical change to improve this. Like I said, 30 year worth of changes is necessary because that's how long the culture has been rotten for, maybe even 35 years. And after that you need someone that can rebuild a new culture. Don't know if it's even possible to find someone that capable.

The P vs E core team differences are likely just tip of the iceberg in how many inefficient and badly run parts are at Intel. And the worst part is I think the E core team could do better - much better.
 

511

Diamond Member
Jul 12, 2024
5,394
4,816
106
I meant it benefits sales of servers because you need CPUs to go with the accelerators like GPUs. It's a rising tide even if it's indirect. When the AI market crashes it'll be like a mini post-lockdown effect for servers.

When an inefficiently run organization that "needed" all the waste to perform like how it used to and that gets cut, it'll do worse. Intel needs a much radical change to improve this. Like I said, 30 year worth of changes is necessary because that's how long the culture has been rotten for, maybe even 35 years. And after that you need someone that can rebuild a new culture. Don't know if it's even possible to find someone that capable.
Maybe I think since or after ottleni 20 years you could say cause under trinity they were fine
The P vs E core team differences are likely just tip of the iceberg in how many inefficient and badly run parts are at Intel. And the worst part is I think the E core team could do better - much better.
Yes
 

LightningZ71

Platinum Member
Mar 10, 2017
2,673
3,372
136
There's no architectural improvement, it's nearly entirely due to process. Core Ultra 7 265U is Meteorlake cores on Intel 3. The ST clocks are 10% higher so that's why the gains are close to it. As for MT, Geekbench does vary a ton(it's a userbenchmark after all), so we'll have to compare top result to top result. It should be 10% or more faster.
The two biggest regressions are clang and background blur, going from a healthy ST improvement each to a big drop in MT. That speaks to a memory or inter thread communication latency issue. This does map to either a BIOS issue, or, the different memory in the two units having different latency settings. It's not unreasonable for the newer one with 64GB ram to have higher latency.
 

DavidC1

Platinum Member
Dec 29, 2023
2,138
3,273
106
Maybe I think since or after ottleni 20 years you could say cause under trinity they were fine
Craig Barret had issues too. That's why I call him a sane Kraznich. So yea ~30 years give or take few.

And the "Trinity" weren't perfect either, it's just that people who are good leaders make up for their deficiencies by the good side.
 
  • Like
Reactions: Thibsie

511

Diamond Member
Jul 12, 2024
5,394
4,816
106
Craig Barret had issues too. That's why I call him a sane Kraznich. So yea ~30 years give or take few.
Sure
And the "Trinity" weren't perfect either, it's just that people who are good leaders make up for their deficiencies by the good side.
No one is perfect but I said there was not cultural rot if they had cultural rot at that time AMD wouldn't exist
 

sgs_x86

Junior Member
Dec 20, 2020
17
26
91
Hypothetical: What if the E-core team is given 50% extra die space per E-core. They can expand the core, add more cache to feed it and run it at higher clocks and still be smaller than an all P-core or hybrid design.
 

Kepler_L2

Golden Member
Sep 6, 2020
1,076
4,638
136
Hypothetical: What if the E-core team is given 50% extra die space per E-core. They can expand the core, add more cache to feed it and run it at higher clocks and still be smaller than an all P-core or hybrid design.
That's what Unified Core is.
 
  • Like
Reactions: sgs_x86

511

Diamond Member
Jul 12, 2024
5,394
4,816
106
Hypothetical: What if the E-core team is given 50% extra die space per E-core. They can expand the core, add more cache to feed it and run it at higher clocks and still be smaller than an all P-core or hybrid design.
Yes P core is too much power Hungry
That's what Unified Core is.
It's better than Royal Core Project from the rumors cause the base of the core was wrong in Royal Core
 

dullard

Elite Member
May 21, 2001
26,191
4,855
126
1738876381697.png

Why do the P core Exist sigh
1) As shown in the data you present, the P core averaged 7.5% faster.

2) But, the data you presented was ran with both the P core at 3.8 GHz and the E core at 3.8 GHz. The P core can go much faster than 3.8 GHz. The turbo P core speed is 5.1 GHz on the 255H and 5.4 GHz on the 285H. Meaning, the end result is another ~20% boost over the 7.5% faster shown in that data.

3) As @desrever stated, when running at high clocks, the P cores are more efficient, see the red circled portion of the graph below where the P cores (blue line) do more work at the same power as the E cores (purple line).
1738876643576.png
 

OneEng2

Senior member
Sep 19, 2022
981
1,193
106
1) As shown in the data you present, the P core averaged 7.5% faster.

2) But, the data you presented was ran with both the P core at 3.8 GHz and the E core at 3.8 GHz. The P core can go much faster than 3.8 GHz. The turbo P core speed is 5.1 GHz on the 255H and 5.4 GHz on the 285H. Meaning, the end result is another ~20% boost over the 7.5% faster shown in that data.

3) As @desrever stated, when running at high clocks, the P cores are more efficient, see the red circled portion of the graph below where the P cores (blue line) do more work at the same power as the E cores (purple line).
View attachment 116455
... and a voice of reason ;).

In engineering, "YOU NEVER GET SOMETHING FOR NOTHING".

Yes, E cores are quite competent at SPECINT2017 at low clock speeds compared to P cores, but just to point out the obvious....

  • Real applications are not SPECINT
  • Performance = IPC * Clock
  • Designing cores that clock higher costs more die space (for a few reasons)
  • Designing cores that clock higher makes them less energy efficient
I don't know if we can ever really put this discussion to bed with proof as it is difficult to divide the P and E cores out like that and run real applications on just one or the other.
 
  • Like
Reactions: lightmanek

511

Diamond Member
Jul 12, 2024
5,394
4,816
106
1) As shown in the data you present, the P core averaged 7.5% faster.

2) But, the data you presented was ran with both the P core at 3.8 GHz and the E core at 3.8 GHz. The P core can go much faster than 3.8 GHz. The turbo P core speed is 5.1 GHz on the 255H and 5.4 GHz on the 285H. Meaning, the end result is another ~20% boost over the 7.5% faster shown in that data.
It's a laptop so sustaining it outside of 2-3 cores is not possible in the chassis
3) As @desrever stated, when running at high clocks, the P cores are more efficient, see the red circled portion of the graph below where the P cores (blue line) do more work at the same power as the E cores (purple line).
View attachment 116455
From this graph the LNL P Core has better PPW than ARL-H so it seems that the Ring Bus/SOC Design is holding back the P cores but that may be true for E cores as well but my point is for power constrained devices P cores are not that useful outside of maybe like 4P Cores the additional 2P cores will steal too much from power budget.
 
Last edited:
  • Like
Reactions: MoistOintment

511

Diamond Member
Jul 12, 2024
5,394
4,816
106
... and a voice of reason ;).

In engineering, "YOU NEVER GET SOMETHING FOR NOTHING".

Yes, E cores are quite competent at SPECINT2017 at low clock speeds compared to P cores, but just to point out the obvious....

  • Real applications are not SPECINT
  • Performance = IPC * Clock
  • Designing cores that clock higher costs more die space (for a few reasons)
  • Designing cores that clock higher makes them less energy efficient
I don't know if we can ever really put this discussion to bed with proof as it is difficult to divide the P and E cores out like that and run real applications on just one or the other.
Well Intel can they must have done internal benchmark that is why Intel is going with E core as a base for Unified Cores
 

OneEng2

Senior member
Sep 19, 2022
981
1,193
106
It's a laptop so sustaining it outside of 2-3 cores is not possible in the chassis

From this graph the LNL P Core has better PPW than ARL-H so it seems that the Ring Bus/SOC Design is holding back the P cores but that may be true for E cores as well but my point is for power constrained devices P cores are not that useful outside of maybe like 4P Cores.
I have been guessing that the P cores can't be fed properly from the ring bus... at least in the latency department. Perhaps Panther Lake will show us what the P cores can do when freed up a little?
 
  • Like
Reactions: Schmide and 511

GTracing

Senior member
Aug 6, 2021
478
1,114
106
It's a laptop so sustaining it outside of 2-3 cores is not possible in the chassis

From this graph the LNL P Core has better PPW than ARL-H so it seems that the Ring Bus/SOC Design is holding back the P cores but that may be true for E cores as well but my point is for power constrained devices P cores are not that useful outside of maybe like 4P Cores.
If that's your point, you have an odd way of saying it. "Why do the P core Exist sigh" sounds to me like you're saying that P cores shouldn't exist. But maybe that's just me.
 

511

Diamond Member
Jul 12, 2024
5,394
4,816
106
I have been guessing that the P cores can't be fed properly from the ring bus... at least in the latency department. Perhaps Panther Lake will show us what the P cores can do when freed up a little?
This Will be true for E cores as well so we have to wait like same time as LNL for PTL Benchmarks
 

DavidC1

Platinum Member
Dec 29, 2023
2,138
3,273
106
In engineering, "YOU NEVER GET SOMETHING FOR NOTHING".
Assuming the two teams are equal in everything like a benchmark, which is NEVER the case.

Competency and innovation sets competing teams apart. See Deepseek.

Or how the Winchip chip(eventually made Cyrix chips) made x86 chips at 1/3 of the time Intel did which at that time were thought to be "impossible". They said "Give us 10 million and 2 years and we'll make an x86 compatible chip". It still is amazing what they were able to do.

The rumor is that since the E core team is in proximity to the Austin Via team, and with buyouts they probably learned a lot to actually make efficient cores.

The E core design also had more similarities(compared to the P) to the ARM cores with distributed schedulers, more FP units versus doubling bit-width, lack of SMT, so there's likely a push in that direction as well.

And I keep pointing out the rate of changes and innovations in the E core designs happen nearly every uarch change or two years, while the P core has only seen during P-M, Core 2, and SNB.
Yes, E cores are quite competent at SPECINT2017 at low clock speeds compared to P cores, but just to point out the obvious....
Games are very sensitive to uarch changes, yet in Arrowlake-S 8P is slower than 1P+16E. Whatever the cache/ring differences may be because of such a config isn't enough for the "Performance" cores to outperform the E core config, so they are very close.

SpecINT is quite accurate from a mile-high point of view by the way.
 
Last edited:

DavidC1

Platinum Member
Dec 29, 2023
2,138
3,273
106
From this graph the LNL P Core has better PPW than ARL-H so it seems that the Ring Bus/SOC Design is holding back the P cores but that may be true for E cores as well but my point is for power constrained devices P cores are not that useful outside of maybe like 4P Cores the additional 2P cores will steal too much from power budget.
If you have a core that uses 10W on active but the SoC uses 5W, then the load TDP will end up being ~15W.

Since Lunarlake has a much lower SoC idle, it ends up lowering peak power by a bit too.

I assume Arrowlake is NOT the best implementation, but higher performance does increase cost, whether in money terms or power. Since Lunar came before Arrow, even though it was supposed to be the other way around, Lunarlake had far better execution, thus the teams were able to meet their goals, while Arrowlake probably failed in lots.
 
  • Like
Reactions: 511

511

Diamond Member
Jul 12, 2024
5,394
4,816
106
If you have a core that uses 10W on active but the SoC uses 5W, then the load TDP will end up being ~15W.

Since Lunarlake has a much lower SoC idle, it ends up lowering peak power by a bit too.

I assume Arrowlake is NOT the best implementation, but higher performance does increase cost, whether in money terms or power. Since Lunar came before Arrow, even though it was supposed to be the other way around, Lunarlake had far better execution, thus the teams were able to meet their goals, while Arrowlake probably failed in lots.
I think the decision to stay with MTLs bad Tile implementation can't be reversed mid way