Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Page 257 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
942
857
106
Wildcat Lake (WCL) Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing Raptor Lake-U. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q1 2026.

Intel Raptor Lake UIntel Wildcat Lake 15W?Intel Lunar LakeIntel Panther Lake 4+0+4
Launch DateQ1-2024Q2-2026Q3-2024Q1-2026
ModelIntel 150UIntel Core 7Core Ultra 7 268VCore Ultra 7 365
Dies2223
NodeIntel 7 + ?Intel 18-A + TSMC N6TSMC N3B + N6Intel 18-A + Intel 3 + TSMC N6
CPU2 P-core + 8 E-cores2 P-core + 4 LP E-cores4 P-core + 4 LP E-cores4 P-core + 4 LP E-cores
Threads12688
Max Clock5.4 GHz?5 GHz4.8 GHz
L3 Cache12 MB12 MB12 MB
TDP15 - 55 W15 W ?17 - 37 W25 - 55 W
Memory128-bit LPDDR5-520064-bit LPDDR5128-bit LPDDR5x-8533128-bit LPDDR5x-7467
Size96 GB32 GB128 GB
Bandwidth136 GB/s
GPUIntel GraphicsIntel GraphicsArc 140VIntel Graphics
RTNoNoYESYES
EU / Xe96 EU2 Xe8 Xe4 Xe
Max Clock1.3 GHz?2 GHz2.5 GHz
NPUGNA 3.018 TOPS48 TOPS49 TOPS






PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



LNL-MX.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,044
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,531
  • INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    181.4 KB · Views: 72,439
  • Clockspeed.png
    Clockspeed.png
    611.8 KB · Views: 72,326
Last edited:

dullard

Elite Member
May 21, 2001
26,196
4,868
126
I was thinking it's either because there's a massive security hole in HT that Intel doesn't want to admit to right now or they are simply cutting costs by not doing the validation.
What is the latest in the performance impact of the spectre / meltdown mitigations? I don't follow that thread regularly. At least for a while at the start, there was a pretty big performance loss from the mitigations when hyperthreading was on.

Hyperthreading realistically is more of a 10% performance boost (ranges roughly from +30% in a few benchmarks to -10% in others, with typical average close to ~+10%). And that was before the mitigation performance losses. So how does the current spectre / meltdown mitigation performance loss compare to the potential hyperthreading gain? Potentially a wash?
 
Mar 8, 2024
66
199
66
Chalking up the loss of HT to apple alone is pretty silly; i think a more reasonable explanation has something to do with the utter lack of success that intel has with arresting power budgets in a way that scales with performance. If you're a company with a history of utterly catastrophic duds, you're on the back foot against AMD, and you NEED to have a successful generational launch to stop the coming tide of OEM mutiny - you axe the thing that makes it harder to hit performance goals. I'm not sure how it'll play out in marketing terms though (people generally like seeing big number, because that's more bigger and better and gooder)
 

naukkis

Golden Member
Jun 5, 2002
1,030
854
136
Big cores in hybrid designs are there to offer better thread performance. Using HT will nullify that as HT will split core thread performance to about half. Only beneficial case for HT in those hybrid designs are massively parallelized loads where single thread performance won't matter - and if power is limited it's more beneficial to assign that power to efficiency cores anyway for better total performance. Intel was actually slow to drop out HT, they should have done it as soon as they go to hybrid designs.
 

adroc_thurston

Diamond Member
Jul 2, 2023
8,550
11,282
106
Big cores in hybrid designs are there to offer better thread performance.
Those cores are also used in DC, where SMT loss hurts.
Nope. LNL actually competes with the likes of Snapdragon X series to fill the gap left out by ARL.
no? lol.
X Elite is higher power than LNL is like all cases.
For a lot higher nT perf but I digress.
 

Hulk

Diamond Member
Oct 9, 1999
5,385
4,098
136
Anyone have a technical understanding of how the cycles that were unused for the primary thread and diverted to the secondary logical thread are going to be utilized solely for the primary thread? Is the removal of HT just to reduce die area or is something being changed to minimize lost cycles during thread stalls?

I mean other than Apple doesn't do it.
 

adroc_thurston

Diamond Member
Jul 2, 2023
8,550
11,282
106
Is the removal of HT just to reduce die area or is something being changed to minimize lost cycles during thread stalls?
Less validation work and you dupe a bit less structures.
SMT area/power impact was overall negligable, you pay in validation costs/times mostly.
 

naukkis

Golden Member
Jun 5, 2002
1,030
854
136
Anyone have a technical understanding of how the cycles that were unused for the primary thread and diverted to the secondary logical thread are going to be utilized solely for the primary thread? Is the removal of HT just to reduce die area or is something being changed to minimize lost cycles during thread stalls?

I mean other than Apple doesn't do it.

Intel HT is symmetrical threading. Both threads are equal, every other clock cycle instructions are feed from different thread. There aren't primary/secondary thread but both threads are executed at speed that is a bit more than half of that execution of single thread on that core.
 

adroc_thurston

Diamond Member
Jul 2, 2023
8,550
11,282
106
Those cases where single-thread speed doesn't matter they should drop big cores and use just more e-cores. Actually Intel is doing it right now.
Atoms have rather castrated featureset and middling perf in just a ton of workloads and they won't replace mainline Xeon that way.
You still need big cores in many-many places. The loss of SMT hurts there.
 
  • Like
Reactions: Tlh97 and Kepler_L2

dullard

Elite Member
May 21, 2001
26,196
4,868
126
Less validation work and you dupe a bit less structures.
SMT area/power impact was overall negligable, you pay in validation costs/times mostly.
Don't forget the non-negligible part: the cache. When two threads share the cache for a core, you get half as much cache for each thread and cache thrashing is much more likely. That means either significantly less performance per thread or you need to have significantly more cache than you would otherwise need (more area, more expense, and more cache latency). Hyperthreading can be a nice performance boost in some cases, but it comes with some significant drawbacks.
 

adroc_thurston

Diamond Member
Jul 2, 2023
8,550
11,282
106
Don't forget the non-negligible part: the cache. When two threads share the cache for a core, you get half as much cache for each thread and cache thrashing is much more likely. That means either significantly less performance per thread or you need to have significantly more cache than you would otherwise need (more area, more expense, and more cache latency). Hyperthreading can be a nice performance boost in some cases, but it comes with some significant drawbacks.
SMT-friendly workloads nuke your caches anyway (stuff like server-side Java and other JITs etc).
The real SMT drawbacks are security (cloud guys are anal about that) and validation time.
 
Jul 27, 2020
28,174
19,217
146
I have been thinking about the rumored removal of HT from ARL
While HT may seem like an unnecessary headache on desktop for Intel and maybe AMD, in mobile CPUs, Intel may keep HT alive for a few more years simply because it's the cheapest way to advertise more cores to consumers without incurring a significant area penalty of replacing the HT virtual cores with physical efficiency cores. I used a Core i5-1235U Dell laptop recently and its BIOS had no setting to turn off HT which I found to be pretty weird. It was an Inspiron laptop. It's like Intel doesn't want the majority of its users working without HT.

Another factor is the core occupancy determination by Intel Thread Director. Suppose Windows is using the P-core for something so that core is "awake". A lightweight thread needs to do something at the same time. Does the ITD wake up a sleeping efficiency core or does it allocate the virtual HT core of the active P-core to that thread? I'm thinking the latter would be a more efficient use of the available resources and it could even save time if the lightweight thread is quick to finish its task in less time it takes to context switch and wake up an efficiency core.

Then there's the rumor about rentable units. Let's suppose they really help increase performance similar to or even better than HT. But because adjacent idle core resources are getting rented out, this will wake those cores up more often and thus power efficiency will take some hit. What if ARL desktop has rentable units and no HT while ARL mobile has HT and no rentable units? If the silicon area dedicated to enabling the rentable unit functionality is similar to HT's silicon area requirements, Intel could put both on the compute die and enable one or the other depending on their use case and targeted market.
 

Doug S

Diamond Member
Feb 8, 2020
3,832
6,767
136
The reason Intel did HT was to increase MT throughput, they were pretty clear about that when they introduced it. They don't need that anymore with their E cores.

Look at it this way. MT throughput is always going to be power limited - you can't run every core at its max frequency in a CPU with a lot of cores. So you (or rather Intel's chip designers) have to ask yourself, where do I get the best increase in performance for each additional watt of power I can pump into the chip?

I'll bet Intel's designers did the math/simulations/benchmarks and determined that if they disabled HT and used the power saved by that to spin up a few more E cores, they got better throughput. What's more, it wouldn't suffer from the vagaries of HT performance where on average it helps but in the wide world of MT workloads there are some where it helps more and some where it actually HURTS. One nice advantage of an extra E core is that it is almost impossible to come up with a benchmark where that will hurt. Maybe it won't help (i.e. you're maxing out memory bandwidth) but you won't see the benchmarks where it makes things worse like you do with HT.
 

Saylick

Diamond Member
Sep 10, 2012
4,121
9,641
136
Where can I read more about that whole rentable units concept? Seems to be wild.

Probably something similar to this. We called it "reverse Hyperthreading".
 

FlameTail

Diamond Member
Dec 15, 2021
4,384
2,762
106

Probably something similar to this. We called it "reverse Hyperthreading".

That inverse hyperthreading is wild stuff.

If Intel can get that working, they'll become the undisputed king of Single Thread performance.
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,135
4,003
136
Those cases where single-thread speed doesn't matter they should drop big cores and use just more e-cores. Actually Intel is doing it right now.
Except for all those workloads that have high latency but also need lots of brawn like relational DB's or generally anything in the server space that is dealing with I/O.

I cant wait for 1000's of terribly performing kubernetes containers running on 1000's of average performing core. But im cloud scale!!!!! 2024 IT is lit.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,730
136
Except for all those workloads that have high latency but also need lots of brawn like relational DB's or generally anything in the server space that is dealing with I/O.

I cant wait for 1000's of terribly performing kubernetes containers running on 1000's of average performing core. But im cloud scale!!!!! 2024 IT is lit.
Isn't Skymont targeting Golden Cove level of performance?

Hardly average performing if that is indeed the case.