Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Tigerick · Aug 22, 2022

Wildcat Lake (WCL) Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing Raptor Lake-U. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q1 2026.

	Intel Raptor Lake U	Intel Wildcat Lake 15W?	Intel Lunar Lake	Intel Panther Lake 4+0+4
Launch Date	Q1-2024	Q2-2026	Q3-2024	Q1-2026
Model	Intel 150U	Intel Core 7	Core Ultra 7 268V	Core Ultra 7 365
Dies	2	2	2	3
Node	Intel 7 + ?	Intel 18-A + TSMC N6	TSMC N3B + N6	Intel 18-A + Intel 3 + TSMC N6

CPU	2 P-core + 8 E-cores	2 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores
Threads	12	6	8	8
Max Clock	5.4 GHz	?	5 GHz	4.8 GHz
L3 Cache	12 MB		12 MB	12 MB
TDP	15 - 55 W	15 W ?	17 - 37 W	25 - 55 W

Memory	128-bit LPDDR5-5200	64-bit LPDDR5	128-bit LPDDR5x-8533	128-bit LPDDR5x-7467
Size	96 GB		32 GB	128 GB
Bandwidth			136 GB/s

GPU	Intel Graphics	Intel Graphics	Arc 140V	Intel Graphics
RT	No	No	YES	YES
EU / Xe	96 EU	2 Xe	8 Xe	4 Xe
Max Clock	1.3 GHz	?	2 GHz	2.5 GHz

NPU	GNA 3.0	18 TOPS	48 TOPS	49 TOPS

As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.

dullard · Mar 8, 2024

jpiniero said:
I was thinking it's either because there's a massive security hole in HT that Intel doesn't want to admit to right now or they are simply cutting costs by not doing the validation.

What is the latest in the performance impact of the spectre / meltdown mitigations? I don't follow that thread regularly. At least for a while at the start, there was a pretty big performance loss from the mitigations when hyperthreading was on.

Hyperthreading realistically is more of a 10% performance boost (ranges roughly from +30% in a few benchmarks to -10% in others, with typical average close to ~+10%). And that was before the mitigation performance losses. So how does the current spectre / meltdown mitigation performance loss compare to the potential hyperthreading gain? Potentially a wash?

reggie_fils_aime · Mar 8, 2024

Chalking up the loss of HT to apple alone is pretty silly; i think a more reasonable explanation has something to do with the utter lack of success that intel has with arresting power budgets in a way that scales with performance. If you're a company with a history of utterly catastrophic duds, you're on the back foot against AMD, and you NEED to have a successful generational launch to stop the coming tide of OEM mutiny - you axe the thing that makes it harder to hit performance goals. I'm not sure how it'll play out in marketing terms though (people generally like seeing big number, because that's more bigger and better and gooder)

naukkis · Mar 8, 2024

Big cores in hybrid designs are there to offer better thread performance. Using HT will nullify that as HT will split core thread performance to about half. Only beneficial case for HT in those hybrid designs are massively parallelized loads where single thread performance won't matter - and if power is limited it's more beneficial to assign that power to efficiency cores anyway for better total performance. Intel was actually slow to drop out HT, they should have done it as soon as they go to hybrid designs.

SiliconFly · Mar 8, 2024

adroc_thurston · Mar 8, 2024

naukkis said:
Big cores in hybrid designs are there to offer better thread performance.

Those cores are also used in DC, where SMT loss hurts.

SiliconFly said:
Nope. LNL actually competes with the likes of Snapdragon X series to fill the gap left out by ARL.

no? lol.
X Elite is higher power than LNL is like all cases.
For a lot higher nT perf but I digress.

Hulk · Mar 8, 2024

Anyone have a technical understanding of how the cycles that were unused for the primary thread and diverted to the secondary logical thread are going to be utilized solely for the primary thread? Is the removal of HT just to reduce die area or is something being changed to minimize lost cycles during thread stalls?

I mean other than Apple doesn't do it.

adroc_thurston · Mar 8, 2024

Hulk said:
Is the removal of HT just to reduce die area or is something being changed to minimize lost cycles during thread stalls?

Less validation work and you dupe a bit less structures.
SMT area/power impact was overall negligable, you pay in validation costs/times mostly.

naukkis · Mar 8, 2024

adroc_thurston said:
Those cores are also used in DC, where SMT loss hurts.

Those cases where single-thread speed doesn't matter they should drop big cores and use just more e-cores. Actually Intel is doing it right now.

SiliconFly · Mar 8, 2024

naukkis · Mar 8, 2024

Hulk said:
Anyone have a technical understanding of how the cycles that were unused for the primary thread and diverted to the secondary logical thread are going to be utilized solely for the primary thread? Is the removal of HT just to reduce die area or is something being changed to minimize lost cycles during thread stalls?

I mean other than Apple doesn't do it.

Intel HT is symmetrical threading. Both threads are equal, every other clock cycle instructions are feed from different thread. There aren't primary/secondary thread but both threads are executed at speed that is a bit more than half of that execution of single thread on that core.

adroc_thurston · Mar 8, 2024

naukkis said:
Those cases where single-thread speed doesn't matter they should drop big cores and use just more e-cores. Actually Intel is doing it right now.

Atoms have rather castrated featureset and middling perf in just a ton of workloads and they won't replace mainline Xeon that way.
You still need big cores in many-many places. The loss of SMT hurts there.

dullard · Mar 8, 2024

adroc_thurston said:
Less validation work and you dupe a bit less structures.
SMT area/power impact was overall negligable, you pay in validation costs/times mostly.

Don't forget the non-negligible part: the cache. When two threads share the cache for a core, you get half as much cache for each thread and cache thrashing is much more likely. That means either significantly less performance per thread or you need to have significantly more cache than you would otherwise need (more area, more expense, and more cache latency). Hyperthreading can be a nice performance boost in some cases, but it comes with some significant drawbacks.

adroc_thurston · Mar 8, 2024

dullard said:
Don't forget the non-negligible part: the cache. When two threads share the cache for a core, you get half as much cache for each thread and cache thrashing is much more likely. That means either significantly less performance per thread or you need to have significantly more cache than you would otherwise need (more area, more expense, and more cache latency). Hyperthreading can be a nice performance boost in some cases, but it comes with some significant drawbacks.

SMT-friendly workloads nuke your caches anyway (stuff like server-side Java and other JITs etc).
The real SMT drawbacks are security (cloud guys are anal about that) and validation time.

SiliconFly · Mar 8, 2024

igor_kavinski · Mar 8, 2024

Hulk said:
I have been thinking about the rumored removal of HT from ARL

While HT may seem like an unnecessary headache on desktop for Intel and maybe AMD, in mobile CPUs, Intel may keep HT alive for a few more years simply because it's the cheapest way to advertise more cores to consumers without incurring a significant area penalty of replacing the HT virtual cores with physical efficiency cores. I used a Core i5-1235U Dell laptop recently and its BIOS had no setting to turn off HT which I found to be pretty weird. It was an Inspiron laptop. It's like Intel doesn't want the majority of its users working without HT.

Another factor is the core occupancy determination by Intel Thread Director. Suppose Windows is using the P-core for something so that core is "awake". A lightweight thread needs to do something at the same time. Does the ITD wake up a sleeping efficiency core or does it allocate the virtual HT core of the active P-core to that thread? I'm thinking the latter would be a more efficient use of the available resources and it could even save time if the lightweight thread is quick to finish its task in less time it takes to context switch and wake up an efficiency core.

Then there's the rumor about rentable units. Let's suppose they really help increase performance similar to or even better than HT. But because adjacent idle core resources are getting rented out, this will wake those cores up more often and thus power efficiency will take some hit. What if ARL desktop has rentable units and no HT while ARL mobile has HT and no rentable units? If the silicon area dedicated to enabling the rentable unit functionality is similar to HT's silicon area requirements, Intel could put both on the compute die and enable one or the other depending on their use case and targeted market.

SiliconFly · Mar 8, 2024

Doug S · Mar 8, 2024

The reason Intel did HT was to increase MT throughput, they were pretty clear about that when they introduced it. They don't need that anymore with their E cores.

Look at it this way. MT throughput is always going to be power limited - you can't run every core at its max frequency in a CPU with a lot of cores. So you (or rather Intel's chip designers) have to ask yourself, where do I get the best increase in performance for each additional watt of power I can pump into the chip?

I'll bet Intel's designers did the math/simulations/benchmarks and determined that if they disabled HT and used the power saved by that to spin up a few more E cores, they got better throughput. What's more, it wouldn't suffer from the vagaries of HT performance where on average it helps but in the wide world of MT workloads there are some where it helps more and some where it actually HURTS. One nice advantage of an extra E core is that it is almost impossible to come up with a benchmark where that will hurt. Maybe it won't help (i.e. you're maxing out memory bandwidth) but you won't see the benchmarks where it makes things worse like you do with HT.

moinmoin · Mar 8, 2024

igor_kavinski said:
Then there's the rumor about rentable units.

Where can I read more about that whole rentable units concept? Seems to be wild.

FlameTail · Mar 8, 2024

adroc_thurston said:
X Elite is higher power than LNL is like all cases.
For a lot higher nT perf but I digress.

X Elite power consumption is scalable

adroc_thurston · Mar 8, 2024

FlameTail said:
X Elite power consumption is scalable

All chips have scalable power envelopes, just that 8cx g4 isn't a 10W part like LNL or Apple Mx stuff.
Not tablet, laptop.

Saylick · Mar 8, 2024

moinmoin said:
Where can I read more about that whole rentable units concept? Seems to be wild.

AnandTech Forums: Technology, Hardware, Software, and Deals

Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

www.anandtech.com

Report: Intel has quietly bought chip startup Soft Machines for $250M - SiliconANGLE

siliconangle.com

Probably something similar to this. We called it "reverse Hyperthreading".

FlameTail · Mar 8, 2024

adroc_thurston said:
All chips have scalable power envelopes, just that 8cx g4 isn't a 10W part like LNL or Apple Mx stuff.
Not tablet, laptop.

LNL also scales upto 30W I believe?

Qualcomm showed off an X Elite device with 23W power.

FlameTail · Mar 8, 2024

Saylick said:
AnandTech Forums: Technology, Hardware, Software, and Deals

Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

www.anandtech.com

Report: Intel has quietly bought chip startup Soft Machines for $250M - SiliconANGLE

Report: Intel has quietly bought chip startup Soft Machines for $250M - SiliconANGLE

siliconangle.com

Probably something similar to this. We called it "reverse Hyperthreading".

That inverse hyperthreading is wild stuff.

If Intel can get that working, they'll become the undisputed king of Single Thread performance.

itsmydamnation · Mar 8, 2024

naukkis said:
Those cases where single-thread speed doesn't matter they should drop big cores and use just more e-cores. Actually Intel is doing it right now.

Except for all those workloads that have high latency but also need lots of brawn like relational DB's or generally anything in the server space that is dealing with I/O.

I cant wait for 1000's of terribly performing kubernetes containers running on 1000's of average performing core. But im cloud scale!!!!! 2024 IT is lit.

tamz_msc · Mar 8, 2024

itsmydamnation said:
Except for all those workloads that have high latency but also need lots of brawn like relational DB's or generally anything in the server space that is dealing with I/O.

I cant wait for 1000's of terribly performing kubernetes containers running on 1000's of average performing core. But im cloud scale!!!!! 2024 IT is lit.

Isn't Skymont targeting Golden Cove level of performance?

Hardly average performing if that is indeed the case.

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Senior member

Attachments

Elite Member

Member

Golden Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Golden Member

Golden Member

Diamond Member

Elite Member

Diamond Member

Golden Member

Lifer

Golden Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member