Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Tigerick · Aug 22, 2022

Wildcat Lake (WCL) Preliminary Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing ADL-N. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q2/Computex 2026. In case people don't remember AlderLake-N, I have created a table below to compare the detail specs of ADL-N and WCL. Just for fun, I am throwing LNL and upcoming Mediatek D9500 SoC.

	Intel Alder Lake - N	Intel Wildcat Lake	Intel Lunar Lake	Mediatek D9500
Launch Date	Q1-2023	Q2-2026 ?	Q3-2024	Q3-2025
Model	Intel N300	?	Core Ultra 7 268V	Dimensity 9500 5G
Dies	2	2	2	1
Node	Intel 7 + ?	Intel 18-A + TSMC N6	TSMC N3B + N6	TSMC N3P

CPU	8 E-cores	2 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores	C1 1+3+4
Threads	8	6	8	8
Max Clock	3.8 GHz	?	5 GHz
L3 Cache	6 MB	?	12 MB
TDP	7 W	Fanless ?	17 W	Fanless

Memory	64-bit LPDDR5-4800	64-bit LPDDR5-6800 ?	128-bit LPDDR5X-8533	64-bit LPDDR5X-10667
Size	16 GB	?	32 GB	24 GB ?
Bandwidth		~ 55 GB/s	136 GB/s	85.6 GB/s

GPU	UHD Graphics		Arc 140V	G1 Ultra
EU / Xe	32 EU	2 Xe	8 Xe	12
Max Clock	1.25 GHz		2 GHz

NPU	NA	18 TOPS	48 TOPS	100 TOPS ?

As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.

511 · Aug 16, 2024

jdubs03 said:
Below post references the QS data that we’ve been talking about past few weeks. He followed up with a post saying he thinks there will be 3% single core uplift (in GB6).

https://twitter.com/x/status/1824268748456595594

4.6 GHZ Skymont E cores someone should do 16T SKT vs 8C/16T RPL comparison on launch at iso and non iso frequency

dttprofessor · Aug 16, 2024

Wolverine2349 said:
How good will latency be on Arrow Lake.

I know now on a tile instead of monolithic die, but isn't it monolithic on a tile instead of 10nm? Like

So wouldn't it still be much better than AMD chiplets because its a ring bus on the tile and all cores on ring bus and Intel designing it like that? So could latency be as good as the 10nm Alder and Raptor Lake?

BEOL OK
NOC ?

TESKATLIPOKA · Aug 16, 2024

OriAr said:
Core Ultra 3 is expected to have 4P + 4E cores.
Should be a huge upgrade in MT perf over current gen i3.

Considering both RPL and MTL have only 2P+4E It's not really surprising that nT will be better.
BTW, what do you consider as a huge upgrade? +30% or more in nT?

Here is a table where I compare mobile MTL vs ARL. Not accurate, just an example of how much better It could be.
MTL P-core: 100 points
MTL P-core HT: 25 points
MTL E-core: 65 points
ARL P-core: 110 points
ARL E-core: 90 points

Thread count	Ultra 3 105UL 2P+4E	ARL 4P+4E	Difference in %
1	100	110	+10%
2	200	220	+10%
3	265	330	+24.5%
4	330	440	+33.3%
5	395	530	+34.2%
6	460	620	+34.7%
7	485	710	+46.4%
8	510	800	+56.9%

cebri1 · Aug 16, 2024

3% seems a bit too low based on the information we have. Maybe there are latency issues? If it's a 14% IPC increase over of RWC, and 0.3 less GHZ... it doesn't add up.

Even if it only had 10% IPC increase over RPC, 1nT should still provide about a 5% ST improvement.

MT is going to murder the 14900K .

dullard · Aug 16, 2024

TESKATLIPOKA said:
MTL P-core: 100 points
MTL P-core HT: 25 points

I don't know specifically about the MTL 105UL, but has any recent Intel chip actually gotten even close to 25% from hyperthreading on average? Maybe on a very specific benchmark or two, but that is far from the norm.

Intel's hyperthreading never gained as much as AMD's version did. I think the most generous assumptions would be 20% and that isn't often the case. Here is an example (admittedly gaming focused) where the HT ON lowers the average performance and if you go on a few seconds in the video the individual differences ranged from -8.6% to +3.7%. Never close to 25%.

Basically doing a generic table with 25% just seems to only be reasonable for the few niche areas where HT does the very, very best and that is generally in artificial loads that keep all threads active at 100% at all times.

Abwx · Aug 16, 2024

dullard said:
Intel's hyperthreading never gained as much as AMD's version did.

That s not true, at the time Hardware.fr made a comparison and the difference wasnt huge, and since those days Intel has improved their SMT capability, they even talked of up to 30%.

22.47% chez Intel, et 25.6% chez AMD

Impact du SMT/HT - AMD Ryzen 7 1800X en test, le retour d'AMD ? - HardWare.fr

Après 4 années de développement, AMD lance enfin le Ryzen 7 1800X, fer de lance de sa nouvelle gamme censée rattraper voire dépasser les derniers processeurs Intel. Pari tenu ?

www.hardware.fr

Wolverine2349 · Aug 16, 2024

How will big.little hybrid work with Arrow Lake.

Because Skymont is so much better than Gracemont, will all the Big.Little hybrid scheduling issues be completed solved and all software should work perfectly as it would on homogenous with hybrid Arrow Lake?

Or no is hybrid still a potential issue for some software regardless of dramatically improved e-cores.

gdansk · Aug 16, 2024

Realistically, assuming P core is 1.14x faster and at 5.7GHz while E core maxes out at 4.6GHz, then any high priority thread scheduled on the E cluster is at ~0.7x performance it would have on a P core.

So it won't make scheduling any easier but does reduce the penalty when it messes up.

dullard · Aug 16, 2024

Abwx said:
That s not true, at the time Hardware.fr

Impact du SMT/HT - AMD Ryzen 7 1800X en test, le retour d'AMD ? - HardWare.fr

Après 4 années de développement, AMD lance enfin le Ryzen 7 1800X, fer de lance de sa nouvelle gamme censée rattraper voire dépasser les derniers processeurs Intel. Pari tenu ?

www.hardware.fr

You linked a situation where Intel's HT gained less (22.5%) than AMD's SMT (25.6% to 28% if core parking is disabled). That just supports my statement.

Plus, I would hope we look at something more recent than HEDT Broadwell-E for debates on how Meteor Lake's hyperthreading performs. As time went on, for newer chips, HT performed less and less and less. More cores meant less reason for HT, better use of the core meant fewer idle resources, and more power limit stretching just meant that HT's transistor flipping added more heat and more thermal throttling.

TESKATLIPOKA · Aug 16, 2024

cebri1 said:
MT is going to murder the 14900K .

Doesn't look like that or It's dependent on thread count and If HT is usable in that APP or not.

I made another table for desktop, tried to be more accurate, but this is still just a guess!
14900KS vs QS ARL in CB R23
RPL P-core: 197 points (5.9GHz all-core Turbo, 50% higher IPC than E-core)
RPL P-core HT: 49 points (1/4 of P-core)
RPL E-core: 100 points (4.5GHz all-core Turbo)
ARL P-core: 206 points (5.4GHz all-core Turbo, 14% higher IPC than RPL P-core)
ARL E-core: 148 points (4.6GHz all-core Turbo, 45% higher IPC than RPL E-core)

Thread count	14900KS 8P+16E	QS ARL 8P+16E	Difference in %
1	197	206	+4.6%
2	394	412	+4.6%
3	591	618	+4.6%
4	788	824	+4.6%
5	985	1030	+4.6%
6	1182	1236	+4.6%
7	1379	1442	+4.6%
8	1576	1648	+4.6%

9	1676	1796	+7.2%
10	1776	1944	+9.5%
11	1876	2092	+11.5%
12	1976	2240	+13.4%
13	2076	2388	+15%
14	2176	2536	+16.5%
15	2276	2684	+17.9%
16	2376	2832	+19.2%

17	2476	2980	+20.4%
18	2576	3128	+21.4%
19	2676	3276	+22.4%
20	2776	3424	+23.3%
21	2876	3572	+24.2%
22	2976	3720	+25%
23	3076	3868	+25.8%
24	3176	4016	+26.5%

25	3225	4016	+24.5%
26	3274	4016	+22.7%
27	3323	4016	+20.9%
28	3372	4016	+19.1%
29	3421	4016	+17.4%
30	3470	4016	+15.7%
31	3519	4016	+14.1%
32	3568	4016	+12.6%

Of course the biggest difference will be at 24 threads, I got +26.5%. Don't forget this table is just my guesswork.

Abwx · Aug 16, 2024

dullard said:
You linked a situation where Intel's HT gained less than AMD's SMT. That just supports my statement.

Plus, I would hope we look at something more recent than HEDT Broadwell-E for debates on how Meteor Lake's hyperthreading performs.

3% yield difference at the time, that s marginal, and as said it has been improved in recent uarchs, IIRC it s 31% in Cinebench for Intel and something like 32% for AMD.

For a 8C/16T this amount to 2 extra cores at worse and as much as 2.5C at best.

The reason to drop the SMT capability lies elsewhere, they thought that each SMT thread could be replaced by a small core that has much more throughput, on paper that make sense but on practice that s not as straightfoward, because a small core cost more silicon and power wise than three pieces of HT circuitry for 3 P cores.

dullard · Aug 16, 2024

Wolverine2349 said:
How will big.little hybrid work with Arrow Lake.

Because Skymont is so much better than Gracemont, will all the Big.Little hybrid scheduling issues be completed solved and all software should work perfectly as it would on homogenous with hybrid Arrow Lake?

Or no is hybrid still a potential issue for some software regardless of dramatically improved e-cores.

With Meteor Lake, the scheduler had Big, Little, Little-LPE, and HT on Big cores. Four totally different ways that adding a thread would impact performance. And four drastically different performance levels between them. Probably the most difficult to schedule correctly is when the Big core is impacted to add in an HT thread and suddenly needs to share cache with that other thread.

With Arrow Lake, it is a bit easier. (1) There are only three types Big, Little, Little-LPE. (2) The performance delta between types of cores is less. (3) Resources on the Big cores are more reliably known if another thread is added like with HT (it doesn't have to suddenly share cache or swap back and fourth with processing threads).

Plus, hopefully more and more software keeps getting written to identify the proper core for each thread.

dullard · Aug 16, 2024

Abwx said:
3% yield difference at the time, that s marginal, and as said it has been improved in recent uarchs, IIRC it s 31% in Cinebench for Intel and something like 32% for AMD.

Look at averages, not cherry pick single software benchmarks written to put all threads at 100% at all times. The average HT performance change is way less (closer to 10% to 15% depending on software in the average) for Intel. Not so much of a drop for AMD in average SMT on/off.

Or wait 2.5 weeks for Lunar Lake to come out. That might be better than speculating. Hopefully we get some reviews that cover the differences.

Wolverine2349 · Aug 16, 2024

dullard said:
With Meteor Lake, the scheduler had Big, Little, Little-LPE, and HT on Big cores. Four totally different ways that adding a thread would impact performance. And four drastically different performance levels between them. Probably the most difficult to schedule correctly is when the Big core is impacted to add in an HT thread and suddenly needs to share cache with that other thread.

With Arrow Lake, it is a bit easier. (1) There are only three types Big, Little, Little-LPE. (2) The performance delta between types of cores is less. (3) Resources on the Big cores are more reliably known if another thread is added like with HT (it doesn't have to suddenly share cache or swap back and fourth with processing threads).

Plus, hopefully more and more software keeps getting written to identify the proper core for each thread.

Though Desktop Arrow Lake, there are no LPE cores right.

All P and e-cores on same die right and only 2 types of places to put threads right?

dullard · Aug 16, 2024

Wolverine2349 said:
Though Desktop Arrow Lake, there are no LPE cores right.

All P and e-cores on same die right and only 2 types of places to put threads right?

That would make scheduling even easier with just 2 types of cores--especially without as much performance delta between them. I have seen conflicting rumors about whether or not Arrow Lake reuses Meteor Lake's tile with the LPE cores though. At this point I don't know--hard to keep the rumors straight.

Abwx · Aug 16, 2024

dullard said:
Look at averages, not cherry pick single software benchmarks written to put all threads at 100% at all times. The average HT performance change is way less (closer to 10% to 15% depending on software in the average) for Intel. Not so much for AMD.

At this rate you dont even need to add more cores, because the point is precisely to increase the max throughput, otherwise why would they implement more cores..?.

So the apps that benefit substancialy of SMT in Hardware.fr test are 7 ZIP, Winrar, DxO Optics, Visual Studio, GCC, Stockfish, Komodo, X264 encoding, 3DS Max with Mental Ray and Vray as plugins., that is, almost all tested apps.

That s a lot, and as you can see there s not even Cinema 4D or many other softs that are known to benefit from SMT, otherwise the 7950X would trail a 14900K in almost all apps if not a 14700K.

dullard · Aug 16, 2024

Abwx said:
So the apps that benefit substancialy of SMT in Hardware.fr test are 7 ZIP, Winrar, DxO Optics, Visual Studio, GCC, Stockfish, Komodo, X264 encoding, 3DS Max with Mental Ray and Vray as plugins., that is, almost all tested apps.

Again, I wouldn't use that Hardware.fr benchmark link for conclusions of modern day CPU comparisons. The article is over 7.5 years old on ~8 year old CPUs.

Here are SMT gains over the years. To make it as lenient towards your argument as possible, I'll exclude gaming results which show SMT declines in more cases than not.

10 years ago: Intel HT is +23.5%, AMD SMT is +55%: https://blog.stuffedcow.net/2014/01/amd-modules-hyperthreading/
7.5 years ago: Intel HT is +22.5%, AMD SMT is +25.6% (or +28% with core parking off): https://www.hardware.fr/articles/956-7/impact-smt-ht.html
7 years ago: Intel HT is +20%, AMD SMT +28%: https://www.anandtech.com/show/11544/intel-skylake-ep-vs-amd-epyc-7000-cpu-battle-of-the-decade/15
5 years ago: Regarding security mitigations that destroyed Intel's HT performance gains. Phoronix's recent testing of all mitigations in Linux found the fixes reduce Intel's performance by 16% (on average) with Hyper-Threading enabled, while AMD only suffers a 3% average loss. https://www.tomshardware.com/news/intel-amd-mitigations-performance-impact,39381.html
4 years ago: AMD SMT is +20.6% on average. Only 4 of the 16 tests met your 32% number. https://www.anandtech.com/show/1626...-multithreading-on-zen-3-and-amd-ryzen-5000/2
2 weeks ago: AMD SMT is +18% on average. https://www.phoronix.com/review/amd-ryzen-zen5-smt/8
2 weeks ago: AMD SMT is +16% on average (I excluded the single threaded benchmarks and the games to make SMT as good as I possibly could). https://www.techpowerup.com/review/amd-ryzen-9-9700x-performance-smt-disabled/2.html

I could go on and on with links. In the average software, AMD's SMT gained more than Intel's HT. As the years go on, SMT's improvement is getting less and less on average. Then adding to that decline, Spectre / Meltdown mitigations destroyed much of the HT gains that Intel had. And in no average was Intel getting +25% as used in the table that sparked my discussion. The only average where AMD got at least +32% as claimed in this thread above was from 10 years ago.

TESKATLIPOKA · Aug 16, 2024

dullard said:
I don't know specifically about the MTL 105UL, but has any recent Intel chip actually gotten even close to 25% from hyperthreading on average? Maybe on a very specific benchmark or two, but that is far from the norm.

Let's say It was in CB R23.

inf64 · Aug 16, 2024

I know that this is about ArrowLake and impact of SMT in benchmarks, but for Zen5/Zen5c the performance impact on Linux is 18% (geomean of 57 workloads) with no measurable impact on power consumption:

SMT Performance Benchmarks Continue To Show Benefit With AMD Zen 5/5C Review - Phoronix

www.phoronix.com

"Simultaneous Multi-Threading (SMT) still proved very much useful for AMD Strix Point with the new Zen 5/5C cores. Contrary to Intel abandoning Hyper Threading (HT/SMT) with Lunar Lake, SMT was providing measurable performance gains across a wide mix of multi-threaded workloads. The impact of SMT varied but when taking the geometric mean of 57 benchmarks in full, SMT on the Ryzen AI 9 HX 370 was around 1.18x the performance of running this 12-core Zen 5/5C laptop SoC with SMT disabled.

Equally important is that leaving SMT enabled on this AMD Ryzen AI 300 series laptop did not negatively impact the CPU power consumption. Overall the CPU package power consumption averaged out to being the same across both runs."

IEC · Aug 16, 2024

I'm holding out hope that Arrow Lake is competitive (and fixes Intel 13th/14th gen stability issues) so I have an excuse to build or upgrade another system in the next six months.

On the other hand, the realist in me expects marginal gains overall with some regressions. There are just too many known items working against ARL:
1) Clock speed regression (even vs most optimistic "leaks")
2) Tile architecture = there will be some performance regressions due to this (latency and power penalties vs monolithic)
3) Lack of SMT
4) New platform/motherboards - I expect some things to be broken at launch judging from recent Intel and non-Intel launches

Not too much longer til we know for sure.

inf64 · Aug 16, 2024

IEC said:
I'm holding out hope that Arrow Lake is competitive (and fixes Intel 13th/14th gen stability issues) so I have an excuse to build or upgrade another system in the next six months.

On the other hand, the realist in me expects marginal gains overall with some regressions. There are just too many known items working against ARL:
1) Clock speed regression (even vs most optimistic "leaks")
2) Tile architecture = there will be some performance regressions due to this (latency and power penalties vs monolithic)
3) Lack of SMT
4) New platform/motherboards - I expect some things to be broken at launch judging from recent Intel and non-Intel launches

Not too much longer til we know for sure.

Gaming performance will be hard to predict due to tile uarchitecture and it's possible impact on latency. On the other hand, ST and MT performance should be fairly predictable , with some caveats (average IPC boost of 14% for P core and 35% for E core, all core boost of 5.4Ghz/4.7Ghz for ARL and 5.6Ghz/4.4Ghz for RPL; SMT adds 20% performance on avg. for RPL)

ST

RPL 14900K- relative ST performance 1.0 with 6Ghz boost
ARL 285K (5.7Ghz ST boost )- relative ST performance : 5.7/6 x 1.14 = 1.083 or around 8% faster

MT
RPL 14900K- relative MT performance with 5.6Ghz/4.4Ghz all core P/E boost : 8 x 1.2 + 16 = 25.6
ARL 285K - relative MT performance with 5.4Ghz/4.7Ghz all core P/E boost and no SMT : (8 x 5.4/5.6 x 1.14) + (16 x 1.35 x 4.7/4.4) = 31.86 => around 25% faster

desrever · Aug 16, 2024

14% IPC improvement isn't RPL tho. Why are people still using it for estimate?

gdansk · Aug 16, 2024

desrever said:
14% IPC improvement isn't RPL tho. Why are people still using it for estimate?

What is a better estimate?

desrever · Aug 16, 2024

gdansk said:
What is a better estimate?

compare vs MTL then to RPL

gdansk · Aug 16, 2024

desrever said:
compare vs MTL then to RPL

Which is?
The result we saw so far match pretty close to 1.14x (at least in GB6) so I was wondering what result you estimate.

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Senior member

Attachments

Diamond Member

Member

Platinum Member

Senior member

Elite Member

Lifer

Senior member

Diamond Member

Elite Member

Platinum Member

Lifer

Elite Member

Elite Member

Senior member

Elite Member

Lifer

Elite Member

Platinum Member

Diamond Member

Elite Member

Diamond Member

Senior member

Diamond Member

Senior member

Diamond Member