Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Tigerick · Aug 22, 2022

Wildcat Lake (WCL) Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing Raptor Lake-U. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q1 2026.

	Intel Raptor Lake U	Intel Wildcat Lake 15W?	Intel Lunar Lake	Intel Panther Lake 4+0+4
Launch Date	Q1-2024	Q2-2026	Q3-2024	Q1-2026
Model	Intel 150U	Intel Core 7	Core Ultra 7 268V	Core Ultra 7 365
Dies	2	2	2	3
Node	Intel 7 + ?	Intel 18-A + TSMC N6	TSMC N3B + N6	Intel 18-A + Intel 3 + TSMC N6

CPU	2 P-core + 8 E-cores	2 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores
Threads	12	6	8	8
Max Clock	5.4 GHz	?	5 GHz	4.8 GHz
L3 Cache	12 MB		12 MB	12 MB
TDP	15 - 55 W	15 W ?	17 - 37 W	25 - 55 W

Memory	128-bit LPDDR5-5200	64-bit LPDDR5	128-bit LPDDR5x-8533	128-bit LPDDR5x-7467
Size	96 GB		32 GB	128 GB
Bandwidth			136 GB/s

GPU	Intel Graphics	Intel Graphics	Arc 140V	Intel Graphics
RT	No	No	YES	YES
EU / Xe	96 EU	2 Xe	8 Xe	4 Xe
Max Clock	1.3 GHz	?	2 GHz	2.5 GHz

NPU	GNA 3.0	18 TOPS	48 TOPS	49 TOPS

As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.

SiliconFly · Nov 29, 2024

mzocyteae · Nov 29, 2024

LightningZ71 said:
The 2+8 ADL p and u chips do just fine in most general use cases. 2 P cores is plenty for web browsing and light usage. It's not amazing, but, having used them for work, it's not hampering me in any way.

Browsers have lots of threads/processes and workloads are irregular.
2P and 0P will make little or no difference, unless there is some steady workload (usually js + webgl, but then 2P is probably insufficient).

mzocyteae · Nov 29, 2024

coercitiv said:
Falling behind at high perf levels is normal, as currently they are built with a different performance target in mind (both core layout and cache structure are aimed at PPA). You can think of this as the reverse of what happened to Zen and Zen C, when optimized for PPA the Zen core loses efficiency at high clocks (and clocks lower too). So in theory, if you wanted to make a P core out of Skymont, you would optimize the layout to ensure better voltage scaling at 4-5Ghz and give it access to proper L2/L3. The core would be bigger, use a bit more power at lower perf levels, but would scale much better at high clocks.

I think folks in the forum should talk less about replacing P cores with E cores, and more about replacing Cove with Mont (arch families). P and E are roles, and they can even be played by the same arch with some tweaks (as shown by AMD). A real world product based on the pedigree of the Mont cores would benefit from a properly planned architecture, they would have this target in mind as they plan the core. Ideally we would want a core close in size to the latest Coves, but with performance to justify the area.

Coves aren't that big -- any core will be similarly big with a big cache (under similar process/library).
Maybe Intel should consider Telum 2's shared L2/3 approach (plus vertical cache), which looks pretty good on numbers.

coercitiv · Nov 29, 2024

mzocyteae said:
Coves aren't that big

Coves are big relative to Monts, and they're big relative to the performance they offer.

SiliconFly · Nov 29, 2024

mzocyteae · Nov 29, 2024

SiliconFly said:
Actually, coves are big. They're super big compared to M3, Skymont, Zen 5, Zen 5C, etc.

A few dozen percentage of the logic+L1 is not super big, and is diminished in product by cache, interconnect (server) or other tiles (client).

SiliconFly · Nov 29, 2024

LightningZ71 · Nov 29, 2024

While the numbers are empirically measured, you aren't necessarily comparing apples to apples. The Lion Cove cores, as implemented in Alder Lake, have different frequency targets, and as such, have more transistors added to help with getting there. In addition, they have more blank space between transistors on critical paths to assist with speed as well.

A more accurate comparison would be to compare the number of core logic transistors that it takes for each one to function. Unfortunately, Intel no longer releases those numbers. I suspect that, were one to add the needed transistors for frequency scaling to the mint cores and also intentionally added buffer space to critical paths, they would find that the size comparison is much closer. Also, keep in mind, Lion Cove is carrying unused functional blocks for SMT and AVX-512 as the core will be reused in servers.

Kepler_L2 · Nov 29, 2024

LightningZ71 said:
Also, keep in mind, Lion Cove is carrying unused functional blocks for SMT and AVX-512 as the core will be reused in servers.

It's not.

SiliconFly · Nov 29, 2024

mzocyteae · Nov 29, 2024

SiliconFly said:
From what I gather, (without L2, L3, etc.), the sizes of the core (with just L1) are:

Lion Cove - 3.4 mm2

Skymont - 1.15 mm2

The size ratio of Skymont to Lion Cove is 1 : 3 which is massive! A Lion Cove is like 3X bigger than a Skymont!

This is what you wrote:

SiliconFly said:
Actually, coves are big. They're super big compared to M3, Skymont, Zen 5, Zen 5C, etc.

M3 is around 2.5m^2, 1.36x is normally not in the range of "super big".
Skymont doesn't reach the performance level of Lion Cove, so the comparison is moot.

LightningZ71 · Nov 29, 2024

Kepler_L2 said:
It's not.

Woah... I thought the "dirty little secret" with Lion Cove was that they have a full SMT implementation in the cores as lithographed in Lunar and Arrow Lake CPU CCDs and that it wasn't enabled because they validation wasn't possible to complete by their ship to market deadlines. It's been discussed on the forum several times... I thought I read the same for AVX-512 being in the core, but not enabled on client.

Meteor Late · Nov 29, 2024

mzocyteae said:
This is what you wrote:

M3 is around 2.5m^2, 1.36x is normally not in the range of "super big".
Skymont doesn't reach the performance level of Lion Cove, so the comparison is moot.

It is, relative to performance and efficiency. M3 is faster and wayy more efficient.

igor_kavinski · Nov 29, 2024

Kepler_L2 said:
It's not.

Any source for that?

Kepler_L2 · Nov 29, 2024

igor_kavinski said:
Any source for that?

Intel's Computex presentation.

igor_kavinski · Nov 29, 2024

Kepler_L2 said:
Intel's Computex presentation.

Ah can we really trust Intel at face value?

511 · Nov 29, 2024

Someone needs to do an autopsy on ARL Silicon to verify it btw if it had HT and AVX-512 and it was disabled in silicon it would have been plus points that would might justify this area

DavidC1 · Nov 29, 2024

LightningZ71 said:
While the numbers are empirically measured, you aren't necessarily comparing apples to apples. The Lion Cove cores, as implemented in Alder Lake, have different frequency targets, and as such, have more transistors added to help with getting there. In addition, they have more blank space between transistors on critical paths to assist with speed as well.

Yes, but it doesn't take 3x the size for the mediocre gains over Skymont. Some tests show even gaming performs better with most of Lion Cove cores off.

And a more straightforward comparison is against AMD. Less dense process but the chip clocks just as high, supports SMT, and is slightly faster, while being a smaller core too.

LightningZ71 · Nov 29, 2024

I'm just discussing the approach differences between Skymont and Lion Cove. One is aimed for higher clocks, particularly sustained higher clocks, and that takes more transistors and effective density sacrifices. I'm not saying that Lion is particularly area efficient in and of itself, just that it does have different targets and, to my personal understanding, transistors present for unexposed features.

The actual functional transistor count difference between the two is not as great as implementation area makes it seem.

GTracing · Nov 29, 2024

LightningZ71 said:
I'm just discussing the approach differences between Skymont and Lion Cove. One is aimed for higher clocks, particularly sustained higher clocks, and that takes more transistors and effective density sacrifices. I'm not saying that Lion is particularly area efficient in and of itself, just that it does have different targets and, to my personal understanding, transistors present for unexposed features.

The actual functional transistor count difference between the two is not as great as implementation area makes it seem.

The thing is it's not hard to increase clocks. That'll be the easiest thing to change if the mont lineage takes over as the p-core. Intel also needs to greatly widen the FPU and add AVX10, improve the L3 cache latency, and get another 30%+ int ipc (which they've been doing every two years, but it's not guaranteed that they can keep up the pace).

But I wouldn't be surprised to see Intel go wider and slower like the arm cores anyways.

DavidC1 · Nov 29, 2024

GTracing said:
The thing is it's not hard to increase clocks.

It takes heroic effort beyond the 5GHz frequency range.

Proof? 9 stage CPU clocks at 4.5GHz while a 19 stage one does 5.7GHz.

All the while:
-The lower clocked chip performs better in absolute performance
-It uses multiples of power to reach those clocks
-Cores are much larger
-Sucks for mobile and server, while being mediocre even for desktops

Nevermind chips like Raptorlake.

It used to be doubling the stages got you 80%+ gains. How is it worth now? Unified Core should aim for at maximum low-5GHz.

As for FP, it should stay 256-bit for client. 512-bit is a waste, and was a mistake to do it with AVX-512. It should have been AVX3-256.

511 · Nov 29, 2024

https://twitter.com/x/status/1862509554736800177

Darkmont area efficiency the 24 core compute tile is just 55mm2 with the L2 on 18A😮

Kepler_L2 · Nov 30, 2024

511 said:
https://twitter.com/x/status/1862509554736800177
Skymont area efficiency the 24 core compute tile is just 55mm2 with the L2 on 18A😮

Darkmont*

511 · Nov 30, 2024

Kepler_L2 said:
Darkmont*

Forgive me Lord Kepler btw are you still getting driver updates 🤔

Kepler_L2 · Nov 30, 2024

511 said:
Forgive me Lord Kepler btw are you still getting driver updates 🤔

NVIDIA Support

nvidia.custhelp.com

Also support for Maxwell/Pascal is supposed to end soon.

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Senior member

Attachments

Golden Member

Member

Member

Diamond Member

Golden Member

Member

Golden Member

Platinum Member

Golden Member

Golden Member

Member

Platinum Member

Senior member

Lifer

Golden Member

Lifer

Diamond Member

Platinum Member

Platinum Member

Senior member

Platinum Member

Diamond Member

Golden Member

Diamond Member

Golden Member