Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Tigerick · Aug 22, 2022

Wildcat Lake (WCL) Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing Raptor Lake-U. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q1 2026.

	Intel Raptor Lake U	Intel Wildcat Lake 15W?	Intel Lunar Lake	Intel Panther Lake 4+4+4
Launch Date	Q1-2024	Q2-2026	Q3-2024	Q1-2026
Model	Intel 150U	Intel Core 7	Core Ultra 7 268V	Core Ultra 7 365
Dies	2	2	2	3
Node	Intel 7 + ?	Intel 18-A + TSMC N6	TSMC N3B + N6	Intel 18-A + Intel 3 + TSMC N6

CPU	2 P-core + 8 E-cores	2 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores	4 P-core + 4 LP E-cores
Threads	12	6	8	8
Max Clock	5.4 GHz	?	5 GHz	4.8 GHz
L3 Cache	12 MB		12 MB	12 MB
TDP	15 - 55 W	15 W ?	17 - 37 W	25 - 55 W

Memory	128-bit LPDDR5-5200	64-bit LPDDR5	128-bit LPDDR5x-8533	128-bit LPDDR5x-7467
Size	96 GB		32 GB	128 GB
Bandwidth			136 GB/s

GPU	Intel Graphics	Intel Graphics	Arc 140V	Intel Graphics
RT	No	No	YES	YES
EU / Xe	96 EU	2 Xe	8 Xe	4 Xe
Max Clock	1.3 GHz	?	2 GHz	2.5 GHz

NPU	GNA 3.0	18 TOPS	48 TOPS	49 TOPS

As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.

cannedlake240 · Oct 11, 2024

poke01 said:
when is this arriving? Nova Lake or even later say 2027?

Doubt they can swap the main cores 2 years before launch, even if the the uarch was ready. If this team only recently started the work, it should take years until the first product is out if it isn't canned in the meantime

Cstops · Oct 11, 2024

For those that like diving into datasheets, it seems [the initial release of] Vol. 1 is up for Arrow Lake. (subject to change/revisions, etc.)

https://www.intel.com/content/www/u...eries-processors-datasheet-volume-1-of-2.html

Here's what it's got for official DDR5 support presently in there.

511 · Oct 11, 2024

alcoholbob said:
Intel also bragged about 15% less die space otherwise needed to make hyper-threading work during the Lunar Lake unveil. So they had a bunch of extra die space to do something interesting for single thread performance and this is all they came up with. No wonder the E-Core team is taking over the next generation uarch.

They should P core is barely faster in server/mobile in mobile/server with SKT/LNC cause we are thermally and power limited

igor_kavinski · Oct 11, 2024

Just because Skymont is doing well NOW doesn't mean it will scale all the way up to 5 GHz and beyond. It may run into the same limitations as Lion Cove if pushed higher.

511 · Oct 11, 2024

igor_kavinski said:
Just because Skymont is doing well NOW doesn't mean it will scale all the way up to 5 GHz and beyond. It may run into the same limitations as Lion Cove if pushed higher.

That is why i said Server and Mobile lol not desktop 4.6 GHz but a better PPA and IPC than P core in future 🙂

Hulk · Oct 11, 2024

Just for my benchmarking pleasure I hope is is possible to disable all P cores in Arrow Lake and just run the Skymont cores. I want to finally corner those slippery E cores!

Magio · Oct 11, 2024

cannedlake240 said:
Doubt they can swap the main cores 2 years before launch, even if the the uarch was ready. If this team only recently started the work, it should take years until the first product is out if it isn't canned in the meantime

IMO what is plausible is that Arctic Wolf was already set to be another big jump that would see the E cores once again widen their range of operation and take over more of the P cores' duties.

A progressive evolution of the E cores until they're ready to fully take over seems much more likely to me than one uarch unifying the two coming out of nowhere at some point in the future.

Wolverine2349 · Oct 11, 2024

Magio said:
IMO what is plausible is that Arctic Wolf was already set to be another big jump that would see the E cores once again widen their range of operation and take over more of the P cores' duties.

A progressive evolution of the E cores until they're ready to fully take over seems much more likely to me than one uarch unifying the two coming out of nowhere at some point in the future.

Well if its true that Skymont has almost the came IPC as Lion Cove, and Arctic Wolf e-cores take another big jump, why wouldn't e-cores already be ready to replace P cores unless P cores can take another huge jump though that seems to be struggling to happen in the Israel Design Center P core team.

Or is there more to the story than just IPC/ Is it that Skymont has IPC in some areas close to lion Cove but not ready tot ake over by itself even with a big jump? Is there some specific limitation the Austin Atom team cores have regardless of jumps right now that prevent them from being primary core? Are they like dependent on P cores for all around functionality?

Like much more to the situation and story than this oh e-cores ready to take over P cores with Arctic Wolf in a few years or longer?

igor_kavinski · Oct 11, 2024

Wolverine2349 said:
Like much more to the situation and story than this oh e-cores ready to take over P cores with Arctic Wolf in a few years or longer?

E-cores are built from the ground up for MT workloads. If they mess with that formula to also enhance their ST performance to the point of not needing P cores, something will regress. There is no perfect core yet that excels at every kind of workload. I believe in future they will use the P-cores for FP heavy duties with full 512 bit width AVX-512 while the AVX10 enabled E-cores will just chip in to provide slightly more boost to such workloads, instead of shouldering that heavy responsibility all on their own.

dullard · Oct 11, 2024

igor_kavinski said:
E-cores are built from the ground up for MT workloads. If they mess with that formula to also enhance their ST performance to the point of not needing P cores, something will regress. There is no perfect core yet that excels at every kind of workload. I believe in future they will use the P-cores for FP heavy duties with full 512 bit width AVX-512 while the AVX10 enabled E-cores will just chip in to provide slightly more boost to such workloads, instead of shouldering that heavy responsibility all on their own.

I agree. P-core for snappy response and complex tasks. E-core for brute force of heavy multi-threaded work. They were separated to get the best of both worlds. Combining both into just one core is likely to get the worst of both worlds.

cannedlake240 · Oct 11, 2024

igor_kavinski said:
E-cores are built from the ground up for MT workloads

The E core is no more, according to the rumor mill. This new mystical core is being developed from the ground up to replace both P/E and return to having just one core because Intel's financial struggles and shifting priorities on AI

DrMrLordX · Oct 11, 2024

cannedlake240 said:
The E core is no more, according to the rumor mill. This new mystical core is being developed from the ground up to replace both P/E and return to having just one core because Intel's financial struggles and shifting priorities on AI

As long as it's based on the current mont cores, it should be okay.

gdansk · Oct 11, 2024

DrMrLordX said:
As long as it's based on the current mont cores, it should be okay.

I don't think it needs to be based strictly upon Skymont as long as it incorporates lessons learned.

MS_AT · Oct 11, 2024

igor_kavinski said:
E-cores are built from the ground up for MT workloads. If they mess with that formula to also enhance their ST performance to the point of not needing P cores, something will regress. There is no perfect core yet that excels at every kind of workload. I believe in future they will use the P-cores for FP heavy duties with full 512 bit width AVX-512 while the AVX10 enabled E-cores will just chip in to provide slightly more boost to such workloads, instead of shouldering that heavy responsibility all on their own.

The whole point of AVX10/256 that is meant for consumer hardware is to get AVX512 features (new instructions, masking etc) with 256b registers, that would then be common between P and E cores.

Now, I would argue that there is nothing specific in Skymont that would let it handle MT workloads better than ST. It's just the core was build with area efficiency in mind, so it would be easier to spam E-cores. It's not like there is something specific blocking them from putting AVX512 on Skymont other than the area taken by the core.

Hulk · Oct 11, 2024

According to my early estimations lion cove in arrow lake should still see about 25% IPC over skymont. When you add in another 20 % for the clock speed advantage that lion cove has , simple napkin math shows 45% IPC advantage for the p cores. It goes without saying this is significant. Skymont is impressive as it has cut down the performance discrepancy of nearly 100% with gracemont/raptor cove to half of that with skymont/lion cove.

gdansk · Oct 11, 2024

Hulk said:
When you add in another 20 % for the clock speed advantage that lion cove has , simple napkin math shows 45% IPC (instructions per clock)

Might want to check that.

ondma · Oct 11, 2024

Hulk said:
According to my early estimations lion cove in arrow lake should still see about 25% IPC over skymont. When you add in another 20 % for the clock speed advantage that lion cove has , simple napkin math shows 45% IPC advantage for the p cores. It goes without saying this is significant. Skymont is impressive as it has cut down the performance discrepancy of nearly 100% with gracemont/raptor cove to half of that with skymont/lion cove.

I though Skymont is supposed to have RC IPC. That is only about 10% faster than RC, based on the latest slide released by Intel. Am I missing something? I think clockspeed is the biggest problem, and perhaps the fact that Skymont is not efficient in all workloads.

cannedlake240 · Oct 11, 2024

MS_AT said:
It's not like there is something specific blocking them from putting AVX512 on Skymont other than the area taken by the core.

Something about E cores not having a uop cache, hence being less suited for vector workloads was suggested by CnC

dullard · Oct 11, 2024

MS_AT said:
Now, I would argue that there is nothing specific in Skymont that would let it handle MT workloads better than ST. It's just the core was build with area efficiency in mind, so it would be easier to spam E-cores. It's not like there is something specific blocking them from putting AVX512 on Skymont other than the area taken by the core.

You are correct with half of the picture. Smaller area = far cheaper to spam E-cores.

But, there is the other half of the picture that you are missing. The E-cores were designed and optimized for low power situations. Spamming a core consuming 5 W each is one thing. Spamming a core taking 25 W each is totally different.

It is just so energy inefficient to spam P-cores. So, you are left with two possibilities with spamming P-cores: (1) massive power draw, tons of heat produced, huge energy bills, and then you have to pay to cool it all. Or (2) run the P-cores far from their design power level and even further from their optimum power level--performance suffers.

This graph is way back from Alder Lake. Even back then the E-cores do more work with each Watt of energy for any power setting less than 15 W/core. Guess what, at 125 W TDP, even an 8 core chip is right at that cutoff. Put in a 16 core chip with 125 W power (7.8 W each) and you are already 30% better performing with E-cores than P-cores. Then if you are considering spamming for multi-threading tasks, you get further and further into the area where P-cores just don't even operate.

https://chipsandcheese.com/p/alder-lakes-power-efficiency-a-complicated-picture

If you are an extreme enthusiast that doesn't care about practical limits like power, then yes, go ahead and spam P-cores and build a nuclear power plant in your back yard to run it.

desrever · Oct 11, 2024

alcoholbob said:
Intel also bragged about 15% less die space otherwise needed to make hyper-threading work during the Lunar Lake unveil. So they had a bunch of extra die space to do something interesting for single thread performance and this is all they came up with. No wonder the E-Core team is taking over the next generation uarch.

the hyperthreading is still there in silicon, just not enabled.

MS_AT · Oct 11, 2024

dullard said:
You are correct with half of the picture. Smaller area = far cheaper to spam E-cores.

But, there is the other half of the picture that you are missing. The E-cores were designed and optimized for low power situations. Spamming a core consuming 5 W each is one thing. Spamming a core taking 25 W each is totally different.

It is just so energy inefficient to spam P-cores. So, you are left with two possibilities with spamming P-cores: (1) massive power draw, tons of heat produced, huge energy bills, and then you have to pay to cool it all. Or (2) run the P-cores far from their design power level and even further from their optimum power level--performance suffers.

This graph is way back from Alder Lake. Even back then the E-cores do more work with each Watt of energy for any power setting less than 15 W/core. Guess what, at 125 W TDP, even an 8 core chip is right at that cutoff. Put in a 16 core chip with 125 W power (7.8 W each) and you are already 30% better performing with E-cores than P-cores. Then if you are considering spamming for multi-threading tasks, you get further and further into the area where P-cores just don't even operate.

View attachment 109222
https://chipsandcheese.com/p/alder-lakes-power-efficiency-a-complicated-picture

If you are an extreme enthusiast that doesn't care about practical limits like power, then yes, go ahead and spam P-cores and build a nuclear power plant in your back yard to run it.

Sorry but I don't remember where I would advocate spamming P cores. The idea behind my post was that Skymont could be a base for a more performant core as there is nothing in its design that inherently prevents it. AMD has shown it is possible to implement a core that scales fairly well across the range of power targets.

dullard · Oct 11, 2024

MS_AT said:
Sorry but I don't remember where I would advocate spamming P cores. The idea behind my post was that Skymont could be a base for a more performant core as there is nothing in its design that inherently prevents it. AMD has shown it is possible to implement a core that scales fairly well across the range of power targets.

I was referring to these sentences of yours: "Now, I would argue that there is nothing specific in Skymont that would let it handle MT workloads better than ST. It's just the core was build with area efficiency in mind, so it would be easier to spam E-cores."

There is something specific that lets E-cores handle MT workloads better: designed and optimized for low power.

Hulk · Oct 11, 2024

gdansk said:
Might want to check that.

Ha! Yes. I meant overall performance. I was "on the table" ready to go in for a colonoscopy when I typed that into my phone so I have a pretty good excuse.

Hulk · Oct 11, 2024

ondma said:
I though Skymont is supposed to have RC IPC. That is only about 10% faster than RC, based on the latest slide released by Intel. Am I missing something? I think clockspeed is the biggest problem, and perhaps the fact that Skymont is not efficient in all workloads.

Intel is saying Skymont is +32% IPC over Gracemont so that is where I based my figure. 32% over Gracemont does not equal Raptor Cove. That would be more like 48%.

I think we will see that what Intel means by Skymont~RPC IPC is in some specific best case scenarios (heavy FP) much in the same way Gracemont~Skylake in some use cases.

I have found that taking the more conservative Intel performance estimates will correlate better to actual performance. The more hyperbolic sounding Intel claims, like Skymont = Raptor Cove are generally hard to replicate and rare corner cases.

We'll know the truth soon enough. One thing I can't figure out is how Intel is claiming +18% CB R24 MT for ARL over RPL? That is nuts.

I'm curious to see actual CB R24 MT results and compare them to RPL, stock-for-stock clocks to verify this seemingly outrageous Intel claim.

gdansk · Oct 11, 2024

Hulk said:
One thing I can't figure out is how Intel is claiming +18% CB R24 MT for ARL over RPL? That is nuts.

Base clocks are up all around. I think that means the typical all core clock rate will be higher too.

And I bet the power allocation of Skymont cluster increased a bit to allow for maximum throughput.

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Senior member

Attachments

Senior member

Junior Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Senior member

Senior member

Lifer

Elite Member

Senior member

Lifer

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Senior member

Elite Member

Senior member

Senior member

Elite Member

Diamond Member

Diamond Member

Diamond Member