Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Page 574 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
846
799
106
Wildcat Lake (WCL) Preliminary Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing ADL-N. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q2/Computex 2026. In case people don't remember AlderLake-N, I have created a table below to compare the detail specs of ADL-N and WCL. Just for fun, I am throwing LNL and upcoming Mediatek D9500 SoC.

Intel Alder Lake - NIntel Wildcat LakeIntel Lunar LakeMediatek D9500
Launch DateQ1-2023Q2-2026 ?Q3-2024Q3-2025
ModelIntel N300?Core Ultra 7 268VDimensity 9500 5G
Dies2221
NodeIntel 7 + ?Intel 18-A + TSMC N6TSMC N3B + N6TSMC N3P
CPU8 E-cores2 P-core + 4 LP E-cores4 P-core + 4 LP E-coresC1 1+3+4
Threads8688
Max Clock3.8 GHz?5 GHz
L3 Cache6 MB?12 MB
TDP7 WFanless ?17 WFanless
Memory64-bit LPDDR5-480064-bit LPDDR5-6800 ?128-bit LPDDR5X-853364-bit LPDDR5X-10667
Size16 GB?32 GB24 GB ?
Bandwidth~ 55 GB/s136 GB/s85.6 GB/s
GPUUHD GraphicsArc 140VG1 Ultra
EU / Xe32 EU2 Xe8 Xe12
Max Clock1.25 GHz2 GHz
NPUNA18 TOPS48 TOPS100 TOPS ?






PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



LNL-MX.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,028
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,522
  • INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    181.4 KB · Views: 72,430
  • Clockspeed.png
    Clockspeed.png
    611.8 KB · Views: 72,318
Last edited:

511

Diamond Member
Jul 12, 2024
4,520
4,136
106
Intel also bragged about 15% less die space otherwise needed to make hyper-threading work during the Lunar Lake unveil. So they had a bunch of extra die space to do something interesting for single thread performance and this is all they came up with. No wonder the E-Core team is taking over the next generation uarch.
They should P core is barely faster in server/mobile in mobile/server with SKT/LNC cause we are thermally and power limited
 

511

Diamond Member
Jul 12, 2024
4,520
4,136
106
Just because Skymont is doing well NOW doesn't mean it will scale all the way up to 5 GHz and beyond. It may run into the same limitations as Lion Cove if pushed higher.
That is why i said Server and Mobile lol not desktop 4.6 GHz but a better PPA and IPC than P core in future 🙂
 

Magio

Member
May 13, 2024
170
201
76
Doubt they can swap the main cores 2 years before launch, even if the the uarch was ready. If this team only recently started the work, it should take years until the first product is out if it isn't canned in the meantime
IMO what is plausible is that Arctic Wolf was already set to be another big jump that would see the E cores once again widen their range of operation and take over more of the P cores' duties.

A progressive evolution of the E cores until they're ready to fully take over seems much more likely to me than one uarch unifying the two coming out of nowhere at some point in the future.
 

Wolverine2349

Senior member
Oct 9, 2022
525
178
86
IMO what is plausible is that Arctic Wolf was already set to be another big jump that would see the E cores once again widen their range of operation and take over more of the P cores' duties.

A progressive evolution of the E cores until they're ready to fully take over seems much more likely to me than one uarch unifying the two coming out of nowhere at some point in the future.

Well if its true that Skymont has almost the came IPC as Lion Cove, and Arctic Wolf e-cores take another big jump, why wouldn't e-cores already be ready to replace P cores unless P cores can take another huge jump though that seems to be struggling to happen in the Israel Design Center P core team.

Or is there more to the story than just IPC/ Is it that Skymont has IPC in some areas close to lion Cove but not ready tot ake over by itself even with a big jump? Is there some specific limitation the Austin Atom team cores have regardless of jumps right now that prevent them from being primary core? Are they like dependent on P cores for all around functionality?

Like much more to the situation and story than this oh e-cores ready to take over P cores with Arctic Wolf in a few years or longer?
 
Jul 27, 2020
27,996
19,122
146
Like much more to the situation and story than this oh e-cores ready to take over P cores with Arctic Wolf in a few years or longer?
E-cores are built from the ground up for MT workloads. If they mess with that formula to also enhance their ST performance to the point of not needing P cores, something will regress. There is no perfect core yet that excels at every kind of workload. I believe in future they will use the P-cores for FP heavy duties with full 512 bit width AVX-512 while the AVX10 enabled E-cores will just chip in to provide slightly more boost to such workloads, instead of shouldering that heavy responsibility all on their own.
 

dullard

Elite Member
May 21, 2001
25,994
4,607
126
E-cores are built from the ground up for MT workloads. If they mess with that formula to also enhance their ST performance to the point of not needing P cores, something will regress. There is no perfect core yet that excels at every kind of workload. I believe in future they will use the P-cores for FP heavy duties with full 512 bit width AVX-512 while the AVX10 enabled E-cores will just chip in to provide slightly more boost to such workloads, instead of shouldering that heavy responsibility all on their own.
I agree. P-core for snappy response and complex tasks. E-core for brute force of heavy multi-threaded work. They were separated to get the best of both worlds. Combining both into just one core is likely to get the worst of both worlds.
 

cannedlake240

Senior member
Jul 4, 2024
247
138
76
E-cores are built from the ground up for MT workloads
The E core is no more, according to the rumor mill. This new mystical core is being developed from the ground up to replace both P/E and return to having just one core because Intel's financial struggles and shifting priorities on AI
 

DrMrLordX

Lifer
Apr 27, 2000
22,901
12,967
136
The E core is no more, according to the rumor mill. This new mystical core is being developed from the ground up to replace both P/E and return to having just one core because Intel's financial struggles and shifting priorities on AI
As long as it's based on the current mont cores, it should be okay.
 
  • Like
Reactions: TESKATLIPOKA

MS_AT

Senior member
Jul 15, 2024
868
1,762
96
E-cores are built from the ground up for MT workloads. If they mess with that formula to also enhance their ST performance to the point of not needing P cores, something will regress. There is no perfect core yet that excels at every kind of workload. I believe in future they will use the P-cores for FP heavy duties with full 512 bit width AVX-512 while the AVX10 enabled E-cores will just chip in to provide slightly more boost to such workloads, instead of shouldering that heavy responsibility all on their own.
The whole point of AVX10/256 that is meant for consumer hardware is to get AVX512 features (new instructions, masking etc) with 256b registers, that would then be common between P and E cores.

Now, I would argue that there is nothing specific in Skymont that would let it handle MT workloads better than ST. It's just the core was build with area efficiency in mind, so it would be easier to spam E-cores. It's not like there is something specific blocking them from putting AVX512 on Skymont other than the area taken by the core.
 

Hulk

Diamond Member
Oct 9, 1999
5,138
3,727
136
According to my early estimations lion cove in arrow lake should still see about 25% IPC over skymont. When you add in another 20 % for the clock speed advantage that lion cove has , simple napkin math shows 45% IPC advantage for the p cores. It goes without saying this is significant. Skymont is impressive as it has cut down the performance discrepancy of nearly 100% with gracemont/raptor cove to half of that with skymont/lion cove.
 
  • Like
Reactions: Henry swagger

ondma

Diamond Member
Mar 18, 2018
3,308
1,692
136
According to my early estimations lion cove in arrow lake should still see about 25% IPC over skymont. When you add in another 20 % for the clock speed advantage that lion cove has , simple napkin math shows 45% IPC advantage for the p cores. It goes without saying this is significant. Skymont is impressive as it has cut down the performance discrepancy of nearly 100% with gracemont/raptor cove to half of that with skymont/lion cove.
I though Skymont is supposed to have RC IPC. That is only about 10% faster than RC, based on the latest slide released by Intel. Am I missing something? I think clockspeed is the biggest problem, and perhaps the fact that Skymont is not efficient in all workloads.
 

dullard

Elite Member
May 21, 2001
25,994
4,607
126
Now, I would argue that there is nothing specific in Skymont that would let it handle MT workloads better than ST. It's just the core was build with area efficiency in mind, so it would be easier to spam E-cores. It's not like there is something specific blocking them from putting AVX512 on Skymont other than the area taken by the core.
You are correct with half of the picture. Smaller area = far cheaper to spam E-cores.

But, there is the other half of the picture that you are missing. The E-cores were designed and optimized for low power situations. Spamming a core consuming 5 W each is one thing. Spamming a core taking 25 W each is totally different.

It is just so energy inefficient to spam P-cores. So, you are left with two possibilities with spamming P-cores: (1) massive power draw, tons of heat produced, huge energy bills, and then you have to pay to cool it all. Or (2) run the P-cores far from their design power level and even further from their optimum power level--performance suffers.

This graph is way back from Alder Lake. Even back then the E-cores do more work with each Watt of energy for any power setting less than 15 W/core. Guess what, at 125 W TDP, even an 8 core chip is right at that cutoff. Put in a 16 core chip with 125 W power (7.8 W each) and you are already 30% better performing with E-cores than P-cores. Then if you are considering spamming for multi-threading tasks, you get further and further into the area where P-cores just don't even operate.

1728663332270.png
https://chipsandcheese.com/p/alder-lakes-power-efficiency-a-complicated-picture

If you are an extreme enthusiast that doesn't care about practical limits like power, then yes, go ahead and spam P-cores and build a nuclear power plant in your back yard to run it.
 
  • Like
Reactions: Tlh97 and 511

desrever

Senior member
Nov 6, 2021
309
776
106
Intel also bragged about 15% less die space otherwise needed to make hyper-threading work during the Lunar Lake unveil. So they had a bunch of extra die space to do something interesting for single thread performance and this is all they came up with. No wonder the E-Core team is taking over the next generation uarch.
the hyperthreading is still there in silicon, just not enabled.
 

MS_AT

Senior member
Jul 15, 2024
868
1,762
96
You are correct with half of the picture. Smaller area = far cheaper to spam E-cores.

But, there is the other half of the picture that you are missing. The E-cores were designed and optimized for low power situations. Spamming a core consuming 5 W each is one thing. Spamming a core taking 25 W each is totally different.

It is just so energy inefficient to spam P-cores. So, you are left with two possibilities with spamming P-cores: (1) massive power draw, tons of heat produced, huge energy bills, and then you have to pay to cool it all. Or (2) run the P-cores far from their design power level and even further from their optimum power level--performance suffers.

This graph is way back from Alder Lake. Even back then the E-cores do more work with each Watt of energy for any power setting less than 15 W/core. Guess what, at 125 W TDP, even an 8 core chip is right at that cutoff. Put in a 16 core chip with 125 W power (7.8 W each) and you are already 30% better performing with E-cores than P-cores. Then if you are considering spamming for multi-threading tasks, you get further and further into the area where P-cores just don't even operate.

View attachment 109222
https://chipsandcheese.com/p/alder-lakes-power-efficiency-a-complicated-picture

If you are an extreme enthusiast that doesn't care about practical limits like power, then yes, go ahead and spam P-cores and build a nuclear power plant in your back yard to run it.
Sorry but I don't remember where I would advocate spamming P cores. The idea behind my post was that Skymont could be a base for a more performant core as there is nothing in its design that inherently prevents it. AMD has shown it is possible to implement a core that scales fairly well across the range of power targets.
 

dullard

Elite Member
May 21, 2001
25,994
4,607
126
Sorry but I don't remember where I would advocate spamming P cores. The idea behind my post was that Skymont could be a base for a more performant core as there is nothing in its design that inherently prevents it. AMD has shown it is possible to implement a core that scales fairly well across the range of power targets.
I was referring to these sentences of yours: "Now, I would argue that there is nothing specific in Skymont that would let it handle MT workloads better than ST. It's just the core was build with area efficiency in mind, so it would be easier to spam E-cores."

There is something specific that lets E-cores handle MT workloads better: designed and optimized for low power.
 

Hulk

Diamond Member
Oct 9, 1999
5,138
3,727
136
I though Skymont is supposed to have RC IPC. That is only about 10% faster than RC, based on the latest slide released by Intel. Am I missing something? I think clockspeed is the biggest problem, and perhaps the fact that Skymont is not efficient in all workloads.
Intel is saying Skymont is +32% IPC over Gracemont so that is where I based my figure. 32% over Gracemont does not equal Raptor Cove. That would be more like 48%.

I think we will see that what Intel means by Skymont~RPC IPC is in some specific best case scenarios (heavy FP) much in the same way Gracemont~Skylake in some use cases.

I have found that taking the more conservative Intel performance estimates will correlate better to actual performance. The more hyperbolic sounding Intel claims, like Skymont = Raptor Cove are generally hard to replicate and rare corner cases.

We'll know the truth soon enough. One thing I can't figure out is how Intel is claiming +18% CB R24 MT for ARL over RPL? That is nuts.

I'm curious to see actual CB R24 MT results and compare them to RPL, stock-for-stock clocks to verify this seemingly outrageous Intel claim.
 

gdansk

Diamond Member
Feb 8, 2011
4,568
7,681
136
One thing I can't figure out is how Intel is claiming +18% CB R24 MT for ARL over RPL? That is nuts.
Base clocks are up all around. I think that means the typical all core clock rate will be higher too.

And I bet the power allocation of Skymont cluster increased a bit to allow for maximum throughput.