• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Page 574 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Wildcat Lake (WCL) Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing Raptor Lake-U. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q1 2026.

Intel Raptor Lake UIntel Wildcat Lake 15WIntel Lunar LakeIntel Panther Lake 4+0+4
Launch DateQ1-2024Q2-2026Q3-2024Q1-2026
ModelIntel 150UIntel Core 7 360Core Ultra 7 268VCore Ultra 7 365
Dies2223
NodeIntel 7 + ?Intel 18-A + TSMC N6TSMC N3B + N6Intel 18-A + Intel 3 + TSMC N6
CPU2 P-core + 8 E-cores2 P-core + 4 LP E-cores4 P-core + 4 LP E-cores4 P-core + 4 LP E-cores
Threads12688
Max Clock5.4 GHz4.8 GHz5 GHz4.8 GHz
L3 Cache12 MB6 MB12 MB12 MB
TDP15 - 55 W15 - 35 W17 - 37 W25 - 55 W
Memory128-bit LPDDR5-520064-bit LPDDR5x-7467128-bit LPDDR5x-8533128-bit LPDDR5x-7467
Size96 GB48 GB32 GB128 GB
Bandwidth83 GB/s60 GB/s136 GB/s120 GB/s
GPUIntel GraphicsIntel GraphicsArc 140VIntel Graphics
RTNoNoYESYES
EU / Xe96 EU2 Xe8 Xe4 Xe
Max Clock1.3 GHz2.6 GHz2 GHz2.5 GHz
NPUGNA 3.017 TOPS48 TOPS49 TOPS






PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



LNL-MX.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,049
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,534
  • INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    181.4 KB · Views: 72,443
  • Clockspeed.png
    Clockspeed.png
    611.8 KB · Views: 72,329
Last edited:
Intel also bragged about 15% less die space otherwise needed to make hyper-threading work during the Lunar Lake unveil. So they had a bunch of extra die space to do something interesting for single thread performance and this is all they came up with. No wonder the E-Core team is taking over the next generation uarch.
They should P core is barely faster in server/mobile in mobile/server with SKT/LNC cause we are thermally and power limited
 
Just because Skymont is doing well NOW doesn't mean it will scale all the way up to 5 GHz and beyond. It may run into the same limitations as Lion Cove if pushed higher.
That is why i said Server and Mobile lol not desktop 4.6 GHz but a better PPA and IPC than P core in future 🙂
 
Doubt they can swap the main cores 2 years before launch, even if the the uarch was ready. If this team only recently started the work, it should take years until the first product is out if it isn't canned in the meantime
IMO what is plausible is that Arctic Wolf was already set to be another big jump that would see the E cores once again widen their range of operation and take over more of the P cores' duties.

A progressive evolution of the E cores until they're ready to fully take over seems much more likely to me than one uarch unifying the two coming out of nowhere at some point in the future.
 
IMO what is plausible is that Arctic Wolf was already set to be another big jump that would see the E cores once again widen their range of operation and take over more of the P cores' duties.

A progressive evolution of the E cores until they're ready to fully take over seems much more likely to me than one uarch unifying the two coming out of nowhere at some point in the future.

Well if its true that Skymont has almost the came IPC as Lion Cove, and Arctic Wolf e-cores take another big jump, why wouldn't e-cores already be ready to replace P cores unless P cores can take another huge jump though that seems to be struggling to happen in the Israel Design Center P core team.

Or is there more to the story than just IPC/ Is it that Skymont has IPC in some areas close to lion Cove but not ready tot ake over by itself even with a big jump? Is there some specific limitation the Austin Atom team cores have regardless of jumps right now that prevent them from being primary core? Are they like dependent on P cores for all around functionality?

Like much more to the situation and story than this oh e-cores ready to take over P cores with Arctic Wolf in a few years or longer?
 
Like much more to the situation and story than this oh e-cores ready to take over P cores with Arctic Wolf in a few years or longer?
E-cores are built from the ground up for MT workloads. If they mess with that formula to also enhance their ST performance to the point of not needing P cores, something will regress. There is no perfect core yet that excels at every kind of workload. I believe in future they will use the P-cores for FP heavy duties with full 512 bit width AVX-512 while the AVX10 enabled E-cores will just chip in to provide slightly more boost to such workloads, instead of shouldering that heavy responsibility all on their own.
 
E-cores are built from the ground up for MT workloads. If they mess with that formula to also enhance their ST performance to the point of not needing P cores, something will regress. There is no perfect core yet that excels at every kind of workload. I believe in future they will use the P-cores for FP heavy duties with full 512 bit width AVX-512 while the AVX10 enabled E-cores will just chip in to provide slightly more boost to such workloads, instead of shouldering that heavy responsibility all on their own.
I agree. P-core for snappy response and complex tasks. E-core for brute force of heavy multi-threaded work. They were separated to get the best of both worlds. Combining both into just one core is likely to get the worst of both worlds.
 
E-cores are built from the ground up for MT workloads. If they mess with that formula to also enhance their ST performance to the point of not needing P cores, something will regress. There is no perfect core yet that excels at every kind of workload. I believe in future they will use the P-cores for FP heavy duties with full 512 bit width AVX-512 while the AVX10 enabled E-cores will just chip in to provide slightly more boost to such workloads, instead of shouldering that heavy responsibility all on their own.
The whole point of AVX10/256 that is meant for consumer hardware is to get AVX512 features (new instructions, masking etc) with 256b registers, that would then be common between P and E cores.

Now, I would argue that there is nothing specific in Skymont that would let it handle MT workloads better than ST. It's just the core was build with area efficiency in mind, so it would be easier to spam E-cores. It's not like there is something specific blocking them from putting AVX512 on Skymont other than the area taken by the core.
 
According to my early estimations lion cove in arrow lake should still see about 25% IPC over skymont. When you add in another 20 % for the clock speed advantage that lion cove has , simple napkin math shows 45% IPC advantage for the p cores. It goes without saying this is significant. Skymont is impressive as it has cut down the performance discrepancy of nearly 100% with gracemont/raptor cove to half of that with skymont/lion cove.
 
According to my early estimations lion cove in arrow lake should still see about 25% IPC over skymont. When you add in another 20 % for the clock speed advantage that lion cove has , simple napkin math shows 45% IPC advantage for the p cores. It goes without saying this is significant. Skymont is impressive as it has cut down the performance discrepancy of nearly 100% with gracemont/raptor cove to half of that with skymont/lion cove.
I though Skymont is supposed to have RC IPC. That is only about 10% faster than RC, based on the latest slide released by Intel. Am I missing something? I think clockspeed is the biggest problem, and perhaps the fact that Skymont is not efficient in all workloads.
 
Now, I would argue that there is nothing specific in Skymont that would let it handle MT workloads better than ST. It's just the core was build with area efficiency in mind, so it would be easier to spam E-cores. It's not like there is something specific blocking them from putting AVX512 on Skymont other than the area taken by the core.
You are correct with half of the picture. Smaller area = far cheaper to spam E-cores.

But, there is the other half of the picture that you are missing. The E-cores were designed and optimized for low power situations. Spamming a core consuming 5 W each is one thing. Spamming a core taking 25 W each is totally different.

It is just so energy inefficient to spam P-cores. So, you are left with two possibilities with spamming P-cores: (1) massive power draw, tons of heat produced, huge energy bills, and then you have to pay to cool it all. Or (2) run the P-cores far from their design power level and even further from their optimum power level--performance suffers.

This graph is way back from Alder Lake. Even back then the E-cores do more work with each Watt of energy for any power setting less than 15 W/core. Guess what, at 125 W TDP, even an 8 core chip is right at that cutoff. Put in a 16 core chip with 125 W power (7.8 W each) and you are already 30% better performing with E-cores than P-cores. Then if you are considering spamming for multi-threading tasks, you get further and further into the area where P-cores just don't even operate.

1728663332270.png
https://chipsandcheese.com/p/alder-lakes-power-efficiency-a-complicated-picture

If you are an extreme enthusiast that doesn't care about practical limits like power, then yes, go ahead and spam P-cores and build a nuclear power plant in your back yard to run it.
 
Intel also bragged about 15% less die space otherwise needed to make hyper-threading work during the Lunar Lake unveil. So they had a bunch of extra die space to do something interesting for single thread performance and this is all they came up with. No wonder the E-Core team is taking over the next generation uarch.
the hyperthreading is still there in silicon, just not enabled.
 
You are correct with half of the picture. Smaller area = far cheaper to spam E-cores.

But, there is the other half of the picture that you are missing. The E-cores were designed and optimized for low power situations. Spamming a core consuming 5 W each is one thing. Spamming a core taking 25 W each is totally different.

It is just so energy inefficient to spam P-cores. So, you are left with two possibilities with spamming P-cores: (1) massive power draw, tons of heat produced, huge energy bills, and then you have to pay to cool it all. Or (2) run the P-cores far from their design power level and even further from their optimum power level--performance suffers.

This graph is way back from Alder Lake. Even back then the E-cores do more work with each Watt of energy for any power setting less than 15 W/core. Guess what, at 125 W TDP, even an 8 core chip is right at that cutoff. Put in a 16 core chip with 125 W power (7.8 W each) and you are already 30% better performing with E-cores than P-cores. Then if you are considering spamming for multi-threading tasks, you get further and further into the area where P-cores just don't even operate.

View attachment 109222
https://chipsandcheese.com/p/alder-lakes-power-efficiency-a-complicated-picture

If you are an extreme enthusiast that doesn't care about practical limits like power, then yes, go ahead and spam P-cores and build a nuclear power plant in your back yard to run it.
Sorry but I don't remember where I would advocate spamming P cores. The idea behind my post was that Skymont could be a base for a more performant core as there is nothing in its design that inherently prevents it. AMD has shown it is possible to implement a core that scales fairly well across the range of power targets.
 
Sorry but I don't remember where I would advocate spamming P cores. The idea behind my post was that Skymont could be a base for a more performant core as there is nothing in its design that inherently prevents it. AMD has shown it is possible to implement a core that scales fairly well across the range of power targets.
I was referring to these sentences of yours: "Now, I would argue that there is nothing specific in Skymont that would let it handle MT workloads better than ST. It's just the core was build with area efficiency in mind, so it would be easier to spam E-cores."

There is something specific that lets E-cores handle MT workloads better: designed and optimized for low power.
 
I though Skymont is supposed to have RC IPC. That is only about 10% faster than RC, based on the latest slide released by Intel. Am I missing something? I think clockspeed is the biggest problem, and perhaps the fact that Skymont is not efficient in all workloads.
Intel is saying Skymont is +32% IPC over Gracemont so that is where I based my figure. 32% over Gracemont does not equal Raptor Cove. That would be more like 48%.

I think we will see that what Intel means by Skymont~RPC IPC is in some specific best case scenarios (heavy FP) much in the same way Gracemont~Skylake in some use cases.

I have found that taking the more conservative Intel performance estimates will correlate better to actual performance. The more hyperbolic sounding Intel claims, like Skymont = Raptor Cove are generally hard to replicate and rare corner cases.

We'll know the truth soon enough. One thing I can't figure out is how Intel is claiming +18% CB R24 MT for ARL over RPL? That is nuts.

I'm curious to see actual CB R24 MT results and compare them to RPL, stock-for-stock clocks to verify this seemingly outrageous Intel claim.
 
One thing I can't figure out is how Intel is claiming +18% CB R24 MT for ARL over RPL? That is nuts.
Base clocks are up all around. I think that means the typical all core clock rate will be higher too.

And I bet the power allocation of Skymont cluster increased a bit to allow for maximum throughput.
 
Back
Top