Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Page 552 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
850
801
106
Wildcat Lake (WCL) Preliminary Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing ADL-N. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q2/Computex 2026. In case people don't remember AlderLake-N, I have created a table below to compare the detail specs of ADL-N and WCL. Just for fun, I am throwing LNL and upcoming Mediatek D9500 SoC.

Intel Alder Lake - NIntel Wildcat LakeIntel Lunar LakeMediatek D9500
Launch DateQ1-2023Q2-2026 ?Q3-2024Q3-2025
ModelIntel N300?Core Ultra 7 268VDimensity 9500 5G
Dies2221
NodeIntel 7 + ?Intel 18-A + TSMC N6TSMC N3B + N6TSMC N3P
CPU8 E-cores2 P-core + 4 LP E-cores4 P-core + 4 LP E-coresC1 1+3+4
Threads8688
Max Clock3.8 GHz?5 GHz
L3 Cache6 MB?12 MB
TDP7 WFanless ?17 WFanless
Memory64-bit LPDDR5-480064-bit LPDDR5-6800 ?128-bit LPDDR5X-853364-bit LPDDR5X-10667
Size16 GB?32 GB24 GB ?
Bandwidth~ 55 GB/s136 GB/s85.6 GB/s
GPUUHD GraphicsArc 140VG1 Ultra
EU / Xe32 EU2 Xe8 Xe12
Max Clock1.25 GHz2 GHz
NPUNA18 TOPS48 TOPS100 TOPS ?






PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



LNL-MX.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,028
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,522
  • INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    181.4 KB · Views: 72,430
  • Clockspeed.png
    Clockspeed.png
    611.8 KB · Views: 72,318
Last edited:

DavidC1

Golden Member
Dec 29, 2023
1,833
2,961
96
It's "free" performance, as in up to 25% more MT perf with just 5% extra transistors per core.
SMT has worse penalty than just negligible die size increase nor power consumption. SMT makes validation more difficult so it increases the risk of designing it. It can potentially delay introduction of a new core. The E core team is taking an entirely different approach.

If the E core lead is taking over, it is plausible that across all segments HT is going away, including servers.
 

yuri69

Senior member
Jul 16, 2013
677
1,215
136
SMT has worse penalty than just negligible die size increase nor power consumption. SMT makes validation more difficult so it increases the risk of designing it. It can potentially delay introduction of a new core.
Dunno. A more than a decade ago AMD on the verge of bankruptcy went all-in on the Zen core. The core was clearly a rather rushed thing and released full of bugs, yet one of its basic design goals was to feature SMT.
 

OneEng2

Senior member
Sep 19, 2022
840
1,107
106
Lunar lake cpu tile <140 mm2
Not sure where I got the 170 from. I stand corrected.

The 288 Core variants can or can not be competitive on some targeted workloads but CLWF will certainly be better than 192C/384T in non AVX-512 Workloads 288 Physical cores vs 192C/384T and they are Chadmont not weak like Sierra Forest
I'm a bit confused here. I thought that Intel's Sierra Forest 144 and 288 core variants (288 not yet released as I understand) were all that is planned for that market segment until Clearwater Forest (now looking more like early 2026 from what I am hearing) on 18A GAA.

This places the 288 core Sierra Forest with its Sierra Glen cores (tweaked Crestmont) (on Intel 3) (identical to today's Sierra Forest) up against a 192 core (384 thread) Zen 5c Turin (on N3E). From what I can tell, the N3B being used for e cores in LNL is a better process than Intel 3 and nearly as good as 18A is expected to be in many respects.

Granite Rapids is a 128Core Redwood Cove design up against AMD Turin 128 core Zen 5 (full).

So Intel is using last gen architecture in its server lineup against AMD's ZEN 5 lineup. What I think will be most interesting is seeing this matchup (where the money is). Additionally, much is riding on Intel's Arrow Lake as it will have to take on ZEN 5 in the laptop and desktop markets (with the former being of much more consequence).

Intel is definitely in a better position than it was a year ago, but it is doing so while beading cash while AMD is still running green. Intel is also changing everything while AMD is following an evolutionary path.
Arrow Lake supposedly will have hyperthreading, it's just "optional" and likely will be enabled for server parts.

That said hyperthreading is "free" performance", but the majority of the gains comes at the cost of more power consumption. If you are in a power constrained system, HT gains are much slimmer, closer to 6-10% more multi-core performance per watt, at the cost of 2-3% single threaded performance. Once you have enough threads to cover any conceivable workload, you are better off focusing back on more ST performance IMO.
I find this hard to believe (and not what I have read). SMT isn't free though. It cost die space and energy. The question really is, does that die size and energy cost pay off in performance?

FWIW, the standard benchmarking I have seen done on ZEN 5 shows a 20-30% performance boost in threaded loads when SMT is enabled. Intel's implementation is as you stated and much lower.

In the data center market, no one cares about single threaded performance ..... and this is where the big profits are.
It's not free, and it also needs synergy with the rest of the design. Intel chose to go a different path with E cores, and as long as they're willing to keep a high number of E cores in the mainstream consumer platform... HT is optional for them.

The transistor cost is ~10% or higher based on what we know from other companies and also based on napkin math using Intel's own marketing slides:


For the sake of napkin math, let's assume HT uses 12.5% more core area. In a design with 8P cores, the savings would accumulate to 100% of a P core, or roughly 3 E cores (the modern ones). Modern E cores to P core equivalence is probably something 3:2 (based on very rough math I just did in my head), so the savings from removing HT can be used to increase MT performance by about 25%. And that with less threads, so arguably better scaling. Everybody can chill now, nothing was lost.

There's a catch though, the design needs to be hybrid and this brings a host of other tradeoffs and design decisions. The loss of HT is arguably the least of Intel's worries. As long as consumer Lion Cove does not spend the transistor budget to include HT and keep it disabled in firmware, everything is just fine and the saving can translate to either more E cores or lower price. (or an NPU, so we can all weep silently in a corner).
I am very interested in seeing the comparisons (die size, energy usage, performance) between Turin on N3E and Sierra on Intel 3. 192 cores of Zen 5c up against 288 cores of Crestmont. Should tell us quite a bit IMO.
SMT has worse penalty than just negligible die size increase nor power consumption. SMT makes validation more difficult so it increases the risk of designing it. It can potentially delay introduction of a new core. The E core team is taking an entirely different approach.

If the E core lead is taking over, it is plausible that across all segments HT is going away, including servers.
Seems to me that the tools and processes to validate SMT are well worked out. After all, it's been around since P4. I suspect that some people in this forum aren't even that old :).
 
  • Like
Reactions: igor_kavinski

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,247
16,108
136
SMT has worse penalty than just negligible die size increase nor power consumption. SMT makes validation more difficult so it increases the risk of designing it. It can potentially delay introduction of a new core. The E core team is taking an entirely different approach.

If the E core lead is taking over, it is plausible that across all segments HT is going away, including servers.
Intel has historically had a bad security problem with SMT and also little gain. Not sure but it was like 10% or so. AMD on the other hand got a 25-30% boost on SMT, not sure why, but those are approx.. the numbers.
 

MS_AT

Senior member
Jul 15, 2024
869
1,765
96
SMT has worse penalty than just negligible die size increase nor power consumption. SMT makes validation more difficult so it increases the risk of designing it. It can potentially delay introduction of a new core. The E core team is taking an entirely different approach.

If the E core lead is taking over, it is plausible that across all segments HT is going away, including servers.
While validation sounds plausible, I think another reason why E-core is not using SMT is the OS scheduling nightmare the hybrid CPUs would become if the P cores and E cores were both SMT enabled;)
I find this hard to believe (and not what I have read). SMT isn't free though. It cost die space and energy. The question really is, does that die size and energy cost pay off in performance?
I would argue that the cost is the die space and additional validation. The increased energy usage is a consequence of the fact you are actually using the backend resources that would otherwise sit idle, therefore you will not exceed the power usage that core would have if it would be running well optimized code able to use all resources without SMT.
 

AcrosTinus

Senior member
Jun 23, 2024
221
226
76
I finished reading the APX documentation. This seem to be a real next generation x86 improvement akin to the AMD64bit extension . The doubled register count lessens the burden on the memory as well as lowering pressure on the branch predictor.

Very nice, though I don't know when these changes will be implemented. If your are still okay with your current performance, I would wait for CPUs implementing APX due to it improving general performance and not only vector ones.
 

DavidC1

Golden Member
Dec 29, 2023
1,833
2,961
96
While validation sounds plausible, I think another reason why E-core is not using SMT is the OS scheduling nightmare the hybrid CPUs would become if the P cores and E cores were both SMT enabled;)
IBM Power 5/6 presentations say SMT verification is much harder than SMP for various reasons, such as splitting the core into two without running into conflicts and deciding what to share and what not to share. This is also speculation on why Conroe didn't have HT but Nehalem did, since the Nehalem team was the one that worked on Pentium 4.

That complication can be translated into boosting per thread performance of the core, and some can be made up by straight up adding more cores.
 
  • Like
Reactions: Thunder 57

Hulk

Diamond Member
Oct 9, 1999
5,138
3,727
136
For better or worse the reality is that most software still relies on 8 or less cores for the bulk of its performance. Of course this is why Geekbench 6 does not scale will with lots of cores, it's trying to emulate the software and routines most CPU's are seeing these days.

20 years ago when many people had 1 core processors, HT was amazing. I had a P4 3.06 (Northwood) and HT made it a dual core, kind of. But 2 cores, even if one is logical is so much better than 1 core. I distinctly remember understanding what "multitasking" really meant once I went from my PIII Tualatin to the Northwood P4. 2 "real" cores with Core2Duo was even better. When we were dealing with 2 or 4 cores and 6 was the "high end," HT/SMT was a big deal and quite useful.

Today we are looking at 6 or 8 big cores minimum and either SMT with AMD or E's with Intel. If you have 8 P's and 8 or 12 or 16 E cores, considering how most software isn't scaling very well to the 25th core, I'd much rather have stronger cores 1 through 8, and do away with those logical cores 25 to 32 since most software isn't utilizing them much anyway. I can also do away with the extra thermal concentration in the P cores and the sacrifice in ST performance SMT brings.
 

511

Diamond Member
Jul 12, 2024
4,533
4,161
106
Not sure where I got the 170 from. I stand corrected.


I'm a bit confused here. I thought that Intel's Sierra Forest 144 and 288 core variants (288 not yet released as I understand) were all that is planned for that market segment until Clearwater Forest (now looking more like early 2026 from what I am hearing) on 18A GAA.
It is Q3 25
This places the 288 core Sierra Forest with its Sierra Glen cores (tweaked Crestmont) (on Intel 3) (identical to today's Sierra Forest) up against a 192 core (384 thread) Zen 5c Turin (on N3E). From what I can tell, the N3B being used for e cores in LNL is a better process than Intel 3 and nearly as good as 18A is expected to be in many respects.
Intel 3 and N3B in Power /Performance are not that far apart tbf only in density 18A is N3P-N2 around 10-15% Better than N3B
Granite Rapids is a 128Core Redwood Cove design up against AMD Turin 128 core Zen 5 (full).
Yes Both have Full Fat AVX-512 but GNR has AMX Superior Memory Accelerators there is difference in terms of platform
So Intel is using last gen architecture in its server lineup against AMD's ZEN 5 lineup. What I think will be most interesting is seeing this matchup (where the money is). Additionally, much is riding on Intel's Arrow Lake as it will have to take on ZEN 5 in the laptop and desktop markets (with the former being of much more consequence).

Intel is definitely in a better position than it was a year ago, but it is doing so while beading cash while AMD is still running green. Intel is also changing everything while AMD is following an evolutionary path.
There is no other way around for Intel you have to bleed cash or you accept defeat Intel 3 is very good in terms of Cost as well vs Intel 7
 

Hulk

Diamond Member
Oct 9, 1999
5,138
3,727
136
Just to add another fact to the SMT/E cores discussion it should be noted that in Cinebench, which scales almost perfectly with the number of cores thrown at it, hyperthreading on a P core running at 5.5GHz is equal to 1/2 of an E core running at 4.4GHz.

So, 8 P cores in a 14900K running at in Cinebench R23 MT with hyperthreading enables scores the same as the same 8 P cores running the same frequency without hyperthreading along with 4 E cores running at 4.4GHz.
 

dttprofessor

Member
Jun 16, 2022
163
45
71
Just to add another fact to the SMT/E cores discussion it should be noted that in Cinebench, which scales almost perfectly with the number of cores thrown at it, hyperthreading on a P core running at 5.5GHz is equal to 1/2 of an E core running at 4.4GHz.

So, 8 P cores in a 14900K running at in Cinebench R23 MT with hyperthreading enables scores the same as the same 8 P cores running the same frequency without hyperthreading along with 4 E cores running at 4.4GHz.
SMT is useless for game.
 

Hulk

Diamond Member
Oct 9, 1999
5,138
3,727
136
SMT is useless for game.
Just to keep this discussion on a path to truth let's keep in mind it's really not if SMT is good or bad, but more when and where it is good or bad.

If you are running a 4 core processor then I would think SMT would be helpful.

8 cores plus a handful of E's, or a 12 core, then the SMT transistors could probably be used better elsewhere.
 
Last edited:

dttprofessor

Member
Jun 16, 2022
163
45
71
Just to keep this discussion on a path to truth let's keep in mind it's really not if SMT is good or bad, but more when and were it is good or bad.

If you are running a 4 core processor then I would think SMT would be helpful.

8 cores plus a handful of E's, or a 12 core, then the SMT transistors could probably be used better elsewhere.
Yes.
 

511

Diamond Member
Jul 12, 2024
4,533
4,161
106
Just to add another fact to the SMT/E cores discussion it should be noted that in Cinebench, which scales almost perfectly with the number of cores thrown at it, hyperthreading on a P core running at 5.5GHz is equal to 1/2 of an E core running at 4.4GHz.

So, 8 P cores in a 14900K running at in Cinebench R23 MT with hyperthreading enables scores the same as the same 8 P cores running the same frequency without hyperthreading along with 4 E cores running at 4.4GHz.
You verified this on your CPU right if yes than Your per core difference performance between P and E is very large vs what will be in arrow lake on a sidenote
ARL-H will have better battery life than Meteor lake due to few reasons
> Finer clock Granularity
> N3B
> E core being massively better than crestmont so workload won't jump on P core
 

Hulk

Diamond Member
Oct 9, 1999
5,138
3,727
136
Double checked the numbers.

8P @5.5GHz with HT on - CB R23 MT score is 22,783, package power is 173W

8P @5.5GHz no HT plus 5 E at 4.4GHz - CB R23 MT score is 22,620, package power is 166W.

So at 5.5GHz and 4.4GHz for P's and E's, respectively, it takes 5 E's to equal 8 P logical processors in CB R23 MT.

Package power is actually less without HT for the same score.

Just wanted to put out the correct numbers.

So at least for this scenario it would take 8/5*16=25.6 logical P cores at 5.5GHz to equal 16 physical E cores at 4.4GHz.
Not that the calculation means anything because physical cores scale with logical cores in this case but I just wanted to know what those 16E's are worth in terms of logical P cores.

Finally, Intel could have simply removed HT from the P's in Raptor and only had to add 5 E's to make up for it in MT in the most extreme (linear scaling) case.

They went one better and dramatically increased the compute with Skymont.
 

OneEng2

Senior member
Sep 19, 2022
840
1,107
106
Intel has historically had a bad security problem with SMT and also little gain. Not sure but it was like 10% or so. AMD on the other hand got a 25-30% boost on SMT, not sure why, but those are approx.. the numbers.
Agreed. AMD appears to have a much better implementation of SMT. My suspicion is that this pays off in the data center business in terms of performance per watt and performance per die area (the former being the most important in this space).
This means 18A is delayed then? Because Intel's been hinting at a Q3 launch
I am reading what others in the industry are saying and speculating for the most part: https://semiwiki.com/forum/index.php?threads/what-is-really-going-on-with-intel’s-18a-process.20944/

Every other manufacturer that brought GAA to production was able to get the process to the point of creating chips, and had very bad issues trying to get yields up to the point where the process made money and was able to support high volume. I don't believe Intel will be any different in this respect. From a philosophical point of view, Intel gambled on too many new technologies on 10 nm and it cost them a great number of delays. It appears to me that Intel has even more gambles going on with 18A than they did with 10nm. Additionally, others in the industry were able to complete 7nm production with high yields before Intel managed to get their 10nm cleaned up. In the case of 18A we are told to believe (by Intel) that this time it will be different. This time, they will do it and not experience what every other fab has experienced. I find this very difficult to believe.

From the above article, Intel does not expect to receive appreciable revenue from 18A foundry business until 2027. That is a whole long time to finance a God awful expensive process for a company that is bleeding cash. Color me skeptical.
For better or worse the reality is that most software still relies on 8 or less cores for the bulk of its performance. Of course this is why Geekbench 6 does not scale will with lots of cores, it's trying to emulate the software and routines most CPU's are seeing these days.
True, and for desktop and laptop applications, that is fair. I am concerned that Intel will concede another round of server products to AMD with its decision, and concede the huge profits that market provides.
It is Q3 25

Intel 3 and N3B in Power /Performance are not that far apart tbf only in density 18A is N3P-N2 around 10-15% Better than N3B

Yes Both have Full Fat AVX-512 but GNR has AMX Superior Memory Accelerators there is difference in terms of platform

There is no other way around for Intel you have to bleed cash or you accept defeat Intel 3 is very good in terms of Cost as well vs Intel 7
The numbers I have read have 18A somewhere between N3P and N2 and in some respects equal to N3P. N3B is better in all respects as I understand it to other N3X variants, just more expensive to use.

What do you base your statement on GNR having a superior memory accelerator than Turn on? I haven't heard this before.

Intel can't bleed cash forever. Currently it looks like their plan won't work without the foundry being lifted up by external customers using it. This makes sense as the exponential cost of new process technology has made it very hard for an IDM to justify new nodes when those new nodes must be exclusively paid for by a single chip program.
SMT is useless for game.
True; however, there is little profit in gaming CPUs while there is a ton of profit in data center CPU's. Seems like a good trade off to me. Of course, one could simply have totally different CPU designs for data center and consumer products ..... but I do believe that history has shown that this results in failed companies for those who have tried it.
Highest single core on desktop. Arrow lake looking good
But getting killed in multi-core.

Also, I am guessing that Arrow Lake will not be able to compete with X3D chips from AMD in gaming .... which I consider to be the #1 application that actually cares about single core performance.

Intel did make some great design choices for the Lunar Lake processor though. For thin and light laptops, battery life plus decent performance is a great combination. I wonder about Arrow Lake though. Seems like it will be squeezed between X3D being better for games, and the lower cost non X3D being better at multi-threaded workloads.

Just a few more days and we will see.
 
Jul 27, 2020
28,042
19,146
146
I wonder about Arrow Lake though. Seems like it will be squeezed between X3D being better for games, and the lower cost non X3D being better at multi-threaded workloads.
Should sell well if it beats 14900KS in ST. I'm more interested in what Intel has planned for 295K. Maybe 8P+24E?