• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Page 619 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Wildcat Lake (WCL) Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing Raptor Lake-U. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q1 2026.

Intel Raptor Lake UIntel Wildcat Lake 15W?Intel Lunar LakeIntel Panther Lake 4+0+4
Launch DateQ1-2024Q2-2026Q3-2024Q1-2026
ModelIntel 150UIntel Core 7Core Ultra 7 268VCore Ultra 7 365
Dies2223
NodeIntel 7 + ?Intel 18-A + TSMC N6TSMC N3B + N6Intel 18-A + Intel 3 + TSMC N6
CPU2 P-core + 8 E-cores2 P-core + 4 LP E-cores4 P-core + 4 LP E-cores4 P-core + 4 LP E-cores
Threads12688
Max Clock5.4 GHz?5 GHz4.8 GHz
L3 Cache12 MB12 MB12 MB
TDP15 - 55 W15 W ?17 - 37 W25 - 55 W
Memory128-bit LPDDR5-520064-bit LPDDR5128-bit LPDDR5x-8533128-bit LPDDR5x-7467
Size96 GB32 GB128 GB
Bandwidth136 GB/s
GPUIntel GraphicsIntel GraphicsArc 140VIntel Graphics
RTNoNoYESYES
EU / Xe96 EU2 Xe8 Xe4 Xe
Max Clock1.3 GHz?2 GHz2.5 GHz
NPUGNA 3.018 TOPS48 TOPS49 TOPS






PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



LNL-MX.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,045
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,532
  • INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    181.4 KB · Views: 72,441
  • Clockspeed.png
    Clockspeed.png
    611.8 KB · Views: 72,327
Last edited:
Power efficiency still pretty meh even with 3nm and all these "efficiency" cores. Multi threaded performance also not that amazing, basically ties zen 5 average.

People seem to be severely over estimating e-cores. The only thing they seem good for is optimizing the die area for Cinebench r24.
E cores are less efficient than P cores at the upper part of the curve. Yes less efficient, not more or even equal, there is data out there that shows this, so on Desktop, they are worse for MT than P cores, another thing is that of course their much lower area allows Intel to stuff more of them.
 
Intel really needed to keep hyperthreading. It's easy to see why Intel is behind without it.
AGREED

Their biggest lie: "We axed HT for single threaded performance"

WHAT single thread performance????

You are literally losing in almost 99% of games!!!!

Like, you don't wanna hear my shrieking screaming voice right now, Intel!

Lion Cove's expanded structures probably only work to the max with HT enabled!
 
E cores are less efficient than P cores at the upper part of the curve. Yes less efficient, not more or even equal, there is data out there that shows this, so on Desktop, they are worse for MT than P cores, another thing is that of course their much lower area allows Intel to stuff more of them.
My point is, people here act like E-core should replace P-core but they can't even clock to 4.6ghz without completely tanking efficiency.

Even if they can significantly improve IPC for the e-core next gen, if they are still clocked at <5 ghz good luck replacing P-core as a design.
 
AGREED

Their biggest lie: "We axed HT for single threaded performance"

WHAT single thread performance????

You are literally losing in almost 99% of games!!!!

Like, you don't wanna hear my shrieking screaming voice right now, Intel!

Lion Cove's expanded structures probably only work to the max with HT enabled!
Let's go back to 9th generation Intel. They dropped Hyperthreading because of security issues. Without hyperthreading no security issues. Hyperthreading returned for the 10th generation Intel chips through 14th generation. Hyperthreading will probably return for Panther Lake (18A).
 
My point is, people here act like E-core should replace P-core but they can't even clock to 4.6ghz without completely tanking efficiency.

Even if they can significantly improve IPC for the e-core next gen, if they are still clocked at <5 ghz good luck replacing P-core as a design.
Well that's why it's an E core. It's not designed for high clock speed. E cores won't replace P cores, they'll have to redesign them for a proper replacement
 
285K stock DDR5-6400 Phoronix results:

Speedometer 3: Ah waht?? Pathetic fail
Jetstream 2: Same. FAIL
OSPRAY: Fail
Embree: Fail
IndigoBench Bedroom: Fail
Intel Open Image Denoise: Fail
Appleseed Emily: Fail
V-ray: Fail
LuxCore Benchmark: Fail
ACES DEGEMM: Fail
miniBUDE: Fail
GROMACS: FAIL
NAMD: FAIL
Xmrig: Fail
Clickhouse: Fail
DuckDB Clickbench: Fail
simdjson: Fail
Numpy: Fail
Cryptsetup: Fail (Decisive win only in Serpent-XTS and fails elsewhere)
SVT-AV1: Fail (despite being great in one test)
x265: Fail (because it is miserable in one test)
Kvazaar: Fail
uvg266: Fail
LibRAW: Fail (loses to 9700X!)
Liquid-DSP: Fail
Ngspice: Fail
srsRAN: Fail (despite being good in one test)
TensorFlow: Fail
OpenVINO: Fail
Whisper.cpp: Fail
Tested games: FAIL

WASM collisiondetection: Decisive Win
WASM imageconvulate: Decisive Win
Godot compilation: Decisive Win
LLVM compilation: Decisive Win
Mesa compilation: Decisive Win
CoreMark: Decisive Win
QuantLib: Decisive Win
BRL-CAD: Decisive Win
libxsmm: Decisive Win
GPAW: Decisive Win
Xcompact3D: Decisive Win
SPECFEM3D: Decisive Win
nginx: Decisive Win
Apache: Decisive Win
Cpuminer: VERY Decisive Win (despite slighly slower in one test)
Apache IoTDB: Decisive Win
PGSQL: Decisive Win (despite losing in Read Only but most real world DB workloads are rarely read only)
CockroachDB: Decisive Win
PyBench: Decisive Win
PyPerformance: Decisive Win
PHPBench: Decisive Win
SVT-VP9: Decisive Win
WebP: Decisive Win
C-ray: Decisive Win

Linux kernel compilation: Great and 2nd only to 9950X
FFmpeg compilation: Great and barely faster than 9950X
PHP compilation: Great and 2nd only to 9950X
Gem5 compilation: Great and 2nd only to 9950X
7-zip compression: Great and 2nd only to 9950X
Blender BMW: Great and 2nd only to 9950X
Blender Junkshop: Great and 2nd only to 9950X
LuxCoreRender DLSC: Great and 2nd only to 9950X
LuxCore Orange Juice: Great and 2nd only to 9950X
Appleseed Disney: Great and 2nd only to 9950X
IndigoBench Supercar: Great and 2nd only to 9950X
DuckDB TPC-H: Great and 2nd only to 9950X
ASTC Encoder: Great and 2nd only to 9950X
Memcached: Great and 2nd only to 9950X
OpenFOAM: Much Great and 2nd only to 7950X3D (quite a feat!)
Pennant: Great and 2nd only to the X3Ds

Blender Fishycat: Just OK
Blender Pabellon Barcelona: Just OK

LuxCoreRender Rainbow: Barely wins
LAMMPS: Barely wins against 9950X

OpenRadioss: Overall great with 1 loss and two wins against 9950X
NAS Parallel Benchmarks: Overall great with 1 loss and two great wins against competitors

RocksDB: Serviceable (Stellar in random read but loses in Read while write, the latter being more important)
 
AGREED

Their biggest lie: "We axed HT for single threaded performance"

WHAT single thread performance????

You are literally losing in almost 99% of games!!!!

Like, you don't wanna hear my shrieking screaming voice right now, Intel!

Lion Cove's expanded structures probably only work to the max with HT enabled!
Obviously the answer is, it would have been even worse with HT (or at least in Lunar).
😎

Seriously, lack of HT is not Arrow Lake's problem. All the guys saying it is loosing in benchmarks due to that: Do you realise that HT only covered third of the cores? If HT could add 15 % performance on P-Cores, that becomes +5% and likely less due to non-linear scaling in the whole picture. Hardly relevant.

Consider that the core was designed with even hybrid-er hybrid designs, like 8+32 (20 % of cores affected) in mind. For Intel's hybrid design, it is pretty much true or close enough that you can offset the removal of SMT with adding more little/efficient cores, because the area saved can likely buy some.
Add to that that your scheduling decisions are simplified from P-Core × E-Core × do you populate HT threads on P-Cores or not to just P-Core × E-Core. The scheduling still isn't a fully solved issue, so perhaps this helps enough to offset some of that lost 5% MT performance.

It's different for AMD because they have SMT on all cores, so removal would cost them much more performance.
 
E cores are good for stuffing many of them AND clocking them low so that efficiency is similar or slightly higher than P cores. We are talking 3-3.5GHz kind of low at maximum. Basically, they are a laptop type of core, efficient and cheap MT, coupled with at least two P core for smooth day to day workloads, ideally 4 at least.
Put them in Desktop on a high power limit CPU and they will not be that good.
E cores are a really good idea, just in the wrong segment and pushed way out of their sweet spot in order to chase AMD's high core count parts.
 
Last edited:
285K stock DDR5-6400 Phoronix results:

Speedometer 3: Ah waht?? Pathetic fail
Jetstream 2: Same. FAIL
OSPRAY: Fail
Embree: Fail
IndigoBench Bedroom: Fail
Intel Open Image Denoise: Fail
Appleseed Emily: Fail
V-ray: Fail
LuxCore Benchmark: Fail
ACES DEGEMM: Fail
miniBUDE: Fail
GROMACS: FAIL
NAMD: FAIL
Xmrig: Fail
Clickhouse: Fail
DuckDB Clickbench: Fail
simdjson: Fail
Numpy: Fail
Cryptsetup: Fail (Decisive win only in Serpent-XTS and fails elsewhere)
SVT-AV1: Fail (despite being great in one test)
x265: Fail (because it is miserable in one test)
Kvazaar: Fail
uvg266: Fail
LibRAW: Fail (loses to 9700X!)
Liquid-DSP: Fail
Ngspice: Fail
srsRAN: Fail (despite being good in one test)
TensorFlow: Fail
OpenVINO: Fail
Whisper.cpp: Fail
Tested games: FAIL

WASM collisiondetection: Decisive Win
WASM imageconvulate: Decisive Win
Godot compilation: Decisive Win
LLVM compilation: Decisive Win
Mesa compilation: Decisive Win
CoreMark: Decisive Win
QuantLib: Decisive Win
BRL-CAD: Decisive Win
libxsmm: Decisive Win
GPAW: Decisive Win
Xcompact3D: Decisive Win
SPECFEM3D: Decisive Win
nginx: Decisive Win
Apache: Decisive Win
Cpuminer: VERY Decisive Win (despite slighly slower in one test)
Apache IoTDB: Decisive Win
PGSQL: Decisive Win (despite losing in Read Only but most real world DB workloads are rarely read only)
CockroachDB: Decisive Win
PyBench: Decisive Win
PyPerformance: Decisive Win
PHPBench: Decisive Win
SVT-VP9: Decisive Win
WebP: Decisive Win
C-ray: Decisive Win

Linux kernel compilation: Great and 2nd only to 9950X
FFmpeg compilation: Great and barely faster than 9950X
PHP compilation: Great and 2nd only to 9950X
Gem5 compilation: Great and 2nd only to 9950X
7-zip compression: Great and 2nd only to 9950X
Blender BMW: Great and 2nd only to 9950X
Blender Junkshop: Great and 2nd only to 9950X
LuxCoreRender DLSC: Great and 2nd only to 9950X
LuxCore Orange Juice: Great and 2nd only to 9950X
Appleseed Disney: Great and 2nd only to 9950X
IndigoBench Supercar: Great and 2nd only to 9950X
DuckDB TPC-H: Great and 2nd only to 9950X
ASTC Encoder: Great and 2nd only to 9950X
Memcached: Great and 2nd only to 9950X
OpenFOAM: Much Great and 2nd only to 7950X3D (quite a feat!)
Pennant: Great and 2nd only to the X3Ds

Blender Fishycat: Just OK
Blender Pabellon Barcelona: Just OK

LuxCoreRender Rainbow: Barely wins
LAMMPS: Barely wins against 9950X

OpenRadioss: Overall great with 1 loss and two wins against 9950X
NAS Parallel Benchmarks: Overall great with 1 loss and two great wins against competitors

RocksDB: Serviceable (Stellar in random read but loses in Read while write, the latter being more important)
Intel loosing at their own software
 
E cores are good for stuffing many of them AND clocking them low so that efficiency is similar or slightly higher than P cores. We are talking 3-3.5GHz kind of low at maximum. Basically, they are a laptop type of core, efficient and cheap MT, coupled with at least two P core for smooth day to day workloads, ideally 4 at least.
Put them in Desktop on a high power limit CPU and they will not be that good.
E is perhaps a misnomer. Perhaps D for Dense.

Skymont is likely a partial inspiration of their next P and E cores. But don't expect it to be small. The advantage is the front-end which seems most likely to punch through the current x64 instruction level parallelism limitations and is scalable for both P and E designs.
 
Hope they realize their mistake and release the Refresh with HT enabled next year.
A mistake is not doing things on purpose and realizing later that decision was wrong.

This is incompetence.
The advantage is the front-end which seems most likely to punch through the current x64 instruction level parallelism limitations and is scalable for both P and E designs.
There's more details than just the front-end.
 
Isn't the APO thing supposed to know what to do with the E-Cores? From Hardware Nexus, it did not seem to make much difference, even on the games that are supposed to support it.
Yeah, it's supposed to, but certain games/benchmarks are abnormally bad, and a bad scheduler seems to be the most likely cause.
 
Yeah, it's supposed to, but certain games/benchmarks are abnormally bad, and a bad scheduler seems to be the most likely cause.
And this:
DebauerCS.png
they are still clocked at <5 ghz good luck replacing P-core as a design.
Which is what they should be doing. Cut the clocks to ~5GHz, which will allow reducing of pipeline stages, tightening memory and cache latencies, and end up with a better core that's fit for desktop, server, and mobile.

Why do you think they had to back down ring clocks drastically? Cause they couldn't clock it that high without turning into Raptorlake.
 
Back
Top