Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Tigerick · Aug 22, 2022

As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.

Intel Core Ultra 100 - Meteor Lake

As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)

511 · 2025-08-02T14:03:30-0400

gdansk said:
Do you remember how that ended? People were so damn confident ARL had it after Zen 5% that they're still coping by comparing tweaked ARL to untweaked GR. Today Intel is selling 20 cores that cost much more to make than 1CCD GR for $210 here. You know it's "similarish" when retailers have to undercut the competition's part by ~$150 to move inventory.
I would not want a repeat of that.

ARL only sucked in gaming vs Zen 5 anyway the learned their lesson and made less ARL.

gdansk · 2025-08-02T14:06:41-0400

511 said:
ARL only sucked in gaming vs Zen 5 anyway the learned their lesson and made less ARL.

It also sucks in SPECint 2017. And Phoronix. And Geekbench. And Speedometer, which I was assured it would dominate. Not just gaming.
Mind you this message is sent from a LNL laptop via a VPN + firewall running on a ARL W880 server. So it's not like I am against Intel selling things cheaply. But don't bury your head in the sand. ARL managed to fail to meet even its modest hypetrain.

511 · 2025-08-02T14:13:52-0400

gdansk said:
It also sucks in SPECint 2017. And Phoronix. And Geekbench. And Speedometer, which I was assured it would dominate. Not just gaming.

Pretty evenly matched in Geekbench as for Spec you are right Phoronix has many AVX-512 benches that just increases delta so much when comparison that would be fixed with NVL.

Geekbench Search - Geekbench

Edit: jfc so much worse in Speedometer did the boost profile made a difference.

reb0rn · 2025-08-02T14:26:16-0400

gdansk said:
What real world, big boy work are you doing that doesn't need to communicate outside of its L2 ghettocluster quickly?
The current monts in desktop parts are configured for one thing and one thing only - chart wankery.

If you do not need it it does not mean you are right, I have atm 10 PC all MT use, atm 7950 and 9950x but will upgrade on next one who offer more perf per less power, hoping for intel this time as the code I use mostly AI and MT load + GPU which is optimized I need only real cores, but that is to be seen when zen6 vs 52 core intel is out also i sometimes need AVX512 so that will play part

511 · 2025-08-02T14:29:50-0400

AVX-512 is on both platforms though

Fjodor2001 · 2025-08-02T14:30:06-0400

gdansk said:
What real world, big boy work are you doing that doesn't need to communicate outside of its L2 ghettocluster quickly?
The current monts in desktop parts are configured for one thing and one thing only - chart wankery.

So just because a core occasionally needs to communicate outside L2 you mean it's doomed?

jpiniero · 2025-08-02T14:34:39-0400

Fjodor2001 said:
No, because we're talking about MT perf.

In Cinebench.

reb0rn · 2025-08-02T14:44:03-0400

jpiniero said:
In Cinebench.

This is quite boring as is not true at all
PyTorch / TensorFlow

Hugging Face Transformers

XGBoost / LightGBM

Apache Spark (local mode)

Blender (AI-assisted features)

DaVinci Resolve Studio

Adobe Premiere Pro (Windows)

HandBrake

FFmpeg

Topaz Video Ai

GCC / Clang / MSVC

Visual Studio 2022

Rust / Cargo

Unity

Unreal Engine 5 — compiling shaders and assets scales extremely well.

FPGA toolchains

RealityCapture / Metashape / Meshroom (photogrammetry)

GIMP (with GEGL multi-threaded plugins)

Emulators like Yuzu, RPCS3, PCSX2

OBS Studio — with filters, background encoding, streaming, and GPU passthrough.

Game Servers (Valheim, ARK, Minecraft, Rust)

Mathematica / MATLAB / Octave

OpenFOAM, ANSYS, COMSOL, ABAQUS

Quantum Espresso, GROMACS, LAMMPS

R with parallel or data.table

Docker or Podman (with multiple containers)

VirtualBox / VMware / KVM

qBittorrent or Transmission with high speed and connection count — CPU for checksums, encryption, etc.

Tor relays / nodes

511 · 2025-08-02T14:49:26-0400

Handbrake and FFmpeg are the same thing for me 🤣

Fjodor2001 · 2025-08-02T14:58:43-0400

jpiniero said:
In Cinebench.

Maybe you're talking about that, but others don't.

gdansk · 2025-08-02T15:03:36-0400

Fjodor2001 said:
So just because a core occasionally needs to communicate outside L2 you mean it's doomed?

Nope. Merely extrapolating from benchmarks where it doesn't to where it does is a flawed. And many people will look at proxies instead of the actual workload they run. And the current e-cores configurations make these comparison even more flawed than past configurations.

jpiniero · 2025-08-02T15:03:53-0400

reb0rn said:
This is quite boring as is not true at all

Those people are buying Threadrippers, not this, lol

Fjodor2001 · 2025-08-02T15:06:54-0400

511 said:
AVX-512 is on both platforms though

Yes, and on both E and P cores, assuming you mean NVL-S.

Fjodor2001 · 2025-08-02T15:10:48-0400

jpiniero said:
Those people are buying Threadrippers, not this, lol

Not when you can get similar performance from NVL-S much cheaper, especially when considering the total system cost.

If you need ~64+C (or more PCIe lanes / RAM) then TR will be an option though, since NVL-S does not have any SKU matching that. But TR with 32C and below? Not so much market for that once NVL-S arrives I think.

Also, you really have to consider whether the performance improvement of going from 54C NVL-S -> 64C TR is really worth it, because the cost difference will be huge.

reb0rn · 2025-08-02T15:13:41-0400

jpiniero said:
Those people are buying Threadrippers, not this, lol

Sure I will send you bill for overpriced CPU, Nova Lake 52 core will have more pci-e lanes also will be gr8 upgrade for me if intel deliver if not zen6 is there too

gdansk · 2025-08-02T15:29:00-0400

reb0rn said:
This is quite boring as is not true at all
...

You can check Phoronix to see the lower core count 4nm 200W part win at most of these compared to the higher core count 3nm 250W part. Uniformity triumphs in the real world.

But if one was to extrapolate from Cinebench then merely the 16 mont cores in 285K should be on par with the 7950X. Yet you can see many of these workloads even the 7950X is on par with the 285K even when it includes another 8 cores, faster memory, and 20W more.

I think the idea that NVL-SK will be significantly faster in MT depends on how much one extrapolates wildly from non-representative workloads. But we'll see.

Fjodor2001 · 2025-08-02T15:34:39-0400

gdansk said:
Nope. Merely extrapolating from benchmarks where it doesn't to where it does is a flawed. And many people will look at proxies instead of the actual workload they run. And the current e-cores configurations make these comparison even more flawed than past configurations.

I think the L2 focus is a huge oversimplification. You really have to check benchmarks for various workloads. Just boiling down MT performance in general for all use cases only to what extent the cores need to communicate outside L2 is really not serious. There are so many other aspects determining MT performance, and it varies per type of workload.

gdansk · 2025-08-02T15:35:51-0400

Fjodor2001 said:
I think the L2 focus is a huge oversimplification. You really have to check benchmarks for various workloads. Just boiling down MT performance for all use cases only to what extent they need to communicate outside L2 is really not serious. There are so many other aspects determining MT performance, and it varies per type of workload.

Totally agree on that front.

MS_AT · 2025-08-02T16:32:16-0400

reb0rn said:
GCC / Clang / MSVC

Yes, we are better of forcing CPU affinity to avoid E cores on Meteor Lake, sticking to P cores + HT. On HX RaptorLake with 8P cores and 12 E-cores usinig E-cores gives us tiny benefit. Needles to say people look at Strix Halo with longing, but probably Arrow Lake would do. This is at work, data gathered from cross section of dev laptops.

I just provide data point, of course ymmv.

Fjodor2001 said:
Yes, and on both E and P cores, assuming you mean NVL-S.

Just keep in mind that we know nothing of AVX10 setup on NovaLake, afaik. If they go with sensible option 512b on P core and 256b on Ecore, the throughput of 48 core part per cycle will match Zen6 24core (assuming they will keep Zen5 DT arrangment), but Zen6 might clock higher.

511 · 2025-08-02T23:28:06-0400

Fjodor2001 said:
Yes, and on both E and P cores, assuming you mean NVL-S.

Yes

MS_AT said:
Just keep in mind that we know nothing of AVX10 setup on NovaLake, afaik. If they go with sensible option 512b on P core and 256b on Ecore, the throughput of 48 core part per cycle will match Zen6 24core (assuming they will keep Zen5 DT arrangment), but Zen6 might clock higher.

It is dual pumped like Zen 4 not full fat for the E cores

gdansk · 2025-08-02T23:34:30-0400

511 said:
It is dual pumped like Zen 4 not full fat for the E cores

Do you know if Cougar Cove or Coyote Cove is 32 bytes fetch per cycle from L2 or is it still 16?

511 · 2025-08-03T00:13:02-0400

gdansk said:
Do you know if Cougar Cove or Coyote Cove is 32 bytes fetch per cycle from L2 or is it still 16?

Cougar is still the same as lion cove for coyote cove don't know

igor_kavinski · 2025-08-03T00:14:50-0400

511 said:
coyote cove

At least in name, there's potential for being a howling success

Fjodor2001 · 2025-08-03T05:38:44-0400

MS_AT said:
Just keep in mind that we know nothing of AVX10 setup on NovaLake, afaik. If they go with sensible option 512b on P core and 256b on Ecore, the throughput of 48 core part per cycle will match Zen6 24core (assuming they will keep Zen5 DT arrangment), but Zen6 might clock higher.

I wonder how you reached that conclusion. NVL-S will have 16 P cores + 32 E cores, and Zen6 24 cores. So assuming 16 P cores approximately will match 16 Zen6 cores, do you mean that the remaining 8 Zen6 cores will be faster than 32 E cores?

reaperrr3 · 2025-08-03T06:47:25-0400

Fjodor2001 said:
I wonder how you reached that conclusion. NVL-S will have 16 P cores + 32 E cores, and Zen6 24 cores. So assuming 16 P cores approximately will match 16 Zen6 cores, do you mean that the remaining 8 Zen6 cores will be faster than 32 E cores?

The 48c NVL will be much more power-/clockspeed-constrained, and you're forgetting that SMT matters in MT workloads, so 16 Zen6 cores will often beat 16 P cores in MT.

The current NVL-Sx2 rumors say 1.1x for ST and 1.6x for MT vs. ARL (285K).
285K and 9950X in MT workloads are roughly a tie on average, according to computerbase.de (in non-gaming applications, mind you).

Zen6 increases core count by 50%, increases IPC by ~6-12%, and will probably be able to reach similar or higher all-core clocks compared to Zen5, so expecting roughly 1.6x MT perf for 10950X vs. 9950X is reasonable.
So unless NVLx2 punches well above its rumored MT weight, 10950X should roughly match it in MT.

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Senior member

Attachments

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Lifer

Senior member

Diamond Member

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Lifer

Diamond Member

Member