Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Tigerick · Aug 22, 2022

As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.

Intel Core Ultra 100 - Meteor Lake

As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)

gdansk · Jul 24, 2025

dr1337 said:
Why isn't it a problem for Apple and their LLCs?

What's the size of their LLC?

LightningZ71 · Jul 24, 2025

The rumor that I've seen for the BLCC die is that they just duplicated the area in the middle of the die where all the L3 already was, making the whole thing wider. I think that they're going to have to make a material improvement in the ring itself to full realize the performance gains of that, though, as there's going to be a lot of constant ring traffic for L3 hits and writes that's going to make an already overstressed design even more heavily stressed. We'll see. I don't think that there will be many places where people will find where it hurts things. The question is more about how much will it really help?

AcrosTinus · Jul 24, 2025

igor_kavinski said:
What workloads does your 265K usually crash in? I haven't faced a serious crash on my 245KF yet during normal use but then I haven't used it for 24 hours straight yet. In your case, it seems like possibly a bad sample (could be one reason why they are having a firesale on 265K) or the mobo needs the latest BIOS and try running with only Intel base profile or 200S Boost profile and see if the crashes persist. If they still do, either bad mobo, some other finicky hardware causing the crashes (process of elimination to identify) or finally bad CPU sample.

I assume you encounter no issues because you a) have no IGPU or b) have no MSI board.

Visual Studio compiles
Handbrake encodes
OBS recording via quicksync
Docker Desktop containers
VMs with VM workstation
VMs with Hyper-V
Simple Browsing
Gaming (Apex, Marvel Rivals)

All these things trigger crashes randomly. Also crashed happen if I do nothing at around 17h it just crashes as well. Also I have used the boost profile, it is buggy as hell and triggers a different set of side effects with the NPU. For now I have gotten a beta BIOS and some voltage settings to apply. The system still crashes though, only using "PEG only" and XMP off provides maximum stability. If I have the time I will troubleshoot more. The 2xU7 and 1xU9 all show the same issues on the MSI board, the ASUS is not affected but has some issues regarding the NPU as well.

OneEng2 · Jul 24, 2025

Just my opinion, but an on-die large L3 is a bad economic idea. Creating a single monolithic large die to get a big L3 is going to decrease yields and increase the wafer waste.

AMD's implementation limits the size of the 3D L3 cache to the size of the original die it is being stacked onto. This is a much better idea for a number of reasons.

Still, Intel will likely achieve the end goal (drastically lowering overall memory latency) using their approach, it will just be expensive for them to do it.

I am not sure how this fits into Intel's desire to increase margins.

igor_kavinski · Jul 24, 2025

AcrosTinus said:
I assume you encounter no issues because you a) have no IGPU or b) have no MSI board.

Correct. No to both.

I purposely refused to update the drivers. Haven't even checked to see if the NPU driver is installed. I just wanted the vanilla virgin experience (don't laugh).

It was going so well until I hit that stupid RAM OC issue where the Lion Cove cores either don't want to do any work or randomly wake up and work and then stop after about 60 seconds. And 7200 RAM should boot at stock 6400 (Arrow Lake's stock RAM config) but instead resorts to 5600 MT/s. It's pathetic that the memory controller can't figure out if the RAM can run at stock 6400.

I just read this: https://www.xda-developers.com/intel-serious-problem-arrow-lake-memory-compatibility/

I guess I got lucky that I didn't face THAT many RAM problems.

coercitiv · Jul 24, 2025

OneEng2 said:
Just my opinion, but an on-die large L3 is a bad economic idea. Creating a single monolithic large die to get a big L3 is going to decrease yields and increase the wafer waste.

AMD's implementation limits the size of the 3D L3 cache to the size of the original die it is being stacked onto. This is a much better idea for a number of reasons.

I think the monolithic approach is indeed more expensive to produce, but it's probably less expensive to design... in terms of time to market that is. Intel is still a slow, reactive beast, they're trading cost efficiency for relevance in the consumer market. It will only cost them a few hundred engineer jobs. /s

Just for the fun of it, here's a very crude vertical multiplication of the sections that contain L3 on Arrow Lake. It's obviously not accurate, but fast to put together, and puts the big cache die at just under 1.6x the ARL compute die size. Napkin math says this means a more realistic layout estimate would fall under 1.5x area. If the vanilla NVL-S ends up around 95 mm2 then BLLC version would be ~140 mm2.

Joe NYC · Jul 24, 2025

coercitiv said:
I think the monolithic approach is indeed more expensive to produce, but it's probably less expensive to design... in terms of time to market that is. Intel is still a slow, reactive beast, they're trading cost efficiency for relevance in the consumer market. It will only cost them a few hundred engineer jobs. /s

Just for the fun of it, here's a very crude vertical multiplication of the sections that contain L3 on Arrow Lake. It's obviously not accurate, but fast to put together, and puts the big cache die at just under 1.6x the ARL compute die size. Napkin math says this means a more realistic layout estimate would fall under 1.5x area. If the vanilla NVL-S ends up around 95 mm2 then BLLC version would be ~140 mm2.

View attachment 127657

Then the comparison with Zen 6 + V=Cache would be:
NVL: 140 mm2 N2
Zen 6: 75 mm2 N2 + 75 mm2 N4 or N6

coercitiv · Jul 24, 2025

Joe NYC said:
Then the comparison with Zen 6 + V=Cache would be:
NVL: 140 mm2 N2
Zen 6: 75 mm2 N2 + 75 mm2 N4 or N6

I'm not comfortable comparing rough estimates like this, the figures used for NVL-S were meant to give a sense of scale, not make 1:1 comparisons with Zen 6. I do agree however that AMD is in a position to make the more cost efficient product, at least as far as perf/cost is concerned.

LightningZ71 · Jul 24, 2025

The x factor is the cost of the stacking process, the good dies that they lose in the process, the extra packaging and movement time costs.

I can only guess that cache die yields must be near 100% as there has never even been a mention of a die recovery product for it.

AcrosTinus · Jul 24, 2025

Intel CEO Letter to Employees

Lip-Bu Tan announces 15% workforce reduction

morethanmoore.substack.com

Just look at this, SMT is coming back and everyone on the 1T per P-Core train have just officially bought a nerfed P-Core CPU, me included.

gdansk · Jul 24, 2025

AcrosTinus said:
Intel CEO Letter to Employees

Lip-Bu Tan announces 15% workforce reduction

morethanmoore.substack.com

Just look at this, SMT is coming back and everyone on the 1T per P-Core train have just officially bought a nerfed P-Core CPU, me included.

Note that it is included under the "data center" category. Doesn't say it's coming back to client, where it's of questionable utility when you have e cores to spam.

Personally I'm more interested in what "simplified SKU stacks" will bring than anything else mentioned there.

AcrosTinus · Jul 24, 2025

gdansk said:
Note that it is included under the "data center" category. Doesn't say it's coming back to client, where it's of questionable utility when you have e cores to spam.

Personally I'm more interested in what "simplified SKU stacks" will bring than anything else mentioned there.

I think you can make the inference that it is coming back to client why:

The move towards 1T per P-Core was not just in client but also a DC thing.
The CEO does not want multiple CPU architectures in development, meaning if SMT is reintroduced with a core family, one can assume that it will will be used across the entire stack, making sure that advancement influence all products.

Saylick · Jul 24, 2025

coercitiv said:
I think the monolithic approach is indeed more expensive to produce, but it's probably less expensive to design... in terms of time to market that is. Intel is still a slow, reactive beast, they're trading cost efficiency for relevance in the consumer market. It will only cost them a few hundred engineer jobs. /s

Just for the fun of it, here's a very crude vertical multiplication of the sections that contain L3 on Arrow Lake. It's obviously not accurate, but fast to put together, and puts the big cache die at just under 1.6x the ARL compute die size. Napkin math says this means a more realistic layout estimate would fall under 1.5x area. If the vanilla NVL-S ends up around 95 mm2 then BLLC version would be ~140 mm2.

View attachment 127657

It’s starting to look like a Zen CCD where the cores take up a little under half the total die area lol

gdansk · Jul 24, 2025

AcrosTinus said:
The CEO does not want multiple CPU architectures in development, meaning if SMT is reintroduced with a core family, one can assume that it will will be used across the entire stack, making sure that advancement influence all products.

It might impact unified core, something far off, restoring SMT across all products. But I doubt that.

In either case the cat coves for client do not have SMT validated. Delay and time on rework/retest at this point are unacceptable. I.e. Arrow Lake will not be alone in this regard.

reb0rn · Jul 24, 2025

SMT is only good for server cloud as they offer clients mostly threads ....so they for them more threads more value
for home users and even for severs maybe best optimization would be without it

OneEng2 · Friday at 8:56 AM

coercitiv said:
I'm not comfortable comparing rough estimates like this, the figures used for NVL-S were meant to give a sense of scale, not make 1:1 comparisons with Zen 6. I do agree however that AMD is in a position to make the more cost efficient product, at least as far as perf/cost is concerned.

That is what I am thinking.

reb0rn said:
SMT is only good for server cloud as they offer clients mostly threads ....so they for them more threads more value
for home users and even for severs maybe best optimization would be without it

While I agree that server benefits the most from SMT, I think that SMT offers the best PPA you can get even in desktop for MT. Getting 40% of a core for 15% die space add is a good deal. This is particularly true because all your cores are identical and don't require you rely on OS scheduling to avoid putting inappropriate loads on processors that can't handle them efficiently ..... or bogging down your P cores with trivial tasks.

I did a comparison of the PPA of Zen 5c vs Skymont (with lots of napkin math) and the were about equal (Skymont was ~10% better using simple math and lots of assumptions). That 10% PPA comes at a pretty big cost though.

reb0rn · Friday at 1:27 PM

What I use I never seen more then 10% benefit from SMT... in MT, sure its depend on core but I alone would prefer always real cores

511 · Friday at 1:52 PM

reb0rn said:
What I use I never seen more then 10% benefit from SMT... in MT, sure its depend on core but I alone would prefer always real cores

even i did benchmarks on my MTL and in cinebench R23 with and without SMT it was roughly 10-15% difference for 16C/22T vs 16C/16T

gdansk · Friday at 2:07 PM

reb0rn said:
What I use I never seen more then 10% benefit from SMT... in MT, sure its depend on core but I alone would prefer always real cores

In what config? A 13100 gains a lot from HT. I measured about 25% in VM benches.

But if you're looking at a part with 1:2 P:E core mix about 10% is what one would expect...

igor_kavinski · Friday at 2:11 PM

gdansk said:
I measured about 25% in VM benches.

What benchmarks did you run inside the VMs?

511 · Friday at 2:11 PM

I need to rebench my setup with 6+8+HT/6+8no HT/6+0/6+0HT.
But the question is what should I bench? Cinebench R23 or 24?

gdansk · Friday at 2:18 PM

igor_kavinski said:
What benchmarks did you run inside the VMs?

wrk against nginx

reb0rn · Friday at 3:15 PM

I can presume non optimized MT load would benefit from SMT more, while dedicated MT optimized code have almost zero benefit

AcrosTinus · Friday at 4:37 PM

If I am less pessimistic, I would call the reintroduction of SMT a positive thing, but somehow my gut tells me, this will be at the cost of ST performance gains because time has to be invested into resurrection and hardening to prevent a SpectreV-X. Time that could have gone into the 1T performance. A super wide P-Core or the so called rumored RU is officially a bed time story.

LightningZ71 · Friday at 4:41 PM

It very much depends on the code. Completely homogenous tasks will typically run into contention issues as the code streams will fight for the same back end resources. Heterogenous code will better exploit the back end, especially if it is very wide. Low predictability branchy code won't stall the whole core, just one thread. Code that has lower memory demand can better share the caches.

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Senior member

Attachments

Diamond Member

Platinum Member

Senior member

Senior member

Lifer

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Senior member

Diamond Member

Senior member

Diamond Member

Diamond Member

Senior member

Senior member

Senior member

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Senior member

Senior member

Platinum Member