Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Page 296 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
686
576
106
PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake

ModelCode-NameDateTDPNodeTilesMain TileCPULP E-CoreLLCGPUXe-cores
Core Ultra 100UMeteor LakeQ4 202315 - 57 WIntel 4 + N5 + N64tCPU2P + 8E212 MBIntel Graphics4
?Lunar LakeQ4 202417 - 30 WN3B + N62CPU + GPU & IMC4P + 4E08 MBArc8
?Panther LakeQ1 2026 ??Intel 18A + N3E3CPU + MC4P + 8E4?Arc12



Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake

Meteor LakeArrow Lake (20A)Arrow Lake (N3B)Arrow Lake Refresh (N3B)Lunar LakePanther Lake
PlatformMobile H/U OnlyDesktop OnlyDesktop & Mobile H&HXDesktop OnlyMobile U OnlyMobile H
Process NodeIntel 4Intel 20ATSMC N3BTSMC N3BTSMC N3BIntel 18A
DateQ4 2023Q1 2025 ?Desktop-Q4-2024
H&HX-Q1-2025
Q4 2025 ?Q4 2024Q1 2026 ?
Full Die6P + 8P6P + 8E ?8P + 16E8P + 32E4P + 4E4P + 8E
LLC24 MB24 MB ?36 MB ??8 MB?
tCPU66.48
tGPU44.45
SoC96.77
IOE44.45
Total252.15



Intel Core Ultra 100 - Meteor Lake

INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg

As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)

Clockspeed.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 23,984
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,456
Last edited:

AMDK11

Senior member
Jul 15, 2019
350
246
116
You took my words about the scheduler too literally. Never mind.

Look carefully again. You can clearly see at the beginning, where the FPU part is, 4 execution ports and then 6 execution ports, which gives a total of 10. Right? The remaining 8 are from SD and AGU.
 

Cheesecake16

Junior Member
Aug 5, 2020
5
26
61
You took my words about the scheduler too literally. Never mind.

Look carefully again. You can clearly see at the beginning, where the FPU part is, 4 execution ports and then 6 execution ports, which gives a total of 10. Right? The remaining 8 are from SD and AGU.
Except that's not how Intel sets up their math scheduler...
Here is what Golden Cove's Math Scheduler looks like.....
Notice how the FP ALUs are on the same ports as the Integer ALUs..... That's what I expect of Lion Cove.....

1714093907086.png
 

Cheesecake16

Junior Member
Aug 5, 2020
5
26
61
C&C is continuing their MTL investigation, this time with an article on the NPU:

In summary, it looks like the NPU is of limited use because of the data types it supports (or rather, doesn't support). And of the cases it does support, the NPU offers lower power but not necessarily higher performance. The author seems to think that the iGPU is the better approach here because it's more powerful and more flexible, even at the cost of higher power because there's many situations where you can plug in a laptop these days. That is, until Intel develops a NPU which does cover more use cases with higher performance while continuing to use lower power.
Yeah..... Getting the NPU to work was a pain and in the end it ended up just being faster to run stuff on the iGPU....
Maybe with LNL, STX, and SDXE that will change and we will have to test it when those CPUs are available, but for right now we just didn't see the point of the NPUs in MTL or in PHX/HWK..... they just aren't fast or efficient enough to justify the headaches of programming them......
 

Hulk

Diamond Member
Oct 9, 1999
4,385
2,270
136
Adding another set of cores in the mix does not change the outcome, the work done by the P cores can still be optimized. Consider you run a highly parallelized workload on a 8P+16E / 24T CPU, so SMT is disabled. The task is split between the P and E cores based on predetermined ratios, and the work done by the 8P/8T cores is finite. That same work can be done more efficiently by 8P/16T with lower clocks and lower voltage, as long as there's good scaling from 8T to 16T. The work done by the E cores is already accounted for.

I think SMT is the victim of a misunderstanding based on the power race in the recent years. Yes, SMT will increase power and temps when allowed to push the package power higher. However, when enforcing a sane power limit, SMT increases efficiency instead.
Theoretically yes, you are correct.

But in reality there are cases where this isn't the best option.

For example, with my 14900K I have HT turned off. I can achieve 5.5/4.3 in a more stable manner and with lower temps than with HT on. While there are very few applications that will slam all cores with HT on, when it happens BIOS settings that are stable with HT off will cause a restart with HT on.

The tiny bit of performance lost by having HT off in those few applications is more than made up for by having a cooler, lower voltage, more stable rig with less chance of CPU degradation.

If I was simply tuning the MT performance and efficiency then I would leave HT on and limit clocks as you noted. But setting up for that usage scenario for daily usage reduces overall performance in the many apps that still rely heavily on ST performance.
 

Hulk

Diamond Member
Oct 9, 1999
4,385
2,270
136
Raptor Cove is 40-45% in overall, where the Integer gap is closer and FP gap is large. It's something like 25% Int and 50-60% FP.

"Raptormont" gets 1-3% gain while Crestmont gets 4-6% gain. So a 30% gain with Skymont as did with Atom-based predecessors gets us to the "aiming for ADL" claim on Twitter. I think SKT will be able to reach Golden Cove similar to GMT reaching Skylake.

It means 10-15% faster than Golden Cove for Int while being 10-15% slower in FP. Consequently it means Sierra Glen(which is Crestmont without the 6-wide retire/allocate, or IOTW Gracemont) is similar to 10-15% faster per core than Skylake and is an excellent Cloud core.

If we extrapolate that to Darkmont-based Clearwater Forest, you essentially have an 18A 144-288 core better-than Golden Cove core chip.

On a side note, I speculate the possibility that they aren't backing down on clocks with Skymont hence the greater than expected core size while they are for Lion Cove.

Good analysis there. I know from reading your posts that you have a deep understanding of this topic so I appreciate the reply.

That type of "mont" IPC increase is mouth-watering...
 

mikk

Diamond Member
May 15, 2012
4,185
2,220
136
Edit:
It seems that Skymont has a 3x 3-Way decoder (Gracemont and Crestmont 2x 3-Way).

Not too surprising given that Raichu told it 6 months ago. And yes indeed it looks like there are 3 decoder in the LNL picture, I can see it.

It is based on three 3-way decoder clusters and the prediction bandwidth looks like has obvious improvement (more than 2X).
 

AMDK11

Senior member
Jul 15, 2019
350
246
116
Except that's not how Intel sets up their math scheduler...
Here is what Golden Cove's Math Scheduler looks like.....
Notice how the FP ALUs are on the same ports as the Integer ALUs..... That's what I expect of Lion Cove.....

View attachment 97861
I know how Intel up to RedwoodCove uses a schedule for ALU and FP.

In the LunarLake diagram, the schedule in the LionCove core appears to be separate, and this may be a big change from what is currently used. FP Units and ALU Units have separate ports. Alternatively, the FP part also has ALU.

Either way, the ALU ports and FP ports give a total of 10 ports, as you can see in the diagram. Unless you think it's fake.
 

AMDK11

Senior member
Jul 15, 2019
350
246
116
Not too surprising given that Raichu told it 6 months ago. And yes indeed it looks like there are 3 decoder in the LNL picture, I can see it.
Yes. I know that. I even heard 2x 4-Way. Nevertheless, the LunarLake graphic and Skymont diagram confirm 3x 3-Way.
 

Cheesecake16

Junior Member
Aug 5, 2020
5
26
61
I know how Intel up to RedwoodCove uses a schedule for ALU and FP.

In the LunarLake diagram, the schedule in the LionCove core appears to be separate, and this may be a big change from what is currently used. FP Units and ALU Units have separate ports. Alternatively, the FP part also has ALU.

Either way, the ALU ports and FP ports give a total of 10 ports, as you can see in the diagram. Unless you think it's fake.
Like I said, I think you are reading too much into it.....

At this point we don't know enough about what Lion Cove (or Skymont for that matter) actually looks like to assume if the image actually is showing anything notable.... sure there are folks like Raichu on Twitter that claim certain things but there is no "hard" evidence like GCC patches, LLVM patches, perf patches, or MSR documentation out yet other then the ISA manual that Intel puts out which isn't that helpful here (other then to imply that Skymont doesn't have AVX10 but that's a different discussion)........

But if I was to speculate based on the assumption that there is in fact 10 math ports split between 6 integer ports and 4 vector ports, then I would assume that the unified math scheduler is no more and that they have split the schedulers up into a 6 port scheduler for integer operations and a 4 port scheduler for the vector operations while keeping the individual load and store schedulers........

Which is starting to look a lot like that fake Zen 5 slide MLID claimed was real ironically enough....
Except here it would be a single vector scheduler instead of 2 vector schedulers in the slide and a split load and store scheduler instead of a unified AGU scheduler in the slide........
 

AMDK11

Senior member
Jul 15, 2019
350
246
116
The diagram most likely comes from Intel. You can clearly see 10 4+6 ports.

You can also see three blocks under AGU+SD. It looks to me like L1-D and L2 divided into 512KB + 2.5MB.

In the UOP cache decoding and sending location, I see 24 entries. GoldenCove has 12 items, including 6 decoders and 8 uop caches.

Skymont also appears to have fewer execution ports than Gracemont and Crestmont.
 

coercitiv

Diamond Member
Jan 24, 2014
6,458
13,119
136
For example, with my 14900K I have HT turned off. I can achieve 5.5/4.3 in a more stable manner and with lower temps than with HT on. While there are very few applications that will slam all cores with HT on, when it happens BIOS settings that are stable with HT off will cause a restart with HT on.

The tiny bit of performance lost by having HT off in those few applications is more than made up for by having a cooler, lower voltage, more stable rig with less chance of CPU degradation.
Enforce temperature limits, enforce power limits, enforce current limits. My argument is you can have your CPU work at lower clocks and have both more performance and better stability. If you want a more stable and better lasting system, lower your max temp from 100C to something like 85-95C. The same applies to power, find the power target that suits your config, this way clocks will push to the max under light loads, then pull back under MT loads, then pull even lower when SMT is getting yields.

It's very weird to me to see a system pushed to the limit of stability and then have SMT blamed for tipping it over the edge. What are you doing up there in the first place? :p
 

Hulk

Diamond Member
Oct 9, 1999
4,385
2,270
136
Enforce temperature limits, enforce power limits, enforce current limits. My argument is you can have your CPU work at lower clocks and have both more performance and better stability. If you want a more stable and better lasting system, lower your max temp from 100C to something like 85-95C. The same applies to power, find the power target that suits your config, this way clocks will push to the max under light loads, then pull back under MT loads, then pull even lower when SMT is getting yields.

It's very weird to me to see a system pushed to the limit of stability and then have SMT blamed for tipping it over the edge. What are you doing up there in the first place? :p
My temps currently never go over 75C. I have tweaked quite a bit and this is where I ended up. I'm at a very safe voltage right now. Enabling HT would require more voltage even with other limits you specified in place otherwise the momentary voltage demand would reset or freeze the system. I learned about this over at Overclocking.net where most everybody turns off HT for best daily performance.

It comes down to the fact that it's simply easier to tune the P's without HT on and then rely on the 16E's for MT. Trying to do both gets messy quickly. Handbrake is one app that will "break" the tune quickly with HT on when you are undervolted and tuned for efficiency.
 
  • Like
Reactions: igor_kavinski

Henry swagger

Senior member
Feb 9, 2022
454
286
106
C
Raptor Cove is 40-45% in overall, where the Integer gap is closer and FP gap is large. It's something like 25% Int and 50-60% FP.

"Raptormont" gets 1-3% gain while Crestmont gets 4-6% gain. So a 30% gain with Skymont as did with Atom-based predecessors gets us to the "aiming for ADL" claim on Twitter. I think SKT will be able to reach Golden Cove similar to GMT reaching Skylake.

It means 10-15% faster than Golden Cove for Int while being 10-15% slower in FP. Consequently it means Sierra Glen(which is Crestmont without the 6-wide retire/allocate, or IOTW Gracemont) is similar to 10-15% faster per core than Skylake and is an excellent Cloud core.

If we extrapolate that to Darkmont-based Clearwater Forest, you essentially have an 18A 144-288 core better-than Golden Cove core chip.

On a side note, I speculate the possibility that they aren't backing down on clocks with Skymont hence the greater than expected core size while they are for Lion Cove.
Clock speed will be key
 

Hulk

Diamond Member
Oct 9, 1999
4,385
2,270
136

SemiAccurate thinks Qualcomm is cheating on their Snapdragon X Elite/Pro benchmarks. It's being compared to Celeron. Whats happening?
The fact that they are only allowing certain benchmarks certainly lends credence to the possibility that they are not as confident as they would have us believe.
 

moinmoin

Diamond Member
Jun 1, 2017
4,998
7,791
136

trivik12

Senior member
Jan 26, 2006
321
288
136

Intel has confirmed that the upcoming 18A process Panther Lake CPU generation is on track for a mid-2025 release date.

As confirmed in the company's Q1 2024 Quarterly Results, Intel CEO Pat Gelsinger reaffirmed that the new upcoming 18A chipset line is being fabricated right now. That puts the processors on track to be coming out about a year from now.


Gelsinger said in the earnings call: "The Core Ultra platform delivers leadership AI performance today with our next-generation platforms launching later this year, Lunar Lake and Arrow Lake tripling our AI performance." Continuing with bold claims for its future tech: "In 2025 with Panther Lake, we will grow AI performance up to an additional 2x".

It's the culmination of what Team Blue has described as its 'Execution Engine' with five new process nodes in production over four years. It began with the Intel 7 process, codenamed Sapphire Rapids and Emerald Rapids (Xeon) to Intel 4's Meteor Lake (Ultra), and the upcoming Intel 20A Arrow Lake process.
 

mikk

Diamond Member
May 15, 2012
4,185
2,220
136
A mid 2025 release for Panther Lake would be surprising, this the first time they talked about a release and usually there is at least 1 year between a generation. Or maybe he refers to Clearwater Forest and wasn't specific about it.
 
  • Like
Reactions: lightmanek

Philste

Member
Oct 13, 2023
139
280
96
A mid 2025 release for Panther Lake would be surprising, this the first time they talked about a release and usually there is at least 1 year between a generation. Or maybe he refers to Clearwater Forest and wasn't specific about it.
Panther Lake as 4+4+4 Design and Mobile only sounds more like a Lunar Lake successor than a Arrow Lake successor. And Lunar Lake looks like it will arrive before Arrow Lake (probably Computex Launch with availability in september at the latest). For Arrow Lake I guess Launch at innovation end of september and availability end of October at best, more likely November.

So if you see it as Lunar Lake successor it might launch at Computex 2025. But after all the CPU side shouldn't be that special, both archs should be refreshes of Lion Cove and Skymont, like Redwood Cove and Crestmont are refreshes of Golden Cove and Gracemont. iGPU gets another bump tho, with 12 Xe³ Cores rumored, so 1536 Celestial ALUs.
 
Last edited:

mikk

Diamond Member
May 15, 2012
4,185
2,220
136
Panther Lake as 4+4+4 Design and Mobile only sounds more like a Lunar Lake successor than a Arrow Lake successor. And Lunar Lake looks like it will arrive before Arrow Lake (probably Computex Launch with availability in september at the latest). For Arrow Lake I guess Launch at innovation end of september and availability end of October at best, more likely November.

So if you see it as Lunar Lake successor it might launch at Computex 2025. But after all the CPU side shouldn't be that special, both archs should be refreshes of Lion Cove and Skymont, like Redwood Cove and Crestmont are refreshes of Golden Cove and Gracemont. iGPU gets another bump tho, with 12 Xe³ Cores rumored, so 1536 Celestial ALUs.

Panther Lake comes with 4+8+4 cores from what we know and it comes for the UPH segment. It also comes with 3 tiles unlike Lunar Lake with only 1 tile. They need something better for ARL-H with a faster NPU and the first tile design wasn't great.