Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Page 369 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
696
602
106
PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake

ModelCode-NameDateTDPNodeTilesMain TileCPULP E-CoreLLCGPUXe-cores
Core Ultra 100UMeteor LakeQ4 202315 - 57 WIntel 4 + N5 + N64tCPU2P + 8E212 MBIntel Graphics4
?Lunar LakeQ4 202417 - 30 WN3B + N62CPU + GPU & IMC4P + 4E08 MBArc8
?Panther LakeQ1 2026 ??Intel 18A + N3E3CPU + MC4P + 8E4?Arc12



Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake

Meteor LakeArrow Lake (20A)Arrow Lake (N3B)Lunar LakePanther Lake
PlatformMobile H/U OnlyDesktop OnlyDesktop & Mobile H&HXMobile U OnlyMobile H
Process NodeIntel 4Intel 20ATSMC N3BTSMC N3BIntel 18A
DateQ4 2023Q1 2025 ?Desktop-Q4-2024
H&HX-Q1-2025
Q4 2024Q1 2026 ?
Full Die6P + 8P6P + 8E ?8P + 16E4P + 4E4P + 8E
LLC24 MB24 MB ?36 MB ?12 MB?
tCPU66.48
tGPU44.45
SoC96.77
IOE44.45
Total252.15

LNL-MX.png

Intel Core Ultra 100 - Meteor Lake

INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg

As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)



Clockspeed.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,006
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,490
Last edited:

DavidC1

Senior member
Dec 29, 2023
940
1,473
96
Interesting how power hungry the "LPE" cores are.
img_2693-jpeg.99480

My guess is on Meteorlake, rather than just powering up the SoC tile, it's also powering up the compute tile, whereas if you have P or E core active you only have the compute tile active.

You can see the E core uses less power per performance than "Low Power" E cores. Really sucks how it is. Maybe that's a reason why in addition to LPE core being low performance the power benefits are situational. The thread director "sees" that it's not worth moving to the LPE core.

If they can address this, lower the power further, and improve perf/W on the low end by 3x like it says on the presentation, it'll be a huge improvement.
 

H433x0n

Golden Member
Mar 15, 2023
1,209
1,572
96
Interesting how power hungry the "LPE" cores are.
img_2693-jpeg.99480

My guess is on Meteorlake, rather than just powering up the SoC tile, it's also powering up the compute tile, whereas if you have P or E core active you only have the compute tile active.

You can see the E core uses less power per performance than "Low Power" E cores. Really sucks how it is. Maybe that's a reason why in addition to LPE core being low performance the power benefits are situational. The thread director "sees" that it's not worth moving to the LPE core.

If they can address this, lower the power further, and improve perf/W on the low end by 3x like it says on the presentation, it'll be a huge improvement.
Looking at that graph ARL-U (which is just MTL refresh with RWC+ on Intel 3) should do pretty well as a low range option with the Intel 3 perf/watt improvement.
 

TwistedAndy

Member
May 23, 2024
159
150
76
You can see the E core uses less power per performance than "Low Power" E cores. Really sucks how it is. Maybe that's a reason why in addition to LPE core being low performance the power benefits are situational. The thread director "sees" that it's not worth moving to the LPE core.

Two LP E-cores in Meteor Lake were suffering from missing L3 cache and high LPDDR5 latencies. They could not handle even some basic tasks on their own. As a result, the CPU tile was used much more frequently than it should.

In Lunar Lake, Intel addresses this issue by using the side cache (SLC) and increasing the number of cores in LP island. Probably, it will help.
 

DavidC1

Senior member
Dec 29, 2023
940
1,473
96
Looking at that graph ARL-U (which is just MTL refresh with RWC+ on Intel 3) should do pretty well as a low range option with the Intel 3 perf/watt improvement.
Yes, but if they don't improve the behavior of the LPE, then it would still be a so-so product.
Two LP E-cores in Meteor Lake were suffering from missing L3 cache and high LPDDR5 latencies. They could not handle even some basic tasks on their own. As a result, the CPU tile was used much more frequently than it should.
This is much more than that. The LPE core should at most, use the same power as the E core cluster, not MORE. The E core behaves properly.

This is due to an execution problem that brought delays to the product. They clearly missed their target.
 

TwistedAndy

Member
May 23, 2024
159
150
76
This is much more than that. The LPE core should at most, use the same power as the E core cluster, not MORE. The E core behaves properly.

This is due to an execution problem that brought delays to the product. They clearly missed their target.

Yes, Intel clearly missed its target with LP E-cores in Meteor Lake.

On the other hand, it makes sense to consider Meteor Lake as a development platform that was pushed into some real products.
 

DavidC1

Senior member
Dec 29, 2023
940
1,473
96
Why Skymont is a vastly more sensible and superior design:

-In an effort to save transistors(power and area), the decoders are weaker than in the P cores. This results in relying on more of microcode, which significantly reduces decoder throughput.
-Since Tremont, they moved to a Clustered Decode setup, which allows the second cluster to execute even if another is blocked by microcode.
-Skymont duplicates common microcode instructions across all 3 clusters so it can continue to execute across all. They call it "Nanocode".
-The chief architect of Skymont also says that 3-wide decode is easier to fill than 4.
-From C&C: From our testing, Tremont’s decoder starts behaving like a 3-wide one after around 128 to 160 instructions without a taken branch. Its throughput peaks with 3 to 64 instructions in the loop. Average applications have 5% to 20% branches, and about half of those are taken so Tremont’s lack of automatic load balancing between the clusters shouldn’t be a big issue.
-Gracemont eliminates the need to use taken branches to take full advantage of the new decoders. It achieves full output, limited only by the backend:
Gracemont improves load balancing between the two decode clusters by automatically switching, instead of relying on taken branches.
On the surface, the P core approach seems better. However, the E core approach is much more sensible. The P core approach is Bigger, More, Badder, which is contrary to what is needed in modern designs that are power limited.

3-wide being easier to fill means better utilization, and there's enough taken branches in x86 code to take full advantage of the clusters, and Gracemont and beyond eliminates the corner case scenario where branches aren't taken for many instructions. Clustered decode is also significantly more compact in terms of transistors. This is a careful balance, unlike the P cores which is big, bigger, and BIGGER!

-Uop cache vs no Uop cache: Pentium 4's problem was that it relied nearly entire on the Trace Cache for decoder throughput. TC was able to output the normal 3-wide, but the uarch had only 1 decoder. While the uop cache has much higher hit rate and much more efficient, there's still a 2-pipeline penalty for missing. And the hit rate is lot lower at about 60%. The E cores go for a straightup extension of decoders.

-Many dedicated and not shared ports: Sharing has benefits, but the boundary where it shares the bandwidth drops to zero. You also need to add algorithms so they can effectively share the data.
-Trying new ideas, instead of just sticking with the old: Ultra wide retire, and more store AGUs. The changes that took 5 generations for the P cores are now being seen nearly every generation.
-Doubled 128-bit FP versus new wider instructions: Doubled amount of units such as in Skymont benefits ALL existing code, unlike AVX, AVX2, AVX512 that all needed recompiling. That is why Skymont is 68% faster in FP. Nevermind the flip-flop and endless fighting for whether AVX512 is needed or not. Skymont's FP is 2x as capable in literally every code since when 128-bit vector was introduced in 2006 with Core 2!

10 years of consistent execution, while the P core team was flopping on it's face more than a Seal on land flops on it's belly. It should offer significant advantages iso-node over existing designs, even Zen 5c. Can't wait to see how the future holds for them. Bring on Arctic Wolf. Another 30% in two years will render all argument for the current P core design moot.
 
  • Like
Reactions: Hulk

DavidC1

Senior member
Dec 29, 2023
940
1,473
96
Yes, Intel clearly missed its target with LP E-cores in Meteor Lake.

On the other hand, it makes sense to consider Meteor Lake as a development platform that was pushed into some real products.
No it's not. Not at all. It was a delayed design hit by Intel 7nm problems. I don't know why it's so hard for some people to grasp? You have done projects in school right? If you are not lazy and start from day 1 and carefully plan it out, you are neither late nor have problems. Those that wait till too long or don't execute aren't only late but the project sucks.

This is why Meteorlake was mediocre. If you hear or delays on a future project, treat it with suspicion, that it might disappoint. NV30 was very late too. They miss targets, thus it's delayed trying to meet original goals. More often than not they don't.

Knights Landing Xeon Phi went from being a late 2013 chip with 200W TDP and 3.3TFlop DP FP to mid 2014 chip with 230W TDP and 3TFlop DP FP. 15% less power, 10% faster, 9 months ago. KNL was a victim of 14nm delay.

Meteorlake was an Alderlake replacement. Raptorlake was not supposed to exist. Think about that. Imagine a better product than current Meteorlake with more being offloaded to LPE cores that are lower power but in end of 2022. This would have demolished competition.
 
  • Like
Reactions: Nothingness

TwistedAndy

Member
May 23, 2024
159
150
76
I agree that Meteor Lake was initially planned as a replacement for Alder Lake on the newer node with a tiled approach and many other features. To make this transition safer, Intel planned to use the existing P-core architecture and slightly update E-cores. It reminiscences their Tik-Tock model.

But still, there are a lot of new features in Meteor Lake, including a tiled approach, a separate memory controller and a media engine, a new NPU, Thead Director, and other stuff. Now, Intel is still polishing some software and drivers for Meteor Lake.

I hope, that all the lessons learned and we will see some interesting stuff in Lunar Lake and Arrow Lake ;)

As for Skymont, I have written a lot about it before and it looks really impressive on paper.

Lunarlake looks like it might crush it.
And it will. Qualcomm X Elite is also a test platform. And it looks pretty promising.
 

DavidC1

Senior member
Dec 29, 2023
940
1,473
96
2) Any chips that get 20A would have an entirely new transistor (RibbonFET with faster transistor switching) and PowerVia (enabling higher frequencies, lower resistance, and lower capacitance).
With just a section of one market said to be using 20A, I don't think you'll see any advantages for the 20A chip. Think in terms of Cannonlake, where it was used to bring the real 10nm node. In this case it's used to bring up 18A.

You need significant design and process cooperation nowadays so a simple port no longer brings big advantages. Things like PowerVia even more so because now the circuit design needs to change.
 

SiliconFly

Golden Member
Mar 10, 2023
1,541
897
96
Looking at that graph ARL-U (which is just MTL refresh with RWC+ on Intel 3) should do pretty well as a low range option with the Intel 3 perf/watt improvement.
Omg! RWC again? Yikes! Intel will never learn.

... Meteorlake with more being offloaded to LPE cores that are lower power but in end of 2022. This would have demolished competition.
Meteor Lake with its RWC cores wouldn't have demolished any competition. The only thing it's capable of doing is demolishing Intel.
 
Last edited:

DavidC1

Senior member
Dec 29, 2023
940
1,473
96
But still, there are a lot of new features in Meteor Lake, including a tiled approach, a separate memory controller and a media engine, a new NPU, Thead Director, and other stuff. Now, Intel is still polishing some software and drivers for Meteor Lake.

And it will. Qualcomm X Elite is also a test platform. And it looks pretty promising.
It don't matter. Most, if not all was planned for late 2022. 7nm delay was over 6+ months, that's why it became 2023 and we got Raptorlake..

Saying it's a "test platform" is an excuse. We don't care, neither do other consumers. Just say as it is. Meteorlake is mediocre and so is X Elite.
Meteor Lake with its RWC cores wouldn't have demolished any competition. The only thing it's capable of doing is demolishing Intel.
What the heck are you talking about? Did you even read what I said?

Page 31

They said they worked to eliminate the "2-3% disaggregation tax". They clearly missed their goals. There's no reason that they can do better. It's the low level details they don't tell us.
 
Last edited:

TwistedAndy

Member
May 23, 2024
159
150
76
Saying it's a "test platform" is an excuse. We don't care, neither do other consumers. Just say as it is. Meteorlake is mediocre and so is X Elite.

Meteor Lake is mediocre in terms of CPU performance but offers a noticeably better battery life and has a more powerful GPU. It's OK for regular customers.

As for Snapdragon X Elite, it has a lot of software and driver issues. I don't see any reason to buy it now, but in a year or two it might be a pretty decent option.
 

Henry swagger

Senior member
Feb 9, 2022
504
306
106
Meteor Lake is mediocre in terms of CPU performance but offers a noticeably better battery life and has a more powerful GPU. It's OK for regular customers.

As for Snapdragon X Elite, it has a lot of software and driver issues. I don't see any reason to buy it now, but in a year or two it might be a pretty decent option.
Meteor lake will still outsell the elite and amd.. even apple is struggling with sales.. brand power is way more important than performance
 
  • Like
Reactions: pcp7

ondma

Diamond Member
Mar 18, 2018
3,005
1,528
136
Looks like July 15 for the Ryzen 300 Series laptop chips and either preorder or actual availability (unclear) of July 31 for the Ryzen 9000s. Since this is a discussion about Arrow Lake, which is best compared to the Ryzen 9000 line, I used July 31st. https://videocardz.com/newz/amd-ryz...00-sales-start-july-31-according-to-retailers

I don't follow how rumor of one complaint of Lunar Lake availability has much to do with Arrow Lake's release date and/or availability.

What is Mountain Lake?

Lunar Lake and Arrow Lake are totally different market segments. And Lunar Lake is launching sooner than Arrow Lake. So, why does focus on the sooner launching ultralight notebook Lunar Lake have anything to say about higher powered Arrow Lake desktop and higher power mobile chips?

If I follow your logic, that means if Ford promotes its upcoming redesigned Mustang released during a summer then that means that a F150 released a later fall is both bad and delayed until Winter?
Obviously, I had a brain fart and meant Meteor Lake, not Mountain Lake, although maybe Intel needs a new lake of some sort. Otherwise I still stand by what I have said. As far as Meteor Lake vs LL vs ARL, it just goes to show the pattern of Intel's execution. Meteor Lake was late, didnt make it to the desktop, was equal to or a regression in performance, and failed to live up to the expectations of greatly improved power usage. Lunar Lake "looks" promising in mobile, I will admit. However, I dont see how anyone can really expect ARL to be more than just "ok". Coming out after Zen 5 and maybe Zen 5 X3D, it certainly needs to be better than that. They apparently gave up hyperthreading, supposedly to get more ST gains, but Lion Cove, based on earlier leaks and LL IPC improvements doesn't show any more, (actually probably less) IPC improvement than other recent new Intel releases. And the new "foundry competitor" is using TSMC nodes for LL and the most performant ARL. As I said, ARL would probably have been great if it had come out instead of RL-R. It would probably have given Intel the lead for at least a few months. But coming out near the end of the year, it probably will make Intel competitive at best until Zen 6. And what does Intel have to counter? Probably an ARL-R.
 

ondma

Diamond Member
Mar 18, 2018
3,005
1,528
136
There is attention for performance reasons. But performance wasn't the discussion point. See this quote below and notice how it was NOT about performance but about release dates. I still have yet to see why the release date of one product and availability of that product are reliant on an unrelated product.

As for desktop Arrow Lake CPUs in October, that is exactly what I said earlier. https://forums.anandtech.com/thread...akes-discussion-threads.2606448/post-41232851
You are either mis-interpreting or deliberately obfuscating the point of my post. It is most certainly about both performance and release dates. As I keep saying, ARL probably would have been a very good product if it had been released instead of Raptor Lake refresh. It might have been a modest improvement, but would have given Intel a better performing product than AMD for at least a few months, and would have mitigated at least some of their efficiency disadvantage. But coming after Zen 5, Intel needs a home run, and I think anyone who still expects that at this point is dreaming.
 

ondma

Diamond Member
Mar 18, 2018
3,005
1,528
136
MTL may sell. But it's still a mediocre product.

Well, I'm now thoroughly convinced anything with RWC is mediocre (including Granite Rapids).

LNC+SKT is their real future.
Skymont seems like a great product, the kind of improvement Intel needed in their P cores. But how much improved is Lion Cove, actually, over RWC for desktop use? I dont really care that much about the E cores for desktop. Just give me a big jump in P core performance and efficiency.
 
  • Like
Reactions: Elfear

dullard

Elite Member
May 21, 2001
25,511
4,008
126
You are either mis-interpreting or deliberately obfuscating the point of my post. It is most certainly about both performance and release dates.
I have been speaking about your specific post here:
Problem is that ARL is late to the party. What reason will their to buy it if AMD has a comparable or better performing chip out 6 months earlier?.
ARL is not 6 months after Zen 5. I am simply trying to correct that fallacy. If I misinterpreted or obfuscated "6 months earlier", please help correct me.

If you wish to have a discussion about performance, go ahead. Last we left it, I used your claim of Zen 5 having comparable performance better than ARL and went with that. Do you have anything concrete to add to your original performance estimate? We will find out in a few months which of those claims is true. But that has nothing to do with your timeline post that we have been discussing for days. A post which was based on ARL being 6 months after Zen 5.

If you want to discuss performance and do not have anything performance-wise to add (especially unlikely given that you have no idea what you are talking about with respect to hyperthreading), how about we go back to your first post on this. What happens to Intel if:
1) Arrow Lake is bad compared to Zen 5
2) Arrow Lake is comparable
3) Arrow Lake beats Zen 5
 
Last edited:

SiliconFly

Golden Member
Mar 10, 2023
1,541
897
96
I have been speaking about your specific post here:

ARL is not 6 months after Zen 5. I am simply trying to correct that fallacy. If I misinterpreted or obfuscated "6 months earlier", please help correct me.

If you wish to have a discussion about performance, go ahead. Last we left it, I used your claim of Zen 5 having comparable performance better than ARL and went with that. Do you have anything concrete to add to your original performance estimate? We will find out in a few months which of those claims is true. But that has nothing to do with your timeline post that we have been discussing for days. A post which was based on ARL being 6 months after Zen 5.

If you want to discuss performance and do not have anything performance-wise to add (especially unlikely given that you have no idea what you are talking about with respect to hyperthreading), how about we go back to your first post on this. What happens to Intel if:
1) Arrow Lake is bad compared to Zen 5
2) Arrow Lake is comparable
3) Arrow Lake beats Zen 5
My two cents. ARL and Zen5 are separated by around 2.5 months at best. Not 6.

ARL top 8P+16E is going to take the MT crown over Zen5 top X3D part due to too many performant E cores. ST is yet to be seen. But I think it's gonna be very close.

(i.e, Skymonts in ARL are gonna be faster than the Skymonts in LNL).

Skymont seems like a great product, the kind of improvement Intel needed in their P cores. But how much improved is Lion Cove, actually, over RWC for desktop use? I don't really care that much about the E cores for desktop. Just give me a big jump in P core performance and efficiency.
Like I mentioned once before, it appears the P core team has reached an evolutionary dead end. Now that they have ported the design to industry standard tools and made it more modular and node agnostic, they need to slide and dice it and come up with something awesome. Otherwise, the entire team needs to be dissolved. They're very close to becoming irrelevant if they don't have anything good to show real soon.

Intel P core team sucks! (as of now)
 
Last edited:
  • Like
Reactions: reb0rn

TwistedAndy

Member
May 23, 2024
159
150
76
Like I mentioned once before, it appears the P core team has reached an evolutionary dead end. Now that they have ported the design to industry standard tools and made it more modular and node agnostic, they need to slide and dice it and come up with something awesome. Otherwise, the entire team needs to be dissolved. They're very close to becoming irrelevant if they don't have anything good to show real soon.
Yes, but things are more complicated here.

It's a classic situation for big projects (P-core/Cove architecture). Usually, they have some kind of a mix of ancient legacy stuff and some new parts. Once the project grows, it becomes harder and harder to make some changes and implement new features. At a certain point, you have to do some refactoring, remove some old legacy structures, make it more flexible, modular, decoupled, etc., just to be able to move forward.

I think that's the reason why Intel decided to clean up the architecture in Lion Cove and switch from "Sea of Fubs" to "Sea of Cells". Usually, the project's success depends on how successful you are at dealing with the technical debt and refactoring.

At the same time, Skymont is a way more perspective architecture just because it's smaller and much more innovative.
 

SiliconFly

Golden Member
Mar 10, 2023
1,541
897
96
...dealing with the technical debt and refactoring.
My point too.

Once they came out with RWC, the whole world fell in love with the P core team for their awesomeness. And now that they're about to deliver a massive IPC uplift with LNC, there's this overwhelming feeling of admiration for the P core team! Wonder what incredible things they're gonna accomplish next.
 
  • Like
Reactions: TwistedAndy

jur

Member
Nov 23, 2016
30
9
81
Skymont design definitely seems very modern and sophisticated, but the question is: "can it achieve same IPC, vector throughput, frequency... as P-CORE without having P-core size (and power)?". It seems that Intel's P-core dedicate a lot of die space to lower latency instructions and vector units. Also, large die size of P-core makes it less power dense, which enables it to achieve higher frequency and power.
 

SiliconFly

Golden Member
Mar 10, 2023
1,541
897
96
Skymont design definitely seems very modern and sophisticated, but the question is: "can it achieve same IPC, vector throughput, frequency... as P-CORE without having P-core size (and power)?". It seems that Intel's P-core dedicate a lot of die space to lower latency instructions and vector units. Also, large die size of P-core makes it less power dense, which enables it to achieve higher frequency and power.
Well, it's been almost 3 years since Golden Cove and the P core team haven't come out with anything worthwhile till date. At this point, Pat can't even blame his predecessors for this mess. This one definitely belongs to him.
 
Last edited: