Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Page 584 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
846
799
106
Wildcat Lake (WCL) Preliminary Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing ADL-N. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q2/Computex 2026. In case people don't remember AlderLake-N, I have created a table below to compare the detail specs of ADL-N and WCL. Just for fun, I am throwing LNL and upcoming Mediatek D9500 SoC.

Intel Alder Lake - NIntel Wildcat LakeIntel Lunar LakeMediatek D9500
Launch DateQ1-2023Q2-2026 ?Q3-2024Q3-2025
ModelIntel N300?Core Ultra 7 268VDimensity 9500 5G
Dies2221
NodeIntel 7 + ?Intel 18-A + TSMC N6TSMC N3B + N6TSMC N3P
CPU8 E-cores2 P-core + 4 LP E-cores4 P-core + 4 LP E-coresC1 1+3+4
Threads8688
Max Clock3.8 GHz?5 GHz
L3 Cache6 MB?12 MB
TDP7 WFanless ?17 WFanless
Memory64-bit LPDDR5-480064-bit LPDDR5-6800 ?128-bit LPDDR5X-853364-bit LPDDR5X-10667
Size16 GB?32 GB24 GB ?
Bandwidth~ 55 GB/s136 GB/s85.6 GB/s
GPUUHD GraphicsArc 140VG1 Ultra
EU / Xe32 EU2 Xe8 Xe12
Max Clock1.25 GHz2 GHz
NPUNA18 TOPS48 TOPS100 TOPS ?






PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



LNL-MX.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,028
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,522
  • INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    181.4 KB · Views: 72,430
  • Clockspeed.png
    Clockspeed.png
    611.8 KB · Views: 72,318
Last edited:

DavidC1

Golden Member
Dec 29, 2023
1,833
2,960
96
I wonder if Intel considered something like 2+10 for Lunar Lake.
2+10 sounds good but that means creating another configuration as Skymont is only quad cluster. So... 2+8.
What are the advantages and disadvantages of Skymonts 3x3 decoder vs. Lion Cove's 8 simple decoders?
The x86 variable instruction length makes it hugely advantageous to go with the cluster decode configuration.

Here's the page talking about variable length decode:
Whatever advantage a full decoder has, it's not worth it at all after certain width, especially when you consider that wide decode is for increasing the peaks, so it already diminishes the w/e perceived advantage there might be. So yes, back in the P3, P4, and Core days they were small. Now we're talking double and triple that width, which dramatically increases the complexity.

Here's a great thread talking about Skymont's clustered decode: https://news.ycombinator.com/item?id=40711835

And because it's a brand new idea, you have potentially other opportunities to further improve it, compared to traditional superscalar approach which existed for 30+ years. Kinda hard to see it as anything other than a win-win.
 
Last edited:

coercitiv

Diamond Member
Jan 24, 2014
7,355
17,424
136
I wonder if Intel considered something like 2+10 for Lunar Lake. Given the huge increase in IPC for Skymont coupled with the fact that the clockspeed advantage of Lion Cove is quite small in this segment, 2+10 seems like it could have been a nice configuration and probably even a bit less area then 4+4.
Like @gdansk already wrote, given their previous config they sure did consider it. However, 4P config is much more suitable for the kind of workloads this chip will see, from browser engines to low power gaming. And again, the IPC delta between Skymont and Lion Cove does not tell you the full story:
  • Skymont in LNL runs at much lower clocks than Lion, and would also have trouble scaling without access to a fast L3
  • P and E cores no longer share the ring bus, so it's a good idea to have enough of them on each side to tackle most common workloads
The reality of the situation is we would not be having this conversation if LC was more efficient and a bit faster. I would argue 4P is a very good foundation for the 12-17W envelope. From here Intel could add 2-4 more E cores to keep scaling the design within the same TDP range (assuming the quad cluster is a limiting factor then it would be 4), or alternatively add +2P and maybe another +4E for higher TDP designs. Obviously it ain't happening this way, as we have no direct iteration from LNL, so we'll need to see how they follow up.

Personally I'm getting increasingly convinced the high performance NPUs are wasted in all the CPUs in the of this gen (in the Win ecosystem). I think we would have been fine with the 12-20 TOPS as hardware support for the fiorst generation of software. By the time we'll get proper AI feature implementation on Windows the current "high performance" NPUs will likely be obsolete in terms of performance or hardware support. In the case of LNL we could have gotten another 4E cluster if the NPU was half the size, and that would have improved performance scaling up to 28W.
 

DavidC1

Golden Member
Dec 29, 2023
1,833
2,960
96
Like @gdansk already wrote, given their previous config they sure did consider it. However, 4P config is much more suitable for the kind of workloads this chip will see, from browser engines to low power gaming. And again, the IPC delta between Skymont and Lion Cove does not tell you the full story:
  • Skymont in LNL runs at much lower clocks than Lion, and would also have trouble scaling without access to a fast L3
  • P and E cores no longer share the ring bus, so it's a good idea to have enough of them on each side to tackle most common workloads
The lack of a fast L3 cache hurts Skymont in LNL quite a bit, especially in FP, where Crestmont is quite weak. It seems it loses almost all of the gains from the doubled FP units.

We wondered exactly how it would perform, considering it was an improvement over LPE in Meteorlake, but the SLC is quite slow and the LPE moniker is appropriate in Lunarlake.
In the case of LNL we could have gotten another 4E cluster if the NPU was half the size, and that would have improved performance scaling up to 28W.
Yea. It's really only useful if you have a discrete GPU, like Nvidia's top end parts. What matters if you go from 0.01x of the performance to 0.05x?

They could have doubled the SLC cache to 16MB too, reduced power further and improved performance. Or 12 Xe2 cores. Heck, 2 extra Lion Cove cores! Anything really other than a honking NPU.
 

poke01

Diamond Member
Mar 8, 2022
4,202
5,551
106
Anything really other than a honking NPU.
Gen4 NPU is too big, Gen5 in PTL should be much smaller.

IMO, NPUs right now are useless in the Windows ecosystem. Especially AMD's which is half baked in terms of support
 
Last edited:

511

Diamond Member
Jul 12, 2024
4,523
4,144
106
My point what is the point of NPU nearly 6X performance for an NPU on Arrow Lake H iGPU on LNL it is 48TOPS vs 67TOPs 40% more TOPs
 

Attachments

  • 2024-10-10_2-03-31-1456x819.png
    2024-10-10_2-03-31-1456x819.png
    863.7 KB · Views: 15

moinmoin

Diamond Member
Jun 1, 2017
5,242
8,456
136
The design company (Centaur Tech) architecting their cores was a relatively small team.
That reminds me, Intel bought that design staff from VIA for $125M back in 2021. Anybody happen to know what happened to them? I guess they strengthened the Atom/e core team in Austin?
 

511

Diamond Member
Jul 12, 2024
4,523
4,144
106
The more time you save by writing down stream of consciousness instead of purposely structured sentences, the less people will care to read your replies.
I can't see the Die Space wasted that should have been better utilized or removed to make it cheaper but yes i will keep it in mind😅

Thanks Microsoft for another useless feature

They should have just slashed NPU by 1/3 and made it more cheaper for us to buy i am pretty sure people would have bought it
 

cannedlake240

Senior member
Jul 4, 2024
247
138
76
I can't see the Die Space wasted that should have been better utilized or removed to make it cheaper but yes i will keep it in mind😅

Thanks Microsoft for another useless feature

They should have just slashed NPU by 1/3 and made it more cheaper for us to buy i am pretty sure people would have bought it
Intel's npus are too big, look at apple, it's 5-7mm2 at most on A18. Lnl npu4 isn't much smaller than the entire 4C CPU cluster. How is apple so good at designing all these IPs man... This is why Intel didn't stand a chance in the mobile market
 

coercitiv

Diamond Member
Jan 24, 2014
7,355
17,424
136
I can't see the Die Space wasted that should have been better utilized or removed to make it cheaper
Here's a simple comparison based on the LNL floor plan. Two of those NPU NCEs would almost cover an entire Skymont cluster.

1728984110117.png

Today, from a consumer point of view, the NPU just adds to the weight of the chip. MS failed to deliver anything of substance this year, and by the time they do come up with cool features we'll probably have a new generation of chips anyway. Today's chips need to have NPUs so that devs can count on the functionality, but whether they need this much NPU is debatable in my opinion (as a simple consumer, the one who's supposed to pay for this as a product). There's a high chance this hardware will become obsolete before the really good AI features come online.

I'll stop here as this was supposed to be just an observation to @Hulk 's commentary about P and E core configs in Lunar Lake... and it's borderline a rant now.
 

poke01

Diamond Member
Mar 8, 2022
4,202
5,551
106
How is apple so good at designing all these IPs man
Apple is more of a hardware company than a software company, their hardware teams are one of the best in the industry. Their software is okay but not industry leading. It comes down to planning, setting achievable goals and iterating on IP every year till you mastered it.

You also have to hire the right people and manage them properly. Apple also ain't perfect they made plenty of mistakes like Intel, their car project which was the perfect case study on how not to handle a long term project. You just have to remember not to repeat those failures again.
 
  • Like
Reactions: Tlh97 and 511

511

Diamond Member
Jul 12, 2024
4,523
4,144
106
Intel's npus are too big, look at apple, it's 5-7mm2 at most on A18. Lnl npu4 isn't much smaller than the entire 4C CPU cluster. How is apple so good at designing all these IPs man... This is why Intel didn't stand a chance in the mobile market
They didn't cause they never bothered and when they bothered it was too late also

Apple and Intel's use of Libraries makes the difference as well one uses HP other uses HD apple is moving to HP Now since they have started chasing clocks

There is the point of stagnation of design team and foundry as well both were interdependent Meanwhile Apple Multi sourced and their execution ability which Intel lacked

Combining these with Apples amazing Design we get M series SOC
 

FlameTail

Diamond Member
Dec 15, 2021
4,384
2,761
106
Apple and Intel's use of Libraries makes the difference as well one uses HP other uses HD apple is moving to HP Now since they have started chasing clocks
I doubt Intel is using HP libraries for the NPU.

Intel's NPU does support more data types than Apple's does, iirc. Though I don't think that alone explains the much bigger NPU size.

Edit: Apple NPU is 35 TOPS of INT8, Lunar Lake's NPU is 48 TOPS of INT8. So Intel does have more "TOPS'.
 

dullard

Elite Member
May 21, 2001
25,994
4,608
126
Today, from a consumer point of view, the NPU just adds to the weight of the chip. MS failed to deliver anything of substance this year, and by the time they do come up with cool features we'll probably have a new generation of chips anyway.
I'd have to disagree. The previews of AI computers coming out the next 3 months are finally getting good reviews of the AI features. Some of this is brand focused (HP's AI print seems to make people quite happy using natural language to reformat documents for exactly the right margins, correct printer problems, etc.) This includes reviewers who previously panned AI as being useless or not ready.

But some things from Microsoft will work on all Windows computers. I'm most looking forward to the new Windows search--no longer do you have to search for file names or specific text in documents. Just type "BBQ Party" in any search bar and all photos from your 2017 barbeque party appear (regardless of their file names or any tags that you might have used). But, voice clarity for conference calls and paint getting features like erase or fill will be quite useful to me. Click to Do might be good -- depends on implementation and I haven't tried it yet.
 
  • Like
Reactions: Elfear

Wolverine2349

Senior member
Oct 9, 2022
525
178
86
The NPU is only because of Wall Street's obsession with AI. Just going to have to live with it until it ends.

BINGO Well said.

SO sick of AI and I hope it blows up and crashes so much worse than dot com bubble ever did.

Irony of dot com is the Internet was a positive impact and revolutionary impact on our lives during the time. There was just irrational exuberance in bidding in stock market with anything with so much as a dot com name even companies with 0 assets or bad debt which caused the severe crash.

AI is so much more useless and there is starting to become irrational exuberance for something that is dangerous and should never revolutionize and enslave our lives. Cannot wait until its wiped out of Wallstreet.
 
Last edited:
  • Like
Reactions: sgs_x86

jur

Member
Nov 23, 2016
47
37
91
AI is not going away; if anything its presence will increase. MS will probably push it to all office products. I bet it will also become a core part of Windows, so that we'll be able to actually talk to the os. Google will integrate it into its services. Photo / video editing software will have (or already has it). All text editing software... Essentially, it will be everywhere and from my experience, in some cases, in can be a big time saver. Accelerators are the future, unless there's a big jump in CPU performance, but from latest Intel / AMD releases, the outlook is pessimistic.
 

CakeMonster

Golden Member
Nov 22, 2012
1,630
809
136
My problem is more with the sharing of data, if they want to introduce local AI accelerated by CPU/GPU that can run offline with local accounts, then fine. Just let me enable/disable it at will in Edge/Office/Desktop etc and don't make any other Windows features dependent on it.
 

dullard

Elite Member
May 21, 2001
25,994
4,608
126
My problem is more with the sharing of data, if they want to introduce local AI accelerated by CPU/GPU that can run offline with local accounts, then fine.
That is the whole reason for NPUs: to do the AI work locally without sharing data.
Just let me enable/disable it at will in Edge/Office/Desktop etc and don't make any other Windows features dependent on it.
You can choose not to use many of the AI features. But, so much of it will be built in. For example, I doubt that you'll be able to turn of natural language file searching (nor once you use it will you want to turn it off).

I realize that I'm just about the only AI cheerleader here. But, damn is it useful when I use it. It is like night and day better.
 

Jan Olšan

Senior member
Jan 12, 2017
574
1,131
136
The fact that Pat killed Royal core was something. That team was the only team at Intel who could have matched Apple's P core in one generation.

But nooo kill that team. Now, that group founded a RISC-V start up.
Do we even know the project was any good?

Actually, where do we even know about "Royal Core" from (and Royal Cove supposedly being the next best thing after sliced bread), isn't it just from some MLID video? Remember that Arrow Lake was supposed to have +40% ST performance according to the same source?

I mean, if Royal Core or Beast Lake or whatever was supposed to have the rentable units (also from MLID?), those things sounded awfully like the VISC concept Intel bought in a a startup. And VISC sounded like a design trying to implement inverse HT (which was an april joke at one point), likely through some software-translation scheme a la Nvidia ARM cores and Transmeta.

I think that was more likely to be the next colossal trainwreck rather than being the next CPU revolution. If Gelsinger killed these crazy projects and ordered the teams to focus on core that gets good in a conventional way instead of trying to be the next Bulldozer or Itanium, that is probably a good thing, not a bad one.

"Conventional" may sound bad but that is the success path so far - Apple, AMD, ARM, even Intel post Netburst (and Itanium).
 
  • Like
Reactions: techjunkie123

DrMrLordX

Lifer
Apr 27, 2000
22,902
12,971
136
That reminds me, Intel bought that design staff from VIA for $125M back in 2021. Anybody happen to know what happened to them? I guess they strengthened the Atom/e core team in Austin?
It would be interesting to know what happened to that team. Somehow I doubt they're contributing to Atom design, but I could be wrong.
 
  • Like
Reactions: moinmoin

OneEng2

Senior member
Sep 19, 2022
840
1,105
106
The lack of a fast L3 cache hurts Skymont in LNL quite a bit, especially in FP, where Crestmont is quite weak. It seems it loses almost all of the gains from the doubled FP units.
This, and lots of other limitations will hurt Skymont in may situations and applications. That's OK though, that is exactly why a P core cluster is needed.

I believe that the future of processing is more stratification of workloads, not unification of the CPU Core designs.

Already we have:
  • High Performance Core
  • Efficiency Core
  • Graphics Core
  • AI Core
I think that as time moves on, the number of specific core types will increase and products will be defined mostly by the mix of core types and the number of those core types. In fact, musical instruments already employ a DSP core for sound processing algorithms as an example. They are able to do the kind of work a PC workstation takes many seconds to do in micro-seconds to the point of where all channel processing within a mixer is done and converted to analog signals in under 1 mSec. In other words, the specific hardware is THOUSANDS of times faster than a very fast PC algorithm can be.

I do not imagine a world where a single compute design rules them all.

You disparage Lion Cove and praise Skymont; however, I think that while there may be some justification to your Lion Cove animosity, the design is undoubtably more efficient than the previous generation. That efficiency is where things will pay off in the future IMO.
 

Josh128

Golden Member
Oct 14, 2022
1,319
1,986
106
That is the whole reason for NPUs: to do the AI work locally without sharing data.

You can choose not to use many of the AI features. But, so much of it will be built in. For example, I doubt that you'll be able to turn of natural language file searching (nor once you use it will you want to turn it off).

I realize that I'm just about the only AI cheerleader here. But, damn is it useful when I use it. It is like night and day better.
What precisely do you use it for? Other than image generation which often requires a lot of manual tweaking once you obtain the output, web search summary is the only thing I use it for (because its default for most browsers now), and thats not particularly a must have feature to me.