Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Page 654 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
919
834
106
Wildcat Lake (WCL) Preliminary Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing ADL-N. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q2/Computex 2026. In case people don't remember AlderLake-N, I have created a table below to compare the detail specs of ADL-N and WCL. Just for fun, I am throwing LNL and upcoming Mediatek D9500 SoC.

Intel Alder Lake - NIntel Wildcat LakeIntel Lunar LakeMediatek D9500
Launch DateQ1-2023Q2-2026 ?Q3-2024Q3-2025
ModelIntel N300?Core Ultra 7 268VDimensity 9500 5G
Dies2221
NodeIntel 7 + ?Intel 18-A + TSMC N6TSMC N3B + N6TSMC N3P
CPU8 E-cores2 P-core + 4 LP E-cores4 P-core + 4 LP E-coresC1 1+3+4
Threads8688
Max Clock3.8 GHz?5 GHz
L3 Cache6 MB?12 MB
TDP7 WFanless ?17 WFanless
Memory64-bit LPDDR5-480064-bit LPDDR5-6800 ?128-bit LPDDR5X-853364-bit LPDDR5X-10667
Size16 GB?32 GB24 GB ?
Bandwidth~ 55 GB/s136 GB/s85.6 GB/s
GPUUHD GraphicsArc 140VG1 Ultra
EU / Xe32 EU2 Xe8 Xe12
Max Clock1.25 GHz2 GHz
NPUNA18 TOPS48 TOPS100 TOPS ?






PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



LNL-MX.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,034
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,527
  • INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    181.4 KB · Views: 72,435
  • Clockspeed.png
    Clockspeed.png
    611.8 KB · Views: 72,321
Last edited:

AcrosTinus

Senior member
Jun 23, 2024
221
226
76
Maybe they can release a 280K SKU with 6P+16E to dial down the power consumption a bit or higher all core clocks for both P and E cores (maybe 290K in that case?).
Cope on :
There is a reason Intel left so much space in the naming, remember the ADM LV4 cache, my "Sooooooooources" told me it was for the gaming version of Arrow Lake. Let AMD run into a trap of taking the crown in gaming just for it to be snatched away in a humiliating defeat.
 
Last edited:

Meteor Late

Senior member
Dec 15, 2023
347
382
106
It is getting hard for me to take Intel process timelines seriously because they seem to miss their deadlines so regularly. ARL disappoints architecturally and isn't on Intel 20A. Might be time for an AMD/Intel pair trade.

It just seems like IPC-wise x86 is nearly topped out. Zen 5 didn't show big gains and Lion Cove showed gains and regression. It seems more likely that the E cores will approach current ST iso frequency performance and peter out as well at that performance level as well, they just end up at that performance level in a more area efficient manner.

Without software help like new instructions or something I don't see how we get another 15 to 20% increase on a generation from either Zen 5 or LNC? 8 wide is already too wide, right?

On that same note, considering performance regression is LNC still +9% over Raptor Cove?

Apple did get the biggest IPC boost since M1 with M4, and I think it was around 8% or something like that. I understand it's not all about IPC and frequency is the other part of the equation, but it seems IPC gains have been shrinking in both camps, at least the ones with highest performance to begin with.
Don't think IPC is nearly topped out but it seems like it's just harder and harder to obtain.
 

Meteor Late

Senior member
Dec 15, 2023
347
382
106
Non-English YT video showing the 245K as more efficient and faster than 9600X:

But slower in games obviously.

No idea how accurate or true the bit about lower power consumption is, though.

Maybe Intel has a chance now to convince budget users to go for Arrow Lake?

EDIT: Scratch that! From same video:

View attachment 110774

Left ARL and right Zen 5 in PUBG. ARL is legendarily bad!

View attachment 110775

Guy trying his best to teach his followers that Zen 5 delivers more frames using more power :D

More cores increases the MT efficiency vs less cores due to the quadratic relationship between voltage and power, that is to say, increasing frequency 33% is going to increase power way more than increasing number of cores by 33%, or conversely, a 8 core CPU will be much more efficient than a 6 core CPU by just reducing the frequency a little bit.
So 245K has 6 + 8 cores without HT and 9600 has 6 cores with SMT. The core count difference is really big, if you add the +20-30% of SMT, it's like 14 vs 7.5 cores, almost double, MT efficiency is going to be better with 245K. On the other hand, 285K is 8 + 16 cores vs 9950X with 16 cores with SMT, it's closer in core count if you add the +20-30% of SMT, it's like 24 vs 20 cores, like 20% more. E cores would make the count a bit less but you get the point.
 

Hulk

Diamond Member
Oct 9, 1999
5,226
3,857
136
Apple did get the biggest IPC boost since M1 with M4, and I think it was around 8% or something like that. I understand it's not all about IPC and frequency is the other part of the equation, but it seems IPC gains have been shrinking in both camps, at least the ones with highest performance to begin with.
Don't think IPC is nearly topped out but it seems like it's just harder and harder to obtain.
Here's what I don't comprehend when people around here (who know a lot more than I do about this) say IPC isn't nearly topped out.

IPC relates to single thread performance. Correct me when needed with my somewhat naive line of reasoning.

Single thread is essentially a sequential line of instructions that must be executed one after the other (theoretically) and the problem when trying to execute them in parallel is that some instructions in the front of the line require the result from the back of the line. The result is that you can simply take 10 instructions and execute them at the same time because of interdependencies among the instructions. I have done some coding and have a basic understanding of how this works.

This is why we have out-of-order caches, branch prediction, and other "smart" structures in the microprocessor architecture that do their best to extract as much parallelism as possible from the code. It amazes me that people have actually devised methods to execute instructions out-of-order and make it work actually.

So to my question. Eventually there seems as though there must be a limit to how much of a given sequential code of interdependent instructions and can be executed in parallel? I assume some points of the code are very ameniable to parallel execution and then other parts get very dependent on neighboring instructions and the result is branch mispredictions, pipeline stalls, and other slowdowns.

So when it is said, "we have a long way to go with increasing IPC" it seems like the very nature of the instructions and their interdependencies contradict that statement.

Okay, that's what bouncing around in my brain regarding this. Please feel free to expand my horizons on this topic and let me know where my understanding is incorrect.
 

ondma

Diamond Member
Mar 18, 2018
3,316
1,708
136
non-K is the volume desktop. 4P+16E as a volume SKU only works if the bins allow it to.
Why would you need 16E for mainstream desktop? Seems like even 4P alone would be enough; for sure 4P/8E should do it.
As far as that goes, I would think even 8E would suffice, with the big jump in E core IPC.
 

adroc_thurston

Diamond Member
Jul 2, 2023
7,886
10,601
106
Why would you need 16E for mainstream desktop? Seems like even 4P alone would be enough; for sure 4P/8E should do it.
As far as that goes, I would think even 8E would suffice, with the big jump in E core IPC.
6+8 is the INTC MSDT offering now.
 

Doug S

Diamond Member
Feb 8, 2020
3,746
6,613
136
N2P won't come until later half of 26 and only for apple it is 18A for majority and Graphics+IO will be TSMC

I doubt Apple will ever use N2P. It arrives at the same time as A16, which is N2P+BSPDN, so that's what Apple will use.
 
  • Like
Reactions: jdubs03

DavidC1

Platinum Member
Dec 29, 2023
2,021
3,157
96
Here's what I don't comprehend when people around here (who know a lot more than I do about this) say IPC isn't nearly topped out.

IPC relates to single thread performance. Correct me when needed with my somewhat naive line of reasoning.
For x86 they have the insane clock speed focus plus their execution sucks. If M4 is close to the peak of what can be achieved, then that's how much they can get better.

4000 in GB6 ST on a 4.5GHz chip, nevermind the power consumption differences. AMD/Intel are currently embarrassing.

How is it justified in any fashion to have a 5.7GHz chip consuming enormous amount of power, risking degradation, and having to clock rest of the chip wayy lower, decreasing density, when it loses to a 4.5GHz one?
 

poke01

Diamond Member
Mar 8, 2022
4,606
5,916
106
Is there a 1080p game benchmark where Apple is shown to be superior in handling game engine performance than x86?
for gaming 9800X and later the 9950X3D will be leaders despite IPC advantage. IPC isn't the end all. Apple will however remain king in laptop power efficiency
 

OneEng2

Senior member
Sep 19, 2022
958
1,172
106
Intel's "five nodes in four years" kept too much capacity out of service doing upgrades, plus they've been playing catchup on acquiring EUV scanners. I think that's the real reason they canceled 20A, outfitting for an internal only node that will be used by only one or two products just doesn't make a whole lot of sense. It also causes problems for Intel's "copy exactly" strategy if the nodes aren't around long enough to get copied to other fabs, because those other fabs are being set up for the next node in line.

I didn't read the transcript but from what I heard they claim to have added two customers for 18A. Whether they add meaningful volume I don't know.
Indeed! Intel's "Copy Exact" philosophy made them a ton of money over the last few decades. Changing equipment and nodes so quickly must have totally decimated this approach.
It is getting hard for me to take Intel process timelines seriously because they seem to miss their deadlines so regularly. ARL disappoints architecturally and isn't on Intel 20A. Might be time for an AMD/Intel pair trade.

It just seems like IPC-wise x86 is nearly topped out. Zen 5 didn't show big gains and Lion Cove showed gains and regression. It seems more likely that the E cores will approach current ST iso frequency performance and peter out as well at that performance level as well, they just end up at that performance level in a more area efficient manner.

Without software help like new instructions or something I don't see how we get another 15 to 20% increase on a generation from either Zen 5 or LNC? 8 wide is already too wide, right?

On that same note, considering performance regression is LNC still +9% over Raptor Cove?
I think the issue is more complicated than you are making it out to be (and others). Yes, in the past we have seen some pretty crazy IPC improvements .... and ILP improvements ..... and MT improvements; however, almost all of it has been accomplished by increasing transistor budget and/or power budget.

As we see the end of quick gains of transistor budget and power budget, only small incremental changes to general computing can be accomplished .... and then only though very clever designs.

The only "big" changes in performance I believe will come from new specialized cores designed for specific tasks.
Zen5 is still much smaller vs Apple cores.

4-wide decode vs 10-wide
8-wide rename vs 10-wide
6 ALU vs 8 ALU
448 ROB vs 960
You don't happen to know how big the die area is for various apple cores?
Cope on :
There is a reason Intel left so much space in the naming, remember the ADM LV4 cache, my "Sooooooooources" told me it was for the gaming version of Arrow Lake. Let AMD run into a trap of taking the crown in gaming just for it to be snatched away in a humiliating defeat.
I can't decide if you are being serious, or telling a very good joke.
Apple did get the biggest IPC boost since M1 with M4, and I think it was around 8% or something like that. I understand it's not all about IPC and frequency is the other part of the equation, but it seems IPC gains have been shrinking in both camps, at least the ones with highest performance to begin with.
Don't think IPC is nearly topped out but it seems like it's just harder and harder to obtain.
For single threaded performance, I believe that essentially he with the bigger transistor budget wins (or should win).

For DC loads, it is more complex. It is more about PPA and staying within a power budget.
 

Hulk

Diamond Member
Oct 9, 1999
5,226
3,857
136
for gaming 9800X and later the 9950X3D will be leaders despite IPC advantage. IPC isn't the end all. Apple will however remain king in laptop power efficiency
So true. The IPC doesn't matter much if the decoders are constantly waiting on memory for data. This is why improvements in the memory subsystem without any IPC changes can improve throughput at iso frequency. This is what we saw with the move from Golden Cove to Raptor Cove when the L2 was increased and performance increased a bit.
 

Meteor Late

Senior member
Dec 15, 2023
347
382
106
Is there a 1080p game benchmark where Apple is shown to be superior in handling game engine performance than x86?

Most likely not, especially with the use of slower LPDDR with worse timings and all that, but also lack of big L3 cache. Memory and L3 cache have been the main enablers of AMD and Intel gaming performance increases lately:
-With Alder Lake and Raptor Lake, Intel increased L3 cache across the stack, so that even with DDR4 there was a gaming improvement, but also DDR5 improved things a lot.
-Zen 1 had 8MB L3 per CCX, so that's what a core could access. Zen 2 CCD doubled L3 cache, with each CCX having 16MB L3 cache. Zen 3 got rid of CCX and now each core could access the full 32MB. Zen 4 got no cache increase but DDR5. Zen 5 got no cache increase and same memory support, hence paltry gaming results compared to Zen 4.
 
Last edited:
  • Like
Reactions: Tlh97 and Executor_

DavidC1

Platinum Member
Dec 29, 2023
2,021
3,157
96
for gaming 9800X and later the 9950X3D will be leaders despite IPC advantage. IPC isn't the end all. Apple will however remain king in laptop power efficiency
See, gaming performance is like how things were with Core 2 and the Nehalem.

Core 2 had a really good architecture but they had FSB and off-die memory controller so it wasn't that good in servers compared to Opteron. But once it got point-to-point interconnect with QPI and IMC with Nehalem it dominated, because the underlying core was good.

If Apple chips ever optimize for gaming, then it can fly, because it has a ridiculously good core. Let's say a hypothetical M7 with dedicated L2 per core, L3 cache, optimized interconnect, and X3D like memory and it'll fly.
 

poke01

Diamond Member
Mar 8, 2022
4,606
5,916
106
Zen5 is still much smaller vs Apple cores.

4-wide decode vs 10-wide
8-wide rename vs 10-wide
6 ALU vs 8 ALU
448 ROB vs 960
For a 10-wide core its really efficient. AMD and Intel and ARM or even Qualcomm are no where near close to Apple in that department while offering the fastest core.
 
  • Like
Reactions: Tlh97

poke01

Diamond Member
Mar 8, 2022
4,606
5,916
106
See, gaming performance is like how things were with Core 2 and the Nehalem.

Core 2 had a really good architecture but they had FSB and off-die memory controller so it wasn't that good in servers compared to Opteron. But once it got point-to-point interconnect with QPI and IMC with Nehalem it dominated, because the underlying core was good.

If Apple chips ever optimize for gaming, then it can fly, because it has a ridiculously good core. Let's say a hypothetical M7 with dedicated L2 per core, L3 cache, optimized interconnect, and X3D like memory and it'll fly.
It would be so hard for ARM even with Apple cores to be good at gaming. There is so much optimsation that goes into x86 in terms of software for games.

Now with the Switch 2 and IF Nvidia makes an ARM SoC for WoA that will change but until then x86 will remain the king for gaming. Apple SoC is great for productivity but at end both ecosystems will exist in their own bubble, it really it always depends the persons use case.