Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Page 916 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
942
858
106
Wildcat Lake (WCL) Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing Raptor Lake-U. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q1 2026.

Intel Raptor Lake UIntel Wildcat Lake 15W?Intel Lunar LakeIntel Panther Lake 4+0+4
Launch DateQ1-2024Q2-2026Q3-2024Q1-2026
ModelIntel 150UIntel Core 7Core Ultra 7 268VCore Ultra 7 365
Dies2223
NodeIntel 7 + ?Intel 18-A + TSMC N6TSMC N3B + N6Intel 18-A + Intel 3 + TSMC N6
CPU2 P-core + 8 E-cores2 P-core + 4 LP E-cores4 P-core + 4 LP E-cores4 P-core + 4 LP E-cores
Threads12688
Max Clock5.4 GHz?5 GHz4.8 GHz
L3 Cache12 MB12 MB12 MB
TDP15 - 55 W15 W ?17 - 37 W25 - 55 W
Memory128-bit LPDDR5-520064-bit LPDDR5128-bit LPDDR5x-8533128-bit LPDDR5x-7467
Size96 GB32 GB128 GB
Bandwidth136 GB/s
GPUIntel GraphicsIntel GraphicsArc 140VIntel Graphics
RTNoNoYESYES
EU / Xe96 EU2 Xe8 Xe4 Xe
Max Clock1.3 GHz?2 GHz2.5 GHz
NPUGNA 3.018 TOPS48 TOPS49 TOPS






PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



LNL-MX.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,044
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,531
  • INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    181.4 KB · Views: 72,440
  • Clockspeed.png
    Clockspeed.png
    611.8 KB · Views: 72,327
Last edited:

Fjodor2001

Diamond Member
Feb 6, 2010
4,644
748
126
What are we really expecting from Titan Lake (successor to Razor Lake) w.r.t. unified core?

Reading this article:

it sounds like we'll still have P and E cores. Says it'll switch to a common ISA, but don't we have that already with NVL for P and E cores?
 

Khato

Golden Member
Jul 15, 2001
1,385
492
136
it sounds like we'll still have P and E cores. Says it'll switch to a common ISA, but don't we have that already with NVL for P and E cores?
As the article implies, Intel's unified core approach is going to mirror AMDs. Don't design an entirely separate core to differentiate between P and E, just change parameters and synthesis targets on the same design.
 

Fjodor2001

Diamond Member
Feb 6, 2010
4,644
748
126
As the article implies, Intel's unified core approach is going to mirror AMDs. Don't design an entirely separate core to differentiate between P and E, just change parameters and synthesis targets on the same design.
Assuming Intel will use the E core as base for the Unified core, can we expect them to push it to P core level of ST perf, for the P core variant of the Unified core?
 

Fjodor2001

Diamond Member
Feb 6, 2010
4,644
748
126
Obviously, they're not going to regress ST
So how to modify current E core design to arrive at Unified core, whose P core variant will reach P core level of ST perf (at perf level expected in 2028 and not now), without sacrificing E core targets (area / perf/watt / price / …)?
 

Khato

Golden Member
Jul 15, 2001
1,385
492
136
I'm not sure what all the plans are on the unified core. One interesting notion that might be carried over from the royal core development is to have a more scalable design. Simplest example based on the current E core implementation would be to have the P core variant have 5x3 clustered decode front end while the E core variant remains 3x3. It'll be interesting to see if they come up with a similar 'clustered' approach to the back end. Similarly they could maintain larger L2 cache shared between 2 cores for the P core variant and smaller L2 cache shared between 4 cores for the E core variant.

Another important point to keep in mind is that there won't be the current size disparity between the P and E core variants, probably more like 2:1 instead of the current 4:1. Mostly because the P core variant won't have as bad of ppa as current.
 

DavidC1

Platinum Member
Dec 29, 2023
2,187
3,340
106
So how to modify current E core design to arrive at Unified core, whose P core variant will reach P core level of ST perf (at perf level expected in 2028 and not now), without sacrificing E core targets (area / perf/watt / price / …)?
Of course it'll be bigger than current ~1mm2 target for the E cores, that's to be expected. The difference, and the hope is that a grown up version will be power efficient and take up less area than whatever lacklustre trajectory P core had. Or, you get similar parameters but better performance. This is mirroring Netburst to Core transition. It's not exact, but as the saying goes "history doesn't repeat, but rhymes".
I'm not sure what all the plans are on the unified core. One interesting notion that might be carried over from the royal core development is to have a more scalable design. Simplest example based on the current E core implementation would be to have the P core variant have 5x3 clustered decode front end while the E core variant remains 3x3. It'll be interesting to see if they come up with a similar 'clustered' approach to the back end. Similarly they could maintain larger L2 cache shared between 2 cores for the P core variant and smaller L2 cache shared between 4 cores for the E core variant.
Arctic Wolf is very likely 4x3. Unified Core means the base architecture stays the same. If one is 5x3 and the other is say 6x3, then it's not unified. Likely they are both on 7x3, but one is frequency optimized and the other is density. Clusters are not needed for back end. Remember Clustered decode came off of addressing x86 decode issue. In some aspects, clustered is even better than monolithic but I think that's a happy side effect. Backend doesn't have that "issue".

They've been aiming at ~30% gain every generation going back to Silvermont in 2013. Silvermont was the outlier with 50% but it also had the very anemic Bonnell predecessor.
 

Khato

Golden Member
Jul 15, 2001
1,385
492
136
Unified Core means the base architecture stays the same. If one is 5x3 and the other is say 6x3, then it's not unified. Likely they are both on 7x3, but one is frequency optimized and the other is density.
It depends on how it's designed, no? Why would parameterizing most of the design to work with an arbitrary number of decode clusters should be difficult? I'd guess that only a few blocks would require separate logic for each configuration. Would that really qualify as a different architecture? I know it would with old school design methodology, but with current design and synthesis flows it could be almost entirely reuse.

Note that I was mentioning the backend side of the design adopting a similar approach simply to keep resource ratios similar. Basically take any structures that are amenable to scaling/duplication and use them to differentiate between the P and E variants, while other areas are exactly the same and designed for the P core performance level.
 

OneEng2

Golden Member
Sep 19, 2022
1,012
1,212
106
What are we really expecting from Titan Lake (successor to Razor Lake) w.r.t. unified core?

Reading this article:

it sounds like we'll still have P and E cores. Says it'll switch to a common ISA, but don't we have that already with NVL for P and E cores?
I think the question really is, how much more like a P Core will "unified core" look than it looks like an E Core? Or will it look more like an E Core?

I have been thinking for those that imagine Zen 6 level performance out of a Skymont level of die area and power envelope, they will be disappointed. In engineering you never get "something for nothin".
As the article implies, Intel's unified core approach is going to mirror AMDs. Don't design an entirely separate core to differentiate between P and E, just change parameters and synthesis targets on the same design.
AMD has a good approach IMO. It is very cost effective and design time friendly (in comparison to making 2 totally different core architectures and then trying to get everything scheduled correctly).
This is mirroring Netburst to Core transition. It's not exact, but as the saying goes "history doesn't repeat, but rhymes".
LOL. Indeed.

I still have trouble giving "Cove" the same black eye as "Netburst". I still have hope that should Intel free up the latency in NVL, "Cove" might breath much better than people give it credit for.

Still, your point is well taken.
 

Thunder 57

Diamond Member
Aug 19, 2007
4,294
7,101
136
I think the question really is, how much more like a P Core will "unified core" look than it looks like an E Core? Or will it look more like an E Core?

I have been thinking for those that imagine Zen 6 level performance out of a Skymont level of die area and power envelope, they will be disappointed. In engineering you never get "something for nothin".

AMD has a good approach IMO. It is very cost effective and design time friendly (in comparison to making 2 totally different core architectures and then trying to get everything scheduled correctly).

LOL. Indeed.

I still have trouble giving "Cove" the same black eye as "Netburst". I still have hope that should Intel free up the latency in NVL, "Cove" might breath much better than people give it credit for.

Still, your point is well taken.

You keep saying that. I think it would be more appropriate to say how there's "No free lunch" regarding physics. To use Netburst as you mentioned, compare it to Core 2. Conroe was faster by a lot, used less die size, less power, and produced less heat than Netburst at the time. Time & cost are more difficult to consider but still seemed like a giant win.

Just recently you posted this:

I guess I believe that in this day and age, there aren't any mysterious CPU architectures that magically work better than everything that came before it.

I believe that Apple, Intel, and AMD all have equivalent engineering teams and tools. The difference is what you target your architecture to do and what things you decide to prioritize and what things you decide to give up.


I believe you can't say I want it all and actually get it all. If you say I want a core that is very power efficient, you can't also say I want a core that clocks higher than the competition.

You can't say I want the core to be very small AND I want 4 way SMT, AVX512, etc, etc.

I do agree that Lion Cove appears to have lower PPA than Zen 5, although I think ARL in general gets a pretty bad rap on the basis of its poor showing in latency sensitive applications (which is mostly a ring bus issue IMO vs a core problem).

I think it is a pretty tall order to take ANY derivative of Skymont and make it compete with Zen 5 across the board. I think you can make it do some things better, but at the expense of doing other things worse.

I just don't see getting something for nothing in engineering.

Well that's a rather defeatist attitude. Rather than pushing to innovate you think that teams just throw in the towel and say "Well that's it lads, we've thought of everything, best to figure out how to put out the best CPU with all that will ever exist.". That might be simplifying it a bit. If I am wrong though I would like to hear your thoughts.
 

Josh128

Banned
Oct 14, 2022
1,542
2,295
106
You keep saying that. I think it would be more appropriate to say how there's "No free lunch" regarding physics. To use Netburst as you mentioned, compare it to Core 2. Conroe was faster by a lot, used less die size, less power, and produced less heat than Netburst at the time. Time & cost are more difficult to consider but still seemed like a giant win.

Just recently you posted this:



Well that's a rather defeatist attitude. Rather than pushing to innovate you think that teams just throw in the towel and say "Well that's it lads, we've thought of everything, best to figure out how to put out the best CPU with all that will ever exist.". That might be simplifying it a bit. If I am wrong though I would like to hear your thoughts.
Its probably better to say, "theres very little low hanging fruit to pick" in x86 CPU design. Gains are small and are often localized for certain functions. It's all dependent on the current state of the art of boolean algebra, logic design, and materials science. Small advancements in any can lead to significant advancements in performance, but its a slog, and breakthroughs are slow. Maybe AI will help speed things up, lol.
 

511

Diamond Member
Jul 12, 2024
5,495
4,897
106
Thanks for sharing. A few percent slower than the equivalent level 255H. Data encryption and data compression scores were dragging the 358H down quite a bit. But, still not yet a launch BIOS. And we don't know power consumption or frequencies used.
It's 400Mhz lower clock 4.7 vs 5.1 has like 6% IPC Improvements so the best case is PTL will have same st as arl but lower power
 

dullard

Elite Member
May 21, 2001
26,196
4,869
126
It's 400Mhz lower clock 4.7 vs 5.1 has like 6% IPC Improvements so the best case is PTL will have same st as arl but lower power
I think you misunderstood my post. 4.8 GHz for P cores is what rumors state the final 358H configuration will be (I haven't seen the E core speed rumors yet, do you know?). But, we don't know what speed or at what power that specific test was performed at. We can assume it is what the rumors state. I just like to be conservative and point out that the actual values are not specified.
 
  • Like
Reactions: DKR and 511

OneEng2

Golden Member
Sep 19, 2022
1,012
1,212
106
Well that's a rather defeatist attitude. Rather than pushing to innovate you think that teams just throw in the towel and say "Well that's it lads, we've thought of everything, best to figure out how to put out the best CPU with all that will ever exist.". That might be simplifying it a bit. If I am wrong though I would like to hear your thoughts.
I would argue that, for the most part, generation on generation changes in performance have been on the back of generation on generation lithography improvements.

The notable difference was that Core 2 was simply a MUCH better architecture than Netburst .... which was wrong headed in so many ways.

I do not believe that either AMD or Intel's current architectures are fundamentally "wrong headed" like Netburst.

Because I don't expect much to change in Lithography in the next generation (15%?) I also don't expect much to change in processor design. This comes from my belief that the big improvements come at the expense of higher transistor budget (deeper buffers, wider execution, more execution units, more L1, L2, L3, etc).

When there isn't more transistor budget to be had, I believe the best you can do is make good tradeoffs. Yes, special instructions help, but only in limited circumstances.
Plenty left, but none of that fits in area or timing budgets.
Agree. I think we might see a little surprise here or there, but without the huge density improvements of the past, it's hard to see big performance improvements generation on generation.

In fact, I think it may be even less than the transistor budget improvement % since many of the current methods have reached the inflection point and adding x% more transistors no longer gets you x% more performance, it get's you y% and y is getting smaller and smaller.