Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads

Page 694 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
851
802
106
Wildcat Lake (WCL) Preliminary Specs

Intel Wildcat Lake (WCL) is upcoming mobile SoC replacing ADL-N. WCL consists of 2 tiles: compute tile and PCD tile. It is true single die consists of CPU, GPU and NPU that is fabbed by 18-A process. Last time I checked, PCD tile is fabbed by TSMC N6 process. They are connected through UCIe, not D2D; a first from Intel. Expecting launching in Q2/Computex 2026. In case people don't remember AlderLake-N, I have created a table below to compare the detail specs of ADL-N and WCL. Just for fun, I am throwing LNL and upcoming Mediatek D9500 SoC.

Intel Alder Lake - NIntel Wildcat LakeIntel Lunar LakeMediatek D9500
Launch DateQ1-2023Q2-2026 ?Q3-2024Q3-2025
ModelIntel N300?Core Ultra 7 268VDimensity 9500 5G
Dies2221
NodeIntel 7 + ?Intel 18-A + TSMC N6TSMC N3B + N6TSMC N3P
CPU8 E-cores2 P-core + 4 LP E-cores4 P-core + 4 LP E-coresC1 1+3+4
Threads8688
Max Clock3.8 GHz?5 GHz
L3 Cache6 MB?12 MB
TDP7 WFanless ?17 WFanless
Memory64-bit LPDDR5-480064-bit LPDDR5-6800 ?128-bit LPDDR5X-853364-bit LPDDR5X-10667
Size16 GB?32 GB24 GB ?
Bandwidth~ 55 GB/s136 GB/s85.6 GB/s
GPUUHD GraphicsArc 140VG1 Ultra
EU / Xe32 EU2 Xe8 Xe12
Max Clock1.25 GHz2 GHz
NPUNA18 TOPS48 TOPS100 TOPS ?






PPT1.jpg
PPT2.jpg
PPT3.jpg



As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



LNL-MX.png
 

Attachments

  • PantherLake.png
    PantherLake.png
    283.5 KB · Views: 24,031
  • LNL.png
    LNL.png
    881.8 KB · Views: 25,525
  • INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    INTEL-CORE-100-ULTRA-METEOR-LAKE-OFFCIAL-SLIDE-2.jpg
    181.4 KB · Views: 72,433
  • Clockspeed.png
    Clockspeed.png
    611.8 KB · Views: 72,319
Last edited:

DavidC1

Golden Member
Dec 29, 2023
1,884
3,031
96
Anyways anyone having area value for 4C Skymont+L2 ?
Can't really figure out the 4C+L2 size from that figure. The compute tile has other stuff such as the interconnect necessary to connect to other tiles. It has EMIB too, so that takes a bit. If it needs a router, it'll take little bit more too.

55mm2 for 24 cores on 18A is ok considering.
 

mzocyteae

Member
Dec 29, 2020
26
19
81
It used to be doubling the stages got you 80%+ gains. How is it worth now? Unified Core should aim for at maximum low-5GHz.

As for FP, it should stay 256-bit for client. 512-bit is a waste, and was a mistake to do it with AVX-512. It should have been AVX3-256.
It is Intel's (political) stubbornness to not implement avx512 in client cores. AMD already proved that avx512 can be implemented with 256 datapath and gets reasonable throughput gains.
If you ever touched simd codes, you won't spam "AVX3-256" bullshit. Most (all?) avx512 instructions have xmm/ymm counterparts.

IMO neither intel nor arm did it correct.
SIMD should be designed with both fixed-size vector width and predicate registers.
 
  • Like
Reactions: Gideon

511

Diamond Member
Jul 12, 2024
4,587
4,214
106
Can't really figure out the 4C+L2 size from that figure. The compute tile has other stuff such as the interconnect necessary to connect to other tiles. It has EMIB too, so that takes a bit. If it needs a router, it'll take little bit more too.
I meant for Arrow Lake 4C+L2 on N3B
This confirms the die has empty area to facilitate EMIB+Mesh+TSV
 
Last edited:
  • Like
Reactions: Elfear

DavidC1

Golden Member
Dec 29, 2023
1,884
3,031
96
If cluster size is similar to Skymont, and doesn't have more L2 cache, then it means 18A is similar to N3B in size.
 

DavidC1

Golden Member
Dec 29, 2023
1,884
3,031
96
It is Intel's (political) stubbornness to not implement avx512 in client cores. AMD already proved that avx512 can be implemented with 256 datapath and gets reasonable throughput gains.
If you ever touched simd codes, you won't spam "AVX3-256" bullshit. Most (all?) avx512 instructions have xmm/ymm counterparts.
And what's the BS? That 512-bit FPUs are overkill for CPUs in this day and age?

AVX3-256 means ALL AVX-512 instructions should be kept, without needing 512-bit registers and FPU.

AMD cores are also very far away from being the most efficient design so it very much applies. Go look at many of Intel's older presentations. 512-bit was solely to stave off Nvidia's advance in HPC, note the first AVX512 product was supposed to be Xeon Phi. And they paid for it with decreased clocks and fragmented ISAs. Something they could have avoided if they focused on GPUs way earlier and stayed on a far sane 256-bit FPU.

ARM vendors and Skymont does a far better approach of adding more FPUs. It straight up benefits everything without putting the burden on users and programmers.
 

adroc_thurston

Diamond Member
Jul 2, 2023
7,192
9,969
106
ARM vendors and Skymont does a far better approach of adding more FPUs
Lmao no, 2 FMAs is the maximum anything real SIMD code would saturate.
without putting the burden on users and programmers.
It literally puts the burden directly on the SIMD slave. You need to juggle more math to saturate moar FMA units.
Man, you really never ever heard SIMD people ranting?
 
  • Like
Reactions: lightmanek

mzocyteae

Member
Dec 29, 2020
26
19
81
AVX3-256 means ALL AVX-512 instructions should be kept, without needing 512-bit registers and FPU.
Lmao, do you understand what is ISA?
And you can't even understand this: :p
avx512 instructions have xmm/ymm counterparts.
ARM vendors and Skymont does a far better approach of adding more FPUs. It straight up benefits everything without putting the burden on users and programmers.
avx512 requires much less effort to programming than avx256.
The only (albeit big) problem is ISA fragmentation caused by Intel's silly decision to not implement avx512 in client cores.
 

OneEng2

Senior member
Sep 19, 2022
842
1,109
106
If cluster size is similar to Skymont, and doesn't have more L2 cache, then it means 18A is similar to N3B in size.
Which is what has been the general consensus among speculations with the information available to date.

It looks to me like AMD intends to complete with these chips from N3P in desktop and laptop while in server, N2 will be used.

... which again brings me back to the financial implications of Intel using a more expensive process and more expensive equipment. Of course some of this cost is mitigated by Intel not having to pay for TSMCs profit.

I wonder if CWF has AVX512 and SMT? Hard to see how it can compete in DC and HPC without them.
 

511

Diamond Member
Jul 12, 2024
4,587
4,214
106
Which is what has been the general consensus among speculations with the information available to date.
Yes we all have been Saying N3 Density and Slightly better performance than N3P but bit less than N2 and danniel nenni confirms it
It looks to me like AMD intends to complete with these chips from N3P in desktop and laptop while in server, N2 will be used.

... which again brings me back to the financial implications of Intel using a more expensive process and more expensive equipment. Of course some of this cost is mitigated by Intel not having to pay for TSMCs profit.
From where do you get the expensive equipment both use almost the same equipment there is no dual sourcing in many of the critical things in semi manufacturing. intel sells one such equipment masks which Intels subsidiary produces so they can charge TSMC more for it lol.
How do you rate one process more expensive than other without proof and you simply said the most important point doesn't matter the foundry you know that if foundry has 30% margin on 18A and product another 30% that is roughly 70% margin on a chip vs AMDs which would not be that much
Also Zen 6 is not arriving before 2H26 at the earliest so it has like 1 year of reign
I wonder if CWF has AVX512 and SMT? Hard to see how it can compete in DC and HPC without them.
First thing it is not a HPC Chip it would loose to 5C in AVX-512 but that's about it does it states anywhere SMT is necessary it is entirely dependent on the people who buy it how much they feature Security SMT is not necessary. AVX-512 might be which might be mitigated by AVX10/256 somewhat
 
Last edited:
  • Like
Reactions: ajsdkflsdjfio

OneEng2

Senior member
Sep 19, 2022
842
1,109
106
From where do you get the expensive equipment both use almost the same equipment there is no dual sourcing in many of the critical things in semi manufacturing. intel sells one such equipment masks which Intels subsidiary produces so they can charge TSMC more for it lol.
How do you rate one process more expensive than other without proof and you simply said the most important point doesn't matter the foundry you know that if foundry has 30% margin on 18A and product another 30% that is roughly 70% margin on a chip vs AMDs which would not be that much
Also Zen 6 is not arriving before 2H26 at the earliest so it has like 1 year of reign

First thing it is not a HPC Chip it would loose to 5C in AVX-512 but that's about it does it states anywhere SMT is necessary it is entirely dependent on the people who buy it how much they feature Security SMT is not necessary make or break AVX-512 might be
Your last sentence is a bit confusing, can you clarify?

Intel originally purchased 5000 series ASML machines (high NA) which for 18A. Now, Intel intends to wring out the high NA machines in 2025, but not use them for production.

GAA and BSPD both cause more passes of process steps over FinFET and FSPD do. This makes the process more expensive. Intel also produces fewer chips on their high end equipment than does TSMC which makes the amortization costs higher for Intel. Furthermore, without High NA, Intel will have to rely on double patterning which will also raise costs. Throughout 2025, Intel will be producing only CWF chips on 18A as I understand it. This is a pretty low volume chip compared to desktop and laptop markets.

AMD using N3P for the high volume desktop and laptop segments significantly reduces their cost over Intel's use of N3B today (N3B also has more passes as I understand it than N3E, N3P and N3X) and (my guess) Intel's own 18A process.

It seems like Intel is willing to throw money at their chips to keep them competitive while AMD manages to maintain competitive products at much lower production costs on less expensive process nodes.
 

cannedlake240

Senior member
Jul 4, 2024
247
138
76
Throughout 2025, Intel will be producing only CWF chips on 18A as I understand it
18A production fab won't be online until 2H 25, where it'll make PTL, CLF and external foundry chips. And CLF isn't very low volume, each one has 660mm2 of 18A silicon.
 

511

Diamond Member
Jul 12, 2024
4,587
4,214
106
Your last sentence is a bit confusing, can you clarify?
I meant that SMT is not a break or make feature as some people will turn it off some may not also for AVX-512 CLW-F support AVX10/256 so it gets everything AVX-512 has just with 256 bit data path
Intel originally purchased 5000 series ASML machines (high NA) which for 18A. Now, Intel intends to wring out the high NA machines in 2025, but not use them for production.
That is a gamble will it pay or not only time will tell
GAA and BSPD both cause more passes of process steps over FinFET and FSPD do. This makes the process more expensive. Intel also produces fewer chips on their high end equipment than does TSMC which makes the amortization costs higher for Intel. Furthermore, without High NA, Intel will have to rely on double patterning which will also raise costs. Throughout 2025, Intel will be producing only CWF chips on 18A as I understand it. This is a pretty low volume chip compared to desktop and laptop markets.
Both TSMC/Intel will use multi patterning at N2/18A but BSPDN allows Intel to relax pitches a but increases complexity as for how expensive it is only Intel/TSMC know the cost
AMD using N3P for the high volume desktop and laptop segments significantly reduces their cost over Intel's use of N3B today (N3B also has more passes as I understand it than N3E, N3P and N3X) and (my guess) Intel's own 18A process.
N3P products(Zen6) won't be launching before 2H26 as for cost of 18A N3B/N3P you are assuming this on the basis of Intel 7 cost structure also N3B and N3P price difference won't be significant both are N3 family.

we will know more details by IEDM 24 in December so hold your horses before making conclusion 🙂
It seems like Intel is willing to throw money at their chips to keep them competitive while AMD manages to maintain competitive products at much lower production costs on less expensive process nodes.
This is true rn but how true it will be next year oncw it ramps the process