Discussion Intel current and future Lakes & Rapids thread

Page 641 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Hougy

Member
Jan 13, 2021
77
60
61
Thanks. Was on mobile at the time. Didn't want to scrub through a video. Or give it a view, tbh. And quality is plenty fine.

So I'll just give my take item by item.

MTL:
  • Socket - No clue. I thought MTL/ARL would use LGA 1800 1700 (Edit: Typo?), but I also knew they would not be platform compatible with ADL/RPL. Could see this being true, but little difference either way. Though a new socket typically hurts motherboard prices.

  • IPC - Bull. Maybe bigger than Raptor Lake, but 12-21%? Nah, he's just making stuff up.

  • Clock speed regressions - Probably bull. Maybe they'll lose 100-200MHz or so, but large enough to compare to ICL? Nah. I fully expect the node shrink to make up any regression from arch/design, and personally guess that clocks will ultimately be higher between comparable SKUs.

  • VPU - sure

  • 2+8, 6+8 - sure, 8+16 - no

  • RPL/MTL volume split - Not sure, but certainly suspicious.

  • Timing - Sounds reasonable enough.
ARL:
  • 8+32 on 20A - Think so?

  • LNC IPC - I'm thinking comparable-ish to Golden Cove's gains. Expecting >>GLC gains (like his previous "at least 30%" claim) is just nonsense. But make no mistake, LNC is probably the most important evolution of Core since its inception. Much better in a whole host of ways.

  • Lion Cove is not Royal. Royal is Royal. How hard is this to understand? Clearly no clue what he's talking about.

  • Skymont - We're in for a treat with this one.

  • Timing - Sounds reasonable enough.

In short, I think all the "new", important details range from suspect to nonsensical, and the rest just reiterating well established rumors.
12-21% IPC increase for MTL over RPL? Locuza and Semianalysis already did a great die shot analysis of MTL. It seems there were few changes to the core, so doesn't that make this IPC prediction completely impossible?

I've seen MLID be proved wrong many times before, but never as soon as he made the prediction 😆
 
  • Like
Reactions: lightmanek

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
12-21% IPC increase for MTL over RPL? Locuza and Semianalysis already did a great die shot analysis of MTL. It seems there were few changes to the core, so doesn't that make this IPC prediction completely impossible?

I've seen MLID be proved wrong many times before, but never as soon as he made the prediction 😆

It's not small. Here's what Raichu said:
About Meteor lake. MTL focus on how to improve the efficiency of the instruction execution, it will not widen the microarchitecture crazy like Alder lake.
More improvements maybe will focus on branch prediction, Micro-operation fusion, instruction dispatch, register remake, and EU execution efficiency.

Some are expecting Haswell-like gains, which is low 10%. 12% seems to be a good average. Haswell was also efficiency focused, so some have said they reduced potential performance to lower power.

Maybe they are doing small/Medium gains rather than zero/Large gains they are doing now. Sunny = Large, Willow = tiny/zero, Golden = Large

to,

Raptor = small, Redwood = Medium, etc. Which would be better for planning I guess?
 
Last edited:

biostud

Lifer
Feb 27, 2003
18,193
4,674
136
Obviously we don't know if zen6 is going to be on AM5, but the continued socket changing definitely wants me to stay with AMD.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Yes desktop haha.
So, what I've heard of MTL desktop is...weird, and I'm not sure I've any leak quite in alignment. That said, my info might very well be out of date, but I'm ultimately quite curious what Intel decides to do. Timing of Arrow Lake is probably critical.

I don't think the CPU comparisons will be bad, at least performance. The GPU is a problem.
I'm not nearly so optimistic for Intel. Obviously graphics will be a huge win for AMD, but they also have a full node jump + new architectures going for them on the CPU side. The efficiency gap is going to be pretty darn stark, and Intel's battery life numbers are bad enough as it is. I can't believe they actually regressed from Tiger Lake.

But it doesn't really matter if Raptorlake mobile is having a comprehensive lineup. Then Meteorlake mobile is going to come 12 month later, period. [snip]
So, I get what you're saying here, but it would definitely be better for Intel to have Meteor Lake come out ASAP, and deal with any consequences to Raptor Lake or whatever. I consider myself a bit of an MTL pessimist, but it's still going to be a significant improvement over RPL, and I imagine the fabs are begging for something on Intel 4/3 to ramp with.

It's always a tradeoff. So apples-to-apples the square root law says 30% performance needs a core that's 70% larger. Also the single core focused design will have higher frequencies, so that'll result in the core larger as well.
I dislike the naive application of the square root law here, as clearly there are many different designs in the market, some of which are strictly superior to others, and I think that's the root of my argument - the differences in the underlying engineering investment.

Like, if the single thread difference between the two is small enough, then it seems to me like a Zen 4/4c arrangement is better. Basically have a base design and a derivative for a different market, instead of "reinventing the wheel" for two separate architectures. Granted, massive political implications of that at a company like Intel, but still.
 
Last edited:

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
It's not small. Here's what Raichu said:



Some are expecting Haswell-like gains, which is low 10%. 12% seems to be a good average. Haswell was also efficiency focused, so some have said they reduced potential performance to lower power.

Maybe they are doing small/Medium gains rather than zero/Large gains they are doing now. Sunny = Large, Willow = tiny/zero, Golden = Large

to,

Raptor = small, Redwood = Medium, etc. Which would be better for planning I guess?
So, I admittedly took something of a leap here. The actual info I'm working with is the claim that IDC's best engineers (and the majority in general) rolled off from Golden Cove to Lion Cove, meaning that Redwood Cove and especially Raptor Cove get the scraps. Therefore, my logic goes, surely the IPC gains from either won't be competitive with Golden Cove. Plus, I've heard that a lot of effort on Lion Cove is focused on "modernizing" it. I'm still uncertain what numbers that will translate to for IPC, freq, etc., but the implication was that it's a painful, if necessary adjustment, so I interpreted it as a negative modifier. In any case, I do think that MLID's >30% number is too high. When's the last time any company achieved that gen/gen? Hah, then again, I guess I have hyped up LNC a bit myself.

Maybe I'll regret making such strong claims, but I'm sticking to my guns for now. I think they're justified based on the available information.

Plus, MLID has a pretty terrible track record with IPC claims, and usually overshoots. We're seeing that play out right now in the Zen 4 thread.
 
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
So aside of a 10 cores glitch with Comet Lake Intel is stagnating with 8 P-cores for years to come and hoping people won't notice because eventually 32 E-cores?

Maybe Intel will petition Congress to rescind Amdahl's Law.

Have we legitimately seen a DESKTOP consumer grade application that would really benefit from having more than 8 P cores as opposed to having 4 times as many additional E cores?

Handbrake comes to mind. Its scaling diminishes beyond a certain number of cores. I would certainly rather have 12-16P cores to commit to it than a bunch of e-cores. Also I would much rather have 16P cores for a unified gaming + streaming box.
 

Timmah!

Golden Member
Jul 24, 2010
1,394
599
136
So aside of a 10 cores glitch with Comet Lake Intel is stagnating with 8 P-cores for years to come and hoping people won't notice because eventually 32 E-cores?

yeah. Hard pass as far as i am concerned. Gimme 16 + 16 instead of 8 + 32 and then i might be interested.

its interesting we are getting tidbits of info about 2024 products like Arrow lake, meanwhile all there is to say about Fishhawk Falls, supposedly to be released this year, is that the sources are starting to “get briefed”. Tells you everything you need to know about it, i guess.
 
  • Like
Reactions: Drazick

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Plus, MLID has a pretty terrible track record with IPC claims, and usually overshoots. We're seeing that play out right now in the Zen 4 thread.

Don't think it was just MLID that expected that gain. Pretty much everyone did. Maybe there was a confusion between total performance versus architectural.

Yea I don't know about Lion Cove. Maybe the efficiency gain will be great, but the absolute performance gain for the uarch is Golden Cove level. Like when we look at Haswell, it was great for laptops, but performance-wise? Not much.

20% gain takes amazing amount of enhancement and additions to get there. 30% takes a serious, almost forced flaw in the predecessor to happen.
 

mikk

Diamond Member
May 15, 2012
4,111
2,105
136
12-21% IPC increase for MTL over RPL? Locuza and Semianalysis already did a great die shot analysis of MTL. It seems there were few changes to the core, so doesn't that make this IPC prediction completely impossible?

I've seen MLID be proved wrong many times before, but never as soon as he made the prediction 😆


There is a good analysis from Cardyak on this topic. Redwood is more focused on integer upgrades.




Some people say the LGA2551 picture from MLID belongs to BGA2551 and is likely the successor of BGA1964 (ADL-HX, basically ADL-S on BGA). If you say 8+16 won't exist and assuming it's true it's doesn't seem like MTL-S is able to replace the upcoming 13900k.
 
Last edited:
  • Like
Reactions: moinmoin and ashFTW

jpiniero

Lifer
Oct 1, 2010
14,509
5,159
136
Going back to this...

SPR on the other hand, since it only has a 1/4th subset of memory controllers, PCIe, CXL, UPI on each of the tiles, they cannot make a chip with full IO unless all 4 chiplets are used. That makes sense for large core count parts. For smaller core counts it’s much cheaper to make monolithic. It would be a disaster if they had to use 1600 mm2 silicon (not counting the EMIB tiles) for every SKU, some of them may even go as low as 8/12/16 cores. Keeping the 4 chiplet design for lower core count also doesn’t make sense, because if you decide to make smaller “1/4th split chiplets”, you might as well make a monolithic which will be far cheaper to make. Intel in the past has made several different size Xeon chips to address the core count range — XCC, MCC, LCC. So they are likely to make several size chips; they do not have the financial constraints to only make one tile (and it’s mirror) for SPR.

How big do you think a 24c/8ch/4xIO monolithic die would be? The problem is that 10 nm yield is still very mediocre to the point where you'd have to slash the core count way down to the point where it wouldn't really be viable as a Metal Xeon.

Now MAYBE you could make an HEDT monolithic die that could work if you are frugal enough. The problem with that is I don't think Xeon W demand is all that much. Kind of wondering if you would be able to satisfy the Xeon W demand with just partially busted EMIB chips.

Xeon E does have some demand. But you'd have to come up with something that would make sense. I'm sure there would be some fans of an 24 15 GC core on LGA 1700.
 

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
So aside of a 10 cores glitch with Comet Lake Intel is stagnating with 8 P-cores for years to come and hoping people won't notice because eventually 32 E-cores?

Not sure about you...

But 8 P-cores with High IPC for gaming and the equivalent of a Xeon W-3175X for highly threaded apps on a Single CPU sounds like a Fine CPU to me.

That is a very powerful CPU, we are talking about a CPU that will likely match or beat a Zen3 ThreadRipper 5975WX in MT workloads.

1654356149415.png

1654356215442.png
 
Last edited:

Timmah!

Golden Member
Jul 24, 2010
1,394
599
136
Not sure about you...

But 8 P-cores with High IPC for gaming and the equivalent of a Xeon W-3175X for highly threaded apps on a Single CPU sounds like a Fine CPU to me.

That is a very powerful CPU, we are talking about a CPU that will likely match or beat a Zen3 ThreadRipper 5975WX in MT workloads.

View attachment 62559

View attachment 62560

Then again, you could have say 24 of those high IPC cores, with only say 8 of them boosting to 5+ GHz while gaming, and then most/all of them running on lower clocks for highly-threaded apps and very likely be significantly faster than Xeon 3175x, and not just its equivalent. As a bonus, you could get away with all the little.big core scheduling shenanigans or lack of AVX-512 for the same reason.

Granted, that would require Intel to finally get their process to the TSMC levels.
 

Glo.

Diamond Member
Apr 25, 2015
5,657
4,409
136
Thanks. Was on mobile at the time. Didn't want to scrub through a video. Or give it a view, tbh. And quality is plenty fine.

So I'll just give my take item by item.

MTL:
  • Socket - No clue. I thought MTL/ARL would use LGA 1800 1700 (Edit: Typo?), but I also knew they would not be platform compatible with ADL/RPL. Could see this being true, but little difference either way. Though a new socket typically hurts motherboard prices.

  • IPC - Bull. Maybe bigger than Raptor Lake, but 12-21%? Nah, he's just making stuff up.

  • Clock speed regressions - Probably bull. Maybe they'll lose 100-200MHz or so, but large enough to compare to ICL? Nah. I fully expect the node shrink to make up any regression from arch/design, and personally guess that clocks will ultimately be higher between comparable SKUs.

  • VPU - sure

  • 2+8, 6+8 - sure, 8+16 - no

  • RPL/MTL volume split - Not sure, but certainly suspicious.

  • Timing - Sounds reasonable enough.
ARL:
  • 8+32 on 20A - Think so?

  • LNC IPC - I'm thinking comparable-ish to Golden Cove's gains. Expecting >>GLC gains (like his previous "at least 30%" claim) is just nonsense. But make no mistake, LNC is probably the most important evolution of Core since its inception. Much better in a whole host of ways.

  • Lion Cove is not Royal. Royal is Royal. How hard is this to understand? Clearly no clue what he's talking about.

  • Skymont - We're in for a treat with this one.

  • Timing - Sounds reasonable enough.

In short, I think all the "new", important details range from suspect to nonsensical, and the rest just reiterating well established rumors.
And he also "leaked" socket LGA2551 photo which clearly based on the photo is a BGA type of socket.
 
  • Like
Reactions: Tlh97 and SteinFG

ashFTW

Senior member
Sep 21, 2020
302
225
96
How big do you think a 24c/8ch/4xIO monolithic die would be? The problem is that 10 nm yield is still very mediocre to the point where you'd have to slash the core count way down to the point where it wouldn't really be viable as a Metal Xeon.
28 core Icelake Xeon is around 470 mm2 **, and is shipping in high volume along with its bigger sibling, as has been said by Intel management on their earning calls. I expect SPR 24 core chip to be around the same size, maybe a tad smaller like 450 mm2. SPR is on Intel 7, an improved 10nm compared to ICL. Alder and Raptor Lake are also on Intel 7. Raptor will probably be half the size of the above monolithic SPR, which is again going to be produced in large volume.

So my conviction stands as before -- 1 (maybe even 2) SPR monolithic die at the lower core count end, and these being repurposed for Xeon-W3400 series. Lower core count servers are probably much higher volume, so addressing them separately (and not throwing 1600 mm2 plus at them, plus the complexity and cost of advance packaging) makes total sense!

** The 40 core die is 628 mm2, and the 28 core die visually looks to be 3/4th the size (Source wikichip).

skx-icx-org-comp.png
 
Last edited:
  • Like
Reactions: IntelUser2000

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
Can I just say that it is kind of amazing that someone was able to take pictures of wafers on a show floor of a high enough quality that we can not only determine die sizes, but do so even for the various functions on the die!

Can I just say that it is kind of amazing that someone was able to take pictures of wafers on a show floor of a high enough quality that we can not only determine die sizes, but do so even for the various functions on the die!

And thanks to that we are able to further guesstimate future Products based on this info.


For example here. 8 + 16 Raptor Lake CPU 13900K Mock Up compared to a 8 + 16 Meteor Lake Compute Cluster with size difference and die size estimates based on info from Locuza and semianalysis

1654371479565.png
 
  • Like
Reactions: Elfear

ashFTW

Senior member
Sep 21, 2020
302
225
96
Can I just say that it is kind of amazing that someone was able to take pictures of wafers on a show floor of a high enough quality that we can not only determine die sizes, but do so even for the various functions on the die!
Yes, indeed!
 

jpiniero

Lifer
Oct 1, 2010
14,509
5,159
136
I expect SPR 24 core chip to be around the same size, maybe a tad smaller like 450 mm2. SPR is on Intel 7, an improved 10nm compared to ICL.

I crudely did it out, based upon the available SPR tile shot, and got 719 mm2, lol. 428 mm2 for 24 cores and the memory tiles, 228 mm2 for 4xIO and 63 for 4xmemory controllers. That can't be right but surely that's closer than your estimate. Golden Cove Server is a lot bigger than Client because of AMX and the extra AVX-512 unit.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Then again, you could have say 24 of those high IPC cores, with only say 8 of them boosting to 5+ GHz while gaming, and then most/all of them running on lower clocks for highly-threaded apps and very likely be significantly faster than Xeon 3175x, and not just its equivalent.

24 cores would take an insane amount of space, even on the latest process from TSMC.

24P cores is actually roughly equal to 8 P and almost 64 E cores not 32.

There is a good analysis from Cardyak on this topic. Redwood is more focused on integer upgrades.

So he's saying it can get 20% gain for Integer but much less for FP? When they say "Integer" it means basically overall perf/clock improvement. It's like increasing the speed limit of the highway and widening it. It benefits every block that uses it, not just "integer". Branch prediction, ROBs, OoOE blocks, L/S units, micro op cache, all benefit all code. It's not like FPU is on a separate block connected by a ring bus.

It's way easier to get FP gains than Integer. The latter you cannot just double blocks and double performance that way.

Based on what @Exist50 is saying, I'd put that as wild ass speculation.
 
Last edited:

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
So my conviction stands as before -- 1 (maybe even 2) SPR monolithic die at the lower core count end
Just be prepared to be disappointed when Intel officially announces Sapphire Rapids for Workstations.

24 cores would take an insane amount of space, even on the latest process from TSMC.

24P cores is actually roughly equal to 8 P and almost 64 E cores not 32.

24 P-Cores on a world where thermals don't matter would get 56,000 points in CB R23, a realistic 8 P + 32 E configuration gets you 50,000+ points in CB R23(as estimated by Puget Systems)
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Yea I don't know about Lion Cove. Maybe the efficiency gain will be great, but the absolute performance gain for the uarch is Golden Cove level. Like when we look at Haswell, it was great for laptops, but performance-wise? Not much.
Just to be clear, do you mean Lion Cove or Redwood Cove here? I'm inclined to believe that Raichu is correct, and that Redwood Cove is more efficiency focused. I think Lion Cove is a much better opportunity from a performance perspective, but I think they're going to be careful to balance power and area as well.

20% gain takes amazing amount of enhancement and additions to get there. 30% takes a serious, almost forced flaw in the predecessor to happen.
Well I expect Royal to be far, far beyond a mere +30%, but that's worth a thread in its own right. Maybe will make one if anything of substance actually leaks.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
There is a good analysis from Cardyak on this topic. Redwood is more focused on integer upgrades.




Some people say the LGA2551 picture from MLID belongs to BGA2551 and is likely the successor of BGA1964 (ADL-HX, basically ADL-S on BGA). If you say 8+16 won't exist and assuming it's true it's doesn't seem like MTL-S is able to replace the upcoming 13900k.
I am very unimpressed with this take, and think they're basically just reading the tea leaves and pulling actual numbers out of thin air. They have no idea what individual changes actually consist of, and while I can admire effort being spent into analyzing the information we have, I'm considerably colder towards any attempts to make confident assertions about the results.

Also, those two in the OP have been rather confidently incorrect on some past assertions. Remember Locuza's original floorplan for MTL? Yeah, that was way off...
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Just to be clear, do you mean Lion Cove or Redwood Cove here? I'm inclined to believe that Raichu is correct, and that Redwood Cove is more efficiency focused. I think Lion Cove is a much better opportunity from a performance perspective, but I think they're going to be careful to balance power and area as well.

I am addressing your whole post, so yes I'm talking about Lion Cove.

Yea I'm not sure if I buy the fantastic, 30%+ gains even for the Royal Cove project. We'll see when it happens. Yes I can believe amazing amount of effort and reorganization of teams would happen, but in terms of absolute numbers I am skeptical.

Like when Nvidia was claiming some epic level adjustments with Pascal but it was nowhere near the hype in terms of numbers. Or how with FinFET they claimed some revolution but performance/watt gains were just in line with normal trends. Things like FinFET is an enabler to continue that's all. RibbonFET and PowerVia seems awesome but the performance gains are 15% and less than previous ones which were plain old boring FinFETs.

Of course if you are already at the top level in things any progress you made you want to tout your horns but I am not going to believe that you are increasing the trendline. It's like sprinters making 1ms progress each year. Eventually it's all marketing.
 
  • Like
Reactions: Saylick

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
24 P-Cores on a world where thermals don't matter would get 56,000 points in CB R23, a realistic 8 P + 32 E configuration gets you 50,000+ points in CB R23(as estimated by Puget Systems)

Yea and 8+32 would take way less space and power. 8+32 = 16

8+64 equals 24P. Now tell me how that performs!

And there's an additional benefit where the P core can be P+, and be bigger and more performant than otherwise for even better ST performance and responsiveness. That's the whole point of hybrid. You get to specialize the cores way more than otherwise.

The real promise is this: Rather than doing 8+64 in place of 24P, you do 8 supercharged P cores + 32 E cores. Of course the P cores would be a lot larger. Let's say 30% faster per clock and twice the size.

Remember, this is in addition to whatever they would do normally. So I believe for risk mitigation it'll be spread out over few generations. So rather than new gen P being 18% faster, you have it being 24% faster for next 4-5 generations. And at the end, you have a very large P core and sea of E cores. Supercharging it for low and high thread.
 
Last edited: