Discussion Intel current and future Lakes & Rapids thread

Page 646 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
So let me get this straight A 112C/224T Sapphire Rapids is 2.1X faster than a gimped 120C/120T Milan-X? On the absolute best possible benchmark(Intel compiler and all)? Tough times ahead for Sapphire Rapids

Actually on some very demanding HPC applications Hyperthreading is turned off because it performs better. Linpack for example performs higher when HT is off since Linpack saturates the FPU and the second thread has absolutely no room for enhancing performance.

SMT benefits performance by filling pipeline bubbles AND covering LLC misses. HPC applications LOVE memory bandwidth, the dataset is large, and fully saturates the execution units. Plus, even if HT was faster, it would have to clock lower because full AVX execution is very demanding anyway.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,541
14,495
136
Actually on some very demanding HPC applications Hyperthreading is turned off because it performs better. Linpack for example performs higher when HT is off since Linpack saturates the FPU and the second thread has absolutely no room for enhancing performance.

SMT benefits performance by filling pipeline bubbles AND covering LLC misses. HPC applications LOVE memory bandwidth, the dataset is large, and fully saturates the execution units. Plus, even if HT was faster, it would have to clock lower because full AVX execution is very demanding anyway.
All those benchmarks were made by Intel. Worthless to me, they are just trying to justify their server chips any way they can, at least for the investors if nothing else.

Pretty pathetic though.
 
  • Like
Reactions: Drazick

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
In the reddit leak "the biggest architectural change in CPU architecture since the Core architecture" was claimed for Nova Lake. There are rumors about Panther Lake after Lunar, I hope it doesn't mean it has been delayed one generation. If there is Panther Lake the Panther Cove core naming would make sense though.
So, I've put some thought into this. I do think it's reasonably likely that Royal first shows up in Nova Lake around 2026. If so, I see three possibilities.
  1. There is a significant issue with Royal execution, which pushes development out that far. I'm optimistic on this not being the case, but admittedly that's just blind faith in the team.

  2. Integrating Royal would require significant rework of other parts of the SoC (fabric, etc) that would not fit in the schedule for an incremental redesign.

  3. IDC (who're developing Lunar Lake and [last I heard] Panther Lake) are not politically willing to take the core due to the influence of the big core team.
In the last two cases, I would hope that Royal's first appearance would be more like a "Royal 2", in which case I'd hope for IPC more like 2.5-3.0x Golden Cove, but maybe I'm getting ahead of myself.
 
  • Like
Reactions: mikk

lobz

Platinum Member
Feb 10, 2017
2,057
2,856
136
Definitely not. Cost and board space are far more dominant concerns there than cooling.
Oh my dear God..... obviously not as a conscious design goal... but sure, from now on I'll try to communicate with you either through citing officially filed patents or court rulings to be as literate as possible.
 

Cardyak

Member
Sep 12, 2018
72
159
106
So, I've put some thought into this. I do think it's reasonably likely that Royal first shows up in Nova Lake around 2026. If so, I see three possibilities.
  1. There is a significant issue with Royal execution, which pushes development out that far. I'm optimistic on this not being the case, but admittedly that's just blind faith in the team.

  2. Integrating Royal would require significant rework of other parts of the SoC (fabric, etc) that would not fit in the schedule for an incremental redesign.

  3. IDC (who're developing Lunar Lake and [last I heard] Panther Lake) are not politically willing to take the core due to the influence of the big core team.
In the last two cases, I would hope that Royal's first appearance would be more like a "Royal 2", in which case I'd hope for IPC more like 2.5-3.0x Golden Cove, but maybe I'm getting ahead of myself.

Do we actually know any of the details for Royal Core yet?

I've heard lots of rumours/speculation, and I've been digging on a few patents, but I'm unable to source any information that links some of Intel's R&D directly to any ongoing projects. As an example, here's some of the ideas that have been floated over the past few years to massively push CPU performance forward:
  • Fusion Reconfigurable Multicore Architecture - Basically stitching together small cores dynamically to create big cores as and when needed, as opposed to "statically" partitioned Big/Little cores.
  • Post-silicon CPU adaptation made practical using machine learning - Scaling the size of components in a CPU core up & down using ML in order to improve efficiency. Golden Cove has started to do this with the BTB, but I expect to see a lot more of this going forward. Note also that this paper was co-authored by Ronak Singhal, a Senior Fellow at Intel.
  • Auto-Predication of Critical Branches - Seems to be an idea where you analyse branches, determine the "trouble makers" that are nearly impossible to predict, and basically give up and decode both sides of the branch as insurance. Seems like a costly insurance policy but if used sparingly on only Hard-to-Predict branches it could cause a massive performance increase, as in most programs a small number of branches cause the majority of mispredictions.
  • Selective Pipeline Flush - Implementing some sort of process so that when a branch mis-predict occurs, you only discard the illegitimate instructions, and retain the correct code, as opposed to flushing everything and throwing the baby out with the bath water. Incredibly convoluted, but the performance gains from doing this are obvious. Worth noting that this research wasn't conducted by Intel, but I'll be damned if they haven't looked into something like this.
  • Value Prediction - One of the key academics who has pioneered research in this area is André Seznec. He recently accepted a position at Intel. Value Prediction is arguably the "Holy Grail" of ST Perf increases and is pretty much required at some point to increase performance. This is a question of not if, but when. There's an absolute ton of research around this and the opportunity is enormous.
  • Shared front end components - No source for this one, but I stumbled across this when glancing at a thread on RealWorldTech - Due to the decode units being idle a large percentage of the time because the Micro-Op cache primarily feeds the back end, not every core needs a whole decode unit to itself. In the interest of saving die space it may be possible to have 1 large decode block that is shared among a small number of cores and behaves in a "round-robin" time slice sort of process.
  • Dropping x86 baggage such as old redundant instructions and emulating them in software to reduce the burden on the CPU. No idea how or even if this would work, but this is something people have spoken about before on numerous occasions, and it stands to reason we will have to have a clean slate at some point or another. (I highly doubt Intel will ever move away from x86 all together)
I have no idea if/what will feature in Royal Core, if anyone does I would be glad to share information. @Exist50 seems to have reputable sources inside Intel so it would be great to have his insight into this.
 

DrMrLordX

Lifer
Apr 27, 2000
21,617
10,824
136
. It makes Intel look bad but AMD can't provide enough chips for all enterprise customers so Intel still sells a lot of server chips despite having fewer cores.

Intel can't sell Cascade Lake in volume anymore. They're still struggling to supply 10nm anything to the server market. People are going to be on waiting lists no matter what.

Correction: you can bet that AMD actively decided against using N3 after the mess it is.

Jfc I've been ragging on about this for a while now for a reason.

N3e isn't a mess though. I still don't think AMD will be an early adopter. Intel may be the first!
 
  • Like
Reactions: Tlh97 and ftt

ashFTW

Senior member
Sep 21, 2020
307
231
96
Intel can't sell Cascade Lake in volume anymore. They're still struggling to supply 10nm anything to the server market. People are going to be on waiting lists no matter what.

While it’s a fact that Intel has lost a lot of server market share to AMD, the above statement is patently false.

Intel Q4 2021 Earnings Call: “Our Data Center Group had its best quarter ever as customers continued rebuilding their confidence in choosing Intel. Enabled by our IDM advantage, Ice Lake servers shipped more than 1 million units, equal to the amount we had shipped in the prior 3 quarters combined.”

Intel shipped 7.71 million total servers that quarter. In contrast, AMD shipped 1.13 million total servers (EPYC2 and EPYC3). One can conclude that Icelake volume was very likely greater than that of EPYC3.

Intel Q1 2022 Earnings Call: ”Our third-generation Intel Scalable processor Ice Lake has now shipped almost 4 million units and Amazon Web Services recently announced general availability of its EC2 I4i instance designed for storage and I/O intensive workloads. This is the 48th AWS instance powered by Ice Lake.

I am also pleased to say that as committed, we began shipping initial SKUs of our fourth gen Intel Xeon scalable processor, Sapphire Rapids, to select customers in Q1.”

Intel shipped almost 2 million Icelake servers last quarter, double that of the previous quarter, and again likely out shipped AMD‘s EPYC2 and EPYC3 combined. AMD does have a superior product right now, that can command much higher ASPs. That’s an awesome achievement from AMD, but Intel will be stronger with Sapphire Rapids soon.

Edit: And for those who continue to falsely claim that Intel can’t yield big chips on 10nm, the two IceLake server chips are 470 and 628 mm2 respectively.
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,583
5,204
136
Edit: And for those who continue to falsely claim that Intel can’t yield big chips on 10nm, the two IceLake server chips are 470 and 628 mm2 respectively.

You can make up somewhat for lousy yield by burning way way more wafers. That's the nice thing about being an IDM. Intel probably sells close to zero fully enabled HCC Icelake server chips and probally very little XCC in general. How far they have to cut the HCC to have any supply is something you can't really tell.

From Dell's website you can see that they are still selling plenty of Cascade Lake.
 

Henry swagger

Senior member
Feb 9, 2022
363
236
86
While it’s a fact that Intel has lost a lot of server market share to AMD, the above statement is patently false.

Intel Q4 2021 Earnings Call: “Our Data Center Group had its best quarter ever as customers continued rebuilding their confidence in choosing Intel. Enabled by our IDM advantage, Ice Lake servers shipped more than 1 million units, equal to the amount we had shipped in the prior 3 quarters combined.”

Intel shipped 7.71 million total servers that quarter. In contrast, AMD shipped 1.13 million total servers (EPYC2 and EPYC3). One can conclude that Icelake volume was very likely greater than that of EPYC3.

Intel Q1 2022 Earnings Call: ”Our third-generation Intel Scalable processor Ice Lake has now shipped almost 4 million units and Amazon Web Services recently announced general availability of its EC2 I4i instance designed for storage and I/O intensive workloads. This is the 48th AWS instance powered by Ice Lake.

I am also pleased to say that as committed, we began shipping initial SKUs of our fourth gen Intel Xeon scalable processor, Sapphire Rapids, to select customers in Q1.”

Intel shipped almost 2 million Icelake servers last quarter, double that of the previous quarter, and again likely out shipped AMD‘s EPYC2 and EPYC3 combined. AMD does have a superior product right now, that can command much higher ASPs. That’s an awesome achievement from AMD, but Intel will be stronger with Sapphire Rapids soon.

Edit: And for those who continue to falsely claim that Intel can’t yield big chips on 10nm, the two IceLake server chips are 470 and 628 mm2 respectively.
Well said.. supply volume is what customers want the most
 

ashFTW

Senior member
Sep 21, 2020
307
231
96
You can make up somewhat for lousy yield by burning way way more wafers. That's the nice thing about being an IDM. Intel probably sells close to zero fully enabled HCC Icelake server chips and probally very little XCC in general. How far they have to cut the HCC to have any supply is something you can't really tell.
Yields always drop with a newer process. Intel has shown such graphs for ever, and these things are discussed during earning calls. In the last call, Intel reported "Gross margin for the quarter was 53%, exceeding our guidance by 100 basis points on improved manufacturing yields and lower factory costs."
So things are getting better. Intel also reported the crossover from 14 to 10nm a while ago.

Intel probably sells close to zero fully enabled HCC Icelake server chips and probally very little XCC in general. How far they have to cut the HCC to have any supply is something you can't really tell.
I don't have any data on this. They designed these processors with a certain yield, recoverability, and product mix in mind. And they are sticking with big die for at least their next two server releases - SPR, and EMR. Future trend is definitely towards smaller chiplets and advanced packaging, which will ameliorate these concerns.

From Dell's website you can see that they are still selling plenty of Cascade Lake.
Icelake volume was still only 1/7th of the total server volume in Q4; it's no surprise Cascade Lake outsells IceLake. Another reason IceLake has had lower level of OEM interest, other than competitive pressure from AMD, is that IceLake is the sole Xeon release on the Whitley platform. It didn't provide enough time for OEMs to recoup their platform investments. They like at least 2 years/releases per platform. For the same reasons, most if not all large OEMs didn't even engage with IceLake Xeon-W. Intel typically has done 2 releases per platform, but they were forced by AMD to quickly move to the next platform. Now that Granite is delayed, Intel has decided to do EMR as a follow on to SPR on Eagle Stream.
 

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
Actually on some very demanding HPC applications Hyperthreading is turned off because it performs better. Linpack for example performs higher when HT is off since Linpack saturates the FPU and the second thread has absolutely no room for enhancing performance.

SMT benefits performance by filling pipeline bubbles AND covering LLC misses. HPC applications LOVE memory bandwidth, the dataset is large, and fully saturates the execution units. Plus, even if HT was faster, it would have to clock lower because full AVX execution is very demanding anyway.
Great way to spin that around. Except OpenFoam benefit from HT On. They load 250 iterations on a 120 thread CPU vs one with 224 threaded one.

 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,583
5,204
136
Intel typically has done 2 releases per platform, but they were forced by AMD to quickly move to the next platform.

Intel had two releases, but the Cooper Lake one got cancelled. Probably because OEMs didn't want to bother because the benefit versus Cascade wasn't much and there was doubt that Intel would be able to do Icelake Server. IIRC that was when they did the Refresh/Price Cut of Cascade instead.

And Intel has done anything but move fast. All they've done is delay because of the yields. In 2019, Intel was talking about Sapphire coming out in 2021. Now it's the end of 2022. They had to fill the void left from 7 nm being behind schedule with Emerald.
 
Last edited:

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
I have no idea if/what will feature in Royal Core, if anyone does I would be glad to share information. @Exist50 seems to have reputable sources inside Intel so it would be great to have his insight into this.
So, understand that most of my knowledge is, somewhat ironically, about people. Who's working on what, how they feel about what they're working on, how they feel about what others are working on, etc. I've had good luck extrapolating technical information from that, but for something like specific features in a future core? Forget it. Companies keep those details very well guarded, and there's no value in me just guessing.

For that matter, it's not like I have any contacts actually within the Royal core team itself. They've just apparently given a couple of high level presentations to various other business units. "Who we are. What we're trying to do.", stuff like that. And so word's gotten around for the basic stuff. The overwhelming consensus is unabashed optimism, and I extrapolate from there.

  • Value Prediction - One of the key academics who has pioneered research in this area is André Seznec. He recently accepted a position at Intel. Value Prediction is arguably the "Holy Grail" of ST Perf increases and is pretty much required at some point to increase performance. This is a question of not if, but when. There's an absolute ton of research around this and the opportunity is enormous.
Ok, I have no idea how on earth you found that name, but yes, I think Seznec is working on Royal. Just decided to browse his page, and there're some interesting things on there, including history on DEC's Alpha EV8, and inventing the TAGE branch predictor. Pretty impressive resume.


And a fun little snippet.
As a programmer I am fundamentally a sequential guy.. Therefore my past researches in computer architecture have essentially focussed on providing performance for sequential programs, mainly on a single processor.

Will need to look into that value prediction stuff, but sounds cool.
 

ashFTW

Senior member
Sep 21, 2020
307
231
96
And Intel has done anything but move fast. All they've done is delay because of the yields. In 2019, Intel was talking about Sapphire coming out in 2021. Now it's the end of 2022. They had to fill the void left from 7 nm being behind schedule with Emerald.
True that! But according to their Q1 call, they did start shipping some SPR SKUs last quarter to select customers. This list of customers must include Argonne National Labs, and I would think that they want only the highest core count SPR with HBM. I expect Aurora to feature in the next Top 500 list.

Has Intel said end of ’22 publicly for full release, or is it just rumors and/or speculation? If not, I would rather wait a month and see what they say on their Q2 earnings call.
 
Last edited:

ashFTW

Senior member
Sep 21, 2020
307
231
96
André Seznec: As a programmer I am fundamentally a sequential guy.. Therefore my past researches in computer architecture have essentially focussed on providing performance for sequential programs, mainly on a single processor.
I’m the opposite. I love to parallelize my workloads and run on as many threads I can. It’s a joy to break up problems into smaller pieces and compose back the sub results. But I do have a big bias for multiprocessors vs multicomputers/clusters. I need to correct that bias in the future.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Great way to spin that around. Except OpenFoam benefit from HT On. They load 250 iterations on a 120 thread CPU vs one with 224 threaded one.


You have AMD's own marketing materials that have SMT off: https://www.amd.com/system/files/documents/amd-epyc-7Fx2-openfoam.pdf

This shows that in all applications tested, there are gains with lower core counts but not with higher ones: https://www.nas.nasa.gov/assets/nas/pdf/papers/NAS_Technical_Report_NAS-2015-05.pdf

You are talking about 6% gains in a specific scenario with most getting no gains or even lower with recommendations to disable SMT.
 

jpiniero

Lifer
Oct 1, 2010
14,583
5,204
136
Has Intel said end of ’22 publicly for full release, or is it just rumors and/or speculation?

They can launch Sapphire whenever. I believe last they were still talking about a 1H launch but the clock is ticking on that.

But actually buying it is probably like Icelake Server when you won't be able to get it in any quantities until Q4.
 

ashFTW

Senior member
Sep 21, 2020
307
231
96
They can launch Sapphire whenever. I believe last they were still talking about a 1H launch but the clock is ticking on that.

But actually buying it is probably like Icelake Server when you won't be able to get it in any quantities until Q4.
Lot of the volume will go to the cloud vendors. Same is true of AMD. They are already receiving deliveries.
 

lyonwonder

Member
Dec 29, 2018
31
23
81
I wouldn't be surprised if Meteor Lake or any of its successors drops legacy hardware features like real mode and hardware X87 since new motherboards no longer have (or are supposed to no longer have since it's an Intel mandate) legacy BIOS support with CSM.

Booting in 32-bit mode may eventually go by the wayside too with future CPU's only booting in long mode and only spporting 64-bit OSs, though I doubt Intel will remove hardware support for 32-bit from the next several CPU generations since 32-bit applications aren't going away any time soon.
 
  • Like
Reactions: Exist50

uzzi38

Platinum Member
Oct 16, 2019
2,621
5,872
146
I small win for Intel. Could be due to the HBM version, don’t know if Genoa has one. Intel must have bribed them or given the processors away for free! /s
It has nothing to do with HBM. Intel just came in cheaper. Didn't matter last time back when Nvidia picked Rome because Intel didn't have a PCIe Gen 4 platform back then. This time around they have a competent platform and were actually an option.