Discussion Intel current and future Lakes & Rapids thread

Page 642 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

ashFTW

Senior member
Sep 21, 2020
310
235
96
I'm repeating myself a lot, but perhaps this helps...

We can break this discussion down to two questions. Let's answer the first one, and then answering the second becomes much easier.

Question 1. Intel has only publicly discussed the 400 mm2 XCC SPR tile with 15 cores each. Four of these are combined together with EMIB to make the SPR chip with (up to) 60 cores. It's not currently publicly known how Intel plans to make lower core count (8/12/16 etc) SPR chips. For example, how will intel make something like the 16 core IceLake Xeon Silver 4314 with a full complement of PCIe lanes and memory channels (but reduced cache), which has an MSRP of only $750?

Option 1: Since only 1/4th the I/O and memory is on each SPR tile, four XCC SPR tiles could be used. Note that these tiles can have defective/fused-off cores but no unrecoverable defects in the I/O, memory, and EMIB PHY areas. There are also 10 additional EMIB tiles (totaling 215 mm2 **), as well as the added costs of advanced packaging. Given that lower end chips have a much larger volume and low MSRP, this option as the only option, is complete madness! Intel may be forced to use this option to satisfy some portion of the low-core parts volume, or for SKUs with full L3 cache, but using this option exclusively would be a colossal money loosing proposition.

Intel 7 yield issues have been brought up before to support this option. But Intel 7 is making half this size die in high volume Alder/Raptor lake parts with no problem. And with extensive block repair/recovery methods, with 74% of the chip being recoverable, a large portion of the SPR tiles will be functional. Intel has also been selling 470 and 628 mm2 Icelake Xeon parts in high volume but on a slightly older 10nmSF.

** Estimated from the figure below from IEEE ISSCC 2022. Also note that EMIB takes over 1/8th of the SPR tiles.

1654382793483.png

Option 2: Keep the 4 tile design, but make the tiles smaller with reduced number of cores. Let's remove two rows of cores (total 8) per tile. Now we have 7 cores per tile, and 28 core parts with all cores functioning. We do have all the I/O and memory, but we also still have the complexity of the 4 tile design. The Silicon savings are there of course. I estimate that these tile will be 250 mm2 or so. Two fewer EMIB tiles will be needed, but over 17% of the 1000 mm2 is now dedicated to die-to-die fabric.

Option 3: Build a monolithic die for lower core counts. For example, take one of the 15 core tiles and add 1-2 rows of cores. Use the area now dedicated to EMIB PHYs to add additional I/O and memory. Option 3 is superior to Option 2, because it's much simpler and cheaper to make, while still reusing large parts of the design. The size should be 450-500 mm2. This option also better covers the even lower core count (like 8 and 12) SKUs. With no multi-die fabric, the chip should perform better as well.

I have no insider information, but to me Option 3 makes the most sense for Sapphire and Emerald Rapids. Granite Rapids and Falcon Ridge are disaggregated with separate compute only tiles, in which case the number of cores can be scaled by just adding more (preferred) or bigger compute tiles.

Question 2. How will Intel make SPR workstations chips?

For a high core count professional workstation (say with 48-60 cores) 4 x XCC is the only option.

For a low core count enthusiast workstation with 16-24 cores and half the I/O and memory channels, my answer would be to use the monolithic die from Option 3 above, though 2 x XCC would also be fine here as @nicalandia has suggested. Or it could be a combination of the two, but being low volume product, that's unlikely.

This is not a competition, let's wait and see what Intel does.
 

ashFTW

Senior member
Sep 21, 2020
310
235
96
Speculation: Granite Rapids and Sierra Rapids with disaggregated design, on the same platform. A "4-stack" Foveros and co-EMIB base die with all the I/O, memory, and cache. The P and E core tiles may be assemblies of 2 or more smaller chiplets. I expect the max core counts for Granite to be 128, and Sierra to be 3-4 times more.

GNR_SRF.jpg

Edit: Fixed Typo (Sierra Rapids to Sierra Forest).
 
Last edited:

ashFTW

Senior member
Sep 21, 2020
310
235
96
Speculation: Falcon Ridge built on the same platform as Granite and Sierra Rapids, continuing the path to Zettascale.
FLR.jpg

Edit: Fixed Typo (Sierra Rapids to Sierra Forest).
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
I'm repeating myself a lot, but perhaps this helps...

We can break this discussion down to two questions. Let's answer the first one, and then answering the second becomes much easier.

This is a well thought out post.

I agree, monolithic or 2-tile is the way to go. MLID was the first one to suggest one in public but I think there's a good chance he'll get it right.

The tile configuration brings Intel another advantage, especially for segmentation and marketing: Countless configurations.

Options for: UPI connections, core count, memory channels, PCI Express, accelerators, tile count, disabling cores/tiles, AVX and AMX configurations.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Yea I'm not sure if I buy the fantastic, 30%+ gains even for the Royal Cove project. We'll see when it happens. Yes I can believe amazing amount of effort and reorganization of teams would happen, but in terms of absolute numbers I am skeptical.
Of course you're skeptical, as would be any rational person. "Extraordinary claims require extraordinary evidence" and certainly I haven't presented any. So just take it on faith for now that I believe Royal is capable of even more than rumored, insane as that may sound. There will be plenty of time for I-told-you-sos later, haha.

Though one thing to keep in mind is that Royal is an independent effort and team. They might have picked up one or two of the Atom folk along the way just by proximity, but it's a completely separate project from Core and Atom. Think the original members largely came from Intel Labs, along with one or two high profile hires. Actually, success for Royal might very well kill Core, so that'll be interesting to watch play out. Might even coexist for a time.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Think the original members largely came from Intel Labs, along with one or two high profile hires. Actually, success for Royal might very well kill Core, so that'll be interesting to watch play out. Might even coexist for a time.

Interesting. I prefer to be conservative because marketing always hypes everything. And I don't mean just advertisements. Reviewers, and even so-called technical presentations count too.

You see some things that are beyond expectations, but pretty rare. Makes sense. The whole semiconductor industry already advances at an astonishing rate. Even the worst of them. A 10 year old computer fan is decades ahead of many non-computer ones. Heatsinks, Fans, Power Delivery, Circuit Design, Chemistry, all advance at a blistering pace to further the computer industry. It's amazing what they do.

I do GPU repair and after looking at that the non-computer circuitry start looking very doable, and sometimes trivial. It's fun.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Speculation: Falcon Ridge built on the same platform as Granite and Sierra Rapids, continuing the path to Zettascale.
View attachment 62607
Ok, something's been bugging me since the announcement the other day, and I finally figured out what it is. They've changed the graphic!

If you go back to February, they showed this:

falcon-shores-intel.fw_.png


Which looks vaguely similar to what they've shown for Granite Rapids:

intel-granite-rapids-chip-logo-854x438.jpg


But now they've shown this, which looks like nothing else they've shown before.

Intel-talked-about-promising-Falcon-As-Chips-that-would-allow-you-to-pick-the-ratio-between-CPU-and-GPU.jpeg


So I initially thought that like Meteor Lake, maybe one is for representative purposes only. Seems totally possible. But then why change it? So I've got two alternative theories.

a) Granite Rapids itself changed topologies when they delayed it to '24. Possible, but that seems like a huge risk.

b) What they're showing here is actually a sneak peak at a post-GNR topology, and Falcon Shores is more like '25 or '26.
 
  • Like
Reactions: ashFTW

ashFTW

Senior member
Sep 21, 2020
310
235
96
a) Granite Rapids itself changed topologies when they delayed it to '24. Possible, but that seems like a huge risk.
The switch was probably made towards the end of ‘21 after Pat joined Intel. A more competitive product was needed (including a more advanced process) to counter AMD, Nvidia, and ARM advance. This gave them 3 years to ‘24. That pace is urgent, but perhaps not “break-neck” crazy.

Which looks vaguely similar to what they've shown for Granite Rapids:

intel-granite-rapids-chip-logo-854x438.jpg


But now they've shown this, which looks like nothing else they've shown before.
This looks more like a “2 stack” design, similar to Ponte Vecchio. Maybe they switched to “4 stack” design once they realized they needed to push the release by a year and move to Intel 3 to compete better, while also better aligning with the evolving Falcon Shores design. Smaller reticle size with High NA EUV use will also start to play out soon.

b) What they're showing here is actually a sneak peak at a post-GNR topology, and Falcon Shores is more like '25 or '26.
It could be ‘25. Parts of Falcon Shores are very likely to be on Intel 20A, especially the Xe parts. The x86 parts can be GNR or SRR on Intel 3. Now that the designs are disaggregated, the various tiles can progress independently and still come together in yearly updates, as long as the interfaces are preserved.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
I have doubts on Falcon Shores using the same platform as Granite Rapids. Maybe Granite Rapids HBM?

I see it as a spiritual successor to the -AP line, and further back, the socketed Xeon Phi.
 

ashFTW

Senior member
Sep 21, 2020
310
235
96
I have doubts on Falcon Shores using the same platform as Granite Rapids. Maybe Granite Rapids HBM?

I see it as a spiritual successor to the -AP line, and further back, the socketed Xeon Phi.
Sapphire Rapids with and without HBM use the same Eagle Stream platform/socket, don’t they? HBM concerns should be internal to the socket. Do all XCC Sapphire Rapids tiles not have the ability to interface with HBM2e?

Edit: Hmmm ... the ISSCC 2022 paper doesn't mention HBM explicitly. Interesting! So did Intel make a separate XCC tile with support for HBM, or is that just not labelled in the floorpan. What's gpio? And what's the yellow and pink stuff?
SPR_layout.jpg
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
@ashFTW Sapphire Rapids with HBM clearly uses a different package. And it's much larger too.

GPIO stands for "General Purpose I/O". Systems have tons of I/O buses and not all are fancy like UPI or DMI. GP in the GPIO means that it can be reconfigured for what's needed. But of course much less bandwidth.
 
Last edited:

ashFTW

Senior member
Sep 21, 2020
310
235
96
@ashFTW Sapphire Rapids with HBM clearly uses a different package. And it's much larger too.
Ok, I see that they have different size packages, but aren’t they both using LGA 4677, and the Eagle Stream PCH?
1654404441305.png

The HBM version package just has these weird “ears sticking out”, which seems like HBM support was an afterthought. Don’t know if every Eagle stream motherboard will be able to accommodate both HBM and non-HBM versions of SPR.

1654404666198.jpeg


GPIO stands for "General Purpose I/O". Systems have tons of I/O buses and not all are fancy like UPI or DMI. GP in the GPIO means that it can be reconfigured for what's needed. But of course much less bandwidth.
Thanks!
 
Last edited:
  • Like
Reactions: lightmanek

mikk

Diamond Member
May 15, 2012
4,140
2,154
136
In the reddit leak "the biggest architectural change in CPU architecture since the Core architecture" was claimed for Nova Lake. There are rumors about Panther Lake after Lunar, I hope it doesn't mean it has been delayed one generation. If there is Panther Lake the Panther Cove core naming would make sense though.
 

Henry swagger

Senior member
Feb 9, 2022
367
239
86
Yea and 8+32 would take way less space and power. 8+32 = 16

8+64 equals 24P. Now tell me how that performs!

And there's an additional benefit where the P core can be P+, and be bigger and more performant than otherwise for even better ST performance and responsiveness. That's the whole point of hybrid. You get to specialize the cores way more than otherwise.

The real promise is this: Rather than doing 8+64 in place of 24P, you do 8 supercharged P cores + 32 E cores. Of course the P cores would be a lot larger. Let's say 30% faster per clock and twice the size.

Remember, this is in addition to whatever they would do normally. So I believe for risk mitigation it'll be spread out over few generations. So rather than new gen P being 18% faster, you have it being 24% faster for next 4-5 generations. And at the end, you have a very large P core and sea of E cores. Supercharging it for low and high thread.
Imagine a intel console with 16 or 24 e-cores.. sony and Microsoft better call intel for the ps6 or xbox 😁
 

Henry swagger

Senior member
Feb 9, 2022
367
239
86
In the reddit leak "the biggest architectural change in CPU architecture since the Core architecture" was claimed for Nova Lake. There are rumors about Panther Lake after Lunar, I hope it doesn't mean it has been delayed one generation. If there is Panther Lake the Panther Cove core naming would make sense though.
Maybe intel wants to score 4.500 or 5000 single core performance on cimebench with royal core or panther cove
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Speculation: Emerald Rapids tile (~480 mm2) with an additional row of cores. Or, up to 76 cores for the chip.


View attachment 62601
Intel's (stylistic) block diagram from HotChips doesn't show any significant space for EMIB. The whole point of EMIB is that it is a small piece of silicon embedded in the package; it is not part of the main die itself. Those are 10 separate tiles.
1654424690981.png
 
  • Like
Reactions: Henry swagger

CakeMonster

Golden Member
Nov 22, 2012
1,391
498
136
Might be a topic for another thread, but even though I always hope for over specced consoles (given the nightmarish lack of RAM on 3rd gen and CPU power on 4th gen), 16 threads seem to be just 'fine' on this gen with GPU power and features if anything holding it back. I don't pretend to know what will be the most important thing for next gen but I kind doubt it will be massive MT performance (although I would prefer a safety margin beyond 8c/16t, AMD design with equal cores might make more sense then).
 

jpiniero

Lifer
Oct 1, 2010
14,591
5,214
136
So I initially thought that like Meteor Lake, maybe one is for representative purposes only. Seems totally possible. But then why change it? So I've got two alternative theories.

a) Granite Rapids itself changed topologies when they delayed it to '24. Possible, but that seems like a huge risk.

b) What they're showing here is actually a sneak peak at a post-GNR topology, and Falcon Shores is more like '25 or '26.

I'll say both. Granite's topology changed but to the one shown in that picture with the Sea of Cores.