Discussion Intel current and future Lakes & Rapids thread

Page 638 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

nicalandia

Platinum Member
Jan 10, 2019
2,654
4,095
106
Shortening the paths between CPU/GPU/memory is the thing. So nVidia, AMD, and Intel are going that way. For AMD it should be MI300 since it can operate in "an APU mode". So next gen.
AMD has been at this for much more than Intel and Nvidia. Trento is an APU already.
 
  • Like
Reactions: lightmanek

nicalandia

Platinum Member
Jan 10, 2019
2,654
4,095
106
I know there is an HBM version of Sapphire Rapids. That's not the point. The point is that Intel is deliberately NOT showing non-HBM Sapphire Rapids on that slide. It's also not showing HBM Emerald Rapids. That makes no sense. Unless Intel is canning non-HBM Sapphire Rapids and leaving 4th gen to . . . IceLake-SP?
This one?

1654030031525.png
 

mikk

Diamond Member
May 15, 2012
3,682
1,560
136
Rialto Bridge looks a bit underwhelming based on "30% increased application performance", but I guess that depends on how good Ponte Vecchio is.

Also, Meteorlake-S in Q1 2024 and Arrowlake-S in the same year. See what I mean by most Meteorlake chips way later than middle of 2023?

You said Meteor Lake won't come before the end of 2023 no matter what model. The desktop models were always the last and it was (or still is) unclear when it comes to the desktop. If there is a desktop model it won't come before late 2023/early 2024 but this is hardly a surprise.
 

ashFTW

Senior member
Sep 21, 2020
252
178
96
Seems like Intel is staying with a 4 tile system design (ala SPR) with Falcon Shores. Each of these tiles, especially the Xe tiles (evolved from Ponte Vecchio), may be composed of sub chiplets. Given the flexible x86 to Xe ratios, is Granite Rapids chip a Falcon Ridge chip with all 4 tiles being x86? That would be a parsimonious design choice.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,518
3,567
136
You said Meteor Lake won't come before the end of 2023 no matter what model. The desktop models were always the last and it was (or still is) unclear when it comes to the desktop. If there is a desktop model it won't come before late 2023/early 2024 but this is hardly a surprise.
No I never said that! Where did you get that nonsense from? I said most Meteorlake models will NOT launch at Q2 2023, and that the M version will.

Most people expected desktop earlier than that. I was NOT one of them. Stop twisting things..
 
Last edited:
  • Like
Reactions: lobz

IntelUser2000

Elite Member
Oct 14, 2003
8,518
3,567
136
Seems like Intel is staying with a 4 tile system design (ala SPR) with Falcon Shores. Each of these tiles, especially the Xe tiles (evolved from Ponte Vecchio), may be composed of sub chiplets. Given the flexible x86 to Xe ratios, is Granite Rapids chip a Falcon Ridge chip with all 4 tiles being x86? That would be a parsimonious design choice.
Seems almost exactly like the MLID leak a while ago. Although there were no details, but that they wanted one of the tiles to be Xe.

Since Falcon Shore is shown on the roadmap as replacing both HBM Xeon and Rialto Bridge, it seems like a logical design choice. We haven't seen how they'll do with Granite Rapids though.

Where's that from? Don't see it in the slide deck.

Though either way, probably reading too much into it.
It's in the Anandtech article.

Anyway what I said about Meteorlake-S is from a well known leaker on twitter. Kope-something, I can't remember.
 
Last edited:

ashFTW

Senior member
Sep 21, 2020
252
178
96
Seems almost exactly like the MLID leak a while ago. Although there were no details, but that they wanted one of the tiles to be Xe.

Since Falcon Shore is shown on the roadmap as replacing both HBM Xeon and Rialto Bridge, it seems like a logical design choice. We haven't seen how they'll do with Granite Rapids though.
I’m assuming that with GNR and FLR, Xeon will be disaggregated such that all the memory and I/O (UPI, PCIe, CXL etc) will be on the Foveros base die. That way using varying number of x86 and Xe tiles still gets all the memory channels and I/O.

The GNR without HBM should also reuse the same FLR x86 tiles. Since I/O and memory are no longer on the x86 tiles, the GNR chips can now be made with 1 to 4 tiles to cover the range of core counts. It’s a much better system design than SPR/EMR.
 
  • Like
Reactions: lightmanek

Exist50

Golden Member
Aug 18, 2016
1,401
1,466
136
Seems like Intel is staying with a 4 tile system design (ala SPR) with Falcon Shores. Each of these tiles, especially the Xe tiles (evolved from Ponte Vecchio), may be composed of sub chiplets. Given the flexible x86 to Xe ratios, is Granite Rapids chip a Falcon Ridge chip with all 4 tiles being x86? That would be a parsimonious design choice.
I think it would have to be an unrepresentative illustration. We've seen various Granite Rapids illustrations, and it doesn't much look like what they've shown here. Maybe there's more flexibility than they've let on thus far, but I don't know...
 

ashFTW

Senior member
Sep 21, 2020
252
178
96
I think it would have to be an unrepresentative illustration. We've seen various Granite Rapids illustrations, and it doesn't much look like what they've shown here. Maybe there's more flexibility than they've let on thus far, but I don't know...
Remember GNR was delayed and moved from Intel 4 to Intel 3, probably changing the system design in the process. The most recent renders probably reflect that, but of course I’m speculating.

Intel needs to be parsimonious with their designs to compete well with fast moving competition (AMD and Server ARM vendors). One Foveros base tile, one x86 P core tile, one x86 E core tile, and one Xe tile should be enough to cover all of Granite Rapids, Sierra Rapids, and Falcon Shores chips.
 
Last edited:

Exist50

Golden Member
Aug 18, 2016
1,401
1,466
136
Remember GNR was delayed and moved from Intel 4 to Intel 3, probably changing the system design in the process. The most recent renders probably reflect that, but of course I’m speculating.
I guess that's possible, though it would be very concerning if so. Changing core, process, and soc architecture all at once? That's basically designing a completely different product. I think Intel needs to focus on the simplest thing that works, and get that out of the door before trying to pile on more changes. After all, clearly Granite Rapids was already very troubled, given that they had to delay it all the way to '24.

One Foveros base tile, one x86 P core tile, one x86 E core tile, and one Xe tile should be enough to cover all of Granite Rapids, Sierra Rapids, and Falcon Shores chips.
I don't think Foveros is a good match for server. You need reticle stitching to get an interposer large enough for everything at once. That, or you need both Foveros and EMiB, which is a lot of extra overhead for what gain? And either way, that's just plain old a lot of silicon, even if it's cheap and dumb.
 
  • Like
Reactions: lightmanek

ashFTW

Senior member
Sep 21, 2020
252
178
96
I don't think Foveros is a good match for server. You need reticle stitching to get an interposer large enough for everything at once. That, or you need both Foveros and EMiB, which is a lot of extra overhead for what gain? And either way, that's just plain old a lot of silicon, even if it's cheap and dumb.
Yes, both Foveros and EMIB will be required (as in PVC) to overcome reticle size challenges that will only get worse with High-NA EUV. EMIB only is not an option — see how much silicon it’s using up on SPR tiles.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,518
3,567
136
Raptorlake and DLVR mysteries unveiled:

I might be wrong on this, but my understanding of DLVR was more as a FIVR alternative? replacement? Some of thing of the sort. This patent is kinda confusing in that regard, but maybe just a different use for the same tech? Wonder also how multiple DLVR would scale... Anyway, still don't really expect it for desktop, but maybe?
I should have took a look at the patent first, and I did.

According to the patent, there's no reason to not include FIVR. The new power system can work with FIVR.

-However, FIVR increases complexity and it says it takes "large die space" or something so DLVR does become a replacement in a sense.

-DLVR isn't the critical part, the way they are delivering power is. Right now, the mainboard VR is in series with FIVR or DLVR. The new system puts a secondary power delivery system that uses the linear voltage regulators. They said the secondary stage only kicks in when the load is high and when it can cause a voltage droop. They also said that the DLVR being on results in lower efficiency,

-The power savings are provided because voltages are determined using worst case scenarios, because unexpected load can cause the voltage drop and the resulting voltage must be higher than the application requirements. By using the DLVR in parallel that kicks in when it's required, the CPU does not have to use the worst case voltages and be closer to "load voltage", saving power.

The part about losing gains above 40A is because their example is using 40A for VR1, and above that the DLVR kicks in. Because of the reduced efficiency of DLVR, that's why the efficiency gains drop, not that there's an inherent limit of 70A.(1)(2)(3)

1.It's likely though there will be a reduced gain for desktop chips that clock ridiculously high because of voltage requirements.
2. The higher the load, the lower the gain, because the greater load results in drop of voltage that gets closer to minimum operating voltage. Therefore the margin shrinks. The lower gain is also due to DLVR kicking in and being less efficient.
3. Also sounds mostly beneficial for mobile, where we are entirely at system's mercy for regulating voltage. Overclocking and overvolting will throw the gains out the window.

So the funny thing is for the most efficient operation, the DLVR should always be off! It'll be a great gain for burst and low load scenarios.
 
Last edited:

dullard

Elite Member
May 21, 2001
24,190
2,409
126
So the funny thing is for the most efficient operation, the DLVR should always be off! It'll be a great gain for burst and low load scenarios.
Thank you for finally reading instead of how others just spout off what they want to believe.

The thing that wraps it all together is that the long term vision is that the P-cores are intended to have low load and handle the bursty needs while the E-cores do the grunt work and the 24-7 type work.
 

Exist50

Golden Member
Aug 18, 2016
1,401
1,466
136
Yes, both Foveros and EMIB will be required (as in PVC) to overcome reticle size challenges that will only get worse with High-NA EUV. EMIB only is not an option — see how much silicon it’s using up on SPR tiles.
I'd hardly think that Foveros + EMIB is better than EMIB alone in terms of overhead. And they have years before high-NA hits the market. Can probably fit in two server gens first. In the meantime, to amortize EMIB overhead, bigger tiles make the most sense.
 

ashFTW

Senior member
Sep 21, 2020
252
178
96
I'd hardly think that Foveros + EMIB is better than EMIB alone in terms of overhead. And they have years before high-NA hits the market. Can probably fit in two server gens first. In the meantime, to amortize EMIB overhead, bigger tiles make the most sense.
What overhead? Foveros (especially with Foveros direct, which will be ready ‘23-‘24) is much denser (bump pitch wise) and far better in terms of energy per bit transferred compared to EMIB. Intel has had enough experience with this technology, and they will now scale it from low end clients (like Meteor Lake) to very high end server chips (like Falcon Shores). Note that Foveros is already able to handle 600W in PVC, and soon 800W in Rialto Bridge, so early power delivery and heat dissipation concerns have been worked out quite a bit for the technology to go mainstream.

Also note that 3D packaging saves space on the top layer for the chip’s core functions (CPU, GPU etc); you can only make a chip so big without affecting placement of other things on the board (like DIMMs, and PCIe slots). For example: Sapphire rapids is already at 1600 mm2 (without counting EMIB tiles), with only 60 cores. If they had built it using Foveros, all the memory and I/O could have been moved to the base die. Then ~1600 mm2 on the top layer could have easily fit twice the number of cores (at a much higher TDP of course). I guess that’s what Granite Rapids will look like. And Sierra Rapids, assuming that 4 new E-cores can still fit in the area of 1 new P-core, could touch 512 cores for the highest SKU. All these 2024 products have the added benefit of being on Intel 3, which is theoretically 2x denser compared to Intel 7.

Granite Rapids, Sierra Rapids, and Falcon shores will all share the same platform and Xeon socket. My guess is 12 channel DDR6, PCIe6, CXL 2.0+, and (for some SKUs) HBM3. There will be more on-chip specialized accelerators as well, though they could also be moved to the base tile. Intra chip and inter-chip interconnect both will need an overhaul. I don’t know if the current mesh, and UPI will continue to scale. Maybe Xe link will be generalized to replace UPI. And the physical layer could be optical. That’s what late 2024 from Intel looks like, if they execute to their plans. Time will tell!! :)
 
  • Like
Reactions: Tlh97 and Vattila

Exist50

Golden Member
Aug 18, 2016
1,401
1,466
136
What overhead? Foveros (especially with Foveros direct, which will be ready ‘23-‘24) is much denser (bump pitch) and far better in terms of energy per bit transferred compared to EMIB
For a larger-than-reticle package, Foveros isn't going to work. You need EMIB for that, and going die<->Foveros<->EMIB<->Foveros<->die is strictly worse than die<->EMIB<->die. And Foveros Direct is probably not going to be ready to supply the entire server lineup for quite some time.

All these 2024 products have the added benefit of being on Intel 3, which is theoretically 2x denser compared to Intel 7.
We've seen Redwood Cove on Intel 4. It's more like 40% denser.

Granite Rapids, Sierra Rapids, and Falcon shores will all share the same platform and Xeon socket. My guess is 12 channel DDR6, PCIe6, CXL 2.0+, and (for some SKUs) HBM3.
Absolutely not. Intel doesn't even have a platform with DDR5 or PCIe 5.0 today. CXL 2.0 and HBM3 might be possible, but DDR6, PCIe6, etc will probably need to wait till 2026, at least from Intel.
 

ashFTW

Senior member
Sep 21, 2020
252
178
96
For a larger-than-reticle package, Foveros isn't going to work. You need EMIB for that, and going die<->Foveros<->EMIB<->Foveros<->die is strictly worse than die<->EMIB<->die. And Foveros Direct is probably not going to be ready to supply the entire server lineup for quite some time.
Foveros works fine for PVC, and it’s quite a bit larger than reticle size. You are right about use of EMIB to connect multiple base die. But communication doesn’t always traverse this EMIB to go to another Foveros Die, the “home base die“ can handle a lot of mem and I/O requests. And in that case it will be faster than always going via EMIB to another chiplet for memory and I/O, assuming these chips are disaggregate.

Foveros direct is going mainstream in 2023.

We've seen Redwood Cove on Intel 4. It's more like 40% denser.
Intel 3 has denser high performance libraries still. I think I saw 9-10% somewhere.

Absolutely not. Intel doesn't even have a platform with DDR5 or PCIe 5.0 today. CXL 2.0 and HBM3 might be possible, but DDR6, PCIe6, etc will probably need to wait till 2026, at least from Intel.
You are probably right about DDR6, but as far as I know, the PCIe 6 spec has already been finalized. There is a huge market pressure to make accelerators faster, and PCIe bandwidth is often a bottleneck. So I don’t doubt that both AMD and Intel will have PCIe6 by the end of ‘24.
 

ashFTW

Senior member
Sep 21, 2020
252
178
96
I'd hardly think that Foveros + EMIB is better than EMIB alone in terms of overhead. And they have years before high-NA hits the market. Can probably fit in two server gens first. In the meantime, to amortize EMIB overhead, bigger tiles make the most sense.
Intel can’t make more that say 90 core chips (60 cores of SPR * 50% density advantage with Intel 3) using just EMIB (ala SPR). AMD will have 2-3x more cores by the end of ‘24. Foveros is the only option to stay competive. And longer term, Intel also needs to redesign and shrink down their P-cores, while simultaneously making the E-cores more performant without too much bloat.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,518
3,567
136
And Sierra Rapids, assuming that 4 new E-cores can still fit in the area of 1 new P-core, could touch 512 cores for the highest SKU.
Crestmont cluster is slightly smaller than the P core block relative to the previous generation. Core-to-core without the L2 caches, it's 3.48 in the Alderlake generation. It gets to 3.58 in Meteorlake.
 

uzzi38

Platinum Member
Oct 16, 2019
2,361
4,964
116
AMD has been at this for much more than Intel and Nvidia. Trento is an APU already.
Eh, not really. Trento is just Milan with a new IOD, the MI250Xs that sit alongside it are very much their own seperate boards. MI300 should be the first one from AMD with CPU + GPU on the same package, if rumours are to be believed.
 

mikk

Diamond Member
May 15, 2012
3,682
1,560
136
No I never said that! Where did you get that nonsense from? I said most Meteorlake models will NOT launch at Q2 2023, and that the M version will

Most people expected desktop earlier than that. I was NOT one of them. Stop twisting things..

Early this year you claimed Meteor Lake is nearly 2 years away and you were talking about the mobile version: https://forums.anandtech.com/threads/intel-current-and-future-lakes-rapids-thread.2509080/page-593#post-40668592
 

igor_kavinski

Diamond Member
Jul 27, 2020
6,115
3,771
106
  • Like
Reactions: lightmanek

ASK THE COMMUNITY