Discussion Intel current and future Lakes & Rapids thread

Page 610 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

repoman27

Senior member
Dec 17, 2018
342
488
136
Battery life is a major concern for the mobile consumer space. Advantages in node power usage could play a very large part here, especially as 80mm is a lot of space in N3 terms. Apple just showed high end IGPs can be a big selling point.
Unfortunately, Apple also showed pretty clearly that they were Intel's only customer that actually wanted to pay for high-end IGPs.

I think this is converging in the right direction. Something like 2 TB4 + 4 or 8 lanes of PCIe for the M segment, maybe? And then you could make it longer to add more for P? I could see that. Definitely wouldn't want DDR on there, even if it'd fit. Too many hops.
That was just a size comparison; I never expected Intel would put a memory interface on its own tile. And looking over that program again, it appears that MTL-M/P and ARL-P might share a common SoC tile, but there is probably an IOE-M for MTL-M and a slightly different IOE-P shared by MTL-P and ARL-P. I know those dies are tiny, but it still boggles my mind that that Intel would be willing to go through multiple tape outs, integrations, validations, etc. How can that possibly be cost effective? Also note that as of the final week of Q1 last year, B0 steppings of the MTL SoC-M/P and IOE-P were expected before the end of Q1'22.

The one thing I can't figure out is what ADM is referring to... Any ideas on that? It doesn't seem to be the base tile, but that isn't referenced anywhere else either, as far as I can see.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
The one thing I can't figure out is what ADM is referring to... Any ideas on that? It doesn't seem to be the base tile, but that isn't referenced anywhere else either, as far as I can see.
I think the prevailing theory is that ADM is some kind of extra cache. Perhaps on an active version of the base die? Something like the "In-Package Memory here"?

purpose_built_client-100854350-orig.jpg
 

repoman27

Senior member
Dec 17, 2018
342
488
136
I think the prevailing theory is that ADM is some kind of extra cache. Perhaps on an active version of the base die? Something like the "In-Package Memory here"?
Yeah, it definitely feels like something to do with providing the GPU tile with more memory bandwidth. Maybe they'll try to bolt on a GDDR6 interface. Well, something else to ponder I suppose.

I kinda just scrubbed through the Investor Meeting presentation, but it seemed like they were talking about integrating their Movidius VPU IP (or something like it) into MTL and future platforms as an AI play. That looks like it will probably land in the SoC tile though.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
ADM might be ADvanced Memory?

@Exist50 Back to Intel release cadence.

So you said Meteorlake is a small gain right? And Arrowlake gets the big one? But that means they have two small gains, because the next is Raptorlake before Meteorlake. So we have two years of small gains?

If Granite Rapids is using the same cores as Arrowlake, again that's troubling. Because Pat himself said 10-12%. So if Raptorlake is small gain, and Meteorlake is, and Arrow is too, then Lunar is as well, meaning 4 years of 10% gains.

Either that's what's really happening, or Granite Rapids is using something that had to be modified from the Lion Cove core in Arrowlake.

And saying Lion Cove is a big gain when we'll have two small gains not particularly impressive, unless the gains are exceptional(40% for example).

Not all bad news though. Because the process is being pulled in big time. Just as Pat said it would work out.

10-10-30 is pretty much same as 10-20-10. I am betting the plans have changed quite a bit. No longer Meteorlake using DG2. And Granite Rapids gets the big Lunar gains plus the 10-12%.

I'm starting to believe in their recovery plan. We've talked about needing a proper CEO, and there's no better one than Gelsinger. And I think he's proving it. We talk about how Intel was suffering because of brain drain and the incompetent and stupid management? But when the opposite happens it results in no change?

You are not going to see instantaneous huge changes like when Otellini was CEO and he introduced Core 2. He probably had some hand in making things better, but lots of things existed before.

Consistent execution over multiple years is what's needed, and yes I know that's boring. But that's what you need for leadership.
 
Last edited:

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
If Granite Rapids is using the same cores as Arrowlake, again that's troubling. Because Pat himself said 10-12%. So if Raptorlake is small gain, and Meteorlake is, and Arrow is too, then Lunar is as well, meaning 4 years of 10% gains.

It seems Intel really loves high frequencies, and doesn't higher frequencies make chasing IPC gains more difficult? If your target clock speed is 5ghz, it's probably going to be harder to hit a certain IPC gain compared to if the clock speed was 3.5ghz. That said, high frequencies seems to make IPC gains more potent and vice versa as they go hand in hand.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
So you said Meteorlake is a small gain right? And Arrowlake gets the big one? But that means they have two small gains, because the next is Raptorlake before Meteorlake. So we have two years of small gains?
Yes, that's exactly the problem, as I see it. It would have been ok, perhaps, if Arrow Lake were a 2023 product, but 2024? Not good enough.

If Granite Rapids is using the same cores as Arrowlake, again that's troubling. Because Pat himself said 10-12%. So if Raptorlake is small gain, and Meteorlake is, and Arrow is too, then Lunar is as well, meaning 4 years of 10% gains.
I've said it before, but I'm increasingly convinced that Arrow Lake and Lunar Lake are much more like contemporaries than successive generations. Good odds they both get branded as 15th gen.

That said, I don't think the small gains (from an IPC perspective) are 10% either. I'm thinking we're looking at something like 5-5-10 (maybe 5-5-15) rather than 10-10-30, and that is a problem. Process will help, surely, but those kind of IPC gains just aren't enough to keep up.

I am betting the plans have changed quite a bit. No longer Meteorlake using DG2.
Really don't think they're using Battlemage/Elasti IP for MTL. All the rumors/leaks say 12.7, and their GPU team doesn't seem to be on a yearly cadence. Plus, more churn is the last thing MTL needs.

And Granite Rapids gets the big Lunar gains plus the 10-12%.
See, I interpreted 10-12% as the gain from Lion Cove, which is why I'm concerned. I'm hoping that number is iso-power, iso-process or something, and thus we can expect somewhat higher IPC gains, but I'm a poor optimist.

We've talked about needing a proper CEO, and there's no better one than Gelsinger. And I think he's proving it. We talk about how Intel was suffering because of brain drain and the incompetent and stupid management? But when the opposite happens it results in no change?
It's not been enough time. Gelsinger can't fundamentally change the pieces Intel has to work with through '23, and likely '24 as well. At best, he can change what's done with those pieces in '24, and stop further bleeding/execution failures, but it'll be years yet before we can see an inflection from him.

You are not going to see instantaneous huge changes like when Otellini was CEO and he introduced Core 2.
When Royal comes out, thank Jim Keller and the team. No one else deserves it.
 
Last edited:

dullard

Elite Member
May 21, 2001
25,065
3,413
126
ADM might be ADvanced Memory?
Intel has used ADM in their Optane memory documents as App Direct Mode, basically letting the specifically written software directly access the Optane data as a form of cache. But, that seems odd for integrated graphics. So, I don't really think they mean App Direct Mode. I'm just tossing that out as a wildcard entry.
10-10-30 is pretty much same as 10-20-10. I am betting the plans have changed quite a bit. No longer Meteorlake using DG2. And Granite Rapids gets the big Lunar gains plus the 10-12%.
That said, I don't think the small gains (from an IPC perspective) are 10% either. I'm thinking we're looking at something like 5-5-10 (maybe 5-5-15) rather than 10-10-30, and that is a problem. Process will help, surely, but those kind of IPC gains just aren't enough to keep up.
Intel has repeatedly stated (including recently in new slides at their Investor Day) that Intel 4 has a ~20% improvement in performance per watt, Intel 3 will be another 18% improvement, and Intel 20A will be another 15%. Now, I realize that you are talking about IPC not performance per watt. But, what gives you confidence that the actual results will be that much lower? Since we can safely assume the wattage limits won't drastically change, the difference between IPC and performance per watt is simply clock speed.
 
  • Like
Reactions: nickxchampagne

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Intel has used ADM in their Optane memory documents as App Direct Mode, basically letting the specifically written software directly access the Optane data as a form of cache. But, that seems odd for integrated graphics. So, I don't really think they mean App Direct Mode. I'm just tossing that out as a wildcard entry.

Don't know where you got the info that it's like a cache. App Direct means it can use Optane in native 256 byte access mode, rather than having to translate that to 4KB block mode, so you get the advantages of near DRAM like read latency and persistence.

From the wording it doesn't sound like Optane either. ADvanced or ADaptive memory is my guess.

When Royal comes out, thank Jim Keller and the team. No one else deserves it.

Yes, but the CEO is the enabler. They'd be looking at worse now if Kraznich was still CEO.
 

Doug S

Platinum Member
Feb 8, 2020
2,261
3,513
136
It seems Intel really loves high frequencies, and doesn't higher frequencies make chasing IPC gains more difficult? If your target clock speed is 5ghz, it's probably going to be harder to hit a certain IPC gain compared to if the clock speed was 3.5ghz. That said, high frequencies seems to make IPC gains more potent and vice versa as they go hand in hand.

If they hold frequency roughly constant, then the improvements that process delivers (which aren't much these days, but better than nothing) can be devoted to increasing IPC. Sure, if you get a 10% process bump and use it to go from 5 GHz to 5.5 GHz you won't see any IPC benefit, but Intel has basically been stuck at 5 GHz ish for the top end for many years. Not because they couldn't have gone to 6 GHz if they want, but presumably their modeling showed that overall performance (across all dimensions) is worse at that speed by whatever evaluation they use (i.e. however they weight raw performance vs performance per watt vs life expectancy of the chip, etc.)

On more than one occasion they've let frequency slide up and then do a new design with more IPC that lowers maximum frequency, then let it slide up again. I expect that to continue, unless new transistor types change the equations and make 6 GHz better than 5 GHz according to their weighting. Though it is also possible the equations could change in favor of 4 GHz (ish) being the new top end in the future (and if so I'd expect to see Apple designs retreat below 3 GHz, as they've got different weighting in their evaluation of "overall" performance than Intel does since they target a different market)
 
  • Like
Reactions: Tlh97 and Carfax83

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Intel has repeatedly stated (including recently in new slides at their Investor Day) that Intel 4 has a ~20% improvement in performance per watt, Intel 3 will be another 18% improvement, and Intel 20A will be another 15%. Now, I realize that you are talking about IPC not performance per watt. But, what gives you confidence that the actual results will be that much lower? Since we can safely assume the wattage limits won't drastically change, the difference between IPC and performance per watt is simply clock speed.
Wait, what? Process perf/watt improvement is completely orthogonal to IPC. And the reason I'm focusing on IPC (or architecture in general) is that the competition can be presumed to be on an at least equivalent node for the foreseeable future. At least through 2024. That said, I haven't mentioned any intrinsic changes to frequency or power consumption that come with the architecture, but those matter too.

Yes, but the CEO is the enabler. They'd be looking at worse now if Kraznich was still CEO.
I guess, but in this case it seems more like him continuing something that started before he got there. I suppose that's worth something, but if we're going to give Gelsinger credit, it should be for things that he initiated and supported. Just my two cents.
 

dullard

Elite Member
May 21, 2001
25,065
3,413
126
Wait, what? Process perf/watt improvement is completely orthogonal to IPC.
Not quite. If you have more performance / watt, you can (A) increase clocks, (B) reduce power, (C) increase IPC, or (D) leave it on the table. Option (B) is not likely, power limits don't move that much from generation to generation. Option (D) is unlikely as Intel has been pushing every corner of performance regardless of the potential costs (such as pushing performance past what many of us would call the optimum for power usage). So you are left with increasing clocks or increasing IPC. Your numbers for increasing IPC therefore imply clock frequencies. I just want to know how you are getting your IPC estimates because that ties into what clocks can be utilized.

Otherwise you are arguing that they'd have a higher performing process and not increase performance.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Not quite. If you have more performance / watt, you can (A) increase clocks, (B) reduce power, (C) increase IPC, or (D) leave it on the table. Option (B) is not likely, power limits don't move that much from generation to generation. Option (D) is unlikely as Intel has been pushing every corner of performance regardless of the potential costs (such as pushing performance past what many of us would call the optimum for power usage). So you are left with increasing clocks or increasing IPC. Your numbers for increasing IPC therefore imply clock frequencies. I just want to know how you are getting your IPC estimates because that ties into what clocks can be utilized.

Otherwise you are arguing that they'd have a higher performing process and not increase performance.
You can increase IPC without a change in process at all, and by the same token, you can do a node shrink without changing IPC. By no twist of logic can you claim that "the difference between IPC and performance per watt is simply clock speed".
 
  • Like
Reactions: uzzi38 and mikk

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
If they hold frequency roughly constant, then the improvements that process delivers (which aren't much these days, but better than nothing) can be devoted to increasing IPC. Sure, if you get a 10% process bump and use it to go from 5 GHz to 5.5 GHz you won't see any IPC benefit, but Intel has basically been stuck at 5 GHz ish for the top end for many years. Not because they couldn't have gone to 6 GHz if they want, but presumably their modeling showed that overall performance (across all dimensions) is worse at that speed by whatever evaluation they use (i.e. however they weight raw performance vs performance per watt vs life expectancy of the chip, etc.)

My point was that getting good performance out of high frequency CPUs necessitates other changes, like increasing bandwidth at the cache and system memory level, and lowering latency as much as possible. When I read the Chips and Cheese article for their Golden Cove deep dive, one of the recurrent themes was that Golden Cove is a bandwidth monster, but that seems to come at the cost of latency to some degree, which Intel tries to hide by utilizing various methods. I suppose the Intel architects thought that increasing bandwidth was more important for overall performance than latency?

When you compare how Golden Cove scales with performance compared to Zen 3, it's apparent that it scales much better than the latter as the clock speed rises. This is something that AMD needs to rectify with Zen 4 if they want to compete with Raptor Lake at high clock speeds, ie 5ghz and above.
 
  • Like
Reactions: nickxchampagne

dullard

Elite Member
May 21, 2001
25,065
3,413
126
You can increase IPC without a change in process at all, and by the same token, you can do a node shrink without changing IPC.
I agree 100%, but that has absolutely nothing to do with what I was talking about. I must not be writing in a way that conveys my message.
By no twist of logic can you claim that "the difference between IPC and performance per watt is simply clock speed".
Let me put it another way. Where is Intel putting the extra performance per Watt?
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
It seems Intel really loves high frequencies, and doesn't higher frequencies make chasing IPC gains more difficult? If your target clock speed is 5ghz, it's probably going to be harder to hit a certain IPC gain compared to if the clock speed was 3.5ghz. That said, high frequencies seems to make IPC gains more potent and vice versa as they go hand in hand.
If they hold frequency roughly constant, then the improvements that process delivers (which aren't much these days, but better than nothing) can be devoted to increasing IPC. Sure, if you get a 10% process bump and use it to go from 5 GHz to 5.5 GHz you won't see any IPC benefit, but Intel has basically been stuck at 5 GHz ish for the top end for many years. Not because they couldn't have gone to 6 GHz if they want, but presumably their modeling showed that overall performance (across all dimensions) is worse at that speed by whatever evaluation they use (i.e. however they weight raw performance vs performance per watt vs life expectancy of the chip, etc.)

On more than one occasion they've let frequency slide up and then do a new design with more IPC that lowers maximum frequency, then let it slide up again. I expect that to continue, unless new transistor types change the equations and make 6 GHz better than 5 GHz according to their weighting. Though it is also possible the equations could change in favor of 4 GHz (ish) being the new top end in the future (and if so I'd expect to see Apple designs retreat below 3 GHz, as they've got different weighting in their evaluation of "overall" performance than Intel does since they target a different market)
I agree 100%, but that has absolutely nothing to do with what I was talking about. I must not be writing in a way that conveys my message.

Let me put it another way. Where is Intel putting the extra performance per Watt?

Hopefully they use some of it to reduce PL2.
 
  • Haha
Reactions: lobz

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Let me put it another way. Where is Intel putting the extra performance per Watt?
Same as any other node shrink. Less power consumption at low to moderate loads, more performance at high loads. Maybe we can even hope they start getting boost power a little more under control.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
And how do they get more performance at high loads? Remember: you said IPC gains will be minimal and you said this has nothing to do with clock speeds.
I said that IPC has nothing to do with clock speed, but for that matter, the perf/watt figure does not directly translate to peak performance either.
 

dullard

Elite Member
May 21, 2001
25,065
3,413
126
I said that IPC has nothing to do with clock speed, but for that matter, the perf/watt figure does not directly translate to peak performance either.
So Intel will not be getting "more performance at high loads"?

Nothing you are saying is technically incorrect, but you seem to be missing the forest for the trees. Intel can get "boost power a little more under control" without IPC gains or without node gains. They can simply turn down the PL2 settings. That is more of a marketing feature and less of a technical one.

I'll ask again. Given that Intel claims that they are going to get more performance/watt, what do you think Intel will do with that? I personally think that wattage is already basically fixed. The next few nodes are likely to be set for close to 35 W, 65 W, and 125 W thermal design power for the desktop chips (especially for Raptor Lake that is supposed to be drop in replacement for Alder Lake, but probably true for the next generations too). They might vary slightly, but probably not much. Plus, the user can set power levels within reasonable limitations anyways, so Intel's default settings are not even set in stone.

So, assuming that those thermal design powers don't change, what will happen at a fixed 125 W across the generations in your opinion? I claim that if IPC isn't going up, then Intel will use that extra node performance to turn up the clock speeds. They don't have to, they could jam more cores in. But then you are getting more instructions per clock per core which seems to be against what you are saying.
 
Last edited:

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
So Intel will not be getting "more performance at high loads"?

Nothing you are saying is technically incorrect, but you seem to be missing the forest for the trees. Intel can get "boost power a little more under control" without IPC gains or without node gains. They can simply turn down the PL2 settings. That is more of a marketing feature and less of a technical one.

I'll ask again, Intel claims that they are going to get more performance/watt. What do you think Intel will do with that? I personally think that wattage is already basically fixed. The next few nodes are likely to be set for close to 35 W, 65 W, and 125 W thermal design power for the desktop chips (especially for Raptor Lake that is supposed to be drop in replacement for Alder Lake, but probably true for the next generations too). They might vary slightly, but probably not much. Plus, the user can set power levels within reasonable limitations anyways, so Intel's default settings are not even fixed.

So, assuming that those thermal design powers don't change, what will happen at a fixed 125 W across the generations in your opinion? I claim that if IPC isn't going up, then Intel will use that extra node performance to turn up the clock.
Going back to the original comment, you were asking me why I expected IPC increases to be small despite the node improvements, right? Well I've answered that, no? I see no inherent connection between IPC and process shrinks, and think they'll just use the gains to more or less implement existing architectures better.
 

dullard

Elite Member
May 21, 2001
25,065
3,413
126
Going back to the original comment, you were asking me why I expected IPC increases to be small despite the node improvements, right? Well I've answered that, no? I see no inherent connection between IPC and process shrinks, and think they'll just use the gains to more or less implement existing architectures better.
No, you haven't answered it in the slightest. What does "implement existing architectures better" mean? The inherent connection is that clock speeds are allowed to change.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
No, you haven't answered it in the slightest. What does "implement existing architectures better" mean? The inherent connection is that clock speeds are allowed to change.
Yes. Changes in clock speed and power at a given speed. Not IPC. Does that clarify?
 

dullard

Elite Member
May 21, 2001
25,065
3,413
126
Yes. Changes in clock speed and power at a given speed. Not IPC. Does that clarify?
Lets try hypothetical numbers. I'll use 1 core to make it as easy as can be.
  • Suppose a hypothetical core calculates 1 billion instructions in 1 second at 1 GHz clock speed using 1 W of power. The IPC is (1 billion instructions) / (1 s) / (1 billion cycles / s) = 1 instruction / cycle.
  • Suppose performance / Watt increases 10%.
Now, you have several options, some of which include lowering the power. But in general, OEM builders like a set power limit so they don't have to completely redesign everything from the ground up. I said above that I assume the power is the same. Then since performance/Watt increased 10%, the chip is now performing 1.1 billion instructions in 1 s. There are various ways that this could be accomplished:

1A) That increase could come from IPC gains. It could now have an IPC of 1.1 instructions / cycle but keep clock speeds the same. Thus in 1 second and at 1 W, it calculates (1.1 instructions / cycle) * (1 s) * (1 billion cycles / second) = 1.1 billion instructions.

1B) That increase could come from clock gains. It could now have a clock rate of 1.1 GHz but keep IPC the same. Thus in 1 second and at 1 W, it calculates (1 instructions / cycle) * (1 s) * (1.1 billion cycles / second) = 1.1 billion instructions.

1C) It could be some combination in between. It could now have an IPC of 1.05 instructions / cycle and a clock rate of 1.0476 GHz. Thus in 1 second and at 1 W, it calculates (1.05 instructions / cycle) * (1 s) * (1.0476 billion cycles / second) = 1.1 billion instructions.

As you see, if the wattage is fixed, then IPC and clock speeds are directly linked.

Yes, there are other options, like lowering thermal design powers. But, that is a fundamental change that I was hoping you'd have explanations for if you think that clock speeds won't change and IPC changes minimally.
 
Last edited:

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Lets try hypothetical numbers. I'll use 1 core to make it as easy as can be.
  • Suppose a hypothetical core calculates 1 billion instructions in 1 second at 1 GHz clock speed using 1 W of power. The IPC is (1 billion instructions) / (1 s) / (1 billion cycles / s) = 1 instruction / cycle.
  • Suppose performance / Watt increases 10%.
Now, you have several options, some of which include lowering the power. But in general, OEM builders like a set power limit so they don't have to completely redesign everything from the ground up. I said above that I assume the power is the same. Then since performance/Watt increased 10%, the chip is now performing 1.1 billion instructions in 1 s. There are various ways that this could be accomplished:

1A) That increase could come from IPC gains keeping the clock speed the same. It could now have an IPC of 1.1 instructions / cycle but keep clock speeds the same. Thus in 1 second and at 1 W, it calculates (1.1 instructions / cycle) * (1 s) * (1 billion cycles / second) = 1.1 billion instructions.

1B) That increase could come from clock gains keeping the IPC the same. It could now have a clock rate of 1.1 GHz but keep IPC the same. Thus in 1 second and at 1 W, it calculates (1 instructions / cycle) * (1 s) * (1.1 billion cycles / second) = 1.1 billion instructions.

1C) It could be some combination in between. It could now have an IPC of 1.05 instructions / cycle and a clock rate of 1.0476 GHz. Thus in 1 second and at 1 W, it calculates (1.05 instructions / cycle) * (1 s) * (1.0476 billion cycles / second) = 1.1 billion instructions.

As you see, if the wattage is fixed, then IPC and clock speeds are directly linked.

Yes, there are other options, like lowering thermal design powers. But, that is a fundamental change that I was hoping you'd have explanations for if you think that clock speeds won't change and IPC changes minimally.
I think I might understand the confusion here. Take that 18% number for Intel 3. That's referring to intrinsic gain from the process, not from subsequent architectures that happen to use it. So architectural changes build on top of that number, not constitute it.