Discussion Intel current and future Lakes & Rapids thread

Page 664 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Timmah!

Golden Member
Jul 24, 2010
1,419
630
136
Couldn't Intel use two 6+8 compute tiles on a single package for desktop?

IMO it would be a logical choice and such product (with 12P + 16E) could be competitive with potential AMD 24 core Ryzen. 8 + 16 will not, as it seems it will more or less match 16 core 7950x im MT performance. And 6+8 would be obviously significantly worse.
 

dullard

Elite Member
May 21, 2001
25,066
3,414
126
Couldn't Intel use two 6+8 compute tiles on a single package for desktop?
It is possible, yes. Advisable? I don't think so. Three main problems. (1) Intel is already probably stretched thin on their Intel 4 production capability. Doubling up on the CPU tile might not be possible due to that limitation. (2) There will be a lot of added latency between the tiles. That latency is possible to partly overcome but I don't think that is what Intel wants to see with this chip. (3) More P cores isn't advisable. If you had 28 cores in 125 W, then each core gets 4.5 W of power (and that assumes absolutely no power goes to the iGPU or the uncore). Maybe Intel could shift it so the P cores get ~6 W and the E cores get ~3 W, but even 6 W each would be a stretch. And at 6 W each, the E core is tied with or outperforms the P cores. So, why add more P cores that will perform worse?

1656525363892.png
1656525379103.png
 
  • Like
Reactions: Vattila

coercitiv

Diamond Member
Jan 24, 2014
6,201
11,902
136
And at 6 W each, the E core is tied with or outperforms the P cores. So, why add more P cores that will perform worse?
That's 6W for a 4x cluster, meaning ~1.5W per core. At 4.5W/core the P core offers better perf/watt.
 
Last edited:

Timmah!

Golden Member
Jul 24, 2010
1,419
630
136
It is possible, yes. Advisable? I don't think so. Three main problems. (1) Intel is already probably stretched thin on their Intel 4 production capability. Doubling up on the CPU tile might not be possible due to that limitation. (2) There will be a lot of added latency between the tiles. That latency is possible to partly overcome but I don't think that is what Intel wants to see with this chip. (3) More P cores isn't advisable. If you had 28 cores in 125 W, then each core gets 4.5 W of power (and that assumes absolutely no power goes to the iGPU or the uncore). Maybe Intel could shift it so the P cores get ~6 W and the E cores get ~3 W, but even 6 W each would be a stretch. And at 6 W each, the E core is tied with or outperforms the P cores. So, why add more P cores that will perform worse?

View attachment 63757
View attachment 63758

Is it not coming like in a year? Why would you assume they are stretched thin and why does it matters in regard to a product most likely a year away?

Regarding latency, i presume they are aware that moving away from monolithic toward tiled approach means latency increase, but since they chose to go that way anyway, they are probably OK with that. And so was AMD, right? Tiles/chiplets/stacked CPUs are the future.

I can see the power-draw with more cores to be an issue, but who knows, if they can make 8+16 on current process, perhaps they can do 12+16 on Intel 4.
 

DrMrLordX

Lifer
Apr 27, 2000
21,631
10,844
136
It is possible, yes. Advisable? I don't think so. Three main problems. (1) Intel is already probably stretched thin on their Intel 4 production capability. Doubling up on the CPU tile might not be possible due to that limitation.

The alternative is taping out an entirely-separate 8P+16E tile that may yield worse than the 6P+8E tile. They're in trouble either way. I would say that Intel has the benefit of Granite Rapids being delayed and moved to a different node, though it's probably more the delay than the node shift that will help there.

(2) There will be a lot of added latency between the tiles.

That is maybe a problem, but if Intel is supposed to have all this wonderful packaging tech, don't you think they should have overcome that by now? Or compensated for it somehow? AMD has been doing this for years.

(3) More P cores isn't advisable. If you had 28 cores in 125 W, then each core gets 4.5 W of power (and that assumes absolutely no power goes to the iGPU or the uncore). Maybe Intel could shift it so the P cores get ~6 W and the E cores get ~3 W, but even 6 W each would be a stretch. And at 6 W each, the E core is tied with or outperforms the P cores. So, why add more P cores that will perform worse?[/quote]

Intel should be able to control voltages and clockspeeds well enough that a 12P+16E part should perform just as well in a <=8 thread task as a hypothetical 8P+16E part, and overall better in a >= 24 thread task. Yes interconnect power might go up when fully utilized, so that is a possible issue. In any case, as long as Redwood Cove is anything like Golden Cove or Raptor Cove, it should gain efficiency by reducing clocks and volts by a little bit. Having more P cores is not a bad thing.
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
Intel would likely do far, FAR better, doing two separate tiles, one for just P cores with process tweaks for clock speed and another with just E cores with process tweaks for density and power usage improvements. Just doing a relaxed, high speed P core tile with 8 cores would be enough there, then a high density 32 core E core tile, maybe they could fit 40 e cores there in a density optimized configuration. That would be 8 high speed threads, 8 HT threads, and 32-40 threads on the e cores. That neatly matches the 48 threads that a 24 core AMD processor would have, with higher overall throughput assuming that everything else is done right.
 

Timmah!

Golden Member
Jul 24, 2010
1,419
630
136
Intel would likely do far, FAR better, doing two separate tiles, one for just P cores with process tweaks for clock speed and another with just E cores with process tweaks for density and power usage improvements. Just doing a relaxed, high speed P core tile with 8 cores would be enough there, then a high density 32 core E core tile, maybe they could fit 40 e cores there in a density optimized configuration. That would be 8 high speed threads, 8 HT threads, and 32-40 threads on the e cores. That neatly matches the 48 threads that a 24 core AMD processor would have, with higher overall throughput assuming that everything else is done right.

Was not Arrow Lake rumored to be 8 + 32? Maybe it will be it exactly the way you propose.
 

dullard

Elite Member
May 21, 2001
25,066
3,414
126
Was not Arrow Lake rumored to be 8 + 32? Maybe it will be it exactly the way you propose.
Yes, that was the rumor. 8 P + 32 E would take roughly the same die area as 12 P + 16 E. But 8 P + 32 E (48 threads) would be faster in highly multi-threaded tasks than 12 P + 16 E (40 threads).

Lets give the P cores a benchmark that highlights the P-core strength and see what happens.
Rough math assumptions:
  • We'll use the Chips and Cheese libx264 Transcode chart since that favors P cores over E cores.
  • Assume a 125 W processor load.
  • Assume a 10 W uncore with all the various tiles (IO, iGPU, SOC, etc). This is low especially if the iGPU is used, but it gives the most advantage to the P core by giving it more power.
  • Assume that each P core is given double the power of each E core. (Feel free to change this assumption and recalculate as you wish).
12 P + 16 E Math:
  • Each block of 4 P cores uses 23 W and performs 7.7 frames/s according to the Chips and Cheese benchmark.
  • Each block of 4 E cores uses 11.5 W and performs 4.5 frames/s according to the Chips and Cheese benchmark.
  • Total power = 10 W (uncore) + 3 * 23 W (P cores) + 4 * 11.5 W (E cores) = 125 W
  • Total performance: 3 * 7.7 frames/s + 4 * 4.5 frames/s = 41.1 frames/s
8 P + 32 E Math:
  • Each block of 4 P cores uses 19.17 W and performs 6.8 frames/s according to the Chips and Cheese benchmark.
  • Each block of 4 E cores uses 9.6 W and performs 4.1 frames/s according to the Chips and Cheese benchmark.
  • Total power = 10 W (uncore) + 2 * 19.17 W (P cores) + 8 * 9.58 W (E cores) = 125 W
  • Total performance: 2 * 6.8 frames/s + 8 * 4.1 frames/s = 46.4 frames/s
The 8 P + 32 E wins over 12 P + 16 E on the benchmark that preferred P cores. Even if you subtract a few percent for Amdahl's law, 8 P + 32 E still wins.
 
Last edited:

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
It is true that Intel tries for about a 12 month cadence. But, there are very notable exceptions. There are timeline slips and a few odd chips that break up that timeline.
Desktop Chip​
Generation​
Months As Top Generation​
Nahalem​
14.2​
Westmere​
12.1​
Sandy Bridge​
2​
15.6​
Ivy Bridge​
3​
13.1​
Haswell​
4​
12.0​
Haswell Refresh​
4​
12.0​
Broadwell​
5​
2.1​
Sky Lake
6​
12.8​
Kaby Lake
7​
13.2​
Coffee Lake
8​
12.5​
Coffee Lake Refresh
9​
10.1​
Comet Lake
10​
19.3​
Rocket Lake​
11​
7.2​
Alder Lake​
12​
10.9? Estimated
Haswell Refresh wasn't even called a new generation, so you could argue there was a 24 month gap.

IPC Wise there is no gains from Skylake to Comet Lake, both are identical uArch built on the same process. Intel kept adding speed and cores to an outdated design.
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
Is it not kinda downgrade from 8 + 16 Raptor Lake?

Oh boy...where to start...is a 5 GHz P4 Extreme faster than a 4 GHz Core i3? (HINT: the answer is no kid. It all comes down to absolute performance, or rather, perf/watt when comparing fixed power and/or heating targets. Note that power and heat are two different things)

It sure looks like Raptor Lake is going to be about 30% better in perf/watt, and more like 50% for PL4 workloads (burst), compared to Alder Lake. At the same time AMD will have only a tiny perf/w advantage, with 230W PPT, compared to 253W PL2 for Raptor Lake, with Raptor Lake being a bit faster. AMD seems to have made a great processor, but not to Zen 3’s level, unfortunately. Had they given us 24-32 cores, it would have killed, as it stands, a bit underwhelming for those who don't need AVX-512. Going on rumor anyway. I suspect its going to be quite a good desktop part.

Meteor Lake and beyond seems too speculative at this point, and maybe more focused on power efficiency for mobile.

I really want to know what you are smoking. You mention PL4 "burst" workloads and that sets off alarms. An Intel chip should never hit PL4. PL2 is "burst". PL4 is at the very limit of an Intel chip's capability to operate. If an Intel chip goes beyond PL4 (or even hits), automatic shutdown will occur. Note that under normal operating circumstances, even when overclocked, Intel chips should never come close to PL4. Period.

Raptor lake will be around 20-30% faster than Alder Lake in multicore workloads (depending on the workload), and a small percentage (under 10%) faster than Alder Lake in Single core workloads. Remember, you heard it here first. Don't bother arguing, just quote my post on launch if I'm wrong.

Zen 4, with 16 cores, will easily be able to deal with Raptor Lake, mind you, and according to what we know, the Zen 4 chip demoed by AMD did NOT use 230W, so the PPT for that chip would likely be lower, but alas, I want to keep this post about Intel.

One day Intel will realize what made the Core/Core 2 chips special: hint, it was perf/watt. Until they rediscover that, they won't be competitive with AMD, and according to the latest earnings reports, they have about 2-3 years until AMD technically becomes a larger company...and AMD is fabless.

Intel needs to stop mis-stepping and start stepping. I remember when Core came out (with little fandom), then Core 2 came out. It was Intel's exit strategy from the power-hungry GHz race. A (sub, in some cases) 2ghz chip quickly and easily beat out the hottest and fastest Pentiums. Until that moment, Intel kept releasing faster and hotter chips, while routinely making missteps, some of which required actual recalls. AMD consistently beat them with lower cost, yet superior chips.

Intel needs another Core 2 in order to be competitive. Meteor Lake may be that chip. We will see. Raptor lake is absolutely not that chip. It will provide a decent upgrade for those who buy Intel, but it won't change the status quo at all.

This message was NOT brought to you by an AMD fanboy. Just a guy who has been in tech since well before the 8088 existed.

Speaking of the Core 2 Duo (or similar AMD chips, for that matter) did you guys watch the Gamers Nexus livestream the other day?
 

pakotlar

Senior member
Aug 22, 2003
731
187
116
Oh boy...where to start...is a 5 GHz P4 Extreme faster than a 4 GHz Core i3? (HINT: the answer is no kid. It all comes down to absolute performance, or rather, perf/watt when comparing fixed power and/or heating targets. Note that power and heat are two different things)



I really want to know what you are smoking. You mention PL4 "burst" workloads and that sets off alarms. An Intel chip should never hit PL4. PL2 is "burst". PL4 is at the very limit of an Intel chip's capability to operate. If an Intel chip goes beyond PL4 (or even hits), automatic shutdown will occur. Note that under normal operating circumstances, even when overclocked, Intel chips should never come close to PL4. Period.

Raptor lake will be around 20-30% faster than Alder Lake in multicore workloads (depending on the workload), and a small percentage (under 10%) faster than Alder Lake in Single core workloads. Remember, you heard it here first. Don't bother arguing, just quote my post on launch if I'm wrong.

Zen 4, with 16 cores, will easily be able to deal with Raptor Lake, mind you, and according to what we know, the Zen 4 chip demoed by AMD did NOT use 230W, so the PPT for that chip would likely be lower, but alas, I want to keep this post about Intel.

One day Intel will realize what made the Core/Core 2 chips special: hint, it was perf/watt. Until they rediscover that, they won't be competitive with AMD, and according to the latest earnings reports, they have about 2-3 years until AMD technically becomes a larger company...and AMD is fabless.

Intel needs to stop mis-stepping and start stepping. I remember when Core came out (with little fandom), then Core 2 came out. It was Intel's exit strategy from the power-hungry GHz race. A (sub, in some cases) 2ghz chip quickly and easily beat out the hottest and fastest Pentiums. Until that moment, Intel kept releasing faster and hotter chips, while routinely making missteps, some of which required actual recalls. AMD consistently beat them with lower cost, yet superior chips.

Intel needs another Core 2 in order to be competitive. Meteor Lake may be that chip. We will see. Raptor lake is absolutely not that chip. It will provide a decent upgrade for those who buy Intel, but it won't change the status quo at all.

This message was NOT brought to you by an AMD fanboy. Just a guy who has been in tech since well before the 8088 existed.

Speaking of the Core 2 Duo (or similar AMD chips, for that matter) did you guys watch the Gamers Nexus livestream the other day?

A rude way of asking “what is PL4”. Its the max power draw for a 10ms duration ref:https://www.igorslab.de/en/power-co...ptor-lake-s-cpus-in-comparison-exclusively/3/

the internet brings out the worst in people.
 

coercitiv

Diamond Member
Jan 24, 2014
6,201
11,902
136
A rude way of asking “what is PL4”. Its the max power draw for a 10ms duration ref

the internet brings out the worst in people.
While @eek2121 over-reacted in his reply, his attitude is something most of us on the forum go through sooner or later considering how misinformation creeps in.

First, let's start with a better source for defining PL4 and PL3:
https://edc.intel.com/content/www/u...heet-volume-1-of-2/008/package-power-control/
  • Power Limit 3 (PL3): A threshold that if exceeded, the PL3 rapid power limiting algorithms will attempt to limit the duty cycle of spikes above PL3 by reactively limiting frequency. This is an optional setting
  • Power Limit 4 (PL4): A limit that will not be exceeded, the PL4 power limiting algorithms will preemptively limit frequency to prevent spikes above PL4.

So, just as @eek2121 warned you, PL4 is not intended to be used as an operating state but rather as a hard threshold in which the CPU must immediately lower clocks. There is no such thing as PL4 workloads and there is no point in attempting to define or discuss perf/watt during a PL4 excursion. AFAIK PL3/PL4 relevance is how these spikes affect the motherboard VRM and the PSU.
 

Timmah!

Golden Member
Jul 24, 2010
1,419
630
136
Oh boy...where to start...is a 5 GHz P4 Extreme faster than a 4 GHz Core i3? (HINT: the answer is no kid. It all comes down to absolute performance, or rather, perf/watt when comparing fixed power and/or heating targets. Note that power and heat are two different things)

so in other words you think that Raptor Lake -> Meteor Lake will be akin to Pentium 4 -> Nehalem?
 

DrMrLordX

Lifer
Apr 27, 2000
21,631
10,844
136
Dylan Patel/Skyjuice may be too kind to Intel, if Intel can't actually launch Sapphire Rapids until next year. Though that rumour is as-of-yet unconfirmed, I guess?
 

pakotlar

Senior member
Aug 22, 2003
731
187
116
While @eek2121 over-reacted in his reply, his attitude is something most of us on the forum go through sooner or later considering how misinformation creeps in.

First, let's start with a better source for defining PL4 and PL3:
https://edc.intel.com/content/www/u...heet-volume-1-of-2/008/package-power-control/


So, just as @eek2121 warned you, PL4 is not intended to be used as an operating state but rather as a hard threshold in which the CPU must immediately lower clocks. There is no such thing as PL4 workloads and there is no point in attempting to define or discuss perf/watt during a PL4 excursion. AFAIK PL3/PL4 relevance is how these spikes affect the motherboard VRM and the PSU.

If you read the article, and the intel edc link, PL4 has a duration. That duration is 10ms. It is absolutely the burst power state, on small time scales.

0D455140-B205-4D19-B140-A1E200BA4EF7.jpeg
 
Last edited:
Jul 27, 2020
16,300
10,332
106
Regarding PL4, it is disabled by default. Even when enabled, it will work in 10ms bursts as long as the CPU isn't melting. So CPU boosts for 10ms and then how long before it can be cooled for PL4 boosting again? You would likely need a 360mm liquid cooler. Not something the average user is going to be using.
 

pakotlar

Senior member
Aug 22, 2003
731
187
116
Regarding PL4, it is disabled by default. Even when enabled, it will work in 10ms bursts as long as the CPU isn't melting. So CPU boosts for 10ms and then how long before it can be cooled for PL4 boosting again? You would likely need a 360mm liquid cooler. Not something the average user is going to be using.

Defaults are not respected on many mobos, but that was not my point. My point was that conditional on PL4 being used, at that power state Raptor Lake will have something like 1.3/.87 perf/w = 1.49 alder lake perf/watt in that power state. At PL2 its around 1.3/1.04, 1.29 alder lake perf/watt.

None of that is controversial based on the current ES3 performance, which may be a bit worse than production. People seem to want to argue it anyway 🤷‍♀️
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Oh boy...where to start...is a 5 GHz P4 Extreme faster than a 4 GHz Core i3? (HINT: the answer is no kid. It all comes down to absolute performance, or rather, perf/watt when comparing fixed power and/or heating targets. Note that power and heat are two different things)



I really want to know what you are smoking. You mention PL4 "burst" workloads and that sets off alarms. An Intel chip should never hit PL4. PL2 is "burst". PL4 is at the very limit of an Intel chip's capability to operate. If an Intel chip goes beyond PL4 (or even hits), automatic shutdown will occur. Note that under normal operating circumstances, even when overclocked, Intel chips should never come close to PL4. Period.

Raptor lake will be around 20-30% faster than Alder Lake in multicore workloads (depending on the workload), and a small percentage (under 10%) faster than Alder Lake in Single core workloads. Remember, you heard it here first. Don't bother arguing, just quote my post on launch if I'm wrong.

Zen 4, with 16 cores, will easily be able to deal with Raptor Lake, mind you, and according to what we know, the Zen 4 chip demoed by AMD did NOT use 230W, so the PPT for that chip would likely be lower, but alas, I want to keep this post about Intel.

One day Intel will realize what made the Core/Core 2 chips special: hint, it was perf/watt. Until they rediscover that, they won't be competitive with AMD, and according to the latest earnings reports, they have about 2-3 years until AMD technically becomes a larger company...and AMD is fabless.

Intel needs to stop mis-stepping and start stepping. I remember when Core came out (with little fandom), then Core 2 came out. It was Intel's exit strategy from the power-hungry GHz race. A (sub, in some cases) 2ghz chip quickly and easily beat out the hottest and fastest Pentiums. Until that moment, Intel kept releasing faster and hotter chips, while routinely making missteps, some of which required actual recalls. AMD consistently beat them with lower cost, yet superior chips.

Intel needs another Core 2 in order to be competitive. Meteor Lake may be that chip. We will see. Raptor lake is absolutely not that chip. It will provide a decent upgrade for those who buy Intel, but it won't change the status quo at all.

This message was NOT brought to you by an AMD fanboy. Just a guy who has been in tech since well before the 8088 existed.

Speaking of the Core 2 Duo (or similar AMD chips, for that matter) did you guys watch the Gamers Nexus livestream the other day?
I think the sad part, depending how you look at it, is that Intel doesn't need a new Conroe to beat AMD in efficiency. Imagine a chip with the same topology as AMD (one IO + two compute), but instead of 2x8 Zen3 cores, you had 2x8 Golden Cove cores? Or one 8c GLC + one 64c GRT die? That chip could use a ton of power if you wanted to push it to the very limits, but would also be very efficient at lower power as well. Of course, one can argue that the 16+ core consumer market isn't very large today, and thus doesn't justify the effort, but while we're on the top of making these top to top comparisons, might as well entertain the thought.

Now Intel's left fighting off 16x gen N+1 cores from AMD, with a full node advantage, but only dedicating 14 big cores worth of space. That may be a good product from an economic perspective, but is clearly going to be a stretch in terms of high end performance.

Ultimately, I do think we'll get a new Conroe out of Intel in the form of Royal. But it's interesting to imagine what would happen if that were to be proceeded, or even accompanied by, a topology change.
 

moinmoin

Diamond Member
Jun 1, 2017
4,952
7,661
136
I think the sad part, depending how you look at it, is that Intel doesn't need a new Conroe to beat AMD in efficiency.
What kind of efficiency are you talking about, power it doesn't seem to be? Pat himself stated that Intel is behind TSMC in process perf/watt efficiency and intends to catch up by 2024.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
What kind of efficiency are you talking about, power it doesn't seem to be? Pat himself stated that Intel is behind TSMC in process perf/watt efficiency and intends to catch up by 2024.
Power, yes. But I was talking about from an iso-process kind of view. Obviously Raptor Lake will be less efficient than Raphael, but thinking longer-term, AMD has been lagging TSMC's latest by a year or two. That works quite well when Intel lags by the same or worse, but seems rather risky to rely on going forward.
 
  • Like
Reactions: Tlh97 and moinmoin

Ajay

Lifer
Jan 8, 2001
15,451
7,862
136
My guess is never. Apple is more likely to do there own fab than AMD. It takes huge volume to make it worthwhile.
Well, it's still a 0% probability. A FAB is many orders of magnitude more complex than putting together a top notch silicon design team.