Discussion P vs E cores in Raptor lake CPUs - determining optimal number of cores

Kocicak

Golden Member
Jan 17, 2019
1,167
1,223
136
Following the discussion of how many P and E cores should be in the CPU, I measured performance of these cores in Intel 13700K CPU at 3400 MHz using Cinebench R23 maximal load as a workload, here are the results. I also estimated from a die shot, that E core has 1/4 of a area of a P core.

13700K PE core calcul.png

You can see, that E core has roughly half of the performance of a P core at 4 times less area. The conclusion from this is, that for increasing performance you keep adding E cores.

Unfortunatelly the E core is less efficient than the P core, and thus adding E cores increases power draw and decreases efficiency.

You can see in the table above, that if we substituted all E cores in a 8 +16 CPU by 4 P cores, we would lose 30% performance.

You can see performance, power draw and efficiency of all P and E core combinations below, available area is for 16 P cores:

core combinations.png

So you have a decision problem now. For
  • Available area and physically possible core combinations
  • Particular workload
  • Combination of P and E core frequencies
you need to find optimal combination of cores with possible decision making scenarios:
  1. Reach maximal performance within possible power draw and not care about efficiency at all.
  2. Reach maximal performance within possible power draw and still maintain some minimal efficiency requirement.
  3. Reach maximal efficiency while still maintaining required performance.
In this example, we get three core count combinations, some of them may not be physically possible on the silicone:

Core count optimum2.png


And all this is valid for ONE WORKLOAD and ONE COMBINATION of P and E core frequencies.

I do not think that it is possible to easily say that they should have swapped this many P for E cores or vice versa. There are too many variables at play.

But it should be obvious that E cores are very useful.
 
Last edited:

coercitiv

Diamond Member
Jan 24, 2014
6,800
14,873
136
Your approach reminds me of an old joke.

An engineer decides to analyze a grasshopper: he claps his hands and observes the grasshopper jumps away from him. He then removes one pair of legs from the insect, claps his hands again to see the grasshopper jump a shorter distance. He proceeds to remove another pair of legs, and this time the grasshopper no longer jumps.

Conclusion: grasshoppers go progressively deaf without their legs.

What you did in this OP was evaluate E core scaling with a method that is readily available but of unknown relevance. Just like the engineer in my little story, you showed little interest in the mechanics of the system under observation, and proceeded to test from a convenient angle. In the end you concluded the E cores are "obviously very useful" even though there was no honest attempt to understand the computing needs of the consumer. You simply gave them more pairs of legs and concluded their hearing should improve.

Consumer workloads are notoriously NOT embarrassingly parallel. They scale very well with ST performance and then core count tends to be seen as a threshold, past which gains encounter big diminishing returns. This sabotages the E core approach as increasing thread count for the sake of area efficiency can rapidly tank scaling. Intel is making a bet that future software will adapt and accommodate this massive increase in threads, some of us beg to differ. For us the hybrid approach looks more like a solution in search of a problem, one that conveniently solves Intel's monolithic approach drawbacks on the consumer side.

IIRC you recently bought a 13900K, could you share with us how the 16E cores are helping you scale performance in your daily workloads? What kind of "consumer" software are you running that is able to efficiently drive more than 8 cores?
 

Kocicak

Golden Member
Jan 17, 2019
1,167
1,223
136
I have been thinking about it during a walk, and here are my other thoughts:

At the same frequency, E core has a large advantage in performance per area over P core, and very slight disadvantage in performance per watt.

Today P cores are clocked higher than E cores to provide as high absolute 1 thread performance as possible. It is very well possible that due to being forced to work at higher frequency, P core loses its small advantage in performance per watt and E core is better than it in BOTH PERFORMANCE PER AREA AND PERFORMANCE PER WATT.

The need and number of P cores would then be determined only by the type of workload by how many threads it needs and how the computation is organised on the threads.

I start to believe that the best solution (for some workloads) may be to have one or two giant superpowerful cores and the rest small ones.
 
Last edited:

Kocicak

Golden Member
Jan 17, 2019
1,167
1,223
136
Consumer workloads are notoriously NOT embarrassingly parallel. They scale very well with ST performance and then core count tends to be seen as a threshold, past which gains encounter big diminishing returns. This sabotages the E core approach as increasing thread count for the sake of area efficiency can rapidly tank scaling.

IIRC you recently bought a 13900K, could you share with us how the 16E cores are helping you scale performance in your daily workloads? What kind of "consumer" software are you running that is able to efficiently drive more than 8 cores?

I bought it, because it was by accident for sale before launch, I do not need so many cores. I was thinking about getting 13600K, but was worried about quality of silicone, and ended up buying 13700K. It now runs at 5600+5500 MHz P cores. My 13900K can run 6000+5700 P cores, so I am still considering keeping for higher performance, possibly even with some E cores disabled.
 

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
Intel is making a bet that future software will adapt and accommodate this massive increase in threads, some of us beg to differ. For us the hybrid approach looks more like a solution in search of a problem, one that conveniently solves Intel's monolithic approach drawbacks on the consumer side.
I very much doubt that intel made hybrid for the future, they just had to match AMDs MT performance because with the global village people making fun of you can turn very ugly very fast and this is the cheapest and easiest way for them to do that.

Intel has more than a decade of experience with xeon phi and they know extremely well that software has no intention, and also no realistic means, of going very threaded.

But the hybrid architecture does help them learn a bunch about how future clustered or decentralized workflows will be handled. The future won't be one CPU in your system but one main CPU in your system and all the smaller CPUs around your house in all of your devices combined.
 
  • Like
Reactions: controlflow

Kocicak

Golden Member
Jan 17, 2019
1,167
1,223
136
I just did the same thing with P cores running at 5500 MHz and E cores at 4200 MHz, and P cores took a huge efficiency hit, they are now significantly less efficient that E cores and somebody building a CPU out of cores running at this frequency should be damn sure, why is he putting as many P cores in as he does.

1PE core calcul 55P 42E.png

core comninations 55 42.png

Core combi table.png
 
Last edited:

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
I just did the same thing with P cores running at 5500 MHz and E cores at 4200 MHz, and P cores took a huge efficiency hit, they are now significantly less efficient that E cores and somebody buiding a CPU out of cores running at this frequency should be damn sure, why is he putting in as many P cores in as he does.
Doing God's Work...!
 

Hulk

Diamond Member
Oct 9, 1999
4,814
3,212
136
Does your 1 core result for the P core include hyperthreading?

How did you isolate the E core score? I assume subtraction for P+E result?
 

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
Does your 1 core result for the P core include hyperthreading?

How did you isolate the E core score? I assume subtraction for P+E result?
We need Intel Grand Ridge(24C/24T Raptormont) to be release to fully comprehend the full extend of the e core capabilities
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,105
136
We need Intel Grand Ridge(24C/24T Raptormont) to be release to fully comprehend the full extend of the e core capabilities
I'm pretty sure Grand Ridge uses Crestmont. Sierra Forest will likely be our first/best look at Atom for server, and likewise Alder Lake N for client, though tbh, I think we have plenty enough data to derive conclusions today.
 

Wolverine2349

Senior member
Oct 9, 2022
441
144
86
Your approach reminds me of an old joke.



What you did in this OP was evaluate E core scaling with a method that is readily available but of unknown relevance. Just like the engineer in my little story, you showed little interest in the mechanics of the system under observation, and proceeded to test from a convenient angle. In the end you concluded the E cores are "obviously very useful" even though there was no honest attempt to understand the computing needs of the consumer. You simply gave them more pairs of legs and concluded their hearing should improve.

Consumer workloads are notoriously NOT embarrassingly parallel. They scale very well with ST performance and then core count tends to be seen as a threshold, past which gains encounter big diminishing returns. This sabotages the E core approach as increasing thread count for the sake of area efficiency can rapidly tank scaling. Intel is making a bet that future software will adapt and accommodate this massive increase in threads, some of us beg to differ. For us the hybrid approach looks more like a solution in search of a problem, one that conveniently solves Intel's monolithic approach drawbacks on the consumer side.

IIRC you recently bought a 13900K, could you share with us how the 16E cores are helping you scale performance in your daily workloads? What kind of "consumer" software are you running that is able to efficiently drive more than 8 cores?


I agree. Not crazy about the e-core approach. They only work for workloads that are embarrassingly parallel which most consumer workloads are not. Show just throwing more and more P-cores not the idea. Though we do not need more and more P cores either, though probably should have at least 8 and maybe a little more. But more focus should be on making the P cores faster and better IPC and keep core count 8-12 or maybe as high as 16 P cores for the next 30-40 years.

Its hard for me to ever see software developers being able to make programs just be so parallels that performance doubles for the double amount of same type core count added. That is like almost impossible. Rather fewer extremely fast P cores like 4 to 12 or even 8-16 with making them faster and faster and faster over time is the better way to go in the consumer space.
 
  • Like
Reactions: Tlh97 and Shmee

Shmee

Memory & Storage, Graphics Cards Mod Elite Member
Super Moderator
Sep 13, 2008
7,876
2,846
146
I think the P cores are still better for gaming in general, correct? Do the new E cores on Raptor Lake make any difference? For gaming, wouldn't it make sense just to have 8 to 12 P cores and no E cores?
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,105
136
The Cache Subsystem listed by Intel(on the leaked presentation) is Identical to Raptormont and built on Intel 7

View attachment 69748
That leak is very old. Importantly, it predates the renaming of "10nm" to "Intel 7". So that process there "7nm HLL+" is what we now call Intel 4. I would also guess that, just like Meteor Lake, they updated the core from Gracemont to Crestmont.
 

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
That leak is very old. Importantly, it predates the renaming of "10nm" to "Intel 7". So that process there "7nm HLL+" is what we now call Intel 4. I would also guess that, just like Meteor Lake, they updated the core from Gracemont to Crestmont.
It would seem that the design remains mostly unchanged. 4MiB L2$. Can Gracemont have Icelake IPC?
 

poke01

Platinum Member
Mar 8, 2022
2,784
3,654
106
I think the P cores are still better for gaming in general, correct? Do the new E cores on Raptor Lake make any difference? For gaming, wouldn't it make sense just to have 8 to 12 P cores and no E cores?
Not everyone who buys CPUs games. Just disable the e-cores in bios.
 

Kocicak

Golden Member
Jan 17, 2019
1,167
1,223
136
Does your 1 core result for the P core include hyperthreading?

How did you isolate the E core score? I assume subtraction for P+E result?

Yes, the cores do all they can do, incl. hyperthreading.

One approach could be let P+E cores run, let P cores alone run, then substract the latter from the former and get the performance of E cores.

I let P cores run alone, divide that by 8 and get 1 P core result, then let 8 E cores run with 1 P core and substract 1 P core from the result, and then divide it by 8 to get 1 E core result.

The advantage of my approach was lower heat stress and power consumption of the cpu during testing. But it may not be so important and the results of both approaches could be similar. Unfortunately in the first 3400 MHz testing I did not run all the cores at 3400MHz at all, and in the second testing I just copied the results from the first table. If you add P and E cores results from my method you get 32135, which is higher than what I got earlier that day running all cores (31799). That could mean that using the second method may reveal performance of cores better?
 

coercitiv

Diamond Member
Jan 24, 2014
6,800
14,873
136
I think the P cores are still better for gaming in general, correct? Do the new E cores on Raptor Lake make any difference? For gaming, wouldn't it make sense just to have 8 to 12 P cores and no E cores?
That's what the counter-argument is all about: consistency in workloads, and arguably even consistency in Intel's designs. Hybrids push theoretical performance way up, but when it comes to running today's software... it gets complicated.

This issue can also be deducted from Intel's approach to power/cost scaling. For mobile Alder Lake they start with 2+8 for low power, shift gears to 4+8 for the classic laptop chip, and then move to 6+8 for high performance desktop replacements. The P cores are scaling the performance across segments. On the desktop Raptor Lake they prioritize P cores even more with the value chips, only the flagship model favors E cores again.

So what can we conclude using Intel's own actions: favoring E cores is worth at very low power & area and at very high power & area. P cores are Priority cores in the consumer space, you always try to fit enough of them to make sure consumer workloads run as fast as possible for the form factor. Today that magical number seems to be 8 according to Intel. Tomorrow... who knows?

Today we have a value champ from Intel the shape of 13600K with 6+8 config. I'd be very curious to see how people would react if they were offered a 13600KP with 8+0 config for the same price (roughly same area). Would average consumers still choose CPU rendering performance over consistent and more future proof gaming performance?
 
  • Like
Reactions: moinmoin and Tlh97

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
That's what the counter-argument is all about: consistency in workloads, and arguably even consistency in Intel's designs. Hybrids push theoretical performance way up, but when it comes to running today's software... it gets complicated.

This issue can also be deducted from Intel's approach to power/cost scaling. For mobile Alder Lake they start with 2+8 for low power, shift gears to 4+8 for the classic laptop chip, and then move to 6+8 for high performance desktop replacements. The P cores are scaling the performance across segments. On the desktop Raptor Lake they prioritize P cores even more with the value chips, only the flagship model favors E cores again.

So what can we conclude using Intel's own actions: favoring E cores is worth at very low power & area and at very high power & area. P cores are Priority cores in the consumer space, you always try to fit enough of them to make sure consumer workloads run as fast as possible for the form factor. Today that magical number seems to be 8 according to Intel. Tomorrow... who knows?

Today we have a value champ from Intel the shape of 13600K with 6+8 config. I'd be very curious to see how people would react if they were offered a 13600KP with 8+0 config for the same price (roughly same area). Would average consumers still choose CPU rendering performance over consistent and more future proof gaming performance?

Well, how well did the 12400/12600 sell vs the K chips last time? Those were 'only' 4+0/6+0 but also had hugely more manageable power draws than the K chips, which must surely rate as a nice win for main stream customers.
(Definitely why I got a 12600 - no zen as I wanted an iGPU as backup.).

8P cores is going to get quite power constrained, even without any E cores, I would have thought.
 

DrMrLordX

Lifer
Apr 27, 2000
22,286
12,054
136
I think the P cores are still better for gaming in general, correct? Do the new E cores on Raptor Lake make any difference? For gaming, wouldn't it make sense just to have 8 to 12 P cores and no E cores?

12P cores would be better for a lot of workloads. As @coercitiv said, scaling is difficult the more cores you add. As least HT still lets you use most of the execution resources of the CPU even if the workload doesn't reach the max thread capacity of the CPU.

People who think that adding more e-cores is the better solution are fooling themselves, especially when Gracemont/Gracemont+ offer little more than area efficiency as an advantage over Raptor Cove.