Question Discussion over P-core and E-core vs AMDs regular vs C-core

igor_kavinski · Feb 10, 2024

Hulk said:
By creating the c cores AMD has validated Intel's hybrid approach.

I'm still not sure if Alder Lake was always supposed to launch with E-cores. They just seem hurriedly tacked on and the post-launch disabling of AVX-512 points to that possibility.

Also, Intel having to create a special Thread Director whereas AMD hasn't revealed anything of the sort in their design seems really odd to me. Like what was wrong with Intel's design that they needed a special Thread Director?

Hulk · Feb 10, 2024

igor_kavinski said:
I'm still not sure if Alder Lake was always supposed to launch with E-cores. They just seem hurriedly tacked on and the post-launch disabling of AVX-512 points to that possibility.

Also, Intel having to create a special Thread Director whereas AMD hasn't revealed anything of the sort in their design seems really odd to me. Like what was wrong with Intel's design that they needed a special Thread Director?

I know there is quite a bit of R&D that goes into CPU manufacturing and generally the lead time is measured in years so I'm not exactly sure what you mean by "tacked on" as it implies a last minute design change, which would have probably been months if not years ahead of release?

As for the Thread Director, if it works for Intel hybrid CPU's wouldn't it also work for AMD?

Khato · Feb 10, 2024

Hulk said:
You got half of it correct. AMD and Intel are using c and E cores for area efficiency. But they are not using them for power efficiency. P cores and normal size Zen 4 cores are more power efficient at iso-performance than their "full size" counterparts. Well, to be totally honest this is definitely the case for Intel, for AMD efficiency is probably about the same since the architecture is the same, but the c cores are smaller so the only advantage of using them for AMD is area efficiency.

Due to the nonlinear nature of the v/f curve a big (in terms of die area), wide (in terms of architecture) CPU running at a low frequency will generally be more power efficient than a smaller physical core running at higher frequency to achieve the same performance. But of course die area is expensive so we have c and E cores.

The p-core/normal zen4 only has higher performance per watt above a certain point. While the v/f curve and hence dynamic power favors the 'large' cores, that comes at the cost of higher static power. The Chips and Cheese article on Alder Lake has some nice graphs showing such - https://chipsandcheese.com/2022/01/28/alder-lakes-power-efficiency-a-complicated-picture/

jpiniero · Feb 10, 2024

igor_kavinski said:
I'm still not sure if Alder Lake was always supposed to launch with E-cores. They just seem hurriedly tacked on and the post-launch disabling of AVX-512 points to that possibility.

No it was always intended to be the case. Speculation was that Intel had originally planned for a software based solution to deal with AVX-512 but was abandoned... far enough along that completely removing AVX-512 wasn't worth the effort, esp when they had probably originally intended Alder Lake Xeon E's to support it still.

igor_kavinski · Feb 10, 2024

Hulk said:
As for the Thread Director, if it works for Intel hybrid CPU's wouldn't it also work for AMD?

What Is Intel® Thread Director?

Describes Intel® Thread Director and what it does.

www.intel.com

Built directly into the hardware, Intel Thread Director uses machine learning to schedule tasks on the right core at the right time (as opposed to relying on static rules).

So my question stands, why does Intel need to depend on this feature taking up die space whereas AMD manages to keep their P and c cores working without it?

igor_kavinski · Feb 10, 2024

jpiniero said:
Speculation was that Intel had originally planned for a software based solution to deal with AVX-512 but was abandoned... far enough along that completely removing AVX-512 wasn't worth the effort

So at Alder Lake launch, they suddenly decided to abandon that software effort? Seems weird. Markfw had his Alder Lake's AVX-512 active and managed to evade its disabling by staying on Linux where Intel or mobo vendors can't push microcode updates as easily as they do on Windows. This AVX-512 drama is why I think the E-cores were added to Alder Lake as a last ditch effort when Intel realized that they would lag significantly behind AMD if they stuck to just P cores. Alder Lake's 12900K was probably originally intended as a 10C/20T part, just like 10900K. At the last minute, they changed the floor plan and replaced two of the P-cores with E-core clusters.

jpiniero · Feb 10, 2024

igor_kavinski said:
So at Alder Lake launch, they suddenly decided to abandon that software effort? Seems weird.

Not suddenly. You have to remember that a normal chip design is like a 4-5 year effort... and with the 10 nm problems, Alder Lake was likely more than that.

igor_kavinski · Feb 10, 2024

Khato said:
https://chipsandcheese.com/2022/01/28/alder-lakes-power-efficiency-a-complicated-picture/

Juicy tidbits there:

Between 3 and 4 GHz, these P-Cores can give the E-Cores a run for their money. In an integer workload, Golden Cove consumes about the same amount of total energy while completing the task faster. With a vectorized load, Golden Cove finishes the task so much faster that it ends up using less total energy than Gracemont, even though Gracemont draws less power. That means running Gracemont above 3.2 GHz is pointless if energy efficiency is your primary concern. Running the E-Cores at 3.8 GHz basically makes them worse P-Cores. But that’s exactly what Alder Lake does by default.

Below 3 GHz, Gracemont shines. It’s able to maintain better performance at very low power targets (as we saw in the previous section) while consuming less energy. With an integer workload, it’s especially efficient.

I hope no one tells me that Intel didn't know this. They deliberately ran Alder Lake outside its power efficiency sweet spot to combat the threat of AMD eating their lunch and making them look like amateurs. So while Intel's architecture may not have been completely crap, their process node disadvantage really hurt them like hell. Like "fire ant bit me!" kind of hurt.

jpiniero · Feb 10, 2024

igor_kavinski said:
I hope no one tells me that Intel didn't know this.

The E cores are for perf/area, not perf/watt.

igor_kavinski · Feb 10, 2024

jpiniero said:
The E cores are for perf/area, not perf/watt.

They could easily have been perf/watt as shown by ChipsandCheese but Intel was desperate not to lose face.

Markfw · Feb 10, 2024

igor_kavinski said:
They could easily have been perf/watt as shown by ChipsandCheese but Intel was desperate not to lose face.

Intel has lost face over so many issues over the last 7 years, its not funny anymore.

Hulk · Feb 10, 2024

igor_kavinski said:
What Is Intel® Thread Director?

Describes Intel® Thread Director and what it does.

www.intel.com

So my question stands, why does Intel need to depend on this feature taking up die space whereas AMD manages to keep their P and c cores working without it?

I don't know. Maybe since AMD c cores are the same cores just clocked lower than their "P" cores it's easier for the OS to direct threads to the cores that can clock higher? It's more complicated with Intel because the P's and E's have significantly different architecture as well as clocks. Intel's E's are more like better HT threads.

Windows has been directing threads to preferred (higher max frequency) cores for quite a few years now so I think that is basically what AMD is utilizing By that I mean the non-c cores are the "preferred cores."

moinmoin · Feb 10, 2024

positivedoppler said:
Always wonder why they didn't continue using HDL for mobile

Do you mean high density libraries with HDL? If so AMD actually does just that with all Zen and many Radeon designs. But that doesn't mean the resulting design has a focus on density, just that the grid is smaller than with high performance libraries.

Hulk said:
By creating the c cores AMD has validated Intel's hybrid approach.

If you want to think so.

In my opinion AMD's focus was clearly to compete with the massive ARM cores server chips. With those high frequency is not the focus but perf/area is, and that's what Bergamo is for.

With PHX2 for a long time it wasn't known whether it actually contains Zen 4c cores, some people here and elsewhere claimed that that still isn't a hybrid design, and Ian Cutress said we won't see a hybrid design from AMD before next gen.

daniberry143 · Feb 10, 2024

AMD's approach seems more practical.

igor_kavinski · Feb 10, 2024

Hulk said:
Windows has been directing threads to preferred (higher max frequency) cores for quite a few years now so I think that is basically what AMD is utilizing By that I mean the non-c cores are the "preferred cores."

Plausible explanation

jpiniero · Feb 10, 2024

moinmoin said:
With PHX2 for a long time it wasn't known whether it actually contains Zen 4c cores, some people here and elsewhere claimed that that still isn't a hybrid design, and Ian Cutress said we won't see a hybrid design from AMD before next gen.

There are rumors that AMD is working on an new Cat core but it's some ways off.

igor_kavinski · Feb 10, 2024

TomCat core? 😛

Doug S · Feb 10, 2024

Hans Gruber said:
Intel is going from 10nm to 5nm with Arrow Lake. That means Intel has much more silicon real estate on their chips. They can be more flexible. Dumping hyperthreading indicates they have some fancy new technology moving forward.

I don't think it indicates any "fancy new technology", they likely believe it is easier to increase MT throughput with their little cores than the additional complexity hyperthreading brings.

Hulk · Feb 10, 2024

Doug S said:
I don't think it indicates any "fancy new technology", they likely believe it is easier to increase MT throughput with their little cores than the additional complexity hyperthreading brings.

I don't know about "fancy new technology" either but I do think that Intel is excising HT structures from the P cores so that they can design for maximum ST performance now that they can "go to" the E's for as much MT compute as needed. Need more MT compute? Add another E cluster. Or as I read somewhere the E's are "Cinebench accelerators."

Abwx · Feb 10, 2024

Hulk said:
, for AMD efficiency is probably about the same since the architecture is the same, but the c cores are smaller so the only advantage of using them for AMD is area efficiency.

No, the C cores are quite more efficient at same frequency assuming it s low enough, AMD s comparison slide is at same frequency for both regular and C cores, beside they have published a slide comparing 2P + 4C and a 6P, and below a given frequency the former is more efficient.

Hulk · Feb 10, 2024

Abwx said:
No, the C cores are quite more efficient at same frequency assuming it s low enough, AMD s comparison slide is at same frequency for both regular and C cores, beside they have published a slide comparing 2P + 4C and a 6P, and below a given frequency the former is more efficient.

Interesting. Two nearly identical architectures on the same nodes but different power for the same frequency and workload. I'd like to read more about this. Can you post that link?

The 2P+4C to 6P comparison isn't apples-to-apples because the 2P's could be running faster, 4C's lower, same compute output but more power for the 2+4.

Let's dig into this!

Abwx · Feb 10, 2024

Hulk said:
Interesting. Two nearly identical architectures on the same nodes but different power for the same frequency and workload. I'd like to read more about this. Can you post that link?

The 2P+4C to 6P comparison isn't apples-to-apples because the 2P's could be running faster, 4C's lower, same compute output but more power for the 2+4.

Let's dig into this!

The comparison slide i posted is at isofrequency.

Question - Discussion over P-core and E-core vs AMDs regular vs C-core

First my take. Intel needed a way to be competitive with AMD in multicore benchmarks and performance scenarios but were limited by space and power due to foundry limitations, so they created E-cores. AMD wanted more cores with less power, so they reduced the speed to allow for more cores in a...

forums.anandtech.com

Now for the 2P + 4C vs 6P.

The 2P + 4C efficency is somewhat lower than it should be at low frequency because of the 2 regular Zen cores, these increase the efficiency at the higher powers but at the lower powers (below 15W) 6 C cores would do better than the 2P + 4C assymmetrical arrangement.

AnandTech Forums: Technology, Hardware, Software, and Deals

Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

www.anandtech.com

KompuKare · Feb 11, 2024

Hulk said:
AMD added the c cores for the exact same reason Intel did. Most software doesn't utilize more than 8 or so threads so there are decreasing gains to adding more full size cores. Instead of designing an entirely new architecture AMD reduced the physical size of Zen 4, which required a decrease in clocks. This way they got quite a bit of additional multicore performance for less die area than would have been required with "full speed" Zen 4 (larger) cores.

i think that is a very consumer centric way to think about it.

Where Zen 4C excels is server chips and density.
Yes, they can repurpose it for desktop and more importantly laptops, but I do not think that was that got them to develop the C cores.

Intel is almost the polar opposite. The did not abandon their smaller cores (but Atom has long no longer been a part purposefully crippled whose main purpose was to keep old fabs busy). But the E cores were not designed for servers. Yes, they will start using E cores in Server with their 'forest' Xeon but they are way late compared to Bergamo or the various ARM many-small-cores server chips.

Thunder 57 · Feb 11, 2024

jpiniero said:
There are rumors that AMD is working on an new Cat core but it's some ways off.

That's interesting. I always liked the cat cores. They probably saved AMD as we know it.

BorisTheBlade82 · Feb 12, 2024

Abwx said:
Slightly :

Hot Chips 2023: AMD verrät wenig mehr über „Siena“ und verwirrt

Nach Genoa, Bergamo und Genoa-X fehlt noch Siena in AMDs Familie der Zen-4-CPUs für Server. Zur Hot Chips gibt es (verwirrende) Details.

www.computerbase.de

Sadly, that slide is misleading. It already takes into account, that for the same MT performance more Zen4c cores are used and that they clock lower than a comparable Zen4 SKU. Looking at the V/f curve, Zen4c is quite disappointing IMHO.

I fully expect AMD to specialize their two cores more in the coming generations, where the c-cores show real efficiency improvements in terms of Work per Joule. This might only be the case for Integer.
All in all, in the client space the heterogenous approach makes sense to me due to Amdahl's law. Server space is different, because there it does not apply in the same way. There, many smaller cores might become the standard while big cores might only be used for special workloads and applications that get licensed by core count usage.

Question Discussion over P-core and E-core vs AMDs regular vs C-core

Lifer

Diamond Member

Golden Member

Lifer

Lifer

Lifer

Lifer

Lifer

Lifer

Lifer

Moderator Emeritus, Elite Member

Diamond Member

Diamond Member

Banned

Lifer

Lifer

Lifer

Diamond Member

Diamond Member

Lifer

Diamond Member

Lifer

Golden Member

Diamond Member

Senior member