Discussion Intel current and future Lakes & Rapids thread

Page 373 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Hulk

Diamond Member
Oct 9, 1999
4,225
2,015
136
What are the little cores going to be used for on the desktop? I can't imagine what benefit adding 4 little cores over 2 more legit ones could bring. This sounds like a laptop design to me.

I have been thinking about this as well. Here are two theories. I'm not implying any of these theories are legit, just topics to perhaps start discussion of this topic. All speculation based on what I've learned so far.

Theory #1
Intel is under the assumption that anything over the top of the line Alder Lake part is super HEDT and consists of a very small niche market that can/will be served by another line of their processors. 8 cores is probably "enough" for 99% of the computer world. So Alder Lake is built primarily for mobile and to cover *most* desktop users. If one Gracemont core can run the whole shabang while watching video or surfing the web ( a Skylake core could do this) wouldn't that result in some crazy impressive battery life for Intel to advertise? All the while the Golden Cove snooze away.

Theory #2
Perhaps some applications have interdependencies that can be well served by the Big/Little strategy. For example, a certain application is running 16 threads, let's just consider physical processors. Perhaps only 6 or 7 of these threads are really compute heavy, but there are 6 more that could cause the Big cores to inefficiently switch among threads and the application would run faster if the Big cores handle the compute heavy threads and a bunch of Little cores run the lighter load threads.

If the little cores are Skylake level compute we're not talking about old school Atom weaklings. Skylake level for the Little cores would be pretty impressive. Think about an 8 core Ice Lake running at 5GHz AND and 8 core non HT Skylake running nearly as fast. That would be a very potent combination when it's essentially a 10700 plus 8 Golden Cove cores. And if what I'm thinking above about light/heavy thread loads is true it could perform equal to or better than the 5950X since the Big cores would be faster than Zen 3.

Theory #3 (okay I'm really reaching on this one)
Some combination of theories #1 and #2 coupled with the fact that some clever juxtaposition of the Big/Little cores allows for superior heat transfer than all big cores. Could the die be laid out Big/Little/Big Little, etc... so that at full bore the little cores (running slower) could absorb and dissipate some of the heat from the big ones. Basically I'm talking about arranging the die to avoid hot spot.
 
  • Like
Reactions: Tlh97 and moonbogg

KompuKare

Golden Member
Jul 28, 2009
1,015
930
136
Or theory #4: Intel want to keep kernel developers on their toes!
Nevermind the scheduler changes Zen and the NUMA layout, moving and scheduling across totally different cores is going to be way harder.
Yes, ARM's Big.little had pioneered a lot of this but didn't ARM always maintain the same ISA and instruction sets between the cores used for Big.little?
While unsupported instruction could cause and exception and either be emulated or forced into the other type of core, this sounds like a lot more work.
Plus, whatever happened to AVX512 taking over the world?
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Or theory #4: Intel want to keep kernel developers on their toes!
Nevermind the scheduler changes Zen and the NUMA layout, moving and scheduling across totally different cores is going to be way harder.
Yes, ARM's Big.little had pioneered a lot of this but didn't ARM always maintain the same ISA and instruction sets between the cores used for Big.little?
While unsupported instruction could cause and exception and either be emulated or forced into the other type of core, this sounds like a lot more work.
Plus, whatever happened to AVX512 taking over the world?

The cores are limited to the least common denominator, at least in Alder Lake.
 

yuri69

Senior member
Jul 16, 2013
388
619
136
Theory #1 is correct - the power efficiency would be very nice for benchmarks like web browsing, video playback, or office loads. Of course, this requires the scheduler to put those processes on the LITTLE cores && do not schedule anything on the big cores thus letting them reach deep sleep.

The general strategy is to keep all those "hungry" cores in deep sleep and wake them only when it makes sense. This way it's possible to run high-priority CPU intensive threads on 1-2 big cores, the low-prio threads occupy the LITTLE cores and the rest of big cores are in deep sleep.
 

Ajay

Lifer
Jan 8, 2001
15,451
7,861
136
The cores are limited to the least common denominator, at least in Alder Lake.
Gracemont has been updated quite a bit, so the LCD is pretty good if true. Good luck to MS in developing an improved scheduler that makes proper use of both sets of cores. At least they have examples from the Linux/Android world as a reference; ex. HMP
 

moonbogg

Lifer
Jan 8, 2011
10,635
3,095
136
Yeah, I still don't get it. I can't imagine any reason to put laptop cores in a desktop chip. It doesn't matter if they are as powerful as last gen stuff. Using chip area for weaker cores doesn't make sense to me when you can instead use the area for more of the newer and better cores. It sounds like it's a money saving strategy where they don't have to design a middle chip for consumer desktop, but instead just use the laptop design as someone else mentioned here. Maybe HEDT will make a return with high-end chips that have all real cores and no laptop cores in them. Then once again, Intel will charge $1800 for a 10 core CPU? One can only hope...
 

Hulk

Diamond Member
Oct 9, 1999
4,225
2,015
136
Really depends on how software utilizes cores? Do they all need to be Big to run the app quickly or is 8 big 8 little better than 12 big sometimes?
 

lobz

Platinum Member
Feb 10, 2017
2,057
2,856
136
Really depends on how software utilizes cores? Do they all need to be Big to run the app quickly or is 8 big 8 little better than 12 big sometimes?
Yeah, of all the niche use cases, this is just really out there in a desktop environment. I think.
 

majord

Senior member
Jul 26, 2015
433
523
136
No matter how you spin it. It's pointless for desktop. Not even NUC level machines will see any real world benefits from the 1 or 2 watt savings.. That's big for battery life savings , but nothing else. These modern HP cores are already not bad efficiency wise if not being pegged at 5Ghz.
 

Hulk

Diamond Member
Oct 9, 1999
4,225
2,015
136
No matter how you spin it. It's pointless for desktop. Not even NUC level machines will see any real world benefits from the 1 or 2 watt savings.. That's big for battery life savings , but nothing else. These modern HP cores are already not bad efficiency wise if not being pegged at 5Ghz.

I'm not trying to spin it I'm trying to think about it with an open mind, which I know can be dangerous on the internet.

I found an interesting portable program called processthreadview that allows you to see exactly what an application is doing process-wise. I'm still fooling around with it.
.

If I'm understanding you clearly you are saying that there is no use for small cores on the desktop. In every application at all times for a given die area big, fast cores will always beat a mix of big and small cores, right?

I don't claim to know the answer to this and am not challenging your assertion but I would like to know upon what tests, simulations, or other knowledge you have come to this conclusion? I'm truly here to learn. I freely admit when it comes to microprocessor design I'm a clueless mechanical engineer who enjoys following the microprocessor industry.
 

dr1337

Senior member
May 25, 2020
333
565
106
These modern HP cores are already not bad efficiency wise
Exactly, thats the whole point. Its rumored that the little cores are ~40% slower than the big cores IPC wise, aka skylake level. If golden cove is the big chonker intel says it is, it should be decently more power hungry than tiger lake let alone skylake, even on 10nm. Also I don't think intel of all companies could possibly overstate the importance of efficiency on the desktop. They just had to lop two cores off their incoming desktop flagship because the IPC gains increased power consumption so much.

Imo I don't see the little cores being all that bad for desktop as long as scheduling is done intelligently. 8 golden cove cores at 5ghz as is would be very compelling as they are. The 8 extra small-ish cores is just icing on the cake. And its definitely the right move for intel as being able to sell '16' cores on mobile guarantees they at least don't fall behind apple in marketing. And I'm really not convinced that the extra investment of a desktop specific chip would actually be worth it. I don't think a dedicated 12 big core chip would necessarily if at all sell better than 8 big 8 little.
 
  • Like
Reactions: Tlh97 and Hulk

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
Something else to consider is that having separate, power optimized cores allows them to also use far more aggressive power management on them, having isolated voltage planes, spinning them down more quickly, etc. Even in some of the most heavily single threaded situations, OSes have stuff to do behind the scenes. It's far better to shuffle that stuff off to low power, low thermal cores to avoid interrupting the big cores and to better manage thermals.

Obviously, we're well within the territory of every few percent counts.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Where is all the talk of a mobile chips coming from? The 8+8 config is a desktop die. Mobile gets 6+8 or 2+8.
 
  • Like
Reactions: Zucker2k

majord

Senior member
Jul 26, 2015
433
523
136
I'm not trying to spin it I'm trying to think about it with an open mind, which I know can be dangerous on the internet.

I found an interesting portable program called processthreadview that allows you to see exactly what an application is doing process-wise. I'm still fooling around with it.
.

If I'm understanding you clearly you are saying that there is no use for small cores on the desktop. In every application at all times for a given die area big, fast cores will always beat a mix of big and small cores, right?

I don't claim to know the answer to this and am not challenging your assertion but I would like to know upon what tests, simulations, or other knowledge you have come to this conclusion? I'm truly here to learn. I freely admit when it comes to microprocessor design I'm a clueless mechanical engineer who enjoys following the microprocessor industry.

No I'm not saying under ALL circumstances. Even the worst of ideas have had circumstances where they made sense, but overall, yes. For the desktop/workstation/enthusiast, I don't see any benefits worth the trouble.

Ultimatly there are two areas it make sense ..

1. Low power mobile, with moderate number of cores , where battery life can be greatly affected by saving 1-2w , even sub W savings are a big deal.

2. Many core Server/enterprise, where a large number of low power, high efficiency cores can actually be used en-mass by workloads that scale to many cores, but where a small number of high ST performance cores are needed concurently (and there are many such cases )

The Desktop market however literally sits in a void between these two..

On the one hand, Desktop is power sensitive too - but not down to this level, and not at low power states, Particularly in the , say, 35-125w market, where other system components are so power hungry, any power savings seen by this sort of scheme virtually fade into the back ground at a system power level. Desktop is more about the peak power consumption, and designing cooling capabilty and power supply around that. Not what's happening at idle, or browsing the web.

On the other hand Desktop workloads can scale to high thread counts, so you could argue the advantages there , but problem is, desktop workloads, i think we all agree don't scale high enough thread wise to see the benefits. Not that is of any relevance with this iteration since it's only 8+8 , 24 thread config. Lower than a 5950x consisting off all high performance cores.

It will be interesting to see the outcome, but there's a good chance a Zen 3 core with SMT will have similar throughput and power consumption as 2x Gracemont. particuarly if its clocked high in order to maximise performance. Probably more to the point, how many gracemont will it take to equal Golden cove's throughput? which should be higher again than Zen 3 (which i'm only mentioning incidently because it's by far a perf/watt benchmark for high performance cores x86 cores atm.. ditto perf/area )
 
Last edited:

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Isn't ADL-S BGA for laptops too?

I suppose so, but we're talking ~55W "desktop replacement" devices. It's quite a small niche. Fundamentally, the design point is still for desktops.

Plus, I would expect at least some lag between ADL-S showing up in desktops and ADL-SBGA showing up in laptops. Might very well see ADL-P laptops first.
 

coercitiv

Diamond Member
Jan 24, 2014
6,199
11,895
136
Probably more to the point, how many gracemont will it take to equal Golden cove's throughput?
Gracemont cluster will look good when compared against Golden Cove when:
  • workload scales well to more threads
  • Golden Cove clocks are down in the all-core turbo range
The problem with hyping Gracemont via the "Skylake IPC" tag is nobody knows how high it will clock. From a purely subjective point of view the "Skyalke IPC" reference point looks very good... until you start asking questions about absolute performance. Will it clock close to 5Ghz? Does it make any sense at all to make it capable of 5Ghz when it's main purpose is increased throughput under heavy workloads? And finally, does Skylake IPC still sound good when Golden Cove may end up being 70-100% faster? (40% IPC advantage, 20-45% clock advantage)

Here's a big question for some of the people in this thread, what would you choose between the following:
  • Comet Lake 10+0 , 10c/20t Skylake cores
  • Cosmic Lake 8+8, 8c/16t Skylake cores + 8c/8t Nehalem cores
 

lobz

Platinum Member
Feb 10, 2017
2,057
2,856
136
Something else to consider is that having separate, power optimized cores allows them to also use far more aggressive power management on them, having isolated voltage planes, spinning them down more quickly, etc. Even in some of the most heavily single threaded situations, OSes have stuff to do behind the scenes. It's far better to shuffle that stuff off to low power, low thermal cores to avoid interrupting the big cores and to better manage thermals.

Obviously, we're well within the territory of every few percent counts.
I don't think that single-threaded cores with Skylake IPC are an efficient solution to any combination of 'behind the scenes OS stuff'.
 

lobz

Platinum Member
Feb 10, 2017
2,057
2,856
136
Gracemont cluster will look good when compared against Golden Cove when:
  • workload scales well to more threads
  • Golden Cove clocks are down in the all-core turbo range
The problem with hyping Gracemont via the "Skylake IPC" tag is nobody knows how high it will clock. From a purely subjective point of view the "Skyalke IPC" reference point looks very good... until you start asking questions about absolute performance. Will it clock close to 5Ghz? Does it make any sense at all to make it capable of 5Ghz when it's main purpose is increased throughput under heavy workloads? And finally, does Skylake IPC still sound good when Golden Cove may end up being 70-100% faster? (40% IPC advantage, 20-45% clock advantage)

Here's a big question for some of the people in this thread, what would you choose between the following:
  • Comet Lake 10+0 , 10c/20t Skylake cores
  • Cosmic Lake 8+8, 8c/16t Skylake cores + 8c/8t Nehalem cores
The second one, just for the name. I finally wanna see a cosmic lake!!
 
  • Like
Reactions: Tlh97

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
I'm also of the belief that they won't be clocking the small cores especially high. If they REALLY are supposed to be all about energy efficiency, why would they allow them to go beyond the inflection point in the power draw / clock speed graph where it starts taking a lot more power for each additional 100mhz? I tend to think that the small cores will be limited to the 3.5-3.8Ghz range. Why? I think that the i3-1125G4 informs here as it is based on the most similar current process tech in production. It seems to me that the i3-1125G4 is an otherwise largely functional Tiger Lake die that is missing power targets to be used as an i5/i7 product. They clock it down deep in the power/performance curve to enable it to clear the target wattage, but have to sell it as an i3. I'm assuming that 10esf won't be dramatically more efficient in that area, so, a clock speed in the 3.5-3.8Ghz range seems to be where power efficiency starts taking a big hit for extra performance. If that's the target for the cores, they won't be pipeline optimized for much higher clock speeds. If there really was something that would stress them so badly that they would need to go faster, then it's better run on the big cores to begin with.

So, working off those assumptions, (~skylake IPC, clocks around 3.7Ghz), what we're looking at is a section of the processor that's going to behave a lot like the i7-6900K or a pair of i7-6700/6700T processors, but run at 15/35W. That's not a particularly bad place to be.

Again, that's as much a S.W.A.G. as anything.
 
  • Like
Reactions: Tlh97 and misuspita

mikk

Diamond Member
May 15, 2012
4,140
2,154
136
I'm also of the belief that they won't be clocking the small cores especially high. If they REALLY are supposed to be all about energy efficiency, why would they allow them to go beyond the inflection point in the power draw / clock speed graph where it starts taking a lot more power for each additional 100mhz? I tend to think that the small cores will be limited to the 3.5-3.8Ghz range.


I'm expecting higher than this for several reasons (if you refer to the desktop). First of all back in 2018 Intel highlighted a frequency increase for Gracemont unlike Tremont or Golden Cove: https://images.anandtech.com/doci/13699/1-Roadmap.jpg

Furthermore Jasper Lake SKUs (10W) can boost up to 3.3 Ghz with the poor Icelake 10nm. The difference from 10nm to SuperFin is big and to the enhanced SuperFin even bigger, ADL will use enhanced SuperFin. Also the fastest desktop SKUs are not energy optimized by nature, a performance optimized 125W SKU won't run with the most efficient clock speeds.