Discussion Intel current and future Lakes & Rapids thread

Page 374 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Hulk

Diamond Member
Oct 9, 1999
4,191
1,975
136
Let's review. I don't think there is much debate that the Big/Little concept is useful for mobile when efficiency is of the utmost importance. We don't have to look further than phones to see that since that is the norm these days it must be advantageous.

I think most of us would agree that Intel is going with a Big/Little design for Alder Lake because mobile is their first priority. The question to be asked is as follows:

Is the hybrid design a detriment to desktop performance and Intel will just do the best than can with it?
Or have they found a way that for a given transistor budget they can achieve higher performance with Big/Little?

While there is little doubt Intel will stay in the optimum efficiency range of the schmoo plot for mobile, they have demonstrated that if enormous power is required for desktop to be competitive they will do it. So I don't take much stock in the theory that for the desktop they will limit Big or Little core speed for power reasons, especially at the top of the stack where competition with AMD is paramount.

It seems to me that the elephant in the room here is for a given application do all threads need the same amount of compute? I don't know. The easiest way to design to to assume they do and throw as many powerful cores at the application as you can. ie Threadripper

A more elegant solution might be to have two "bucket's" of cores. Big and Little and assign them based on the compute needs of various threads for the ongoing process. Of course this can be tailored to a specific application but can it work across the board? As I wrote above I found a neat little app called "processthreadsview" that allows you to see the threads created by a specific application, the rate of context switching and the user/kernel time for each thread.

I have only started to check it out but it seems that the rate of context switching could be meaningful. As we know, everytime the context is switched for a thread there is inefficiency as one thread is flushed from the CPU and another one started. The less context switching the better.

Here's a screenshot of Handbrake running the benchmark from this forum. Lots of threads are spawned and the first 12 are doing quite a bit of context switching. This is from my 4770k. It would be interesting to see what this looks like with 8/16 core, 12/24 cores, and 16/32 cores. Seems like 7 or 8 threads are getting hammered pretty good by the context switching rate, which I believe is switches/second. So those "main" threads are being cut away from to service other threads frequently.

Handbrake Context Switches.jpg

Here is a screen shot of Presonus Studio One playing back a relatively lightly loaded multitrack song. Here we have 7 threads with high context switching and 8 or so other threads which seem to be lightly loaded and being serviced during the context switches from the main threads.

Presonus Studio One context switches.jpg

Honestly I don't know what the heck this all really means but it seems it could be that these applications could do well with 8 dedicated Big cores and a bunch of Little cores to service the less heavily loaded threads. Or it could mean nothing because the context switching performance hit is negligible.

One more interesting observation is that the user time for Studio One is focused on 8 threads. For Handbrake it's 7 threads, plus one less loaded one. Is this because my 4770k has 8 logical processors or is this because these apps were coded with 4/8 core processors in mind? If someone ran the Handbrake test on an 8 core rig we'd at least know that right?

But I thought it was interesting enough to post.
 
  • Like
Reactions: Tlh97 and KompuKare

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
I'm expecting higher than this for several reasons (if you refer to the desktop). First of all back in 2018 Intel highlighted a frequency increase for Gracemont unlike Tremont or Golden Cove: https://images.anandtech.com/doci/13699/1-Roadmap.jpg

Furthermore Jasper Lake SKUs (10W) can boost up to 3.3 Ghz with the poor Icelake 10nm. The difference from 10nm to SuperFin is big and to the enhanced SuperFin even bigger, ADL will use enhanced SuperFin. Also the fastest desktop SKUs are not energy optimized by nature, a performance optimized 125W SKU won't run with the most efficient clock speeds.

They don't need to go after every last Mhz on the small cores. 3.3 - 3.8 max feels right for locked.
 

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
Where is all the talk of a mobile chips coming from? The 8+8 config is a desktop die. Mobile gets 6+8 or 2+8.
Where are you getting this information?
With die salvaging, I would expect failed gracemont cores, hence something like 4+4 and whatnot.
 

blckgrffn

Diamond Member
May 1, 2003
9,111
3,029
136
www.teamjuchems.com
Where are you getting this information?
With die salvaging, I would expect failed gracemont cores, hence something like 4+4 and whatnot.

My $.02 is that 8 "little" cores seems fairly ridiculous if we are really trying to get efficient unless it's a marketing gimmick to pump core numbers up.

Octacore of lower power cores in mobile seems to only occur when the don't bother to include any Big cores.
 

gdansk

Golden Member
Feb 8, 2011
1,988
2,358
136
I think Intel is right to some extent here. Mix a few huge cores and a bunch of weaker cores that have better performance per watt. It could have higher single thread and higher multi-thread performance than designing one core that tries to mix both. I doubt Alder Lake will achieve both victories but in the future such designs could. ARM is trying a three tier approach (huge, big, small). But Intel is not going to design three different cores.
 

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
Where are you getting this information?
With die salvaging, I would expect failed gracemont cores, hence something like 4+4 and whatnot.


This has the potential core options that were leaked. Intel could change it of course before launch but since the cut down desktop die is 6 big and 0 small that seems unlikely there would be any SKUs with small cores enabled with less than 6 big.
 
  • Like
Reactions: Tlh97 and Ajay

Asterox

Golden Member
May 15, 2012
1,026
1,775
136

Intel non-K CPU-s, they were never loved and appreciated as "mighty overclocking Intel K CPU-s".One or two example is more then enough. i7 8700K vs cheeper i7 8700 with the support of Mindfactory CPU sales figures.i7 6700K vs miserable i7 6700, so sad i7 6700 he didn't even get a pat on the shoulder. :mask:



 

Thala

Golden Member
Nov 12, 2014
1,355
653
136
Gracemont has been updated quite a bit, so the LCD is pretty good if true. Good luck to MS in developing an improved scheduler that makes proper use of both sets of cores. At least they have examples from the Linux/Android world as a reference; ex. HMP

What are you talking about? Windows has such a scheduler for years in essentially any Windows ARM device.
 

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
What are you talking about? Windows has such a scheduler for years in essentially any Windows ARM device.
I guess my initial search terms produced the wrong results. With ARM64 support being added in Windows 10, Big.Little support in scheduling was also added. So I stand corrected.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136

This has the potential core options that were leaked. Intel could change it of course before launch but since the cut down desktop die is 6 big and 0 small that seems unlikely there would be any SKUs with small cores enabled with less than 6 big.

Sharkbay's leak on this is better. He covered the actual die configs.

In summary:
  • Alder Lake-S
    • 8 Big Cores + 8 Small Cores + GT1 GPU
    • 6 Big Cores + 0 Small Cores + GT1 GPU
  • Alder Lake-P
    • 2 Big Cores + 8 Small Cores + GT2 GPU
    • 6 Big Cores + 8 Small Cores + GT2 GPU
  • Alder Lake-M
    • 2 Big Cores + 8 Small Cores + GT2 GPU (possible reuse with P die?)

The key point being that the 8+8 die is only used in desktops and a couple desktop-class laptops, so it's not a directly mobile-driven decision.
 

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
A proper big.little design would allow you to browse the internet, watch movies, and do light gaming while have the system only use 5-15 watts. I am not saying Intel is taking this route.

Expect to see more advanced designs in the future.
 

dullard

Elite Member
May 21, 2001
24,998
3,327
126
It could have higher single thread and higher multi-thread performance than designing one core that tries to mix both.
I'll do a little back-of-the-napkin math with major assumptions.

Desktop big.little make sense in a world that separates HEDT from just desktop. That means, if you need 16 or 32 cores, you buy a HEDT chip and if you don't need 16 or 32 cores, then you buy a desktop chip. Lets assume for this discussion that the user does not really need HEDT and see what we can determine.

Assumptions:
  • User does not need HEDT chip.
  • User purchases typical OEM desktop.
  • Typical OEM desktop is limited to 125 W of power.
  • Uncore Power + iGPU Power = 20 W of power (I realize that this is a wild guess, feel free to update the math with better numbers)
  • Adding 8 little cores adds another 20 W (Again a wild guess, please update with your own numbers).
  • Chip performs in the same voltage/frequency curve as Comet Lake (again, another guess as I don't have Alder Lake data to work with). Chip Power with iGPU ~= 20 W + 0.163 * Number of Cores * Frequency ^ 3.182. Add the 8 little core power too as needed.
  • Ignore inefficiencies as you go down the CPU line (for example, the 10600K is not the same quality chip as the 10900K).
  • Little cores do half the work of big cores at the same frequency
  • Software can efficiently utilize the big and little cores
Now, what if we applied that math to Alder Lake and assume money was no object. Then you'd come up with these hypothetical possibilities:
  • Hypothetical, 16 Big + 8 Little, 3.0 GHz. MT work: (16 + 8/2) * 3.0 = 59.7 Units. ST: 1 * 3.0 = 3.0 Units
  • Hypothetical, 16 Big + No Little, 3.2 GHz. MT work: 16 * 3.2 = 51.1 Units. ST: 1 * 3.2 = 3.2 Units
  • Hypothetical, 8 Big + 8 Little, 3.7 GHz. MT work: (8 + 8/2) * 3.7 = 44.6 Units. ST: 1 * 3.7 = 3.7 Units
  • Hypothetical, 8 Big + No Little, 4.0 GHz. MT work: 8 * 4.0 = 31.8 Units. ST: 1 * 4.0 = 4.0 Units
So, looking purely at multi-threaded performance, we want as many cores as possible. But remember the very first assumption, the first assumption is that is not the workload of this user. Yes, 16 Big + 16 little would be great, but it just doesn't meet the user's need and would cost Intel a fortune in yield. It also would seriously hurt single thread performance.

The 8+8 just is a sweet spot for a barely noticeable ~7.5% drop in single thread performance over 8+0. But it also has a sizeable ~40% multithread performance boost over 8+0. And if Intel can fully turn off the little cores so that all power is available to the big cores (unsure if that is possible so I did not include it above), then the single thread would not even be harmed.
 
Last edited:
  • Like
Reactions: Tlh97

Hulk

Diamond Member
Oct 9, 1999
4,191
1,975
136
I would try this but I'm not good at quick changes in my BiOS.
Could someone try this?

Run the Handbrake benchmark with 1 CPU (HT on) locked at 4GHz.
Run the Handbrake benchmark with 4 CPU's (HT on) locked at 1GHz.

It would be useful to see how these "compute units" we are playing with translate to actual work.
A rough simulation of Big/Little.
 

ondma

Platinum Member
Mar 18, 2018
2,718
1,278
136
Desktop big.little make sense in a world that separates HEDT from just desktop. That means, if you need 16 or 32 cores, you buy a HEDT chip and if you don't need 16 or 32 cores, then you buy a desktop chip. Lets assume for this discussion that the user does not really need HEDT and see what we can determine.

Assumptions:
  • User does not need HEDT chip.
  • User purchases typical OEM desktop.
  • Typical OEM desktop is limited to 125 W of power.
  • Uncore Power + iGPU Power = 20 W of power (I realize that this is a wild guess, feel free to update the math with better numbers)
  • Adding 8 little cores adds another 20 W (Again a wild guess, please update with your own numbers).
  • Chip performs in the same voltage/frequency curve as Comet Lake (again, another guess as I don't have Alder Lake data to work with). Chip Power with iGPU ~= 20 W + 0.163 * Number of Cores * Frequency ^ 3.182. Add the 8 little core power too as needed.
  • Ignore inefficiencies as you go down the CPU line (for example, the 10600K is not the same quality chip as the 10900K).
  • Little cores do half the work of big cores at the same frequency
  • Software can efficiently utilize the big and little cores
Now, what if we applied that math to Alder Lake and assume money was no object. Then you'd come up with these hypothetical possibilities:
  • Hypothetical, 16 Big + 8 Little, 3.0 GHz. MT work: (16 + 8/2) * 3.0 = 59.7 Units. ST: 1 * 3.0 = 3.0 Units
  • Hypothetical, 16 Big + No Little, 3.2 GHz. MT work: 16 * 3.2 = 51.1 Units. ST: 1 * 3.2 = 3.2 Units
  • Hypothetical, 8 Big + 8 Little, 3.7 GHz. MT work: (8 + 8/2) * 3.7 = 44.6 Units. ST: 1 * 3.7 = 3.7 Units
  • Hypothetical, 8 Big + No Little, 4.0 GHz. MT work: 8 * 4.0 = 31.8 Units. ST: 1 * 4.0 = 4.0 Units
So, looking purely at multi-threaded performance, we want as many cores as possible. But remember the very first assumption, the first assumption is that is not the workload of this user. Yes, 16 Big + 16 little would be great, but it just doesn't meet the user's need and would cost Intel a fortune in yield. It also would seriously hurt single thread performance.

The 8+8 just is a sweet spot for a barely noticeable ~7.5% drop in single thread performance over 8+0. But it also has a sizeable ~40% multithread performance boost over 8+0. And if Intel can fully turn off the little cores so that all power is available to the big cores (unsure if that is possible so I did not include it above), then the single thread would not even be harmed.
[/QUOTE]
Lots of assumptions there. Looks like you are taking a product and positing a scenario to fit it, instead of taking what the customer needs and designing a product to fit that.
Besides, you dont have to sacrifice single core performance to get lots of cores. AMD has already shown this, since 5900x has equal or better turbo clocks than lower tier products with less cores. It will be interesting to see Alder Lake power consumption and price. I have a feeling, even with big/little for Intel, AMD will still be able to give better multicore performance with equal or less power consumption. If the rumors of improved IPC turn out for Alder Lake, I would simplify it down to 2 basic scenarios.
1. The best single core/lightly threaded performance with good multi core as well: Alder Lake. Just give me 8+0, dont bother with the small cores.
2. Best multi threaded performance: Zen 3 or 4 or whatever is available.

Of course, this could be modified by pricing. I would anticipate AL 8 + 8 would compete well with 12 zen cores. It would be much more attractive if it were priced cheaper than the 5900x. Knowing Intel though, I seriously doubt that will be the case.
 
  • Like
Reactions: Tlh97

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
Of course, this could be modified by pricing. I would anticipate AL 8 + 8 would compete well with 12 zen cores. It would be much more attractive if it were priced cheaper than the 5900x. Knowing Intel though, I seriously doubt that will be the case.

That shouldn't be that unrealistic. The 10900K's MSRP is less than the 5900X.

There will for sure be 8+0 parts released but I doubt that it will also be an unlocked model.
 

dullard

Elite Member
May 21, 2001
24,998
3,327
126
Lots of assumptions there. Looks like you are taking a product and positing a scenario to fit it, instead of taking what the customer needs and designing a product to fit that.
Besides, you dont have to sacrifice single core performance to get lots of cores. AMD has already shown this, since 5900x has equal or better turbo clocks than lower tier products with less cores.
Intel is going down the mix-and-match for your needs path. Alder Lake is just one of the early steps. It might look like it is ramming a square peg into a circular hole. But, I think the ultimate goal seems the opposite: give every user the right mix for the user's needs. Some will need more big cores, some more little cores, some more GPU, some more AI, some a low power, some a mix of everything, etc. Alder Lake isn't the final end-game. It is just a necessary small step towards eventually making products that much more closely match the user's needs.
1614104867132.png

For it all to work, Intel first needs to build the ecosystem of operating systems and software optimized for all the different combinations. They could go for the brass ring and hope to get it all perfect, but then if there isn't software for it, they'll lose. A small step of starting with a hybrid approach seems like a reasonable first step. It might fail miserably for Intel, but I think with that long-term vision it is a reasonable start.

As for the rest of your post:
  • 5950X, 16 cores, 3.4 GHz base
  • 5900X, 12 cores, 3.7 GHz base
  • 5800X, 8 cores, 3.8 GHz base
  • Once you go lower than that, you get lower cores and lower base speed, but this is due to them being the chips that weren't quite as good as the top chips (see the "Ignore inefficiencies as you go down the CPU line" assumption)
Even AMD has lower base speeds with more cores. And AMD has the benefit of a lower power consuming process. Intel is rammed up into a power wall. More cores for Intel means lower base speed.
 
Last edited:

ondma

Platinum Member
Mar 18, 2018
2,718
1,278
136
5950x max turbo (what I was talking about, not all core turbo) = 4.9
5900x max turbo 4.8
5800x max turbo 4.7

Of course as you load more cores, the turbo speeds eventually get lower for the higher teir processors, but that is not what I was talking about.

Ultimately, though, the problem with Alder Lake is you simply cant get enough big cores. They are trying to make up for this by adding small cores, but that seems like just an inadequate compromise for the desktop.
 
  • Like
Reactions: Tlh97

dullard

Elite Member
May 21, 2001
24,998
3,327
126
5950x max turbo (what I was talking about, not all core turbo) = 4.9
5900x max turbo 4.8
5800x max turbo 4.7

Of course as you load more cores, the turbo speeds eventually get lower for the higher teir processors, but that is not what I was talking about.

Ultimately, though, the problem with Alder Lake is you simply cant get enough big cores. They are trying to make up for this by adding small cores, but that seems like just an inadequate compromise for the desktop.
For clarification, my whole post was on base clocks (including the power formula). Turbo clocks, especially for Intel, and doubly for OEM desktops are not guaranteed. Now, if we removed the 125 W OEM assumption from my post, then we can talk turbo speeds. But then again that really isn't Intel's audience for Alder Lake.

Too many people base their posts here on reviews that rip off the default cooler, put it with a powerful cooler, put it into a case with many large fans, and then talk about turbo performance. I think that is great. But, it loses the fact that it is only a small percent of people that actually do that. Many people run far closer to base clock rates with crappy OEM cooling, covered in dust, inside their desk cabinet with inadequate vents.
 

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
Ultimately, though, the problem with Alder Lake is you simply cant get enough big cores. They are trying to make up for this by adding small cores, but that seems like just an inadequate compromise for the desktop.

As long as they stick with the Ring, they aren't going more than 10. So the options are 10 big or 8 big and 8 small. Plus you have the problem that Intel's cores are just so damn big.
 

majord

Senior member
Jul 26, 2015
433
523
136
For clarification, my whole post was on base clocks (including the power formula). Turbo clocks, especially for Intel, and doubly for OEM desktops are not guaranteed. Now, if we removed the 125 W OEM assumption from my post, then we can talk turbo speeds. But then again that really isn't Intel's audience for Alder Lake.

Too many people base their posts here on reviews that rip off the default cooler, put it with a powerful cooler, put it into a case with many large fans, and then talk about turbo performance. I think that is great. But, it loses the fact that it is only a small percent of people that actually do that. Many people run far closer to base clock rates with crappy OEM cooling, covered in dust, inside their desk cabinet with inadequate vents.




Turbo clocks are more guaranteed than base clocks in reality.. i.e if for some twisted reason you wanted to see a CPU running at base clocks , you'd have to try real hard. So making those calculations off it doesn't reflect reality. I mean you even used base clocks for ST performance.. why?
 
Last edited:
  • Like
Reactions: IEC and Tlh97

cortexa99

Senior member
Jul 2, 2018
318
505
136
11900k(Retail-ver?)

geekbench

pre-order 2 weeks before the reviews online......

Press - Product information unveil, includes pre-order and advertising - Mar. 16th, 2021, at 08:00 AM PT time

Performance review, benchmark, sales - Mar. 30th, 2021, at 06:00 AM PT time
 

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,323
4,904
136
It's all going to come down to pricing. If Intel can deliver a good value, the cap at 8 cores and being on a power inefficient 14nm process can be overlooked. Backporting a new and better design versus Skylake++++ combined with their long-overdue leadership change makes me cautiously optimistic.

Unfortunately, the power part is a big factor. There were plenty of lower-end Z490 boards (*cough*, ASRock) that couldn't handle the 10c/20t parts for any extended full load. I've been window shopping Z490 and Z590 boards with the intent of buying a Rocket Lake chip, but the prices for anything with decent VRMs are a serious turn-off. They'll have to price their CPUs lower versus their AMD counterparts to convince me to spend more than double on my motherboard... especially when Alder Lake promises to launch not too long after.
 

blckgrffn

Diamond Member
May 1, 2003
9,111
3,029
136
www.teamjuchems.com
It's all going to come down to pricing. If Intel can deliver a good value, the cap at 8 cores and being on a power inefficient 14nm process can be overlooked. Backporting a new and better design versus Skylake++++ combined with their long-overdue leadership change makes me cautiously optimistic.

Unfortunately, the power part is a big factor. There were plenty of lower-end Z490 boards (*cough*, ASRock) that couldn't handle the 10c/20t parts for any extended full load. I've been window shopping Z490 and Z590 boards with the intent of buying a Rocket Lake chip, but the prices for anything with decent VRMs are a serious turn-off. They'll have to price their CPUs lower versus their AMD counterparts to convince me to spend more than double on my motherboard... especially when Alder Lake promises to launch not too long after.

I hear that. It's all about pricing.

It still seems to me there is going to have to be a solid 8 core offering at $300 against the 5600x given *maybe* performance parity for games and the like but the fact the 5600x can be used with any decent ~$100 motherboard and ~$20 cooler with confidence (hey USB, I know) vs spending more like $150 an $50 minimum on board and cooler with Intel means the all in price is still premium.

I would like an Xe equipped CPU for a new Plex server given there should be some solid upgrades that will last a long time on the hardware transcoding front, we'll see how that shakes out over the next few months.
 
  • Like
Reactions: Tlh97