[Ashraf] 10nm "Lakefield" SoC with Intel big + little cores

Page 12 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
At one time I lost quite a bit of time trying to understand why my Toshiba laptop decided to idle at ~2-3W package power, with CPU cores in deep C state sleep but very low C2/C3 Package Residency.

Using PoP DRAM should also mean the 7W TDP figure is more like 5W as DRAM can reach 2-3W according to NBC's HWInfo monitor, and being in same package means you have to account for DRAM into the figure for cooling. It may also be able to idle DRAM better as well, in addition to basically having an on-die PCH.

I know, I have a broken Clover Trail and a Bay Trail tablet. Just broken screens. I opened up the Clover Trail one up for fun and it had nothing but a single DRAM IC. I later realized it was the PoP DRAM.

The 8-inch BT device lasts 6-8 hours with a 15WHr battery and the 10-inch CT device lasts 7-9 hours with a 24WHr battery. Neither really has you caring about screen brightness, or shutting down applications and optimizing as much as I do on laptops/ultrabooks, because its just efficient.

There's a whole thread in NBR forum about laptop optimization. Some report going from 3 to 8 hours after optimization. Involves registry changes, updating every driver, not just from the manufacturer but from obscure sources and driver utilities, shutting down almost every non-essential services, and playing with throttlestop to make sure it works.

A single misbehaving application and/or hardware can triple the idle package power.
 
  • Like
Reactions: coercitiv

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
@coercitiv I ran Witcher 3 on the system in my sig for a few mins. HWInfo64 says DRAM power peak is 1.8W with 1.5W average. It's DDR4-2400.

It looks at least at load, LPDDR3 isn't much more efficient.

Okay, maybe 8c Tremont would be out of the question given 10nm yields. But 4c Tremont should have been doable,

It's a bit off topic but,

Die is certainly not the limiter. 4C Tremont + L2 is only 5mm2 or so. The whole die is 82mm2 for Lakefield, meaning its a tiny portion. If they could take 4 core + L3 portion in Icelake out, you can fit 6x Tremont clusters for twenty four cores.

Wikichip's Lakfield article with more info:


This is quite fascinating.

1592238287496.png
 
  • Like
Reactions: lightmanek

RetroZombie

Senior member
Nov 5, 2019
464
386
96
I just have to say... reading your posts is like reading an Intel advertisement.
I think i have it figured out.
He's an AI bot that converts intel marketing slides into walls of text!

I don't know if I absolutely agree with that. But let's see whether or not Intel can get theri Atom clocks back to where Goldmont+ was and what Tremont's successor brings to the table.
Not sure. How many software is optimized for a five core cpu? Or is aware of the five cpu cores and will use the correct one(s)?
2C4T(big)+4C4T(small) would be a more 'clear design'.
Even the old Y line will be close if not better in many tests.
 

Roland00Address

Platinum Member
Dec 17, 2008
2,196
260
126
Not sure. How many software is optimized for a five core cpu? Or is aware of the five cpu cores and will use the correct one(s)?
2C4T(big)+4C4T(small) would be a more 'clear design'.
Even the old Y line will be close if not better in many tests.
We should not take our "mental heuristics" aka our mental shortcuts that apply in one domain and assume they are translatable to another domain.

With a windows or android device you are going to have the OS running, the app you designed, but also other things that are not the OS and not the App you designed. In such scenarios you do not target a specific number of cores or a specific number of threads and let the OS and the hardware silicon determine how best to allocate threads.

But in other scenarios such as video game console development you are in complete control of what is being run. In theory an OS whose demands are relatively static and you are the only other app running. Thus any untapped potential is your loss so in this scenario you actually are trying to dedicate your software you are designing to a specific core count whether 7 cores, or 6 cores, or 8 cores and so on. 1 too many or 1 too little cores matters.

Different scenarios have completely different rules and thus different mental shortcuts known as heuristics of how best to optimize.
 
Last edited:

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
I think i have it figured out.
He's an AI bot that converts intel marketing slides into walls of text!
I just have to say... you sound like the AMD zombie out to get anybody who's not pro AMD. Sorry, couldn't resist lol.

Not sure. How many software is optimized for a five core cpu? Or is aware of the five cpu cores and will use the correct one(s)?
The OS scheduler knows, and that's what matters, I believe.
 
  • Haha
Reactions: RetroZombie

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
Die is certainly not the limiter. 4C Tremont + L2 is only 5mm2 or so. The whole die is 82mm2 for Lakefield, meaning its a tiny portion. If they could take 4 core + L3 portion in Icelake out, you can fit 6x Tremont clusters for twenty four cores.
Tremont certainly is tiny. So I can see your point and agree that it is unlikely that 10nm yield/defect rate problems would prevent them from producing an 8c SoC. For some reason Intel keeps delaying Tremont-based products though. I keep asking myself why, and I have no good answers for that.

If it's true that GB5 isn't utilizing the Sunny Cove core in MT at all and if it's true that the Tremont cores are running at only 1.8 GHz in that test, that means 4c Tremont has basically equalled 4c Goldmont (J5005) at a deficit of 900 MHz (-33% clockspeed), which is really impressive. There are plenty of AiO/corporate desktops that could use Tremont!

How many software is optimized for a five core cpu?

That bit is mostly irrelevant. Modern software design encourages the developer to use as many cores as are available, workload permitting, unless there's some very good reason not to do that (such as on a console like the PS4 where certain hardware resources are set aside for OS/background tasks). The heterogeneous nature of Lakefield may confuse the scheduler somewhat, but software developers will have no problems loading up 5 cores.
 
  • Like
Reactions: Tlh97 and mikk

moinmoin

Diamond Member
Jun 1, 2017
4,952
7,661
136
Not sure. How many software is optimized for a five core cpu? Or is aware of the five cpu cores and will use the correct one(s)?
2C4T(big)+4C4T(small) would be a more 'clear design'.
Even the old Y line will be close if not better in many tests.
This particular job is completely in the realm of the OS' task schedulers to handle and optimize for, not for individual software.
 

Tabalan

Member
Feb 23, 2020
41
25
91
Not sure. How many software is optimized for a five core cpu? Or is aware of the five cpu cores and will use the correct one(s)?
2C4T(big)+4C4T(small) would be a more 'clear design'.
Even the old Y line will be close if not better in many tests.
In pure performance old Y line might be ahead of Lakefield because of higher PL2. However, when it comes to idle power consumption or battery life Lakefield should be way ahead.
 

Roland00Address

Platinum Member
Dec 17, 2008
2,196
260
126
For some reason Intel keeps delaying Tremont-based products though. I keep asking myself why, and I have no good answers for that.

If it's true that GB5 isn't utilizing the Sunny Cove core in MT at all and if it's true that the Tremont cores are running at only 1.8 GHz in that test, that means 4c Tremont has basically equalled 4c Goldmont (J5005) at a deficit of 900 MHz (-33% clockspeed), which is really impressive. There are plenty of AiO/corporate desktops that could use Tremont!

We do not know the yields for the various 10nm or 14nm (whatever plus they are on) for that is a corporate secret that is proprietary for competition sake. What we do know though is both 10nm and 14nm foundries are having supply issues where there is more demand per month than what these factories can output per month. Thus I assume, that if the yields are comparable, you would optimize which chips you make due to some type of metric you consider the most important. Usually the metric optimized for is making the most money, but sometimes it is trying to make a specific customer happy (such as Apple), or you going to get some form of future insight (like trying out new intel modems, asciis, or something else.)

What I am saying here is Tremont may be a fabulous chip, but other chips may have something Intel prioritizes more such as more money. Until they can have an abundance of supply making more chips per month than total demand, Tremont is going to be a low focus in their priority stack.

This is different than lets say 2013's Baytrail quad cores (with 22nm Silvermont microarchitecture inside) for Intel was capable of producing more chips than demand so they were figuring out how to fill their foundry capacities for a 10% idle foundry means that foundry is not making the maximum amount of money it is capable of.
 

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
@Roland00Address

You also have to consider that Intel is trying to launch the rather-massive 38c IceLake-SP unless they've given up on it. They may be sacrificing many, many 10nm wafers on that altar.
 

Roland00Address

Platinum Member
Dec 17, 2008
2,196
260
126
@Roland00Address

You also have to consider that Intel is trying to launch the rather-massive 38c IceLake-SP unless they've given up on it. They may be sacrificing many, many 10nm wafers on that altar.
Yep and that is a gamble that may pay dividends, or it may be wasting silicon on that altar.

What is the silent prayer again?
[8 previous stanzas]
So as I pray, Ice Lake Server Chips!
 

RetroZombie

Senior member
Nov 5, 2019
464
386
96
That bit is mostly irrelevant. Modern software design encourages the developer to use as many cores as are available,
This particular job is completely in the realm of the OS' task schedulers to handle and optimize for, not for individual software.
That is mostly OS task, as developers not longer "use cores", they create async tasks ( and threads based on the number of CPU cores avalible).
Ok you guys are right, maybe i was still with my mind in the Phenom X3 era where some software would not use the third cpu core because it wasn't expecting it, just two or four.
But let me rephrase that, what software is aware of one powerful cpu core + 4 weak ones?
Will microsoft fix this with a specific kernel or scheduler just for that cpu?
 

Roland00Address

Platinum Member
Dec 17, 2008
2,196
260
126
Ok you guys are right, maybe i was still with my mind in the Phenom X3 era where some software would not use the third cpu core because it wasn't expecting it, just two or four.

That was bad software design in 2008, 12 years ago. I can explain the history of why it happened, and why it was still bad design in 2008, rare, yet it happened, but it is not worth the words for the world change. OS scheduling and frozen OS is different for we are not running Windows XP anymore.
 
  • Like
Reactions: RetroZombie

podspi

Golden Member
Jan 11, 2011
1,965
71
91
Ok you guys are right, maybe i was still with my mind in the Phenom X3 era where some software would not use the third cpu core because it wasn't expecting it, just two or four.
But let me rephrase that, what software is aware of one powerful cpu core + 4 weak ones?
Will microsoft fix this with a specific kernel or scheduler just for that cpu?

This is also the OS' job, and a great question: How does Windows handle CPUs like Lakefield? I know Android has been dealing with this for years, and the WARM machines also have to deal with this, so it's not an unsolvable problem, but I wonder what the specific solution is here.
 
  • Like
Reactions: RetroZombie

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
We're seeing a tiny bit of this in the recent scheduler improvements in Windows recently with respect to "favored core" and keeping threads from jumping from one CCX to the next. The Windows scheduler is gaining a modicum of intelligence by either performing active profiling of threads or taking queues from the software itself on what kind of load profile it represents. MS is quite familiar with the behavior of the various system services that are a part of windows and will likely keep them confined to the low power cores unless they need immediate attention as one example. The trick is to keep low priority and non-interactive threads off of the high performance core as much as possible to allow it to idle as often as possible. It's also important to try to keep the low power cores either idled, or at a 70% load as often as possible. Fully loaded likely takes them out of their frequency and power draw sweet spot, and running them at a lower load keeps sections of those cores needlessly powered up (even in the advent of core section clock and power gating). There's a lot going on under the hood...
 
  • Like
Reactions: krumme and moinmoin

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
It seems the reason the Sunny Cove core clock so low is due to keeping the power levels low.

I can just verify it running Cinebench on my i3 7100. It uses 16W on 1T at 3.9GHz. If I drop it to 3GHz the power usage drops to 8.5W. AFAIK Icelake mobile doesn't differ noticeably.

If anything, a mobile U will have harder time clocking high at the same voltage level. Notebook users report the H series chips use lower power at bleeding edge of the frequency level compared to U chips such as Whiskeylake.
 

jpiniero

Lifer
Oct 1, 2010
14,599
5,218
136

First benchmarks. It does manage to beat the 8cx in the Galaxy Book S in the browser tests but only barely in most of them.

Frequency is pretty low on the R15 run, only getting 2.4 Ghz on the single core test and 1.9 on the multi core test. That gives a score of 255 on the MT test which is pretty low and is about what an n5000 gives anyway.
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,237
5,020
136
Well, that's pretty disappointing. I hope the battery life makes up for it.