Discussion Intel current and future Lakes & Rapids thread

Page 496 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Racan

Golden Member
Sep 22, 2012
1,292
2,372
136
It's important to note that is for Golden Cove only. Intel is claiming larger gains for Gracemont. It will be interesting to see where everything lands.
Yeah so far it looks like Gracemont is a far larger leap over it's predecessor.
 
Feb 17, 2020
108
289
136
Would it be unreasonable to expect the usual 15% IPC gains? With V-cache on top of that I think it will take the gaming crown at least.

Yes. If Zen 4's a derivative, ~10% IPC is a more realistic expectation. Zen->Zen 2 's 15% was abnormally high for a tick iteration. Zen 2->Zen 3 isn't relevant given that Zen 3 was a tock. As far as vcache goes, you shouldn't expect it to be on anything other than a crazy-expensive halo sku, I'm talking like $1500. Stacking is still in its infancy and the tools being used can't handle the volume to do anything else. Plus the extra L3 is mainly useful for compile workloads, so I could see any capacity being bought up by corporate customers.

Now with that said, it's not like Zen 4's immediately dead in the water or anything. It'll still have a commanding power and area lead over Golden Cove, even factoring in Golden Cove's assumed IPC lead. Being on N5 will also give it extra clocks and efficiency as well. Plus it looks like it'll have AVX-512, which we now know Alder Lake lacks. So even if there's a slight IPC disadvantage, Zen 4 should easily make that up in other areas. If Alder Lake was only Golden Cove, AMD wouldn't need to worry. They'd probably break even on desktop and dominate in mobile while offering AVX-512, but Alder Lake isn't just Golden Cove.

The real problem for AMD is going to be Gracemont (frankly, I've been saying this for a while). It probably won't help a ton in gaming since the schedulers will have some teething issues. But in mobile, AMD simply doesn't have an answer. Gracemont could also be used in the server. Imagine a Sapphire Rapids fork that uses 240 gracemont cores instead of 60 Golden Cove cores (or maybe mix and match the die), that can be an extremely compelling product depending on your workload.
 
  • Like
Reactions: Exist50

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
EDIT: some predictions of what Intel is gonna do with Golden Cove, beyond traditional "larger" OoO resources and improved branch prediction:

Prediction check time is now that Intel has spilled the beans about GC.

1) 0 bubble BPU - pretty much obvious, even the best branch prediction in the world would be 2 cycles slower than AMD for predicted taken branches

Jury is still out about this since Intel is rather secretive about BPU, remains to be seen what actually has been done for predicted taken branches.

2) 6 uOps from decode OR more tricks in decode stage like what apple does with op elimination before they even move further into frontend

Yup, 6 wide decode and tricks are also in and allocation is 6-wide also.

3) One more ACTUAL execution port. Core went to 3 ports, Haswell increased to 4 and Intel has lived with 4 since then. Piling transistors on ports is not the same as having a 5th one. AMD has 8 independant pipes in Int/FP and from Cinebench results i am almost certain we got 5th port and proper increase in dispatch / retirement

Finally a damn fifth port. Not much needs to be added.

4) 3 loads are very likely, since it has been 10+ years since Sandy Bridge was designed and even AMD has moved to 3 load ports now.

Yay, good news overall, and i am very happy with 3x256bit OR 2x512bit part, could have been done in more restrictive ways.



Overall good accuracy of predictions and kudos to Intel to doing right things.
 

dullard

Elite Member
May 21, 2001
25,913
4,500
126
Besides even if this would be the case, it would not change the scheduler at all
...
You have to understand, that this feature (mapping certain threads to a subset of available cores) is supported since literally forever - and is not new feature
The scheduler is already working perfectly fine, which you can see when you happen to have a Windows big.LITTLE machine (perhaps excluding Lakefield)
What are you talking about? Windows has such a scheduler for years in essentially any Windows ARM device.
Sure, there could be many optimizations ... but not in the algorithm, which determines the schedule in above situation (aka the task of the scheduler) - thats because the schedule is trivial and even the most barebone scheduler would be able to determine the optimum schedule.
Even sophisticated performance counters will not give any insights to change the schedule in these cases, because the schedule is already optimal.
The heterogenous Windows scheduler is already implemented and is used for every device which features heterogenous core configurations.
Hmm. What information just came out. Oh yeah, a whole new thread scheduler. You know, the one that Thala denied was happening over and over and over again. https://www.anandtech.com/show/16881/a-deep-dive-into-intels-alder-lake-microarchitectures/2 Some tidbits from that article:
  • With Alder Lake it gets a bit more complex, and the company has built a technology called Thread Director.
  • This new technology is a combined hardware/software solution that Intel has engineered with Microsoft focused on Windows 11.
  • This fundamental change is one reason why Windows 11 exists.
  • the difference between Windows 10 and Windows 11 is how much information is available to the scheduler about what is running.
  • Intel’s Thread Director controller puts an embedded microcontroller inside the processor such that it can monitor what each thread is doing and what it needs out of its performance metrics
  • it will give hints to the OS as to which thread is best to move
  • Windows 11 will mean threads will move more often than in Windows 10
Wow, it is so odd that Intel added a new microcontroller and Microsoft added a whole new operating system on a problem that was already solved. If only they could have spoken to @Thala, they would have saved so much time, effort, and expense.
 
Last edited:

eek2121

Diamond Member
Aug 2, 2005
3,384
5,011
136
The real problem for AMD is going to be Gracemont (frankly, I've been saying this for a while). It probably won't help a ton in gaming since the schedulers will have some teething issues. But in mobile, AMD simply doesn't have an answer. Gracemont could also be used in the server. Imagine a Sapphire Rapids fork that uses 240 gracemont cores instead of 60 Golden Cove cores (or maybe mix and match the die), that can be an extremely compelling product depending on your workload.

I kept telling people this. Gracemont and successors are the cores to watch. Golden Cove is fast, but huge and power hungry. Gracemont is x86 done right. If the atom team keeps this up, *cove will be dead.
 

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
Hmm. What information just came out. Oh yeah, a whole new thread scheduler. You know, the one that Thala denied was happening over and over and over again. https://www.anandtech.com/show/16881/a-deep-dive-into-intels-alder-lake-microarchitectures/2 Some tidbits from that article:
  • With Alder Lake it gets a bit more complex, and the company has built a technology called Thread Director.
  • This new technology is a combined hardware/software solution that Intel has engineered with Microsoft focused on Windows 11.
  • This fundamental change is one reason why Windows 11 exists.
  • the difference between Windows 10 and Windows 11 is how much information is available to the scheduler about what is running.
  • Intel’s Thread Director controller puts an embedded microcontroller inside the processor such that it can monitor what each thread is doing and what it needs out of its performance metrics
  • it will give hints to the OS as to which thread is best to move
  • Windows 11 will mean threads will move more often than in Windows 10
Wow, it is so odd that Intel added a new microcontroller and Microsoft added a whole new operating system on a problem that was already solved. If only they could have spoken to @Thala, they would have saved so much time, effort, and expense.
I find how @coercitiv gets a pass in this scheduler talk a bit interesting lol
 

coercitiv

Diamond Member
Jan 24, 2014
7,225
16,982
136
Wow, it is so odd that Intel added a new microcontroller and Microsoft added a whole new operating system on a problem that was already solved. If only they could have spoken to @Thala, they would have saved so much time, effort, and expense.
Read the last tidbit?
Windows 10 does not get Thread Director, but relies on a more basic version of Intel’s Hardware Guided Scheduling (HGS).
So not only did Intel add a new microcontroller and Microsoft added a WHOLE new operating system for the problem, they also had someone travel back in time and add "basic" Hardware Guided Scheduling in Windows 10.

I find how @coercitiv gets a pass in this scheduler talk a bit interesting lol
Tell me more about how you feel.
 

dullard

Elite Member
May 21, 2001
25,913
4,500
126
I find how @coercitiv gets a pass in this scheduler talk a bit interesting lol
I never got into debates with coercitiv about it. But, we can also quote him whenever he claimed there would be no changes. Yes, schedulers already existed. But, this is a fundamental shift in the way they work.
 

eek2121

Diamond Member
Aug 2, 2005
3,384
5,011
136
I never got into debates with coercitiv about it. But, we can also quote him whenever he claimed there would be no changes. Yes, schedulers already existed. But, this is a fundamental shift in the way they work.

I am surprised it took Microsoft this long TBH.

Many android devices have custom and/or tweaked schedulers.
 

Panino Manino

Golden Member
Jan 28, 2017
1,109
1,360
136
If AMD’s Zen 4 processors plan to support some form of AVX-512 as has been theorized, even as dual-issue AVX2 operations, we might be in some dystopian processor environment where AMD is the only consumer processor on the market to support AVX-512.

Like people say around here, "o mundo não gira, ele capota".
Ok, but so what? AVX-512 on Zen 4 would actually be useful for domestic consumers?
 

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
This "magical" scheduler talk needs to stop. The scheduler does not do "heavy lifting", it has only one job, which is to minimize performance loss. In the case of benchmarks such as CB, GB, Passmark - the scheduler's job is extremely simple: puts the ST test on a big core, puts the MT test on all cores. It doesn't get more basic than that.

Moreover, it has been repeatedly explained in this thread that Microsoft already has a hybrid scheduler - which is also available in Win 10. Whether MS and Intel have worked to fine-tune this scheduler for Win 11 /w Alder Lake to extract more efficiency in complex real-world scenarios that are not synthetic benchmarks with very scalable workloads, that remains to be seen, but even then don't imagine big jumps in efficiency because the bulk of the effort was done years ago already.

Windows 10 Scheduler Aware of "Lakefield" Hybrid Topologies


View attachment 47766
Bolded.

For the love of-

Hardware-guided is just a fancy way of saying "we have more performance counters that Windows can now use to get an accurate understanding of what's taking place in each core, and thus can schedule better". You still rely on the Windows scheduler at the end of the day.

You can't schedule around big.LITTLE purely in hardware. Seriously - Apple tried with the A10 and their first foray into heterogeneous architectures. They dropped it after just a single generation of use.
Fancy performance counters, I guess.

Edit:
1629386555712.png
 
Last edited:

coercitiv

Diamond Member
Jan 24, 2014
7,225
16,982
136
ok, let's see how what I wrote 1 month ago
Moreover, it has been repeatedly explained in this thread that Microsoft already has a hybrid scheduler - which is also available in Win 10. Whether MS and Intel have worked to fine-tune this scheduler for Win 11 /w Alder Lake to extract more efficiency in complex real-world scenarios that are not synthetic benchmarks with very scalable workloads, that remains to be seen, but even then don't imagine big jumps in efficiency because the bulk of the effort was done years ago already.

and compare with what the article states today in relation to Win 10
Windows 10 does not get Thread Director, but relies on a more basic version of Intel’s Hardware Guided Scheduling (HGS). In our conversations with Intel, they were cagy to put any exact performance differential metrics between the two, however based on understanding of the technology, we should expect to see better frequency efficiency in Windows 11. Intel stated that even though the new technology in Windows 11 will mean threads will move more often than in Windows 10, potentially adding latency, in their testing it wasn’t in any way human perceivable.
Hmm, so Alder Lake does get hybrid scheduling support in Win 10, and it appears to be an basic/early version of the "hardware guided" scheduler as well. And notice how Intel reps aren't really beating the drums on some massive & magical performance jump from the new scheduler, but instead focus on describing improved efficiency and "imperceptible" latency increase due to thread migration.

So, I ask you again, what is it you feel I'm getting a pass for?
 

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
I think the bit most missed is that the "Up to 2x Multithreaded Performance" lists the two key components that makes this possible:
1. New Gracemont Cores
2. Hardware-Guided Scheduling

That's enough hint for me as to why that preliminary multithreaded result seemed baffling.
 
Last edited:

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
ok, let's see how what I wrote 1 month ago


and compare with what the article states today in relation to Win 10

Hmm, so Alder Lake does get hybrid scheduling support in Win 10, and it appears to be an basic/early version of the "hardware guided" scheduler as well. And notice how Intel reps aren't really beating the drums on some massive & magical performance jump from the new scheduler, but instead focus on describing improved efficiency and "imperceptible" latency increase due to thread migration.

So, I ask you again, what is it you feel I'm getting a pass for?
You were always quick to insist Hybrid scheduling was supported and fully functional on Win 10. Note that you wrote what I quoted AFTER the Intel slide announcing HGS for Alderlake. Besides, any prior presence of HGS in Win 10, in any form, is only software based. The slide explicitly said: "Hardware-guided scheduling." Why would you be running around with graphs of Lakefield on Win 10 scheduler?
 

coercitiv

Diamond Member
Jan 24, 2014
7,225
16,982
136
Besides, any prior presence of HGS in Win 10, in any form, is only software based.
For the sake of logic, how can it be both hardware-guided and "only software based"?

Why would you be running around with graphs of Lakefield on Win 10 scheduler?
June 10, 2020
Intel Hybrid Processors: Uncompromised PC Experiences for Innovative Form Factors Like Foldables, Dual Screens
Today, Intel launched Intel® Core™ processors with Intel® Hybrid Technology, code-named “Lakefield.”
Hardware-guided OS scheduling: Enabling real-time communication between the CPU and the OS scheduler to run the right apps on the right cores, the hybrid CPU architecture helps deliver up to 24% better performance per SOC power3 and up to 12% faster single-threaded integer compute-intensive application performance4.

Do I need to start highlighting text in (red) before you acknowledge the basic fact that Intel already laid the foundation for HGS with Lakefield and Win 10?
 

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
For the sake of logic, how can it be both hardware-guided and "only software based"?


June 10, 2020
Intel Hybrid Processors: Uncompromised PC Experiences for Innovative Form Factors Like Foldables, Dual Screens



Do I need to start highlighting text in (red) before you acknowledge the basic fact that Intel already laid the foundation for HGS with Lakefield and Win 10?
Because there's a software and a hardware component to this new approach. Apparently, Win 10 covers part of the software component. Answer this:
1. Is Windows 11 different from Windows 10 in software scheduling?
2. Is there a micro controller aka "Thread Director" Thread Director in Alderlake to manage thread scheduling? Is this controller present in any other arch that you're aware of on the Intel platform, including Lakefield?
 

coercitiv

Diamond Member
Jan 24, 2014
7,225
16,982
136
Because there's a software and a hardware component to this new approach.
You just wrote 2 posts ago that Win 10 scheduler for Lakefield was "only software based". I showed you a direct quote from Intel that clearly stated Lakefield on Win 10 had HGS - Hardware Guided Scheduling. Not only were you misinformed, but now that the cat's out of the bag, you swiftly start the goalpost shifting... the scheduling is different... the hardware is different!

It ain't called hardware-guided if there's no hardware to guide it.
 

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
You just wrote 2 posts ago that Win 10 scheduler for Lakefield was "only software based". I showed you a direct quote from Intel that clearly stated Lakefield on Win 10 had HGS - Hardware Guided Scheduling. Not only were you misinformed, but now that the cat's out of the bag, you swiftly start the goalpost shifting... the scheduling is different... the hardware is different!

It ain't called hardware-guided if there's no hardware to guide it.
So you agree HGS for Alderlake is not the same as whatever was on Win 10 + previous hybrid archs, ie. thread director? You were quick to assume Win 10 was already fully sufficient. Don't try to muddy the waters, please.
 

CakeMonster

Golden Member
Nov 22, 2012
1,621
798
136
Performance on the small cores is going to be super interesting. I can't wait for side by side comparisons with AMD for when games draw from many cores and start using the small ones compared to core 8-16 on the 5950X.
 

Abwx

Lifer
Apr 2, 2011
11,835
4,789
136
46.jpg


So 19% IPC improvement over Rocket Lake.

Sort of...

average (i.e., geomean) improvement of 19%, across a wide range of existing workloads.

Based on overall scores and individual subcomponent scores on: SYSmark 25, CrossMark, PCMark 10, SPEC CPU 2017, WebXPRT 3, Geekbench 5


 

lobz

Platinum Member
Feb 10, 2017
2,057
2,856
136
In the last two weeks people discovered there's such a thing as PL4 for Intel CPUs. We are still in the process of acknowledging (as a group) the major difference between continuous power and peak power consumption for both modern PC power supplies and VRM stages.

As a general rule of thumb, as long as you see people arguing about power consumption in this thread it means there have been no further developments/leaks about launch timeline and/or performance. Get well soon.
Fyi, incredibly high power lasers use that incredibly high power only for an incredibly short amount time as well :rolleyes::D:D:D
 

jpiniero

Lifer
Oct 1, 2010
16,493
6,984
136
Interesting... Intel didn't even use WIndows 11 beta for their performance comparison with Rocket Lake. Used 20H2.