Question Raptor Lake - Official Thread

Page 15 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Hulk

Diamond Member
Oct 9, 1999
4,191
1,975
136
Since we already have the first Raptor Lake leak I'm thinking it should have it's own thread.
What do we know so far?
From Anandtech's Intel Process Roadmap articles from July:

Built on Intel 7 with upgraded FinFET
10-15% PPW (performance-per-watt)
Last non-tiled consumer CPU as Meteor Lake will be tiled

I'm guessing this will be a minor update to ADL with just a few microarchitecture changes to the cores. The larger change will be the new process refinement allowing 8+16 at the top of the stack.

Will it work with current z690 motherboards? If yes then that could be a major selling point for people to move to ADL rather than wait.
 
  • Like
Reactions: vstar

dullard

Elite Member
May 21, 2001
24,998
3,327
126
It only supplements MT workloads like cinebench, while creating so much complication/complexity in software optimizations.
It is one line of code to specify the preferred core(s) when you create a thread. Is that your definition of "so much complication"?
 

maddie

Diamond Member
Jul 18, 2010
4,723
4,628
136
It is one line of code to specify the preferred core(s) when you create a thread. Is that your definition of "so much complication"?
What happens when you have lots of threads? Surely thread contention on which thread on which core becomes a lot more complicated? How will the thread director determine priority values? I also assume that each unique user has his own set of priorities as to which software takes precedence.
 

FangBLade

Member
Apr 13, 2022
199
395
96
What if there is some part of software that could benefit more from performance cores? Will thread scheduler recognize it and immideatly switch from little to big cores? is it smart enough? or if some update break algorithms, and application start switching between big and little cores, and that adds latency/regression. There is thousands of combination for how you could optimize software to use big/little, it is very complicated from the perspective of developers, so i hope they will bury big/little idea once MCM 3d stacking cores design is finished.
 

dullard

Elite Member
May 21, 2001
24,998
3,327
126
What happens when you have lots of threads? Surely thread contention on which thread on which core becomes a lot more complicated? How will the thread director determine priority values? I also assume that each unique user has his own set of priorities as to which software takes precedence.
Again, long-term it is mostly NOT up to the thread director. It is up to the programmers who intimately know the software and its needs when they set the priorities and suggested P or E cores to use. And long-term just about every thread goes onto an E core once there are dozens of E cores. At that point the thread director will be useful for putting intense threads on the colder cores to spread out heat. The P cores will be reserved for the user-interface and other high priority single threaded uses. The only real problem is short-term because Intel released chips with too few E cores.

Users with different needs can already set priorities (either through Windows performance modes or very clumsily through thread lasso type software). Theoretically, this could get even more sophisticated if programmers want. They could let users set priorities in the options.

The same problems existed before. What if one software wants 10 threads and another wants 20 threads, but your processor only has 16 available? There is always a conflict. Now the only real change is that there are more options and software can specify more specifically what to do.
 

ondma

Platinum Member
Mar 18, 2018
2,718
1,278
136
What if there is some part of software that could benefit more from performance cores? Will thread scheduler recognize it and immideatly switch from little to big cores? is it smart enough? or if some update break algorithms, and application start switching between big and little cores, and that adds latency/regression. There is thousands of combination for how you could optimize software to use big/little, it is very complicated from the perspective of developers, so i hope they will bury big/little idea once MCM 3d stacking cores design is finished.
I am not a chip designer or software engineer, but just from a layman's point of view, scheduling would not seem to be an impossible task, since ARM has been doing it for years.
 

dullard

Elite Member
May 21, 2001
24,998
3,327
126
What if there is some part of software that could benefit more from performance cores? Will thread scheduler recognize it and immideatly switch from little to big cores? is it smart enough? or if some update break algorithms, and application start switching between big and little cores, and that adds latency/regression. There is thousands of combination for how you could optimize software to use big/little, it is very complicated from the perspective of developers, so i hope they will bury big/little idea once MCM 3d stacking cores design is finished.
Thread director is not smart enough to do that perfectly (it just does an okay job). That is why there are now options for programmers to specify which type of core to run each thread on. As a programmer myself, it seems pathetically easy. Some tasks need to be done quickly, other tasks only need to be done eventually. I've never created a thread and couldn't decide how it needs to perform. Take gaming as an example. If I were to program a 3D shooter, I'd want as quick a response as possible for the user inputs (as little lag as can be for key/button presses and mouse moves). Next, I'd want to emphasize smooth game play and graphics. Other threads would be much less important. The thread that updates the map can be slower. The threads spawning items can be slower. The thread running the explosions can be slower as it is just a visual effect with no game play impacts. Etc. I can't imagine thinking that the item inventory thread needs to be on a P core -- or not even knowing what type of core would be best for that thread.

Big / Little is an optimization. Certain tasks really need to be done quickly. Other tasks do not. Thus, there is no need to hog precious resources on the lower priority threads. Having little cores lets Windows do its background tasks, lets virus scanners run, lets encryption/decryption occur, lets printing tasks, etc all run without taking up much CPU power. This gives more of the limited CPU power to do what the user really wants done quickly.

As CPUs get more and more cores, you run into a problem. Suppose you have the typical 125 W CPU. Suppose you have 4 cores. Each core gets 31 W. That is more than necessary. All cores can get the full power that they want. But now suppose you have 16 cores--each core gets 7.8 W. Now you have a problem. Performance starts to suffer as most desktop CPU cores can go much faster if they had more than 7.8 W. But what if you have 64 cores? Each core only gets 1.9 W. That runs into the situation where desktop CPUs need to be massively slowed down with much lower frequencies since the cores just don't have the power they need to perform well. For example in CPU-Z, Alder Lake P-cores at 2 W only perform at 8% of their full speed. By not using big/little, you actually are ADDING latency that you just claimed that you don't want. But what if instead you split it? Let some cores get full power and other cores get lower power. Then you can let the lag/latency impacted tasks go at full speed (on high power cores) and the rest at a diminished speed on low power cores (but you still get massive parallelism benefits). Best of both worlds.

Nothing about 3D stacking will eliminate that fact that as you add more cores, you must reduce the power to each core, thus reduce performance, and thus add lag.
 
Last edited:

desrever

Member
Nov 6, 2021
108
262
106
It is up to the programmers
Why would a programmer even know how to prioritize their threads? Without huristics at the OS level, all any programmer would want is their threads to finish asap. They might be able to prioritize threads within their own APP but they are not going to be able to prioritize their app threads in relation to other apps. Also nobody wants their threads to run on the small cores unless it's a background task cause they want maximum responsiveness. Background tasks are not going to stress the CPU, this is why Apple goes with 2 small cores and many large cores.
 

John Carmack

Member
Sep 10, 2016
153
246
116
It is one line of code to specify the preferred core(s) when you create a thread. Is that your definition of "so much complication"?

Thread director is not smart enough to do that perfectly (it just does an okay job). That is why there are now options for programmers to specify which type of core to run each thread on. As a programmer myself, it seems pathetically easy. Some tasks need to be done quickly, other tasks only need to be done eventually. I've never created a thread and couldn't decide how it needs to perform. Take gaming as an example. If I were to program a 3D shooter, I'd want as quick a response as possible for the user inputs (as little lag as can be for key/button presses and mouse moves). Next, I'd want to emphasize smooth game play and graphics. Other threads would be much less important. The thread that updates the map can be slower. The threads spawning items can be slower. The thread running the explosions can be slower as it is just a visual effect with no game play impacts. Etc. I can't imagine thinking that the item inventory thread needs to be on a P core -- or not even knowing what type of core would be best for that thread.

Big / Little is an optimization. Certain tasks really need to be done quickly. Other tasks do not. Thus, there is no need to hog precious resources on the lower priority threads. Having little cores lets Windows do its background tasks, lets virus scanners run, lets encryption/decryption occur, lets printing tasks, etc all run without taking up much CPU power. This gives more of the limited CPU power to do what the user really wants done quickly.

R.I.P. to the hypebeasts who in the months before Alder Lake's launch came in here and techsplained to us how the "problems" with thread scheduling were fiction because the issue was already solved.
 

FangBLade

Member
Apr 13, 2022
199
395
96
we know how much developers are adapting, especially game developers that are know as lazy, parallelization in game development is hard, especially optimizing it to big/little so we can forget that, games that use more than 8 cores will cause trouble to Intel because of latency between big and little cores, but there won't be many so Intel can sleep for now. Talking about other softwares it will take years before most popular and used software use big/little properly. I would still choose all big cores design so i don't have to wait proper optimization, and because each big core is much stronger than each little you don't need as many as little, and they can be power efficient too, unused cores go to sleep state and problem solved.
 

Hulk

Diamond Member
Oct 9, 1999
4,191
1,975
136
While the TD behavior isn't perfect after working with Windows 11 and a 12700K for 9 months I have come to like always having the 8 Golden Coves on the foreground app.
My biggest issue is that I don't have enough E cores to keep the background tasks moving along quickly enough. I'd be perfectly happy with it if I had 12 or 16 E's.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,483
14,434
136
While the TD behavior isn't perfect after working with Windows 11 and a 12700K for 9 months I have come to like always having the 8 Golden Coves on the foreground app.
My biggest issue is that I don't have enough E cores to keep the background tasks moving along quickly enough. I'd be perfectly happy with it if I had 12 or 16 E's.
My question is, that what if all those e-cores did not fix the problem, because of the thread director ? I am not saying it will never work, just that in its infancy, I am not sure anybody will be happy with it. I guess time will tell. In the meantime, give me all P-cores. At a reasonable power level.
 

TheELF

Diamond Member
Dec 22, 2012
3,967
720
126
Why would a programmer even know how to prioritize their threads? Without huristics at the OS level, all any programmer would want is their threads to finish asap. They might be able to prioritize threads within their own APP but they are not going to be able to prioritize their app threads in relation to other apps. Also nobody wants their threads to run on the small cores unless it's a background task cause they want maximum responsiveness. Background tasks are not going to stress the CPU, this is why Apple goes with 2 small cores and many large cores.
It's not 1990 anymore where we only had one core at 500Mhz or something.
Look at your clocks during normal operation of windows, unless you have a locked all core overclock most of the times your clocks will be very low and everything is still completely responsive.
A programmer can put less needy threads on the e-cores and not lose anything.
Everybody is always all about power efficiency and optimization, but here all the same people are all like "nah brah, need max power, thread go vroom"
Adjusting your threads to only use as much CPU cycles as they need to to do what they need to do is what optimization is all about.
we know how much developers are adapting, especially game developers that are know as lazy, parallelization in game development is hard, especially optimizing it to big/little so we can forget that, games that use more than 8 cores will cause trouble to Intel because of latency between big and little cores, but there won't be many so Intel can sleep for now. Talking about other softwares it will take years before most popular and used software use big/little properly. I would still choose all big cores design so i don't have to wait proper optimization, and because each big core is much stronger than each little you don't need as many as little, and they can be power efficient too, unused cores go to sleep state and problem solved.
There is no real latency between big and little like there was between the core complexes in ZEN 1.
They are on the same bus, they just have different capabilities so the same thread would take longer to finish on the e-cores, that's not latency, if one thread needs less compute cycles then an other then if it is run on the e-core it will still finish in time for the other thread that ran on the p-core.
Also it didn't hurt AMD why do you think intel would have any problem?
I would still choose all big cores design so i don't have to wait proper optimization, and because each big core is much stronger than each little you don't need as many as little, and they can be power efficient too, unused cores go to sleep state and problem solved.
The more cores run at once the slower ALL OF THEM get, the slower your software runs, the less responsive it gets...
If unused cores go to sleep it takes a while for them to wake up again when they are needed, you are all against latency but now you are ok with latency?
In the meantime, give me all P-cores. At a reasonable power level.
You only run DC, you are not the target audience for mainstream CPUs.
 

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,686
136
I am not saying it will never work, just that in its infancy, I am not sure anybody will be happy with it.
It will work, but first Intel needs to stop doing classical Intel thinking and trying to reinvent the wheel with every feature they bring to the market. What Intel tried to achieve with TD was more than a simple dynamic allocation of resources, using some sort of profiling to decide where threads go. Sometimes it works nicely. Sometimes :)
 

TheELF

Diamond Member
Dec 22, 2012
3,967
720
126
Show me the target audience for E cores on the desktop, please!
Everybody else?!
Gamers, streamers, content creators.
Anybody that wants to run something heavy in the foreground while also wanting to run something less demanding in the background without much interference.
With the 12900 you get a faster 11900 and an additional amount of compute to spend anyway you like or to just ignore.
Intel-Innovation-2021-Alder-Lake-9.jpg

Intel-Innovation-2021-Alder-Lake-13.jpg
 

Henry swagger

Senior member
Feb 9, 2022
356
235
86
Everybody else?!
Gamers, streamers, content creators.
Anybody that wants to run something heavy in the foreground while also wanting to run something less demanding in the background without much interference.
With the 12900 you get a faster 11900 and an additional amount of compute to spend anyway you like or to just ignore.
Intel-Innovation-2021-Alder-Lake-9.jpg

Intel-Innovation-2021-Alder-Lake-13.jpg
Well said.. plus e-cores are for the main pc market which are laptops
 

DrMrLordX

Lifer
Apr 27, 2000
21,583
10,785
136
When a programmer creates a thread, there are several things that are assigned. You must give the thread a name, you must assign the thread to be background or foreground, you must assign a rough priority level (note a programmer can ignore setting these but they just are assigned to default values).

How many programmers actually hand-select thread priority level? I can see setting a low priority on threads produced by, say, an AV scanner or something that's never really meant to have focus, but anything else . . . ? From what little I've seen, most programmers rely on the OS scheduler.
 

TheELF

Diamond Member
Dec 22, 2012
3,967
720
126
How many programmers actually hand-select thread priority level? I can see setting a low priority on threads produced by, say, an AV scanner or something that's never really meant to have focus, but anything else . . . ? From what little I've seen, most programmers rely on the OS scheduler.
They don't hand select them, at least for game coders, but the development platform (compiler or whatnot) they use does it for them.
If you use a game engine then that engine knows what has to run fast and what can take its time and how to synchronize all of it, if windows 11 gains a big enough install base for game engine developers to start caring then they will add support for the new thread director and new games made on the new version will automatically be optimized for thread director.
And yes sure, most software isn't even complex enough for the programmer to go into any trouble of doing anything more than let the OS scheduler handle everything.
 

DrMrLordX

Lifer
Apr 27, 2000
21,583
10,785
136
If you use a game engine then that engine knows what has to run fast and what can take its time and how to synchronize all of it, if windows 11 gains a big enough install base for game engine developers to start caring then they will add support for the new thread director and new games made on the new version will automatically be optimized for thread director.

I'm sure they'll be thrilled at the added expense.
 

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,686
136
Everybody else?!
Gamers, streamers, content creators.
Anybody that wants to run something heavy in the foreground while also wanting to run something less demanding in the background without much interference.
With the 12900 you get a faster 11900 and an additional amount of compute to spend anyway you like or to just ignore.
You've just described the target audience for 12P+ cores. Anyone doing pro or semi-pro work and looking for a strong multi-core setup won't bother with the inconsistency that E cores come with atm.

Maybe they fix some of this with the RPL launch, but anyone looking at 12900K today for demanding foreground & background usage better not count on E cores, especially if we're talking video editing / transcoding.
 

FangBLade

Member
Apr 13, 2022
199
395
96
Everybody else?!
Gamers, streamers, content creators.
Anybody that wants to run something heavy in the foreground while also wanting to run something less demanding in the background without much interference.
With the 12900 you get a faster 11900 and an additional amount of compute to spend anyway you like or to just ignore.
Intel-Innovation-2021-Alder-Lake-9.jpg

Intel-Innovation-2021-Alder-Lake-13.jpg
Literally any of these prefer all big cores, just ask them to choose between 10+P cores, or 8P + 8E and come back here.
 
  • Like
Reactions: lobz

dullard

Elite Member
May 21, 2001
24,998
3,327
126
Why would a programmer even know how to prioritize their threads? Without huristics at the OS level, all any programmer would want is their threads to finish asap. They might be able to prioritize threads within their own APP but they are not going to be able to prioritize their app threads in relation to other apps. Also nobody wants their threads to run on the small cores unless it's a background task cause they want maximum responsiveness. Background tasks are not going to stress the CPU, this is why Apple goes with 2 small cores and many large cores.
Every since threads were invented, they have always needed to be given priorities. That is the first and most fundamental part of working with multiple threads. You absolutely do NOT want all threads to finish ASAP since you know there are limited resources. Threads always have been given priorities. For example, there are threads that handle interrupts. Interrupts are named interrupts because they interrupt your program and need to be done immediately with highest priority. Other tasks (say like feeding the printer with data) can go on in the background and be interrupted by the more important task. The fact that you don't know this means you probably should listen and/or read about it before posting on the subject.

Windows is not a RTOS (real-time operating system). In overly simple terms, a RTOS starts tasks at known fixed intervals (say every 1 microsecond). As long as your thread's task completes in under 1 us, you are good to go. It will run again at the next 1 us interval. If your program's task is more than 1 us, then you will have all kinds of problems. You as a programmer need to know this or you need to know if you don't have a RTOS. Windows is not an RTOS and schedules things whenever it can. This eliminates wasted time and improves performance (if a task is less than 1 us, why wait for the next 1 us interval?) Since Windows is not an RTOS, other apps can and will take priority over your app at times. If you do not program with that in mind, then you probably should find another career.

I already gave many examples in the posts above where programmers want the E-cores. I see no reason to repeat these examples. You can refute specific ones if you wish to continue that line of discussion.

Finally, doesn't the Apple M2 have 4 big and 4 little cores?
 
Last edited:
  • Like
Reactions: Hulk