Question How to set affinity to multiple CPUs ?

Micrornd

Golden Member
Mar 2, 2013
1,278
178
106
Is there a way to set affinity to multiple CPUs in Windows 10?
I'd like to force an app to use multiple CPUs, but Windows 10 is determined to only allow the use of a single CPU.
When I try to set affinity to a second CPU, Windows 10 accepts it, and switches the app entirely to that CPU, and it no longer runs on the first CPU.
Any suggestions ???
 

Hitman928

Diamond Member
Apr 15, 2012
5,160
7,595
136
I'm going to make what I feel is a safe assumption and say that you mean CPU cores and don't have a multi-socket system.

With that assumption, if a program is not coded to be multi-threaded, Windows can't force it to be multi-threaded. I guess technically it could in a time splice way but the result would be net negative. In very basic terms, Windows will assign the amount of cores/threads the program calls for. It sounds like your program is not multithreaded and therefore will only ever run on a single core.
 

Micrornd

Golden Member
Mar 2, 2013
1,278
178
106
I could quote you the old adage about "when you assume" :) ;)

But no, I did mean CPUs (not cores) and yes, it is a multi-socket system.
Any other ideas ???
 

Hitman928

Diamond Member
Apr 15, 2012
5,160
7,595
136
I could quote you the old adage about "when you assume" :) ;)

But no, I did mean CPUs (not cores) and yes, it is a multi-socket system.
Any other ideas ???

Ok, this starts to get a bit beyond my area of expertise then, but how is the system configured for UMA/NUMA? Which version of Windows 10 are you using?
 

tamz_msc

Diamond Member
Jan 5, 2017
3,698
3,547
136
From what I've read it isn't possible to set affinity for a process across different NUMA nodes. You'll have to turn on node interleaving/UMA for that.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,225
9,987
126
Is there a parameter to the cmd.exe START command to specify a NUMA domain to launch an application on?

If not, is there anything on the SysInternals site that might help? A helper tool of some sort?
 

Micrornd

Golden Member
Mar 2, 2013
1,278
178
106
Just as a follow-up, it took me a while to sort this out so that my old brain could understand it :rolleyes: -
It appears the programmer of an app has to make it "NUMA aware" for it to use all NUMA nodes (all cores on multi-processor boards).

For those few here that actually have multi-processor workstations and actually use them to work -
Windows has a concept called, "Processor Groups".
Each group has a limit of 64 logical cores.
So on systems with a single processor of 64 logical cores or less, there is only one group and 1 NUMA node.

But on say a 2696v3 dual processor board, there are 72 logical cores.
So there will be two groups of 36 logical cores each - each with the cores from each NUMA node.
36 in each group and node because that is what each processor has in this example.
(Bios programming does allow the option to further divide the processor into multiple NUMA nodes before Windows "sees" them, my Gigabyte board I used for dual 2696v3 Xeons allowed me to have each processor as a single NUMA node of 36 logical processors or each processor to be 2 NUMA nodes of 18 logical processors. Divided that way the 2 NUMA nodes were each 9 "real" cores and their 9 "hyper" cores, so 4 NUMA nodes total for the 2 processors, each consisting of 9 "real" cores and 9 "hyper cores")

If a dual (or more) processor board had say 72 logical cores in each processor, then each processor would have 2 groups and 2 nodes (4 Processor Groups and 4 NUMA nodes total for a dualie) because remember, Windows Processor Groups have a limit of 64 logical cores per group.
(The same basic principle applies to a single processor board with more than 64 logical processors)

But by default, programs are only able to use 1 group (in the above example 1 group of 36 logical cores), because that is the default in Windows.
In order to use multiple groups, the program needs to be specially programmed for it.
(MS does explain how to do that in the link above, and while I'm not a programmer, it does seem rather straight forward and makes one wonder why it isn't routinely done)

As a workaround - Process Lasso Pro can "force" apps to use multiple groups with it's Process Group Extender option.
I don't believe it is as efficient as if the app would be if properly programmed to run on multiple NUMA nodes/Processor Groups,
but I have found it does reduce my encode and transcode times roughly 35%-40% (depending on the original container, of course) ;)

Hopefully that makes things a little clearer for anyone stumbling on this thread :)
 

DrMrLordX

Lifer
Apr 27, 2000
21,571
10,764
136
That's actually good information, thanks. Even though few people here mess with 2P+ systems, there are people here who have/will have 64c/128t systems (3990x) that will probably face similar problems running Windows.
 

Micrornd

Golden Member
Mar 2, 2013
1,278
178
106
That's actually good information, thanks. Even though few people here mess with 2P+ systems, there are people here who have/will have 64c/128t systems (3990x) that will probably face similar problems running Windows.
They will, but just to be clear, there's no problem running Windows, it performs the same way whether all logical cores are used or not and 95% of folks won't know that not all the logical cores are being used because the newer processors are so fast.
That is they won't know until they actually see how fast they are really meant to be when all logical cores are used
 
  • Like
Reactions: VirtualLarry