Multicore CPU load balance

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
Hey all.
I was wondering about what is up with the constant task shuffling between cores on my fx6300 CPU.

Even if you run single thread performance benchmark, windows seems to think it will be better to switch the cores every couple of ms. The effect is this:
6egnid.jpg

This is superpi - single thread performance benchmark. You can see the CPU cores load is split between 4 threads.

For some reason Windows thinks it will be faster. Switch the load (single bench app) from 100% loaded thread (by this app) to the one that have 0% load on it will make it run faster:confused:

So, when I was fooling around with my unlocked CPU I noticed in AMD OverDrive "Smart profiles" tab which have " Core affinity" checkbox.

So I tough to myslef: doesn't switch load from one core to the other require additional memory and cache operations? Hell, doesn't fx suffer from slow cache performance? It does! Could possibly locking the app to single thread - disabling this load switching improve performance?

Well.. So I tested it and here is my result:
2rcnc3p.jpg


Quite a bit of improvement. More than 0.5s faster.

fx6300 locked at 4.2GHz with overclocked memory:
ori7ar.jpg


I used process lasso app to see if it can improve thing a bit:
r8idmv.jpg


Seems to be on par with locking it to single thread.:thumbsup:



Why does windows do this? Can you disable this CPU load switching? How does intel perform? Can you lock app to single thread with intel CPU?
 
Last edited:
Dec 30, 2004
12,553
2
76
make sure you're running the two Microsoft kernel hotfixes for AMD processors--
https://forums.station.sony.com/eq/...nt-amd-fx-bulldozer-cpu-fix-windows-7.206565/

to answer your question,
helps with distributing energy consumption, heat dissipation, stress on the CPU.

check out process lasso, lets you control this.
Personally last night I had the idea to use a profiling program that automatically modifies core affinities of tasks on the fly and modifies your overclock; mostly you could have 2 cores set to 5ghz the rest to 4ghz and just always move keep the task with current focus onto the 5ghz core-- Chrome, Firefox, etc. Everything else could do what it wants on the background cores. Also, only have the overclock on when under load, like rendering a webpage.
 
Last edited:

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
Win 8.1 already have those fixes, doesn't it?
I'll give a look to the lasso thing.

I like the fast cores idea. Something like "driver thread" could run on faster core, while the background stuff could be kept on one slow core.
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
Updated OP with process lasso result.

But this is all single-thread benchmark. I wonder how does it impact more threaded applications, like games. Any ideas how to make a reliable test?
 

mikeymikec

Lifer
May 19, 2011
21,619
16,895
136
Same thing happens on Win7 with My Ph2 960T (6-core), I just ran Prime95 on a single core and the task gets thrown between cores.

Is there any performance advantage of tying a task to a particular core?
 

Flapdrol1337

Golden Member
May 21, 2014
1,677
93
91
I set the affinity to 1 core with an older version of super pi, running a pentium G3258.

2 cores affinity 1m 34s
1 core affinity 1m 35s

Guess it doesn't mean much for haswell.
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
So I did 3 Shadow of mordor benchmark runs for default windows settings and process lasso performance optimized setting.

Default average fps was: 52 FPS
performance lasso average fps was: 76 FPS

Something fishy in here...
 

WittyRemark

Member
Dec 7, 2014
118
0
0
So I did 3 Shadow of mordor benchmark runs for default windows settings and process lasso performance optimized setting.

Default average fps was: 52 FPS
performance lasso average fps was: 76 FPS

Something fishy in here...

That's quite the jump you got there.
Try other games, and what's your PC specs btw ?
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
That's quite the jump you got there.
Try other games, and what's your PC specs btw ?

There is no difference in gameplay when I lock it two 3 Module/3 threads (0-2-4). Even going two thread seems to drop from 58 to 56 fps.
I don't have any software to analyse the fps. I can run build-in benchmark or relay on fps counter OSD.

FX6300@4,2Ghz
HD7870@1,1Ghz/5,0Ghz
DDR3 2x4GB 1600MHz@1866
MSI 760G-p43 FX
FullHD Samsung display
DeepCool Gammaxx S40
Win8.1
 
Last edited:

Abwx

Lifer
Apr 2, 2011
12,038
5,014
136
Why does windows do this?

Windows switch from a core to another surely for better heat distribution within the die, when a core reach a given temp it will use a slightly cooler core and so on, temperature will be homogenous within the surface with no hot spot.
 

WittyRemark

Member
Dec 7, 2014
118
0
0
There is no difference in gameplay when I lock it two 3 Module/3 threads (0-2-4). Even going two thread seems to drop from 58 to 56 fps.
I don't have any software to analyse the fps. I can run build-in benchmark or relay on fps counter OSD.

FX6300@4,2Ghz
HD7870@1,1Ghz/5,0Ghz
DDR3 2x4GB 1600MHz@1866
MSI 760G-p43 FX
FullHD Samsung display
DeepCool Gammaxx S40
Win8.1

Thanks for the reply.
Mordor isn't all that demanding on the CPU side, or so I've heard.
Is this same for other games too?
you can fraps for benchmarking.
 
Last edited:
Dec 30, 2004
12,553
2
76
Win 8.1 already have those fixes, doesn't it?
I'll give a look to the lasso thing.

I like the fast cores idea. Something like "driver thread" could run on faster core, while the background stuff could be kept on one slow core.

yeah, actually, it does. Didn't look at your pics
 

Dufus

Senior member
Sep 20, 2010
675
119
101
Why does windows do this? Can you disable this CPU load switching? How does intel perform? Can you lock app to single thread with intel CPU?
It's just the way the dispatcher works. Threads are queued depending on priority. If you have normal priority then it will get a time slice (quantum) to run, usually 15.6ms. If there are other processes in the queue then it will be bumped of and placed at the back of the queue until the other processes have had there turn. When it reaches the front of the queue again there may be a process already running on the thread it previously used so it gets put on a different thread. It may also get overridden by higher priority threads.

Affinity can be set by the Windows task manager but doing this may stall the process if there is something already running on it's set thread when it's next turn to run comes around.

Here's running SuperPi on an Intel 4700MQ laptop CPU with normal priority and standard affinity (can run on any CPU thread). Time taken to complete 10.0 seconds.
1lx0.png

As can be seen the process is distributed across several CPU threads.

Setting the process priority to realtime (via task manager) will help the process gain preference and may be seen to run on just one thread unless kicked off by threads with the same or higher priority or system interrupts. In this example setting realtime it runs on just 2 threads, it may also run on just one even though affinity is set for all cores.
mrtyqf.png


Have to be careful setting affinity, in this shot other processes have been tagged for Core 0 threads 0 and 1 while SuperPi is run on core 3 thread 0. With stepped turbo's the single thread bin of 36 may drop to 35 during a bench run so some compromise will have to be taken to achieve the best results.
s61839.png