• We are currently experiencing delays with our email service, which may affect logins and notifications. We sincerely apologize for the inconvenience and appreciate your patience while we work to resolve the issue.

X2 affinity performace gains (a must read)UPDATED!!!

RFC Rudel

Member
Oct 19, 2005
49
1
66
I finish my research on CPU affinity performance for games on dual core or dual CPU systems.

I use windows w2k3 enterprise because is the only available OS that can run the windows resource manager. http://www.microsoft.com/windowsserver2003/downloads/wsrm.mspx

To check the process affinity and CPU usage I use sysinternals process explorer
http://www.sysinternals.com/Utilities/ProcessExplorer.html


The windows resource manager allows you to make affinity policies for process and services.

EJ: you configure WRM to run all the OS on CPU0 and the .exe of your games or other single treaded apps on CPU1.

The WRM works as a service so if you have the service on manual your OS starts like the average Joe dual system and all of your games and app run on both cores or wherever the OS tread scheduler thinks they should run.

The WRM do not allow you to manage the OS process, but a quick registry edit cleaning the System Exclusion List you could make one of your cores totally empty and ready for your CPU hungry single treaded apps/games.

I chose to make CPU1 free because CPU0 is the default for several things.


Performance Gains

I use PI calculations, Sandra, 3dmark 2003&2005, CPU only benchmarks etc.

On PI calculations or any other single treaded CPU benchmark if you make it run with the WRM properly configured you get 5% or less gains in performance, but your OS stays totally responsive because the tread of the bench app do not interfere with all the OS and user process. (the score of single treaded is half the score of a dual cpu capable app.)

3dmark give some 5% or less.

In games is a totally different story.

If the game is pretty much VGA limited (like BF2) your gains will be on parts where the CPU became the limit, I get more max FPS number but my average FPS gains in BF2 are 8/10% at the most.

In BF2 I pass from a max 80FPS to 92FPS, as I said before on games that are seriously VGA limited the gains are limited but they exist.

On HL2 my x800xl is not too much of a bottleneck so in normal mode (no affinity optimizations) I put Dr freeman steady at 150FPS and when I start the WRM and the HL2 run alone on CPU1 my FPS jump to 182!!!

I use a creative X-FI so in theory the board unloads my CPU from audio calculations; I think that people with onboard audio should get better results.

UPDATE
using WRM and my X-FI the unused core (cpu0) uses 10% cpu, I uninstalled my X-FI and use the onboard sound and configure the bf2 tu use software audio and high quality.

the unused core went to 15/20 with spikes of 25% of CPU usage!!!! and the audio quality have no inpact on FPS when runing with the WRM.

this shows that the X-fi really work and the redirection of all OS operatios to CPU0 have clearly advantage for onboard audio users.


IRQ

I use a resource kit utility to redirect all IRQ to CPU0 but I was unable to get a tangible performance gain.


Application Needed


I change the WRM .msi to run on XP but it fails to install the WRM service.
The only way to manage OS process affinity in XP is using sysinternals process explorer, but is not automatic.

I think that a application is needed because it is a pain to set up all the affinities by hand not to mention that there is a IRQ and tread process priority performace gains not deeply explored.

I found that NCR SMP Utilization Manager very awkward but has some nice features.
http://www.ncr.com/support/pcfiles/Utility/NTTOOLS/SMPUT200.EXE


So far the WRM is the best to manage process affinity.


WE NEED SOMEONE TO MAKE A APP LIKE THE WRM FOR XP.

please if you want to recreate this feel free to ask.


my machine
LANPARTY UT nF4 SLI-DR
AMD 3800 Dual core@2.55Ghz
2x1G Ram TCCC (OCZ ASAP)
ATI X800XL
Creative X-FI EM
3x sata2 hitachi 80GB Raid0
1 WD 80GB
Nec DVD-R
Samsung DVD-CDR combo
Thernaltake Armour
Thermaltake 680W PSU
Aerocool Coolpanel (Front Panel)
Windows 2003 enterprise
 

Geomagick

Golden Member
Dec 3, 1999
1,265
0
76
Awesome info there. Really good to see that you can really get good performance gains in single threaded applications with a dual core CPU.
 

RFC Rudel

Member
Oct 19, 2005
49
1
66
the Ms load balancing Sucks big time
the tread scheduler balance the hole system and you app/game has to share the cpu time with other process.It may be ok for not demanding single treaded app/games but nothing beats an empty core tu run your single treaded app.

I use a second monitor to run task manager,performace monitor and process esplorer to see how the windows do the Suck balancing.

the thing is not tu run the single treaded app on single core, you must empty that core before you run the app there.

becouse if the core is empty there is no competition to use the cpu with another process tread.



 

RFC Rudel

Member
Oct 19, 2005
49
1
66
Originally posted by: seiyafan
What's the improvement on super Pi score?


5% or less


becouse if the single treaded bech run on high priority it get more cpu time, so there is no to much diference when you run the pi on an empty core.
But when you run on the empty core the pi calculations do not interfire with the OS and user process and your windows work like nothing is being run.

you may think that if you put your game in high priority you may get similar results, but if your games have more cpu time you may losse your mouse, have serius sound/network problems becouse the game do not allow critical OS process to work corecctly.

runing de app/game on an empty core is like runing on real time priority, the treads get all the cpu time and they no interfire with OS and other process becouse they run on cpu0.

playing and monitoring shows that the core that do not run the game have 10% or more usage constantly, if the game run on that core it have 10% less cpu.



 

SGtheArtist

Senior member
Apr 5, 2001
508
0
0
If the user has to manually setup the load distribution between the two cores what is the point of going dual core?

If you run a game on an unused core and the increase in performance is only 5% or less then wouldn't it be better to buy a single core processor and spend the difference elsewhere?

Isn't the benefit in dual core the fact that the app (if properly load balanced) has the resources of 2 cores instead of 1? Say the entire unused 2GHz of core1 & 800MHz from core0? Your solution limits the app to 1 core.

Wouldn't it be worth the time to develop a program that combines the 2 cores processing power so the OS & apps treated it as a single 4GHz processor (2x2GHz)?
 

RFC Rudel

Member
Oct 19, 2005
49
1
66
Originally posted by: SGtheArtist
If the user has to manually setup the load distribution between the two cores what is the point of going dual core?

If you run a game on an unused core and the increase in performance is only 5% or less then wouldn't it be better to buy a single core processor and spend the difference elsewhere?

Isn't the benefit in dual core the fact that the app (if properly load balanced) has the resources of 2 cores instead of 1? Say the entire unused 2GHz of core1 & 800MHz from core0? Your solution limits the app to 1 core.

Wouldn't it be worth the time to develop a program that combines the 2 cores processing power so the OS & apps treated it as a single 4GHz processor (2x2GHz)?


5% o less on PI or cpu benchs, in games that are not seriusly vga limited you gains go all the way to 10/20% and the lower FPS point goes up by 30%.


The ideal APP/GAME is SMP CAPABLE (muiltitreaded) and use both cores but few games are multitreaded.


I point no solution, just a better use of dual cores when de game do not use dual cpus.
Price and choice of dual core and if it better to buy and fx55 is beyond this tread.


Multitreaded games and app are no easy to develop.


I will post some Fraps logs tomorrow.





 

SGtheArtist

Senior member
Apr 5, 2001
508
0
0
RFC Rudel,

Thank you for clarifying I clearly understand your point now. I would have thought that even single threaded games would benefit in performance if the OS balanced the load across the multi-cores. This is apparently not the case.

Thank you for sharing your findings.
 

hooflung

Golden Member
Dec 31, 2004
1,190
1
0
The problem lies in that the windows NT kernel has been developed, since 3.1, with SMP designed for muti-threaded applications that are less demanding on actual CPU power rather than file and ram I/O power. Database applications, Application Serving, and Media Serving is where the Windows Kernel shines at SMP. Games are a different breed of applications. In theory SMP is just SMP but the demands that games press on CPU, calculation algorithms such as AI, Physics and sound, aren't the same as Database Bubble Sorts. AMD is not to blame here I think it is a case of the CPUs demanding more than the OS can handle reliably. Same thing happened when we moved 1ghz and off of Windows 98/ME. The platform was not built to scale with the hardware for the types of apps being used and system reliability was iffy at best. I wonder if Intel D chips would have the same problem if they were AS FAST and Demanding on games as Athlon 64's are?
 

redhatlinux

Senior member
Oct 6, 2001
493
0
0
Good post. Since reading the 3800+ review here at Anandtech I've wondered where all the lost cycles go. My experience with large mainframe multi-cpus has always been that if you can force affinity then you can get the best results. XP spends too much time juggling and not enough time dispatching. I have been wondering if anyone has tested any flavor of Linux on X2's. I feel that 1.6 times a single core should be possible even as high as 1.7.
 

RFC Rudel

Member
Oct 19, 2005
49
1
66
Originally posted by: redhatlinux
XP spends too much time juggling and not enough time dispatching.


I use w2k3 and think that have a tunned tread scheduler and work better than xp.

but sometimes the OS really balances and you see 100% cpu spikes on cpu0 and cpu1 and the games perform a litle better, but that caind of balancing never runs for more than a few minutes and the OS swicht the game to one core only.

 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
What an amazing first post...you are MOST welcomed!!!
It's long been known that the load balancing of Windows is incredibly bad, which makes me wonder if there will be any improvement in Vista...
RFC, do you happen to have a copy of the Vista beta to test?
 

RFC Rudel

Member
Oct 19, 2005
49
1
66
Originally posted by: Viditor
What an amazing first post...you are MOST welcomed!!!
It's long been known that the load balancing of Windows is incredibly bad, which makes me wonder if there will be any improvement in Vista...
RFC, do you happen to have a copy of the Vista beta to test?


vista kernel is based on the w2k3 and the current beta do not support SATA drives!!!, is to early to test it, no to mention lack of bug free and tunned drivers.


I espect that some sort of WRM came with Vista or a leat a better tunning.


 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Good info...thanks, mate!
MS has made some cryptic remarks in the past as to Vista being more geared towards multicore chips, and I've had a suspicion that this was the reason. Please, when we get a more stable and usable version, if you are able to bench your results would be GREATLY appreciated!

Cheers!
 

biostud

Lifer
Feb 27, 2003
19,884
6,985
136
even though the process is manual, do you need to set affinity every time you reboot or can you save the settings?