Milkyway@Home - GPU & CPU performance stats wanted, any capable h/w, old or new!

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Assimilator1

Elite Member
Nov 4, 1999
24,165
524
126
ok, so i had two dual-GPU machines crunching MW@H recently. one had dual Sapphire 7970s (not the same exact models though), and the other had dual Gigabyte 7970's (matching GPUs). the fact that the Sapphire 7970s didn't have matching model #'s is irrelevant though b/c they both had the same amount of memory (3GB) and were clocked the same. so, without further ado:

Sapphire 7970, core @ 950MHz (factory OCed), VRAM @ 800MHz (underclocked), BOINC v7.2.28, Catalyst 13.9, Win7 x64: ~67s

Gigabyte 7970, core @ 1000MHz (factory OCed), VRAM @ 850MHz (underclocked), BOINC v7.2.28, Catalyst 13.9, Win7 x64: ~69s

...now i know it doesn't make any sense that the average run time for the lower clocked Sapphire 7970 is slightly better than the average run time of the higher clocked Gigabyte 7970, particularly since the supporting hardware is all the same (both use the same CPU, mobo, type and amount of RAM, etc.)...but i suspect that the small sample size of 5 tasks might be the culprit. i used to have a back log of hundreds of valid tasks on each host, but those back logs have dwindled to ~60 valid tasks each since having pulled both hosts off the project 4 days ago...so unfortunately i can't simply add more tasks to the average without getting back on the project again for a few days...and i can't do that either b/c those two hosts have since been completely taken apart. at any rate, i suppose you should post the better of the two times i reported...
Yea that is odd, you sure you didn't get the hosts mixed up? ;)
Or maybe the Sapphire card has slightly lower latency RAM timings??
Btw why did you underclock the RAM? To save power? If so how much did it save? Did it hurt WU times much?

Also I was browsing through 1 of your hosts & saw that 213.76 WU times were 135-150s (ish), were your GPUs crunching 2 WUs at a time & your average times a halving of that?

Thanks for your scores guys :), will update.
 
Last edited:

Sunny129

Diamond Member
Nov 14, 2000
4,823
6
81
Yea that is odd, you sure you didn't get the hosts mixed up? ;)
Or maybe the Sapphire card has slightly lower latency RAM timings??
on second thought, even thought both machines were identical as far as supporting hardware goes, the CPUs did not share the exact same work load (different combination of projects, tasks, etc.)...and so one CPU may have had more resources available for MW@H than the other...i suppose that might account for such a slight difference in average run times.

Also I was browsing through 1 of your hosts & saw that 213.76 WU times were 135-150s (ish), were your GPUs crunching 2 WUs at a time & your average times a halving of that?
yes, i calculated my average and then divided that number by 2 b/c i was running MW@H tasks 2 at a time.
 

Assimilator1

Elite Member
Nov 4, 1999
24,165
524
126
Hmm, I'm not sure that's terrible accurate I just averaged 5 213.76 WUs from Mumaks 7950 rig (he runs 2 WUs at a time), that average was ~151s giving in theory 75.5s for 1 WU. But he actually ran single WUs for a while for the benchmark & he got 81s.
That makes sense if indeed running 2 WUs at a time increases ppd, going by his score running 2 at a time cuts about 7% off the time running them singularly (interesting :)).

I think what I'll do is post your times for crunching 2 at a time & estimate your single WUs times based on Mumaks performance differences of 2 vs 1 at a time, I'll label it so too. At least until/if you do run single WUs :).
 

Sunny129

Diamond Member
Nov 14, 2000
4,823
6
81
yeah that seems to be a sensible way of estimating my average MW@H run time for running 1 tasks at a time. i won't be able to confirm that anytime soon b/c again, both of those machines have now been taken apart and reappropriated.

basically there are two schools of thought when it comes to running 1 MW@H task at a time Vs running 2 at a time - some folks believe there is an advantage, while others do not. the belief that there is no advantage to running 2 MW@H tasks simultaneously stems from the fact that it is very easy to see that crunching 2 MW@H tasks simultaneously takes exactly twice as long as crunching only 1 MW@H. what those folks probably don't realize is that the last few seconds worth of calculations of a MW@H GPU task is done entirely on the CPU, not the GPU. in fact, if you monitor GPU utilization w/ MSI Afterburner, GPU-Z, or a similar utility, you'll note that it drops to zero for the last few seconds of a running MW@H task before starting a new task and jumping back up to its typical load value (if you run 1 task at a time).

so, if a MW@H task takes approx. 100s to crunch on a particular video card, and the last ~7 seconds worth of crunching is done on the CPU only, that equates to ~7 seconds of idle/wasted GPU time. now suppose you allow MW@H to run 2 tasks at a time and you start running 1 MW@H task - if you allow a 2nd task to start running before the first task ends, that 2nd task will continue to utilize the GPU during the ~7 seconds that the GPU would have otherwise been idle had you been running only 1 task at a time. i hope this makes sense to everyone...
 

Assimilator1

Elite Member
Nov 4, 1999
24,165
524
126
Err, so are they stock or o/ced then?

Sunny
Mumak posted this in the MW forum bench thread :-

I don't think it's a good idea to compare running TWO WUs/GPU.
That very depends how these are 'synchronized'. Moreover, it's hard to tell what the other WU running concurrently was, which might affect the final times too.
For me running 2 WUs/GPU ranges from 113 to 154 secs. If you'd still like to make a comparison, then an average from 5 results is absolutely insufficient, I'd say you'd need to average at least 100 results.


What do ya reckon?

I've just looked back through 200 modfit results, & his 213.76 WUs get done in a range from 112s to 155s, so he's certainly right about the variability on his rig, is this typical when running 2 WUs in parallel? If so I'll have to remove the 2 WU table & your est. times :(.
I was going to look back through your results but I just remembered you don't have many left to see (8 213.76s), the few that you do have left range from 122-136s for PC ID 496426 & for 479416 11 213.76s ranging from 126-146s, damn......
 
Last edited:

Sunny129

Diamond Member
Nov 14, 2000
4,823
6
81
Sunny
Mumak posted this in the MW forum bench thread :-

I don't think it's a good idea to compare running TWO WUs/GPU.
That very depends how these are 'synchronized'. Moreover, it's hard to tell what the other WU running concurrently was, which might affect the final times too.
For me running 2 WUs/GPU ranges from 113 to 154 secs. If you'd still like to make a comparison, then an average from 5 results is absolutely insufficient, I'd say you'd need to average at least 100 results.

What do ya reckon?
well i hadn't thought of looking at it statistically, despite the fact that it really is the essence of this thread...so from a statistical point of view, i must agree with Mumak. while my variance, and thus my standard deviation, is less than his, i'm sure they're still far greater, and my run times far less consistent, than if i were running the tasks 1 at a time. so from a short term (run time) benchmarking perspective, i think it is best to do it running only 1 task at a time. from a long term (PPD) benchmarking perspective, i think its best to run 2 tasks at a time.

I've just looked back through 200 modfit results, & his 213.76 WUs get done in a range from 112s to 155s, so he's certainly right about the variability on his rig, is this typical when running 2 WUs in parallel? If so I'll have to remove the 2 WU table & your est. times :(.
I was going to look back through your results but I just remembered you don't have many left to see (8 213.76s), the few that you do have left range from 122-136s for PC ID 496426 & for 479416 11 213.76s ranging from 126-146s, damn......

i'll tell you what - i'll run MW@H tasks 1 at a time on one of my 7970s and report back as soon as i get the change...i just have to get a machine put back together, which i should have done within the next day or two.
 

petrusbroder

Elite Member
Nov 28, 2004
13,348
1,155
126
Hi Mark, I would be glad to contribute data (having 2 ATI 5850 on the project) but for some reason I have only 159,86 points WUs (=MilkyWay@Home v1.02 (opencl_amd_ati)) and those are not the right ones. I am watching for the other ones ... ;)
 

Assimilator1

Elite Member
Nov 4, 1999
24,165
524
126
Doh! Weird, your the 3rd person that's had that :confused:
I hope this isn't a longer term trend oh we'll have to benchmark all over again!

Btw I'd keep an eye out myself on your WU times but you've hidden your computers, how come?

Thanks biodoc :)

I'll tell you what - i'll run MW@H tasks 1 at a time on one of my 7970s and report back as soon as i get the change...i just have to get a machine put back together, which i should have done within the next day or two.

Cool, thx Sunny :)
LMK what MHz the GPU is when you do.
 
Last edited:

petrusbroder

Elite Member
Nov 28, 2004
13,348
1,155
126
OT:
Well, one of my mates here i Sweden had a farm, it was visible, some street smart gang looked at his BOINC projects, realised he had some really cool hardware, broke into his hopuse one day and took all - including mice, keyboards, router, etc etc. but only computer equipment and the 47'' TV. And since then I have my comps hidden.
/OT.
 

Sunny129

Diamond Member
Nov 14, 2000
4,823
6
81
Hi Mark, I would be glad to contribute data (having 2 ATI 5850 on the project) but for some reason I have only 159,86 points WUs (=MilkyWay@Home v1.02 (opencl_amd_ati)) and those are not the right ones. I am watching for the other ones ... ;)
Peter, you have to run the Milkyway@Home Modified Fit v1.28 (opencl_amd_ati) tasks to start seeing the ones that earn 213.76 points...and even then you'll get an assortment of 79.93-point tasks, 213.76-point tasks, and 320.63-point tasks...you'll just have to run Modified Fit for a while until you have a sufficient number of 213.76-point tasks to average. the regular Milkyway@Home v1.02 (opencl_amd_ati) application gives you 159.86-point tasks only.
 
Last edited:

petrusbroder

Elite Member
Nov 28, 2004
13,348
1,155
126
I checked my results just now and I have had a string of the 213.76-WUs.
The results are:
AMD ATI Radeon HD 5850 (1024MB, driver: 1.4.1741 OpenCL: 1.02, Windows 7 HP, SP1 all updates until last tuesday):
average (17 WUs): 257.57 seconds, median: 245.64 seconds, standard deviation: 19.82 seconds.

I used 17 WUs because one of 3 or 4 WUs gave a time of approx 270 - 280 seconds (compared to the other three which were in the 240 - 245-seconds range...). I have two of exactly the same GPUs running (same model#, different serial#, purchased at the same time) @800Mhz. However it is possible that one of the cards runs slower because the intake of cool air is less due to the other card.

Therefore the use of the median is better in this case ...
 
Last edited:

Assimilator1

Elite Member
Nov 4, 1999
24,165
524
126
Ok that confuses things a bit, so you're saying that 1 card maybe be throttling because it's drawing in warm air from the other card?
You also say 3-4 were 270-280s, & another 3 were 240-245s, what about the other 10-11 WUs? :confused: ;)

Do both GPUs have a dedicated CPU core? Because if not that'll slow them down a little & give more varying times, due to them not being at nr 100% load. CCC overdrive will show GPU load/util.
 

petrusbroder

Elite Member
Nov 28, 2004
13,348
1,155
126
No, I am saying 1 WU of 3-4 was 270-280 seconds, compared to the other 3 -4 which were 240 - 245 seconds; i.e. 6 had times between 270 - 280 seconds, 9 had times between 240 - 245 seconds. The WUs with longer times were from one GPU, the ones with shorter times from the other. Both GPU are supposed to run @800MHz, but I am not shure they do in reality due to heat. Both GPUs run @100% load ... dedicated CPU-core, although I think that is a waste of the computation power of 2 CPU-cores which, for a small penalty, could crunch 2 other WUs at the same time (i.e. 2 GPU-WUs + 2 CPU-WUs compared to 2 GPU-WUs only ...)
 

Assimilator1

Elite Member
Nov 4, 1999
24,165
524
126
I agree about the waste of CPUs cores, but it's just for benchmarking purposes :).
Although I found with asteroids@h GPU load dropped to ~45% with all 4 cores used, where as with F@H it was ~80%, which I was fine with. ~45% though I'm not, so now A@H runs on 3 cores only whilst running MW GPU.

1 WU of 3-4 what? do you mean 1 group of 3-4 WUs? your still not making sense, lol I think theirs a language issue here ;).
Nm anyway, I understood your 6 WUs @ 270-280 & 9 @ 240-245s :).

Is the group of 9 WUs @ 240-245s solely from the cooler running GPU? I'm just trying to isolate times from the cooler running (& presumably) full speed GPU.
 

petrusbroder

Elite Member
Nov 28, 2004
13,348
1,155
126
No I mean that in a group of 4 - 5 WUs one is running 270-280 seconds and the other 3 - 4 are running @ 240 - 245 seconds ...

And yes, the 240-245 sec WUs are from the cooler running GPU. I have removed the sidepanel of the computer and am getting 15 wus with 240-245 seconds and 1 WU with 278 seconds - it is a heat issue ...
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,703
4,661
75
Corrected results from my GPU, as described above: 1,127.30s for an average of 6 WUs.
 

Assimilator1

Elite Member
Nov 4, 1999
24,165
524
126
New time for my HD 5850 now o/ced to 775 MHz, 231s for 6 213.76 WUs :).
Only didn't clock higher because of driver slider limits, will investigate further o/cing another day :).

salvordorhardin
Got ya ;)

Corrected results from my GPU, as described above: 1,127.30s for an average of 6 WUs.
So for the GTX 460 768mb OC@750 yea?

Thx for the results :)

Petrus
Re your hot GPU, weird, what temp is it running at?
GPU-Z shows no signs of it throttling?
 
Last edited:

Assimilator1

Elite Member
Nov 4, 1999
24,165
524
126
Lol, did you find out if your GPU was nr max load?

Table updated, 7970 time removed (wrong WUs!), 7950 o/c added.

Just remembered I took a screenshot of some WUs my old 4870 did just over a month ago, & by luck it's got 5 213.76 cred WUs in it! :), average is 503s, Cat 13.1, Win 7 ult 64bit, BOINC 7.2.33.
Although it didn't have a dedicated core I was running F@H at the time which plays nicely with MW, & if IIRC any time I looked at GPU load it was showing nr 100%.
 
Last edited:

Sunny129

Diamond Member
Nov 14, 2000
4,823
6
81
Table updated, 7970 time removed (wrong WUs!), 7950 o/c added.
well i've been having some issues getting a quad 7970 machine up and running, so i haven't gotten the chance to benchmark one of my 7970s w/ the 213.76 tasks yet...maybe i'll have it figured out by the time the weekend rolls around...
 

Assimilator1

Elite Member
Nov 4, 1999
24,165
524
126
No probs :), btw the time removed was from a guy from the MW forum. But yes it would be good to have a time from an o/ced 7970.