MilkyWay@H - Benchmark thread Winter 2016 on (updated 1-2019) - GPU & CPU times wanted for new WUs

Page 7 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

StefanR5R

Platinum Member
Dec 10, 2016
2,258
343
106
Gridcoin, a method to draw more suckers into the pyramid.
About how many daily GRC can I expect to generate at maximum output from one card with your specs above?
From some older posts I was given the impression one 7970 should yield 50GRC daily, but that hasn't been my experience.
From what I gather from this thread here, running multiple tasks simultaneously on the same GPU gives a throughput uplift which is small on AMD GPUs but moderate on NVidia GPUs. I read the uplift can be large on some high-end NVidia server GPUs.

With this information you can take a GPU's known single-task PPD and make guess about multiple-task PPD, if you can't test it yourself yet.

Then, RAC is a bit less than PPD, or in the ideal steady-state case, equal to PPD.

Last, conversion from RAC to GRC is AFAIU variable over time and perhaps unpredictable, by depending on the overall volume of participation in the scheme. IMO it's off-topic here.
 

ao_ika_red

Golden Member
Aug 11, 2016
1,219
57
106
Well then. This is mine.
AMD RX 560 2GB @1300MHz on Athlon x4 845.
It took 238.83s to finish a 227.23 credited task.


As per the op, if you read it :p ;), I need an average of 5 WUs, so as to reduce variation.
Hi @Assimilator1 , it seems I still have an obligation to fix my milkyway benchmark. So, here it goes:
I got 5 tasks with 227.26 points each. My avr runtime is 241.02s.
This is achieved using AMD RX560 2GB @1300MHz with AMD Athlon x4 845 @3.8GHz.


edit: All of today's tasks consist of 227.26 and 227.28 pts tasks so I wonder if my benchmark above is still relevant.
 
Last edited:

StefanR5R

Platinum Member
Dec 10, 2016
2,258
343
106
Not 100 % on-topic in this thread, but probably interesting to some:
Energy-efficiency of CPU versus GPU

I measured with the current mix of WUs which give 227.62 or 203.92 credits per task. I used the same hosts on which I performed an energy-efficiency comparison at SETI@home recently. Power consumption is total host consumption at the wall, including VRM inefficiency, PSU inefficiency, peripherals, fans, etc.

Unfortunately I do not have any AMD GPUs, especially ones with good double precision performance. Those would be older 28 nm tech, compared to the 16 nm GPUs and 14 nm CPUs that I have; but their dedicated double precision hardware should make them more efficient.

CPU host: dual Broadwell-EP
2x E5-2696 v4, 2x 22C/44T @ 2.8 GHz
running "MilkyWay@Home v1.46" on x86-64 Linux
throughput .................. 244,000 PPD (average from 200 valid tasks)
power consumption ........ ≈340 W (interpolated from a Watt meter to which more hosts are attached)
power efficiency ................ ≈8.3 points/kJ

GPU host: dual Pascal
2x GTX 1080Ti, watercooled, i7-7700K is only feeding the GPUs (i.e. no CPU tasks)
running "MilkyWay@Home v1.46 (opencl_nvidia_101)" on x86-64 Linux, 3 simultaneous tasks per GPU
throughput .................. 445,000 PPD (average from 200 valid tasks)
power consumption .......... 380 W (averaged from several spot readings from a dedicated Watt meter)
power efficiency ................ 13.6 points/kJ

I.e., my most efficient CPUs achieve 60 % of the efficiency of the GPU host, going by total host consumption at the wall. Keep in mind that higher clocked desktop CPUs, nevermind overclocked CPUs, are far less power efficient than these server CPUs.

--------

In addition, I ran the normal "MilkyWay@Home v1.46" application and the "MilkyWay@Home N-Body Simulation v1.70 (mt)" application alongside on two E5-2690 v4 based hosts:
  • N-Body gave 1/4 the PPD compared to the normal application.
  • N-Body was configured to 4 threads per task, and the host a little bit over-committed in order to achieve 100 % CPU utilization like on the host with the normal application.
  • I don't have power readings from the respective hosts. But given same utilization, and CPUs running at the non-AVX turbo clock in both hosts, I presume that power consumption is similar between N-Body and the normal application.
 
Last edited:

StefanR5R

Platinum Member
Dec 10, 2016
2,258
343
106
Again hijacking this thread for an off-topic post:
AMD announces two new server GPUs called MI60 and MI50, using the Vega architecture ported to 7 nm process and revamped in the double precision department (among else):

AMD MI60 ............................................... 7.4 TFLOPS theoretical peak FP64 throughput
AMD MI50 ............................................... 6.7 TFLOPS peak FP64 (edit)
AMD FirePro W9100 (Hawaii XT) .......... 2.6 TFLOPS peak FP64
AMD Radeon 280X (Tahiti XTL) ..... 0.8 - 1.0 TFLOPS peak FP64

Nvidia Tesla V100 ................................... 7.5 TFLOPS peak FP64
Nvidia Titan V.......................................... 6.9 TFLOPS peak FP64
Availability Q4/2018 (MI60), Q1/2019 (MI50). Price: Maybe not attractive to the starving DCer.
 
Last edited:

Assimilator1

Elite Member
Nov 4, 1999
22,942
19
91
Thanks for keeping my thread alive Stefan ;), & posting some interesting info, good to know that GPUs still rule the roost for MW@H.

Insane DP output M160/150 & NVidia cards! Someone in my milkyway forum benchmark thread has posted times for a V100 , running concurrent WUs he's getting between 4-5s per WU!! :D

And yea as you noticed back in September MW has changed the WUs again, it seems that 227.62 & 203.92 are the common WUs now, are you still seeing that?
I'm looking at starting a new table with the 227.62 WUs.

[U]ao_ika_red[/U]
Sorry for the late reply, yeah unfortunately you're right, that was a different credit WU so will have slightly different run times. And they've changed again!
I'm looking at starting a new table with the 227.62 credit WUs, if people concur that is indeed a new common WU (& it's close to the old benchmark WU) then I'll start a new table. And you can post your average time for that WU :).
Are you seeing plenty of 227.62s?
 

ao_ika_red

Golden Member
Aug 11, 2016
1,219
57
106
Thanks for keeping my thread alive Stefan ;), & posting some interesting info, good to know that GPUs still rule the roost for MW@H.

Insane DP output M160/150 & NVidia cards! Someone in my milkyway forum benchmark thread has posted times for a V100 , running concurrent WUs he's getting between 4-5s per WU!! :D

And yea as you noticed back in September MW has changed the WUs again, it seems that 227.62 & 203.92 are the common WUs now, are you still seeing that?
I'm looking at starting a new table with the 227.62 WUs.

[U]ao_ika_red[/U]
Sorry for the late reply, yeah unfortunately you're right, that was a different credit WU so will have slightly different run times. And they've changed again!
I'm looking at starting a new table with the 227.62 credit WUs, if people concur that is indeed a new common WU (& it's close to the old benchmark WU) then I'll start a new table. And you can post your average time for that WU :).
Are you seeing plenty of 227.62s?
Mine also has plenty of 227.62 and 203.92 tasks and no 227.26 or 227.28 task anymore.
 

Assimilator1

Elite Member
Nov 4, 1999
22,942
19
91
A few other people in the MW forum thread have also confirmed a good number of 227.62 WUs, so I will go with them for the next table.
 

IEC

Super Moderator
Super Moderator
Jun 10, 2004
13,485
213
136
RTX 2080 Ti is a bust for Milkyway, lol:
112.17s for 227.62
117.18s for 203.92
111.27s for 227.62
112.17s for 227.62
112.16s for 227.62
108.14s for 227.62
109.13s for 227.62
109.14s for 227.62

It also barely heats up the card, running at 55°C at 60% fan speed. Compared to maxing out at 80°C @ 80% fan under F@H or similar full loads.

This is with a i7-8700K @ 4.7GHz no AVX offset.
 

Assimilator1

Elite Member
Nov 4, 1999
22,942
19
91
Not necessarily, try running 3 or 4 WUs at once (maybe more), MW has had a ongoing problem with Nvidia cards not being able to load them up properly. Running multiple WUs gets by that.
Doesn't fix the poor single WU benchmark times though! I will add a separate table for cards running multiple WUs.
Oh btw, can you provide me with an average time of your 227.62 WUs above? :p [edit] Nm, I did it, 110.6s from 7 WUs
And is your card at stock clocks?

Oh another note, I'm benchmarking my RX 580 atm.
So my RX 580's 1st 227.62 results in :-

1161409011708247987 12 Jan 2019, 11:47:21 UTC 12 Jan 2019, 11:54:12 UTC Completed and validated 97.29 24.49 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
1161414191708245697 12 Jan 2019, 11:47:21 UTC 12 Jan 2019, 11:52:30 UTC Completed and validated 104.25 25.86 203.92 MilkyWay@Home v1.46 (opencl_ati_101)
1161414451708304662 12 Jan 2019, 11:47:21 UTC 12 Jan 2019, 12:01:05 UTC Completed and validated 97.23 24.10 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
1161412791708290730 12 Jan 2019, 11:47:21 UTC 12 Jan 2019, 12:06:05 UTC Completed and validated 97.24 24.15 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
1161412871708317912 12 Jan 2019, 11:47:21 UTC 12 Jan 2019, 11:49:10 UTC Completed and validated 98.46 24.59 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
1161410481708121871 12 Jan 2019, 11:47:21 UTC 12 Jan 2019, 12:02:41 UTC Completed and validated 96.19 24.48 227.62 MilkyWay@Home v1.46 (opencl_ati_101)

So from those 5 227.62 WUs the average is ~97.3s

Results from my 2nd rig's HD 7870 XT 3GB DS :-

1161979141708336607 12 Jan 2019, 13:01:41 UTC 12 Jan 2019, 13:34:05 UTC Completed and validated 73.36 14.57 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
1161977161708347064 12 Jan 2019, 13:01:41 UTC 12 Jan 2019, 13:27:35 UTC Completed and validated 73.19 14.65 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
1161974771708324410 12 Jan 2019, 13:01:41 UTC 12 Jan 2019, 13:45:25 UTC Completed and validated 73.40 14.38 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
1161974991708159228 12 Jan 2019, 13:01:41 UTC 12 Jan 2019, 13:34:05 UTC Completed and validated 72.35 14.49 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
1161975081708335074 12 Jan 2019, 13:01:41 UTC 12 Jan 2019, 13:06:31 UTC Completed and validated 73.40 14.55 227.62 MilkyWay@Home v1.46 (opencl_ati_101)
Average is ~73.2s

Interesting to see times from the 203.92 credit WUs (which seem to be a bit more common atm.......)

1161979151708336822 12 Jan 2019, 13:01:41 UTC 12 Jan 2019, 13:40:33 UTC Completed and validated 75.43 18.67 203.92 MilkyWay@Home v1.46 (opencl_ati_101)
1161979171708340864 12 Jan 2019, 13:01:41 UTC 12 Jan 2019, 13:37:18 UTC Completed and validated 77.39 18.77 203.92 MilkyWay@Home v1.46 (opencl_ati_101)


**************************************************************************

New times & table added to the op, post your 227.62 credit scores! :)

Can any modern card de-throne the 7 year old HD 7970?, still king of Milky Way! ;)o_O (single GPU, single WUs).
 
Last edited:

IEC

Super Moderator
Super Moderator
Jun 10, 2004
13,485
213
136
I guess that depends on what the DP rate of the newly-announced Radeon VII will be. The corresponding MI50 part is 6.7 TFLOPs for FP64... :eek:

Edit: Per Ryan Smith of AT:
FP64 is not among the couple of features they dialed back for the consumer card. The only things AMD has turned down for Radeon VII are Instinct drivers (obviously), PCIe 4.0 support, and the external Infinity Fabric link. All other Vega 20 features are intact
Source:
This seems unlikely to me, but if true this is going to be the DP compute card to get...
 
Last edited:

StefanR5R

Platinum Member
Dec 10, 2016
2,258
343
106
Can any modern card de-throne the 7 year old HD 7970?, still king of Milky Way! ;)o_O (single GPU, single WUs).
(...which would be 38.2 seconds.)
Let's see:
  • Hawaii based FirePro's (W8100, W9100, S9150, S9170): OK, they are not modern. ;-) Also, I am not sure if they really beat 7970 when running only one task at a time. Edit 2: Probably not.
  • What about Volta (Titan, Quadro, Tesla)? Here is a host with Titan V which takes 49.6...50.7 seconds for a 227.62 WU, but I don't know if this is with 1 at a time or several at a time.
Edit 1:
Tesla V100: 37 s when 7 tasks run at once (WU type not reported)
Tesla P100: 55 s when 6 tasks run at once (WU type not reported)
(source from April 2018)
 
Last edited:

StefanR5R

Platinum Member
Dec 10, 2016
2,258
343
106
I guess that depends on what the DP rate of the newly-announced Radeon VII will be. The corresponding MI50 part is 6.7 TFLOPs for FP64... :eek:

Edit: Per Ryan Smith of AT:

Source:
This seems unlikely to me, but if true this is going to be the DP compute card to get...
Techgage (link) and apparently LTT (link) were told at CES that Radeon VII won't have full FP64 support, and Techgage got this also confirmed from AMD's director of product marketing when they asked directly. :-(
 

plonk420

Senior member
Feb 6, 2004
315
4
81
Average of at least 5 WU times
89.13 (well, some were 86 seconds, some 91 seconds)
overall computer. my default is to run 3 at a time, so that's the 210-odd second units
86 seconds: de_modfit_sim19fixed_bundle4_4s_NoContraintsWithDisk260_1_1545263705_11648016_0
91 seconds: de_modfit_sim19fixed_bundle6_2s_NoContraintsWithDisk140_3_1545263705_11629274_1

A dedicated physical CPU core for each GPU
i've stopped all other processing, save for browser and some other junk like irc, discord, steam

Please state what speed & type CPU you have
i7-4790 (not K!), stock 3.8 (4.0 turbo), Radeon Fury

Please state GPU clock speeds if overclocked (including factory overclocks) or state 'stock'.
stock

For CPU times please state whether Hyper Threading (or equivalent) is enabled or not, times for both states welcomed.
HT: yes

It would also be useful if you could state your BOINC & driver version, & OS, incase it does make any difference.
7.10.2 x64, Win10, Radeon 19.1.1
 

Assimilator1

Elite Member
Nov 4, 1999
22,942
19
91
Thanks for your post plonk :)
You linked a screen shot but it only shows 1 credit (validated) 227.62 WU, & looking at your valid results page only shows 2 227.62 singly crunched WUs (86s ea). The 91s are the 203.92 credit WUs ;).

Techgage (link) and apparently LTT (link) were told at CES that Radeon VII won't have full FP64 support, and Techgage got this also confirmed from AMD's director of product marketing when they asked directly. :-(
Yea I'd just read that, pitty :(.

Btw, whilst some of the high end Nvidia cards do indeed have more DP power than the HD 7970 they are unable to run single WUs efficiently & so have slower times, however when they run multiple WUs they do much better :).
 

Assimilator1

Elite Member
Nov 4, 1999
22,942
19
91
Err, ok so that works out to ~209 GFLOPs for that card? That sucks, if true :( .
 

StefanR5R

Platinum Member
Dec 10, 2016
2,258
343
106
No; about 1.67 TFLOPS would be the peak DP spec, whereas 8x of that = 13.4 TFLOPS would be the peak SP spec (if clocked like the Instinct MI50 server card).
 

IEC

Super Moderator
Super Moderator
Jun 10, 2004
13,485
213
136
Err, ok so that works out to ~209 GFLOPs for that card? That sucks, if true :( .
I should have been clearer in my post, but the 1.67 TFLOPS is the expected peak FP64 rate post-gimping. Or 1/4 of the MI50 part.
 

plonk420

Senior member
Feb 6, 2004
315
4
81
does ANY 227.xx points count? i've got 89.52 averaged over 18 results 227.13-227.18 point results

edit: also, running 3 simultaneously averages 227.45 seconds each over 11 results
 
Last edited:

ao_ika_red

Golden Member
Aug 11, 2016
1,219
57
106
does ANY 227.xx points count? i've got 89.52 averaged over 18 results 227.13-227.18 point results

edit: also, running 3 simultaneously averages 227.45 seconds each over 11 results
@Assimilator1 I don't see 227.62 tasks anymore. Instead, mostly it consists of 227.1x tasks.
 

Assimilator1

Elite Member
Nov 4, 1999
22,942
19
91
*sigh* Yep, I see the same for my 2nd PC. If they keep changing WUs this fast it won't be possible to collate WU times.
Looking at 227.1x results I have they vary from 68.2-75.5, that's quite a variation, this rig's 227.62 average time was 73.2s, so maybe they can be used, but I don't know yet, I'll have to see what times my rig gets, & anyone else whose posted a 227.62 time.

7 current results for my 2nd rig :-

1254450991712933764566187 21 Jan 2019, 21:40:58 UTC 28 Jan 2019, 20:47:12 UTC Completed and validated 71.63 10.90 227.17 MilkyWay@Home v1.46 (opencl_ati_101) 1250275071712742444566187 21 Jan 2019, 12:56:06 UTC 21 Jan 2019, 16:42:32 UTC Completed and validated 73.25 10.31 227.17 MilkyWay@Home v1.46 (opencl_ati_101) 1231788211711808478566187 19 Jan 2019, 14:36:44 UTC 19 Jan 2019, 14:57:56 UTC Completed and validated 68.20 7.22 227.13 MilkyWay@Home v1.46 (opencl_ati_101) 1231786721711808386566187 19 Jan 2019, 14:36:44 UTC 19 Jan 2019, 14:44:49 UTC Completed and validated 75.47 11.67 227.18 MilkyWay@Home v1.46 (opencl_ati_101) 1231786861711808400566187 19 Jan 2019, 14:36:43 UTC 19 Jan 2019, 14:39:57 UTC Completed and validated 72.53 11.68 227.18 MilkyWay@Home v1.46 (opencl_ati_101)
 

IEC

Super Moderator
Super Moderator
Jun 10, 2004
13,485
213
136

IEC

Super Moderator
Super Moderator
Jun 10, 2004
13,485
213
136
Managed to snag a XFX-branded Radeon VII at Newegg today.

I bought it mainly for the FP64 compute (3.46 TFLOPs i.e. 1/2 of MI50) and will update this thread when I get a chance to test it sometime next weekend.
 


ASK THE COMMUNITY

TRENDING THREADS