8th Annual BOINC Pentathlon

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

StefanR5R

Elite Member
Dec 10, 2016
5,517
7,826
136
[City Run, WCG OpenZika]

Argh... My 2P boxes have already finished all their OpenZika work. Off to create additional bunkers.

Edit: The guide from OCN about multiple instances works basically. But at WCG, it has the downside that client identity is not copied to the new instance, and therefore WCG wants to see some valid results returned from the new instance before handing out bigger numbers of WUs. -- For the future, I need to figure out how to copy client identity without copying full client state.

There is indeed a limit of 1000 runnable tasks imposed by the client.
There is no similar limit for (trusted) clients imposed by WCG, from what I see so far.

Imagine a computer with 56 threads, doing jobs which take 65 minutes at average to complete.
A cache of 1000 jobs is work for 65000 / 56 = 1160 minutes = 19 hours. :mad:

Fortunately I believe I learned how to copy client identity such that the new client instance is known to, and trusted by, WCG. That is, the new instance is able to fetch another 1000 WUs from WCG without having to send some valid results first.

Edit:
Hmm, I'm afraid I did not achieve what I wanted. When the new client fetches work, it emits log lines like:
Requesting new tasks for CPU
Scheduler request completed: got 15 new tasks
Resent lost task ZIKA_[long alphanumeric code]
Resent lost task ZIKA_[long alphanumeric code]
Resent lost task ZIKA_[long alphanumeric code]
etc..
These are tasks which are already present in the original client instance. :(
Hence, there is a lot of specific client state that would need to be copied in addition to client identity.

Edit:
Completing enough validated work to be eligible for more than min{n_cpus+2; 66} tasks*, and then downloading 1000 tasks**, takes about 3 hours here and now. (And, as noted above, this is a supply of work for mere 19 hours.)

*) limit imposed by WCG on n00bs
**) limit imposed by the boinc client itself, afaik

Edit:
The saga of almost absurdly small OpenZika bunkers continues:

I now checked my 4C/8T Haswell. It had received 280 tasks on Monday, then was denied further tasks with "This computer has reached a limit on tasks in progress". (I guess this is a per-core limit which could be circumvented by the ncpus tag in cc_config.xml.) This little bunker will complete tomorrow.

While I am shocked that not even a normal PC is allowed to queue up more than ~2 days worth of OpenZika, I also find some solace in the fact that there is a high barrier against ~5 day bunkers not just for folks with bigger machines like mine, but also for the many many users out there with more commonly sized machines. IOW I presume that lots of participants won't have the time + patience + knowledge that are together required to build a 5 days large OpenZika bunker.
 
Last edited:

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,250
3,845
75
[City Run, WCG OpenZika]

The problem with starting a new VM seems to be you need to get at least one WU validated before it will send you a bunch of work. I've sent in 5 so far, none validated yet. Maybe I'll stop at 8 and resume my other VM.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,250
3,845
75
Well that's a new message. "This computer has finished a quota of x tasks." And it won't send me any more.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,565
14,520
136
If somebody wants to tell me what to do to join this race, I could help. I saw something about Zika, I have over 200 tasks queued up for that one.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,250
3,845
75
If somebody wants to tell me what to do to join this race, I could help. I saw something about Zika, I have over 200 tasks queued up for that one.
Sounds like you're doing what you need to do. I'm doing the same, except I have close to 1000 tasks between two computers and two VMs.
 

StefanR5R

Elite Member
Dec 10, 2016
5,517
7,826
136
[City Run, WCG OpenZika]

@Markfw,
I will try to make a summary of the race particulars later today, unless somebody else does...

For now, it is important to know about the "City Run" sub-race:
  • Only OpenZika points count. For the remaining time leading up to the race, and during the race, please go to www.worldcommunitygrid.org -> "My Contribution" -> "My Project", and deselect all subprojects, then select OpenZika.
    The next time your machines ask the server for more work, they will get OpenZika.
  • If you already downloaded tasks of other WCG subprojects, abort them unless you can complete them shortly.
  • Only OpenZika Points granted before end of the race count.
    End of the "City Run": May 10, 00:00 UTC
  • Only OpenZika Points granted after beginning of the race count.
    Beginning of the race is as soon as the organizers have posted initial stats at https://www.seti-germany.de/boinc_pentathlon/statistiken/pentathlon.php in the City Run column. This will happen shortly after May 5, 00:00 UTC.
  • Please try to retain completed work until after seti-germany has posted the initial stats! E.g. "suspend network activity" in boincmgr after you downloaded enough OpenZika tasks.
Exception to the last rule:
  • If you happen to have a client which never crunched OpenZika before, the WCG servers will let you have at most as many WUs as you have logical cores + 2, or at most 66 WUs, at any moment. Only after the client completed and uploaded a few WUs and got them validated at WCG (i.e. WCG received copies of these WUs with the same results from another donor), then WCG will let you have more WUs to store and crunch on for a while.
  • Unfortunately, many people are currently retaining their work until the race begins, or even longer. This means that the work which the new client submits now, is less likely to be validated before the race. Hence the new client must crunch through a whole bunch of WUs and upload them before the first validation succeeds.
  • As a consolation, all of the WUs that you submit now but get validated only after the race, will be counted during the race and hence contribute to the team rank.
Again, this exception applies only to clients which have never returned any OpenZika before.

Edit:
Clients which have submitted valid OpenZika work in the past, will receive mostly WUs at which WCG lowered the "quorum" to 1, which means they will be accepted as valid immediately without double-check by another submitter. WCG will only occasionally send OpenZika WUs with a need for validation or verification to veteran clients, for spot checks.​
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,517
7,826
136
[City Run, WCG OpenZika]

The problem with starting a new VM seems to be you need to get at least one WU validated before it will send you a bunch of work. I've sent in 5 so far, none validated yet. Maybe I'll stop at 8 and resume my other VM.

If the wingman of these WUs in limbo is a participant of the race, you will get these WUs validated during the race, and thus they will count for the race.

On my machines which cannot bunker more than 1 or 2 days, I simply let new client instances complete and send work for about 2 hours, until enough validations went through (a single one?). *) If there are some WUs "wasted" this way, they are still contributing to science...

On my Windows machines I don't have this problem because they are nicely loaded up with Cosmology in parallel.

*) Edit:
I now did this on a 4C/8T PC. One validated WU on its own is not enough.

This machine was allowed to download more tasks 1h58m after one WU was validated (so far, only this one WU was validated), and 3h7m after it downloaded its first WU.

So maybe the rule is: Get 1 WU validated and wait patiently for 2 additional hours.
 
Last edited:

Kiska

Golden Member
Apr 4, 2012
1,013
290
136
@StefanR5R Your efforts will not succeed for copying and pasting the client_state file, as WCG scheduler request has a summary of what the computer in question is currently crunching. So therefore on the next scheduler request, if WCG servers don't see work being processed where it should be, either they resend the workunit or it will mark it as abandoned
 

StefanR5R

Elite Member
Dec 10, 2016
5,517
7,826
136
Well that's a new message. "This computer has finished a quota of x tasks." And it won't send me any more.

I received this message now too. There is a typo in this message.
"This computer has finished a daily quota of 24 tasks"​
CPU: 4C/8T, 198 tasks total, 2 valid, 15 pending validation, 181 in progress.
I.e. this message actually means
"This computer has not yet finished a daily quota of 24 tasks".​
After I reported 7 more tasks (exactly 24 finished), this download throttle went away.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,565
14,520
136
[City Run, WCG OpenZika]

@Markfw,
I will try to make a summary of the race particulars later today, unless somebody else does...

For now, it is important to know about the "City Run" sub-race:
  • Only OpenZika points count. For the remaining time leading up to the race, and during the race, please go to www.worldcommunitygrid.org -> "My Contribution" -> "My Project", and deselect all subprojects, then select OpenZika.
    The next time your machines ask the server for more work, they will get OpenZika.
  • If you already downloaded tasks of other WCG subprojects, abort them unless you can complete them shortly.
  • Only OpenZika Points granted before end of the race count.
    End of the "City Run": May 10, 00:00 UTC
  • Only OpenZika Points granted after beginning of the race count.
    Beginning of the race is as soon as the organizers have posted initial stats at https://www.seti-germany.de/boinc_pentathlon/statistiken/pentathlon.php in the City Run column. This will happen shortly after May 5, 00:00 UTC.
  • Please try to retain completed work until after seti-germany has posted the initial stats! E.g. "suspend network activity" in boincmgr after you downloaded enough OpenZika tasks.
Exception to the last rule:
  • If you happen to have a client which never crunched OpenZika before, the WCG servers will let you have at most as many WUs as you have logical cores + 2, or at most 66 WUs, at any moment. Only after the client completed and uploaded a few WUs and got them validated at WCG (i.e. WCG received copies of these WUs with the same results from another donor), then WCG will let you have more WUs to store and crunch on for a while.
  • Unfortunately, many people are currently retaining their work until the race begins, or even longer. This means that the work which the new client submits now, is less likely to be validated before the race. Hence the new client must crunch through a whole bunch of WUs and upload them before the first validation succeeds.
  • As a consolation, all of the WUs that you submit now but get validated only after the race, will be counted during the race and hence contribute to the team rank.
Again, this exception applies only to clients which have never returned any OpenZika before.

Edit:
Clients which have submitted valid OpenZika work in the past, will receive mostly WUs at which WCG lowered the "quorum" to 1, which means they will be accepted as valid immediately without double-check by another submitter. WCG will only occasionally send OpenZika WUs with a need for validation or verification to veteran clients, for spot checks.​
Done !! 144 cores changed
 
  • Like
Reactions: StefanR5R

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,250
3,845
75
I received this message now too. There is a typo in this message.
"This computer has finished a daily quota of 24 tasks"​
CPU: 4C/8T, 198 tasks total, 2 valid, 15 pending validation, 181 in progress.
I.e. this message actually means
"This computer has not yet finished a daily quota of 24 tasks".​
After I reported 7 more tasks (exactly 24 finished), this download throttle went away.
Thanks! I thought I'd done something wrong when I released 3 WUs last night, and the quota went up from 13 to 17. o_O Now it makes sense.

I released 74 WUs today, to obtain 57 more, but some of those were probably quorum 2, so it's probably a wash.
 

StefanR5R

Elite Member
Dec 10, 2016
5,517
7,826
136
@StefanR5R Your efforts will not succeed for copying and pasting the client_state file, as WCG scheduler request has a summary of what the computer in question is currently crunching. So therefore on the next scheduler request, if WCG servers don't see work being processed where it should be, either they resend the workunit or it will mark it as abandoned

Thanks. Meanwhile I also saw the risk of existing WUs getting detached discussed at the OcUK forum. From what I saw at my own WCG "results status" page though, I apparently had luck and the original client still has got all WUs in progress, not detached. (I tore down the new client and set up another one with separate identity. OK, actually not 1 but 10 now in total besides the original ones, and that's just for 4 days bunkering because I did not enter this business immediately after seti-ger's announcement.)

Fingers crossed that what I'm gonna upload will be accepted.
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,517
7,826
136
April 30, the first two Pentathlon projects were announced. The effect:
xnyn0i.png
2djuyw1.png
qraql0.png
14bn8rb.png
2nk0x3o.png
2a66zyv.png
Graphs taken from stats.free-dc.org.
 

Jondi

Junior Member
Apr 16, 2017
19
21
51
Thanks. Meanwhile I also saw the risk of existing WUs getting detached discussed at the OcUK formum. From what I saw at my own WCG "results status" page though, I apparently had luck and the original client still has got all WUs in progress, not detached. (I tore down the new client and set up another one with separate identity. OK, actually not 1 but 10 now in total besides the original ones, and that's just for 4 days bunkering because I did not enter this business immediately after seti-ger's announcement.)

Fingers crossed that what I'm gonna upload will be accepted.

Can I ask Stefan is this in Linux? if so which distro? and what boinc client version?
One of our members has a theory that my troubles might be because I using an old boinc client.
 

StefanR5R

Elite Member
Dec 10, 2016
5,517
7,826
136
[Marathon, Cosmology@Home]

Some data about credits/hour (1/24 PPD).

Ivy Bridge-E, planck_param_sims v2.04 (vbox64_mt) windows_x86_64
constantly 50 credits/WU
runtime min/ max/ avg = 394/ 993/ 652 s (from 7 WUs)
min/ max/ avg 181/ 457/ 313 credits/hour (one 6-threaded task at a time)​

mobile Haswell, planck_param_sims v2.04 (vbox64_mt) windows_x86_64
constantly 50 credits/WU
runtime min/ max/ avg = 515/ 1,319/ 979 s (from 5 WUs)
min/ max/ avg 136/ 349/ 206 credits/hour (one 4-threaded task at a time)​

Broadwell-E, planck_param_sims v2.04 (vbox64_mt) windows_x86_64
constantly 50 credits/WU
runtime min/ max/ avg = 276/ 1,297/ 518 s (from 15 WUs)
min/ max/ avg 4 * 139/ 653/ 403 credits/hour (four 5-threaded tasks at a time)​

Broadwell-E, camb_boinc2docker v2.04 (vbox64_mt) windows_x86_64
min/ max/ avg = 83/ 92/ 89 credits/WU
runtime min/ max/ avg = 275/ 306/ 295 s (from 10 WUs)
constantly 4 * 1,082 credits/hour (four 5-threaded tasks at a time)​

Notes:
Some of the Broadwell-E runtimes are from tasks which I did not run in the 4x5 config, but I haven't taken note which ones.
I don't have data from camb_legacy on any of these machines.
I also don't have data from camb_boinc2docker on the former two machines.​

Conclusion: camb_boinc2docker > planck_param_sims, credit wise. YMMV.

(OTOH the project scientists prefer planck_param_sims to be worked on in the short term, to support a paper they are preparing right now.)

Did anybody upload results from one of the VM applications and from camb_legacy from one and the same machine, so that we can compare camb_legacy's yield directly?

Another conclusion: That Broadwell-E was obscenely expensive, but what's not to like about almost 8 times the throughput compared to the mobile Haswell?
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,517
7,826
136
Can I ask Stefan is this in Linux? if so which distro? and what boinc client version?
One of our members has a theory that my troubles might be because I using an old boinc client.

You can ask and I do answer, but can you trust this answer? ;)

So far I did this multiple-instance thing only on Linux machines.
1x OpenSuse Tumbleweed, boinc 7.6.33-2.1 from the default package repo
3x Gentoo, boinc 7.6.33-r3 from the default package repo

Edit: I followed OCN's guide. This basically means a fresh and virtually empty boinc data directory is created for each new instance. Each client got a different boinc computer ID assigned automagically.
 
  • Like
Reactions: Jondi

Jondi

Junior Member
Apr 16, 2017
19
21
51
Of course I trust your answer, we're all friends here :)
I'll have another go, but until after the pentathlon is over, I don't want anymore accidents :(

Thanks.
 

StefanR5R

Elite Member
Dec 10, 2016
5,517
7,826
136
[Marathon, Cosmology@Home]
camb_boinc2docker > planck_param_sims, credit wise. YMMV.

(OTOH the project scientists prefer planck_param_sims to be worked on in the short term, to support a paper they are preparing right now.)

Right now, the server status page shows
500 available camb_legacy tasks,
3 available camb_boinc2docker tasks,
13733 available planck_param_sims tasks.
 

StefanR5R

Elite Member
Dec 10, 2016
5,517
7,826
136
[Marathon, Cosmology@Home]
Did anybody upload results from one of the VM applications and from camb_legacy from one and the same machine, so that we can compare camb_legacy's yield directly?

No, but searching other users might answer the question. For example: https://www.cosmologyathome.org/results.php?hostid=289601&offset=0&show_names=0&state=4&appid=2

Nice, thank you. This host is a dual Xeon X5570 with 2x 4C/8T (16T total). Let's assume it is doing 16 camb_legacy tasks in parallel. What we don't know is how many camb_boinc2docker tasks it is doing in parallel. Perhaps 2, one on each socket.

camb_legacy
min/ max/ avg = 179/ 556/ 412 credits/WU(from 23 WUs)
runtime min/ max/ avg = 28,002/ 88,001/ 63,620 s
min/ max/ avg (16x?) 21.6/ 24.0/ 23.2 credits/hour (16 single-threaded tasks at a time?)​

camb_boinc2docker
min/ max/ avg = 23/ 38/ 29 credits/WU(from 57 WUs)
runtime min/ max/ avg = 365/ 475/ 415 s
CPU time min/ max/ avg = 2,146/ 2,976/ 2,349 s
hence CPU time = 5.66 * run time -> 8-threaded tasks?
min/ max/ avg (2x?) 230/ 287/ 251 credits/hour (2 octa-threaded tasks at a time?)​

If my guesses about parallelism on this host are right, then we have
camb_legacy = 371 credits/hour = 8,911 PPD (total from all threads),
camb_boinc2docker = 502 credits/hour = 12,037 PPD (total from all threads).​

Conclusion: PPD for camb_boinc2docker > planck_param_sims > camb_legacy
camb_boinc2docker > camb_legacy > planck_param_sims,​
circa at a ratio of 2.75 : 2 : 1.

Edit: This interpolation may be exaggerated for one or another reason. See estimation for i5-6200U below.

Edit 2: Oops, serious copy&paste error in my spreadsheet. camb_boinc2docker PPD were completely wrong; corrected now.
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,517
7,826
136
I had four legacy WU's upload. No VM's though.
About 24 credits per hour :( per core.

If you crunch 4 legacy tasks in parallel, that would be ~100 credits/h.

Your i5-6200U (mobile Skylake, configurable to 7.5 or 15 or 25 W TDP) should give a throughput which is perhaps at the order of magnitude of maybe 2/3 of my i7-4900MQ (mobile Haswell, 47 W TDP, hyperthreading currently switched off). Could this be right? That could be ~140 credits/h for planck_param_sims, assuming that camb_boinc2docker tasks are hard to get.

Which means the PPD of Planck vs. legacy on your host wouldn't be 4:1 1:2 like I interpolated from my desktop and that Xeon, but more like 3:2. :confused:

Edit:
Previous calculation for the Xeon was seriously wrong; corrected in the post further above.
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,517
7,826
136
OK, my calculations and estimations above are a mess. I need to compute a few of those camb_legacy tasks myself on Friday or so.
 

TennesseeTony

Elite Member
Aug 2, 2003
4,209
3,634
136
www.google.com
Somewhat unrelated, but related..... You may now witness the firepower of this fully armed and operational battle station, err, ThunderStrike.

28 cores/56 threads, and quad GTX1080's ready for Pentathlon battle! Pics.

Fun fact: This quad 1080 machine pulls a matching 1080 watts at the wall. :)
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,565
14,520
136
where is the 4th video card ? and to I see PCIE risers on the middle card for cooling ?