8th Annual BOINC Pentathlon

StefanR5R · May 2, 2017

[City Run, WCG OpenZika]

StefanR5R said:
Argh... My 2P boxes have already finished all their OpenZika work. Off to create additional bunkers.

Edit: The guide from OCN about multiple instances works basically. But at WCG, it has the downside that client identity is not copied to the new instance, and therefore WCG wants to see some valid results returned from the new instance before handing out bigger numbers of WUs. -- For the future, I need to figure out how to copy client identity without copying full client state.

There is indeed a limit of 1000 runnable tasks imposed by the client.
There is no similar limit for (trusted) clients imposed by WCG, from what I see so far.

Imagine a computer with 56 threads, doing jobs which take 65 minutes at average to complete.
A cache of 1000 jobs is work for 65000 / 56 = 1160 minutes = 19 hours.

Fortunately I believe I learned how to copy client identity such that the new client instance is known to, and trusted by, WCG. That is, the new instance is able to fetch another 1000 WUs from WCG without having to send some valid results first.

Edit:
Hmm, I'm afraid I did not achieve what I wanted. When the new client fetches work, it emits log lines like:

Requesting new tasks for CPU
Scheduler request completed: got 15 new tasks
Resent lost task ZIKA_[long alphanumeric code]
Resent lost task ZIKA_[long alphanumeric code]
Resent lost task ZIKA_[long alphanumeric code]
etc..

These are tasks which are already present in the original client instance.

Hence, there is a lot of specific client state that would need to be copied in addition to client identity.

Edit:
Completing enough validated work to be eligible for more than min{n_cpus+2; 66} tasks*, and then downloading 1000 tasks**, takes about 3 hours here and now. (And, as noted above, this is a supply of work for mere 19 hours.)

*) limit imposed by WCG on n00bs
**) limit imposed by the boinc client itself, afaik

Edit:
The saga of almost absurdly small OpenZika bunkers continues:

I now checked my 4C/8T Haswell. It had received 280 tasks on Monday, then was denied further tasks with "This computer has reached a limit on tasks in progress". (I guess this is a per-core limit which could be circumvented by the ncpus tag in cc_config.xml.) This little bunker will complete tomorrow.

While I am shocked that not even a normal PC is allowed to queue up more than ~2 days worth of OpenZika, I also find some solace in the fact that there is a high barrier against ~5 day bunkers not just for folks with bigger machines like mine, but also for the many many users out there with more commonly sized machines. IOW I presume that lots of participants won't have the time + patience + knowledge that are together required to build a 5 days large OpenZika bunker.

Ken g6 · May 2, 2017

[City Run, WCG OpenZika]

The problem with starting a new VM seems to be you need to get at least one WU validated before it will send you a bunch of work. I've sent in 5 so far, none validated yet. Maybe I'll stop at 8 and resume my other VM.

Ken g6 · May 2, 2017

Well that's a new message. "This computer has finished a quota of x tasks." And it won't send me any more.

Markfw · May 3, 2017

If somebody wants to tell me what to do to join this race, I could help. I saw something about Zika, I have over 200 tasks queued up for that one.

Ken g6 · May 3, 2017

Markfw said:
If somebody wants to tell me what to do to join this race, I could help. I saw something about Zika, I have over 200 tasks queued up for that one.

Sounds like you're doing what you need to do. I'm doing the same, except I have close to 1000 tasks between two computers and two VMs.

StefanR5R · May 3, 2017

[City Run, WCG OpenZika]

@Markfw,
I will try to make a summary of the race particulars later today, unless somebody else does...

For now, it is important to know about the "City Run" sub-race:

Only OpenZika points count. For the remaining time leading up to the race, and during the race, please go to www.worldcommunitygrid.org -> "My Contribution" -> "My Project", and deselect all subprojects, then select OpenZika.
The next time your machines ask the server for more work, they will get OpenZika.
If you already downloaded tasks of other WCG subprojects, abort them unless you can complete them shortly.
Only OpenZika Points granted before end of the race count.
End of the "City Run": May 10, 00:00 UTC
Only OpenZika Points granted after beginning of the race count.
Beginning of the race is as soon as the organizers have posted initial stats at https://www.seti-germany.de/boinc_pentathlon/statistiken/pentathlon.php in the City Run column. This will happen shortly after May 5, 00:00 UTC.
Please try to retain completed work until after seti-germany has posted the initial stats! E.g. "suspend network activity" in boincmgr after you downloaded enough OpenZika tasks.

Exception to the last rule:

If you happen to have a client which never crunched OpenZika before, the WCG servers will let you have at most as many WUs as you have logical cores + 2, or at most 66 WUs, at any moment. Only after the client completed and uploaded a few WUs and got them validated at WCG (i.e. WCG received copies of these WUs with the same results from another donor), then WCG will let you have more WUs to store and crunch on for a while.
Unfortunately, many people are currently retaining their work until the race begins, or even longer. This means that the work which the new client submits now, is less likely to be validated before the race. Hence the new client must crunch through a whole bunch of WUs and upload them before the first validation succeeds.
As a consolation, all of the WUs that you submit now but get validated only after the race, will be counted during the race and hence contribute to the team rank.

Again, this exception applies only to clients which have never returned any OpenZika before.

Edit:
Clients which have submitted valid OpenZika work in the past, will receive mostly WUs at which WCG lowered the "quorum" to 1, which means they will be accepted as valid immediately without double-check by another submitter. WCG will only occasionally send OpenZika WUs with a need for validation or verification to veteran clients, for spot checks.

StefanR5R · May 3, 2017

[City Run, WCG OpenZika]

Ken g6 said:
The problem with starting a new VM seems to be you need to get at least one WU validated before it will send you a bunch of work. I've sent in 5 so far, none validated yet. Maybe I'll stop at 8 and resume my other VM.

If the wingman of these WUs in limbo is a participant of the race, you will get these WUs validated during the race, and thus they will count for the race.

On my machines which cannot bunker more than 1 or 2 days, I simply let new client instances complete and send work for about 2 hours, until enough validations went through ~~(a single one?)~~. *) If there are some WUs "wasted" this way, they are still contributing to science...

On my Windows machines I don't have this problem because they are nicely loaded up with Cosmology in parallel.

*) Edit:
I now did this on a 4C/8T PC. One validated WU on its own is not enough.

This machine was allowed to download more tasks 1h58m after one WU was validated (so far, only this one WU was validated), and 3h7m after it downloaded its first WU.

So maybe the rule is: Get 1 WU validated and wait patiently for 2 additional hours.

Kiska · May 3, 2017

@StefanR5R Your efforts will not succeed for copying and pasting the client_state file, as WCG scheduler request has a summary of what the computer in question is currently crunching. So therefore on the next scheduler request, if WCG servers don't see work being processed where it should be, either they resend the workunit or it will mark it as abandoned

StefanR5R · May 3, 2017

Ken g6 said:
Well that's a new message. "This computer has finished a quota of x tasks." And it won't send me any more.

I received this message now too. There is a typo in this message.

"This computer has finished a daily quota of 24 tasks"

CPU: 4C/8T, 198 tasks total, 2 valid, 15 pending validation, 181 in progress.
I.e. this message actually means

"This computer has not yet finished a daily quota of 24 tasks".

After I reported 7 more tasks (exactly 24 finished), this download throttle went away.

Markfw · May 3, 2017

StefanR5R said:
[City Run, WCG OpenZika]

@Markfw,
I will try to make a summary of the race particulars later today, unless somebody else does...

For now, it is important to know about the "City Run" sub-race:

Only OpenZika points count. For the remaining time leading up to the race, and during the race, please go to www.worldcommunitygrid.org -> "My Contribution" -> "My Project", and deselect all subprojects, then select OpenZika.
The next time your machines ask the server for more work, they will get OpenZika.

If you already downloaded tasks of other WCG subprojects, abort them unless you can complete them shortly.

Only OpenZika Points granted before end of the race count.
End of the "City Run": May 10, 00:00 UTC

Only OpenZika Points granted after beginning of the race count.
Beginning of the race is as soon as the organizers have posted initial stats at https://www.seti-germany.de/boinc_pentathlon/statistiken/pentathlon.php in the City Run column. This will happen shortly after May 5, 00:00 UTC.

Please try to retain completed work until after seti-germany has posted the initial stats! E.g. "suspend network activity" in boincmgr after you downloaded enough OpenZika tasks.

Exception to the last rule:

If you happen to have a client which never crunched OpenZika before, the WCG servers will let you have at most as many WUs as you have logical cores + 2, or at most 66 WUs, at any moment. Only after the client completed and uploaded a few WUs and got them validated at WCG (i.e. WCG received copies of these WUs with the same results from another donor), then WCG will let you have more WUs to store and crunch on for a while.

Unfortunately, many people are currently retaining their work until the race begins, or even longer. This means that the work which the new client submits now, is less likely to be validated before the race. Hence the new client must crunch through a whole bunch of WUs and upload them before the first validation succeeds.

As a consolation, all of the WUs that you submit now but get validated only after the race, will be counted during the race and hence contribute to the team rank.

Again, this exception applies only to clients which have never returned any OpenZika before.

Edit:
Clients which have submitted valid OpenZika work in the past, will receive mostly WUs at which WCG lowered the "quorum" to 1, which means they will be accepted as valid immediately without double-check by another submitter. WCG will only occasionally send OpenZika WUs with a need for validation or verification to veteran clients, for spot checks.

Done !! 144 cores changed

Ken g6 · May 3, 2017

StefanR5R said:
I received this message now too. There is a typo in this message.

"This computer has finished a daily quota of 24 tasks"
CPU: 4C/8T, 198 tasks total, 2 valid, 15 pending validation, 181 in progress.
I.e. this message actually means

"This computer has not yet finished a daily quota of 24 tasks".
After I reported 7 more tasks (exactly 24 finished), this download throttle went away.

Thanks! I thought I'd done something wrong when I released 3 WUs last night, and the quota went up from 13 to 17.

Now it makes sense.

I released 74 WUs today, to obtain 57 more, but some of those were probably quorum 2, so it's probably a wash.

StefanR5R · May 3, 2017

Kiska said:
@StefanR5R Your efforts will not succeed for copying and pasting the client_state file, as WCG scheduler request has a summary of what the computer in question is currently crunching. So therefore on the next scheduler request, if WCG servers don't see work being processed where it should be, either they resend the workunit or it will mark it as abandoned

Thanks. Meanwhile I also saw the risk of existing WUs getting detached discussed at the OcUK forum. From what I saw at my own WCG "results status" page though, I apparently had luck and the original client still has got all WUs in progress, not detached. (I tore down the new client and set up another one with separate identity. OK, actually not 1 but 10 now in total besides the original ones, and that's just for 4 days bunkering because I did not enter this business immediately after seti-ger's announcement.)

Fingers crossed that what I'm gonna upload will be accepted.

StefanR5R · May 3, 2017

April 30, the first two Pentathlon projects were announced. The effect:

Graphs taken from stats.free-dc.org.

Jondi · May 3, 2017

StefanR5R said:
Thanks. Meanwhile I also saw the risk of existing WUs getting detached discussed at the OcUK formum. From what I saw at my own WCG "results status" page though, I apparently had luck and the original client still has got all WUs in progress, not detached. (I tore down the new client and set up another one with separate identity. OK, actually not 1 but 10 now in total besides the original ones, and that's just for 4 days bunkering because I did not enter this business immediately after seti-ger's announcement.)

Fingers crossed that what I'm gonna upload will be accepted.

Can I ask Stefan is this in Linux? if so which distro? and what boinc client version?
One of our members has a theory that my troubles might be because I using an old boinc client.

StefanR5R · May 3, 2017

[Marathon, Cosmology@Home]

Some data about credits/hour (1/24 PPD).

Ivy Bridge-E, planck_param_sims v2.04 (vbox64_mt) windows_x86_64

constantly 50 credits/WU
runtime min/ max/ avg = 394/ 993/ 652 s (from 7 WUs)
min/ max/ avg 181/ 457/ 313 credits/hour (one 6-threaded task at a time)

mobile Haswell, planck_param_sims v2.04 (vbox64_mt) windows_x86_64

constantly 50 credits/WU
runtime min/ max/ avg = 515/ 1,319/ 979 s (from 5 WUs)
min/ max/ avg 136/ 349/ 206 credits/hour (one 4-threaded task at a time)

Broadwell-E, planck_param_sims v2.04 (vbox64_mt) windows_x86_64

constantly 50 credits/WU
runtime min/ max/ avg = 276/ 1,297/ 518 s (from 15 WUs)
min/ max/ avg 4 * 139/ 653/ 403 credits/hour (four 5-threaded tasks at a time)

Broadwell-E, camb_boinc2docker v2.04 (vbox64_mt) windows_x86_64

min/ max/ avg = 83/ 92/ 89 credits/WU
runtime min/ max/ avg = 275/ 306/ 295 s (from 10 WUs)
constantly 4 * 1,082 credits/hour (four 5-threaded tasks at a time)

Notes:

Some of the Broadwell-E runtimes are from tasks which I did not run in the 4x5 config, but I haven't taken note which ones.
I don't have data from camb_legacy on any of these machines.
I also don't have data from camb_boinc2docker on the former two machines.

Conclusion: camb_boinc2docker > planck_param_sims, credit wise. YMMV.

(OTOH the project scientists prefer planck_param_sims to be worked on in the short term, to support a paper they are preparing right now.)

Did anybody upload results from one of the VM applications and from camb_legacy from one and the same machine, so that we can compare camb_legacy's yield directly?

Another conclusion: That Broadwell-E was obscenely expensive, but what's not to like about almost 8 times the throughput compared to the mobile Haswell?

StefanR5R · May 3, 2017

Jondi said:
Can I ask Stefan is this in Linux? if so which distro? and what boinc client version?
One of our members has a theory that my troubles might be because I using an old boinc client.

You can ask and I do answer, but can you trust this answer?

So far I did this multiple-instance thing only on Linux machines.
1x OpenSuse Tumbleweed, boinc 7.6.33-2.1 from the default package repo
3x Gentoo, boinc 7.6.33-r3 from the default package repo

Edit: I followed OCN's guide. This basically means a fresh and virtually empty boinc data directory is created for each new instance. Each client got a different boinc computer ID assigned automagically.

Jondi · May 3, 2017

Of course I trust your answer, we're all friends here

I'll have another go, but until after the pentathlon is over, I don't want anymore accidents

Thanks.

StefanR5R · May 3, 2017

[Marathon, Cosmology@Home]

StefanR5R said:
camb_boinc2docker > planck_param_sims, credit wise. YMMV.

(OTOH the project scientists prefer planck_param_sims to be worked on in the short term, to support a paper they are preparing right now.)

Right now, the server status page shows
500 available camb_legacy tasks,
3 available camb_boinc2docker tasks,
13733 available planck_param_sims tasks.

TennesseeTony · May 3, 2017

Did anybody upload results from one of the VM applications and from camb_legacy from one and the same machine, so that we can compare camb_legacy's yield directly?

No, but searching other users might answer the question. For example: https://www.cosmologyathome.org/results.php?hostid=289601&offset=0&show_names=0&state=4&appid=2

StefanR5R · May 3, 2017

[Marathon, Cosmology@Home]

TennesseeTony said:
Did anybody upload results from one of the VM applications and from camb_legacy from one and the same machine, so that we can compare camb_legacy's yield directly?

No, but searching other users might answer the question. For example: https://www.cosmologyathome.org/results.php?hostid=289601&offset=0&show_names=0&state=4&appid=2

Nice, thank you. This host is a dual Xeon X5570 with 2x 4C/8T (16T total). Let's assume it is doing 16 camb_legacy tasks in parallel. What we don't know is how many camb_boinc2docker tasks it is doing in parallel. Perhaps 2, one on each socket.

camb_legacy

min/ max/ avg = 179/ 556/ 412 credits/WU(from 23 WUs)
runtime min/ max/ avg = 28,002/ 88,001/ 63,620 s
min/ max/ avg (16x?) 21.6/ 24.0/ 23.2 credits/hour (16 single-threaded tasks at a time?)

camb_boinc2docker

min/ max/ avg = 23/ 38/ 29 credits/WU(from 57 WUs)
runtime min/ max/ avg = 365/ 475/ 415 s
CPU time min/ max/ avg = 2,146/ 2,976/ 2,349 s
hence CPU time = 5.66 * run time -> 8-threaded tasks?
min/ max/ avg (2x?) 230/ 287/ 251 credits/hour (2 octa-threaded tasks at a time?)

If my guesses about parallelism on this host are right, then we have

camb_legacy = 371 credits/hour = 8,911 PPD (total from all threads),
camb_boinc2docker = 502 credits/hour = 12,037 PPD (total from all threads).

Conclusion: PPD for ~~camb_boinc2docker ~~> planck_param_sims~~ > camb_legacy~~

camb_boinc2docker > camb_legacy > planck_param_sims,

circa at a ratio of 2.75 : 2 : 1.

Edit: This interpolation may be exaggerated for one or another reason. See estimation for i5-6200U below.

Edit 2: Oops, serious copy&paste error in my spreadsheet. camb_boinc2docker PPD were completely wrong; corrected now.

Orange Kid · May 3, 2017

I had four legacy WU's upload. No VM's though.
About 24 credits per hour

per core.

StefanR5R · May 3, 2017

Orange Kid said:
I had four legacy WU's upload. No VM's though.
About 24 credits per hour per core.

If you crunch 4 legacy tasks in parallel, that would be ~100 credits/h.

Your i5-6200U (mobile Skylake, configurable to 7.5 or 15 or 25 W TDP) should give a throughput which is perhaps at the order of magnitude of maybe 2/3 of my i7-4900MQ (mobile Haswell, 47 W TDP, hyperthreading currently switched off). Could this be right? That could be ~140 credits/h for planck_param_sims, assuming that camb_boinc2docker tasks are hard to get.

Which means the PPD of Planck vs. legacy on your host wouldn't be ~~4:1~~ 1:2 like I interpolated from my desktop and that Xeon, but more like 3:2.

Edit:
Previous calculation for the Xeon was seriously wrong; corrected in the post further above.

StefanR5R · May 3, 2017

OK, my calculations and estimations above are a mess. I need to compute a few of those camb_legacy tasks myself on Friday or so.

TennesseeTony · May 3, 2017

Somewhat unrelated, but related..... You may now witness the firepower of this fully armed and operational battle station, err, ThunderStrike.

28 cores/56 threads, and quad GTX1080's ready for Pentathlon battle! Pics.

Fun fact: This quad 1080 machine pulls a matching 1080 watts at the wall.

Markfw · May 3, 2017

where is the 4th video card ? and to I see PCIE risers on the middle card for cooling ?

8th Annual BOINC Pentathlon

Elite Member

Programming Moderator, Elite Member

Programming Moderator, Elite Member

Moderator Emeritus, Elite Member

Programming Moderator, Elite Member

Elite Member

Elite Member

Golden Member

Elite Member

Moderator Emeritus, Elite Member

Programming Moderator, Elite Member

Elite Member

Elite Member

Junior Member

Elite Member

Elite Member

Junior Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Moderator Emeritus, Elite Member