8th Annual BOINC Pentathlon

StefanR5R · May 5, 2017

I might be mistaken about what was really going on at those spikes there...

...but I see 8.5 more of those in our future.

TennesseeTony · May 5, 2017

I haven't had to reboot any of my Win10 machines for the hosts to get changed. I've had to wait a few minutes is all, seems some system timer is involved.

My limited Einstein uploads were intentional, I knew I'd complete the bunker WAY before the contest started, and just wanted to make sure I maxed out my downloads before I suspended the project.

Now I'm ready to start running and bunkering those tasks, what is the web address we need to block in the host file for Einstein? I've tried searching, but, it's Cinco de Mayo and....uhm.... I've had beverages already.

iwajabitw · May 5, 2017

Jim and some Yuengling here Tony. Finally got my bunkers full, about 1200 tasks.. Had to flush some I bunkered yesterday to get it to work

GLeeM · May 5, 2017

No leakers, I sent WUs before knowing Einstein was the next race, it was the easyest project to get enough WUs to bunker with WCG and Cosmology and I was crunching it anyway.

The hosts file Einstein data is: " 127.0.0.1 einstein3.aei.uni-hannover.de "
I had tried a bunch before finding this one

4thKor · May 5, 2017

http://www.overclock.net/t/1628924/guide-setting-up-multiple-boinc-instances/50#post_26072472

StefanR5R · May 6, 2017

My only PC which is suitable for Einstein has got 3 GPUs, and in order to get enough tasks I need to do what @4thKor is doing.

But here are candidates for the hosts file:

Code:

$ cd /cygdrive/c/ProgramData/BOINC/
$ grep url client_state.xml | cut -d/ -f-3 | sort -u
    <download_url>http://einstein.phys.uwm.edu
    <download_url>http://einstein2.aei.uni-hannover.de
    <download_url>http://einstein3.aei.uni-hannover.de
    <download_url>http://einstein-dl.syr.edu
    <download_url>http://einstein-dl3.phys.uwm.edu
    <master_url>http://einstein.phys.uwm.edu
    <project_master_url>http://einstein.phys.uwm.edu
    <scheduler_url>https://scheduler.einsteinathome.org
    <upload_url>http://einstein3.aei.uni-hannover.de

This concurs with what @GLeeM said.
($ is the cygwin shell prompt.)

StefanR5R · May 6, 2017

The WCG/OpenZika City Run is ending on May 10, 00:00 UTC, whereas the Einstein Cross Country is beginning on May 9, 00:00 UTC. --- I figure that many participants in this race are now operating both WCG and Einstein in a single client, have loaded up on both and disabled networking until May 9, if they were successful in getting enough tasks for both.

If so, then a significant portion of the WCG production is currently not showing up in the stats, and teams ranks in the City Run will change dramatically on May 9 again.

(My own WCG and little Einstein productions are happening on different machines. Einstein shares a machine with Cosmology, but I run them in separate clients.)

TennesseeTony · May 6, 2017

Whew. I'm starting to feel like an old dog, with all these new tricks to learn.

TennesseeTony said:
...This one is always fun, and frustrating. ...
...For the newer members, this is one challenging competition. ..
...It drives me crazy, but is so much fun! ....

Those snippets are from the first post. And that was said BEFORE I found out about all the tricks of large scale bunkering, lol!

Thanks GLeeM and Stefan, I am officially starting my Einstein bunker now. Fortunately the weather has turned unseasonably cool for my area.

As for Cosmo vs WCG, I'm letting task manager battle it out as to which gets to run and when, for now.

StefanR5R · May 6, 2017

Some experiences from building Einstein bunkers for my 3-GPU PC:

To get a large enough queue, I want boincmgr's estimation of task duration to quickly converge to the proper value. Real value here is 11...16 minutes depending on the card, initial estimation was 45 minutes, or even 3 hours. With those estimations, boincmgr will assume that the planned queue depth a.k.a. "Store at least [...] days of work" is topped out earlier than true.

If multiple boinc client instances run on the same PC, remember to suspend CPU activity of the older instance(s) before you start up a new instance for the first time. The new instance is going to run the CPU benchmark, and for some projects it is desirable to get this benchmark correct.
I guess Einstein GPU WUs are not affected by CPU benchmark results, but I may be wrong. So, better safe than sorry.

It seems to me that when I start a new client and have it connected to Einstein, it may be better to let it run with "Store at least [ 0 ] days of work" for a while, in order to get the estimated task duration closer to the real task duration sooner. (See first bullet point.)

I am still used to control the boinc client with boincmgr. But I now use BoincTask in addition. While load up a bunker, I have BoinkTask sitting there with the "Tasks" tab open. That way I see immediately how many tasks are already downloaded. In addition, I have boincmgrs log window open.

On my NVIDIA GPUs, running 2 tasks per GPU nearly saturates them, and 3 tasks per GPU saturate them entirely. Hence I think it is optimal to run 2 tasks per GPU. Compared to 3 tasks per GPU, I lose only very little throughput but leave more CPU resources for Cosmology (or WCG).
However, during bunkering I run
- 1 task/GPU initially to get a lower estimation of task duration (see first bullet point),
- then 3 tasks/GPU to let boinc assume that 9 tasks will run in parallel on my 3-GPU PC for the rest of the time until "Store at least [...] days of work"+ "Store up to an additional [...] days of work".
  That way, I have better chances to run into a hard limit of circa 1000 runnable tasks per client, instead of a lower limit imposed by the boinc client's estimation of task duration.

And a word of warning: Today I had another crash of a suspended Einstein GPU task when I resumed it. It locked up the PC, necessitating a reboot. Therefore I need to remember never to suspend an already running Einstein task. I now employ the following routine for best card utilization when having 2 tasks/card:

Set "No new tasks" on the Einstein project, or suspend network access.
Wait for a point in time when running tasks are at least a minute away from completion, so that there is no danger that one GPU becomes ready to start a new task when the next steps are performed.
Select all tasks that are "Ready to start". Suspend them.
Wait for all running tasks to complete. (Or for all but one.)
Resume one of the suspended tasks.
Watch GPU utilization in GPU-Z to climb to > 80 %. My CPU needs less than 15 seconds or so to set up the task until its max GPU utilization.
Wait another 30 seconds or so.
Then start the next suspended task.
Repeat 6.-8. until all GPUs run the desired number of tasks.
Resume all remaining tasks.
Undo step 1. when desired.

Now all GPU tasks run staggered in time, and their periods where they need CPU instead of GPU don't overlap, or at least not much.

GLeeM · May 6, 2017

You can prevent the CPU benchmark from running, just make sure the present numbers are as you want. I had noticed large differences in benchmark numbers depending on what was happening on computer when it ran. I think maybe running at start-up when so much else is also starting/happening?

So if you add the first line to cc_config.xml the benchmark will not run unless you run it manually.
Looking at the second line I am wondering if it prevents fetch at other times?
The last line is handy during races so finished results send as soon as done (Edit: would need a "1" instead of "0").

<cc_config>
<options>
<skip_cpu_benchmarks>1</skip_cpu_benchmarks>
<fetch_on_update>1</fetch_on_update>
<use_all_gpus>1</use_all_gpus>
<allow_multiple_clients>1</allow_multiple_clients>
<report_results_immediately>0</report_results_immediately>
</options>
</cc_config>

StefanR5R · May 6, 2017

TennesseeTony said:
For the newer members, this is one challenging competition.

I am glad that we just had Einstein and WCG sprints recently in Formula BOINC. That way I was at least aware of such quirks like WCG's throttled task distribution to new clients, or Einstein's web config interface made for little Einsteins.

StefanR5R · May 6, 2017

GLeeM said:
So if you add the first line to cc_config.xml the benchmark will not run unless you run it manually.

https://boinc.berkeley.edu/wiki/Client_configuration says that it also disables manually triggered benchmarks.

Good tip; I definitely don't want periodic benchmarks during periods when two clients are doing stuff. IOW <allow_multiple_clients>1</allow_multiple_clients> should always be amended with <skip_cpu_benchmarks>1</skip_cpu_benchmarks> (once benchmark numbers are correct in a given client instance).

TennesseeTony · May 6, 2017

My "reserve fleet" is still bunkering one of the projects. When it completes its tasks, I won't be able to employ it any further until the GPU race is over. The bunkering fleet is pulling 2350 watts on its own, and now I've got 1950 watts of GPU going as well. Something has to give.

Of course my 'normal' fleet will continue, uninterrupted. About 1200-1400 CPU watts there.

EDIT: Ah man. Someone broke Cosmology. Again. And it's the weekend. I wonder how long this will take...

StefanR5R · May 6, 2017

TennesseeTony said:
Ah man. Someone broke Cosmology. Again. And it's the weekend. I wonder how long this will take...

I suspect that it isn't actually broken, but merely overloaded. I read on seti-germany that Cosmo's validator is still behind with validations. The work generators are presumably running on the very same host.

StefanR5R · May 6, 2017

Or maybe work generator's aren't under-performing but automatically/ intentionally throttled due to validation/ integration/ cleanup not keeping up?

TennesseeTony · May 6, 2017

Cosmo was completely off line for a bit, couldn't even access the homepage. Only lasted about 5 minutes thankfully.

ThunderStrike is performing well, regarding the close proximity of the GPUs. 2 concurrent Einstein tasks nearly max the 1080s (utilization), but only use 70-80% of the power. F@H and PG will be the real thermal tests I suppose.

StefanR5R · May 6, 2017

TennesseeTony said:
Cosmo was completely off line for a bit, couldn't even access the homepage. Only lasted about 5 minutes thankfully.

WCG's web site was down earlier today too (500 internal server error), but surely not because of the Pentathlon.

TennesseeTony said:
ThunderStrike is performing well, regarding the close proximity of the GPUs.

I had been using two blower cards in a PC for a short while recently (for the first and the last time). From this brief experience I am certain that your staggered layout with riser cards makes a big difference.

TennesseeTony · May 6, 2017

Our team stats page indicates we have at least 2 teams who are quickly going to overtake us in Cosmo. Three of the five machines I have running it can't get any tasks, so task availability certainly is a factor in the Marathon, seemingly year after year.

StefanR5R · May 6, 2017

My laptop still has 260 tasks from its initial download left to process. It must certainly be possible with some client_state.xml hackery to carry tasks from one client to another...

Ken g6 · May 6, 2017

StefanR5R said:
And a word of warning: Today I had another crash of a suspended Einstein GPU task when I resumed it.

Well, I guess it'll be interesting when I resume all the Einstein work I've suspended. At least I'm doing them two at a time, so if one freezes the other shouldn't. And I wouldn't expect a Linux task to freeze the entire system.

StefanR5R said:
It must certainly be possible with some client_state.xml hackery to carry tasks from one client to another...

If you can figure that out, it would be a major breakthrough!

StefanR5R · May 6, 2017

I'm afraid since client state is also stored at the project server, it may not be possible to upload the completed WU directly from the secondary client. It may be necessary to transfer the completed WU back to the originating client for upload.

StefanR5R · May 6, 2017

Ken g6 said:
I wouldn't expect a Linux task to freeze the entire system.

But a kernel driver can.

TennesseeTony · May 6, 2017

I personally haven't experienced any issues suspending Einstein (no linux here). In fact I always ensure my tasks (on the same GPU) have a staggered start time by using the suspend/resume feature, because they need a full thread of CPU at the beginning, and I like to minimize the times I'm at 100% CPU so as to not bog down other tasks.

Transferring tasks from one machine to the next indeed would be awesome and game changing!

TennesseeTony · May 6, 2017

One thought that needs to be shared before I forget....

Any bunkers for any of the projects should be turned in well before the end of the project, say, 4 hours early, maybe even 6 hours.

As fun as it would be to shock the competition with last hour uploads, giving them no time to counter-attack, many a bunker in times past has not been received in time. High server load/crashes, personal power failures, internet outages, all sorts of stuff that can ruin your day if you wait too long!

TennesseeTony · May 6, 2017

SETI.Germany also has daily commentary, if you're interested. They do a great job at hosting the race, with the stats and all, but I find the commentary hard to follow.

Similar to Formula BOINC's leagues, the commentary is somewhat broken down into leagues.

Day one
Day two

Snippets:
TAAT news (as they call us) from day one:

A good start for TeAm AnandTech (#9). Last year the Marathon was their best discipline and there are signs this year that that might be repeated.
Incredible results (at least at the moment) returned by Overclock.net #4 and Overclockers UK (#7). ///Great job guys!/// Both will not find it easy to retain their current positions but judging by their current performances, both are hot contenders for a place in the Top10. Of course, RKN (#9) and Boinc.Italy (#11) want to be included. However, TAAT (also at #9) are currently not willing to give way. Gridcoin and [H]ard|OCP (#12) are not content to vacate their rank without an argument. What’s ailing Meisterkuehler.de (#16) will hopefully soon be revealed.

TAAT news from day two:

At the City Run, ..... However, [H]ard|OCP (#11) and TAAT (#12) should throw a few more energy bars into their furnace or they won't be getting RKN to break into a sweat otherwise.
Here they are at last. Meisterkuehler have meanwhile crunched to #10 followed by TAAT and RKN.
RKN complete the Top10 and TAAT (#11) are able to change that at any time,

Hope to see the commentator utterly shocked one day soon, to see TAAT giving the big boys a beating.

8th Annual BOINC Pentathlon

Elite Member

Elite Member

Senior member

Elite Member

Junior Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Programming Moderator, Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member