Info 11th BOINC Pentathlon 2020

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,401
14,364
136
What is a 7452? Vendor model number, or chip model?
Its an AMD EPYC server chip.

while I am here, it loaded win 10, and I got 11,848 on cinebench 20, but I could not load nvidia drivers.

Installing mint 19.2 right now.
 
  • Like
Reactions: Endgame124

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,401
14,364
136
OK, BOINC is running Rosetta, and its folding.... Done for the day, but tomorrow I can add it to my monitoring software, and check temps, etc....

95% of the CPUs for BOINC, 128 gig ram, 128 gig hard disk, 60-62 threads
3.4 million for F@H, 14 mil ppd total.
 
  • Like
Reactions: TennesseeTony

StefanR5R

Elite Member
Dec 10, 2016
5,426
7,634
136
Bunkering tip:
work at this contest can begin after Saturday May 2, 00:00 UTC (after Friday May 1, 20:00 EDT / 17:00 PDT) – i.e. download tasks, start computation but suspend network transfers.
Suspending network transfers in the client is the most common method to "bunker" results for deferred reporting. An alternative method is to block the computer from accessing the project server, either by means of a firewall, or by means of the resolver — the latter is also known as the /etc/hosts-method.

Most BOINC projects have one and the same server name for all of the various project server functions (master, downloads, uploads, scheduler, web interface, message boards). But Rosetta@home has got...
Scheduler server: srv4.bakerlab.org
(upload server and web server: boinc.bakerlab.org)
...a separate host name of the scheduler. For bunkering, it is sufficient to block only the scheduler (srv4.bakerlab.org). That way, uploading of result files keeps working, and you can still access the web site and forums (boinc.bakerlab.org).

After the start of the contest, precisely...
After the stats table at the contest site was initialized, which should be shortly after May 5, 00:00 UTC, enable networking again to let the boinc client upload and report the results, as well as fetch more work.
...after SETI.Germany's tables were initialized, remove the block to the Rosetta scheduler again, then trigger a project update in the boinc client to report the results.

In the /etc/hosts-method, the block is implemented as an entry like
127.0.0.1 srv4.bakerlab.org
in the file /etc/hosts on Linux, or C:\Windows\System32\drivers\etc\hosts on Windows. Caveat: Some systems are using a caching resolver. Typically, such a resolver monitors the host file for changes and picks your edit up immediately. But some don't, and need to be triggered to reload (or reboot the computer). If unsure, after you edited the hosts file, use boincmgr's advanced view, click [Update] on Rosetta@home, and check the event log. The method succeeded if the message "Scheduler request failed: Couldn't connect to server" appears. (To be safe, you can perform this test before you start bunkering, or while you have network transfers suspended in the boinc client.)

If you prefer to block uploads too, add the line
127.0.0.1 boinc.bakerlab.org
in the hosts file. Of course then you will lose access to the Rosetta web site and message boards as a side effect.

To remove the block again later, just remove the above line(s) from the hosts file, or put a # comment sign in front of this line. Again, most resolvers will pick up this reversal immediately, but some may require a reload trigger (or simpler: host reboot).

Note,
if you want to retain the ability to modify target CPU time on the go (see post #21) even while you are bunkering, then either use the "suspend networking method" (this blocks uploads but still enables user-requested scheduler requests), or use the /etc/hosts-method with a block of boinc.bakerlab.org and temporarily switch to the "suspend networking method" when you need to access the Rosetta@home web site.

PS,
a potential benefit of the firewall method and /etc/hosts-method over the "suspend networking method" is that the boinc client remains able to access other projects while you are bunkering for one or another project.
 
Last edited:

Assimilator1

Elite Member
Nov 4, 1999
24,118
507
126
I'm already running Rosetta, am I right in thinking I don't need to do anything more for my points to go towards the team in this?

Since current Rosetta-work has a 3-day deadline it's a waste of time to build a bunker at this time.
Hey! Good to see you mate! :)
How's things?
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,401
14,364
136
I'll say one thing. This is the first time that I have had 3 different boxes have a problem with Rosetta. They just lock up. One was a Xeon, and the other 2 are 3900x's. They ran 24/7@100% on WCG, but on Rosetta won't even go 5 minutes. NOT overclocked, CPU or memory.. So I lost a few boxes on Rosetta., still have over 550 threads running it.

The last 2 power supplies I had were Corsair. Guess what ? the ones in the 3 boxes I am having problems with ? AX850, AX860 and RM1000, all Corsair. The 850 is the oldest, many years, the 860 is suspect, as all the ones I had die recently were Corsair "i" series" and the RM 1000 ? not sure. But its an idea.
 
Last edited:

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,401
14,364
136
That is especially odd on Linux. I have one Xeon that likes to test my resolve on most any project if i combine it with a GPU project. Grrr.
All of these boxes have a GPU folding. 2 are 2060's and one was a 2080TI.
 

StefanR5R

Elite Member
Dec 10, 2016
5,426
7,634
136
Javelin Throw — NumberFields@home
So the first of the 5 Javelin throws have been announced. Number Fields. Do all Javelin throws get a 3 day notice? "Each day of Javelin Throw is announced three days in advance."
Yes. In theory, one could expend 3+1 days computer time on each throw. (I haven't looked up the deadlines and other tech info yet; will do so later.) But if the organizers are mean, they could schedule two throws in close succession, leaving less bunker time for one of such days.

Edit:
The beginning of each contest is 00:00 UTC. But the announcements happen randomly at 00:00, 06:00, 12:00, 18:00 UTC. Hence, is the announcement period just 2d6h...3d, or is it 3d...3d18h? I shall find out.

Edit 2:
  • Announcement periods can be 4d6h...5d or 2d6h...3d depending on discipline.
  • I am not sure from pschoefer's response in the shoutbox whether Javelin Throw days are announced 2d6h...3d in advance, or strictly 3d in advance.

Marathon — Rosetta@home
This is the first time that I have had 3 different boxes have a problem with Rosetta. They just lock up. One was a Xeon, and the other 2 are 3900x's. They ran 24/7@100% on WCG, but on Rosetta won't even go 5 minutes.
Perhaps RAM got tight, and the computers began to swap. That could make them unresponsive for minutes or hours, which is very much like a real lock-up.

Otherwise, lock-ups on Linux only happen with unstable hardware or with unstable kernel-level drivers. This is mainly true for 3rd party drivers, e.g. Virtual Box. I don't know how good NVidias drivers are; I have been using one and the same 3xx driver since our Brony@home Folding competition and it's been stable all the time.

Re Swapping: Maybe set "When computer is in use..." and "When computer is not in use, use at most ___ %" memory to no more than 85 %. That way, if tasks suddenly begin to use more memory than boinc-client anticipated, there is a better chance that the remaining RAM is sufficient to keep the computer responsive — including keeping boinc-client responsive and able to suspend excessive tasks.

Further, I am unsure about "Page/swap file: use at most __ %". I suspect this setting should be small. Does somebody know for sure?

While I am not sure if Rosetta's memory consumption is really the reason for the lockups which you got, there is this: 1. Memory footprint of Rosetta increased only recently. 2. If the same computers ran WCG and Rosetta together, then WCG's lower memory consumption would leave plenty of memory for Rosetta, making this issue moot.
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,426
7,634
136
Marathon — Rosetta@home

@Markfw, one other thing: At a moment when several Rosetta tasks are started at once, boinc-client becomes unresponsive on its remote control port for a while. The reason is probably the heavy disk I/O when the task data are unzipped and written to the slot directories.

Boinc-client's not serving the remote control port during this time affects all control tools (boincmgr, boinctasks, boinccmd, boinctui). However, as soon as all the Rosetta tasks finished their startup preparations and are running normally, boinc-client and the remote control tools should become responsive again.

I found that boinc-client still listens to so-called signals during such startup periods though. That is, it can be shut down by the kill command. Such a shutdown will generally happen cleanly but take a short while on its own.

Of course, if the computer runs into a situation when RAM is exhausted (post #38), then the entire computer will be unresponsive, and the shutdown-by-kill-command won't be an option then. You could only reboot, but then need to take care somehow that the computer doesn't run into the same situation again right after reboot.

- - - - - - - - - - - - - - - - - - - - -

About the change of project URL from http to https:

Anybody who still has the old URL (most of us) can keep using it indefinitely.

To switch from the old to the new URL,
  • complete or abort any present Rosetta tasks,
  • upload and report completed Rosetta tasks (only after the contest started, of course),
    report aborted Rosetta tasks,
  • when no Rosetta tasks are on the computer anymore, Remove the project,
  • add a new project with URL = https://boinc.bakerlab.org/rosetta/ (needs to be manually edited in the URL box of boincmgr's Add-project dialogue). Enter your existing E-mail and Rosetta password as credentials.
    (Alternatively, add the project with boinccmd --project_attach, or by creating an account_boinc.bakerlab.org_rosetta.xml file from a backup of an old account file, edited to the new URL.)
Obviously, this change is better deferred until a convenient time, e.g. until after the Pentathlon.
 
Last edited:

Assimilator1

Elite Member
Nov 4, 1999
24,118
507
126
I was just about to ask about the Rosetta url, I was wondering whether tasks would have to be finished, now I know :).

Rosetta states 'change when convenient', there isn't a convenient time! :rolleyes:
 
  • Like
Reactions: Endgame124

StefanR5R

Elite Member
Dec 10, 2016
5,426
7,634
136
It may be possible to change the URL without removing the project and even with tasks in progress. But that would involve some error-prone editing of config files and state files, and might go wrong.

Furthermore, it is possible to remove a project even while tasks are in progress. When you re-add the project with new URL, the project server may notice that the tasks were "abandoned" and send replicas to other hosts. If the server doesn't notice, then these tasks will time out after the 3-days deadline, after which the server will send replicas to other hosts unless an admin cancels them in the meantime for unrelated reasons.

But in short, better do this after all Rosetta tasks were finished and reported, e.g. after Pentathlon.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,401
14,364
136
@StefanR5R , the 3900 has only 16 gig ram, that certainly could be it. And in every case the IO light was solid. Maybe double the ram to 32 gig and try again ???? The Xeon box, I swapped out to my other Xeon box, and doubled the ram to 32 gig. Its fine now.
 
  • Like
Reactions: TennesseeTony

Howdy

Senior member
Nov 12, 2017
572
480
136
@StefanR5R , the 3900 has only 16 gig ram, that certainly could be it. And in every case the IO light was solid. Maybe double the ram to 32 gig and try again ???? The Xeon box, I swapped out to my other Xeon box, and doubled the ram to 32 gig. Its fine now.
IIRC, I believe @Skillz posted a recommendation of 1g of ram (maybe 2g?) per CPU thread for Rosetta. With that said, my TR with 32g of ram has occasionally had a "waiting for memory" moment running all threads on the project.
 
  • Like
Reactions: TennesseeTony

StefanR5R

Elite Member
Dec 10, 2016
5,426
7,634
136
The average and maximum RAM requirements of the Rosetta application have been changing with application versions, and more so with batches, a lot lately. My last info from end of April is circa 1.5 GB max; maybe 800 MB typical. I'll check on it once more later today.

In addition, the MiniRosetta application with its low RAM requirement was retired recently. If you had the usual random mix of Rosetta and MiniRosetta tasks on your computer, the RAM pressure was less.

Normally, boinc-client does a good job to keep all running tasks within the overall RAM limit which you configured for BOINC. But it can only react. It monitors the actual RAM usage, and starts or suspends tasks accordingly. But from what I understand, if tasks increase their RAM usage too quickly, or if something else on your computer suddenly begins to occupy lots of RAM, then the computer may run into a need to swap out from RAM to disk before boinc-client is able to suspend tasks. At this time, the computer is already so slow that boinc-client doesn't really have a chance anymore to get back in control.

Hence my recommendation to leave a good margin for the operating system etc. when you configure BOINC's RAM limit.
 
Last edited:

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,401
14,364
136
OK, The Xeon that had trouble ? I doubled the ram from 16 to 32. Its now using 20 gig of ram, so, yea, it was page swapping itself to death, 20 gig for 14 core/28 thread. Its only using 25 threads since it has a 2080TI, but 25 threads = 20 gig ram. So the 3900x needs 32 gig minimum.

Problem solved.

And I never had this before, as my threadrippers with 16 cores had 32 gig, and the EPYC's have plenty of ram, so its the "new" Rosetta boxes with only 16 gig that are problematic, thats why it was fine before. Now to upgrade the ram before the race,.
 

TennesseeTony

Elite Member
Aug 2, 2003
4,199
3,630
136
www.google.com
Now to upgrade the ram before the race

Good luck with that. Newegg shipping is terrible right now, Amazon 2-day Prime is now 4-5 days shipping. Newegg is at like 8 days for me (if it comes from California). [ Hurry up UPS! My 3950X needs that MB/RAM for the big race! ]
 

Endgame124

Senior member
Feb 11, 2008
953
669
136
The average and maximum RAM requirements of the Rosetta application have been changing with application versions, and more so with batches, a lot lately. My last info from end of April is circa 1.5 GB max; maybe 800 MB typical. I'll check on it once more later today.

In addition, the MiniRosetta application with its low RAM requirement was retired recently. If you had the usual random mix of Rosetta and MiniRosetta tasks on your computer, the RAM pressure was less.

Normally, boinc-client does a good job to keep all running tasks within the overall RAM limit which you configured for BOINC. But it can only react. It monitors the actual RAM usage, and starts or suspends tasks accordingly. But from what I understand, if tasks increase their RAM usage too quickly, or if something else on your computer suddenly begins to occupy lots of RAM, then the computer may run into a need to swap out from RAM to disk before boinc-client is able to suspend tasks. At this time, the computer is already so slow that boinc-client doesn't really have a chance anymore to get back in control.

Hence my recommendation to leave a good margin for the operating system etc. when you configure BOINC's RAM limit.
Note, just like using a 1gb raspberry pi to run Rosetta, you should be able to configure zram on Linux systems to create a Dynamically sized, compressed swap Partition in ram, so if your system starts to swap it just burns a little extra cpu instead of thrashing page files to death.
 

zzuupp

Lifer
Jul 6, 2008
14,863
2,319
126
Good luck with that. Newegg shipping is terrible right now, Amazon 2-day Prime is now 4-5 days shipping. Newegg is at like 8 days for me (if it comes from California). [ Hurry up UPS! My 3950X needs that MB/RAM for the big race! ]
Apparently, I got lucky then. I placed an order midweek. The delivery tracker claims that 2/3rds of it is in the area somewhere, and should be delivered today.
The other 1/3 appears to be on a choo-choo coming from California.
 
  • Like
Reactions: TennesseeTony

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,401
14,364
136
Good luck with that. Newegg shipping is terrible right now, Amazon 2-day Prime is now 4-5 days shipping. Newegg is at like 8 days for me (if it comes from California). [ Hurry up UPS! My 3950X needs that MB/RAM for the big race! ]
All done... The three threadrippers I have sold had all that extra memory. Now all 3 3900x are at 32 gig and the Xeon !

And the 2 I did that were not locked up were using 14.5 gig of 16, so any minute they would have been a problem.

I have WAY too much hardware. I have 3 functional boxes not fired up, as I don't have the power to run them, including a 14 core 28 thread Xeon.

Edit, and I changed back the box that I had to switch to WCG, so now all 3 3900x's are on Rosetta, the 3950x, all Xeons and EPYC's, and a 2970wx and a 2990wx.