Announcement WCG update 9/9/2022 still messed up *

StefanR5R · Feb 17, 2022

Markfw said:
OK, almost done. So I have tried Rosetta, next to nothing happening. NO WCG, so what other CPU medical is there that has work ?

Check out post #4. :-D

(Apart from these and F@H, there are no other active medical projects currently. OK, and GPUGrid, intermittently.)

Oh, one more edit: SiDock@home's project infrastructure is based in Russia if that sort of thing matters to anybody. But their science is done in Slovenia, which is a EU member. (SiDock team page)

Markfw · Feb 17, 2022

UPDATE: I added Rosetta (actually, just resumed) for the 3 64 core boxes, and they are runnig ! Not all maxed to 1000 tasks, but a couple hundred ready to start. So I added it to other machines that were idle. I am up to 588 running.

StefanR5R · Feb 17, 2022

Yep, looks like there is a batch of classic Rosetta tasks available again. You could try to buffer some, but keep in mind that the reporting deadline is just 3 days.

Markfw · Feb 17, 2022

StefanR5R said:
Yep, looks like there is a batch of classic Rosetta tasks available again. You could try to buffer some, but keep in mind that the reporting deadline is just 3 days.

The slowest box I have are the 64 core 2 ghz boxes that will do these in 7 hours. I will NOT try to buffer anything. If it stops delivering, then so be it.

Progress

StefanR5R · Feb 18, 2022

The runtime of (classic) Rosetta tasks is mostly independent of CPU speed. It is configured at the website in your account -> Rosetta@home preferences -> Target CPU run time. I believe the default is 8 h currently, but I am not sure.

Within each task, the application runs several trial simulations of the same molecule in succession, for as long as the target runtime isn't exceeded. Faster CPUs get some more of these runs done, slower CPUs some fewer. (How many runs can be accomplished within the target runtime depends on the complexity of the workunit, besides CPU speed.)

After a task finishes, the number of runs which were accomplished is logged in stderr.txt of the task in a line like "This process generated ... decoys from ... attempts". After the result was reported to the server, this can be seen in an individual task's details in the user's or host's tasks tables on the website.

If you look at several results with similar batch names but from computers with different per-core speed, you will see distinctively different numbers of "decoys" reported in relation to runtime. Besides, if you e.g. browse top_hosts.php, you will see that some users set a different runtime than the default.

Currently distributed workunits end up with hundreds of "decoys" within each task, so I guess the currently researched models aren't very complex.

cellarnoise · Feb 18, 2022

I think this last post confirms Stefan is an A.I. Bot and we are all just doing "its" bidding.

At least it appears that "Stefan" is on our side,for now...

Crap? The forum a.i. ( much smarter than I ) won't let me insert a photo in mobile? Only attach? We are doomed ?

Thanks Stefan!

Assimilator1 · Feb 18, 2022

Markfw said:
I know. I was just saying that its happening. Instead of over 700 tasks running at once on my boxes, I am down to a couple hundred. Will be out in a couple of days. Time to find something to keep them busy.

My rig ran out sometime in the early hours (GMT) of Wednesday morning, you did well to keep your fleet going that long!

Endgame124 said:
I hear you on the cold Basement. I put my 5950 on Milky Way - it seems silly to use it on projects that also use GPU, but There aren’t many projects left that just use CPU.

There's LHC, although I find it sometimes causes GUI lag on my system now, atm I'm sticking with it.
With the problems Rosetta's python WUs caused, I'm staying away from them sadly, I would've liked to rejoin it!

I tried to run Asteroids, and found out it was still down from last year! lol, what happened to them?

StefanR5R · Feb 18, 2022

@cellarnoise, I've just picked up a fault in the AE-35 unit. It's going to go 100% failure within 72 hours.

waffleironhead · Feb 18, 2022

SiDock is back up and running if you still are looking.

Markfw · Feb 18, 2022

waffleironhead said:
SiDock is back up and running if you still are looking.

FYI, I have about 5000 downloaded for 12 boxes, and 600 or so running for Rosetta, so I am good for now.

Thanks though !!

Assimilator1 · Feb 19, 2022

LHC lag is worse than I thought, + getting many errored WUs + getting warnings from some WUs about needing to clean the VM environment is just doing my head in! So I'm going to stop the last VM based LHC project and run sixtrack, I just hope I get enough WUs for that!

StefanR5R · Feb 20, 2022

LHC@home's applications are all difficult to run. They are resource hungry to varying degree (disk, network, some have VM overhead with big RAM requirements and the responsiveness issues which you mentioned) and are not very well integrated with BOINC. (E.g. they use LHC's distributed filesystem for most of their file transfers, not BOINC's own file download and upload mechanism.)

Even Sixtrack is difficult: Runtimes can vary wildly. It may happen that a host receives a large number of tasks with very short runtimes; the BOINC client would then reduce its runtime estimation for this application radically; and when the host started to receive normal Sixtrack tasks again, the client would abort them before they could finish because they took much longer than the client's newly adjusted timeout would permit.

On a more general note: Empirically, all applications of physics or chemistry DC projects — and a certain subset of astronomy projects — are difficult to run; some more than others. This is fundamentally the nature of numeric simulations in physics. Even if there is programmer talent and time available to get the application program in best shape, issues like SixTrack's bimodal runtime, or other LHC applications' large file data, or QuChemPedIA's frequent lack of convergence will remain.

So there will always be issues like these left to cope with by us DC volunteers. But we all have limited spare time that we can and want to dedicate to monitoring and controlling our computers. Sometimes there is just no alternative to switch over to an easy and reliable project like there are a few in the math/ number theory section of DC. Other times we might be able to invest some time upfront into the implementation of various workarounds which then allow us to keep a difficult project running with a lot less supervision.

Back to LHC@home: Personally, I like the subject matter of this project. But I haven't invested as much personal time and computer time on it than on other physics/ astrophysics/ chemistry related projects. LHC donated a sizable amount of their own datacenter capacity to medical research at some point, and are still doing so, and that left me wondering whether they really need volunteer computer time at all. (Perhaps at most during peek demands, but apparently not over the long term.)

Assimilator1 · Feb 20, 2022

I've disabled all but the sixtrack app now, and there's no WUs, lol.
So I've started Universe@Home, no problems there so far

.

I really like the subject matter of LHC too, their research and discoveries have always fascinated me, but I'm not going through that mammoth post of LHCs about how to fix VM, I don't mind spending a bit of time on tweaking a few settings, but I'm not spending hours going through that!
LHC@H has been more problematic (since about 1yr ago) then any other DC project I have ever run. So I won't be doing much of that

(bar any sixtrack work I get).

Markfw · Feb 25, 2022

Well, WCG is totally dead, and Rosetta (without virtualbox) is out of work. So, for the first time in years, I only have 8 video cards going, and I have had to close up the windows in the house ! Its only 48F outside.

cellarnoise · Feb 25, 2022

TN-Grid seems to be having problems also on accepting finished WUs and is largely running out of work.

Maybe just run FaH on CPU some?

I think I am going to suspend running Si-Dock WUs because of recent Geo-politics...

Getting hard to find medical tasks.

Markfw · Feb 26, 2022

StefanR5R said:
The runtime of (classic) Rosetta tasks is mostly independent of CPU speed. It is configured at the website in your account -> Rosetta@home preferences -> Target CPU run time. I believe the default is 8 h currently, but I am not sure.

Within each task, the application runs several trial simulations of the same molecule in succession, for as long as the target runtime isn't exceeded. Faster CPUs get some more of these runs done, slower CPUs some fewer. (How many runs can be accomplished within the target runtime depends on the complexity of the workunit, besides CPU speed.)

After a task finishes, the number of runs which were accomplished is logged in stderr.txt of the task in a line like "This process generated ... decoys from ... attempts". After the result was reported to the server, this can be seen in an individual task's details in the user's or host's tasks tables on the website.

If you look at several results with similar batch names but from computers with different per-core speed, you will see distinctively different numbers of "decoys" reported in relation to runtime. Besides, if you e.g. browse top_hosts.php, you will see that some users set a different runtime than the default.

Currently distributed workunits end up with hundreds of "decoys" within each task, so I guess the currently researched models aren't very complex.

I changed it from the defailt 3 to 12 hours. My ppd went down from 506k to 147k. I am back to 4 hours.

wayliff · Feb 27, 2022

Hello fellow crunchers and anandtech members!

I have also been crunching WCG for many years...came to check what are the alternatives using BOINC. Thanks for the posts.
I used to have dedicated machines, but now I just do mostly overnight runs or while I am not working.

Registered: May 1, 2007
Run Time: 124:283 years:days
Points: 193,401,015
Results: 439,574
Last Result: February 21, 2022

StefanR5R · Feb 27, 2022

(Rosetta v4.20)

Markfw said:
I changed it from the defailt 3 to 12 hours. My ppd went down from 506k to 147k. I am back to 4 hours.

Hmm. But how much of this was from the preferences change, and how much perhaps from "downtime" when Rosetta didn't send work?

PPH for individual tasks on three of your hosts:

Results from host 6179426 (Ryzen 9 5950X, 111 valid results when I collected these data):

10 results with 3.0 h duration, 54...62 points/hour
70 results with 5.5...6.6 h duration, 22...30 points/hour
30 results with 9.0...9.9 h duration, 25...28 points/hour
1 result with 12.0 h duration, 72 points/hour

---> That is, points/hour are all over the place.

Results from host 6179428 (EPYC 7452, 203 valid results when I looked):

6 results with 6.1...6.2 h duration, 36...41 points/hour
62 results with 6.5...7.2 h duration, 31...38 points/hour
56 results with 7.6...8.0 h duration, 34...43 points/hour
79 results with 11.6...12.1 h duration, 32...46 points/hour

---> That is, points/hour are consistent between 6h, 7h, 8h, and 12h. But we don't have 3h results on this host.

Results from host 6179463 (Ryzen 9 5950X, 88 valid results when I looked):

12 results with 3.0 h duration, 49...76 points/hour
8 results with 3.8...4.0 h duration, 47...72 points/hour
50 results with 8.0 h duration, 52...70 points/hour
4 results with 11.0...11.5 h duration, 8.3 points/hour *
14 results with 11.9...12.1 h duration, 50...67 points/hour

---> That is, points/hour are consistent between 3h, 4h, 8h, and 12h tasks (ignoring the four special tasks which obviously had issues).
________
*) stderr.txt of these tasks shows that only 10 decoys or less were generated within these tasks. Tasks with normal credit had an order of magnitude more decoys.

If Rosetta@home would have a longer streak of Rosetta v4 work availability, then a "scientific" experiment would be to set up two or three or four boinc client instances on a single computer, configure different target CPU run times for each client instance, and then have them work for a while on whatever random work batches come along. After all instances amassed a good deal of valid results, compare PPH of the differently configured tasks.
But due to the frequent gaps in work availability, such an experiment is harder to do. We would want to take data only from tasks which ran while the host was fully utilized. An only partially busy host would have a higher per-core-performance due to higher power budget/ higher thermal budget/ lower cache pressure/ higher memory bandwidth per busy core.

(Edit: An alternative to several client instances on a single host would be several hosts with same hardware. But it's hard to have same hardware: CPUs from a very similar V/F bin would be required, or the CPUs should be tweaked to run at a fixed clock.)

Markfw · Feb 27, 2022

really odd. I guess I can't tell anything until Rosetta gets "Normal"

StefanR5R · Apr 17, 2022

Latest update on the currently static World Community Grid site:

2022-04-13 —Validation of the WCG Environment Enters Final Stages
Workunit management, website and forums, and APIs have been subjected to unit tests, integration testing, and manual inspection to validate previous WCG functionality. Over 80% of tests and checks have been completed successfully in the Krembil QA environment to date.

https://www.worldcommunitygrid.org/news
Lots of other short articles about the past and current WCG were posted there as well.

Ken g6 · Apr 17, 2022

Today would be a great day to resurrect WCG! Probably too much to hope for, though.

biodoc · Apr 22, 2022

The WCG restart has been delayed until May 9th. https://secure.worldcommunitygrid.org/news/0421

StefanR5R · Apr 22, 2022

Ouch! AFAICT, May 9 is later than May 5, for instance. Still earlier than May 19 though. ;-)

PS, WCG's continuity and openness in public communication is exemplary, and is a good sign for the things to come even after the ownership change.

Markfw · Jun 4, 2022

The latest update: No ETA yet

Home

World Community Grid enables anyone with a computer or Android device to donate their unused computing power to advance cutting-edge scientific research on topics related to health and sustainability.

www.worldcommunitygrid.org

cellarnoise · Jun 8, 2022

I dream about Where we Could Go anymore

. My mostly retrained and falling water is waiting...

?

Announcement WCG update 9/9/2022 ** still messed up ***

Elite Member

Moderator Emeritus, Elite Member

Elite Member

Moderator Emeritus, Elite Member

Elite Member

Senior member

Attachments

Elite Member

Elite Member

Diamond Member

Moderator Emeritus, Elite Member

Elite Member

Elite Member

Elite Member

Moderator Emeritus, Elite Member

Senior member

Moderator Emeritus, Elite Member

Lifer

Elite Member

Moderator Emeritus, Elite Member

Elite Member

Programming Moderator, Elite Member

Diamond Member

Elite Member

Moderator Emeritus, Elite Member

Senior member

Announcement WCG update 9/9/2022 still messed up *