Discussion World Community Grid : Africa Rainfall Project

ao_ika_red

Golden Member
Aug 11, 2016
1,679
715
136
This project will help precisely predicting localised rainfall in sub-Saharan Africa and hopefully will help African farmers increase their food production.

Be advised this project filesize is larger than most WCG project then longer running time and higher RAM allocation are unavoidable.

I will try running this project after the FB Sprint and see what's what. I assume it will behave like CPDN and LHC@Home.

Full press release:
 
  • Like
Reactions: TennesseeTony

crashtech

Lifer
Jan 4, 2013
10,695
2,294
146
This project will help precisely predicting localised rainfall in sub-Saharan Africa and hopefully will help African farmers increase their food production.

Be advised this project filesize is larger than most WCG project then longer running time and higher RAM allocation are unavoidable.

I will try running this project after the FB Sprint and see what's what. I assume it will behave like CPDN and LHC@Home.

Full press release:
Whoa, I hope it behaves better than that! I want to say that IBM adheres to a higher standard, guess we'll see!
 
  • Haha
Reactions: emoga

StefanR5R

Elite Member
Dec 10, 2016
6,794
10,833
136
It sure behaves better than that.

The one task which I ran so far took 20 hours on a low-clocked Xeon with all HT threads busy. I haven't watched download size and upload size, but the result uploads took ~100 seconds at perhaps ~7 Mbit/s, which puts it at the order of ~80 MiB total size of the result files.

When I looked while it ran, it had >800 MB virtual/ >700 MB resident RAM allocated. According to the WCG forum, one should assume 1 MByte RAM per running task.

Also from the forum: There are only 8 checkpoints during the entire task runtime. I.e. you may lose hours of work if you suspend and resume a task, unless boinc-client is set to keep suspended tasks in RAM.

For circa 12...24 hours run time, the deadline of 7 days isn't much. Work units represent time steps (and regions) within the entire simulation, hence future work units depend on the results of past work units. This may be a reason for their choice of a moderate deadline. Furthermore, this dependence between work units could cause a limitation of the total number of tasks in progress across all WCG contributors.

Current PPH (points per hour of "run time", actually CPU time), averaged over all hosts of all WCG contributors, and presumably averaged over the entire life time of the respective subproject:
ARP1 ....... 189 p/h​
MIP1 ........ 200 p/h​
SCC1 ....... 176 p/h​
ZIKA ........ 198 p/h​
HST1 ....... 182 p/h​
FAHB ....... 187 p/h​
MCM1 ..... 186 p/h​
I.e. on average, it's a wash. But for individual hosts, there will be differences because of the different performance characteristics of the applications.

Oh, and on day 1 of this new project, 16 (sixteen) results were returned by all WCG contributors combined, according to project stats. :-)
 
Last edited:

ao_ika_red

Golden Member
Aug 11, 2016
1,679
715
136
Wow. Awesome initial report! Thanks a lot Stefan!
I don't even touch it yet but from the look of it, my little Athlon should run this project happily unlike certain CPDN or LHC.
I wish this project covered entire world because we had record drought and rainforest fire this year.
 

StefanR5R

Elite Member
Dec 10, 2016
6,794
10,833
136
Current PPH (points per hour of "run time", actually CPU time), averaged over all hosts of all WCG contributors, and presumably averaged over the entire life time of the respective subproject:
ARP1 ....... 189 p/h​
Oh, now it shows 156 p/h (and 126 results returned in total). This is certainly going to fluctuate for a while, until much more results were returned.
 
  • Like
Reactions: ao_ika_red

StefanR5R

Elite Member
Dec 10, 2016
6,794
10,833
136
BTW, we are apparently simulating the period of July 2018...June 2019. Here is the stderr log of the task which I ran yesterday:
Code:
<core_client_version>7.14.2</core_client_version>
<![CDATA[
<stderr_txt>
INFO: Initializing
INFO: No state to restore. Start from the beginning.
Starting WRFMain
[08:38:38] INFO: Checkpoint taken at 2018-07-01_06:00:00
[12:04:19] INFO: Checkpoint taken at 2018-07-01_12:00:00
[15:39:47] INFO: Checkpoint taken at 2018-07-01_18:00:00
[18:12:07] INFO: Checkpoint taken at 2018-07-02_00:00:00
[20:54:10] INFO: Checkpoint taken at 2018-07-02_06:00:00
[23:32:29] INFO: Checkpoint taken at 2018-07-02_12:00:00
[00:53:18] INFO: Checkpoint taken at 2018-07-02_18:00:00
[01:49:19] INFO: Checkpoint taken at 2018-07-03_00:00:00
INFO: Simulation complete compressing output.
01:50:18 (11048): called boinc_finish(0)

</stderr_txt>
]]>


Edit,
download and upload size of ARP tasks is indeed larger than normal.
Research ProjectOne-Time DownloadPer Workunit DownloadPer Workunit Upload
Africa Rainfall Project100 MB100 MB60 MB
FightAIDS@Home - Phase 210 MB0.2 MB1 MB
Help Stop TB30 MB5 MB10 MB
Mapping Cancer Markers40 MB0.1 MB3 MB
Microbiome Immunity Project100 MB50 MB1.5 MB
OpenZika2 MB0.2 MB0.2 MB
Smash Childhood Cancer2 MB0.2 MB0.1 MB
(from the help section)
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
6,794
10,833
136
I don't have ARP enabled myself. But from what I read elsewhere, and from what WCG's own stats show,* the volume of ARP work in circulation remains comparably low.

It may nevertheless be possible to occupy a computer solely with ARP if none of the other WCG subprojects is permitted onto the boinc client and the client is triggered to request new work more often than it would do by itself. (Neither have I tested this theory, nor am I aware of somebody who has done so.) A viable alternative to disabling the other subprojects completely may be to limit the number of tasks in progress for the other subprojects in the web prefs (but have ARP unlimited or set to a high limit), plus the local work buffer set to more than the work in progress of the other projects can saturate, plus keeping the client to request more work regularly.

--------
*) results returned yesterday/ CPU time reported yesterday/ fraction of total CPU time:
arp1 ............ 1,700 ...... 5.8 y ... 1.1 %
mip1 ....... 610,000 ..... 160 y ... 30 %
scc1 ................... 0 ......... 0 y ..... 0 %
zika .............. 1,200 ..... 0.4 y ... 0.1 %
hst1 ................ 740 ..... 1.3 y ... 0.2 %
fahb ......... 150,000 ...... 94 y ... 18 %
mcm1 ...... 580,000 ... 270 y .... 51 %

PS:
If you can, request and complete work with at least 7 days deadline (ARP, MIP, MCM), but report the results only after November 16, 00:00 UTC.
 
Last edited:

ao_ika_red

Golden Member
Aug 11, 2016
1,679
715
136
It's been a week and I still don't get any ARP WU. I only think that preliminary WUs were used as testing WU and they're now ironing out the wrinkle after getting feedback.

I only turned my PC on at night, though. So that's probably another reason.
 

TennesseeTony

Elite Member
Aug 2, 2003
4,367
3,839
136
www.google.com
The forum kept complaining of limited WU's, then one of the volunteers explained that this project is expected to take an ENORMOUS amount of (redundant) storage from IBM/WCG, and they are phasing into it slowly to allow the system to keep up. I think they said 1 petabyte of hard disk space?

Edit: I found it very interesting that African rainfall is so hard to predict, that often a rainstorm only falls on ONE FARM. Whoa. That is rather small amounts of rain indeed!
 

StefanR5R

Elite Member
Dec 10, 2016
6,794
10,833
136
It's been a week and I still don't get any ARP WU. I only think that preliminary WUs were used as testing WU and they're now ironing out the wrinkle after getting feedback.
No, the number of work units in circulation is slowly growing:
arp1 stats history (shows results returned, not tasks sent)

ARP went through a beta testing phase before release.

But is is still a small number of tasks compared to the other projects, as noted in #8. If you want some of them, my guess is that you simply need to emit a respective number of requests for work (and therefore need to either disable the other projects, or set a limit of tasks in progress for them such that they cannot fill your client's work buffer).