WCG problems

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

crashtech

Lifer
Jan 4, 2013
10,435
2,048
136
Latest:
WU Distribution Update

We are working towards resuming a consistent WU supply similar to what we had before the storage system failure. The recent sparsity of OPN1 WU was caused by a batch that has blocked the create-work process for all other projects. We have found and fixed the glitch, and the system is busy creating work for OPN1 right now. We still have an ARP1 backlog of unsent results (see ARP project update ), but we now have a spare capacity for a larger backlog. After OPN1 work units are prepared, the system will prepare ARP1 work units.

On the back end, we still had to finalize setup of the new storage as there was a networking issue that was preventing us from accessing the tape archive. Data center admins have helped to fix it, and the production system on the new storage is being backed up.

We continue to investigate the errors in the BOINC system services, specifically assimilators and validators. Unfortunately, the application is written such that an unexpected error halts the service (which happened when our storage system failed). We are attempting to clear out the problematic data to allow the applications to continue processing other results, but BOINC doesn't seem to have an easy method of flushing specific workunits or results out of its system.

If you have any comments or questions, please leave them in this thread for us to answer. Thank you for your support, patience and understanding.

WCG team

 

StefanR5R

Elite Member
Dec 10, 2016
5,181
7,091
136
I received ARP1 work (it's the only WCG subproject which I have selected currently), but its large result data are uploading very slowly — considerably slower than my own slowish internet uplink would allow. It's a combination of generally low transfer rate with occasional transient HTTP errors.

It's a good thing that the client eventually stops to request more work when there are too many uploads in progress (per project). Although right now this client-side stopper isn't even triggered, as the server-side ARP1 work supply appears to have dried out again already. (This subproject has always been submitting work in waves, not contiguously.)