WCG problems

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,254
16,110
136
wow, quick response on my latest email, and here it is using gmail

1758834280780.png
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,254
16,110
136
I just reported (with pictures) that WCG was still broken, and I got a personal thank you from Igor !

1759407616229.png
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,695
4,657
75
Now getting "Another scheduler instance is running for this host". Which might be an improvement.
 

mmonnin03

Senior member
Nov 7, 2006
336
272
136
October 3, 2025
We are aware of the issue with the scheduler returning "Another scheduler instance is running for this host" and have identified the cause in the config.xml template we adapated for the new containerzied environment. We will fix it once we have confirmed that the new event-driven validation and assmilation pipelines are working correctly.

Uploads are being processed normally, we've confirmed the new architecture for the containerized file_upload_handler pool behind Apache is correctly producing to the per-application Kafka (Redpanda) topics, storing the event and result data in separate queues on the local brokers partition.

As a result, there will be at least one more weekend sprint. Tentatively, we expect to be producing new workunits next week for MCM1, ARP1, and MAM1 beta version 7.07, validations should resume over the weekend, initial releases of batches will be intermittent.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,254
16,110
136
Igors reply to my email to him today:

almost there, i hope:

scheduler fix/open, some downloads sent, new validations should be the
endpoint in a few hours, then Dylan will get to work trying to create new
workunits for MCM1 and MAM1, once MCM1 is steady and we have beta mams
going out to hopefully promote 7.07 as the first production MAM1 release .
 

Skillz

Golden Member
Feb 14, 2014
1,182
1,191
136
They should probably focus on fixing the scheduler issue before even thinking of trying to release a new app.
 
  • Like
Reactions: Ken g6

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,254
16,110
136
Well, all the bad error messages went away, but no tasks. Here is the latest comment from Igor.


Igor Jurisica​

Wed, Oct 8, 7:10 AM (1 day ago)
to me







Thank you Mark - yes - no work has been going out yet - we wanted to
gracefullly start, monitor - update, increase - hopefully - all will start
coming up shortly.
 

StefanR5R

Elite Member
Dec 10, 2016
6,686
10,591
136
November
11/06 - 11/09 (19:00 UTC) Boinc Games 2025 UCI Indoor Cycling World Championships sprint (project TBA, 3 days)
11/16 - 11/23 (00:00 UTC) World Community Grid 21st Birthday Challenge (7 days) — tentative; maybe not happening
11/16 - 11/23 (12:00 UTC) PrimeGrid UNESCO Anniversary Challenge (CUL/WOO-LLR, 7 days)
WCG Birthday Challenge status changed from "maybe not happening" to "most likely not happening".
Besides the recent technical issues, they point out that the submission rate of new work was lacking even at periods without other technical problems. (Reminds me of TN-Grid.) Of course there is a slim possibility that it gets better after Krembil fixed what broke during the move to other infrastructure, but nobody is holding their breath.