Asteroids@home - new to me! Whose crunching it?

Sunny129 · Aug 28, 2013

Assimilator1 said:
I ran out of WUs last night, my cache change didn't have chance to get new WUs.
Also can't access their homepage.

it loaded for me, but it did so extremely slowly...they are definitely having server issues right now. all we can do is wait until it is sorted out, and perhaps keep an eye on the A@H message boards (so long as the server is up and running them).

zzuupp · Aug 28, 2013

I haven't tried the board yet, but I apparently ran out of tasks earlier. I have four waiting to report.

networkman · Aug 29, 2013

I can't even get to the homepage for the Asteroids project now, so I suspect they ran into bigger issues than anticipated.

After hearing the news about the new optimized apps, I had re-loaded my work units to take advantage of the faster times.

Apparently, my cache is set fairly high as I have 53 WUs completed, 4 in process, and 134 more waiting to go so I dare say I can handle another 2 or 3 days without issue.

The real problem (which I hope they consider) is the deadline for the work to be completed -- I'd be really upset if the work didn't get credited because it was past the deadline, but we couldn't upload because the servers were down!

networkman · Aug 29, 2013

Okay, I just happened to check and find the project is accepting work again and sending out new units!

Assimilator1 · Aug 29, 2013

Yea I've got WUs too now

petrusbroder · Aug 29, 2013

The site is up too - all seems OK then.

Sunny129 · Aug 29, 2013

EXPLANATION THREAD

Assimilator1 · Aug 30, 2013

I nearly ran out of WUs this morning, throttled back A@H to 25% & ran F@H alongside @75% til this evening. Got a bunch of WUs now

.

Bradtech519 · Aug 30, 2013

I've been happy with the performance of my FX 8350 on the SSE3 running stock 4.0 GHz. I got seven workunits going for Asteroids and 1 WU for test4theory. Completion times have been around 2000-2200 seconds. CPU is staying between 49-51c 100% load. I'll be glad when electric bills start falling due to summer heat.

Assimilator1 · Aug 31, 2013

My WUs complete in about 45mins(+/- 2mins)/2700s

, what clock speed is your FX at?

zzuupp · Sep 4, 2013

I was getting new tasks, but they were all 'download failed'
going back to LHC for a bit. So far, they've fixed their problem, and they are rarer tasks.

Sunny129 · Sep 4, 2013

zzuupp said:
I was getting new tasks, but they were all 'download failed'
going back to LHC for a bit. So far, they've fixed their problem, and they are rarer tasks.

not all of the tasks i'm getting are failing on download - i'd say approx. 25% of all A@H downloads are failing on my host. now granted, that's alot of errors, but keep in mind that our CPUs aren't wasting any time trying to crunch these "download failed" tasks. and the large quantities of errors being generated on a multitude of hosts by this server-side issue seem to be recognized as such by our BOINC clients. that is to say, if our BOINC clients were mistakenly identifying the errors as having been caused by our hosts, BOINC would punish us by reducing our daily quotas (which will only return to our maximum respective daily quotas over time, after X number of tasks have been returned with an error rate less than Y, so to speak), and despite all these "download failed" errors, i've been able to maintain a full cache all day.

long story short, these errors shouldn't hurt your production b/c your CPU isn't wasting any time on them (your RAC/PPD should be just as good now as it was before the "download failed" errors started to roll in). likewise, they shouldn't hurt the project b/c the server will continue to attempt to resend these tasks to other hosts until it succeeds in doing so.

zzuupp · Sep 5, 2013

It seemed like my failure rate was higher, or just bad timing on my part. Yesterday morning I watched a few dozen download, and then 'download failed' right before my eyes.

Switching projects was the easiest fix before I left for work.

GLeeM · Sep 5, 2013

Sunny129 said:
the server will continue to attempt to resend these tasks to other hosts until it succeeds in doing so.

I don't think they are going to succeed.

It sounds like they will have to fail to send to 20 different hosts before quitting!

So just be patient.

Assimilator1 · Sep 5, 2013

Well atm I've got a whole list of failied d/l WUs, my client had completely stalled, I've only got 2 crunching now & that's only because I did an update.

Same thing happened yesterday 90%+ failed d/l WUs

.
So I'm throttling A@H back to 1 CPU & firing up F@H until they can get themselves sorted.

I'm gonna see if their's an update in that server problem thread.
Yep :-

Kyong has finally found where the problem is and he is working on it. There is a mismatch between database and filesystem. The problem is caused by the server crash we experienced last week.

We are sorry for the problem.

Corrected grammar errors

Sunny129 · Sep 5, 2013

wow...my guess is that what isn't happening to me yet (as mentioned in my previous post) is happening to others...that is to say, the A@H project is stalling on some folks' hosts b/c they are returning too many errors, and their scheduler request back-offs are increasing accordingly, thus preventing the download of new work for ever increasing periods of time. i have a ton of A@H download errors on my host as well, but apparently i haven't had enough of them to cause BOINC to increase my scheduler request back-off times and prevent me from trying to download new work, b/c new tasks keep coming in from A@H.

at any rate, b/c there are no wasted CPU cycles or loss of production efficiency caused by these download errors, i will continue to crunch A@H. if/when the errors become numerous enough to start affecting my ability to get new A@H work, i'll put this project on the back burner and dedicate my CPU to another project (or projects). but it appears some of you have already gotten to that point...

Assimilator1 · Sep 5, 2013

Yea sure have

, but I am eager to crunch it!

Sunny129 · Sep 6, 2013

someone posted in the Server failures in last few days thread on the A@H forums claimed that detaching and reattaching his hosts eliminated his download errors. but in a subsequent response, someone tried the same thing and it didn't do anything for him. i won't be able to test either a project reset or a detach-reattach for another ~48 hours or so b/c that's how large my cache is, and i don't want to trash any of the "ready to start" A@H tasks that are currently in the queue.

in the mean time, my A@H host has also reached the point where a majority of my A@H tasks are now download errors. i also noticed my scheduler request back-off times are now in the neighborhood of 24 hours b/c of it. the only reason the A@H project hasn't stalled on this host is b/c i've had more than 24 hours worth of tasks in my queue. nevertheless, i realize that the more download errors i return, the longer my scheduler request back-off times become, and sooner or later i'll run out of new work before the BOINC client is scheduled to contact the A@H server and request more work, thus stalling the project on my host...

...but i've come up with a workaround (for Windows hosts anyways). open up the simplest plain text editor you have, for instance notepad, and enter the following text:

boinccmd --project http://asteroidsathome.net/boinc/ update

save and close it. then rename the file extension from txt to bat. then place the file in your BOINC installation directory (not your BOINC data directory). DO NOT create this batch file using a more powerful text editor like MS Word, as they sometimes tend to add invisible characters to the file before saving. now go to start menu -> all programs -> accessories -> system tools -> task scheduler. create a scheduled task that executes the batch file you just created, and set it such that it executes the batch file every X minutes or hours. mine is set to execute every 2 hours...so despite the fact that BOINC may not want to contact the A@H server again for another ~24 hours on its own accord (due to an insanely long scheduler request back-off time), the batch file will force BOINC to do so as if you just manually updated the project. voila! babysitting without having to babysit! sure, your host will continue to return download errors (until they get the problem fixed server-side), but at least the project won't stall on your host because of it.

*EDIT* - btw, setting up a scheduled task in Win7 is alot easier than it was in WinXP, but it can still be tricky...so if you can't get your scheduled task to execute exactly how and when it should, let me know and i'll do my best to help.

Sunny129 · Sep 6, 2013

*update*

well i was going to run my A@H cache dry so i could try a project reset or a project detach-reattach to see if it has a positive effect on the download error problem...but then someone posted again in the Server failures in last few days thread on the A@H forums, this time claiming that he had less download errors than before. so i set A@H to allow new tasks again and i got 59 new tasks, only 3 of which failed to download properly this time around...i'll keep accepting new work for a while to see if this wasn't just an isolated incident that makes it appear like the download error problem is on the mend...i'll keep you all posted.

Assimilator1 · Sep 7, 2013

Nice work around sunny

, after my client was off for a day it just seemed to fix itself

. Now I'm only getting the odd 1 or 2 'd/l failed' ........ spoke too soon! Just checked my client & it's nearly out of WUs! thx to d/l failed errors

.

Sunny129 · Sep 7, 2013

i was just about to update, but you beat me to it - the number of download errors on my host shot back up again too...so it looks like you'll have to use my workaround for the time being if you want to remain on the project...

Assimilator1 · Sep 7, 2013

Shortly after I last posted I did a manual update & got a bunch of WUs

, still some failed (8 out of 25) but a marked improvement, very up & down!

Assimilator1 · Sep 8, 2013

..... & then most failed, & then got some , & now about 1/2 are failing atm.

Think I'll just leave it crunching on 1 core until they sort themselves out!

NWM
Somehow I've still managed to pass you! Your clients stalled?

networkman · Sep 9, 2013

I'm not surprised that you've passed me as I only have that same single older Xeon machine on the project. I may be able to snag one or two older Xeon-based servers if I can get the Test Lab back up and running, which means resurrecting the router that I had to cannibalize for parts. :\

Assimilator1 · Sep 9, 2013

Which family Xeon is it? (P4? Core 2? etc) I know you gave me the numbers BOINC reports but it meant nothing to me

, anyhow mines an ageing Core 2 quad (as per my sig) so it probably wouldn't take too much to catch back up

.

Asteroids@home - new to me! Whose crunching it?

Diamond Member

Lifer

Lifer

Lifer

Elite Member

Elite Member

Diamond Member

Elite Member

Senior member

Elite Member

Lifer

Diamond Member

Lifer

Elite Member

Elite Member

Diamond Member

Elite Member

Diamond Member

Diamond Member

Elite Member

Diamond Member

Elite Member

Elite Member

Lifer

Elite Member