SETI problem

Smoke · Oct 27, 2003

That sounds like you have uncovered the problem, Spacehead. 😉

I thought the problems were probably related.

IsOs · Oct 27, 2003

Originally posted by: Smokeball
That sounds like you have uncovered the problem, Spacehead. 😉

I thought the problems were probably related.

I wonder if they would respond if we tell them that due to this user database crashing last week, we've submitted some 200,000 workunits but never got credited for them

MGallik · Oct 27, 2003

I've lost 3 clients and a full days crunching. I reinstalled all the clients and flushed my q's. So far everything looks ok again.

So far........

Robor · Oct 27, 2003

My 4 home clients are all idle right now and I've got who knows how many idle at work now. The result is I'm about 130 WU's off my normal production. I am not going to go around removing files and restarting clients. If they fix themselves fine, if not they'll sit until BOINC. I may un/reinstall my home fleet but I really don't feel like it at the moment. I'll wait and see if they've really fixed it. Oh well, at least my power bill will be lower...

RaySun2Be · Oct 27, 2003

dasm, this sucks. 🙁

OhioDude · Oct 28, 2003

This little problem does a number on the user_info.sah file. :disgust: I compared a good one with one on a client that stalled.

Good user_info.sah:
type=user info
id=71203
key=1989307882
email_addr=
name=George M. Fryberger
url=
country=
postal_code=
show_name=no
show_email=no
venue=0
register_time= 2451315.27295 (Sun May 16 18:33:02 1999)
last_wu_time= 2451696.25222 (Wed May 31 18:03:11 2000)
last_result_time= 2452940.93089 (Tue Oct 28 10:20:28 2003)
nwus=1314
nresults=34468
total_cpu=1048056251.518761
params_index=0

Bad user_info.sah:
type=user info
id=-1
key=0
email_addr=
name=
url=
country=
postal_code=
show_name=no
show_email=no
venue=0
register_time= 0.00000
last_wu_time= 0.00000
last_result_time= 0.00000
nwus=0
nresults=0
total_cpu=0.000000
params_index=-1

For those who have some stuck clients, you'll have to replace user_info.sah with a good one, or remove it all together and restart the client from scratch to get the client running again.

TofBnT · Oct 28, 2003

Hmmmmm, Maybe that is why we are losing clients.

OhioDude · Oct 28, 2003

I just purged 400 wu's from my seti queue just in case there were any more nasty ones in there waiting to hose up the works.

Assimilator1 · Oct 28, 2003

Looking down Sonis stats this morning I noticed alot of peoples output was low🙁 ,including mine ,though AFAIK I've got no stalled clients😕

[edit]
I might of spoken too soon!🙁
Is this the error?

15:11: s@h_wrk Passthrough: Seti@home status: ErrorCode 0x00000064 100

Damn thats been going for a while but I've no idea which client it is!

[edit2]Well that's a coincedence!,one of my co-commanders just phoned me ,guess what he asked about?yep ,a client wouldn't transmit!.
Thanks for the info Spacehead ,found my stalled client🙂

OhioDude · Oct 28, 2003

I might of spoken too soon!
Is this the error?

15:11: s@h_wrk Passthrough: Seti@home status: ErrorCode 0x00000064 100

Damn thats been going for a while but I've no idea which client it is!

That's the dude!

At the very least, you need to remove the result.sah file, replace the user_info.sah file with a good one, and restart the client.

Robor · Oct 28, 2003

Well, I had 12 clients that I know were stalled. (had no WU pending). I copied a "good" user.sah file over the bad one but I can't remotely restart the service on the clients, or can I? If anyone knows how to do that please drop a quick reply. Oh, I do have domain admin rights.

Assimilator1 · Oct 28, 2003

You'll need to delete the result.sah too🙁

OD
Already fixed,re 2nd edit😉

Smoke · Oct 28, 2003

Yeah, there is more than one of these threads but the problems remain. I don't know which one to post so I've posted the following in both. 😛 😀

I've just spent some time with my Q and this is what I found.

There seems to be an unusual number of PASSTHROUGH REQUESTS and these don't seem to make it to Berkeley and the Q times out. In an attempt to avoid these errors I have DISABLED both the OVERRIDE FOR PASS-THROUGH OPERATIONS and DOWNLOAD NEW WUs.

Upon a restart it appeared WUs were being sent in but then the following happened:

Queued passthrough request
Passing through request send_result_get_user_stats
Pass: Seti@home status: Error 0x00000064 100
Returning passing through request response

Then 3 or 4 completed WUs are sent in and the above entires in the LOG appear again?

I thought by disabling PASS-THROUGHs and not DOWNLOADING WUs, this would not happen? Any thoughts?

Assimilator1 · Oct 28, 2003

You've got a client thats stalled🙁 ,now you've got find out which one......or ones!.Then delete the result.sah & userinfo.sah for that client
Btw thats the same error I got earlier.

I just found out that my 2nd rig has been dead in space for nearly a day:|

PraetorianGuards · Oct 28, 2003

Sheesh. I had thought last night it was just another Berkeley outage. Oh wel..

Problem is, I've gone through every WU's in my home comp and I've yet to find one with a corrupted user info file. All of them look ok to me, has this happened to anyone else? I'm almos tempted just to restart everything. Maybe I'll move into a queue too 🙂

IsOs · Oct 28, 2003

Originally posted by: PraetorianGuards
Sheesh. I had thought last night it was just another Berkeley outage. Oh wel..

Problem is, I've gone through every WU's in my home comp and I've yet to find one with a corrupted user info file. All of them look ok to me, has this happened to anyone else? I'm almos tempted just to restart everything. Maybe I'll move into a queue too 🙂

I usually would delete the entire directory and do a reinstall. If you have lots of clients, you'll be spending a lot of time trying to find out which are the bad workunits. The easy way out is simple delete then reinstall.🙂

Smoke · Oct 28, 2003

Aren't PASS-THROUGH REQUESTS when a NEW CLIENT is requesting a WU for the 1st time?

IsOs · Oct 28, 2003

Originally posted by: Smokeball
Aren't PASS-THROUGH REQUESTS when a NEW CLIENT is requesting a WU for the 1st time?

I think they are actually invoke when a workunit bearing a user_id that's not currently present in the Q attempted to get submitted. Since these bad workunits seem to have an unknown user_id, the Q will pass through the request to the main server. Of course, I can't be sure since I never read the source code for SETI Q. 🙂

Smoke · Oct 28, 2003

I think we said about the same thing.

I can't seem to figure out which NEW CLIENT is making the request. Is there anyway to determine this?

Rattledagger · Oct 28, 2003

Pass-through?
If you're connecting with a new client that you've logged-into with an email-address, you get a pass-through.
If you're connecting with a new client or old client or anything that's got an id=1234567 & key=987654321, SetiQueue will automatically recognice this, and no passthrough is neccessary. Setiqueue doesn't care if it's a result or a wu-request, it automatically configures a new user/queue/client-pair if neccessary.

Smoke · Oct 28, 2003

Isn't that what I said?

OhioDude · Oct 28, 2003

I can't seem to figure out which NEW CLIENT is making the request. Is there anyway to determine this?

The problem is that one (or more) of your clients has a corrupted user_info.sah file. That's why it looks like a new client. I was only able to determine which ones in my fleet were stalled by looking at my seti queue for clients that showed zero wu's pending. Also when looking at the stalled client's S@H folders, there was no state.sah file, either.

I don't know if that helps you or not, but that was my experience with this particular problem.

paf077 · Oct 28, 2003

I have had problems with 2 comps for the last 2 weeks, setidriver just closes on it's own. and when I restart them they can't send. I deleted evreything and installed all neww, they're ok for 1-2 days and then start again. Yesterday my Q reset itself??? I had over 900 wus in there and when I checked last night I only had 130. What the H*** is going on?? I just might drop evrerything by the end of the week if this keeps up!

I have no time to go around and reinstall everything!!!! getting my shoulder operation in 2 weeks, so I have to get All my work around the hoiuse done by then.!

if it wasn't for the MTR I would just flush everything right now.

Paf.

Smoke · Oct 28, 2003

Originally posted by: OhioDude

I can't seem to figure out which NEW CLIENT is making the request. Is there anyway to determine this?

Click to expand...

The problem is that one (or more) of your clients has a corrupted user_info.sah file. That's why it looks like a new client. I was only able to determine which ones in my fleet were stalled by looking at my seti queue for clients that showed zero wu's pending. Also when looking at the stalled client's S@H folders, there was no state.sah file, either.

I don't know if that helps you or not, but that was my experience with this particular problem.

Using your tip I have discovered the following clients that fit the description:

adsl-67-64-145-244.dsl.rcsntx.swbell.net 4.53 0 0 2h41m 5d 6h35m 0 Client Name: lobadobadingdong
208-151-97-59-cdsl-rb1.fai.acsalaska.net 5.00 0 0 7h10m 3d 9h26m 0 Client Name: AK_Yanqui
208-151-110-117-cdsl-rb1.fai.acsalaska.net 0.11 0 0 7h09m 9d 2h01m 0 Client Name: AK_Yanqui
208-151-99-82-cdsl-rb1.fai.acsalaska.net 0.33 0 0 6h57m 9d 10h32m 0 Client Name: AK_Yanqui
208-151-110-224-cdsl-rb1.fai.acsalaska.net 1.10 0 0 7h07m 10d 9h03m Client Name: AK_Yanqui

I have no idea who "AK_Yanqui" may be. It is definitely someone in Alaska.

I'll contact [/b]lobadobadingdong[/b] and see if he's having any problems.

lobadobadingdong · Oct 28, 2003

my main crunchers seems to be missing some info on the bottom of the file, how do I fix it?

nm, found it a few threads up.

SETI problem

Distributed Computing Elite Member

Diamond Member

Golden Member

Elite Member

Lifer

Diamond Member

Golden Member

Diamond Member

Elite Member

Diamond Member

Elite Member

Elite Member

Distributed Computing Elite Member

Elite Member

Golden Member

Diamond Member

Distributed Computing Elite Member

Diamond Member

Distributed Computing Elite Member

Elite Member

Distributed Computing Elite Member

Diamond Member

Golden Member

Distributed Computing Elite Member

Lifer