• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

SETI WU that will not finish?

Robor

Elite Member
I've got a WinNT4 Workstation (SP6a) that is 99.997% complete and it will not finish the WU. I look in Task Manager and find that SetiLog and Seti are not listed. "System Idle Process" is 99% and there's nothing abnormal running. I restart the client and it just dies again. I rebooted the box and it does the same thing. I've got a bunch of similar workstations and have never seen this problem. The only thing I can think of is to kill the nearly completed WU and let it download another. Sucks to lose 9-10 hours of process time though.

Rob
 
Robor - almost the exact thing happened to me yesterday...

I run the win 3.03 CLI in WINE on my Red Hat 6.2 and had noticed late yesterday afternoon that this machine hadn't uploaded a result to my Setiqueue since very early that morning. I checked to see if the WU had been one of the low angle ranges (it wasn't). Since WINE runs in a terminal (and I can view any errors), I found that the process was stuck trying to connect to my queue. I killed it multiple times but it still refused to connect. I then stopped and restarted the queue multiple times which also didn't help. Finally I tried having it upload directly to SETI but it still hung trying to connect to them.

I did as you did, creating a new directory and was able to successfully download a new WU.

Yeah you're right... It IS tough to lose not only that approx 5.5 hrs of work (this is on a P3 600@950), but an estimated 2 more results while it sat idle, before I got it going again... 🙁

What made this worse was the fact that my alpha's NIC had some kind of hardware error yesterday that locked it up (I normally run it headless). Even attaching a keyboard and monitor to try to see what was happening and fix it was useless. I ended up having to hard boot it (it had been up for about 6 1/2 months continuously). So there were another 2 lost WUs from that machine too... 🙁

Oh well... must have been due to the full moon. 😉
 
I just blew the entire directory away and readded the client back in. I restarted it and now SetiWatch shows it as "state.sah file is empty for \\COMPUTERNAME\Share\state.sah". Hopefully it will finish the WU and recreate the state.sah file properly on the next WU. Before I blew the directory away I checked the date/time on the files. They were last updated on 02/07/00 at 4:11PM (of course, right after I leave work). Looks like a lost about 2-3 WU's then. DOH! 🙁

It's just weird because I've never seen this happen before.

Rob
 
It's just weird because I've never seen this happen before.

It's that full moon babe! Wreaks havoc on people and machines!

😉
 
I had that happen the other day, but I was able to stop transmit, shutdown SetiDriver and restart the process. It then sent the WU.

I had a similar WU-delaying problems the other day. I came to work Monday morning and found SETI sitting idle and asking for my ID. It had finish the last WU on Sat. night. 🙁

I assume this is all related to the new version of SetiQ, but then could it be version 3.03 of S@H?

Either way it stinks, especially now that it takes so much longer to process one WU.
 
JWM - there was a similar thread that got started on Ars today about the same thing... People are now starting to report these types of problems - and in my case, I'm still using the old Setiqueue (0.78b).

I know that over the course of a year, I might have had maybe 1 WU refuse to upload to anywhere out of what became a total of 11 machines that were running SETI at one time before 3.03.

Since 3.03, I've already had a few weird/possibly corrupted ones so far... 🙁

[EDIT: Here is the Ars thread]
 
:| GRRRRRRR! :|

More problems. It's been 24 hours now and that problem client is still reporting that the state.sah file is empty. I haven't checked the box to see if it's actually running because there's a user working on it now. I *NEVER* had a problem like this with any of the previous clients. As much as I hate to have to upgrade 34 clients again, I sure hope they fix this worthless POS soon. I've spent considerable amounts of my own time installing my work clients and SetiQ up so it's very low maintenance. Until recently all I've ever had to do is monitor the fleet with SetiWatch and keep an eye on systems that are powered down or have a program that has hung up (never SETI though). That said, if this buggy client continues to give me problems I will do like rleach2 and pull the damned thing off my work boxes.

Rob
 
Okay, there is calm now. Sorry for the previous venting post. I'm just a little frustrated that this new client is not only slower than the previous one, it's also a little buggy. 🙁

I renamed the computer on the network, changed the SETI directory from \Program Files\Data to \Data, and used fresh copies of Setilog.exe and the SETI client. Started from scratch and now it's working again. SetiWatch shows my fleet humming along nicely. We'll see how it does over the weekend.

Rob
 
Holy crap Robor ! you made my heart drop for a minute there!:Q

I'm glad its sorted now 🙂 ,hey maybe I could send the babe over to you to sooth you a little? (& reward you for your work for TA SETI)🙂

Hmm ,I can see the babe is going to be in great demand real soon!

Poof
And back to you 😉
 
I am home for the weekend and checked the SetiSpy log on my PIII-700/910 where I found 1 WU that took 97 hours!!!🙁 The ones before and after took about 6.5 hours. I was running cli 3.03 with SetiDriver. Did not have it going thru SetiQ, so that is eliminate as the culprit.

I think that we have seen enough of this type of problem that it should be report. Any ideas??
 
I've started a VLAR thread on Ars.

Here is the hypothesis that I'm trying to prove (although item #1 has pretty much been proven):

1.) The 3.x win CLI processes VLAR WUs (where AR = < 0.1) slower than the mids (where AR > 0.1 but less than 1). Based on the behavior of other clients, VLARs should process faster than mids.

2.) If you are SP or hotfix-less on your windoze, eg., you did a fresh install and didn't get chance, didn't want to bother, or were afraid (😉) to put a SP and/or some kind of update on it, you will find that processing on your VLARs are EXCESSIVE and the reported CPU time in the result.sah file (or from Setiqueue or Setispy, etc.) does NOT match the actual time it took the WU to complete before it uploaded

Now I am completely SP-less and update-less (except for my NT 4 which is at SP 4) in my fleet because I'm basically a Linux user and personally don't give a sh*t about windoze, and right now, only 3 of the 11 machines are booted into windoze.

Those reporting in at Ars who indicate that they are NOT seeing excessive WU times for VLARs (but do see problem #1) are also reporting that they are pretty much &quot;up to date&quot; with the SPs and hotfixes/updates.

I'm trying to get a handle on people who are seeing the excessive WU times and whether they are in fact SP-less...

Anyone?
 
I've got a few boxes at work that take an excessive time (24hr - 32hr) on a P2-350/64MB. Pretty sad that it takes over a day to produce a single WU on these machines when they get one of those tough WU's. Almost all of my Win9x machines are running the install from the CD. We've got a pathetic modem pool for internet access and I don't have the time to download the updates through it. I thought about getting a CDROM and doing them but since we're moving to Win2000 Pro in the very near future I think that'd be a waste of time.

The good news is when we upgrade to Win2K many of the slower boxes will have to be replaced/upgraded. I've got a handful of P166-P233 boxes that I don't bother running the client on because they're too slow. Once they're replaced I should have a fleet of 50+ that will all be P2-300/128MB or better soon. Then, when the Corporate office moves to my building that could easily go up to 75+. I'm not sure how they will feel about the SETI client though. I'm going to have to work on that angle... 😉

EDIT: Assim1... Yes, please send over the babe. I need some soothing! 😱

Rob
 
Robor - your info about the excessive seti times from your old P2s is confirming my hypothesis! If you ever got chance to track them a little more closesly, you would probably note that although it might appear that they take like 32 hrs to complete, the actual time reported for the cpu_time is much less... That's because something is happening where the client is doing alot of system calls and that takes away CPU time from the seti process.... 🙁

That is, if you run your 9x/ME/NT (not sure about 2K yet) SP-less, then the VLAR problem gets exaggerated even more! :|

As an FYI Robor - I did a bench on my PIII 450 katmai running NT4 SP5 and it completed a &quot;mid&quot; angle range in 12.22 hours. A VLAR would probably have taken just a little bit longer based on the &quot;known&quot; bug. My Xeon II 400 did the same bench in 11.16 hours. I would expect no more than maybe 15-16 hours for your P2s. BUT... although that 15 hour time would be reported in the results file, you would have found that it took almost a day to get that WU completed if you run without a SP!

There's something about the analysis routines that's very different in 3.x than in 2.x, and whatever it is, it's assuming and/or requiring that some &quot;fix&quot; be in place in your windoze... 🙁

I'm still trying to track down exactly what these SPs and hotfixes fix... sigh.

And you guys can do me a favor by smacking Hellburner for ribbing me about this whole thing too... 😉
 
Robor

One of my ships is a PII 350@392 with RAM at CAS 2 ,it runs v3.03 GUI and seems to be averaging 13-14 hrs/ WU.
[update] doh! forget those times ,I've just found out that he's upgraded the cpu on that PC!😱

With the kinda problems your getting on those machines would it be worth changing that 1 to the GUI version?

BTW I look forward to your upgraded fleet 😀

Poof
ref HB ,eh?😕
 
I'm not going to worry too much about the older machines right now. We're very close to replacing most of them with new DELL's running Win2000 Professional. I believe I've got 4 coming in late next week. I'm going to install the client as a service on them and will probably do the same to my NT4 Workstation boxes. I've noticed no problems so far as far as VLAR WU's with the NT4 systems. They're slower than the 3.0 client but they're usually in the same range as far as completion times.

Rob
 
Hey i can't really answer your question...
but i was wondering...if the new seti was slower...cuz it's taking me 10hours each and before i changed it...it took 8 hours ... and that was before i oc'd my 950 to 1.1

...
 
I had 4 machines sitting idle on Thursday. They were waiting for me to enter my S@H id. 🙁 The log in SetiQ beta3 said they had sent results that were &quot;empty or zero lenght&quot; work units. Thursday was not the first time I'd had this happen, but never on more that one machine at a time. So, now I'm loading SetiDriver with more that 1 WU to see if this keeps the systems from pausing for my manual intervention. BTW: The &quot;empty&quot; WUs did get sent in and I did get credit for them. Any others have this??
 
Losty: Yes, the 3.03 client is significantly slower than previous versions. I've found that the time it takes to complete WU's is about 50% longer on average.

JWMiddleton: The more I read about the beta SetiQ the more I'm glad I've still got the older versions running. I know it leaves a DOS window open and it's rather crude but it gets the job done and it's rock solid as far as stability. Hopefully the new versions will improve in stability but from what I've seen so far they're not there yet.

Rob
 
Assim1 - Yes it's long and will hopefully get longer. We're trying to get to the bottom of this!! According to responses in that thread (particularly from Lawrence Kirby, one of the porters) is that they know about the problem but it's not really high on their priority list. Well guess what? That's tough. A bunch of us (and this is growing) are BOYCOTTING VLARs.

JWM - a number of people have reported (it happened to me too) problems with random WUs that seem to process fine but won't upload at all... This may be different from your problem but we're attributing it to instability in the 3.03 win CLI.

I have ntop (a network version of the *nix &quot;top&quot; program that monitors running processes) running here 24/7 and may take a look at what's happening when a 3.03 client communicates with and uploads/downloads to/from Setiqueue. I'm still running the old 0.78b version mainly because I haven't had time to putz with the new one and I prefer the simple text interface that once configured and working, is rock solid as Robor has said, and just runs... I have mine open next to my pproxy console window and both of them do what I need them to do - serve WUs/blocks/stubs.

Robor - can I assume that your NT 4s are all patched up with some SP? The reason I ask is because I am pretty much convinced based on a number of reports now, that if your windoze is patched (ie., not a fresh or un-updated install), the VLAR problem is reduced (although still very much there), as compared to un-patched or non-SP'd windoze installations, where you would experience what you're experiencing with the PIIs - excessive run times for VLAR WUs.
 
Poof: Just looked at the first part of your ARS thread and saw my name. So, I will get the data for you in the next few days. I am at the cabin now and am using a P133 laptop (running RC5). Will collect my logs when I get home and take them back to Shreveport tonight. On Friday I noted on one machine that I had two VLAR WU entries in the SetiSpy log. Don't know if the logs will give me clock time and CPU time, but I'll check.

As far as a VLAR boycott - that p!sses me off! :|:| Everytime you guys discard or hold a VLAR WU, it gets resent to the rest of us in a week, or so. That cause us to have even more! :|:| In a normal period I've noted that about 12% are VHAR, and about 7% are VLAR. But, the number of VLAR WUs goes up when others don't process them. This happen back in December when the VLAR info first came to light. Since then 3.03 has been overshadowing the VLAR issue. Now, that it is back in the limelight we will see it again if you guys continue. This says to me that the boycotters are more interested in STATS! :|:|
 
Back
Top