F@H completed WU didn't upload

trevinom

Golden Member
Sep 19, 2003
1,061
0
0
I was having problems with my machine just as it was finishing a Gromacs core. When I turned it back on, it said it couldn't find the work files and everthing would be deleted. It then loaded a new Tinker core and I haven't seen it send the completed Gromacs or get credit for the WU. What do I do? Here is the log output:

...
[03:24:47] Project: 543 (Run 6, Clone 7, Gen 30)
[03:24:47]
[03:24:47] Assembly optimizations on if available.
[03:24:47] Entering M.D.
[03:25:07] (Starting from checkpoint)
[03:25:07] Protein: p543_BBA5_ext
[03:25:07] Writing local files
[03:25:07] Completed 500000 out of 500000 steps (100)
[03:25:08] Extra 3DNow boost OK.
[03:25:10] Writing final coordinates.
[03:25:10] Past main M.D. loop

Folding@home Client Shutdown.


--- Opening Log file [April 16 03:28:11]


# Windows Console Edition #####################################################
###############################################################################

Folding@home Client Version 4.00

http://folding.stanford.edu

###############################################################################
###############################################################################



[03:28:11] - Ask before connecting: No
[03:28:11] - User name: Martin_Trevino (Team 198)
[03:28:11] - User ID = 3FC9E9705D15ABBE
[03:28:11] - Machine ID: 1
[03:28:11]
[03:28:12] Loaded queue successfully.
[03:28:12] + Benchmarking ...
[03:28:15]
[03:28:15] + Processing work unit
[03:28:15] Core required: FahCore_78.exe
[03:28:15] Core found.
[03:28:15] Working on Unit 03 [April 16 03:28:15]
[03:28:15] + Working ...

Folding@home Client Shutdown.


--- Opening Log file [April 16 03:38:47]


# Windows Console Edition #####################################################
###############################################################################

Folding@home Client Version 4.00

http://folding.stanford.edu

###############################################################################
###############################################################################



[03:38:47] - Ask before connecting: No
[03:38:47] - User name: Martin_Trevino (Team 198)
[03:38:47] - User ID = 3FC9E9705D15ABBE
[03:38:47] - Machine ID: 1
[03:38:47]
[03:38:48] Loaded queue successfully.
[03:38:48] + Benchmarking ...
[03:38:50]
[03:38:50] + Processing work unit
[03:38:50] Core required: FahCore_78.exe
[03:38:50] Core found.
[03:38:50] Working on Unit 03 [April 16 03:38:50]
[03:38:50] + Working ...
[03:38:51]
[03:38:51] *------------------------------*
[03:38:51] Folding@home Gromacs Core
[03:38:51] Version 1.56 (February 2, 2004)
[03:38:51]
[03:38:51] Preparing to commence simulation
[03:38:51] - Looking at optimizations...
[03:38:51] - Created dyn
[03:38:51] - Files status OK
[03:38:51]
[03:38:51] Folding@home Core Shutdown: MISSING_WORK_FILES
[03:38:55] CoreStatus = 74 (116)
[03:38:55] The core could not find the work files specified. Removing from queue
[03:38:55] Deleting current work unit & continuing...
[03:38:59] - Preparing to get new work unit...
[03:38:59] + Attempting to get work packet
[03:38:59] - Connecting to assignment server
[03:39:00] - Successful: assigned to (171.64.122.143).
[03:39:00] + News From Folding@Home: Welcome to Folding@Home
[03:39:00] Loaded queue successfully.
[03:39:01] + Closed connections
[03:39:06]
[03:39:06] + Processing work unit
[03:39:06] Core required: FahCore_65.exe
[03:39:06] Core found.
[03:39:06] Working on Unit 04 [April 16 03:39:06]
[03:39:06] + Working ...
[03:39:08] Folding@home Client Core Version 2.52 (February 10, 2004)
[03:39:24]
[03:39:24] Proj: work/wudata_04
[03:39:24] Done: 22940 -> 142973 (decompressed 623.2 percent)
[03:39:24] nsteps: 5000000 dt: 2.000000 dt_dump: 250.000000 temperature: 298.000000
[03:39:24] xyzfile:
[03:39:24] " 393 p638_L939_K12M_ext
[03:39:24] 1 N 151.224627 -54.925552 -27.925702 20..."
[03:39:24] keyfile:
[03:39:24] "parameters ./proj638.prm
[03:39:24] NOVERSION
[03:39:24] ARCHIVE
[03:39:24]
[03:39:24] cutoff 16.0
[03:39:24] taper 12...."
[03:39:24]
[03:39:24] - Couldn't get size info for dyn file: work/wudata_04.dyn
[03:39:24] Starting from initial work packet
[03:39:24]
[03:39:24] Protein: p638_L939_K12M_ext
[03:39:24] - Run: 196 (Clone 0, Gen 36)
[03:39:24] - Frames Completed: 0, Remaining: 400
[03:39:24] - Dynamic steps required: 5000000


These are the files that I have in my work directory:

Volume in drive C is EXTRA
Volume Serial Number is 805C-26FA

Directory of C:\working\seti\work

04/16/2004 12:03 AM <DIR> .
04/16/2004 12:03 AM <DIR> ..
04/16/2004 12:03 AM 26,789 current.xyz
04/16/2004 12:03 AM 0 dir.txt
04/16/2004 12:03 AM 1,227 logfile_04.txt
02/06/2004 11:18 PM 174 wudata_02.bxv
03/05/2004 10:12 PM 0 wudata_02.goe
03/05/2004 10:12 PM 0 wudata_02.sas
02/06/2004 11:18 PM 634,224 wudata_02CP.arc
04/16/2004 12:03 AM 52,922 wudata_04.arc
04/16/2004 12:03 AM 72 wudata_04.chk
04/15/2004 11:39 PM 23,452 wudata_04.dat
04/16/2004 12:03 AM 126,230 wudata_04.dyn
04/15/2004 11:49 PM 117,010 wudata_04.key
04/16/2004 12:00 AM 112,640 wudata_04.log
04/15/2004 11:49 PM 116,710 wudata_04.prm
04/15/2004 11:49 PM 26,001 wudata_04.xyz
04/16/2004 12:03 AM 512 wuinfo_04.dat
04/15/2004 11:29 PM 1,106,972 wuresults_03.dat
17 File(s) 2,344,935 bytes
2 Dir(s) 99,334,077,952 bytes free

Can I recover?

This is what I get when I do -queueinfo:

--- Opening Log file [April 16 09:51:42]


# Windows Console Edition #####################################################
###############################################################################

Folding@home Client Version 4.00

http://folding.stanford.edu

###############################################################################
###############################################################################

Arguments: -queueinfo

[09:51:42] - Ask before connecting: No
[09:51:42] - User name: Martin_Trevino (Team 198)
[09:51:42] - User ID = 3FC9E9705D15ABBE
[09:51:42] - Machine ID: 1
[09:51:42]
[09:51:42] Loaded queue successfully.
[09:51:42] Printing Queue Information
CURRENT QUEUE:
00 EMPTY
01 EMPTY
02 EMPTY
03 EMPTY
04 *READY "Folding@Home" (65) 171.64.122.143:8080 April 16 03:39:01 | June 5 03:39
05 EMPTY
06 EMPTY
07 EMPTY
08 EMPTY
09 EMPTY

Folding@home Client Shutdown.


This is what I get when I do -send all :

--- Opening Log file [April 16 09:54:35]


# Windows Console Edition #####################################################
###############################################################################

Folding@home Client Version 4.00

http://folding.stanford.edu

###############################################################################
###############################################################################

Arguments: -send all

[09:54:35] - Ask before connecting: No
[09:54:35] - User name: Martin_Trevino (Team 198)
[09:54:35] - User ID = 3FC9E9705D15ABBE
[09:54:35] - Machine ID: 1
[09:54:35]
[09:54:35] Loaded queue successfully.
[09:54:35] Attempting to return result(s) to server...

Folding@home Client Shutdown.
 

ProviaFan

Lifer
Mar 17, 2001
14,993
1
0
Try "fah4console -queueinfo" to see if it still lists the WU in queue? If not, I'm afraid there's not much hope of recovery. :(
 

GLeeM

Elite Member
Apr 2, 2004
7,199
128
106
something similar happened to me and jliechty said to try the -queueinfo option in the shortcut to see if there is unsent results. Which is what the wuresults_03.dat file is. He said to then try the -send all option and it worked for me. I had to also use the -local option cuz I run two w/HT.

If this doesn't work wait for somebody who knows this stuff well.

Good luck!
 

r0tt3n1

Golden Member
Oct 16, 2001
1,086
0
0
Hate to say this, but that Gromacs wu is lost.... :(
From the logs we can see that the unit finished, but was interupted just as it was writing the final data to disk, thus corrupting the data. Stanford needs A-1 choice data from its finished wu's, so the client takes no chances and deletes any suspect wu's.
The queue info shows the new wu you downoaded, you can tell by the time it was recieved. I see no trace of the previous wu.
When you used the -send all flag, it probably didnt send anything?? I ask because the log file seems truncated, there should have been a message saying nothing to send, or that there in fact was something sent (I doubt it).
Sorry, but looks like its a goner. Just yesterday, one of my machines had a brain fart, it didnt lose the wu, but the damn thing started from the beginning again! arrrrgg.
 

trevinom

Golden Member
Sep 19, 2003
1,061
0
0
Originally posted by: r0tt3n1
Hate to say this, but that Gromacs wu is lost.... :(
From the logs we can see that the unit finished, but was interupted just as it was writing the final data to disk, thus corrupting the data. Stanford needs A-1 choice data from its finished wu's, so the client takes no chances and deletes any suspect wu's.
The queue info shows the new wu you downoaded, you can tell by the time it was recieved. I see no trace of the previous wu.
When you used the -send all flag, it probably didnt send anything?? I ask because the log file seems truncated, there should have been a message saying nothing to send, or that there in fact was something sent (I doubt it).
Sorry, but looks like its a goner. Just yesterday, one of my machines had a brain fart, it didnt lose the wu, but the damn thing started from the beginning again! arrrrgg.



It did not send anything when I used the -send all flag. The log is not truncated, it ended when it disconnected. No message about anything to send appeared, just 'client shutdown' and that was it.

bummer

I guess it will just set me back a few hours of overtaking Scottv :)
It could have been worse, It could have been one of those 5 meg monsters


thanks for all your input guys.