Originally posted by: BlackMountainCow
Yes, they will fail from their variables as well. But that'll give you a different error message. If they fail from variables, you'll see that they started about 3 times over from their last working checkpoint. If they can't continue with their model (let's say the world turned into an iceball or a fireball) they'll abort the current WU and load another one. The stress problem errors though will just say something like "computing error" or "unrecoverable error" in model xxxx. Also, the WU output file you can look at under "your tasks" on the CPDN site will be way different. How I would know? I've been there
CPDN should in theory try 1st. the daily checkpoint, next the monthly, and lastly the yearly checkpoint, in case of problems. But, even you've got a "bad" model, it's not always it does try the checkpoints, and some of the error-messages can just as easily be a "bad model" as "unstable computer"...
No idea how many of the CPDN-wu's is "bad", but wouldn't be surprised if it's higher % than other projects like Folding@Home and SETI@home. So, atleast a single wu-error doesn't mean there's neccessarily anything wrong with your computer. But, if you example runs 4 models simultaneously on a Quad-box and all 4 models craps-out after maybe 1 day or something, the odds of bad hardware seems more likely.
If you're getting multiple model-crashes, one way to test this is, then you've downloaded some wu's, stop & exit BOINC completely, and make a backup-copy of the whole BOINC-directory + sub-directories. Now, in case one/more models crashes within a day or so, try the backup-copy...
If the backup-copy crashes on the same day or something, no problem, you've got a bad model. But, if example the same model crashes once in April, next time in February, and you tries reverting-back to same backup and gets a crash in October, it's a good chance you've got an unstable computer...
Does finishing a CPDN-model successfully mean you'll have no problems running any other DC-project? No, different DC-projects uses cpu & memory differently, and running one DC-project error-free for some months doesn't mean another won't find the "weakest spot" in a system, and push it over the edge...
This is the case, regardless of "but I've run Folding@Home stable for N months", or "I've finished N CPDN-models", another DC-project can still show the "stable" overclock really is "unstable". Not so commonly, but even non-overclocked computers can show-up as "unstable"...
As for which is more "stressful" of Folding@Home and CPDN, I'm not certain, but would guess even with 90% reduction, CPDN is still hitting the hd harder than Folding@Home, with the possible exception if you've configured to checkpoint once per minute... As for memory & cpu, would guess this depends on wu-type...
In any case, an interesting experiment would be if all that runs FAH on "stable" overclocked systems would run CPDN on all cores for 2 weeks or something ("fast" models should finish in 2 weeks on a fast computer), and see if their system is still "stable" or not (use backup-method and re-run in case of model-crash), and at the same time all that runs CPDN on "stable" overclocked systems ran FAH for two weeks...