Troubleshooting steps for "flaky" machines, that dont seem to fail on testbench?

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
I've got a client, with two machines (from different relatives) that are generically flaky. Both are branded OEM rigs (not the same), both are AMD AM2 DDR2 rigs though (IIRC).

One of them, I tested for several hours, and I couldn't get it to fail. So I replaced the PSU just in case, and sent it on its way. This one got fairly hot during stress-testing, but swapping the Al block heatsink for a fancy four-heatpipe model didn't help, surprisingly.

The second one, I did some stress-testing, and this one, temps skyrocketed, so I replaced the heatsink with a 4-heatpipe one, and temps were back under control. I figured that was the problem, so I returned it, as I couldn't get it to crash after swapping heatsinks, during stress-testing.

I got reports after I returned it, that the machine was still "shutting off". I got the machine back in a few months later, along with its LCD monitor. I thought that I had found the problem, that it was the monitor's AC cable port, but it turned out that a brand-new cable I had pulled out of my stash to test with, turned out to be defective.

I also got a few blue-screens, and finally pulled out 2x512MB of RAM and then it seemed more stable. I sent it back, with fresh cables for power for both the PC and monitor, and a brand-new VGA cable, in case it was the cables.

Client said it was ok for a while, but then it started acting up again.

I'm not really sure what to do at this point, as I may be getting both of these machines in to work on again soon.

I fear that they may simply have motherboards that are getting old and need replacement.

My current plan is this - to install a fresh HDD or SSD, perform a fresh OS install, and stress-test. If it fails the stress-test, then it would seem to be a hardware problem.

If so, then try a fresh PSU. If it still fails, then it is most likely the mobo, no? Or possibly the RAM. I have another system I know works that takes DDR2 RAM, that I could drop the RAM into and test that way.

If it does seem like the mobo, then I have some spare mobos too. But not exact OEM replacements.
 

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
In my experience, if a system is totally stable at my test station, but not stable at the client's location, it's often a problem with something at their location (bad power, misbehaving appliance on the same circuit, user error, etc). Have them try plugging the computer in at a completely different location in the house for a few days to see if the problem continues on a different power circuit.
 

Smoove910

Golden Member
Aug 2, 2006
1,235
6
81
Since they were both AMD products, did you happen to go into the BIOS and see what the voltage was set at? Hopefully it wasn't set to 'auto'... this could be why they were running hot. Just a thought...
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
Since they were both AMD products, did you happen to go into the BIOS and see what the voltage was set at? Hopefully it wasn't set to 'auto'... this could be why they were running hot. Just a thought...

What's wrong with an "AUTO" voltage setting, for running at stock speeds? These rigs weren't overclocked. In fact, since they are OEM boxes, I doubt very much that they even have a voltage setting at all.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
In my experience, if a system is totally stable at my test station, but not stable at the client's location, it's often a problem with something at their location (bad power, misbehaving appliance on the same circuit, user error, etc). Have them try plugging the computer in at a completely different location in the house for a few days to see if the problem continues on a different power circuit.

True. The most likely causes of crashes are due to either power or temperature. I keep my place fairly cool, I wonder if that could affect things much?
 

Sleepingforest

Platinum Member
Nov 18, 2012
2,375
0
76
Perhaps, if you have a heat lamp or something similar, increase the ambient temperature near your PCs to see if heat really is the problem. If they fail that test, then just buy some cheap CPU coolers to mount onto the motherboard or try throwing some spare fans into the case.
 

denis280

Diamond Member
Jan 16, 2011
3,434
9
81
My current plan is this - to install a fresh HDD or SSD, perform a fresh OS install, and stress-test. If it fails the stress-test, then it would seem to be a hardware problem. If so, then try a fresh PSU. If it still fails, then it is most likely the mobo, no? Or possibly the RAM. I have another system I know works that takes DDR2 RAM, that I could drop the RAM into and test that way. If it does seem like the mobo, then I have some spare mobos too. But not exact OEM replacements.
You Could try all of that.I Just did about 2 weeks ago.very frustrating.anyway turn out.to change the mobo.
 

lakedude

Platinum Member
Mar 14, 2009
2,778
529
126
Both are branded OEM rigs (not the same), both are AMD AM2 DDR2 rigs though (IIRC).
Those are pretty old machines at this point.

My AM2 DDR2 machine gets flaky. I just re-seat all the connections and it is fine for another few months...

It will get to where it will not even boot but nothing is really wrong with it, just old/bad connections.