How to troubleshoot random crashes?

antsct

Senior member
Sep 22, 2005
265
0
0
I have tried to provide as much info as possible, sorry for the long post.

I have an old Socket A system that I use as my spare PC. Specs are:

AMD OEM Sempron (Barton 400MHz fsb) 3300+ - 2.2GHz (Exactly the same chip as the XP 3200+)
Custom Heatsink / Fan combo (very big, i cant remember exactly what it is, I think Thermaltake)
Abit AN7 Motherboard with latest BIOS
2Gb Kingston DDR400
256mb Sapphire 2600XT AGP
320Gb SATA Seagate 16mb Hard Drive
Windows XP Pro SP2

Here's my problem.

Up until recently, the system had a Mobile Athlon overclocked to 1.8GHz. It was 100% stable, never ever crashed once on me. I managed to get my hands on a brand new Sempron for very cheap. I was looking for a 3200+ but couldn't find any worth paying for. So I decided to go for this Sempron (very rare processor but has exact same specs as the 3200+, that is, 2.2GHz, 400MHz fsb, 512KB L2 Cache). I was abit worried about getting this processor because it wasn't officially supported by the motherboard, although Sempron support was added in a future BIOS release (which I have flashed to). I read online that many people have got this CPU to work in this motherboard, except that it shows up as a 3200+, which is the same for me.

Anyways, popped the CPU, computer posts perfectly, and i definetly noticed a speed increase in various taks, games etc.

There's just one problem. At any point, the system will just crash. My monitor display goes black and the monitors LED light turns from green to orange. However I can still hear all the components running (such as the fans) however the HDD LED on the case stops aswell.

I can't pin point whats causing the crash. It will just happen out of nowhere. It first started when I played CS:Source. After about 10 mins i got a BSOD, the game ran fine before that. The second time I played CS:S i got another BSOD, there was no error codes but, just a random memory reference that was different each occasion. However the BSOD is gone now, and all I get is a blank display.

My CPU temp is at around 42 - 48*C, depening on load. I used a stress test to run the CPU @ 100% useage for half hour and no crashes. I was also getting crashes in Live For Speed. Once I opened Live For Speed and it crashed within 5 secs, yet today I let the game run for about an hour, when I came back to the computer it was still running fine. By the way, Unreal Tournament 3 has never crashed.

However, it's not only that. I may be surfing the web for like 2 hours, then I'll just visit any random website (such as Live For Speed forums) and boom, system crashes, blank display, fan's and lights are still going (except HHD light) and my monitors LED turns orange. Just now, I've had the system running for around 3 hours or so, went to go play LFS, was in the game for like 10 secs and then it crashed. Last night I managed to play LFS for around 30 mins without any problems.

I just can't seem to work out what it is, or how to fix it, and it just occurs out of nowhere. It's extremely frustrating.

By the way, I have virus scanned using AVG free...no virusus found. As I said, this only started happening with the new CPU.

Any help would be GREATLY appreciated!
 
Nov 26, 2005
15,194
403
126
It could possibly be the motherboard starting to die. Test the CPU again for a longer time (3-4 hrs) with Prime95 or OCCT and if it crashes, cross check the cpu with the old one that you know works and if it crashes again, that would be an acute indicator for the mobo or something other than the CPUs
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,260
16,117
136
The motherboard bios may not support that chip. Check the Abit site, and possibly update the bios. I had a similar problem with one one my setups, and a bios flash fixed it !
 

ther00kie16

Golden Member
Mar 28, 2008
1,573
0
0
Crashes whenever you play games? I'd say check the video card as well. Check the temps as some 2600xt run warm. If you have a spare video card, try swapping that in and see if it's stable in games. Could be that the extra heat output of the cpu is choking the video card.
But yea, first thing to do would be to update bios.
 

Foxery

Golden Member
Jan 24, 2008
1,709
0
0
My CPU temp is at around 42 - 48*C, depening on load. I used a stress test to run the CPU @ 100% useage for half hour and no crashes.

Yep, you should run a program designed for stress testing; a game doesn't help you isolate the source of the problem. Half an hour is nowhere near enough time, either - OCCT's shortest setting is an hour, and the author suggests 4h.

I see you've checked for BIOS updates, but they can only do so much. Can you verify that this motherboard supports this CPU?
 

ShadowFlareX

Member
May 6, 2008
150
0
0
Sounds awfully familiar to my past experiences when I was working as a desktop support in a certain company that uses Dell as their PCs.... like in the thousands. One time, those Dells were starting to behave like what you described. After a little investigation, turned out the some capacitors were either blown or leaking. The signs are very visible, the top of the capacitor is buldging as if it's about to blow up (but not!).

So check the capacitors all over your motherboard, see if any of them are buldging. If they are, then there's nothing you can do about it, the motherboard is pretty much unusable. If you can't find any, then must be something else.
 

nyker96

Diamond Member
Apr 19, 2005
5,630
2
81
run ortho + memtest first before anything else. See what the result is. Just to rule out the CPU/RAM. Also I had this as well, built a machine last year, started having random resets found out it's the HSF going bad, the temp hiked up and the CPU initiated auto shutdown. But my gut feeling is your problem is different than mine because your temp seems fine.

So, first thing first run these two to rule out CPU+mem so you have less to worry about. Also check temp when ortho maxing out see if it's like crazy high or not.
 

antsct

Senior member
Sep 22, 2005
265
0
0
Thanks for the help so far!

I run memtest86 and there were no errors.

I also ran Prime95 using the 'Small FFTs' test for 3 hours and had no errors. HOWEVER, when using the 'In place large FFTs' test, after around 14 minutes i received the following error:

FATAL ERROR: Rounding was 0.4999990463, expecing less than 0.4
Hardware failure detected, consult stress.txt

Does anyone know what this means?
 

betasub

Platinum Member
Mar 22, 2006
2,677
0
0
The error report is typical for a Prime95 failure, but it's the fact that you've passed memtest86 and Small FFT test that makes the Large FFT failure unusual.

You have a hardware fault, that's for sure, but you haven't yet narrowed it down to CPU, L2 or memory - you need to run the tests for far longer. You can try underclocking CPU/memory or increasing their voltage (Vcore/Vdimm) to seek stability.
 
Oct 20, 2005
10,978
44
91
Replace the original CPU (the one that worked with the 100% stable system) and run memtest for about an hour, then run prime95 for awhile (like overnight).

If it returns no error, then it's almost a guarantee that the new sempron is the culprit, that is to say as long as you hadn't made any other changes to any other components.
 

antsct

Senior member
Sep 22, 2005
265
0
0
Thanks again everyone!

I think i've made a new discovery...

When I was originally given the Mobile Athlon CPU and custom heatsink/fan, i was told it could easily manage (200 x 11) 2.2 - 2.4GHz (2.7GHz on watercooling). However, whenever I pushed it above 1.9GHz, the system would boot up fine, but no applications / games would open up. Yet as soon as I bumped it down to 1.8GHz, everything opened fine and I could play games etc, without any problems.

So I was curious to see what would happen if I ran the Sempron 3300+ at 1.8GHz (166 x 11). So far, it hasn't crashed once. I'm going to try run it like this for a couple days and see how it goes. I also ran Prime95 using the large-FFTs for around 40 mins (that's all the time I had) and it didn't result in any errors.

Later on im going to try run it at 1.9Ghz - 2GHz and see if it crashes. However, this means that I can't use an FSB of 133/166/200 (the multiplier is also locked at 11). Is it safe to use an FSB of around 185MHz or so?

I have a very strong feeling that the problem is the motherboard and that it doesn't like CPU's clocked above 1.9GHz. For what reason, I have no idea, could be that it's showing its age? Thankfully I have another AN7 in the house so I might even try the Sempron at its defualt 2.2GHz on that board and see how it goes.

Any other input would be great. Thanks.