Strange Lockup Problem

morphon

Junior Member
Feb 6, 2008
12
0
0
Howdy!

I'm trying to track down the culprit for a very strange lockup problem I have on this new build.

The parts:
FX-8350 (no OC)
ASUS M5A99FX Pro 2.0 MB
16GB Crucial Ballistix DDR-1600
GeForce 560ti (the sole carryover from my previous computer)
Cooler Master Seidon 120M Close-loop liquid cooling
500W 80+ P/S from OCZ (watts at the wall at load WITH 28" monitor is 377, this sounds like it's within the proper range)
Battery Backup
Win8 x64

Everything is run at stock speeds, everything in the BIOS set to AUTO (no auto OC or anything, just auto)

It runs great. I can run Prime95 all day without the core temps going above 40c. But... and here's the strange part... it will crash once every couple days. It doesn't BSOD or reset. It just STOPS. I have to hit the reset button, but then it comes up just fine, and is good for the next day or two, and then it hangs again. There doesn't seem to be any pattern to it. Sometimes I'm watching a video, or typing on Word, or playing a game, though it's usually while gaming.

Ok, now for the troubleshooting tests:

1. Refresh Win8 so it starts with a clean hardware profile, then re-install drivers for video, mouse, etc...

2. PRIME95 for a long time. NP.

3. Memtest 86+ - CRASHES. But only when I run it in multicore mode. If I restrict it to one core the memory tests fine. But if it is in multicore it resets the computer. It's not a system hang like it normally does. The whole thing powers down, then powers back up. It's longer than a normal reset. It's like somebody turned it off from the wall, waited one second, then turned it on again. Hmmmm....

4. 3DMark - CRASHES. But only if I turn off VSYNC. If VSYNC is on, it finishes just fine. If VSYNC is OFF, then it will do a system hang during the first demo (and usually in the exact same place each time). I checked the whole HPET thing that they have on their site, but that was set correctly in BIOS, so I dunno what to say about that one.

So, at this point I'm trying to narrow this down. Usually when I'm having crashes that happen mostly within games I point at the video card, especially when I'm not getting errors with PRIME95. But it's a card I've used for over a year with no issues at all, so I hesitate on that one. That leaves memory, CPU, and MB.

Or maybe something else I'm overlooking. My next step is to pull one of the memory sticks and see if I can get memtest to run stable on all 8 cores, but with only one stick.

Anyone have any other things to try?
 

Bubbaleone

Golden Member
Nov 20, 2011
1,803
4
76
That really sounds like a PSU issue given that you're not getting any BSODs and the shutdown/restart scenario you're describing. Got another PSU you can test with?

.
 

Steltek

Diamond Member
Mar 29, 2001
3,486
1,244
136
That really sounds like a PSU issue given that you're not getting any BSODs and the shutdown/restart scenario you're describing. Got another PSU you can test with?

.

I agree - that type of shutdown sequence just screams bad power supply. Not to mention, OCZ isn't the most reliable brand to begin with.

Have you tried resetting the CMOS memory? Also, are you running the most recent BIOS available on the board? You also didn't mention what type of main drive you are running - do you have an SSD or a regular hard drive? If you are running an SSD, is the firmware updated to the most recent available?
 

morphon

Junior Member
Feb 6, 2008
12
0
0
Tried resetting CMOS using the jumper, same issues. I'm using the very newest BIOS revision and drivers for chipset and video. I'm using plain jane 2TB 7200RPM drives.

Some new wrinkles here...

I don't have a different power supply to try out, but I suppose I could go buy one. It came from my last computer, so it's used to seeing this video card and drives. The older chip was an Athlon-II 630 (entry level quad at the time) paired with a MSI 890X, so I suppose it's possible that the load this processor/MB/memory is so different that the PS is just not able to do its work reliably.

I've spent an interesting evening working with memory and MEMTest. I upgraded to the newest (5.0) and the multicore crashing thing went away. I'm testing one stick right now which so far is coming out pretty clean. If it finishes this second cycle I'm going to try the 3dMark test again and see if it crashes in that same spot, a different spot, or is crash free. Then I'll do the same thing with the other stick.

I certainly don't mind replacing things, as long as I know what to replace. :-/

Y'all's help in tracking this down is much appreciated. If the memory checks out, is the next step PS?
 

Bubbaleone

Golden Member
Nov 20, 2011
1,803
4
76
There's certainly no harm in testing other components but, if the PSU is headed south, how can you know if a memory or drive error that gets thrown isn't due to bad voltage? IMHO you'll be doing yourself a favor to invest in a quality 650W PSU like those offered by Silverstone, XFX, Seasonic, or Corsair for example.

.

.
 

morphon

Junior Member
Feb 6, 2008
12
0
0
There's certainly no harm in testing other components but, if the PSU is headed south, how can you know if a memory or drive error that gets thrown isn't due to bad voltage? IMHO you'll be doing yourself a favor to invest in a quality 650W PSU like those offered by Silverstone, XFX, Seasonic, or Corsair for example.

.

.

Ok, I'll stop by MicroCenter tomorrow. BTW, I LOVE MicroCenter. <3 <3

Any of these look reasonable?

http://www.microcenter.com/product/404601/SuperNOVA_NEX650G_650_Watt_ATX_12V_Power_Supply

http://www.microcenter.com/product/...00_2013_Edition_600_Watt_ATX_12V_Power_Supply

http://www.microcenter.com/product/407832/ZU-600B_600W_Modular_ATX_Power_Supply
 

Bubbaleone

Golden Member
Nov 20, 2011
1,803
4
76
I think the GS600 would work well, especially if you need a lot of SATA connectors, but keep in mind that it's non-modular and neither SLI or CrossFire ready. If you can afford a few more bucks you should really take a look at the SeaSonic SSR-650RM if you'd like to buy yourself some quality future-proofing.

.
 

morphon

Junior Member
Feb 6, 2008
12
0
0
I think the GS600 would work well, especially if you need a lot of SATA connectors, but keep in mind that it's non-modular and neither SLI or CrossFire ready. If you can afford a few more bucks you should really take a look at the SeaSonic SSR-650RM if you'd like to buy yourself some quality future-proofing.

.

I won't be running a multi-GPU setup any time soon. Given that I seem to wear out a power supply every 2 years, then, well, probably no need to future-proof. I'll get the modular version of that 600W Corsair.

I did some testing last night to see what voltages I'm getting. I used HWMonitor (and, yes, a multimeter at the molex would be better). Looks like the 12v fluctuates between 11.9 and 11.7. I ran Furmark, since it is windowed and will let me keep an eye on HWMonitor - 12v dropped to 11.6 and promptly crashed. Not exactly proof, but pretty strong confirmation.

So, yeah. Off to the store to get a PS. Will update the thread on results....
 

morphon

Junior Member
Feb 6, 2008
12
0
0
Ok, put in a Corsair CXM-600. It's not high up on their awesome list, but it's replacing the OCZ 500, so more capacity and a better brand and NEWER! Yay!

Anyway, all the things that triggered the freeze EVERY TIME are now working like a charm. Ran Furmark a while (normally would freeze within 20 seconds), no problem. Same for 3DMark. I noticed that it is drawing more from the wall than before (20-30 watts more under load). Perhaps the 500 was just not delivering enough juice. <shrug>

Now for an extended Portal2 session...

I think the PS might have solved this one.


EDIT: Make that 50-60 watts more under load. Wow.
 
Last edited:

morphon

Junior Member
Feb 6, 2008
12
0
0
Triple post. Sorry.

Ok, so new PS seems to be running fine, except that I got another freeze. After a long BF4 session, lots of Furmark, etc... No problem. But then, just browsing several hours later, nothing really hammering the system, and got the freeze.

I'm kinda at a loss, since the new power supply definitely solved a lot of freezing issues that I was having earlier. All the stuff that would reliably trigger it is now working fine. Facebook and Anandtech in Chrome shouldn't trigger this. Grrrrr....

Anyway, one other thing. I tried running Furmark and Prime95 at the same time, and then look at the load at the wall. The system is drawing over 500 watts with both of those going, spiking up to 530 a few times. Given that the power supply is a 600.... did I go too small? Should I return it and find myself a 750?

Any further thoughts about the freezing? I'm going to see if I can find another trigger. I can't browse for 6 hours hoping it will fail. /sigh
 

Steltek

Diamond Member
Mar 29, 2001
3,486
1,244
136
Triple post. Sorry.

Ok, so new PS seems to be running fine, except that I got another freeze. After a long BF4 session, lots of Furmark, etc... No problem. But then, just browsing several hours later, nothing really hammering the system, and got the freeze.

I'm kinda at a loss, since the new power supply definitely solved a lot of freezing issues that I was having earlier. All the stuff that would reliably trigger it is now working fine. Facebook and Anandtech in Chrome shouldn't trigger this. Grrrrr....

Anyway, one other thing. I tried running Furmark and Prime95 at the same time, and then look at the load at the wall. The system is drawing over 500 watts with both of those going, spiking up to 530 a few times. Given that the power supply is a 600.... did I go too small? Should I return it and find myself a 750?

Any further thoughts about the freezing? I'm going to see if I can find another trigger. I can't browse for 6 hours hoping it will fail. /sigh

No, the supply isn't too small. Corsair accurately rates their units on actual capacity (unlike many manufacturers), so you are still below capacity even pushing your system to the wall. It isn't likely that your system as currently spec'd will likely exceed the power supply capacity in the future without a significant upgrade.

It might be a good idea to check for problems related to your prior freezes and forced restarts (i.e. corrupted system files, drivers, etc). Run SFC and CHKDSK sessions to check for system file and file system corruption. You might also revert to an earlier video driver revision to see if it could be a bug in the current driver.

Are you running the ASUS AISuite II software? I've found parts of it to be very buggy -- I eventually dumped most of the applications at my last Windows reinstall and found my system runs better without it (especially the fan controller software). If your system is configured with four memory modules, you might try dropping back to two to see if it could be a memory controller issue (I've seen AMD-based systems in the past that had issues with running four modules, but most of that was related to overclocking). It might also be worth checking the BIOS to ensure your memory modules are being properly recognized using the correct timings.

Finally, are there any errors showing in your system or application logs that might indicate a misbehaving application/driver?
 
Last edited:

morphon

Junior Member
Feb 6, 2008
12
0
0
No, the supply isn't too small. Corsair accurately rates their units on actual capacity (unlike many manufacturers), so you are still below capacity even pushing your system to the wall. It isn't likely that your system as currently spec'd will likely exceed the power supply capacity in the future without a significant upgrade.

It might be a good idea to check for problems related to your prior freezes and forced restarts (i.e. corrupted system files, drivers, etc). Run SFC and CHKDSK sessions to check for system file and file system corruption.

Now would also be a good time to check for driver and even BIOS updates on the ASUS website if you haven't done so (especially video drivers -- most web browsers now by default use hardware acceleration, so video driver bugs can cause all sorts of weird problems). BTW, are you running the ASUS AISuite II software? I've found parts of it to be very buggy -- I eventually dumped most of the applications at my last Windows reinstall and found my system runs better without it (especially the fan controller software). Is your system configured with two memory modules or four? If you are running four, you might try dropping back to two to see if it could be a memory controller issue.

Finally, are there any errors showing in your system or application logs that might indicate a misbehaving application/driver?

Giving SFC a go...

Newest BIOS updates, and newest chipset, peripheral, and Nvidia GPU drivers have been on the whole time.

No AISuite II. It was cool, but I tend to be a BIOS person more than a windows utility guy.

2x8GB DDR3-1600 memory. Passes Memtest 86+ 5.0 like a champ.

Logs show nothing other than the system started without a clean shutdown. There is no indication of what is causing the problem.

One thing that came to mind - the PS I'm using is a Corsair CX-600M (I know, cheap). After doing a little research, I noticed that some reviewers complained about bad hold up time - something about the capacitors being too small to cover the ATX spec but that this might not be a problem if your power coming from the wall is nice and stable. However - here in Houston the stuff coming from the wall is notoriously bad. My UPS will click on and off (engaging, disengaging about 5 seconds later) about once a week, even though the lights don't dim or anything.

Thing is, the reviews (techpowerup for one) are showing a hold up time short enough (6ms) that I'm worried it is not enough time for the UPS to kick in (rated 4ms) in case the system is under load. It's less than half as long as the ATX minimum (15ms). Given how dirty the juice coming from the wall is: maybe I need to not cheap out here and get something better.
 

Bubbaleone

Golden Member
Nov 20, 2011
1,803
4
76
The CXM-600 is a good PSU and a good choice for your rig. If your house or apartment line voltage is that bad, even a good PSU won't solve the problems, especially with UPS. I think you would find that having stable 120V/60Hz voltage, for the PSU and UPS, would eliminate the major cause of the problems you've been experiencing. You might want to consider making a modest investment in a line conditioner. Here's one example: Tripp Lite LC1200

.
 

morphon

Junior Member
Feb 6, 2008
12
0
0
Well, I did some holdup tests. If the system is totally idle, it will survive the ups being unplugged. If I'm running anything, even if the total load is under 300w, the system restarts or freezes if the ups is unplugged. So, I'm going to need a PS with bigger caps anyway.

I guess after that it's line conditioner time. :-/
 

Steltek

Diamond Member
Mar 29, 2001
3,486
1,244
136
It might be worth contacting the power company to have them send a tech to check your lines. It might be that the transformer needs to be adjusted. Your UPS software may be able to log the power drops which can help in your dealings with them. If nothing else, you should be able to see how often you are having voltage drops and how the voltage is fluctuating.
 

morphon

Junior Member
Feb 6, 2008
12
0
0
Ok, put in http://www.microcenter.com/product/380192/Silencer_Mk_III_Series_600_Watt_ATX_Modular_Power_Supply

I did some research - it is a rebranded SeaSonic. Long warranty, bigger caps than the CX600M. A bit larger, and the modular connectors blocked the lower intake fan, so I turned it fan-side up. Apparently this makes almost no difference. Hmmm.

Anyway, no crashes so far. It will holdup while playing games, so it's a little less dependent on the wall than the Corsair. It doesn't survive the UPS being unplugged while running Furmark, but whatever.

I'll start the UPS logging. Might uncover something interesting.

Thanks for everyone's patience. Hopefully this did the trick.
 

Steltek

Diamond Member
Mar 29, 2001
3,486
1,244
136
Out of curiosity, what model UPS do you have and how old are the batteries?
 

morphon

Junior Member
Feb 6, 2008
12
0
0
Out of curiosity, what model UPS do you have and how old are the batteries?

It's a CyberPower 850AVR. About 5 years old.

Given how often the power goes out here in Houston (probably 4-5 times a year), it has long paid for itself in avoiding lost productivity.
 

morphon

Junior Member
Feb 6, 2008
12
0
0
Just closing the loop on this one. Since I put in the PC Power & Cooling Silencer III-600W, I have had ZERO problems. System runs like a dream. Dropped in a 7950 since they're bizarrely inexpensive at the moment (cheaper AND faster than the new 270X? Yes pls.)

Games are super smooth. Enough CPU power to capture full-frame gaming video to x264 without more than 4-5fps loss. No lockups, strange restarts, or frustrations.

Thanks to everyone for their help. I was totally stumped.