Hey Anandtech! Looking for some expert fellow enthusiasts to give me a double check on my situation and confirm I am following a good methodology to solve a very strange crashing issue which to me, feels like a PSU issue.
Setup:
OS: Win 8.1 x64
CPU: i7 4790k running stock
Mobo: Gigabyte Z97X-UD5H-BLK rev 1.0
Ram: 16 gigs running stock @ 2.4ghx
SSD: Intel 520 256GB
GPUs: 2xEVGA Titan SC in SLI
PSU: ~4 year old Corsair AX1200 (bought early after line was introduced)
Situation:
1) Came back from honeymoon super excited to play Civ: Beyond Earth - updated with new game-ready drivers for Nvidia which gave me a BSOD and OS recovery tools could not resolve inability to boot Windows. Here I think I had coincidental driver crash corrupt my OS - fine I thought - let's reinstall!
2) Reinstall OS as UEFI and keep it bare bones, Windows update only, Nvidia driver, Civ: Beyond earth
3) Have an odd crash here and there - and only ~20 mins into first session of game
4) Wanted to reinstall again - ended up in loop unable to install windows due to UEFI issues - as I was installed on legacy for longest time with no issues reinstalled again completely legacy
5) Event viewer - can never see a faulting application cause the crash in either Application or System.
Testing results -
Given fresh reinstalls and crashes even with stable, older versions of Nvidia driver (or no driver), concluding i have a fault somewhere in the hardware chain.
1) DOS bootable tests:
- Memory - Can run windows memory tester and dos-booted memtest x86 for 12 hours+ with no errors reported and no crashes Conclusion: Memory ok!
- CPU - when booted in a DOS tester using linpack and a few other utilities which I expect also run linpack like algorythms - NO PROBLEMS - can run for hours Conclusion: CPU ok!
2) Windows tests:
- Prime95 - crash happens <5mins with no errors in GUI and on reboot, none found in "results" file
- OCCT - intense GPU testing maxing out cards can run for an hour with no issues. Power supply testing with CPU running and GPUs going with draw of ~800watts (just under capacity of my APC battery backup - never hit this even when gaming intensely on the TITANs) and this ran for an hour until i stopped it manually
- OCCT CPU Linpack in windows - can run for an hour no errors no issues
Conclusions + Next steps (where I need your help!)
1) Latest NVIDIA driver release is problematic - see issues in a lot of places and even affected 980s in SLI for a recent HardOCP article
2) Memory+CPU are likely ok - or would see errors in DOS booted tests
3) Prime95 - can see many with similar setups running this test with no issues - i think fact this crashes within a few mins my whole computer makes me think there is a real HW fault somewhere
However it is very strange that I get no errors in the Prime95 test (increasing confidence in my CPU?) and that OCCT tests never generate errors or crashes.
Problem source candidates: PSU or Mobo - could have flaky 12A rail or flaky delivery of that rail through the mobo
4) Don't have backup PSU of sufficient wattage so will buy a new PSU next week and keep it handy for t-shooting if it turns out I dont have a PSU issue or replace my corsair AX1200 if it turns out its an issue.
5)If new PSU also has same issues occurring will seek out another mobo and do a whole mobo replacement - REALLY don't want to do that given the task and how long it will take
Questions for group:
Am I on the right track here? Can anyone offer rationale for how Prime95 could flush out an issue no other test can? Is it that it taxes the CPU in a way thats more intense than Linpack based tests which could stress the 12A rail?
I guess I am confused cause the OCCT "power supply" test should be drawing a lot from the 12A rail and I dont get a crash within mins - that thing ran for an hour with no problem!
There is something that Prime95 is exposing that nothing else is - though I did get that crash in-game with Civ aftwer 20 mins and a few odd ones outta nowhere.
Setup:
OS: Win 8.1 x64
CPU: i7 4790k running stock
Mobo: Gigabyte Z97X-UD5H-BLK rev 1.0
Ram: 16 gigs running stock @ 2.4ghx
SSD: Intel 520 256GB
GPUs: 2xEVGA Titan SC in SLI
PSU: ~4 year old Corsair AX1200 (bought early after line was introduced)
Situation:
1) Came back from honeymoon super excited to play Civ: Beyond Earth - updated with new game-ready drivers for Nvidia which gave me a BSOD and OS recovery tools could not resolve inability to boot Windows. Here I think I had coincidental driver crash corrupt my OS - fine I thought - let's reinstall!
2) Reinstall OS as UEFI and keep it bare bones, Windows update only, Nvidia driver, Civ: Beyond earth
3) Have an odd crash here and there - and only ~20 mins into first session of game
4) Wanted to reinstall again - ended up in loop unable to install windows due to UEFI issues - as I was installed on legacy for longest time with no issues reinstalled again completely legacy
5) Event viewer - can never see a faulting application cause the crash in either Application or System.
Testing results -
Given fresh reinstalls and crashes even with stable, older versions of Nvidia driver (or no driver), concluding i have a fault somewhere in the hardware chain.
1) DOS bootable tests:
- Memory - Can run windows memory tester and dos-booted memtest x86 for 12 hours+ with no errors reported and no crashes Conclusion: Memory ok!
- CPU - when booted in a DOS tester using linpack and a few other utilities which I expect also run linpack like algorythms - NO PROBLEMS - can run for hours Conclusion: CPU ok!
2) Windows tests:
- Prime95 - crash happens <5mins with no errors in GUI and on reboot, none found in "results" file
- OCCT - intense GPU testing maxing out cards can run for an hour with no issues. Power supply testing with CPU running and GPUs going with draw of ~800watts (just under capacity of my APC battery backup - never hit this even when gaming intensely on the TITANs) and this ran for an hour until i stopped it manually
- OCCT CPU Linpack in windows - can run for an hour no errors no issues
Conclusions + Next steps (where I need your help!)
1) Latest NVIDIA driver release is problematic - see issues in a lot of places and even affected 980s in SLI for a recent HardOCP article
2) Memory+CPU are likely ok - or would see errors in DOS booted tests
3) Prime95 - can see many with similar setups running this test with no issues - i think fact this crashes within a few mins my whole computer makes me think there is a real HW fault somewhere
However it is very strange that I get no errors in the Prime95 test (increasing confidence in my CPU?) and that OCCT tests never generate errors or crashes.
Problem source candidates: PSU or Mobo - could have flaky 12A rail or flaky delivery of that rail through the mobo
4) Don't have backup PSU of sufficient wattage so will buy a new PSU next week and keep it handy for t-shooting if it turns out I dont have a PSU issue or replace my corsair AX1200 if it turns out its an issue.
5)If new PSU also has same issues occurring will seek out another mobo and do a whole mobo replacement - REALLY don't want to do that given the task and how long it will take
Questions for group:
Am I on the right track here? Can anyone offer rationale for how Prime95 could flush out an issue no other test can? Is it that it taxes the CPU in a way thats more intense than Linpack based tests which could stress the 12A rail?
I guess I am confused cause the OCCT "power supply" test should be drawing a lot from the 12A rail and I dont get a crash within mins - that thing ran for an hour with no problem!
There is something that Prime95 is exposing that nothing else is - though I did get that crash in-game with Civ aftwer 20 mins and a few odd ones outta nowhere.
Last edited: