Hello! This is my first post here so first of all: nice to meet you!
-------------------
Here is a TL;DR, full post with details bellow:
System crashes (freeze & reboot) when I'm actively playing (not if the game is launched but idle). I tried a lot of solutions, including GPU RMA, re-formatting and changing my OS three times, buying a new PSU and CPU cooler, clearing the CMOS, flashing the BIOS of motherboard and gpu. Only thing that worked (but NOT satisfactory): underclothing -200Mhz memory & core clocks.
------------------
I'm having a lot of issues with my new computer.
I have a history of building my own PCs so I know a thing or two but it's the first time I'm having so much troubles with a problem that seems to have so many solutions at first.
Here it is: my system is really unstable. It crashes, a lot, but only during specific circumstances: when I'm actively using it, especially gaming.
Here are the components:
MB: ASRocks Z97 Pro3
Proc: i5 4690
GC: GIGABYTE GV-N970IXOC-4GD GeForce GTX 970 4GB 256-B
PSU: EVGA 850 G2
Ram: Crucial Ballistix Sport 16GB Kit (8GBx2) DDR3 1600 MT/s (PC3-12800)
I also have two SSDs (my main one, new, for my gaming Windows system, and an older one I'm still using for dual boot on a Linux for my work) and a 3TB hard drive for all the data.
My tower is also very well cooled with three noctua 120mm fans (added to the two coming with the case, a fractal design arctic white) and a NH-u12s cpu cooler.
I was using windows 7 when the first crashes occurred (it was this January, I bought the above configuration as a Christmas gift). Since then I switched to Windows 8.1 and now the technical preview of Windows 10, the crashes occurred on all the OS, even on clean installations without any additional software except steam + my current games (The Witcher 3 atm).
Here is a more detailed description of the symptoms: I start the game, load my save, everything is fine. In game, as long as I don't do anything, it can stay up without any crashes for several hours (I was doing that at first as ways to tests if everything was stable or not).
When I start playing, after 15 minutes to a couple of hours, I'll get a big freeze, an audio feedback loop and, if I wait enough time, a reboot.
At some point I copied the content of a minidump crash file, but right now on my windows 10 I didn't figure out how to get a memory dump when it crashes yet. It does not look like a BSOD too, I don't see anything on the screen, it just freezes and crashes.
Here are the steps I did to try and fix that:
- Reformatted and tried with Windows 8.1.
- Sent my GPU to RMA, it was processed and I got it back and the problem was still there.
- Cleared the CMOS of my motherboard. Double checked all the parameters in my BIOS (like memory voltage etc.). Flashed the BIOS to the newest version.
- Flashed my GPU bios to the newest version.
- Bought the PSU + CPU cooler listed above and redid all the cable management of my tower. Now the GPU never goes above 66°C even on high load with benchmarks like Valley (which are not causing any crashes).
- Checked my memory with memcheck.
There is only one thing that improved the problem: undercloking my gpu (-200mhz both memory and core clocks). In that case, I don't get the crashes anymore. But it's just not acceptable in my mind, I didn't buy all this stuff to get 5 to 10% of decreased performances. Especially since I'd also like to be able to overclock at some point, the GTX 970 is supposed to be a good card for that.
So here it is. Would you have, by any chance, any other ideas about how to fix these problems? You think it could be the motherboard? I really don't want to send it to RMA, it's going to be very complicated for me to get by without desktop since I'm mostly working from home on it, so I'd like to keep that element at the bottom of the list of things to do. Are there any ways to test the stability/integrity of the motherboard?
Thanks!
-------------------
Here is a TL;DR, full post with details bellow:
System crashes (freeze & reboot) when I'm actively playing (not if the game is launched but idle). I tried a lot of solutions, including GPU RMA, re-formatting and changing my OS three times, buying a new PSU and CPU cooler, clearing the CMOS, flashing the BIOS of motherboard and gpu. Only thing that worked (but NOT satisfactory): underclothing -200Mhz memory & core clocks.
------------------
I'm having a lot of issues with my new computer.
I have a history of building my own PCs so I know a thing or two but it's the first time I'm having so much troubles with a problem that seems to have so many solutions at first.
Here it is: my system is really unstable. It crashes, a lot, but only during specific circumstances: when I'm actively using it, especially gaming.
Here are the components:
MB: ASRocks Z97 Pro3
Proc: i5 4690
GC: GIGABYTE GV-N970IXOC-4GD GeForce GTX 970 4GB 256-B
PSU: EVGA 850 G2
Ram: Crucial Ballistix Sport 16GB Kit (8GBx2) DDR3 1600 MT/s (PC3-12800)
I also have two SSDs (my main one, new, for my gaming Windows system, and an older one I'm still using for dual boot on a Linux for my work) and a 3TB hard drive for all the data.
My tower is also very well cooled with three noctua 120mm fans (added to the two coming with the case, a fractal design arctic white) and a NH-u12s cpu cooler.
I was using windows 7 when the first crashes occurred (it was this January, I bought the above configuration as a Christmas gift). Since then I switched to Windows 8.1 and now the technical preview of Windows 10, the crashes occurred on all the OS, even on clean installations without any additional software except steam + my current games (The Witcher 3 atm).
Here is a more detailed description of the symptoms: I start the game, load my save, everything is fine. In game, as long as I don't do anything, it can stay up without any crashes for several hours (I was doing that at first as ways to tests if everything was stable or not).
When I start playing, after 15 minutes to a couple of hours, I'll get a big freeze, an audio feedback loop and, if I wait enough time, a reboot.
At some point I copied the content of a minidump crash file, but right now on my windows 10 I didn't figure out how to get a memory dump when it crashes yet. It does not look like a BSOD too, I don't see anything on the screen, it just freezes and crashes.
*** WARNING: Unable to verify timestamp for nvlddmkm.sys
*** ERROR: Module load completed but symbols could not be loaded for nvlddmkm.sys
Probably caused by : nvlddmkm.sys ( nvlddmkm+7a2dc0 )
Followup: MachineOwner
---------
3: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
VIDEO_TDR_FAILURE (116)
Attempt to reset the display driver and recover from timeout failed.
Arguments:
Arg1: fffffa8018a6b190, Optional pointer to internal TDR recovery context (TDR_RECOVERY_CONTEXT).
Arg2: fffff88005203dc0, The pointer into responsible device driver module (e.g. owner tag).
Arg3: ffffffffc000009a, Optional error code (NTSTATUS) of the last failed operation.
Arg4: 0000000000000004, Optional internal context dependent data.
Debugging Details:
------------------
FAULTING_IP:
nvlddmkm+7a2dc0
fffff880`05203dc0 48ff251996edff jmp qword ptr [nvlddmkm+0x67c3e0 (fffff880`050dd3e0)]
DEFAULT_BUCKET_ID: GRAPHICS_DRIVER_TDR_FAULT
CUSTOMER_CRASH_COUNT: 1
BUGCHECK_STR: 0x116
PROCESS_NAME: System
CURRENT_IRQL: 0
ANALYSIS_VERSION: 6.3.9600.17298 (debuggers(dbg).141024-1500) amd64fre
STACK_TEXT:
fffff880`05ec6a48 fffff880`054cc134 : 00000000`00000116 fffffa80`18a6b190 fffff880`05203dc0 ffffffff`c000009a : nt!KeBugCheckEx
fffff880`05ec6a50 fffff880`0549f867 : fffff880`05203dc0 fffffa80`106c9000 00000000`00000000 ffffffff`c000009a : dxgkrnl!TdrBugcheckOnTimeout+0xec
fffff880`05ec6a90 fffff880`054cbf43 : fffffa80`ffffd846 00000000`00000000 fffffa80`18a6b190 00000000`00000000 : dxgkrnl!DXGADAPTER::Reset+0x2a3
fffff880`05ec6b40 fffff880`0559c03d : fffffa80`1ad201e0 00000000`00000080 00000000`00000000 fffffa80`106be410 : dxgkrnl!TdrResetFromTimeout+0x23
fffff880`05ec6bc0 fffff800`0312f0ca : 00000000`0103906e fffffa80`103a7b50 fffffa80`0c6e19e0 fffffa80`103a7b50 : dxgmms1!VidSchiWorkerThread+0x101
fffff880`05ec6c00 fffff800`02e83be6 : fffff880`03165180 fffffa80`103a7b50 fffff880`0316ffc0 fffffa80`0f201da0 : nt!PspSystemThreadStartup+0x5a
fffff880`05ec6c40 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxStartSystemThread+0x16
STACK_COMMAND: .bugcheck ; kb
FOLLOWUP_IP:
nvlddmkm+7a2dc0
fffff880`05203dc0 48ff251996edff jmp qword ptr [nvlddmkm+0x67c3e0 (fffff880`050dd3e0)]
SYMBOL_NAME: nvlddmkm+7a2dc0
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: nvlddmkm
IMAGE_NAME: nvlddmkm.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 54b0548e
FAILURE_BUCKET_ID: X64_0x116_IMAGE_nvlddmkm.sys
BUCKET_ID: X64_0x116_IMAGE_nvlddmkm.sys
ANALYSIS_SOURCE: KM
FAILURE_ID_HASH_STRING: km:x64_0x116_image_nvlddmkm.sys
FAILURE_ID_HASH: {1f9e0448-3238-5868-3678-c8e526bb1edc}
Followup: MachineOwner
Here are the steps I did to try and fix that:
- Reformatted and tried with Windows 8.1.
- Sent my GPU to RMA, it was processed and I got it back and the problem was still there.
- Cleared the CMOS of my motherboard. Double checked all the parameters in my BIOS (like memory voltage etc.). Flashed the BIOS to the newest version.
- Flashed my GPU bios to the newest version.
- Bought the PSU + CPU cooler listed above and redid all the cable management of my tower. Now the GPU never goes above 66°C even on high load with benchmarks like Valley (which are not causing any crashes).
- Checked my memory with memcheck.
There is only one thing that improved the problem: undercloking my gpu (-200mhz both memory and core clocks). In that case, I don't get the crashes anymore. But it's just not acceptable in my mind, I didn't buy all this stuff to get 5 to 10% of decreased performances. Especially since I'd also like to be able to overclock at some point, the GTX 970 is supposed to be a good card for that.
So here it is. Would you have, by any chance, any other ideas about how to fix these problems? You think it could be the motherboard? I really don't want to send it to RMA, it's going to be very complicated for me to get by without desktop since I'm mostly working from home on it, so I'd like to keep that element at the bottom of the list of things to do. Are there any ways to test the stability/integrity of the motherboard?
Thanks!
