Question Computer rebooting semi-randomly under GPU load - am I taking the right steps?

JDA81

Junior Member
Jan 17, 2018
8
1
41
Now my build is over a year old, but I think this is the appropriate forum for this kind of problem...

tldr: Computer reboots under intensive GPU loads - despite how long the load has been running, and very predictably upon certain interactions from user.

Examples:
Example 1: Doing something graphics intensive like Furmark. After a few minutes it just reboots.
Example 2: Playing a game level for 10 minutes, level ends so the level is just about to unload - instant reboot.
Example 3: Playing a turn based strategy game. Making many decisions. Click turn end - instant reboot. (doesn't happen every time)
Example 4: Using Furmark for several minutes, press alt-tab - instant reboot. (doesn't happen every time)
(Examples 2-4 can be reproduced pretty easily, but only account for ~40% of the total reboots I've seen)

When it occurs:
During intensive games: i.e. most games that can at least potentially raise the GPU temperature to 70 degrees C or above - even if the computer turning off doesn't necessarily happen when the GPU is that hot. I've seen it happen when Furmark had got the GPU to only 60 degrees C.

Data:
My CPU temps are cool under almost all conditions except Prime95 - 40-50 degrees C
My GPU temps are a bit high - 70 to 83 degrees C under sustained 100% load.

When it DOESN'T occur:
During web browsing, nor light games, nor when idle, nor during intense CPU tasks like Prime95.
When the room is cool.

Timeline:
It seems to have only started to happen often within the last few weeks, the computer is over a year old - although it happened a single time a bit over a month ago.
I made no recent hardware changes, although about 6 months ago I got a 144Hz Gsync display.
I update my Nvidia drivers frequently, which I've updated a few times since the problem first occurred. I've tried downgrading my drivers too.

What have I tried:
Disabled automatic reboot in windows (so I could see a BSOD if there was one there was none).
Checked Event Logs - only sudden power loss is noted.
Disabled BIOS overheat protection - system still rebooted under load.
I've tried running my graphics card fans at 100% (which kept GPU temp under 73 C) - no difference
I've tried running my CPU fans and case fans at higher speed - no difference
Since the problem started I updated my BIOS - no difference
Clamped down voltages in the BIOS (so they wouldn't fluctuate up under load) - no difference
I updated and downgraded graphics drivers - no difference
I unplugged peripheral devices - no difference
I've surveyed my motherboard somewhat and there doesn't appear to be any bulging or cracked capacitors.
Verified PSU is not in eco mode.
Verified all fans are actually running (including PSU).

Other:
Definite very slight buzzing I can hear when my case is open when GPU is under load (not when CPU under load like Prime95). Not rhythmic, I guess it could be a leaky capacitor but I can't tell where the sound is coming from. I don't think it's coil whine but I haven't heard every type of coil whine...
I've tried downclocking my GPU quite a bit - this seemed to help, but if I ran Prime95 simultaneously and alt-tabbed a few times I was able to get it to reboot.
I've tried cooling down the room until it's chilly - after a lot of testing this seems to resolve all crashing behavior. Note: my GPU still runs up to 79 degrees C in this case. This definitely seems to be the largest controlling factor I can implement - making me think some component, possibly not the GPU nor CPU, has become more heat sensitive?

My plan:
I've ordered a replacement PSU, I'll try replacing that first. Since it is the component that would be most affected by cooling the room and least affected by fan speeds, AND since we're talking about power dropping out, I figure PSU is the most likely candidate.
If the PSU doesn't help, I'm not sure what I'll go after next - I have other GPUs but it's significantly lower end - but I figure if it fixes the problem it could still be any other component (PSU or mobo having trouble with high amounts of power), and if it doesn't fix the problem it would again still be any other component. So swapping out the GPU won't actually tell me anything (would be different if it was a more power-hungry GPU). Still, I'll give it a shot after the PSU.
Finally I'd go for a new mobo - avoiding this because of the pain of complete disassembly and reassembly.

Specs:
i9-9900k (never overclocked)
32GB DDR4-3000 RAM (tried lowering RAM voltage and frequency - no help)
ASRock Taichi Z390 mobo
Evga 2080 XC Ultra (not overclocked (tried underclocking, see notes)) - in uppermost slot
EVGA 850W P2 PSU (NOT in eco mode, never used eco mode)
Fractal Design R6 case


Anyone disagree with any of my conclusions or have any other insights or suggestions? I'm all ears. It'll be a while before the PSU is delivered.
 

Steltek

Diamond Member
Mar 29, 2001
3,042
753
136
I don't think I would do anything different than you are doing. Changing the PSU is always a good testing methodology anytime you have sudden reboots when running a higher draw GPU, especially when you have an unusual sound as well. You didn't say what you are using, but I presume you are using a NVMe or SSD drive? I presume the temps on the drive are good and there are no indications of problems there?

Does it make a difference if you open the front case panel to maximize airflow?

Also, are you running the GPU horizontally mounted in the motherboard, or do you have it mounted vertically using risers? Running it mounted vertically in that case can result in a huge increase in temperature (on the order of 15-20C or more over running it horizontally), depending upon the card, even with the front panel open.

If the new PSU doesn't fix the issue and the GPU is still under warranty, I'd RMA the GPU before I'd try to fix it myself. If it is no longer under warranty (and you have no credit card extended warranty feature or such), you could probably disassemble it and attempt to reapply thermal paste (or, even, convert to water cooling).
 
Last edited:

JDA81

Junior Member
Jan 17, 2018
8
1
41
I don't think I would do anything different than you are doing. Changing the PSU is always a good testing methodology anytime you have sudden reboots when running a higher draw GPU, especially when you have an unusual sound as well. You didn't say what you are using, but I presume you are using a NVMe or SSD drive? I presume the temps on the drive are good and there are no indications of problems there?

Does it make a difference if you open the front case panel to maximize airflow?

Also, are you running the GPU horizontally mounted in the motherboard, or do you have it mounted vertically using risers? Running it mounted vertically in that case can result in a huge increase in temperature (on the order of 15-20C or more over running it horizontally), depending upon the card, even with the front panel open.

If the new PSU doesn't fix the issue and the GPU is still under warranty, I'd RMA the GPU before I'd try to fix it myself. If it is no longer under warranty (and you have no credit card extended warranty feature or such), you could probably disassemble it and attempt to reapply thermal paste (or, even, convert to water cooling).

Thanks for the response!

My drives include 1 SATA M.2 drive, and a full load of SATA non M.2 drives - SSD & HDD. The temps on the SATA M.2 are OK, 50's to 65 max C. It doesn't have a heat spreader or anything. No indication of problems.

I didn't test it super thoroughly for opening front panel, but I believe front panel being open didn't help. I also tried opening up all the panels but the room was getting significantly cooler by then. I closed it back up and proceeded to test for over an hour just fine with the cooler room. I did the cooler room test another day with everything closed up and it was stable then too. I'm curious to try testing with a warm room, but all panels open and fan speeds maxed.

GPU is traditional mobo mount, it's a 2.75 slot so I'm not even sure I could do that new mounting style. Even so, it's tighter in there than I regularly expect, especially with the Noctua NH-D14 which I failed to mention. About ~3 slots worth of space for the GPU fans to blow before the PSU shroud.

Yeah GPU is under extended warranty so I would do a warranty exchange before bothering to buy a new one. Too bad the Nvidia 30x0 series aren't out yet - that'd be convenient.
 

Steltek

Diamond Member
Mar 29, 2001
3,042
753
136
Thanks for the response!

My drives include 1 SATA M.2 drive, and a full load of SATA non M.2 drives - SSD & HDD. The temps on the SATA M.2 are OK, 50's to 65 max C. It doesn't have a heat spreader or anything. No indication of problems.

I didn't test it super thoroughly for opening front panel, but I believe front panel being open didn't help. I also tried opening up all the panels but the room was getting significantly cooler by then. I closed it back up and proceeded to test for over an hour just fine with the cooler room. I did the cooler room test another day with everything closed up and it was stable then too. I'm curious to try testing with a warm room, but all panels open and fan speeds maxed.

GPU is traditional mobo mount, it's a 2.75 slot so I'm not even sure I could do that new mounting style. Even so, it's tighter in there than I regularly expect, especially with the Noctua NH-D14 which I failed to mention. About ~3 slots worth of space for the GPU fans to blow before the PSU shroud.

Yeah GPU is under extended warranty so I would do a warranty exchange before bothering to buy a new one. Too bad the Nvidia 30x0 series aren't out yet - that'd be convenient.

Based upon what you are describing, I think it will end up being either the PSU or the GPU, with a very slim possibility of a motherboard problem. If replacing the PSU doesn't fix the problem, I'd go back to your older graphics card. Run something to give the old card a heavy workout to get it hot, and see if the system shuts down. If it doesn't, I'd RMA the 2080 XC Ultra graphics card.
 

JDA81

Junior Member
Jan 17, 2018
8
1
41
Swapped out the PSU today - been testing a lot - 2 hours in Furmark, and plenty of games... The problem seems completely resolved.

Still very stable - today it was very hot in the room and the computer is still perfectly stable. Guess I'll be sending in the old PSU when I can (still under warranty for many years).
 
Last edited:
  • Like
Reactions: VirtualLarry