Vega 56 disappearing

djeyewater

Member
Apr 15, 2007
37
0
66
I've had this happen twice now - I'm playing a game, the system locks up. Reboot the PC and the graphics card is no longer recognised in the BIOS (or Windows), no output to screen. Remove the card, boot up, shut down, insert the card again, and all working fine.

The card is a MSI Radeon RX Vega 56 Air Boost 8GB OC. Not flashed and running at stock settings.
Motherboard is an ASUS Prime X-370 Pro on the latest BIOS (5220).
I also have an Nvidia Quadro M2000 as the 2nd GPU, no problems with this. My main monitor runs off the Vega 56, and the second off the Quadro.

The card came with the Division 2 game, and this is the only game I play on it, and only occasionally. The game's always been quite crashy, sometimes just the game crashes, othertimes the whole PC locks up. But it's not like it crashes every time I play it, maybe 10% of sessions I'll get a crash. I don't think I've ever had the system lockup when not playing the game.

The behaviour of the card disappearing has only happened within the past month or so. Things I've changed in the past few months were putting 2 separate PCI-e power cables into the card instead of previously using both connections from a single cable. And updating the graphics drivers.

When I had the crash today I had only just started the game. The case has good ventilation. So I wouldn't think it's a temperature issue unless it's a spike causing it.

I'm downloading 3D mark with Firestrike at the moment so I can test it with that and see if it causes any issues.

Anyone have any suggestions of things to check / try?

Cheers
 

Iron Woode

Elite Member
Super Moderator
Oct 10, 1999
30,880
12,386
136
you can run GPU-Z while running your benchmarks and see a graph of the temps.

I bought 3dmark from steam last month when it was on sale for $5.00 CDN.

You can run HWMonitor (cpu-z company) and check PSU voltages and wattages for any abnormalities.
 
  • Like
Reactions: DAPUNISHER

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
28,486
20,574
146
I have read a fair number of users complain about that card running too hot. If that turns out to be the issue, undervolting, and if needed, a more aggressive fan curve, will help.
 

Iron Woode

Elite Member
Super Moderator
Oct 10, 1999
30,880
12,386
136
I have read a fair number of users complain about that card running too hot. If that turns out to be the issue, undervolting, and if needed, a more aggressive fan curve, will help.
perhaps it needs a good cleaning and new thermal paste?
 
  • Like
Reactions: killster1

djeyewater

Member
Apr 15, 2007
37
0
66
Thanks, will try both of your suggestions.

About 5 minutes after my original post the system locked up with the exact same issue. So it's probably not related to gaming or load on the card. I've now changed the card back so it's only connected to a single PCI-e power cable (both 8 pins connected from a single cable). Also, although I removed the card, I didn't boot the system with the card removed, just put it back after changing the cable, and the system came back up with the card working. I didn't mention the PSU, which is a Corsair CX750M.
 

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
28,486
20,574
146
You know, I cannot recall ever having used Nvidia and AMD, in the same system, at the same time, before. I wonder if there is a driver conflict involved?
 

Iron Woode

Elite Member
Super Moderator
Oct 10, 1999
30,880
12,386
136
You know, I cannot recall ever having used Nvidia and AMD, in the same system, at the same time, before. I wonder if there is a driver conflict involved?
possible.

I tried that way back in the day and all I got was random BSODs.
 

Iron Woode

Elite Member
Super Moderator
Oct 10, 1999
30,880
12,386
136
Thanks, will try both of your suggestions.

About 5 minutes after my original post the system locked up with the exact same issue. So it's probably not related to gaming or load on the card. I've now changed the card back so it's only connected to a single PCI-e power cable (both 8 pins connected from a single cable). Also, although I removed the card, I didn't boot the system with the card removed, just put it back after changing the cable, and the system came back up with the card working. I didn't mention the PSU, which is a Corsair CX750M.
There is the distinct possibility the Vega card is dying.
 

Feld

Senior member
Aug 6, 2015
287
95
101
You know, I cannot recall ever having used Nvidia and AMD, in the same system, at the same time, before. I wonder if there is a driver conflict involved?
They run together just fine with no conflicts when it comes to mining, but I have no experience with a more normal machine.
 
  • Like
Reactions: VirtualLarry

DrMrLordX

Lifer
Apr 27, 2000
21,631
10,843
136
You sure it is a card problem and not a motherboard problem? I've had PCIe devices disappear on my x570 board before (it's always my NVMe drive in slot #2). Usually this happens after some kind of crash due to GPU overclocking or RAM overclocking. Have to cold boot to get the system to fix the problem.

I also had that problem from time to time on the x370 Taichi.
 

djeyewater

Member
Apr 15, 2007
37
0
66
Thanks for the suggestions everyone. Problem could be Vega GPU itself, conflict with the other GPU, motherboard, or PSU. Unfortunately the machine doesn't BSOD, just locks up completely, so no minidumps to analyse.
The board does seem a bit flaky though, today I was trying to swap out the NVMe drive for a larger one, and boot off the current one in a USB caddy (the board only has 1 M.2 slot). Sometimes it would see the USB drive at boot / in the BIOS, othertimes it wouldn't. Just gone back to the original drive mounted on the board for now as got work I need to get done.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,340
10,044
126
You know, I cannot recall ever having used Nvidia and AMD, in the same system, at the same time, before. I wonder if there is a driver conflict involved?
I'm using an XFX reference blower RX 5700 8GB card as my primary (Windows desktop too), and then an MSI GTX 1660 ti 'Gaming X' as my secondary GPU. I'm hashing on both of them. Sometimes, one or the other quits working in the mining app, and it needs to be re-started, or sometimes, rarely, I need to reboot the whole system. Though, I will say, the last couple of Adrenaline 2020 drivers, have been way more stable than they used to be. Rarely ever get black-screen or system-freeze now. I think, not entirely sure, that I'm using the 441.41 drivers for the NVidia card.
 
  • Like
Reactions: DAPUNISHER

djeyewater

Member
Apr 15, 2007
37
0
66
So far, since switching the card back to a single PCI-e power cable, I haven't had any issues. Since the issue is intermittent that's no guarantee it's fixed, only time will tell. I checked the temps and they seemed fine. I also tried undervolting it, but the AMD drivers are (still) useless and stop using the manual settings.
 

linaaslt

Junior Member
Aug 8, 2013
20
6
81
I also tried undervolting it, but the AMD drivers are (still) useless and stop using the manual settings.
I could disagree with you on drivers being useless, i had vega 56 for a year and a half, i never had any driver issues, even undervolting my card, it's been udervolted since i got it, and it's been rock solid. Might need to loo for a problem somewhere els.
 

Feld

Senior member
Aug 6, 2015
287
95
101
Current drivers don't allow certain cards to change clocks. I'm using drivers from a year ago for my Radeon VII because anything newer won't allow me to overclock without delving into registry hacks. Not sure if the problem is just for Vega cards or everything older than Navi. But if anyone is having issues with manually changing clock speeds and voltages, the 19.5.2 drivers are the last ones that fully allowed it for Vega cards.
 

DrMrLordX

Lifer
Apr 27, 2000
21,631
10,843
136
@Feld

Part of the problem with Radeon VII is Radeon Settings/Radeon Software. The driver will allow overclocking, but you have to use external software like OverdriveNTool to make it stick.
 
  • Like
Reactions: Feld

Feld

Senior member
Aug 6, 2015
287
95
101
@Feld

Part of the problem with Radeon VII is Radeon Settings/Radeon Software. The driver will allow overclocking, but you have to use external software like OverdriveNTool to make it stick.
Yeah, I know it's possible with third party tools. If I ever need to do so due to compatibility issues with the older driver I will, but that's added complexity that I don't need at the moment. The 19.5.2 drivers are rock solid stable and are currently giving me plenty of performance for mining and gaming (mostly Borderlands 3 right now), so until I'm either forced into making a change or AMD fixes overclocking in their own software, I'm content as is.