GPU Driver or Hardware Issue?

Enthyos

Junior Member
Jan 13, 2021
3
1
41
Hello folks,

Hopefully I am in the right section as this issue appeared after my GPU upgrade from a RTX 2080Ti to RTX 3090.
I am encountering an issue with one my most played games (World of Tanks), the client is crashing into a black screen at random times (sound and other programs still run in the background - can use TeamSpeak for example).
This issue only happens in World of Tanks and is reproductible while in battle, does not happen while in idle or just browsing around in the garage interface. No other game I have tried so far gave me any crash - DX11 or 12.
*Did notice a weird thing that went away after the last driver update (v.461.09), the in-game gamma settings were greyed out - so I thought I’d mention this anyways.
I am at a loss here as I have run out of ideas and I can’t seem to figure out if this is an issue with hardware or is it somehow tied to the way the game works.
The crashes began mid-November and since then I have contacted the player and technical support teams @Wargaming – the company behind World of Tanks creating multiple tickets that lead nowhere so far.
My rig config – pc part picker link here.
Age of the computer (hardware) ranging from new – GPU to 1 and a half years old for most of the system hardware.
OS wise the system was installed around 08/19.

Here is a list of things I tried after reading around on the internet on various forums and asking all my tech-savvy friends for help:
§ Updated/Rolled back the GPU drivers using DDU in Safe mode – Not connected to the internet / Windows Drivers Updates off > tried the versions (Studio and Game Ready): 442.74*did not recognize my GPU as expected* - 456.38 - 457.30 - 460.97 (beta) and the latest drivers as well - now running Game Ready 461.09.
§ Made sure my temps are low and there is no overheating - at an ambient temp of roughly 23-4 C degrees - I get to a maximum 71C on GPU and 68/70 CPU side during stress test or even lower temps when just under intense gaming (CoD – Cold War / Metro Exodus for example) - and high 50 C CPU / low 60C on the GPU.
§ Cleaned the registry using CCleaner.
§ Modified the page file system to a higher value and reverted to windows default (32 GB of physical RAM memory should be more than enough- but hey I gave it a shot anyway)
§ Updated the latest chipset drivers from AMD site.
§ Updated the GPU firmware and latest Aorus software.
§ Updated to the latest BIOS (now running F31j)
§ Disabled the XMP Profile in BIOS / and using all stock settings in BIOS.
§ Updated to the latest Creative Sound Blaster sound card drivers
§ Used the windows default recommended sound settings (suggested by the Wargaming Support team)
§ Rebooted my router & network devices for a few minutes (suggested by the Wargaming Support team)
§ In Nvidia Control Panel set the game to use Maximum Performance mode as some user recommended in a similar issue with Apex Legends.
§ Disabled G-Sync + gave it a go with V Sync on and off (Using a G-Sync monitor @ 1440p 165hz)
§ Downclocked my GPU using the Aorus tool to the reference RTX 3090 specs.
§ Enabled and tested Low Latency mode
§ Updated to the latest chipset from the AMD site.
§ Tried running the game with no Antialiasing / Off in NVCP as well.
§ Launched the game in Windowed - Windowed borderless - Fullscreen
§ Lowered the resolution from 1440p to 1080p and 720p
§ Tried to run the game with the lowest graphical settings in all the above resolutions
§ Since the game client has the option – tried both x32/x64 clients
§ Tried Compatibility mode Windows 7 & 8
§ Launched the game in safe mode
§ Used both HD and SD client (lower end machines version)
§ Ran the game integrity check (suggested by the Wargaming Support team)
§ Added the game as a firewall exception (suggested by the Wargaming Support team)
§ Removed the Appdata WoT folder that contains all the game settings (suggested by the Wargaming Support team)
§ Reinstalled the game multiple times
§ Made sure the DirectX Suite is up to date
§ Uninstalled Easy Tune Engine Service/ Team Viewer (as suggested by the Wargaming Support team)
§ Disabled & Uninstalled iCue / Aorus Software / RGB Fusion 2 (suggested by the Wargaming Support team)
§ Clean booted - Windows Default Only
§ Deleted the cache Nvidia Shader Cache
§ Disabled onboard sound in Device Manager
§ Used a TDR Manipulator and disabled / reenabled it.
§ Updated my Windows 10 to the latest build 20H2
§ Rather recently replaced my PSU unit (end of 2019)
§ Ran a 3D Mark benchmark (Firestrike Demo) while logging with Hwinfo64 * Sensors Only – Log here (3D Mark Results + Hwinfo64 Log) (advised by the user Greybear on Nvidia Forums)
After the latest tests (here) advised by the user Greybear in my post on the Nvidia Forums the readings in the log file he points out there is an anomaly, you can see it in the Images in the 12v LOW column "" 0 "" .
Open the Google Drive Link above access the .csv file and scroll across to column LV - MD - see all the 0 entries.... notice that the 12 readings appear in the wrong column on some of the 0 reading in the LV column.
He recommended that I set the RAIL performance to SINGLE RAIL on my PSU.

My knowledge on PSU’s is quite limited - Any thoughts on that?

*I will redo the benchmarks and log the info as soon as I get home from work. *
After digging for any leads in my crash logs, I found the following information into the game’s crash log and Windows Event logger:

The games's python log here for World of Tanks ends with this after a crash:
INFO: [Info] FATAL ERROR: RenderSystemDeviceAccess::genericInterfaceCallFailureHandler - device in unsupported state: The device has been removed.”

This seems to be synchronized time stamp wise - with the report from the Windows Event logger that points to the nvlddmkm file stopping to work:

" Event id - 1062600691 in Source "nvlddmkm" cannot be found. The local computer may not have the necessary registry information or message DLL files to display the message, or you may not have permission to access them. The following information is part of the event: \Device\Video3 Graphics Exception on (GPC6, PPC 0)"
I have tried capturing a screenshot right after the black screen crash to see if it pops any information put the screenshots are black.

I am aware that this is a long post, but I am still having hope that there is a way to find a fix for this.
Thank you very much for your time and patience.
Any help is highly appreciated.
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Your error means - GPU card driver has crashed and one of those reasons is card crashing due to power delivery problems.

I'd do the following, download MSI afterburner, it will also install RTSS, after install/reboot, go to RTSS and add your game ( can be done when Wot is running, shift click on "add" and choose WoT executable ).
Put Framerate limit to something quite low, like 1/2 of your monitor refresh ( 144/2 in this case )

1610627373918.png


Play the game ( it should be now limited to whatever FPS you put in ) and if it is still crashing, the problem is probably not related to power problems.
 

Enthyos

Junior Member
Jan 13, 2021
3
1
41
Hello JoeRambo, thank you for your reply.

I have tryed a similar thing by trying to downclock my gpu using Aorus Engine to have it clocked at FE RTX3090 level.
That seemed to help to some extent but it does not stay crash free.
If it happenes to be related to power problems, what would you recommend? A new PSU and if yes any brand in mind?
I will be giving your suggestion a shot right now and come back with feedback.
 

Enthyos

Junior Member
Jan 13, 2021
3
1
41
Hey JoeRambo,

I know it has been a long time since my last post, unfortunately the issue is not fixed and the Gigabyte Support team redirected me to my local vendor whom offered me a refund but not a replacement due to the low stocks.

Meanwhile after following your recommendations and loaded RivaTunner Statistics Server and even more testing over an extended period of time with the Aorus Engine I did end up having a stable GPU while playing World of Tanks.

While using RTSS I only ended up having partial success, (weirdly enough stable at 59 fps) I kept trying to find a solution and ran different clocks and configurations.

Stable settings I am currently using with an alternate profile in Aorus Engine – GPU Clock 1625 / Memory Clock 19502 / Fan Speed Auto / Power Target 100 @ a Target temp of 83 degrees Celcius - Limited to 120 FPS via Nvidia Control Pannel.

The issue occurred only once since then while playing League of Legends but didn’t reproduce ever since and it was using this custom World of Tanks profile not the standard Aorus Extreme OC – (GPU Boost 1860Mhz / Memory Clock 19500).

No other games or workload like rendering crashed my GPU, so I decided to keep the card a while more, at least until the availability issues are resolved.

Hopefully this workaround will help somebody else out there.
 
  • Like
Reactions: fralexandr