Question Am I experiencing storage failure or something else?

MintyZen

Junior Member
Nov 2, 2011
8
0
66
Since a few days ago my main rig has started experiencing random BSODs. Interestingly enough, they come in stages: first the application I'm using freezes, then the whole desktop, including mouse pointer, and finally the blue screen cometh.
After that, the usual data collection process doesn't even begin - the percentage indicator stays at 0% and I have to manually shut down. Memory/kernel dumps are not created either.

This is, sadly, not the first time I'm dealing with random BSODs: around last June I had a very similar issue - the main difference being that the system would flash a blue screen before rebooting no matter what I set in Start-up and recovery, and still not generate dumps.

What fixed the issue back then was replacing the SSD, a Samsung 970 Evo which I bought in 2018 (same time as the rest of the build) with a 970 Evo Plus (due to the Evo no longer being available). Where it gets weird is that after RMA'ing the drive to Samsung (it was under warranty but the process would take a month and I needed my system back), they sent it back saying that they could not find any issue when they tested it.

As a last ditch attempt I also tried wiping everything and reinstalling Windows: when doing so I usually clean the drive via DiskPart to have the installer operate on an unpartitioned drive, but this time
Code:
clean
would fail with an error message. I worked around the issue using the sanitize NVME function of my motherboard's BIOS, which allowed me to install Windows again, and get another BSOD within an hour.

So, once again to me the SSD looks to be the culprit, however it seems strange that
- The original SSD would come back with a clear bill of health from the manufacturer
- I'd get two bad drives in a row, from a pretty reputable manufacturer. Seems like exceedingly bad luck

But if the SSD is not the cause, then what? If it was the motherboard or RAM, swapping the drive last year would not have solved the issue. And it's not like I have swapped any other components recently.
 

JimKiler

Diamond Member
Oct 10, 2002
3,558
205
106
Is your PSU faulty and harming your SSD? But if the PSU is the issue I would expect the mobo to be affected more than the SSD.
 

Tech Junky

Diamond Member
Jan 27, 2022
3,407
1,142
106
SSD's in general don't' exhibit symptoms but just simply die. There may be some errors / blocks / etc. that get marked an unused but, generally they don't have the classic indicators of failure coming.

Locking and freezing can be an assortment of things though. We all tend to think they're related to RAM / SSD / etc. and sometimes it's actually something dumb like a driver.

I had some random reboots happening on my laptop and all I could glean from the logs was "kernel-power" and I had gotten a different power brick prior to this and it sort of made sense it could potentially be a power issue. I had the original brick though as a stand by. Well, after a few weeks of inconsistent reboots I finally dove into figure out the issue. Tried both bricks , different things, and finally booted into Linux with a Live image. Linux stayed booted for a couple of days w/o any reboots. This ruled out HW / power being the actual issue and something instead in the OS.

I eventually tracked it down to an Intel driver update causing all of this .Rolled back the driver and everything was stable again. Sometimes it's just a coincidence when things happen and blaming one over the other makes for a long day of trying to figure it out.
 
Jul 27, 2020
16,165
10,240
106
My guess is PSU or mobo. What are their models? Are the PSU voltage rails exhibiting significant variations? You can check them in the health section of the BIOS if it has one.
 
Jul 27, 2020
16,165
10,240
106
But if the SSD is not the cause, then what?
Faced a serious issue with MX500 SSD on an old 3rd Gen Core CPU based Dell Optiplex. The system would just lock up upon boot. Had to give it back to the dealer and get a Samsung 850 EVO. No issues after more than a year. You could try going with a different brand of SSD.

Did have a bit of trouble convincing the dealer to replace the MX500 because when he checked it by installing Windows on it on his own spare machine, it worked flawlessly.
 

MintyZen

Junior Member
Nov 2, 2011
8
0
66
PSU should be a good one, actually: I have a Corsair SF600 Platinum. Rest of the system is Ryzen 7 3700X/Asrock Fatal1ty X470 Gaming-ITX-ac/Samsung 970 Evo Plus. I had already updated both Nvidia's drivers and the AMD chipset ones to the latest available.

I also tried a few more things in the last few days:
  • Ran the Windows memory test, most extended one. After a 12+ hours I stopped it myself since it was not finding any errors. A subsequent standard run, which completed in 3 hours also found nothing
  • Hooked up a SATA SSD to the system, secure erased it as well as the original NVME drive, reinstalled Windows to the SATA SSD while leaving the NVME drive connected but unpartitioned/uninitialized
  • Installed drivers, updates, and my usual software loadout (Windows, Office, Visual Studio and VS Code, SDKs, 30+ apps from the Microsoft Store) to the SATA SSD, set everything up as I usually do - this lasted about four hours and no blue screens
  • Installed a fairly intensive game (God of War), left it running for a couple of hours - no blue screens
  • Ran CrystalDiskInfo to check SMART status for the drives - all reported as in good condition
At this point I'm suspecting the motherboard is the culprit, either directly or because the NVME slot is in the back (Mini-ITX) and causes the drive to overheat. Does this sound reasonable?
 
Jul 27, 2020
16,165
10,240
106
At this point I'm suspecting the motherboard is the culprit, either directly or because the NVME slot is in the back (Mini-ITX) and causes the drive to overheat. Does this sound reasonable?
Very. Does the SSD have a heatsink? If not, consider these:

Amazon.com: NVMe M.2 SSD Cooler Heatsinks with 20mm Fan Powerful Cooling… : Electronics
Amazon.com: Sabrent M.2 2280 SSD Rocket Heatsink (SB-HTSK) : Everything Else

Crazy big one: Amazon.com: ElecGear M.2 2280 SSD Cooler, M11 Angle and Height Adjustable Heat Pipe + Solid Aluminum Heatsink with PWM Cooling Fan for 80mm PCIe NVMe and SATA SSD Internal Solid State Drive, Thermal Pads included : Electronics

ElecGear M.2 2280 SSD Cooler, Aluminum + Copper Heatsink with 4Pin PWM Cooling Fan for 80mm PCIe NVMe or SATA NGFF M2 SSD Internal Solid State Drive, Thermal Pads Included - Newegg.com

Amazon.com: ElecGear M.2 2280 SSD Heatsink, CM-01 Heat Pipe + Double Deck Solid Aluminum Heat Sink Cooler for 80mm PCIe NVMe and SATA M2 SSD Internal Solid State Drive Memory, Thermal Pads Included : Electronics

Another mad cooler: https://www.amazon.com/ElecGear-EL-80X-Aluminum-Heatsink-Internal/dp/B09G74HZFL/

Amazon.com: EZDIY-FAB M.2 SSD heatsink 2280, Double-Sided Heat Sink, High Performance SSD Radiator for PC / PS5 for PCIE NVME M.2 SSD or SATA M.2 SSD- Black : Electronics


One more big baddie: Amazon.com: ineo M.2 2280 SSD Rocket Heatsink Built-in Cooling Fan [M3] : Electronics
 
Last edited:
  • Like
Reactions: Kledgie