Discussion Warning about WD SN850x NVMe drives

repoman0

Diamond Member
Jun 17, 2010
4,473
3,312
136
I've gone through three of these drives now on two different AMD systems with two Windows installs. The first two drives I wrote off as compatibility issues with my Zen 3 X570 setup, something wrong with the NVMe slot, etc. But I'm on X670E now with a brand new 2TB SN850x drive with a fresh Windows install and the same exact thing keeps happening. 3/3 failure count.

The issue is a BSOD triggered by stornvme drive controller failures. I've had the Samsung PM981 that this drive replaced for years with no storage-related issues on my past system and it's about to go back in. Seems to be some kind of controller or firmware bug on these new WD drives. Stay away.
 

Tech Junky

Diamond Member
Jan 27, 2022
3,407
1,142
106
I have the non-X version and it's been fine for over a year now. I also picked up some 770's for TB4 testing and they've been fine for a few months now as well. 3/3 though isn't a fluke across different systems though IMO but, might just be a Windows issue vs the drives. Have you tried Linux instead?

Also, the WD series has it's own "dashboard" app that they use for monitoring and updating the firmware.

 

repoman0

Diamond Member
Jun 17, 2010
4,473
3,312
136
I have the non-X version and it's been fine for over a year now. I also picked up some 770's for TB4 testing and they've been fine for a few months now as well. 3/3 though isn't a fluke across different systems though IMO but, might just be a Windows issue vs the drives. Have you tried Linux instead?

Also, the WD series has it's own "dashboard" app that they use for monitoring and updating the firmware.


Yeah, I installed the dashboard and all three drives have the latest firmware from the factory and are declared perfectly healthy by the app. I haven't tried Linux on this drive .. the PM981 was supposed to be my Linux drive but I just wiped it to clone Windows. I'd think the regular SN850 is perfectly fine after all this time but the X variant is a new/updated controller. Could very well be a Windows issue but stornvme is mature software and IMO it's up to WD to make their controller compatible.
 

Tech Junky

Diamond Member
Jan 27, 2022
3,407
1,142
106
Yeah, WD should have MSFT update the driver for the controller if it's not already included but, it happens. Linux on the other hand as you know tends to be a little more cutting edge when it comes to drivers. Though my onboard NIC is a RTL something or another and I had to blacklist the module to force it to use the correct one as it wouldn't bring up the interface correctly.

I haven't bothered looking at the X version though as I have more drives than I know what to do with at this point. Have 2 in the laptop 850/770, server has an 850, got another 770 in the TB4 enclosure, and a couple of others just sitting around that were superseded by the 850/770 upgrades.

I would test in Linux though to see what happens before returning them. Not ideal but, if it works well it buys time to update the controller issue if you swap later back to Windows or dual boot it. I don't recall seeing many issues with WD though.
 

repoman0

Diamond Member
Jun 17, 2010
4,473
3,312
136
I hear you, it might be okay in Linux, but my patience for it is up. I’m going to just send it back while my return window is open. Spent too much time trying to get the drive to work as it is, and compatibility suspicions were part of what pushed me to upgrade to Zen 4.

The error also only happens around once per day or less, so super annoying to wait for it. Nothing obvious seems to trigger it. Usually the system is just idle .. maybe a power management bug, who knows.
 

R81Z3N1

Member
Jul 15, 2017
77
24
81
I run Suse Tumbleweed and have a 500GB WB Black SN850, at times all my drives seem to disconnect and I get errors in saying program X can't write to device. A reboot later all is fine, happened this morning.

My take is that my motherboard and bios takes a dump. happens like every 3 months just unmounts for no reason. Could be a hardware problem, am running a B550 board Gigabyte Aorus Elite to be exact. Could be some weird software bug, but it looks like hardware bios bug or something.

The Samsung drive in my system has the firmware bug in which it adds to the error log but no significant error's found. Both are Gen3 devices. On the Samsung bug reports it needs a firmware fix, which is not going to happen anytime soon.

I am running a 3800x on the board, and seems very stable no overclocking, but like I said not 4-5 nine uptimes. Also at times my network card disconnects, it's a 10gb card in last x16 slot. Likewise a reboot solves that problem most of the time. It just seems that my hardware is not recognized until a reboot. I have troubleshot the issue, and seems the system can not find the card not listed under lspc, or other logs until a reboot.

Could be buggy bios, and firmware I would say like others have suggested find a better brand. I had a post from months ago about WD and for the price I would go with a different brand, but that is just me.

Next time it happens I would say try a rescue CD, could be a Windows/bios/firmware issue.

R81Z3N1
 

Tech Junky

Diamond Member
Jan 27, 2022
3,407
1,142
106
@R81Z3N1

Sounds more like an os issue than hardware. I use Ubuntu and none of that happens.

I might do an update or something that knocks out bonds and bridges but that's a simple network restart to bring them back online. The hardware doesn't disappear just the logical interfaces due to the update. Drives should never disappear though.

If you have firmware bugs you should take the time to upgrade it. Grab a dos image and run the update.
 
  • Like
Reactions: igor_kavinski
Jul 27, 2020
16,165
10,240
106
I am running a 3800x on the board, and seems very stable no overclocking, but like I said not 4-5 nine uptimes. Also at times my network card disconnects, it's a 10gb card in last x16 slot. Likewise a reboot solves that problem most of the time. It just seems that my hardware is not recognized until a reboot. I have troubleshot the issue, and seems the system can not find the card not listed under lspc, or other logs until a reboot.
Update the BIOS on your mobo. If that doesn't solve the issue, check if your PSU voltages are within +/- 5% in the BIOS/UEFI health section. Sometimes the PSU craps out and starts delivering slightly high voltages which then lead to unpredictable behavior.
 

R81Z3N1

Member
Jul 15, 2017
77
24
81
My PSU is a bronze a little old, but a good unit, Seasonic 650 bronze, Network card is Chelsio T420-CR. I wish Samsung had a newer firmware, but 970 evo 256 G in a adapter card. Samsung has a bug tracker, but comments say won't fix, only thing it does is flag in log files and increments error log by 1. If you look at log file it just increments the error count but does not give an error.

The errors I describe above only happen once in a blue moon, like once every 3 months on the write issues. Other than the annoyance of Samsung in log files everything looks good. I do use a UPS so power spikes should be under control. But could very well be old PSU at times and buggy firmware.

R81Z3N1
 

R81Z3N1

Member
Jul 15, 2017
77
24
81
Ok just going to give you all the smart status of the drives.

M.2 Drives Gen 3

Samsung 970 Evo 256 g

Power on Hours 6,266
Available spare 100%
Error entry log entries 552

WDS500G3X0C

Power on Hours 13,568
Available Spare 100%
Error Log entries 1

The Samsung drive is the oldest drive I have but only reports 6k hours my WD had twice that so something a little strange going with the Samsung. If I put it back into a Windows machine I bet I would not get anymore error log entries.

I will at some point retire the Samsung, it might make a good boot drive. But at the moment have way too many computers, and don't play that many games.

R81Z3N1
 

Justinus

Diamond Member
Oct 10, 2005
3,173
1,515
136
Sounds like an OS or firmware bug. I have a 4TB model and it has worked flawlessly in X570 and X670E. I haven't tried a 2TB model.
 

R81Z3N1

Member
Jul 15, 2017
77
24
81
The Samsung does have a firmware bug, and according to the bug tracking they won't fix it. It's on an adapter card and I have a a P300A case with good airflow.

I am not impressed with Samsung support for Linux, and some products like monitors it seems getting a firmware update is last on their minds. I guess the question should be going forward what are uses does a 256G card have in todays systems.

As a boot drive it is a little slow, and on the small size, as gaming too small. At this point I feel it is the best options if it weren't for those stupid +1 error count entries.

R91Z3N1
 

repoman0

Diamond Member
Jun 17, 2010
4,473
3,312
136
I ended up with a 2TB Hynix P41 found on sale for $200. It’s been working perfectly with no lockups or crashes for over a week. So confirmed it was the drive, not a problem with the slot or PCI-e 4.0 on my board or something.