Samsung 850 PRO SSD Failure imminent?

chongov

Junior Member
Nov 13, 2017
3
0
1
About 2 years ago I put together a server for my wife's small business. Windows 2012 R2 on a Lenovo RS140 server, with 2 Hyper-V VMs running off of a Samsung 850 PRO SSD. I figured the 850 Pro would last forever (>> 2 years), but recently the server has been acting up so I ran Samsung Magician, with the following results:

2ZYDV47oX8yurcabrPUVjwVw1VivF70kSuTU_1y_-Rqn3nkeaQK4aR6vNbZ0V0sTK8LsLv8ba0clqtdyaPCv=w2880-h1460-rw


Should I RMA the drive? Is it plausible that an 850 Pro would fail after just 2 years and 7TB of writes? Am I doing something wrong that would accelerate SSD failure? I fully expected this thing to last many years, based on the reviews and the positioning of that product.

Any advice is appreciated.

Thanks,
Mike
 

UsandThem

Elite Member
May 4, 2000
16,068
7,383
146
Just RMA it if you have issues with it. Just like any other electrical device out there, some will be defective or fail early. No manufacturer can guarantee that all their products are free from defects. But since it has a 10 year warranty, you get another unit.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,574
10,211
126
7TB of writes should be "nothing" to any competent MLC-based SSD, especially of sufficient size. I would not expect a NAND-wearout-derived failure that early in an SSD's life. That said, you may need to RMA, as it may have other issues. But disconcerting to be sure, for such a highly-rated and highly-touted SSD. Then again, it's still a "consumer" (albeit high-end) SSD, and not a "Server-class SSD", with HetMLC or eMLC. Which is what you should really be using.
 

ronbo613

Golden Member
Jan 9, 2010
1,237
45
91
When you make a few million of anything, pretty good chance a few will be bad. Send it back, get a new one.

Sooner or later happens to everyone.
 

chongov

Junior Member
Nov 13, 2017
3
0
1
Then again, it's still a "consumer" (albeit high-end) SSD, and not a "Server-class SSD", with HetMLC or eMLC. Which is what you should really be using.

I'm looking at replacing it with an Intel DC S4500 or S3520. I see these are not HET drives but would you regard them as "Server Class" and less likely to fail than the 850 Pro?

Thanks,
Mike
 

nk215

Senior member
Dec 4, 2008
403
2
81
I'm looking at replacing it with an Intel DC S4500 or S3520. I see these are not HET drives but would you regard them as "Server Class" and less likely to fail than the 850 Pro?

Thanks,
Mike

If you can, swing for a S3710. I have a stack of 4 S35xx 600gb on my desk but when it came to business server, I use S3710 (400gb).

I had a consumer SSD went bad on business server in around 2 years. I was lucky to be able to retrieve data off of it (backup is a half a day old).

The strange this was after baking the SSD in an oven for 30mins, it has been working fine as a temp storage in an USB3 enclosure for home use ever since. This is an m5 crucial drive.

I have 20+ ssd at home and none ever went bad in a span of abut 8-10 years until the one at the office went.

Then again, my first power supply that went bad was also the one at the office. An old PSU that came with a Dell T110ii E3 ESXi machine. The darn machine kept on restarting under load. Considering it had all SSD storage, the PSU was never stressed that hard.
 

coercitiv

Diamond Member
Jan 24, 2014
7,252
17,092
136
I'm looking at replacing it with an Intel DC S4500 or S3520. I see these are not HET drives but would you regard them as "Server Class" and less likely to fail than the 850 Pro?
7TB of writes is not a problem for 850 Pro in any shape or form. I have a 850 Pro and a 840 Pro, both over 7TB host writes, both over 3 years old, and they both report to be in top shape.

Whatever happened to that SSD (currently the screenshot is not being displayed properly), it was not due to NAND endurance problems. Samsung rates even their 128GB 850 Pro at 150TBW, and their internal testing found they can write up to 8000 TB on it. At worst the 256 GB model is good for 300 TB of writes.
 

Elixer

Lifer
May 7, 2002
10,371
762
126
Should I RMA the drive? Is it plausible that an 850 Pro would fail after just 2 years and 7TB of writes? Am I doing something wrong that would accelerate SSD failure? I fully expected this thing to last many years, based on the reviews and the positioning of that product.

Any advice is appreciated.

Thanks,
Mike
Hard to say what is going on, since the pic isn't working.
 

coercitiv

Diamond Member
Jan 24, 2014
7,252
17,092
136
7TB of writes is not a problem for 850 Pro in any shape or form. I have a 850 Pro and a 840 Pro, both over 7TB host writes, both over 3 years old, and they both report to be in top shape.
Screenshots to support the above:

giYPsZL.png
 

chongov

Junior Member
Nov 13, 2017
3
0
1
Hard to say what is going on, since the pic isn't working.

Sorry... basically I get failures of "Uncorrectable Error Count" and "ECC Error Count", both with RAW values of 5170.

From the sounds of it I just have a defective unit.

Mike
 

gsteele531

Junior Member
Feb 7, 2018
2
0
6
I have now had 3 Samsung 840/850 SSDs fail. The latest one, with 600 power cycles and 6000 power-on hours, comes up with the primary partition as RAW - i.e., unformatted. I find 6000 hours an unacceptably low life, given that the drive was installed on a home PC, which does not have heavy I/O use, and with hibernation turned off and the swapfile located on a separate, magnetic HDD. In each case, analysis with HDTune, shows pronounced degradation in read performance, along with an erratic performance trace. Why an SSD would show such performance degradation and variable performance, in some cases far lower than a typical HDD in read rate (not access time) despite a full TRIM prior to the test, is beyond me. If it were a single drive, I'd chalk it up to a random defect. Three drives in a home environment employing only six or seven across towers and laptops seems reflective more of poor quality than anything else - particularly since the drives were purchased over a period of several years. Thoughts?
 

UsandThem

Elite Member
May 4, 2000
16,068
7,383
146
Three drives in a home environment employing only six or seven across towers and laptops seems reflective more of poor quality than anything else - particularly since the drives were purchased over a period of several years. Thoughts?

I'm the opposite. I have a 830 and 840 that still works fine to this day, and six 850 EVOs and I've yet to have a problem with any of them (or any of my other SSDs like Kingston and Micron as well). I do keep all my systems on back-up UPS systems with AVR, so maybe that's the difference? I've seen several people have a defective SSD on here over the years, but never anyone with that many defective drives in a home use situation.
 

gsteele531

Junior Member
Feb 7, 2018
2
0
6
I'm the opposite. I have a 830 and 840 that still works fine to this day, and six 850 EVOs and I've yet to have a problem with any of them (or any of my other SSDs like Kingston and Micron as well). I do keep all my systems on back-up UPS systems with AVR, so maybe that's the difference? I've seen several people have a defective SSD on here over the years, but never anyone with that many defective drives in a home use situation.

I agree - it's unusual particularly because I have installed probably 10 other Samsung drives, and at least twice that number of other brands, with only a single failure among them. All my desktops are on UPS's, connected to the wall through surge protectors to extend the level of security and integrity of the UPS (I replace the surge protectors when there's a detected event.) Actually, what I'm looking for is anecdotal evidence that Samsung's are not all they are cracked up to be. The failure mode is pretty severe - no SMART warning, no indication; just failure, and then shortly thereafter, doorstop status.

P.S. Have you ever run HDTune on any of them to assess their performance?
 
Last edited:

Puffnstuff

Lifer
Mar 9, 2005
16,187
4,871
136
My first samsung ssd, an 840 pro 256 died early with no warning and was replaced through their problematic rma service. My 850 pro 256 serves as my boot drive with 62.3 TB written to it so you should send that drive in under rma.
 

Glaring_Mistake

Senior member
Mar 2, 2015
310
117
126
I have now had 3 Samsung 840/850 SSDs fail. The latest one, with 600 power cycles and 6000 power-on hours, comes up with the primary partition as RAW - i.e., unformatted. I find 6000 hours an unacceptably low life, given that the drive was installed on a home PC, which does not have heavy I/O use, and with hibernation turned off and the swapfile located on a separate, magnetic HDD. In each case, analysis with HDTune, shows pronounced degradation in read performance, along with an erratic performance trace. Why an SSD would show such performance degradation and variable performance, in some cases far lower than a typical HDD in read rate (not access time) despite a full TRIM prior to the test, is beyond me. If it were a single drive, I'd chalk it up to a random defect. Three drives in a home environment employing only six or seven across towers and laptops seems reflective more of poor quality than anything else - particularly since the drives were purchased over a period of several years. Thoughts?

Do you have any screenshots of the results with HD Tune?

In any case it sounds a bit odd to me that either an 850 EVO or Pro would suffer from voltage drift without significant wear, unsuitable temperatures or after being powered off for a long time.
It's possible that the results from HD Tune can be false positives, sometimes you'd see low read speeds with HD Tune even when for example SSD Read Speed Tester would show normal read speeds.
Have experienced that myself.

Don't know why they would have failed however regardless if they suffered from voltage drift or not however.
 

Puffnstuff

Lifer
Mar 9, 2005
16,187
4,871
136
Sorry... basically I get failures of "Uncorrectable Error Count" and "ECC Error Count", both with RAW values of 5170.

From the sounds of it I just have a defective unit.

Mike
I just ran CDI on my 850 pro and it said the similar things including giving me ecc error rate of 200 and crc error count of 99. It also shows my power on hours at 15,556 and power on count at 1,439.
 

Glaring_Mistake

Senior member
Mar 2, 2015
310
117
126
I just ran CDI on my 850 pro and it said the similar things including giving me ecc error rate of 200 and crc error count of 99. It also shows my power on hours at 15,556 and power on count at 1,439.

Could you post a screenshot of that?

Because I think that the ECC Error Rate value of 200 that you're talking is not the RAW value.
If so then you should worry about it dropping from 200, not the other way around.
Or well, if it did increase then there is likely something wrong with SMART but you get the point.

Regarding the CRC Error Count it is most often caused by the cable, if it is in poor condition or if it has come a bit loose for example.
 

UsandThem

Elite Member
May 4, 2000
16,068
7,383
146
The screen shot posted above is what my 960 EVO reports with CrystalDiskInfo as well, however my CRC error count is "0" under RAW values. I think how Samsung drives report their data throws people off, like the airflow temperature. For example, your temperature is currently 33 (which is 100 - 67 in the current column, with the worst being 43 degrees).
 

Puffnstuff

Lifer
Mar 9, 2005
16,187
4,871
136
I think how Samsung drives report their data throws people off, like the airflow temperature. For example, your temperature is currently 33 (which is 100 - 67 in the current column).
I believe that is the issue in a nutshell.:D
 

Puffnstuff

Lifer
Mar 9, 2005
16,187
4,871
136
Your Raw Values for ECC Error Rate shows that it hasn't had any (reported) ECC failures yet.
If it had then the RAW Value wouldn't be just a bunch of zeroes.
Right and as UsandThem points out the raw and the current don't even come close to matching making this a reporting error with CDI.
 

Glaring_Mistake

Senior member
Mar 2, 2015
310
117
126
Right and as UsandThem points out the raw and the current don't even come close to matching making this a reporting error with CDI.

No, as far as I know CDI is reporting accurately.

Think of it more like this:
Raw Values tell you how many times something has happened.
Current/Worst Values tell you if it has happened often enough that it is starting to get worrisome.
For example it is not as if it is likely that it has a pool of exactly 100 spare blocks kept in reserve but it still has a Current value of 100 meaning that it hasn't had to replace any so far.

That said: If the ECC Error Rate were to drop even just from 200 to 199 that would be cause for concern since that means it may have suffered from data loss.
So it's a bit of a simplified explanation but I hope you get the gist.