Is Evo Ssd fixed now?

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

SSBrain

Member
Nov 16, 2012
158
0
76
If I remember, I was planning on testing it myself.

I tried SSD Read Speed Tester again after performing several sequential read runs with HDDScan, it looks like performance improved marginally since last time, overall:

1VEsJuN.png

(compare with the previous screenshots)

I might try running the HDTune error check several times again later to see if it improves further or better. I'm wondering if running two or more instances of the program at the same time while it performs its LBA scan could help.
 
Last edited:

coercitiv

Diamond Member
Jan 24, 2014
7,391
17,535
136
I tried SSD Read Speed Tester again after performing several sequential read runs with HDDScan, it looks like performance improved marginally since last time, overall:
Multiple reads may trigger a background cell refresh, meaning data gets moved around. The following quote is taken from an Intel doc describing High Endurance Technology used in some of their products:
By using a scheme called background data refresh,the SSD moves data around during periods of inactivity to re-allocate areas that have incurred heavy reads.
This is in addition to the wear-leveling scheme that already exists in other Intel® SSDs.
Most of the problems TLC nand and high endurance nand encounter are similar, and so are some of the solutions applied. However, (multiple) readings from all the cells in a SSD may have other side effects, so I'm not sure this is the only viable explanation.

People should also start differentiating between Evo SSD left on a shelf for one month and the ones kept online for one month, since power off data retention and power on data retention are not similar.
 
Last edited:

SSBrain

Member
Nov 16, 2012
158
0
76
Actually, after some tests shared on the 840 issue thread on Overclock.net, it appears that the amount of restored performance is temperature-dependent more than read-amount dependent. This is my working hypothesis on why this seems to happen (copy-pasted here from overclock.net):

SSBrain said:
It could be that indeed it is not adjusting its read voltage correctly as Samsung says and only after performing prolonged read operations for a certain period of time some kind of correction/detection algorithm kicks in and partially restores performance, and that the issue was there since the previous gen model without the company realizing it on their testbeds. Or it could also be reprogramming some of the cells, although I haven't noticed yet whether the SSD's Wear Leveling Count increased quicker because of this.

Since it takes some time to make it improve this way, I wonder if it hasn't something to do with drive temperature. Perhaps instead of doing multiple read script passes on the entire LBA range, just one is needed, but at an elevated temperature (like 60-65 °C)? This makes me wonder if it's mostly desktop users who are reporting this problem, since drive temperatures are usually much cooler on their PC.

If that's the case, speaking of temperature, then there are also chances that the problem mainly occurs after the TLC NAND memory on these SSDs is programmed at a too low temperature. NAND memory retention/error bit rate gets worse the higher ambient temperature is, when the drive powered off. However many people don't realize that exactly the opposite happens when NAND memory is being written on, when the drive is powered on.

We know that TLC memory has a lower endurance than MLC NAND; perhaps it is also more sensitive to temperature and somehow Samsung failed to take this into account? Or thought that since these SSDs are usually used on laptops where they get hotter, that keeping temperatures as low as possible would have been a good idea for power consumption? Indeed Samsung 840 are among the most power efficient (cooler) SSDs. Perhaps this comes with some trade off in certain cases?

[...]The (temporary?) speed improvements after doing several read passes of every LBA are hinting that some sort of temperature correction (mainly during sequential read operations) is occurring and that it might be not working correctly.

A further hypothesis of mine (to be tested) is that this is triggered when the affected data are written on the NAND at an excessively low temperature, which might explain why some users don't seem to be having this problem, especially after the firmware fix on the 840 EVO.

This implies that if one were to constantly use the drive at around 45-50°C (typical laptop operating temperature), performance on old data would remain more or less consistent over time.

TL;DR:

- Power on data retention/error bit rate improves when programming temperature gets higher (the opposite of power off data retention/error bit rate).

- TLC NAND memory might be more sensitive to the above issue than regular MLC NAND.

- Long term performance of old data on Samsung 840/840 EVO SSDs might be affected by the temperature of the drive at the time of writing.

- Firmware issues might be causing improper handling of old data written when the drive was at a low temperature.
 
Last edited:

Quad5Ny

Member
Feb 10, 2011
135
5
91
Source: http://www.virtium.com/wp-content/u...sidersations-for-Industrial-Embedded-SSDs.pdf

"Before continuing, a brief physics discussion is in order. The data value is determined by the number of bits per cell and the voltage level read by the SSD controller. The voltage level is determined by the number of electrons on the floating gate of the transistor. Over time, electrons on the floating gate can leak through the oxide layer back to the substrate. The more electrons leak, the more the voltage changes and the higher the chance of a bit error. Too many bit errors - more than the SSD controller can correct – results in uncorrectable, and eventually system, errors.

So the stronger the oxide layer, the better the data retention. Oxide strength is determined by two factors – endurance and temperature. The more program / erase cycles, the weaker the oxide layer becomes. In terms of temperature, think of the oxide layer as ice on a lake. When programming, electrons get injected from the substrate onto the floating gate. The colder the temperature, the more difficult it is to program. The hotter the temperature, the easier it is to program. The converse is also true. The colder the temperature, the more difficult it is for electrons to leak back into the substrate. The hotter the temperature, the more leakage can occur."
^^^ @SSBrain Your right about how temperature effects flash but the drives should not be reacting this way when stored at room temperature (22c-ish).

It takes more than a few user reports before it's worthwhile for us to contact the manufacturer and possibly post something. That said, I'm working with Samsung to get some answers.

From everyone who has a 840 EVO: Thank-you!
 
Last edited:

SSBrain

Member
Nov 16, 2012
158
0
76
You're certainly right that they shouldn't. All of this is pointing up to firmware bugs + SSD/TLC NAND usage at unexpectedly (for Samsung engineers, at least) low temperatures.

If this is confirmed, a possible fix besides sorting out the temperature-correction algorithms, could be making these SSDs operate at a higher temperature. If this was all done in firmware, it would mean they would have an overall higher power consumption, though.

Anyway, on my part I tried insulating my Samsung 840 250GB (non-evo) with polyurethane foam so that it will have a higher operating temperature, especially under load. I'll check out read performance regularly over the following weeks/months to see if this will have an effect.
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
It's really disappointing the firmware fix hasn't worked. I'd expect something like that from OCZ, not Samsung.
 

SSBrain

Member
Nov 16, 2012
158
0
76
So, I performed again the test after insulating the SSD with polyurethane foam and loading it with read operations. The Samsung 840 (non-EVO) was at about 52°C when I started SSD Read Speed Tester: performance now looks almost completely normal. This seems to confirm at least that it depends on drive temperature. As for why it happens only on old data, it might be because of the reasons I previously speculated:

J5BugZ1.png
 

SSBrain

Member
Nov 16, 2012
158
0
76
I've made a gif animation which shows clearly how old data speed changes with drive temperature. For some reason on the fourth frame I had a speed spike which changed the program's graph scale, which I corrected afterwards with an image editing program:

5NjyAHS.gif


The test of the corrected frame was actually made after the one made at a 52°C starting temperature, which further reinforces the temperature correlation.
 
Last edited:

GlacierFreeze

Golden Member
May 23, 2005
1,125
1
0
So lower temps reduce performance? Or am I misreading?

Either way, thanks for compiling and comparing these findings and theories.
 

SSBrain

Member
Nov 16, 2012
158
0
76
Lower temperatures do reduce read performance on the data already affected by the speed problem, at least on my drive. Keep in mind that too high temperatures (>65-70°C) might kick in the drive's temperature throttling algorithm, also reducing overall performance. I haven't reached those temperatures and don't intend to.


My current hypothesis is also that writing new data at a too low drive temperature will cause said data to develop speed issues with time. Since the drive is clearly having problems applying the correct temperature adjustments when reading data, the same could be happening when writing.
 

.vodka

Golden Member
Dec 5, 2014
1,203
1,538
136
Would you buy an 840 EVO with this in mind, considering the alternatives are among stuff like the current Kingston V300 which IIRC isn't representative of review samples due to having different NAND in later versions?
 
Feb 25, 2011
16,994
1,622
126
would you buy an 840 evo with this in mind, considering the alternatives are among stuff like the current kingston v300 which iirc isn't representative of review samples due to having different nand in later versions?
mx100.