How bad will next gen NAND P/E cycle count will be?

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

taltamir

Lifer
Mar 21, 2004
13,576
6
76
1. You can only say that there is nothing to worry about with typical desktop OS drive usage.
2. Improved algorithms have reduced write amplification from over 40x to under 2x on incompressible loads and under 1x on compressible. Such improvements cannot continue, and as such this matters.
3. NAND is hitting a wall in size in terms of sustainable write cycles. Although to be honest it is less severe than:
4. NAND is getting more prone to errors as you shrink it (which to be fair, also applies to spindle drives).
5. I did not state that 20nm NAND would be "unusable", I merely pointed out a discrepancy in the marketing drivel and expressed desire to know the real reliability.

While people have nothing to worry about under normal desktop usage with current tech, this is a real concern that must be addressed and cannot be pushed forever. Alternative technologies will be required.
 
Last edited:

anikhtos

Senior member
May 1, 2011
289
1
0
An ssd with few cycles lets say 500 but at capasities of terabyte in an 3,5 inch format would be nice to store data into it.
like your moovies your music and generally data that will be read from drive.
so maybe they will be unsuable for os booting but will be great for storing data definetly safer than a plateer disk.
 

DirkGently1

Senior member
Mar 31, 2011
904
0
0
I don't know too much about SSDs, but am I wrong to be thinking that the results in that thread so far are pretty impressive? I mean we have 30TBs of writing which is impressive in itself, but if the media wear-out indictor is accurate at 83, the drive still has a ways to go.

Pretty awesome huh? Here's a paper, (again sourced from the XS thread), discussing nand endurance as relates to recovery periods...

http://www.usenix.org/event/hotstorage10/tech/full_papers/Mohan.pdf


"...recovery periods of such durations can significantly boost endurance, allowing the blocks to undergo several millions of P/E cycles before reaching the endurance limit.
The amount of time required for reaching the endurance limit is much longer than the NAND flash retention period. Therefore, endurance is not a major flash reliability concern under realistic data center usage scenarios and a much wider array of I/O intensive applications can leverage the performance and power benefits of flash-based
SSDs than previously assumed."

 
Last edited:

IntelCeleron

Member
Dec 10, 2009
41
0
66
Pretty awesome huh? Here's a paper, (again sourced from the XS thread), discussing nand endurance as relates to recovery periods...

http://www.usenix.org/event/hotstorage10/tech/full_papers/Mohan.pdf


"...recovery periods of such durations can significantly boost endurance, allowing the blocks to undergo several millions of P/E cycles before reaching the endurance limit.
The amount of time required for reaching the endurance limit is much longer than the NAND flash retention period. Therefore, endurance is not a major flash reliability concern under realistic data center usage scenarios and a much wider array of I/O intensive applications can leverage the performance and power benefits of flash-based
SSDs than previously assumed."


Indeed. That sounds reassuring.
 

DirkGently1

Senior member
Mar 31, 2011
904
0
0
Latest update from the ongoing endurance testing, for those not following the thread.

Album&


Thread here:

http://www.xtremesystems.org/forums/showthread.php?271063-SSD-Write-Endurance-25nm-Vs-34nm

That Crucial M4 has got some serious legs. It's gotta reach 1PB, surely?
 

MarkLuvsCS

Senior member
Jun 13, 2004
740
0
76
It has many meanings, but I believe it is used in this instance as media wear indicator

That is correct. The drives will get to 1% and stay there. That is why the actual writes are also shown. The Samsung 470 has a black border because that drive has officially died. The graphs indicating the write amplification for the samsung was at around 5x vs 1.0-1.2x for most others, so lets hope the rest last nearly 5x as long :p. If you check out the thread though, you can see the drives power on hours start to range in the months, so it's not as if the drives are writting tons of data with only a few days of actual use.

Most drives under test conditions are the smallest batch of drives because that is what people are willing to throw away for the knowledge XD.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
People would freak out if they came to realize the transistors in their CPU's have the same limited "number of times the xtor can be switched" lifespan dynamics.

The fact that they've never been conditioned to be worried about that aspect of what is under the hood of their CPU has spared us of threads like these in the CPU forum.

Lifetime of any electrical aspect of an IC comes down to tradeoffs. Endurance is one of them. You can make high-endurance NAND, if you are willing to spend more money in the development phase.

The fact that NAND endurance is decreasing is just evidence that it is not as high a priority as other metrics of development, which includes the timeline and production cost.

Project Triangle: you can pick any two of the three, but never more than two.
320px-Project-triangle.svg.png

^ in this graphic, "good" means reliability/endurance of the end product, "fast" means the development timeline, and "cheap" means the development expense. (not to be confused with cheap production costs for the consumer, or fast nand performance at the end-user level)

Whenever we start getting close to the actual endurance levels which will effect the reality of the end-user experience (it wasn't 25k p/e, it wasn't 10k p/e, it isn't 3k p/e) you can rest assured that there are legions of process development engineers (myself included) ready and waiting with solution to implement to resolve the declining endurance dilemma.

But don't expect those solutions to be free unless you are willing to wait a while for them to be robustly implemented (not gonna happen on a 18 month node-cadence cycle).

I remember ~10yrs when the "gigahertz barrier" was about to be broken, like it was something akin to the sound barrier which really did require extraneous efforts to get from 999MHz to 1GHz. There is nothing extraneous about improving NAND endurance...unless you want those improvements to be done quickly and cheaply :p
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
Latest update from the ongoing endurance testing, for those not following the thread.

A shame they are not doing any tests without TRIM. I have seen someone whose SSD life wore out in about a year to nothing and he didn't have TRIM. Based on my knowledge of how it works, I posited that without TRIM he can be getting a write amplification of 128x under certain conditions (need to erase 512kb to write one single 4kb sector)

Now that I think about it, 512b sector emulation could make it much worse still.

So I would love to see durability testing without TRIM on some drives.
 

razel

Platinum Member
May 14, 2002
2,337
90
101
Testing without TRIM seems relatively easy to do. Since you're so interested, why not contribute to their thread? You'd be joining a very honorable group making sacrifices to add to everyone's knowledge.
 

Yellowbeard

Golden Member
Sep 9, 2003
1,542
2
0
Purely out of curiosity, while testing without TRIM might be interesting, why would it be relevent? The number of users with a non-RAID SSD not using TRIM is likely very small if not miniscule and also likely shrinking each month. Seems like a fairly non useful fact.
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
Testing without TRIM seems relatively easy to do. Since you're so interested, why not contribute to their thread? You'd be joining a very honorable group making sacrifices to add to everyone's knowledge.

I am not quite ready to pay money to be allowed to post in a forum.
I know how to do such tests but it is time consuming and I hadn't gotten around to it.

Purely out of curiosity, while testing without TRIM might be interesting, why would it be relevent? The number of users with a non-RAID SSD not using TRIM is likely very small if not miniscule and also likely shrinking each month. Seems like a fairly non useful fact.

Because this could be more important than all the other reasons to use TRIM and yet nobody even knows it exists. And if people knew they would be more likely to press for TRIM. Also less likely to give up on trim by using RAID0 of SSD drives.
 
Last edited:

frostedflakes

Diamond Member
Mar 1, 2005
7,925
1
81
They did test write amplification of one drive with and without TRIM, might give you an idea of how other drives will perform without TRIM. Although obviously it will vary between controllers (and firmware revisions for the same controller), depending on how aggressive their idle garbage collection is. I don't know how the aggressiveness of the Vertex Turbo's garbage collection compares to other drives.

attachment.php
 

Yellowbeard

Golden Member
Sep 9, 2003
1,542
2
0
Because this could be more important than all the other reasons to use TRIM and yet nobody even knows it exists. And if people knew they would be more likely to press for TRIM. Also less likely to give up on trim by using RAID0 of SSD drives.

Good points. I was just curious. In fact, 2 identical systems with identical loads on 2 "virgin" drives might be a good way to illustrate that point. I may suggest that to a friend if you don't mind me stealing your idea?
 

DirkGently1

Senior member
Mar 31, 2011
904
0
0
but does the usable size decrease during these 136 years?

You get reallocated sectors, although each firmware will of course have it's own way of handling it. The Samsung 470 was still writing across the entire drive when it gave up the ghost.

It's worth mentioning that the M4 in that test was still showing 0 reallocated sectors when that last graph was produced!

As far as the 470 goes, the only test now is to see how well it retains data as Read Only. It's supposed to be a minimum 12 months but only time will tell. I'd be interested to find out whether powering up the drive, or not, in that time has any effect on how well it retains the data.

Worth mentioning too that these tests are pretty much worst case scenario for writing data. It's thought that Nand lasts longer the more time it has to recover between writes, so with lighter loads the drives could last sigificantly longer than they already are.
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
Good points. I was just curious. In fact, 2 identical systems with identical loads on 2 "virgin" drives might be a good way to illustrate that point. I may suggest that to a friend if you don't mind me stealing your idea?

Go right ahead. I don't mind at all.

They did test write amplification of one drive with and without TRIM, might give you an idea of how other drives will perform without TRIM. Although obviously it will vary between controllers (and firmware revisions for the same controller), depending on how aggressive their idle garbage collection is. I don't know how the aggressiveness of the Vertex Turbo's garbage collection compares to other drives.

attachment.php

This shows the vertex write amplification doubled without TRIM.
You raised good point about things that make it vary but missed one crucial and obvious one... Data workload. I would imagine that if a drive is 90% full and is being given lots of tiny random writes (ex: log updates; a few bites in size they require writing a whole sector, so either 512b or 4kb depending on how it is formatted) then it might get to that theoretical 128x+ write amplification.
Suddenly 100 years expected lifespan becomes 1 year.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Go right ahead. I don't mind at all.



This shows the vertex write amplification doubled without TRIM.
You raised good point about things that make it vary but missed one crucial and obvious one... Data workload. I would imagine that if a drive is 90% full and is being given lots of tiny random writes (ex: log updates; a few bites in size they require writing a whole sector, so either 512b or 4kb depending on how it is formatted) then it might get to that theoretical 128x+ write amplification.
Suddenly 100 years expected lifespan becomes 1 year.

Actually in your case all you've done to the SSD is reduce the usable lifetime of 10% of the cells to 1yr.

The other 90% of the cells that sat static for the year are still fully functional with another 100yr life expectancy.

That said, I'd like to see what happens to the reliability of a spindle drive that is 90% full and has to endure continuous writes for the duration of a full year. I doubt you are going to experience 100yr lifetime from the remaining 90% of the drive as the drive's mechanical components are going to be all the more worn out from the 1yr of writes.
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
Actually in your case all you've done to the SSD is reduce the usable lifetime of 10% of the cells to 1yr.

The other 90% of the cells that sat static for the year are still fully functional with another 100yr life expectancy.

That said, I'd like to see what happens to the reliability of a spindle drive that is 90% full and has to endure continuous writes for the duration of a full year. I doubt you are going to experience 100yr lifetime from the remaining 90% of the drive as the drive's mechanical components are going to be all the more worn out from the 1yr of writes.

SSDs have wear leveling. The data in the "static" cells will be moved (causing more write amplification; although it should be a negligible increase) in order to cycle them back into use to ensure even wear at some point (that is, when there is a large enough gap between them and the most worn sectors). And the 100 years is assuming that ALL the cells in the drive are used equally. If you are only using 10% of the drive at that write amplification then those 10% of the cells would only last a month. This also explains the case with the guy whose SSD ran out of writes in a single year (he posted here in the forum) and who was not using TRIM.

As for a spindle disk in such a situation. It wouldn't last 100 years, they never do and I never claimed they do. But I doubt it would have a significant impact on its lifespan. I was comparing TRIM-less SSD to TRIM-ed SSD not TRIM-less SSD to a HDD.
 
Last edited:

PandaBear

Golden Member
Aug 23, 2000
1,375
1
81
Assuming you use the same material and shrink, the cell will last fewer cycles. However usually they find better material that's more durable to shrink it with and the result is a reduction of durability that isn't as bad as it would have been.

Wear leveling even out the hot spot on a chip or a drive, if it is already wearing uniformly it would not improve the life. Unless, if you buy a much bigger drive than you need and use only a small part of it, then it would last longer. If you only need a 20GB drive and buy a 60GB, it would last twice as long if you buy a 120GB assume both have wear leveling. This is what sets it apart from hard drive, which capacity has nothing to do with durability.

ECC helps with initial quality and yield, but not much about life. It is like mapping out bad sector so you don't have to throw away the chip but it doesn't means the chip go from lasting 3k cycle to 10k cycle.

Reducing write amplification would help, but the better way to do it, if possible, is to have a file system that is flash aware and would not duplicate some of the work stuff already needed in wear leveling.
 

groberts101

Golden Member
Mar 17, 2011
1,390
0
0
Reducing write amplification would help, but the better way to do it, if possible, is to have a file system that is flash aware and would not duplicate some of the work stuff already needed in wear leveling.

Perfect Disk and Disk Keeper are already doing that with their drive mgmt tools. Relatively speaking of course as they are not "flash aware".
 

DirkGently1

Senior member
Mar 31, 2011
904
0
0
So, from the same XS thread, it looks as if it's the Indilinx drive that will be first past the 1PB barrier! Not far off now...

http://www.xtremesystems.org/forums/showthread.php?271063-SSD-Write-Endurance-25nm-Vs-34nm

Ignore the charts in the OP, they're horribly outdated. The Indilinx has well over 900TB of writes now and after a brief scare is going strong. Static data is still retained once written and passes MD5 checks, which was my main concern. After all, it's pointless having a drive capable of this if all the data was corrupt 100's of GB ago.
 

=Wendy=

Senior member
Nov 7, 2009
263
1
76
www.myce.com
I have been following that thread over at XT, and it is quite remarkable how ALL the SSDs in that thread are coping so well with the punishment they are taking.