SSDs, garbage collection and TRIM

mikeymikec

Lifer
May 19, 2011
20,375
15,059
136
I've been going through a steady cycle for a few years of reading up about SSDs, learning/remembering what garbage collection and TRIM is about, then at some point forgetting some of the important points and having to go through it all again. I find some of the articles I read to be a bit hard to digest, and sometimes I have to read two articles and put the pieces together from what I understand from each. It's tedious.

Today I decided that I've finally had enough of this tedious cycle, so I thought I'd read up on the topics and then write my own guide, perhaps purely for my own use, or if people appreciate it AND it is accurate, for public consumption.

Here is my guide. If I have made mistakes, please let me know (on the thread or PM), preferably with reputable sources to back up what you're saying. Also, be nice if you have any criticisms. Constructive criticism is good.

Mikeymikec's simple explanation of SSD garbage collection and TRIM

This explanation is written in a historical fashion so that the reader can see where the problem started and how/why solutions developed.

Jargon buster:

Page: 4KB of data in a 256KB block on a storage device.
Block: Containers of 64x 4KB pages of data, as many as you need to make up the total capacity of a storage device.

The problem

SSDs can't write data to pages that already contain data, unlike hard disks. With SSDs, the page would have to be fully erased first then the data can be written into it, and since SSDs only erase in blocks, and all the data in that block would need to be re-located, this imposes a significant performance penalty.

Also, the storage cells in SSDs have a limited number of write cycles, so avoiding unnecessary writing is a good thing, and especially so to the same location because that block will wear out sooner (reading doesn't any effect in this respect).

Some helpful background information

When an OS deletes a piece of data (regardless of the storage device type), it isn't actually wiped from the storage device, it is simply marked for deletion, kind of like taking a sheet of A4 with out-of-date information on and putting a big red cross on it so that it's obvious that it shouldn't be regarded as having useful data. The seemingly logical solution would be to wipe the data out straight away but that imposes a performance penalty. It would be like insisting that every sheet of A4 has to be taken to the shredder before any real work can continue.

The workaround / solution

The first technique created in response to this problem is called 'garbage collection'. An overly-simple explanation of garbage collection is this:

When pages have been marked as garbage (no longer containing useful data), then during the SSD's garbage collection cycle, the blocks that contain those 'garbage pages' will have their contents written to a blank block, then the original block is erased so it can be re-used.

Sounds like a solution? Ah, well, not quite. Older operating systems aren't aware of the foibles of SSDs and they don't communicate with them in quite the ideal way to suit the garbage collection feature. In some situations, it works fine.

For example, when the data in some pages need to be altered (e.g. you've just updated your CV), the usual 'copy everything to a new block' cycle has to occur, and in that process the pages in the original block are marked as garbage.

However, when a file has been deleted, older operating systems don't inform the SSD of that change in the way that one might consider logical (the explanation as to how and why that is won't be explained here), so those pages of data won't get marked as garbage for the SSD's garbage collector to deal with.

This means that every time garbage collection is run, pages of valuable/current data are being moved/written as well as some useless pages containing data that (as far as the operating system is concerned) has been deleted. So more data is being written to the SSD unnecessarily, which is bad as far as the life of the SSD is concerned. This is what is known as write amplification.

This also means that there's a discrepancy between what the OS believes is stored and what actually is stored - the OS believes that certain files have been deleted but their data still exists on the SSD. The space used by these pages will only be re-used when the OS requests to write to their location, which makes the SSD realise that those pages are not needed, and they get marked as garbage.

So, onto the next logical workaround / solution, known as TRIM.

I've been referring to “older operating systems” on a few occasions. My definition for an older operating system as far (as this document is concerned) is an OS that doesn't support TRIM.
The following operating systems support TRIM:

http://en.wikipedia.org/wiki/TRIM#Operating_system_support

So, a significant drawback of garbage collection that has been established so far in this document is the SSD's lack of awareness of when files have been deleted by the operating system/user. A logical next step is to make it so that when an OS requests a file deletion, it immediately tells the SSD specifically that the data pages used to store that file can be marked as garbage, so that the next time garbage collection is run (or when data in a block needs to be updated), it doesn't go writing unnecessary data, thus avoiding a certain amount of write amplification.

A TRIM-capable OS needs to inform the SSD by issuing it with a command called TRIM, and the SSD needs to know what that command means, so you need a modern OS and a TRIM-aware SSD (most of them from – I would guess – 2010 onwards support this command).

My sources:
http://thessdreview.com/daily-news/...ion-and-trim-in-ssds-explained-an-ssd-primer/
http://en.wikipedia.org/wiki/Write_amplification
 
Last edited:

BrightCandle

Diamond Member
Mar 15, 2007
4,762
0
76
I would change the wording of the before Trim world/introduction because it can be said more succinctly and more simply with a bit more technical information:

Windows NTFS can simply be described as storing not only the file itself, but also an entry in the file system table saying where the file can be found. Before trim was introduced the deletion of a file only resulted in a change to the FAT and the entry being removed. The original blocks of the files contents would remain on the disk. The only time the SSD knew that the file had been deleted was when the operating system chose to write a different file over the pages that the original file was using, thus giving it information that the contents of the file could now be reallocated as empty pages.

Trim changes that process so that the SSD is informed that the pages are no longer in use on the deletion of the file. This effectively gives the SSD much earlier information about pages that can be removed and that should reduce the write amplification and also allow more free blocks to be made available more quickly.

I would also change the explanation of the problem as it can current be misunderstood. The problem with flash memory used in SSDs is that a page can only be written to once after it has been erased. SSDs can only erase an entire block and not individual pages.

Write amplification is better described as the process of writing more data than strictly provided to it by the OS because of partially filled blocks and the need to erase and rewrite the entire block with the one additional page when no free pages can be found on the SSD.

SSDs don't write to blocks, they only erase at block level. Its pages that the SSD writes to in chunks and its empty pages the SSD is looking for when writing new or updating existing data.

So currently I would say there are quite a few technical inaccuracies in this text that need working on.
 

mikeymikec

Lifer
May 19, 2011
20,375
15,059
136
Updated the OP. I altered the first paragraph of 'The problem' and removed the first paragraph of 'The workaround/solution' in light of what you've said and double-checked with the primer mentioned in the sources.

I've also moved the write amplification sentence to the relevant paragraph in 'The workaround/solution'.

I've also added a paragraph to the end of 'The workaround/solution', mentioning the previously unmentioned point about deleted files not being marked as garbage until the OS requests write operations in their location.

I've also altered the final TRIM paragraph wrt when TRIM is issued time-wise and also mentioning its effect on write amplification.

I'm not sure why you mentioned NTFS then seemingly talk about how (AFAIK) file systems work in general.

Thanks for your input, I appreciate it :)
 
Last edited:

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
This also means that there's a discrepancy between what the OS believes is stored and what actually is stored - the OS believes that certain files have been deleted but their data still exists on the SSD.
Nope. Most SSDs act exactly like hard drives, without TRIM, leaving no such discrepancies to exist. Samsung SSDs have created discrepancies, but the opposite way from how you describe, however: by inspecting NTFS itself, and then discarding free blocks, with or without TRIM.

Due to half-baked thinking when they came up with TRIM, discrepancies that matter will generally only exist in RAID+TRIM environments (another reason for ZFS and BTRFS to take over, over time, along with any other similar volman+FS systems).

Here's something interesting I saw the other day, which is somewhat related: http://www.xbitlabs.com/articles/storage/display/sandisk-ultra-plus_5.html#sect0

It looks the Samsung drives aren't doing any GC at all without TRIM; meanwhile the SF drives don't go back to full performance after TRIM.
Oh, they're still doing GC, just poorly, apparently, as far as performance goes. They have to be, to function. The Corsair Neutron definitely replaces the Crucial M4 as the solid, "will work pretty well no matter what you throw at it," drive, however (especially since M500 aren't staying in stock anywhere, these days), but the Plextors are still good buys, too, in 256GB capacity.

For a modern Windows desktop, the 840 and 840 Pro are both great--best bang/buck, and fastest--but I'm not sure they're what I'd get for Linux, a kiosk, a server, etc..