The skinny on TRIM

imagoon · Nov 2, 2012

taltamir said:
A spec is an article. Just one that specifies something.

It is a 496 page document, do you mind referring to a specific page number wherein this is stated?

That sounds highly unusual... can someone confirm or refute this?

Start on page xi, "table of contents" I gave you the exact term and it has 2 pages on op code 06h. I will leave you to click around on all the tables and references.

Working draft just means they will take suggestions on the specification. They then roll the suggestions in to the next ATA number and put it out as a working draft again. It does not mean the the actual specification is still in draft. It is a pretty common academic process.

taltamir · Nov 3, 2012

Cerb said:
If two drives in a RAID 1 both get told that a given set of blocks is unused, unless it is noted in array metadata somewhere, so that those blocks are ignored, they could decide when to return zeroes at different times, or even not do so at all until the block is re-used for something. Between those two points in time, the drives aren't mirrors.

Irrelevent, the only sectors that aren't perfectly mirrored are sectors which have been explicitly deleted and never written to since said deletion.

That is a problem in parity keeping raid as it will cause parity calculations to fail. But in RAID1 it merely means that you randomly can read one set of junk or another set of junk but both are junk, and should a rebuild be required no data will be lost nor will the rebuild report any parity errors

Although this might or might not be an issue on a 3+ drive RAID1 array rebuild depending on whether the controller simply copies the data from just one of the drives or whether it compares both drives beforehand. (I have no idea which is the typical behavior)

imagoon said:
Working draft just means they will take suggestions on the specification. They then roll the suggestions in to the next ATA number and put it out as a working draft again. It does not mean the the actual specification is still in draft. It is a pretty common academic process.

If that is the case, it would mean a working draft document from 2009 could contain suggestions which end up being removed or altered before the final release of the next spec number.

imagoon said:
Start on page xi, "table of contents" I gave you the exact term and it has 2 pages on op code 06h. I will leave you to click around on all the tables and references.

It has about 39 references to op code 06h actually spread across what is easily 20-30 pages.
"DATA SET MANAGEMENT" makes no mention of TRIM requiring to report success and "Write Log Ext Error" is not gonna help anyone with an error didn't occur.

A drive, driver, or controller using a driver that uses an ATA spec that predates trim will not generate such an error.

Cerb · Nov 3, 2012

taltamir said:
Irrelevent, the only sectors that aren't perfectly mirrored are sectors which have been explicitly deleted and never written to since said deletion.

In a file-system-aware RAID, the above is true. In a FS-agnostic RAID, the above is not true. RAID 1 has no more a concept of deletions than RAID 5/6/etc.. With a bog standard RAID 1, all data is made up of a range of valid sectors that need to be mirrored. Whether the data is garbage to the host software cannot be a concern for the array, since it cannot be discerned by the array controller.

A hardware RAID controller, that will work with any file system, must be able to validate the array just by reading its member's contents, which will include those of the hypothetically TRIMed sectors. If two sectors do not return the same results, whether they were garbage to the file system sitting on top of the array doesn't matter: the controller must consider the array broken, and have a means to decide which drive should be preferred to repair the array with. There is no deleted data, from the RAID controller's perspective.

Until TRIM has some specific time-limited guarantees, the only way around that will be to explicitly add in TRIM support, which may require significant firmware, minor driver, and minor on-disk metadata, changes. The controller must be able to accept the TRIM command, add readable metadata to the array about the now-garbage sectors, and then send the TRIM command to the drives, before it can reliably be considered to implement TRIM. Historically, there has been little to no use for doing anything special to deleted sectors, so it's likely not the most trivial undertaking to validate the correctness of such an implementation.

A software RAID array, OTOH, can prefer to treat the file system's expectations as having the highest precedence. In that case, it's a much easier proposition to implement TRIM, provided nothing fancy like LVM is being used, since the RAID controller has direct access to file system knowledge.

taltamir · Nov 3, 2012

Cerb said:
In a file-system-aware RAID, the above is true. In a FS-agnostic RAID, the above is not true. RAID 1 has no more a concept of deletions than RAID 5/6/etc.. With a bog standard RAID 1, all data is made up of a range of valid sectors that need to be mirrored. Whether the data is garbage to the host software cannot be a concern for the array, since it cannot be discerned by the array controller.

A hardware RAID controller, that will work with any file system, must be able to validate the array just by reading its member's contents, which will include those of the hypothetically TRIMed sectors. If two sectors do not return the same results, whether they were garbage to the file system sitting on top of the array doesn't matter: the controller must consider the array broken, and have a means to decide which drive should be preferred to repair the array with. There is no deleted data, from the RAID controller's perspective.

The fact that from the perspective of the hardware controller there is no such thing as deleted data is totally irrelevant to a RAID1 scheme since there is no parity to calculate.
No data is ever lost because the only sectors affected are ones containing deleted data. As such the USER never loses any ACTUAL DATA.

When does this crop up?
For RAID5?
During a parity raid (say, RAID5) rebuild operation: Rebuild aborts due to parity mismatch.

For RAID1?
1. During a rebuild operation of a degraded 2 drive RAID1 array: the remaining single drive will be used as a source, no actual data is lost, the controller will NEVER NOTICE the disparity , only garbage is garbled.

2. During a rebuild operation of 3+ drive RAID1 array where at least 2 drives have not failed: If the surviving drives are compared and found to be different (rather than just reading off of one of the surviving drives at random; I have no idea which of those two options is the typical behavior), an FALSE POSITIVE data corruption error/warning might be given to the user, as long as the recovery continues with data from EITHER then no actual data was lost since its just deleted junk that has mismatch.

3. During a validate operation for RAID1

A hardware RAID controller, that will work with any file system, must be able to validate the array just by reading its member's contents

None of the controllers I used had such a feature. But hypothetically if one DID do that, then at worst you get a FALSE POSITIVE warning about data corruption. Where no actual data was corrupted or lost. Because no matter which drive is chosen as a source for the "repair" they are both dealing only with junk sectors

4. If you know of ANY other scenario where this actually crops up in RAID1, please let me know.

Cerb · Nov 3, 2012

taltamir said:
1. During a rebuild operation of a degraded 2 drive RAID1 array: the remaining single drive will be used as a source, no actual data is lost, the controller will NEVER NOTICE the disparity , only garbage is garbled.

During a rebuild? The disparity could be the cause of the rebuild, and it would be a waste of resources to do the rebuild. It may be a false positive and pointless wrt to the user's data--I'm not arguing that--but it's still an error to be handled, that will not exist if TRIM is either specially implemented, or not implemented. The user, for some value of user, will notice errors, even if they aren't actually corrupting useful data. Any such error, however, should not be allowed to happen, and not supporting TRIM as it stands conveniently keeps it from being able to. Determinism is the cure-all, and luckily, it's on the way (even though it never should have been non-deterministic from the start).

You should not lose data from it, but it still has the potential to leave the array in an undetermined state, without some way to read a list of sectors to ignore on a boot-time check, possibly leading to a rebuild, or possibly-bogus errors during an intended rebuild (how are you going to separate a real data error from a free space error without spending a ton of time?). A false positive is a bad thing: we don't need RAID controllers that cry wolf. An error needs to be a real error, or not raised in the first place. If TRIM were to have a guaranteed time bound for response, the very chance would disappear, and implementation would become trivial (most controllers already assume that any ATA drive is lying about having actually just written data, and have been that way for many years, so a small time delay would likely not be an issue).

taltamir · Nov 3, 2012

Cerb said:
During a rebuild? The disparity could be the cause of the rebuild

That is a good point and a cause for concern as unnecessary and unwanted rebuilds are bad even if no data is lost...
However for this to be true it requires that the disparity between the drives be noticed in the first place. The only way that is happening is if the controller for some reason reads and compares the paired sectors on both drives. And as I said, I am not aware of any situation where this will occur on RAID1 except during a rebuild (caused by some other cause like a dead drive).

Read operations are always striped between the two drives such that each drive reads only half the data and they are never both reading the same data, nor is this data compared (this is why RAID1 of 2 drives doubles (nearly) the speed)

However, if there are controllers out there that routinely scan all drives in a RAID1 array for disparity between the two member drives, then in such a situation such a disparity would lead to unnecessary rebuilds (with no data loss) as you said.
If someone could name one such controller I would concede the point (I already stated non deterministic TRIM is an issue for parity doing RAID but not for RAID1, I will concede it is an issue for some RAID1 schemes as well but not all of them. Depending on the specific controller / software used to perform the RAID1)

Mark R · Nov 3, 2012

taltamir said:
That is a good point and a cause for concern as unnecessary and unwanted rebuilds are bad even if no data is lost...
However for this to be true it requires that the disparity between the drives be noticed in the first place. The only way that is happening is if the controller for some reason reads and compares the paired sectors on both drives. And as I said, I am not aware of any situation where this will occur on RAID1 except during a rebuild (caused by some other cause like a dead drive).

Data scrubbing (the act of the controller reading every sector from every drive, and validating that they are all consistent) is a normal option on any sensible RAID system; virtually all SAS RAID controllers offer this; indeed, regular scrubbing is an absolutely essential part of maintaining a RAID array, in order to minimize the risk of data loss.

The problem is that if the scrub finds inconsistencies, then you have the problem about what it actually does about them.

Some may switch from low-priority background (read-only) scrubbing to a high-priority full rebuild (read/write), if inconsistencies in the scrub are detected.

Read operations are always striped between the two drives such that each drive reads only half the data and they are never both reading the same data, nor is this data compared (this is why RAID1 of 2 drives doubles (nearly) the speed)

Not universally true. I've seen some RAID controllers which will ONLY read from drive 1 in RAID 1, unless drive 1 is unavailable. Some controllers interleave reads across the drives, some so in a naive way, which gives performance similar to a single drive; some are smarter.

I've also seen some RAID controllers which ALWAYS read all drives in RAID 1 and compares the data (I believe some versions of the Intel RAID do this in RAID 1); although what it actually does when it gets an inconsistency, I don't know.

taltamir · Nov 3, 2012

Mark R said:
Data scrubbing (the act of the controller reading every sector from every drive, and validating that they are all consistent) is a normal option on any sensible RAID system; indeed, regular scrubbing is an absolutely essential part of maintaining a RAID array, in order to minimize the risk of data loss.

I know what data scrubbing is. I routinely use scrubbing on my ZFS arrays/single drives (and for ZFS this will not be an issue).
But as I said I don't know of any hardware raid controllers that will scrub/verify raid1 arrays. Although to be fair my exposure is mostly cheap and crappy controllers.

Mark R said:
Not universally true. I've seen some RAID controllers which will ONLY read from drive 1 in RAID 1, unless drive 1 is unavailable. Some controllers interleave reads across the drives, some so in a naive way, which gives performance similar to a single drive; some are smarter.

I've also seen some RAID controllers which ALWAYS read all drives in RAID 1 and compares the data (I believe some versions of the Intel RAID do this in RAID 1); although what it actually does when it gets an inconsistency, I don't know.

That is interesting to know. In the latter situation non deterministic TRIM could indeed be an issue even on a RAID1 array (if it triggers a full rebuild).

Mark R · Nov 3, 2012

taltamir said:
I know what data scrubbing is. I routinely use scrubbing on my ZFS arrays/single drives (and for ZFS this will not be an issue).

But as I said I don't know any hardware raid controllers that will scrub/verify raid1 arrays. Although to be fair my exposure is mostly cheap and crappy controllers.

Standard feature on all enterprise level cards.

BonzaiDuck · Nov 3, 2012

Reading the posts by several luminaries here will be educational for sure, at least as far as tracing the logic and at least observing the points of consensus.

That being said, I only wanted to answer questions about TRIM in my ISRT SSD-caching usage -- which requires a RAID BIOS setting, but involves some sort of RAID0 configuration through a utility running under the OS ("IRST").

LC-Technology says that they can't implement their TRIMming "Solid State Doctor" software to work with a RAID setting in BIOS.

Intel tells me that they had fixed the problem of "TRIM-in-RAID" with a version 11.x.x.xxxx of IRST software, but that it will not provide TRIM for my Z68 chipset and requires Z77.

ASUS tells me that I would benefit to update the IRST 10.x.x.xxxx version to 11, but offers no other guarantees, assertions or promises.

Patriot, which makes my Pyro drive, has yet to respond.

Several forum posts seem to insinuate that TRIM works in RAID0, and my ISRT setup requires a RAID0 created with the Windows software -- it is a RAID0 for a drive configured to itself with no other physical drive involved.

I see that others who had used RAID0 for a long time with two disks do not perceive performance degradation. I don't see any performance degradation. IF I want to maintain the drive, I am considering a temporary connection to my Marvell controller which is configured for AHCI.

That's the intelligence I can offer. But I don't really know anything yet, except that "I can do certain stuff" which might improve a drive performance that I haven't perceived to degrade.

And if an SSD is used for ISRT caching, one must then ask about the proportion of read operations versus writes as opposed to a standalone HDD. OR "Is this worth the worry?" Or "Should I just buy a spare SSD for caching that I can switch back and forth?"

And I just recall reading here where someone called TRIM a sort of "bonus" to older SSD performance.

taltamir · Nov 3, 2012

Mark R said:
Standard feature on all enterprise level cards.

Alright then, so for RAID1 it depends on the controller.

Although, it just occured to me... The whole original question was whether implementing non deterministic TRIM would cause problems if implemented in RAID1

However, the 2 possible issues identified are ccontroller specific. It thus means that a controller which does not have either of those 2 issues could have non deterministic trim support for RAID1 added to it with no ill effects.

Of course, the fact that the controller must generate TRIM commands for the component drives based on the TRIM commands it receives for the entire array means that with TRIM tracking in the controller itself any and all RAID types could have TRIM implemented for them. But that would be expensive and complicated compared to using deterministic TRIM.

Old Hippie · Nov 4, 2012

Reading the posts by several luminaries here will be educational for sure

I'm glad some one is getting it.

It's waaaay over my head! LOL!

thepriceisright · Nov 4, 2012

Thank you all for the great input so far, you've all been incredibly cordial as well.

I was going to start testing question #3 (TRIM with dynamic disks) to see for myself but I have a question about the process.

Here is what I did *edit* I should clarify that this is a baseline test on a Samsung 830 connected to the Intel P67 SATA 6G in AHCI:

Placed a txt file on the root of C.

Opened WinHex as an admin, opened the C drive (F9, new snapshot) and found my txt file. I made note of the offset and sector.

Closed WinHex and deleted the txt file, then emptied recycle bin (does shift+del do the same thing with regards to trim?) and waited for a few minutes.

I reopened WinHex as admin, opened the C drive (F9, new snapshot) and searched for the offset. The sectors/offset are not zeroed out, in fact the first 7 hex characters are identical. The rest is gibberish and does not match the baseline hex. Since it is a txt file I can actually read the text in the baseline snap, not so with the second one.

Now my assumption is that the file was trimmed and all is well in the world but this is not what several of the guides I was following stated would happen.

I wanted to make sure my methodology is proper before I continue testing other scenarios.

taltamir · Nov 4, 2012

BonzaiDuck said:
Several forum posts seem to insinuate that TRIM works in RAID0

Using the latest intel driver only. Others do not yet support it

I see that others who had used RAID0 for a long time with two disks do not perceive performance degradation.

There is no performance degradation, there is a single one time drop followed by slight fluctuations up and down around it.
There is however an increase in write amplification which introduces more lifespan degradation. (not sure if significant or not)

IF I want to maintain the drive, I am considering a temporary connection to my Marvell controller

I see no point in doing that. It isn't gonna trim the drive when you plug it in, you would have to first write data and erase it... Although you could run a "manual trim" on it once plugged in I cannot justify doing that every 2 weeks just for a bit extra performance.

which is configured for AHCI.

AHCI is useful in that it provides NCQ which boosts performance. AHCI drivers were the first to be updated with TRIM support but you can get TRIM on ATA (IDE) as well.

OmegaSupreme · Nov 4, 2012

Now correct me if I'm wrong, doesn't TRIM find unused/irrelevant sectors and clear them? Wouldn't this effectively shorten SSD life as there are only a limited amount of read/writes?

Wouldn't it be actually MORE beneficial to disable it (if possible) for drive longevity? Even though you may take a slight hit in overall speed.

wirednuts · Nov 4, 2012

OmegaSupreme said:
Now correct me if I'm wrong, doesn't TRIM find unused/irrelevant sectors and clear them? Wouldn't this effectively shorten SSD life as there are only a limited amount of read/writes?

Wouldn't it be actually MORE beneficial to disable it (if possible) for drive longevity? Even though you may take a slight hit in overall speed.

no. when trim clears a sector, it does so using the secure erase command. basically, it sends an electrical short to that sector.

what you dont want to do is send a D-Ban type of erase to the drive... which is the old school type where it just writes over the unwanted sectors. that shortens the drives life and doesnt speed anything up because the sector still has stuff written on it.

BonzaiDuck · Nov 4, 2012

Taltimir's response about TRIM for RAID0:

taltamir said:
Using the latest intel driver only. Others do not yet support it

Well, I'm trying to nail this down for thel Z68 chipset -- before . . . I install the latest driver and IRST software.

There's always the possibility that some tech-rep is insufficiently informed to give perfectly accurate responses to specific questions. An Intel tech-rep, in e-mail a few days ago, told me that you couldn't get TRIM on the Z68, but only with the Z77 boards.

With these other sources of information -- some forum posts elsewhere I uncovered going back a year -- there was no reference to chipset.

I myself had misinterpreted Intel's online web-pages for the new software version of IRST: It said the software was compatible with chipsets that included the Z68. But that doesn't confirm that it provides TRIM support with the Z68.

All I can say is . . . . the Smart Response (ISRT) SSD caching is about the best thing since white bread. If it doesn't allow TRIM to occur, that's sort of a fly in the ointment . . . . or a fly on the peanut-butter sandwich with Barbara Ann . . . [if they still make that bread].

taltamir · Nov 4, 2012

BonzaiDuck said:
There's always the possibility that some tech-rep is insufficiently informed to give perfectly accurate responses to specific questions. An Intel tech-rep, in e-mail a few days ago, told me that you couldn't get TRIM on the Z68, but only with the Z77 boards.

I should note that the line you quoted me explicitly referred to RAID0...

Saying that a caching device is a "sorta RAID0 that does RAID0 of only one drive" makes no sense as by very definition you need more than one drive to do RAID0.

BonzaiDuck · Nov 4, 2012

taltamir said:
I should note that the line you quoted me explicitly referred to RAID0...

Saying that a caching device is a "sorta RAID0 that does RAID0 of only one drive" makes no sense as by very definition you need more than one drive to do RAID0.

Well, I'm sorry it doesn't make sense, but that's the way it works. Jus' a minute -- I'll get a screen capture of my ISRT status and post it here.

. . . Heck with it . . . I'll have to upload it to my web-pages first. . . .

OK . . . the 60GB SSD is represented as "Array 0000" showing a drive of 56GB as a rectangle knitted to a box labeled "Volume_0000 . . Type: RAID0 Cache Volume 56GB." The accelerated HDD is not part of this "SSD entity." One concludes from the diagram that the accelerated HDD isn't part of any RAID0 -- it's merely accelerated by the caching entity. Which . . . . makes sense, since you can "unhinge" the caching from the windows desktop, and then boot directly from the standalone HDD.

It's weird. I don't think I've come across any technical (understandable) explanation of it. And of course I have reservations, when the Intel tech-rep refers to two or more SSDs in RAID0, without any caveat for the ISRT caching.

OK . . . here's the screen-capture . . .

Cerb · Nov 4, 2012

OmegaSupreme said:
Now correct me if I'm wrong, doesn't TRIM find unused/irrelevant sectors and clear them? Wouldn't this effectively shorten SSD life as there are only a limited amount of read/writes?

No. When you write, it already has to clear and reprogram space on the flash--wear leveling is what allows an arbitrary amount of writing to the device to kill it, rather than a few thousand writes to a single logical sector killing it (also see write amplification). Most drives have either ~7% or ~14% of reserved space for this purpose. TRIM allows that to be increased by up to your free space, allowing it to maintain higher performance, similar to the performance before it was entirely written over.

imagoon · Nov 4, 2012

taltamir said:
It has about 39 references to op code 06h actually spread across what is easily 20-30 pages.
"DATA SET MANAGEMENT" makes no mention of TRIM requiring to report success and "Write Log Ext Error" is not gonna help anyone with an error didn't occur.

A drive, driver, or controller using a driver that uses an ATA spec that predates trim will not generate such an error.

It requires the command to be acknowledged and will generate an error if it is invalid such as "TRIM LBA range" that happens to be above the drive LBA count. TRIM does not require that something happen with it (non determinist part of the command) however the drive is require to either ACK or send a error frame back. As I am aware there is very few (if any at this time) commands that get sent to the device and then get no response back. ATA command channel is supposed to have acknowledges or errors for all commands received. What the drive does with the command may or may not matter.

It is pretty obvious that a device that predates TRIM and the DATA SET MANAGEMENT commands will not generate that error so I am wondering where you are going with that as your point really makes no sense. The drive would report the "Invalid / Unknown command" back to the system which has been defined since ATA1...

Dufus · Nov 5, 2012

While using Intel RST/e drivers at this time then trim support for arrays is only available for RAID0 and then only for 7 series chipsets (except X79) using 11+ drivers and OROM. When drives are seen as SCSI devices using Intel drivers then trim will also work for W7, if it would also work with the disks seen as ATA devices.

Trim only tells the SSD firmware to unmap translation of physical pages to logical LBA's, it does not erase blocks itself. When reading LBA'S that translate to unmapped pages the SSD firmware may generate the returned data on the fly or point to a special cache location. In this case it may return all zero's, all one's or something else but should not return the old data if the trim was successful and the SSD firmware conforms to standards.

Testing for trim success can be problematic, for instance the drivers may change a failed trim to successful in order to keep the OS "happy". The trim command sent from the OS is not the same as the ATA trim command sent directly to the SSD. Possibly the best chance is to physically check the logical LBA's before and after trim.

The skinny on TRIM

Diamond Member

Lifer

Elite Member

Lifer

Elite Member

Lifer

Diamond Member

Lifer

Diamond Member

Lifer

Lifer

Diamond Member

Member

Lifer

Member

Diamond Member

Lifer

Lifer

Lifer

Elite Member

Diamond Member

Senior member