WD Red or WD RE4 drives for ZFS NAS?

blastingcap · Sep 24, 2012

murphyc said:
Well the only reason why you'd get long error recovery on a green drive is due to persistent read errors (internal to the disk). The problem is that it's non-obvious when this is happening until it gets really bad. If you're looking at the full SMART attributes rather than just the single pass/fail health attribute, you can see these show up in either the raw error read rate, ECC errors, or pending sector errors. Some system logs might also show read delays, not just a disk that reports back a read error. So you kinda have to do some diagnostics periodically and most people don't do that. They like to pretend the state of a diskW drive is binary: working vs not working.

So my suggestion is that you initiate ATA Enhanced Secure Erase on the drives periodically which will remove persistently bad sectors from use. You could also just zero the disk with a zeroing utility: this is easier for most people, but it's much slower, and it only zero's addressable sectors rather than all sectors. But if there are any bad sectors, it's equally effective at compelling the firmware to remove them from use, and substitute reserve sectors instead.

Thanks, I'm new to diy nas. I am building a raidz2 on nas4free. Do I even need to do the ashift 12 thing anymore or is the latest nas4free able to handle it? Or at least freenas0.7?

murphyc · Sep 24, 2012

imagoon said:
IOEDC/IOECC is supported on both SAS and SATA. It is on the Red / Black / Blue / RE4 / WD Enterprise drives. The ECC portion of SAS / SATA is not interface specific. It is an HDA thing.

Here is an oldish 2008 article from SNIA that explains what is even a bigger difference today between consumer, nearline and enterprise disks. They are utterly different.

Most current controllers while rebuilding from a degraded status will mark any sectors that it can't rebuild as 'Bad' at the file system level and keep on rebuilding.

Consumer RAID? They are puking their guts out when this happens. The rebuild stops. The array comes unraveled, and users complain about data loss because they don't have backups. Because they think RAID is backup.

If only consumer RAID merely rebuilt the RAID, and a few files were corrupt. Consumer users would not be complaining nearly as much because they wouldn't even know (most likely) they'd been bitten. You would not see consumers irate with a handful of bad files.

RAID isn't a backup so this is acceptable.

This is not the perspective of most consumer users.

90 second recovery on consumer drives is reasonable because it isn't assumed there is another disk or group of disks available for consumer drives.

That is PRECISELY the situation you have with consumer RAID in a degraded state.

This whole TLER/SCT ERC thing is a NON-ISSUE in enterprise storage. The controller timeout is properly timing out for the drives error recovery, and the drives have vastly better error recovery from the outset and the RAID controllers will typically not implode on additional read errors when degraded. That sort of behavior costs money. Money consumers aren't spending for their RAID.

And it's why ZFS, btrfs, and ReFS are very cool because we get much more resilience with data while still being able to use cheaper disks.

The SATA and SAS disks use the same ECC correction technology.

They do not, see the SNIA presentation previously cited. Google is full of this stuff, some more current than that one, but that one has nice pretty diagrams very clearly describing the significant differences between SATA and SAS error detection and correction capabilities.

All those numbers for expected life and none of them are an order of magnitude better than any other.

That's because you're quoting MTBF. I expressly said unrecoverable error rate, or UER. That's a different thing.

The UER on Red is the same as it is for Green. 10^14. The UER on RE4 is 10^16. The difference is ~51 hours to reach a hard error at sustained transfer rate for consumer SATA, and ~2700 hours for enterprise SAS.

murphyc · Sep 24, 2012

tynopik said:
the point is that such things are not defined/required in the spec

Yes they are, go do some reading.

it's possible to make a SAS disk without such things and a SATA disk with such things

Not every advanced feature you find in SAS can be done in SATA. There's a reason why there are such things as nearline SATA and nearline SAS. But they are still different.

also, please explain how the hardware can be 'more accurate' with the same amount of parity data? either you have enough to recover or you don't

How it is originally encoded is different, how it is decoded is different. The sync marks are different allowing better separation between the ECC region on the disk and the user data region, etc.

murphyc · Sep 24, 2012

blastingcap said:
Do I even need to do the ashift 12 thing anymore or is the latest nas4free able to handle it?

That's a totally different thing, related to 512e AF disks and proper alignment. I wouldn't think that ZFS would be totally unaware of how to handle 4KB sectors since it's block size is 4KB, and it's typical to just format the bare disk without partitioning it - so alignment should be assured. So I'm not sure what the context is for needing ashift.

tynopik · Sep 24, 2012

murphyc said:
Yes they are, go do some reading.

I read the paper.

You're wrong.

murphyc · Sep 24, 2012

tynopik said:
I read the paper.

You read what paper?

You will find product spec after produce spec saying the same thing, Full IOEDC/IOECC is only on the SAS model.

http://www.seagate.com/docs/pdf/marketing/po_barracuda_es_2.pdf

murphyc · Sep 24, 2012

PI, formerly DIF, only in SAS. Hitachi End to End Data Protection.

Variable sector size. Only in SAS.

tynopik · Sep 24, 2012

murphyc said:
You will find product spec after produce spec saying the same thing, Full IOEDC/IOECC is only on the SAS model.

That's their implementation, not what is actually required in the SAS spec.

imagoon · Sep 24, 2012

This is not the perspective of most consumer users.

Just because it is not the perspective of most consumer users doesn't mean they are correct.

As for the UER, that number is an average and is based heavily on how the drive is tested. It has some value but most UER's indicated that a 25GB RAID 5 array simply can never rebuild. This is of course hogwash. The higher the UER the odds of a problem should be lower. It is always still a gamble.

Picking on the Red drive again (2TB): Most consumers are not likely to read the entire disk the 46 times it takes to hit a "100% chance of a read error." The RE4 takes 4656 total disk reads to hit 100%

That is just statistics and doesn't always show reality because Enterprise disks fail often enough when not read 4656 total times.

The UER is determined the same way the MTBF is. They test a batch and derive it from there.

Other things that are interesting to explain:

RE4 and RE SAS use the exact same HDA but one is rated at 10^16 and one is 10^14.
Same sector count per HDA, Same transfer rates. This indicated the sectors are the same size since a larger sector with more ECC would add additional overhead and slow the SAS drive.

murphyc · Sep 24, 2012

tynopik said:
That's their implementation, not what is actually required in the SAS spec.

You said defined or required. The SAS spec defines a LOT more related to error detection, correction, recovery, and integrity than SATA. And some of those things are required including the extra sync marking with SAS which SATA does not require.

tynopik · Sep 24, 2012

murphyc said:
PI, formerly DIF, only in SAS. Hitachi End to End Data Protection.

which has diddly squat to do with ensuring that the data on the disk itself is good.

that's about ensuring data integrity through the rest of the chain (cable, controller, etc), but if it's corrupt on the disk, that doesn't help you at all.

murphyc · Sep 24, 2012

tynopik said:
about ensuring data integrity through the rest of the chain (cable, controller, etc), but if it's corrupt on the disk, that doesn't help you at all.

It includes a superior ability to correct for corruption regardless of where in the chain it occurs. That's the whole point of it. And it's convenient how you completely ignore the part about sync marks being different between SAS and SATA which does directly help correct expressly for on-disk corruption as the result of surface defects.

murphyc · Sep 24, 2012

imagoon said:
Just because it is not the perspective of most consumer users doesn't mean they are correct.

My very point is that consumers routinely do things that are not in their best interest, let alone best practices.

As for the UER, that number is an average..

The POINT is that consumer SATA statistically has two full orders of magnitude greater probability of encountering hard errors than than enterprise SAS. This is not insignificant over the life of a disk. When it actually occurs in the real world with a specific disk is obviously unknown, it does not require discussion. We know that people are encountering disk failures (no surprise) in RAID, but we also know (a surprise for some usually consumers) that in a degraded state a read error causes arrays to fail. So my point with Red is that while it will reduce the likelihood of a disk being dropped from an array in the first place, because the UER is the same as a green, statistically you still have risk of RAID array failure when it's degraded if any other disk encounters read errors. Perhaps higher risk.

RE4 and RE SAS use the exact same HDA but one is rated at 10^16 and one is 10^14.

The data sheet I've got off the web site disagrees. 10^15 vs 10^16.
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-771386.pdf
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-701338.pdf

Same sector count per HDA, Same transfer rates.

For transfer rates, not the same for the 1TB.

1TB SAS = 139MB/s, 12.8 read/write watts, 9.9 idle watts
1TB SATA = 128MB/s 5.9 read/write watts, 5.9 idle watts

These are not the same drive, so I don't know what you mean by the HDAs being the same. While the 2TB disks are more similar in performance, the power consumption is still higher.

This indicated the sectors are the same size since a larger sector with more ECC would add additional overhead and slow the SAS drive.

That is an unsupported deduction given that they don't in fact have the same performance. And SAS and SATA command queuing are completely different by specification. And the UER very clearly indicates there's a difference in error handling capability of the two drives.

How in the world do you think they get an order of magnitude reduction in UER, and a slightly higher transfer rate, if otherwise the two disks are identical? Well they are in fact not identical, there's a difference between SAS and SATA error handling. Again the SNIA diagrams explain a lot of this.

tynopik · Sep 24, 2012

murphyc said:
How in the world do you think they get an order of magnitude reduction in UER, and a slightly higher transfer rate, if otherwise the two disks are identical? Well they are in fact not identical, there's a difference between SAS and SATA error handling. Again the SNIA diagrams explain a lot of this.

Oh look: http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-701284.pdf

VelociRaptor (SATA) has the same rate as the RE SAS

But that's unpossible! You just said that SAS drives are better than SATA drives!

And it's convenient how you completely ignore the part about sync marks being different between SAS and SATA

Because it's minor. If you think that accounts for an order of magnitude difference, i've got a bridge to sell you.

murphyc · Sep 24, 2012

tynopik said:
The Red line has a NRRE rate of <1 in 10^14
the RE4 line has a NRRE rate of <1 in 10^15
How in the world do you think they get an order of magnitude reduction in NRRE WHEN THEY'RE BOTH SATA?

Better physical build of the RE4. Obviously. But you're claiming the RE4 and RE SAS are the same physical build, with no meaningful difference in SCSI error handling over SATA and yet the UER is an order of magnitude difference. You refuse to explain this, except via hand waiving.

Because it's minor. If you think that accounts for an order of magnitude difference, i've got a bridge to sell you.

Oh so now you're resorting to putting words in my mouth instead of explaining the ridiculousness of your assertions. I never said or implied SAS sync marks exclusively account for a 10^1 reduction in UER. It was an example of what is uniquely required with SAS drives, that you previously said there were no error detection or correction differences required by SAS that could not be implemented by SATA.

Here go read some more.

murphyc · Sep 24, 2012

tynopik said:
VelociRaptor (SATA) has the same rate as the RE SAS. But that's unpossible! You just said that SAS drives are better than SATA drives!

10,000 RPM vs 7200RPM. The RE SAS is NL-SAS. The VelociRaptor is about the best SATA disk in terms of performance and UER you're going to get and based on specs it's pretty remarkable.

WDC isn't quoting MTBF for either SATA drive you've mentioned. The RE SAS is 1.2 million. Go to a company like Hitachi, more well known for enterprise disks, and you'll see their offerings are 2 million MTBF for 10k drives.

Interface accounts for part of the advantage, as does build quality. If you combine both, you get enterprise SAS which no SATA drive can touch on performance or error handling.

tynopik · Sep 24, 2012

murphyc said:
But you're claiming the RE4 and RE SAS are the same physical build

not me

murphyc said:
Oh so now you're resorting to putting words in my mouth instead of explaining the ridiculousness of your assertions. I never said or implied SAS sync marks exclusively account for a 10^1 reduction in UER. It was an example of what is uniquely required with SAS drives

please give some more examples of where SAS drives actually do a better job of protecting the data on the disk.

murphyc said:
Here go read some more.

again, nothing about how the SAS standard helps ensure data integrity on the drive, just that enterprise drives tend to have better quality components

I'm beginning to think that your 'an example' is actually 'sole example'

tynopik · Sep 24, 2012

murphyc said:
RE SAS is clearly a NL-SAS disk. Hence UER of 10^15 rather than 10^16 like what you get with enterprise SAS. In effect the RE SAS is a SATA physical disk with a SAS interface. Interface accounts for part of the advantage, as does build quality. If you combine both, you get enterprise SAS which no SATA drive can touch on performance or error handling.

proof please

WD SATA drives range from 10^14 to 10^16, that's two orders of magnitude with the same interface. Are we supposed to believe that that last order of magnitude can ONLY be explained with the help of the interface?

imagoon · Sep 24, 2012

murphyc said:
My very point is that consumers routinely do things that are not in their best interest, let alone best practices.

The POINT is that consumer SATA statistically has two full orders of magnitude greater probability of encountering hard errors than than enterprise SAS. This is not insignificant over the life of a disk. When it actually occurs in the real world with a specific disk is obviously unknown, it does not require discussion. We know that people are encountering disk failures (no surprise) in RAID, but we also know (a surprise for some usually consumers) that in a degraded state a read error causes arrays to fail. So my point with Red is that while it will reduce the likelihood of a disk being dropped from an array in the first place, because the UER is the same as a green, statistically you still have risk of RAID array failure when it's degraded if any other disk encounters read errors. Perhaps higher risk.

The data sheet I've got off the web site disagrees. 10^15 vs 10^16.
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-771386.pdf
http://www.wdc.com/wdproducts/library/SpecSheet/ENG/2879-701338.pdf

For transfer rates, not the same for the 1TB.

1TB SAS = 139MB/s, 12.8 read/write watts, 9.9 idle watts
1TB SATA = 128MB/s 5.9 read/write watts, 5.9 idle watts

These are not the same drive, so I don't know what you mean by the HDAs being the same. While the 2TB disks are more similar in performance, the power consumption is still higher.

That is an unsupported deduction given that they don't in fact have the same performance. And SAS and SATA command queuing are completely different by specification. And the UER very clearly indicates there's a difference in error handling capability of the two drives.

How in the world do you think they get an order of magnitude reduction in UER, and a slightly higher transfer rate, if otherwise the two disks are identical? Well they are in fact not identical, there's a difference between SAS and SATA error handling. Again the SNIA diagrams explain a lot of this.

You own link shows 10^14 and 10^16. 1^15 is 10^14.

The HDA = Hard drive assembly. The metal box is the same between those drives. Power levels would obviously different with SAS controller board having dual controllers and the like. Extra ECC will slow a the same HDA because there would be extra bits per sector. However since the HDA is the same and the sector count is identical and the transfer rates are the same (2tb model), it infers it uses the same ECC encoding.

IE a 516 byte ECC would have .78% overhead per sector and 528 3.12%. That ~2.2% of extra transfer should show itself as at least a 3megabyte / sec loss.

murphyc · Sep 24, 2012

tynopik said:
proof please

I've already cited you numerous sources, including from SNIA, which is a trade organization. The diagrams fully demonstrate the difference in error handling between SAS and SATA in exhaustive detail and you keep hand waiving. The SCSI command set by specification itself has more error handling granularity than SATA does also.

WD SATA drives range from 10^14 to 10^16, that's two orders of magnitude with the same interface. Are we supposed to believe that that last order of magnitude can ONLY be explained with the help of the interface?

They only offer ONE drive that is SATA with that UER. And that's merely one metric for error handling capability.

imagoon · Sep 24, 2012

murphyc said:
I've already cited you numerous sources, including from SNIA, which is a trade organization. The diagrams fully demonstrate the difference in error handling between SAS and SATA in exhaustive detail and you keep hand waiving. The SCSI command set by specification itself has more error handling granularity than SATA does also.

They only offer ONE drive that is SATA with that UER. And that's merely one metric for error handling capability.

The part you keep missing is the interface itself has nothing to do with the ECC abilities of the drive. The controllers ram can have ECC. PCI-e definitely does. SAS has ECC as part of the correcting frames. So does SATA. That leaves the disk. There is nothing that says an SATA disk cannot have ECC RAM on the drive itself allowing end to end ECC. The specs even show that Seagate does that routinely on their SATA disks and the Western Digital appears to also on some of their SATA line. In addition SATA added ECC on commands like SAS.

murphyc · Sep 24, 2012

imagoon said:
You own link shows 10^14 and 10^16. 1^15 is 10^14.

Yes it is.

Extra ECC will slow a the same HDA because there would be extra bits per sector. However since the HDA is the same and the sector count is identical and the transfer rates are the same (2tb model), it infers it uses the same ECC encoding.

If you look at how standard and reverse ECC differ, they use the same space on disk but are computed differently.

IE a 516 byte ECC would have .78% overhead per sector and 528 3.12%. That ~2.2% of extra transfer should show itself as at least a 3megabyte / sec loss.

You're assuming identical processing hardware on disks with two different interfaces. You can't know this based on the published specs with what ASIC and other IC's are being used for ECC processing.

imagoon · Sep 24, 2012

murphyc said:
Yes it is.

If you look at how standard and reverse ECC differ, they use the same space on disk but are computed differently.

You're assuming identical processing hardware on disks with two different interfaces. You can't know this based on the published specs with what ASIC and other IC's are being used for ECC processing.

No I am not assuming. The disk spinning can only read so many bytes of data at the same time. No matter of ASICs are going to solve the fact that disk has the same angler velocity and reading 512 / 516 / 528 bytes of data will not be the same because the head has to read more bytes. If you need to read 528 bytes to get 512 bytes, the additional 14 bytes are over head and slows the transfer by the same amount of overhead.

In addition here is emulex's position on end to end error correction on SATA:

http://www.emulex.com/artifacts/ba0...eaa8e5bed206/elx_wp_all_emb_dataintegrity.pdf

murphyc · Sep 24, 2012

imagoon said:
The part you keep missing is the interface itself has nothing to do with the ECC abilities of the drive.

The command set is part of the error handling capability of the drive, and the SCSI command set has finer granularity in error handling than SATA.

There is nothing that says an SATA disk cannot have ECC RAM on the drive itself allowing end to end ECC.

Yes, there is. It's called reality. The people who want this and will pay for it also want SAS. They don't want SATA.

The specs even show that Seagate does that routinely on their SATA disks and the Western Digital appears to also on some of their SATA line. In addition SATA added ECC on commands like SAS.

I've already cited recent Seagate marketing materials where they clearly offer "full ECC and EDC" only on SAS disks. Now you can go on and on about how they could put this in SATA drives, but I don't understand what the whole point of a theoretical conversation is. The fact of the matter is there is a substantial difference in real world error handling between SAS and SATA. If you want more reliable disks, you're not using a resilient file system, the data is important, you don't want to wait for long error recovery, etc, then don't use consumer SATA! Very simple. There's a disk for you and it's enterprise SAS - that world is not busy pissing in their pants about TLER b.s. (which by the way is an exclusively WDC marketing term, the correct term is SCT ERC).

murphyc · Sep 24, 2012

imagoon said:
No I am not assuming. The disk spinning can only read so many bytes of data at the same time. No matter of ASICs are going to solve the fact that disk has the same angler velocity and reading 512 / 516 / 528 bytes of data will not be the same because the head has to read more bytes.

Which could still be processed faster with one ASIC than another.

Nevertheless, the disks that use variable byte sectors clearly say this in their specs. The RE SAS does not have variable byte sectors. It's clearly NL-SAS. Only enterprise SAS offer that particular characteristic, and not all of them, but many do. You can have differences in ECC, without it affecting the space taken up on-disk.

If you need to read 528 bytes to get 512 bytes, the additional 14 bytes are over head and slows the transfer by the same amount of overhead.

And yet the 520 byte PI drives are regularly faster. And in this case, two non PI drives, with 512 byte sectors, the SAS disk is faster than the SATA, when you claim they are physically identical, spinning at identical speed and have zero difference in overhead between them.

In addition here is emulex's position on end to end error correction on SATA:

http://www.emulex.com/artifacts/ba0...eaa8e5bed206/elx_wp_all_emb_dataintegrity.pdf

Old. 2006. Clearly there's nothing happening with PI on SATA. That there is a need doesn't mean it's going to happen. Consumers aren't going to pay for HBAs that can support PI. And the enterprise consumers who will overwhelmingly have chosen SAS. There are tons of reasons why SAS is better for enterprise applications, not just real world error handling.

WD Red or WD RE4 drives for ZFS NAS?

Diamond Member

Senior member

Senior member

Senior member

Diamond Member

Senior member

Senior member

Diamond Member

Diamond Member

Senior member

Diamond Member

Senior member

Senior member

Diamond Member

Senior member

Senior member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Senior member

Diamond Member

Senior member

Senior member