Thank you all for the food for thought thus far.
I should probably have indicated that the main usage of the NAS is for media storage and consumption. It's also used for backups, projects, etc. I am definitely going to be using ECC memory for my build, since that is one of the final areas of vulnerability in my current setup.
My understanding was that the 1TB=1GB rule was for total raw drive capacity, not for zpool allocated space. Is that an incorrection assumption?
I will potentially be enabling compression but I have no current plans for dedupe since it does require such massive quantities of RAM. My usage model also would not benefit from the space savings of deduplication. Even compression is iffy since most of my usage is already-compressed media.
I'd rather add more RAM to increase the ARC cache than add mirrored SSDs for L2ARC or ZIL at this time. I am definitely open to adding them down the line, however.
If you're doing parity RAID modes, checksumming is basically a requirement. Checksumming + Parity is what allows bad block recovery during scrubs. Also, you cannot use mirrored SSDs for L2ARC. L2ARC requires cache devices to be single device vdevs, however it can support multiple cache devices for L2ARC and caching is spread across them using a fairly efficient and intelligent algorithm. ZIL cache devices should be mirrored, but can be non-mirrored. If you lose your ZIL you're going to have a bad time.
1) Frankly it was something I hadn't considered, and you raise excellent points. I previously kept my 4 drive RAIDZ vdevs in separate pools because the chance of losing another drive in a vdev during a rebuild seemed plausible. I didn't want to lose 100% of my storage should that occur. Furthermore, I wanted to be sure that I could connect a pool to another system should my NAS hardware fail and I needed access to data.
In retrospect, the scenario is different in the new configuration. Most systems wont support 12 drives at a time without an HBA of some sort, so my portability case is moot vs 24 drives in a pool. If you feel that there is not significant risk in 4x RAIDZ2 vdevs of 6 drives each having a vdev fail and losing the pool all at once, I will probably go down that route. Striping across four vdevs instead of two would certainly increase performance.
ZFS doesn't allow restriping within its algorithms, so for this reason striping is not used across vdevs in a pool. Most of the complexity is at the vdev level, including redundancy. So in a pool with 2 raidz2 vdevs, each vdev is essentially an independent unit. The way pools expand is via intelligent writes. So you cannot gain strict performance by adding additional vdevs to a pool. The worst case scenario behaves like concatenation (e.g. 100% write bias to vdev1 till it reaches ~80% capacity, then 100% write bias to vdev2, etc.). Best case scenario behaves somewhat like layered RAID0 [RAID60/50/10] (e.g. 50% write bias to vdev1, 50% write bias to vdev2, with balanced capacity targets). You have no control over this layer of the algorithm, so it's difficult to predict exactly how it performs, but you can guarantee no worse performance than a single vdev by itself, with a possibility of better performance.
What this means is that ZFS's pooling algorithm encourages larger vdev spindle counts when targeting performance. So with a target of 16 drives in my media server I am using 8x3TB raidz2 vdevs with 2 vdevs in the pool. With a target of 45 disks in the new media server plan I will be using 9x4TB raidz3 vdevs with 5 vdevs in the pool. Since both of these configurations will more than meet my required write and read performance values.
Also, keep in mind spindle counts per vdev have strict targets for parity raid based vdevs. In mirrored configurations spindle counts are obviously inherently 2 per vdev with no real limitations to vdevs per pool.
From the
ZFS Best Practices Guide you get the following disk count/redundancy recommendations.
Code:
RAIDZ Configuration Requirements and Recommendations
A RAIDZ configuration with N disks of size X with P parity disks can hold approximately (N-P)*X bytes and can withstand P device(s) failing before data integrity is compromised.
Start a single-parity RAIDZ (raidz) configuration at 3 disks (2+1)
Start a double-parity RAIDZ (raidz2) configuration at 6 disks (4+2)
Start a triple-parity RAIDZ (raidz3) configuration at 9 disks (6+3)
(N+P) with P = 1 (raidz), 2 (raidz2), or 3 (raidz3) and N equals 2, 4, or 6
The recommended number of disks per group is between 3 and 9. If you have more disks, use multiple groups.
The other thing to understand is that resilver and scrub operations are performance limited/impacted by vdev spindle count/vdev size, not by pool size. When you have a pool with multiple vdevs and start a scrub operation on the pool, the scrub performances on the vdevs concurrently (AFAICT). When you require a resilver, this is done on an individual vdev only. Scrubs are per vdev however, so since random I/O speed/IOPS of the vdev + size of the vdev affects, smaller vdevs with faster individual devices are going to obviously scrub/resilver faster.
I personally don't like FreeNAS specifically because it tries to hide all of this away from you. The thing is that ZFS is so amazingly cool, but to really dig into the meat of it you need to use the command-line and you need to tune it yourself for your application. There's nice tuning profiles available in FreeBSD that you can use, and you can do custom tuning parameters as well for how it uses memory that can impact performance in several regards. There's also
specific tuning changes that can be made to increase resilver/scrub performance as well. Getting into the underlying system means something like FreeNAS is extraneous, and actually in performance testing FreeNAS tends to use tuning profiles that are less performant than the FreeBSD defaults.
2) I am not using a ton of ZFS/FreeNAS functionality at the moment. I currently have scheduled snapshots twice a day (retained for a week at a time). Scrubs every 35 days (FreeNAS default). I am not using compression or dedupe (compression is a potential for the new system). I have SFTP and CIFS servers running as well as Transmission as a torrent client. I am not using NFS because of poor performance in FreeNAS 8. I have not tried it again yet since moving to 9 but it is a possibility, especially since CIFS is only one-thread per user.
Poor NFS performance is common, unfortunately. On FreeBSD all of the network connectivity to the filesystem is userland (CIFS/NFS/iSCSI) where in the native implementation on Solaris it's builtin (at least NFS/iSCSI is) to the filesystem. Unfortunately that's not the case in FreeBSD. I think FreeNAS NFS performance is worse than you might be able to get doing it manually though.
3) I do not currently have SSDs for L2ARC or ZIL. I might look into adding an SSD mirror for such purposes if I am unhappy with system performance. My understanding is that with ZFS v28, losing the ZIL no longer causes a loss of the pool. Is that correct?
I don't know if it causes a loss of the pool, but it DOES cause data loss, guaranteed if you lose the ZIL. I'm pretty sure it loses the pool though as well. Long story short, you don't want it to happen.