Question Medium-sized NVMe array setup suggestions?

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

_Rick_

Diamond Member
Apr 20, 2012
3,980
74
91
Just did some initial testing (parallel copies of the same large files from tmpfs to multiple instances on the candidate configuration), and one assumption which I held to be true, but which was recently challenged, I reconfirmed:
RAID 6 is _much_ slower (at least when writing) compared to RAID 10. Got 4-5 GB/s writing files from memory to both btrfs RAID 10 and xfs on RAID10, versus 1.2 GB/s for XFS on RAID 6.
Plus, the initial resync that was required for some reason, took well over three hours.

Still, the RAID10 numbers feel like they're a a factor 2-4 off of what I would expect (4 striped drives at 4 GB/s each).
This slowness is also making me wonder whether I should just go back to having a single encrypted array, instead of having an array of encrypted disks. At least right now, the crypto isn't putting on the brakes.

BTRFS looks like it has a single binding thread, which might even be cpu-limiting the load.
I might want to have an FS with snapshot capability, for better backup capabilities.

Edit: looks like xfsdump might have sufficient locking/cow capability to handle this.
 
Last edited:

Dana599

Junior Member
Sep 22, 2025
2
1
6
I'm taking the next step with my storage setup, and will in the near future buy 8 M.2 SSDs - DRAM-less 2TB drives, ideally MLC, but potentially even QLC. Trying to get the best deal, and I can wait a while.

My intended use case is in my home NAS, to host home directories across my other machines - and potentially be able to free up the SSDs installed there mostly for steam libraries. I do expect the steam client to not enjoy having to share the directory, so we will see about that.
Also will likely host some game servers, own-cloud-style services and such - hoping to run either single node k8s or just plain docker/compose, and avoiding VMs.
Got 16 Zen4c and 128 GB of RAM on the machine, but no plans to run 100-concurrent-users-databases on the array - instead it will mostly be media. Worst case will be stuff like Windows AppData.
OS is Gentoo, I've neen running btrfs without major issue for a while (although i got annoyed with it refusing to mount when degraded, unless you specify it, which broke booting off of a single disk.
I heard good, bad and ugly about ZFS - since I am not chasing IOPS significantly, I doubt it will matter.
Will have a local spinning rust backup, so if the worst happens, I should be able to recover most of it. Bonus points, if I can get snapshots of the volumes.

Main considerations:
Don't eat my data please - and allow me something like RAID5/6 so I don't need to throw half the capacity away for RAID 1/10.
Don't eat the SSDs: write amplification of SSDs should not be multiplied significantly. It's bad enough as it is, on these cheap drives.
Make drive swaps easy: The disks will be mounted in an externally accessible bay, and I am looking into NVMe somewhat-hot swapping them, if they break - if the setup then makes me faff around with more than two lines of shell, I'll be annoyed.
Don't get slower than HDDs - regardless of what I am doing. I know that SSDs will already degrade horrendously once you eat through their cache - I need a storage setup where that condition remains the worst case.
Support TRIM - which probably throws out classic SW-RAID, as it gets much harder to track which blocks are used, if the FS needs to pass the information through.

My current default would be BTRFS with "RAID 1", so that all files are replicated exactly once, without any kind of parity overhead - and maybe I'll create a throw-away volume without replication, for scratch data, if needed.
If you have any additional input, on read-heavy "low-cost" NVMe arrays (the disks only cost as much as it costs to get them wired and installed...)

Why am I doing it? It's pretty cool, and I had too many HDDs die on me lately. And disk spin-ups need to go back into the 2000s :D
Sounds like a fun project! Given your goals, BTRFS with RAID1 for important data plus a non-replicated scratch volume seems like a safe balance. ZFS is solid but heavier on resources and write amplification, so for mostly read-heavy workloads your plan looks well-suited.
 
  • Love
Reactions: _Rick_

_Rick_

Diamond Member
Apr 20, 2012
3,980
74
91
Ended up going for BTRFS RAID 10 - across the whole pool, as changing replication for subvolumes is not supported. I wasn't of a mind to partition the drives, as partition resizing or volume management were not going to be on my menu.
Hot-plug worked pretty well on the device level - what the actual FS will do, is yet to be seen.
Now it's time to figure out a proper automated, incremental backup which leverages the snapshots and makes me not worry - had a brief scare, when migrating my current drives, as I messed up the encryption algo, and had the impression that I lost the whole dataset, when I only had a partial backup. That was a dark couple of days!