40TB storage solution for home?

jwilliams4200 · Aug 7, 2012

murphyc said:
It's in the forum, you can read about it.

Making up more FUD, I see. Do you talk about unsupported claims so much because you are fond of making them?

I have been quite specific in my claims and my support. You just ignore it for some reason.

Red Squirrel · Aug 7, 2012

murphyc said:
Short answer is yes you can do it on the fly.

Longer answer is:

Recent versions of mdadm can grow RAID 5 and RAID 6 while the file system is live in-use.

For many years it's been possible to add-subtract-migrate storage to/from LVM managed volume groups, while the file system above it is live.

Note that in both cases, growing the array, or growing the volume group, does not increase the size of the file system. That is a separate action. XFS supports growing its file size, so after you grow an array or logical volume, you can then grow the XFS file system.

yeah mdadm has been able to for a while now (was able to on my FC9 machine). Another thing it can also do (which mine can't since it's too old) is convert from different raid levels. There are limitations, but from what I gather it can go from raid 5 to raid 6 and vise versa, I'm not sure about the other raid levels but think it may be able to. Like you can probably go from raid 1 to raid 10 or raid 0 to raid 10. And yes you need to run extend (or some similar command) to extend the actual file system. Think of growing the raid the same way as growing a vmdk file in a vm.

For that much storage I'd probably go raid 6 right off the bat. Also, you will want a very good UPS for this and have it setup to do a proper shut down should the power go out too long. Disk arrays do not like hard shut downs. It's almost guarantee something will corrupt if it happens.

jwilliams4200 · Aug 7, 2012

Red Squirrel said:
For that much storage I'd probably go raid 6 right off the bat. Also, you will want a very good UPS for this and have it setup to do a proper shut down should the power go out too long. Disk arrays do not like hard shut downs. It's almost guarantee something will corrupt if it happens.

That is a good example of an advantage of snapshot RAID over conventional RAID 5 or RAID 6 that I forgot to mention before.

With snapshot RAID, each drive has its own filesystem. So even if your system crashes or loses power in the middle of a write, managing to corrupt the entire filesystem that was being written to, with snapshot RAID you can still recover your data since it will only be the filesystem on one drive that got clobbered -- you can recover just as if the RAID lost one drive. There is no straightforward way to do that with distributed parity RAID, since there is generally only one filesystem for the entire RAID, and if it gets corrupted, it covers all the drives in the RAID. Of course, some filesystems (like ZFS) have safeguards to make such filesystem corruption less likely. Still, I'd rather have the flexibility of having an independent filesystem on each drive.

murphyc · Aug 7, 2012

jwilliams4200 said:
Making up more FUD, I see. Do you talk about unsupported claims so much because you are fond of making them?

Troll. You accuse me what you're guilty of:

jwilliams4200 said:
A lot of people here are giving bad advice with regards to ZFS or mdadm RAID.

You have not stated either vaguely or precisely who and/or what bad advice has been given with regardes to ZFS or md-RAID. Immediately after your snotty comment, you proceeded exclusively to talk about snapshot RAID capabilities which in no way could one infer others' "bad advice".

I have been quite specific in my claims and my support. You just ignore it for some reason.

No you've only supplied the general capabilities of SnapRAID and FlexRAID, which I think are helpful contributions to the conversation. However, you have not at all been specific with respect to what "bad advice" others have offered. Nor have you answered the question regarding what enterprise implementations exist of either snapshot RAID you've suggested.

You seem to be under the impression you get to piss in people's faces on a public forum, and not be called out on it. It is valuable to have people point out the errors/flaws of other's suggestions, you have not done this. You've merely done a "your idea sucks donkey balls, mine is better, neener neener."

jwilliams4200 · Aug 7, 2012

murphyc said:
You have not stated either vaguely or precisely who and/or what bad advice has been given with regardes to ZFS or md-RAID.

I am assuming people reading my posts are not morons. I listed the specific advantages of snapshot RAID. It only takes a little bit of thought to realize the implication -- that the other systems being recommended lack those advantages, and are therefore a poor choice for the usage described by the OP.

Please stop drawing this thread off-topic by posting nonsense and unsupported accusations. The people reading these posts are not morons, and do not need to have every obvious implication spelled out for them.

murphyc · Aug 7, 2012

jwilliams4200 said:
I am assuming people reading my posts are not morons. I listed the specific advantages of snapshot RAID. It only takes a little bit of thought to realize the implication -- that the other systems being recommended lack those advantages, and are therefore a poor choice for the usage described by the OP.

Except, that you're basically calling everyone a moron by asking them to believe this ridiculous explanation. How is it non-morons, in your view, give bad advice in the first place?

Please stop drawing this thread off-topic by posting nonsense and unsupported accusations. The people reading these posts are not morons, and do not need to have every obvious implication spelled out for them.

Yes, unsupported except for the fact you actually wrote that a lot of people were giving bad advice. I guess in your worldview there are nefarious non-morons who aren't poorly informed but nevertheless give bad advice, for unknown reasons.

Listed, highly generalized, benefits of snapshot RAID

mix and match any number and capacity of drives

You can do this with md-raid and LVM as well, not unique to snapshot RAID.

start with already filled drives

This appears to be a unique advantage of SnapRAID. But the reason it's possible is because SnapRAID is implemented above the file system level, and that is also why there absolutely will be an upper limit to scalable performance.

expand one drive at a time without completely rewriting the whole array

Can be done with ZFS. Can be done with mdraid. Can be done with LVM.

can lose more drives than you have redundancy (parity), you only lose the data on the dead drives, not the entire array.

Again because it's implemented above the file system, each disk retains its own file system.

Of your four listed benefits, two are valid and unique to SnapRAID.

Now, considering your logic appears to be that you do not have to state specific bad advice, or why it matters or may be a problem, rather all one must do is state a deficiency and the deficient implementation is to be summarily rejected, I choose as my arbitrary example: ZFS checksumming and scrubbing eliminates bit rot and silent data corruption.

By inference, therefore, ZFS is superior to SnapRAID, further qualification is not necessary, per your rules of logic on this subject: I get to arbitrarily choose a feature, claim it to be important above all others, not qualify the claim in any way, and brand your advice as "bad", also without further qualification.

This logic is self-evidently absurd.

I invite you to recant your "bad advice" claim and state that you merely have a preference/bias for certain features of SnapRAID. Because that is all you have provided.

murphyc · Aug 7, 2012

SnapRAID does checksum files. Nevertheless the prior logic remains flawed.

A better example is that redundancy of a SnapRAID array is not realtime, it is done upon request. So if you haven't made such a snapshot request recently, any data since the last snapshot can be lost. Unlike ZFS or mdraid 1, 5, 6.

Another example for many disk arrays is to propose that triple parity is a most important feature, something again ZFS has, but mdraid nor SnapRAID offer.

And yet another example is the ability to export/clone an entire filesystem, something ZFS can do.

My point is that stating others are giving "bad advice" is poor form. Be specific. And account for alternatives.

jwilliams4200 · Aug 7, 2012

murphyc said:
mix and match any number and capacity of drives

Click to expand...

You can do this with md-raid and LVM as well, not unique to snapshot RAID.

Wrong. When using drives of different capacities, an mdadm RAID-5 or RAID-6 will only use the first X amount of space on each drive, where X is equal to the capacity of the smallest drive. Of course, LVM can be used to concatenate different capacity drives, but LVM does not provide RAID-5 or RAID-6 like redundancy. And there is no way to combine mdadm with LVM that will let you, for example, utilize the full-capacity of different-size drives together with dual-parity redundancy.

murphyc said:
expand one drive at a time without completely rewriting the whole array

Click to expand...

Can be done with ZFS. Can be done with mdraid. Can be done with LVM.

Wrong again. ZFS cannot expand vdevs. You can only add more vdevs to a pool, but since a Z1 or Z2 vdev is 3 or 4 drives minimum, you can only expand by 3 or 4 drives at a time with ZFS.

mdadm can expand a RAID-5 or -6 one drive at a time, but it must rewrite the entire RAID (all 40TB or whatever) in order to add that one drive.

LVM can, of course, easily add a drive to a volume, but since LVM does not provide any redundancy, it would be foolish indeed to maintain a 40TB logical volume by itself. Bad advice.

murphyc said:
ZFS checksumming and scrubbing eliminates bit rot and silent data corruption.

Wrong yet again (the implication that that is an advantage for ZFS and not snapraid). You made the trifecta (or should we instead go with, three strikes, you're out?). As I already stated, and as anyone who actually bothered to know what they were talking about before making erroneous claims would know, SnapRAID also does block checksums on all the data and can do the equivalent of a scrub (snapraid check) to look for any bit errors that may have developed.

murphyc · Aug 7, 2012

jwilliams4200 said:
Wrong. When using drives of different capacities, an mdadm RAID-5 or RAID-6 will only use the first X amount of space on each drive

Not wrong. Absolutely possible. Synology does it and calls it Hybrid RAID. There are multiple techniques depending on the drives. The easiest is to place like sized drives in arrays. Then add arrays into a VG for aggregation. Another way is to use partitioning to lowest common denominator size, and possibly using different RAID levels (i.e. if you end up with only two 500GB partitions on two drives, you'd mirror those partitions since you can't RAID 5 or 6 them).

mdraid can be applied to partitions, not just whole disks.

And there is no way to combine mdadm with LVM that will let you, for example, utilize the full-capacity of different-size drives together with dual-parity redundancy.

Untrue. You can lvm mirror two like sized mdraid RAID 1 arrays. Everything else can be RAID 6. It's hardly less than a clusterf|ck than 30 some odd disks either their own file systems on them, plus a lot of metadata interspersed.

ZFS cannot expand vdevs. You can only add more vdevs to a pool, but since a Z1 or Z2 vdev is 3 or 4 drives minimum, you can only expand by 3 or 4 drives at a time with ZFS.

It's true adding a single disk is not possible, but that is an arbitrarily low addition for expanding such a large array. btrfs can do this, however.

mdadm can expand a RAID-5 or -6 one drive at a time, but it must rewrite the entire RAID (all 40TB or whatever) in order to add that one drive.

*shrug* it's a live rebuild.

LVM can, of course, easily add a drive to a volume, but since LVM does not provide any redundancy, it would be foolish indeed to maintain a 40TB logical volume by itself. Bad advice.

I suppose LVM mirroring in your book is not redundancy, but I suspect others will disagree.

jwilliams4200 · Aug 7, 2012

murphyc said:
Absolutely possible. Synology does it and calls it Hybrid RAID.

mdraid can be applied to partitions, not just whole disks.

Now you are changing from incorrect to just bad advice. Yes, you can make multiple RAIDs in various ways to try to utilize all the space on different size drives. But this is inefficient compared to snapshot RAID, which can do a single RAID of all the different size drives, thus only "wasting" 1 or 2 drives for single- or dual-parity. The method you are talking about, to do dual-parity, would end up "wasting" 4 or 6 or more drives.

murphyc said:
It's true adding a single disk is not possible, but that is an arbitrarily low addition for expanding such a large array. btrfs can do this, however.

Expanding by a single disk is not "arbitrary". It is fundamental. And it is an important feature for a home RAID like this where expansion tends to be haphazard.

btrfs cannot yet do RAID-5 or RAID-6, so I am not sure why you are bringing it up (it is also very bad advice, since btrfs is not yet reliable enough for something like this). mdadm can expand by a single drive, but as I already said, it needs to rewrite the entire RAID to do it.

murphyc said:
I suppose LVM mirroring in your book is not redundancy, but I suspect others will disagree.

It is redundancy, but it is bad advice in this situation. For the amount of storage that he is talking about, mirroring is too inefficient and therefore too costly.

Viper GTS · Aug 7, 2012

Christ you guys argue a lot.

On a lighter note here's a few pictures of my setup:

Viper GTS

heymrdj · Aug 7, 2012

Viper GTS said:
Christ you guys argue a lot.

On a lighter note here's a few pictures of my setup:

Viper GTS

Viper. That picture is like pure sex to me.

That is all.

murphyc · Aug 7, 2012

jwilliams4200 said:
Now you are changing from incorrect to just bad advice. Yes, you can make multiple RAIDs in various ways to try to utilize all the space on different size drives.

Fun how you manage to contradict yourself and change the goal posts in back to back sentences. Your original assertion was it cannot be done, yet you blame it on me, then call it bad advice, then change the standard from "can be done" to "can be done as efficiently as I subjectively assert, even if it's after the fact." You are welcome to take the concept up with the multitude of vendors and companies who utilize this strategy for primary storage arrays.

But this is inefficient compared to snapshot RAID, which can do a single RAID of all the different size drives, thus only "wasting" 1 or 2 drives for single- or dual-parity. The method you are talking about, to do dual-parity, would end up "wasting" 4 or 6 or more drives.

I like how you complain about inefficient space usage, while entirely ignoring the fact that snapshot RAID is not live redundancy and doesn't offer triple parity redundancy. For a 60TB array, that might be important to someone.

btrfs cannot yet do RAID-5 or RAID-6, so I am not sure why you are bringing it up

Another goal post shift. The context was an array that is growable by adding a single disk. And btrfs can do this. That you're now limiting the array to only RAID 5 or 6 is arbitrary as well as after the fact.

I'll tell you what, I'll play your game too: RAID 5 is bad advice because of the write hole problem, therefore you give bad advice.

(it is also very bad advice, since btrfs is not yet reliable enough for something like this).

I see, so Oracle gives very bad advice by including btrfs in Unbreakable Linux 2, and somehow their requirements and customers have pissant needs compared to home theater users. Gotcha.

It is redundancy, but it is bad advice in this situation. For the amount of storage that he is talking about, mirroring is too inefficient and therefore too costly.

It's interesting reading different perspectives. For example, the SnapRAID developers consider it a backup solution for an array, rather than designed for primary storage. I find the recurring emphasis of SnapRAID on it being a backup strategy vastly more compelling than your presentation of it as an alternative to truly, instantly redundant, RAID.

murphyc · Aug 7, 2012

Viper GTS said:
Christ you guys argue a lot.

Yes, but per jwilliams4200, since your pictures are tacitly recommending Nexentastor as a solution for the OP, and thus you're recommending ZFS and not snapshot RAID, you're giving bad advice to people.

jwilliams4200 · Aug 7, 2012

It's all in your head, murphyc. I have not contradicted myself or changed my claims. My original statements were clear and carefully worded, and they are correct. I do assume that people are not morons and that I don't need to spell out every crazy thing that is possible but which no experienced person would ever suggest for the OP to do. No one with any sense who is trying to make an inexpensive bulk fileserver for 40+TB is going to use mirroring and throw away 50% of their usable capacity. That really goes without saying, hence I did not say it in my earlier post.

And you are wrong yet again. Do you just randomly make claims until you find one that is not incorrect? "snapshot RAID" is capable triple-parity (or quadruple-, or even higher) if you use FlexRAID.

As for btrfs, if you followed the mailing list (as I do) you would see that there are still a large number of changes and bug fixes happening all the time. Also, if you ask on the mailing list, the developers themselves will tell you that it is not stable enough yet to be a good choice for a 40+TB file server. Most people that have been following btrfs think that Oracle jumped the gun with including it in their distro. Even the Fedora developers (Fedora is a cutting-edge distribution and not shy about including new code) are not planning to use btrfs for default filesystem in the upcoming Fedora 18.

As for the SnapRAID developer (singular), Andrea is not a native English speaker. If you ask him, he will definitely not contend that SnapRAID is a replacement for a real backup if your data is irreplaceable. His few uses of the word "backup" on the web page do not mean what you seem to think he means. If you describe the OP's application to Andrea, I am confident that he will say that snapshot RAID is a far better choice for the OP than ZFS or mdadm/LVM.

Smoblikat · Aug 7, 2012

Viper GTS said:
Christ you guys argue a lot.

On a lighter note here's a few pictures of my setup:

Viper GTS

Thats one slick setup man, what are all those cables in the middle for? (where the graphics cards would go) and what size PSU do you think youd need to run this, obviously it isnt a 1200w

EDIT - Its been one of those days

just realized theyre the SATA cables, im not used to them being blue.

WhoBeDaPlaya · Aug 7, 2012

Viper GTS · Aug 7, 2012

Smoblikat said:
Thats one slick setup man, what are all those cables in the middle for? (where the graphics cards would go) and what size PSU do you think youd need to run this, obviously it isnt a 1200w

EDIT - Its been one of those days just realized theyre the SATA cables, im not used to them being blue.

While yes they're carrying SATA data they're not SATA cables like you're thinking of - They're SFF-8087 cables from the IBM BR10i cards to the backplanes on the Norco.

Much like this, but the particular ones I have are not sleeved and instead carry individual SATA channels in a separate package:

I'm not 100% sure what it's drawing from the wall, but it's not a lot. A few hundred watts probably? I've got a kill-a-watt if you really want to know. The reason for the AX1200 is:

1) I already had one in my desktop and I liked it
2) At the small fraction of its capability I needed it's basically silent
3) If there's anything I like it's overkill

From left to right the expansion cards are:

Areca ARC-1680
Quad port Intel gigabit NIC
IBM BR10i
PCI SSD bracket (basically a PCI card shaped piece of plastic that holds four 2.5" SSDs - These are the ESXi boot device and ZFS ZIL cache)
IBM BR10i

Viper GTS

Smoblikat · Aug 7, 2012

Viper GTS said:
While yes they're carrying SATA data they're not SATA cables like you're thinking of - They're SFF-8087 cables from the IBM BR10i cards to the backplanes on the Norco.

Much like this, but the particular ones I have are not sleeved and instead carry individual SATA channels in a separate package:

I'm not 100% sure what it's drawing from the wall, but it's not a lot. A few hundred watts probably? I've got a kill-a-watt if you really want to know. The reason for the AX1200 is:

1) I already had one in my desktop and I liked it
2) At the small fraction of its capability I needed it's basically silent
3) If there's anything I like it's overkill

From left to right the expansion cards are:

Areca ARC-1680
Quad port Intel gigabit NIC
IBM BR10i
PCI SSD bracket (basically a PCI card shaped piece of plastic that holds four 2.5" SSDs - These are the ESXi boot device and ZFS ZIL cache)
IBM BR10i

Viper GTS

Sweet, isnt that 8087 SAS? Im assuming its SAS 6gb/s. Are you using the quad port NIC directly to your desktop, I know theers software that takes all the RJ-45 ports and basically combines them like RAID0 to get faster speeds to a specific device. If its not too much trouble a kill-a-watt would be cool. Though its not somthing I need done, ive been looking into a real home server myself (right now I have a dual xeon one but I want to expand it) and just wanted to get an idea of power consumption/noise. Also, what do you use to distribute the power to all of your HDD's and your SSD;s that are backmounted to the bracket?

murphyc · Aug 8, 2012

jwilliams4200 said:
It's all in your head, murphyc. I have not contradicted myself or changed my claims. My original statements were clear and carefully worded, and they are correct.

Like "LVM doesn't include any redundancy" but upon being corrected you turned into a belly aching whiner about the kind of redundancy it has? Like how you've libeled an entire forum of people as giving "bad advice" merely because their opinions differ from yours? You're roughly on your 15th helping of hubris.

As for btrfs, if you followed the mailing list (as I do) you would see that there are still a large number of changes and bug fixes happening all the time.

That's your metric for stability? ext4 and XFS continue have many commits and changed/deleted lines of code.

Also, if you ask on the mailing list, the developers themselves will tell you that it is not stable enough yet to be a good choice for a 40+TB file server.

Interesting that someone did ask this regarding a 100TB array and the developers themselves did not say a thing. They have previously said it is stable on a stable system, however.

Most people that have been following btrfs think that Oracle jumped the gun with including it in their distro.

The lead btrfs developer worked for Oracle up until very recently, and you're essentially proposing either he didn't have enough influence to prevent it from being included in Unbreakable Kernel, or he himself jumped the gun. Why would anyone give a flying F what "most followers of btrfs" even think compared to the developers themselves? Where do you even get this polling information? It's certainly not expressed on the btrfs list.

Even the Fedora developers (Fedora is a cutting-edge distribution and not shy about including new code) are not planning to use btrfs for default filesystem in the upcoming Fedora 18.

Provide a citation. [1]

As for the SnapRAID developer (singular), Andrea is not a native English speaker. If you ask him, he will definitely not contend that SnapRAID is a replacement for a real backup if your data is irreplaceable. His few uses of the word "backup" on the web page do not mean what you seem to think he means.

Ahh, the "foreigner is confused" argument. You're such a charmer. I'll let you ask him what he means.

In the SnapRAID forums there are clear indicators of scalability problems and bottlenecks, in recent forum entries. Users see 100% core utilization with less than a dozen disks, and it's not yet multithreaded. The write performance would always be limited to that of a single disk because there is no striping. I also question the management of such a system when it comes to the recommended interval for 'sync' and 'check'. The redundancy is likewise distinctly manual and distinctly delayed compared to RAID 1,5,6 or RAIDZ or RAIDZ2.

And instead of a single file system, you're proposing somewhere around 15 - 60 independent file systems which will need fsck run on them periodically. If there's ever an inconsistency in even a minority of the file systems, say due to a power failure... what a huge PITA that would be for a home user to sort out. Hours to days of full traversal to make repairs, disk by disk.

I do not agree with your recommendation for storage of this size.

[1] The two missing pieces to get it in F17 were lack of btrfsck, and lack of ui for anaconda to properly configure a btrfs default installation. Not stability concerns. Both features are essentially ready in F18, which by the way just branched from rawhide yesterday. In every past case the decision on btrfs as default for Fedora has come much later in development of a release, not at or before branch.

40TB storage solution for home?

Senior member

No Lifer

Senior member

Senior member

Senior member

Senior member

Senior member

Senior member

Senior member

Senior member

Lifer

Diamond Member

Senior member

Senior member

Senior member

Diamond Member

Diamond Member

Lifer

Diamond Member

Senior member