40TB storage solution for home?

conleche

Junior Member
Aug 6, 2012
2
0
0
I have a lot of movie files (recorded TV programs, Blu-ray backups, and HD movies that I shot) and my storage is in mess. How would you store 30-40TB of movies now and prepare for say 20TB expansion every year?

Current config:
Windows Home Server: 10TB using 7 drives (all internal)
Photo/Video workstation and PS3 Media Server: 28TB using 13 drives (10TB RAID5 internal + 2TB internal + 2-3TB external x6)

My line of thoughts:
Although most files will never be played back (eg hundreds of hours of my HD dash cam data), I would like to keep them accessible on my home network. I prefer that all my data reside in a fault-tolerant storage, but I realize this would be difficult in terms of cost and expandability.
I know there are external enclosures, but those that hold >5 drives seem as expensive as a PC.
I can use old PCs to create a NAS, but old machines don't have SATA. Most SATA cards have about 4 ports? and if you use multiple SATA cards, you'd run out of HDD bays and go back to external enclosure dilemma. Using external drives are visually messy and space-consuming, in addition to USB/eSATA port-consuming.
I suspect that the easiest and most economical way is to configure an old PC with about a dozen external drives via multiple USB cards, and get another PC like that as storage requirement grows, but let's ask in Anandtech forum!
 
Last edited:

VirtualLarry

No Lifer
Aug 25, 2001
56,570
10,202
126
You need a Norco-4020. Check my FS thread.

20 hot-swap SATA drive bays, in a 4U rack-mount case.
fill em all with 4TB Hitachi drives, boom, instant 80TB of space. Or, fill only as many drive bays as you need now, and expand later with even larger SATA drives.

Another possibility is a BackBlaze storage pod. You can get the plans, and get someone to machine the case for you, and you buy the components they list, and put it together. Maximum storage for minimum space/cost.
 

WhoBeDaPlaya

Diamond Member
Sep 15, 2000
7,414
402
126
I thought I was pretty balla with 60TB of online storage :eek:
Probably the best way is to go rackmount. I'd do that here, but I live in a studio apt :(
 

jwilliams4200

Senior member
Apr 10, 2009
532
0
0
First thing to do is to separate your important, irreplaceable data from your data that is either replaceable or unimportant.

The important, irreplaceable data should be handled separately. For most people, this is a fairly small amount of data. For me, I keep this data on a RAID 1, but I also back it up once a month to an HDD that I keep in a fireproof media-safe, and I also have it automatically backed up over the network every night to a remote site.

For the rest of the data, I suggest a large hot-swap 19" rackmount chassis (you don't actually need a rack for the chassis if you do not want one). I have a Norco RPC-4224 that works well. It costs $400 and can hold 24 HDDs. I have a Supermicro X9SCM-iiF motherboard and 3 IBM M1015 HBAs (8 ports each) which I bought from ebay for $75 each. If you don't want to buy from ebay, you can find the equivalent LSI 9211-8i HBAs for a little over $200 each new.

My fileserver runs linux, and I use mdadm (linux software RAID) for the RAID 1 for my important, irreplaceable data -- just 2 drives. The rest of the drives contain my replaceable or unimportant data. They are managed with SnapRAID, which maintains dual-parity drive redundancy as well as block checksums to verify data integrity. The linux server runs samba to share the data with Windows computers on my LAN.

Plenty of people have made a similar system (using the same hardware) that runs Windows. SnapRAID is also available for Windows. But I think linux provides more flexibility, and besides, it is free.
 
Last edited:

Viper GTS

Lifer
Oct 13, 1999
38,107
433
136
Consumer grade USB storage is absolutely NOT what you want. For this level of storage you need to be looking at a dense, rack mount storage appliance type device.

Assuming you don't have many tens of thousands to spend on this you're looking at a DIY ZFS machine. Plan to spend $5k+ minimum, this is not going to be a cheap project.

To do this right:

Norco 4224 (24 drives vs 20 in the 4020)
Server grade motherboard with registered ECC RAM (24-32 GB)
3 x IBM BR10i in IT mode will give you 24 SATA channels with no port multipliers
ZFS storage OS

Nexenta would normally be my recommendation but my guess is you can't afford the license required to store the kind of quantity you're talking about ($9,250 for 64 TB Silver Edition). Since that's almost certainly out of the question then you're looking at maybe OpenIndiana or similar (http://openindiana.org/).

If I had to guess I would say you are probably vastly underestimating how much this is going to cost you. If you really, truly want to keep this data it is going to be $$$ both in upfront costs and the long term power/maintenance. And this is before you even consider doing off-site backups.

Realistically the data probably isn't that valuable.

Viper GTS
 

WhoBeDaPlaya

Diamond Member
Sep 15, 2000
7,414
402
126
Forgot to mention, you might want to check out the HardForum storage sub-forum. Lots of builds detailed that sound right up your alley.
 

Red Squirrel

No Lifer
May 24, 2003
69,693
13,325
126
www.betteroff.ca
You need a Norco-4020. Check my FS thread.

20 hot-swap SATA drive bays, in a 4U rack-mount case.
fill em all with 4TB Hitachi drives, boom, instant 80TB of space. Or, fill only as many drive bays as you need now, and expand later with even larger SATA drives.

Another possibility is a BackBlaze storage pod. You can get the plans, and get someone to machine the case for you, and you buy the components they list, and put it together. Maximum storage for minimum space/cost.

This, or if you have a bigger budget, look at Supermicro's equivalent. Hot swap power supplies, and built better. The Norco's have lot of issues and you may have to go through a couple RMAs before you get a good one. At least based on reviews.

Also for something with that much storage you do want redundant PSU so you can plug into two separate feeds. Perhaps two separate UPSes or two separate hydro legs, that way you are safe from a PSU, UPS or cable failure. 40GB worth of storage dropping hard would SUCK!

And yes definitely rackmount... get a small rack, or a big one, it's worth it. You can put a patch panel and switch in there, servers etc... racks are awesome to have.
 

palladium

Senior member
Dec 24, 2007
539
2
81
Do consider tape drives. You mentioned that most of them will never be played back, so you can archive those on a tape drive, and keep the "hot" data on hard drives you already have. You sacrifice instant accessibility, but it is a lot easier and the ongoing maintenance cost is going to be cheaper too.
 

Viper GTS

Lifer
Oct 13, 1999
38,107
433
136
The problem with the SuperMicro is they are really, REALLY loud - Not home suitable unless you can stuff them in a basement with major insulation between you & it.

The Norco can be made very quiet. For my 4224 I ordered the 120mm mid-plane bracket, fully populated it with Noctua fans (mid and back), and picked up a power supply that was vast overkill (Corsair AX1200). The end result is something that is nearly inaudible in my living room even fully populated with 24 7200 RPM hard drives.

It definitely heats up the room but it does so quietly :).

Viper GTS
 

conleche

Junior Member
Aug 6, 2012
2
0
0
High-density, high-quality info in a matter of minutes...thanks guys.
Norco sounds like what I've been hoping to find. Thanks for all the pointers, price info, and even hyperlinks!

Considering that my particular situation is casual but massive data (and predominantly video), I suppose I can use common (non-ECC) motherboards that pops up with deep discount from time to time?

How much CPU power should I have? Since IBM M1015 that jwiliams4200 is using seems to be a hardware RAID controller, perhaps even 1-2GHz Celeron would suffice? As this storage is primarily archival/real-time streaming for living room playback, speed is not a priority and I will be using some 5400rpm drives. But if I use the card as SATA interface and use OS to control 24 various drives, perhaps it will require considerable CPU power? Then again, only a few video files will be accessed (played back) in a day. Hmm...

I'll do some reading in HardForum, too.
Thanks again.
 

Smoblikat

Diamond Member
Nov 19, 2011
5,184
107
106
High-density, high-quality info in a matter of minutes...thanks guys.
Norco sounds like what I've been hoping to find. Thanks for all the pointers, price info, and even hyperlinks!

Considering that my particular situation is casual but massive data (and predominantly video), I suppose I can use common (non-ECC) motherboards that pops up with deep discount from time to time?

How much CPU power should I have? Since IBM M1015 that jwiliams4200 is using seems to be a hardware RAID controller, perhaps even 1-2GHz Celeron would suffice? As this storage is primarily archival/real-time streaming for living room playback, speed is not a priority and I will be using some 5400rpm drives. But if I use the card as SATA interface and use OS to control 24 various drives, perhaps it will require considerable CPU power? Then again, only a few video files will be accessed (played back) in a day. Hmm...

I'll do some reading in HardForum, too.
Thanks again.

An I3 would be perfect. You dont need a lot if its mainly archival.

Also @everyone else
Where can i get tape drives?
 

_Rick_

Diamond Member
Apr 20, 2012
3,946
70
91
Considering the amount of money you'll be paying for storage, the cheapest AES-NI capable 1155 Xeon on a small ECC-capable board with ECC-RAM (2x4 GB should be plenty of cache) is probably the most cost effective, CPU-wise.
ECC means less errors when using aggressive caching.
AES means you can just swap disks, without having to worry about the contents being divulged. It does also works without hardware acceleration, but it's much faster and more efficient to have the acceleration.
Having a fast CPU also means you can run re-encode jobs on the machine to save a bit of space for those HD-videos.

At those volumes a Solaris/BSD Box with ZFS is probably the way to go. This also means no hardware RAID. Hardware RAID is generally not convenient to have unless you really need the performance of the intermediate caches. With a decent CPU, you shouldn't be limited by the RAID algorithms.

Previous advice re splitting disposable and non-disposable data is important at those volumes. Having a 40TB backup array is going to cost, same goes for any kind of automated tape system. I would probably recommend getting a small "table-top" rack (8-12U) for a cleaner and safer installation.
UPS is convenient, if you don't have one, a small rack mount unit look nice - a desktop unit will do just as well. Redundant power supplies - Nice to have, I guess.

Optionally you could consider non-RAID controllers, if you go with ZFS, you'll need a Solaris/BSD driver first and foremost. I'd say go cheap, even with 6 4 port cards. Bad controllers can be a pain, but I've yet to run into one.
 

Viper GTS

Lifer
Oct 13, 1999
38,107
433
136
Given what you'll spend on hard drives spending the extra for ECC RAM is trivial. ZFS error correction is worthless if you have no idea if the data coming out of RAM is valid. Spend the extra on a proper server platform with ECC.

If you're not looking for performance you can reduce the CPU and RAM to minimal levels, but don't lose out on error correction for $300 total system savings.

I will try to take some pictures of mine tonight.

System specs:

Norco 4224
Corsair AX1200 PSU
Tyan S7025 dual Xeon motherboard
2 x Xeon 5606
6 x 8 GB DDR3 RDIMM (moving soon to 16 GB)
2 x IBM BR10i cards in IT mode
1 x Areca 1680 w/4 GB cache
Quad port Intel NIC

Storage devices:

16 x Samsung Spinpoint F3 1 TB (on the IBM cards)
8 x Enterprise grade 1 TB 7200 RPM in RAID 10 on Areca
2 x Intel SSD for ZIL
1 x Samsung 830 for ESXi boot

Overall system architecture:

Licensed copy of ESXi runs on the Samsung 830, uses the Areca RAID 10 set as VMFS storage
The three LSI based controllers (2 add in + 1 onboard) are passed through via VT-d into a Nexenta VM
Nexenta running 3-way mirror + hotspare for ~5 TB RAW usable space

Nexenta is used solely for my personal data storage + the occasional NFS share/iSCSI volume for lab testing.

Viper GTS
 
Last edited:

murphyc

Senior member
Apr 7, 2012
235
0
0
Do consider tape drives. You mentioned that most of them will never be played back, so you can archive those on a tape drive, and keep the "hot" data on hard drives you already have. You sacrifice instant accessibility, but it is a lot easier and the ongoing maintenance cost is going to be cheaper too.

I agree. If you use LTFS capable LTO-5 drives, you will save a ton of physical space, weight, complexity, and longer shelf storage life than either spinning or non-spinning disk. The pisser is that you'll have to come up with quite a good system for organizing all of this. That's ~40 tapes if the whole thing is archived to tape.

For the spinning platter storage, I would look at Netflix Open Connect, and Facebook's Open Compute for hardware ideas. They are in the business of bulk cheap reliable storage. I would absolutely not screw around with proprietary stuff because as soon as your 1 year warranty is over, you're stuck replacing parts with vendor specific replacements in order to regain access to your data. IMO you should be able to move the array to completely rebuilt commodity hardware and have access to your data at anytime.

I would consider a FreeBSD 9 ZFS based solution, which you can find in Free 4 NAS. There are some specific requirements on how you grow storage using ZFS as you cannot grow RAIDZ. But you can add RAIDZ's to a pool and increase the pool size.

I'd also consider a straight Linux setup with CentOS (free RHEL 6 clone), with XFS for the file system on top of LVM for volume management on top of md RAID. You can increase storage by either growing the RAID, live, or you can add storage to the volume group through LVM. It's probably faster to build and test a new 20TB array, and add that to the volume group with LVM rather than rebuild the array even though it can be done live.

Screw the Window stuff...there's nothing comparable for logical volume management to either ZFS or LVM until Microsoft extends ReFS deployment downstream to the masses.
 

murphyc

Senior member
Apr 7, 2012
235
0
0
I also agree with OpenIndiana as an OS for ZFS based NAS. And ECC is a no brainer if you care about this data at all. The cost is negligible. I'd use ECC memory regardless of ZFS.

And if you're going to use consumer drives for this project instead of enterprise SAS drives, you really need to plan on a system that's going to drop drives regularly. I wouldn't get drives that are too big, they will take longer to rebuild. And I'd also do a burn in on each consumer drive: i.e. run ATA Secure Erase, and the extended SMART test; and ensure you have enabled smartd to regularly run SMART tests. I'd script this so the server does a diff on certain attributes so that if suddenly you get a spike for bad sectors, or read or write errors, or CRC errors, etc. you get an email sent to you about that disk. When the drive continues to successfully use ECC, all of these types of errors are considered NORMAL and the basic health of the drive will continue to be reported as "PASS". So if you want any advanced warning at all about drive failures you can't depend on the health status alone.
 

Smoblikat

Diamond Member
Nov 19, 2011
5,184
107
106
I agree. If you use LTFS capable LTO-5 drives, you will save a ton of physical space, weight, complexity, and longer shelf storage life than either spinning or non-spinning disk. The pisser is that you'll have to come up with quite a good system for organizing all of this. That's ~40 tapes if the whole thing is archived to tape.

For the spinning platter storage, I would look at Netflix Open Connect, and Facebook's Open Compute for hardware ideas. They are in the business of bulk cheap reliable storage. I would absolutely not screw around with proprietary stuff because as soon as your 1 year warranty is over, you're stuck replacing parts with vendor specific replacements in order to regain access to your data. IMO you should be able to move the array to completely rebuilt commodity hardware and have access to your data at anytime.

I would consider a FreeBSD 9 ZFS based solution, which you can find in Free 4 NAS. There are some specific requirements on how you grow storage using ZFS as you cannot grow RAIDZ. But you can add RAIDZ's to a pool and increase the pool size.

I'd also consider a straight Linux setup with CentOS (free RHEL 6 clone), with XFS for the file system on top of LVM for volume management on top of md RAID. You can increase storage by either growing the RAID, live, or you can add storage to the volume group through LVM. It's probably faster to build and test a new 20TB array, and add that to the volume group with LVM rather than rebuild the array even though it can be done live.

Screw the Window stuff...there's nothing comparable for logical volume management to either ZFS or LVM until Microsoft extends ReFS deployment downstream to the masses.

With that linux idea does the RAID grow on the fly or do I have to shut it down/format to get the RAID to grow?
 

Red Squirrel

No Lifer
May 24, 2003
69,693
13,325
126
www.betteroff.ca
I think with ZFS you can't grow the raid, but if you want to be able to grow the raid, mdadm raid will do that. ZFS is a "smarter" type raid though and has many features while mdadm is simpler may have less features, but it does have that capability. ZFS is also better performance, I think. That's usually why people go with it.


As for tapes... bad idea, financially. Have you ever looked at how much they cost? You could buy a yacht and build a mini data center inside it for the price of a drive and each tape is about the price of a 1TB drive. Tapes ARE better for long term storage and generally more reliable, but they are way more expensive.
 

jwilliams4200

Senior member
Apr 10, 2009
532
0
0
A lot of people here are giving bad advice with regards to ZFS or mdadm RAID.

For a home system that needs the type of expansion that you talked about, snapshot RAID is the way to go. You can mix and match any number and capacity of drives, you can start with already filled drives, you can expand one drive at a time without completely rewriting the whole array, and if you lose more drives than you have redundancy (parity), you only lose the data on the dead drives, not the entire array.

With snapshot RAID, there are basically two choices. There is SnapRAID (which I already mentioned) that is free and open-source. It works well, but it is more of a do-it-yourself type program, in that it works best if you have some understanding of how it works and you think carefully how to integrate it into your system. The other choice for snapshot RAID is FlexRAID, which comes with a GUI and bundles some other features (like pooling and email alerts). But it is not free nor open-source. Most things that FlexRAID can do can be accomplished with SnapRAID plus additional (free) tools that you integrate yourself.
 
Last edited:

murphyc

Senior member
Apr 7, 2012
235
0
0
With that linux idea does the RAID grow on the fly or do I have to shut it down/format to get the RAID to grow?

Short answer is yes you can do it on the fly.

Longer answer is:

Recent versions of mdadm can grow RAID 5 and RAID 6 while the file system is live in-use.

For many years it's been possible to add-subtract-migrate storage to/from LVM managed volume groups, while the file system above it is live.

Note that in both cases, growing the array, or growing the volume group, does not increase the size of the file system. That is a separate action. XFS supports growing its file size, so after you grow an array or logical volume, you can then grow the XFS file system.
 

murphyc

Senior member
Apr 7, 2012
235
0
0
A lot of people here are giving bad advice with regards to ZFS or mdadm RAID.

You should be specific to the bad advice and directly correct it, rather than hand waiving and moving on with your own arguably questionable advice.

Have you read the SnapRAID forums regarding its performance scalability? If you had you might not be suggesting it for what is clearly not your typical home theater storage setup. The OP is talking about enterprise storage requirements from the start, with an enterprise growth rate.

Please tell me what enterprise implementations are using snapshot RAID today? Enterprise are using mdraid, LVM, and XFS. They are using FreeBSD and ZFS, or UFS+J. They are using OpenIndiana and ZFS.
 

jwilliams4200

Senior member
Apr 10, 2009
532
0
0
Have you read the SnapRAID forums regarding its performance scalability? If you had you might not be suggesting it for what is clearly not your typical home theater storage setup.

What are you talking about "peformance scalability"? Yes, I have read the SnapRAID forum. In addition, I have over 30TB of data managed by snapraid, and I have been using snapraid for about a year now. I can get 1000+ MB/sec throughput with a snapraid sync (that includes computing dual-parity as well as block checksum data).

As for your FUD about enterprise, you really don't know what you are talking about when it comes to large, expandable home fileservers. I already gave specific examples of why snapshot RAID is better for such usage. You really need to pay attention before posting nonsense.
 

murphyc

Senior member
Apr 7, 2012
235
0
0
What are you talking about "peformance scalability"?

It's in the forum, you can read about it.

Yes, I have read the SnapRAID forum. In addition, I have over 30TB of data managed by snapraid, and I have been using snapraid for about a year now. I can get 1000+ MB/sec throughput with a snapraid sync (that includes computing dual-parity as well as block checksum data).

I'm glad you're having a good experience. The support forums are not exactly active compared to other forums I'm used to.

As for your FUD about enterprise, you really don't know what you are talking about when it comes to large, expandable home fileservers. I already gave specific examples of why snapshot RAID is better for such usage.

Great non-response: Call something FUD. Don't answer the question.


You really need to pay attention before posting nonsense.

That's hilarious. Versus people such as yourself who waive your hand around in circles about others posting bad information, while you don't bother to be specific, nor correct the misinformation? That is ill-mannered and unprofessional if you actually care about whether people obtain correct information. Whereas if you merely care about looking like you know what you're doing, the useless hand waiving approach thus far is working perfectly.
 
Last edited: