Huge file server project (24TB)

Gooberlx2

Lifer
May 4, 2001
15,381
6
91
My lab needs to either order or build a large, expandable file server for our projects.

To start with, it needs begin at about 12TB, and be expandable to at least 24TB over the next couple of years. Unfortunately we'll need to start the system before any 2TB HDDs are going to be released (IIRC, Seagate was going to start releasing them next year?).

I would prefer something like a RAID 5 or 6 array....or something with redundancy, but something that isn't too much of a hassle to dynamically expand as our data needs grow. Honestly, I'd love to be able to just pop in a new drive, have minimal commands to make and it just works.

There is already an off site backup solution in place (though that will have to be expanded in the future as well)

- I've researched running a Linux LVM on top of software RAID 5 or 6 and using add/grow with mdadm.
- I've seen posts somewhere about using several Thecus N5200 Pros in a SAN (though it's not clear to me if they can network together as one large, RAIDed filesystem).
- I've read about Unraid
- There's always rackmount chasis with 24 3.5" bays
- etc...

So how would YOU accomplish this? Budget is anywhere from $5K to $15K, and I'd love to consider quite the range of solutions, from something someone would homebrew, to enterprise solutions.

....ok....go!
:)
 

EarthwormJim

Diamond Member
Oct 15, 2003
3,239
0
76
RAID 6 would only increase your drive count by one, but give you extra piece of mind.

I'd stick with Seagate drives for their 5 year warranty. I'm sure this much data you'll be hanging onto for awhile.

Card wise something like this http://www.newegg.com/Product/...x?Item=N82E16816151042 would work.

You'd have to buy some expanders to fit that many drives.

As far as racks, I'm not really sure what you'd want. In my desktop I have this rack, http://www.newegg.com/Product/...x?Item=N82E16817994028.

It fits 5 hard drives in only 3 5.25" spaces. Definitely a ton of options when it comes to racks, so look around for what will fit your needs (locks, adequate cooling, space constraints, integrated into a chassis).

I have no real experience with a huge server, so I'd wait for more experienced people to chime in.
 

Brovane

Diamond Member
Dec 18, 2001
6,540
2,678
136
In my opinion if you are looking at this amount of Storage you need to seriously take a look at SAN storage however it is probably out of your price range. You might want to look at a Dell AX4-5i since it can scale to 60TB and there is a lot flexibility. We got one quoted for around $20k in the entry level model fully loaded with 12TB and you can add additional DAEs. Also it is iSCSI so you have a lot of flexibility and all the SAN features.
 

RebateMonger

Elite Member
Dec 24, 2005
11,586
0
0
Once you figure out the technology you want to use, be sure to ask about:

a) The cost of replacement or add-on parts. Additional drives for EMC SANS, for instance, cost a small fortune. Plus, the first few drives contain the SAN OS, which can only be installed by EMC. You likely have to have a Service contract to get this service.
b) How are you going to back up your data?
c) How are you going to recover your data in case of a disaster?
d) How long will it take to perform simple array maintenance like defragementation, a "chkdsk" or eqivalent, and how long to scan the volume as part of an emergency data recovery operation? Something as simple as a chkdsk in Windows will disable a 1 TB RAID volume for at least 24 hours.
 

Gooberlx2

Lifer
May 4, 2001
15,381
6
91
Originally posted by: RebateMonger
Once you figure out the technology you want to use, be sure to ask about:

a) The cost of replacement or add-on parts. Additional drives for EMC SANS, for instance, cost a small fortune. Plus, the first few drives contain the SAN OS, which can only be installed by EMC. You likely have to have a Service contract to get this service.
b) How are you going to back up your data?
c) How are you going to recover your data in case of a disaster?
d) How long will it take to perform simple array maintenance like defragementation, a "chkdsk" or eqivalent, and how long to scan the volume as part of an emergency data recovery operation? Something as simple as a chkdsk in Windows will disable a 1 TB RAID volume for at least 24 hours.

Good questions:

The sad thing here is that I'm kinda getting thrown into all this. I consider myself fairly savvy and a quick learner, but hell, I'm no network admin and I've never tinkered with anything like a rackmount server, blade server, SAN, EMC, etc, etc..

I shudder to think how long it would take to swap a faulty hard drive and rebuild a RAID 6 across 24TB.

Unfortunately that budget I listed will likely be the upper tier for any considerations. As far as service and maintenance goes, I'll probably be it. Our lab is grant funded, so there's really not wiggle room on the budget, but a 24TB storage solution is something we MUST have. So that's why I'm trying to find some sort of hybrid solution. Enterprise capability on an enthusiast budget.

Backup is somewhat (poorly) intrinsic by our process. We work on files stored on our file server, and all processing and data output is automatically uploaded to an offsite server that was built specifically for this proprietary Linux software we use. I would need to setup some sort of syncback for the data output files.

Unfortunately THAT server is only 10TB, with tape backup anyway, so it's going to have to be expanded in the near future as well. (that'll be a fun conversation with the management). All this was setup before I arrived, and before one of the new supervisors arrived. When we did the calculations on how much data is required per case, we found the original estimation was way, way off. Apparently they only calculated the image files we process, not the data output for their server needs.

So here we are...*sigh*
 

Brovane

Diamond Member
Dec 18, 2001
6,540
2,678
136
Originally posted by: Gooberlx2
Originally posted by: RebateMonger
Once you figure out the technology you want to use, be sure to ask about:

a) The cost of replacement or add-on parts. Additional drives for EMC SANS, for instance, cost a small fortune. Plus, the first few drives contain the SAN OS, which can only be installed by EMC. You likely have to have a Service contract to get this service.
b) How are you going to back up your data?
c) How are you going to recover your data in case of a disaster?
d) How long will it take to perform simple array maintenance like defragementation, a "chkdsk" or eqivalent, and how long to scan the volume as part of an emergency data recovery operation? Something as simple as a chkdsk in Windows will disable a 1 TB RAID volume for at least 24 hours.

Good questions:

The sad thing here is that I'm kinda getting thrown into all this. I consider myself fairly savvy and a quick learner, but hell, I'm no network admin and I've never tinkered with anything like a rackmount server, blade server, SAN, EMC, etc, etc..

I shudder to think how long it would take to swap a faulty hard drive and rebuild a RAID 6 across 24TB.

Unfortunately that budget I listed will likely be the upper tier for any considerations. As far as service and maintenance goes, I'll probably be it. Our lab is grant funded, so there's really not wiggle room on the budget, but a 24TB storage solution is something we MUST have. So that's why I'm trying to find some sort of hybrid solution. Enterprise capability on an enthusiast budget.

Backup is somewhat (poorly) intrinsic by our process. We work on files stored on our file server, and all processing and data output is automatically uploaded to an offsite server that was built specifically for this proprietary Linux software we use. I would need to setup some sort of syncback for the data output files.

Unfortunately THAT server is only 10TB, with tape backup anyway, so it's going to have to be expanded in the near future as well. (that'll be a fun conversation with the management). All this was setup before I arrived, and before one of the new supervisors arrived. When we did the calculations on how much data is required per case, we found the original estimation was way, way off. Apparently they only calculated the image files we process, not the data output for their server needs.

So here we are...*sigh*


Unfortunately based on your budget I would say you are not going to be able to get a Enterprise solution with the requirements you need. It looks like you are stuck with a "jimmy-joe-jack" solution. I would present both solutions to your management and make sure they understood that because of the budget constraints that this will not be a enterprise solution and ask for the response in e-mail to be saved by you for future reference if necessary.

For a SAN fiber channel drives cost a lot but the AX uses SATA drives which bring the cost done considerable and capacity goes up a lot. For the SAN OS it is spread across the first 5 drives and is triple mirror. Even if you lose a drive on the first 5 it will automatically grab your hot spare drive if configured and start the rebuilt process. No need for a SAN OS reinstall. You are not probably going to be able to do RAID 6 across 24TB. Even on a SAN you can do a maximum of 14 disks last time I checked in a singe Raid Group. So a single RAID 6 across 26 1TB disks is out of the question.

A SAN EMC Navisphere configuration tool is all GUI based and not hard to use. If you are not going fiber channel then you don't have to do zoning which gets a little tricky. iSCSI SAN's are fairly easy to setup and configure.
 

NXIL

Senior member
Apr 14, 2005
774
0
0
Dear Goob,

ZFS?


Welcome to ZFS

ZFS is a new kind of file system that provides simple administration, transactional semantics, end-to-end data integrity, and immense scalability. ZFS is not an incremental improvement to existing technology; it is a fundamentally new approach to data management. We've blown away 20 years of obsolete assumptions, eliminated complexity at the source, and created a storage system that's actually a pleasure to use.


http://opensolaris.org/os/community/zfs/

In computing, ZFS is a file system designed by Sun Microsystems for the Solaris Operating System. The features of ZFS include support for high storage capacities, integration of the concepts of filesystem and volume management, snapshots and copy-on-write clones, on-line integrity checking and repair, and RAID-Z. ZFS is implemented as open-source software, licensed under the Common Development and Distribution License (CDDL).


http://en.wikipedia.org/wiki/ZFS

Unlike traditional file systems, which reside on single devices and thus require a volume manager to use more than one device, ZFS filesystems are built on top of virtual storage pools called zpools. A zpool is constructed of virtual devices (vdevs), which are themselves constructed of block devices: files, hard drive partitions, or entire drives, with the last being the recommended usage.[6] Block devices within a vdev may be configured in different ways, depending on needs and space available: non-redundantly (similar to RAID 0), as a mirror (RAID 1) of two or more devices, as a RAID-Z group of three or more devices, or as a RAID-Z2 group of four or more devices.[7] The storage capacity of all vdevs is available to all of the file system instances in the zpool.


http://www.sun.com/2004-0914/feature/

ZFS, the dynamic new file system in Sun's Solaris 10 Operating System (Solaris OS), will make you forget everything you thought you knew about file systems. ZFS will be available on all Solaris 10 OS-supported platforms, and all existing applications will run with it. Moreover, ZFS complements Sun's storage management portfolio, including the Sun StorEdge QFS software, which is ideal for sharing business data.

"If you're willing to take on the entire software stack, there's a lot of innovation possible."

Jeff Bonwick
Distinguished Engineer
Chief Architect of ZFS
Sun Microsystems, Inc.


"We've rethought everything and rearchitected it," says Jeff Bonwick, Sun distinguished engineer and chief architect of ZFS. "We've thrown away 20 years of old technology that was based on assumptions no longer true today."

ZFS is supported on both SPARC and x86 platforms. More important, ZFS is endian-neutral. You can easily move disks from a SPARC server to an x86 server. Neither architecture pays a byte-swapping tax due to Sun's patent-pending "adaptive endian-ness" technology, which is unique to ZFS. And you don't have to worry about migration. Sun continues to support the UFS file system.


ZFS should be up and running on Macintosh OS X by the end of the year (was supposed to be in and running already, but, some delays, for whatever reason.)

Anyway, seems perfect for what you want to do, and Sun systems interoperate well with other systems, i.e. Windows, Mac, Linux, etc.

Honestly, I'd love to be able to just pop in a new drive, have minimal commands to make and it just works.

Supposedly that is what ZFS accomplishes....

HTH

NXIL
 

NXIL

Senior member
Apr 14, 2005
774
0
0
Dear Goob,

12 TB for about 15K$:

http://shop.sun.com/is-bin/INT...un_Store_US-SunCatalog

That leaves 12 drive bays open.

http://shop.sun.com/is-bin/INT...&ShowAllProducts=false

Note: does not include upgraded on site warranty, etc....

1 TB hard drives run about $200.

12 x 200 = $2400....hmm, that seems like a lot for the rest of the sun hardware....and the 6TB version is only $8.5k....seems like Sun can wheel and deal a bit on that pricing, unless they are using some sort of enterprise class SATA drive....

Anyway, I would guess that Sun sales would salivate at being able to offer you some serious storage.

HTH

NXIL
 

Cr0nJ0b

Golden Member
Apr 13, 2004
1,141
29
91
meettomy.site
In my opinion you need need to sit down with your boss or who ever has the purse strings and define a realistic budget. Define what that value of the data is and what your downtime will cost (because you will have some) and add in the total cost including your time to administer and setup.

As a hobbiest, I would do this.

Get a case -- $150
Get 3 drive enclosures (5 Hot swap drives each) -- $100
Motherboard with 4 PCI-X slots -- $250
CPU -- $350
Memory -- $150
PSU -- $200
hard drives -- 34 x $200 = $6,800
Boot drive probably IDE mirrored Flash -- $300
RAID card ARC-1160 -- $1,000
RAID enclosure 16 drives -- $1,500 x 2 = $3000
Total so far -- $12,200 before tax and shipping
Load either FreeNAS, OpenFiler or Ubuntu

This will give you roughly (16 - 2)*2 = 28 x 1TB = 28TB usable

You will need something to rack it in, so throw in another $2,000 and we have your budget.

Now, you need to ask yourself...what happens if you loose all of that data? 28TB...is a...lot...of data.

You could recover from? what an off site backup?...in what...3 years? Or you could buy a backup system for about $10K and be up and running in a couple of weeks or so.

You could always just double the NAS system and rsync between them...which would give you the best recovery.

If it were me I would get a realistic budget of something like $30K-45K, get a commercial NAS system from EMC, or Netapp or someone else in the higher quality range (they will both cost about the same when the dust settles), they will support the top end of your requirement and give you a lot more usability and function. Better support too.

If the data isn't all that important, and you don't really care if it's down or it "goes away", then maybe build your own and be shackled to it for the rest of your career.

that's my 2 cents.

 

NXIL

Senior member
Apr 14, 2005
774
0
0
Originally posted by: Cr0nJ0b
In my opinion you need need to sit down with your boss or who ever has the purse strings and define a realistic budget. Define what that value of the data is and what your downtime will cost (because you will have some) and add in the total cost including your time to administer and setup.

...

that's my 2 cents.

QFT

I agree with CrOnJOb: with 12-24 TB (!) of data on tap, there's going to be some down time, hardware issues, glitches, etc. You have no doubt heard folks on these forums recommend friends and family members get a pre-fab Dell rather than them putting together a system for someone, for which they become 24.7 tech support forever, not just for just 3 years.

Recommend you get quotes from

Apple (you might be surprised, their high end gear can be priced competitively, and they are adding ZFS to OS X)

Dell

HP

IBM

Sun

I think the 24.7 support option, plus maybe training for you, if you are going to be the cat herder for this project, should definitely be factored in.

GL HTH

NXIL
 

Tamago808

Junior Member
Jul 22, 2008
4
0
0
I may be new to the forums but this caught my attention since I'm building a homebrew fileserver by year's end once I know my limited budget.

Anyways I'm a technician at a integration facility that does assembly for various companies needs on what they want. Simple desktop solutions, network utilities, file servers and rack server storage, etc. I tend to deal with AIC branded chassis for lower class OEM builds which require a lot of hands on work.

http://www.aicipc.com/ProductDetail.aspx?ref=RSC-4ED2-0

This may catch your eye or not on what this company can offer. I can approve that the rate of failures on the backplanes is quite rare. You can check the specs on the chassis as well. They also offer more versions of these file rack servers if you need like a slim dvd-rw in the front and a 3.5" floppy drive. Just be forewarned, the versions that use the 3.5" bays have lousy cable clearance so be ready to use a feeder type stick for cabling if the bays are on the bottom of the chassis.

On the other end, I can recommend the newest 3Ware PCI-E Raid cards that utilize the newer Intel raid controller chips that allow RAID6.

http://www.newegg.com/Product/...x?Item=N82E16816116046

I could recommend going either spanning 2 or 3 cards instead of 1 card if u want some redundancy if one card were to fail. I'm in the process to lookup more data on the 8Port version for my fileserver on how well it performs but i've been lazy on it. Hope this treats your brainstorming nicely.

 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
isn't the standard recommendation to use no more then 5 drives in raid5, and no more then 9 drives for raid6?

Since you only need 12TB today, two RAID6 arrays of 8 drives each will give you 12TB of useable space and 4TB used in parity.
Does it have to be EXACTLY 12TB? in which case throw in another drive to account for 1024 conversion. (1TB drive is actually 1,000,000,000,000 bytes drive... but KB =1024byte, MB =1024KB, and so on).

I dont see why you should be making it 24TB today like some suggested, the prices will go down over the next couple of years, reliability will go up, speed will go up, and with less component there is less of a chance of something breaking.
there is no point of making it overly complicated today.

Although I am all behind the idea of making two seperate servers, on two sites, and syncing the data between them.

Also, with ZFS you could increase the size of each data pool by adding more arrays. (it can pool the data between arrays... so if you have a pool containing 2 arrays of 6TB each, you could add a 12TB 3rd array to the pool when 2TB drives hit the market)

HDDs... I am not sure if the seagate ES is worth the 50$ premium over their barracuda line for this. Anyone else has opinions on that one? WD is also good, research the individual drives for compatibility issues and other issues. when I looked into their 750GB lines back in the day i found issues with every single manufacturer, just different issues with each one (bugs with sleep mode, bugs with cache usage, bugs with dropping out of raid arrays... etc)
buy 16 of them. Put them in two raid6 type arrays (raid6 or raidz2 if ZFS), and you got 12TB of space.

Homebrew:
Standard computer components: 800-1000$
2x drives in mirror mode for OS = 100$.
HDD 190$-230$ x 16 = 3040$-3680$
OS = free
Total: under 5000$


I highly recommend ZFS (on solaris only right now, it is not safe anywhere else at the moment!). And with ZFS type software raid you are no dependent on any hardware. I have swapped out motherboards and operating systems (intentionally, as a test) and it always as simple to reaquire the array as typing "zpool import -f tank". tank being what i named the pool.
You don't have to buy raid cards that cost thosands of dollars... also those raid cards are single points of failure. And if it fails several years later, when it is no longer being made, you have to purchase the exact same make and model to restore the array. With them being obsolete and no longer made... well that is a pain.


Also, this is very VERY important... back up on an optical media on occasion!
Backups should be targetted at types of data loss... Common losses and countermeasures
1. Drive failure: use redundant arrays and backups.
2. Controller failure: use backups and software raid (software raid is not controller dependent)
3. Write errors due to faulty drive, cpu, memory, psu, whatever: use ZFS or other checksumming filesystem (some are in development, google has one which is a trade secret, ZFS is the only one on the market).
4. Fire, flood, etc: offsite backups.
5. Theft: off site backup or optical media backup (write once optical media is worthless in of itself)
6. Swatting (competitor "tips" FBI with some fake story, your computers are confiscated as evidence, if charges are pressed, forever, if no charges are pressed, they will be returned to you in the timely manner of oh, a mere 24 YEARS): use off site backups.
 

live4spd

Member
Jul 6, 2000
112
0
71
Here's my idea:

We use Dell's MD1000 Storage Arrays here at work. They can take up to 15 drives of SATA or SAS drives. Buy 2 MD1000's off ebay (about $1700 each) stuff them full as many 1TB SATA Drives as you like, and connect them to a PERC5 or PERC6 or LSI Logic Controller. You can chain up to 3 PowerVaults together on one channel. You RAID 5 each PV together or individually. (I'd go 1 RAID5 per PV in your case)

That should get you up to 28 TB online storage for your budget.