Recommended filesystem for file server?

Kakumba

Senior member
Mar 13, 2006
610
0
0
So, I am creating a new file server for large files (movies etc, over 100MB files), and am wondering what FS to use. I will be running RAID 5 or 6, probably on a 3Ware 9650SE-12ML card, with several Seagate 1Tb disks. OS will be CentOS 5.2.

I was thinking XFS would be the best choice of filesystem (even though the XFS modules are not included by default in this OS), however it seems that XFS on LVM2 (which is what I would be implementing) has issues if 4K stacks are enabled, as happens by default, especially in 2.6 kernel Red Hat distros. The issue being that under high I/O, the system locks up.. Obviously this is not an ideal situation, so what I would like to know is:

Has anyone experienced this? What is the likelihood of such an event happening? Am I being overly paranoid here?

Also, XFS on linux does not allow block sizes greater than the page size, which is 4K by default. This seems a pointlessly small block size for a file system that does not need to deal with small files, and not all that many directories (pretty flat file structure). Especially when you consider that I will be looking at a RAID 6 chunk size of 1Mb, or maybe 512K (the bigger the better as far as I can tell). How hard is it to increase the pagesize for the kernel, to allow bigger block sizes? Or am I making this a whole lot more complicated than I need to?

Next question:

how does one create a stride aligned XFS filesystem? http://linux-raid.osdl.org/index.php/RAID_setup#Options_for_mke2fs thats great for ext2, but surely we can do something similar for XFS? Have been reading on linux.com: http://www.linux.com/feature/140734 that this is a good thing to do, but there isn't much info that I can see there about how to do it.

Any other advice you would give for this?
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
I'm using XFS on LVM on dm-crypt and haven't had a single stack overflow issue.

Also, XFS on linux does not allow block sizes greater than the page size, which is 4K by default. This seems a pointlessly small block size for a file system that does not need to deal with small files, and not all that many directories (pretty flat file structure). Especially when you consider that I will be looking at a RAID 6 chunk size of 1Mb, or maybe 512K (the bigger the better as far as I can tell). How hard is it to increase the pagesize for the kernel, to allow bigger block sizes? Or am I making this a whole lot more complicated than I need to?

I believe you are, there are probably some special cases where bigger block sizes will make a noticable difference but not in a home server.

how does one create a stride aligned XFS filesystem? http://linux-raid.osdl.org/index.php/RAID_setup#Options_for_mke2fs thats great for ext2, but surely we can do something similar for XFS? Have been reading on linux.com: http://www.linux.com/feature/140734 that this is a good thing to do, but there isn't much info that I can see there about how to do it.

mkfs.xfs has RAID options but I never worried about them either. If you do want to use them you'll have to create the filesystem outside of the installation though since Anaconda doesn't let you touch the mkfs command that it uses.
 

Kakumba

Senior member
Mar 13, 2006
610
0
0
Thanks for the replies, that stack issue is one less thing to worry about.

I was already planning on doing the XFS outside of the install, as I first want to ensure that the hardware RAID controller is happy before creating the filesystem on it.

Well, think I will just go with what I already know, what what little more I can gain before building this thing. Thanks.
 

skyking

Lifer
Nov 21, 2001
22,779
5,941
146
Originally posted by: Brazen
xfs on lvm on md raid

Md raid for the win. Hardware cards can and do fail, and then you will have fun getting back to anything at all.
 

Kakumba

Senior member
Mar 13, 2006
610
0
0
Yup, there will be no Solaris here. Not for this anyways, Solaris has its uses, this is not one of them for me. Anyway, hardware RAID FTW, because:

1. Can someone recommend an affordable motherboard with 14+ SATA ports? Ie, I am planning on using a Gigabyte P45 based board, so in that price range. Didnt think so....
2. Software RAID has come a long way, but for RAID 6, hardware RAID is still faster. Not to mention more reliable for power failures (BBU FTW!)

I use md for most machines, but in this case, hardware RAID is required. So, XFS on LVM on hardware RAID 6 will be the choice. 3Ware 9650SE-12ML.
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
2. Software RAID has come a long way, but for RAID 6, hardware RAID is still faster. Not to mention more reliable for power failures (BBU FTW!)

For a file server I really doubt you'll notice the speed difference and a good UPS will always win out over a battery on the RAID controller.
 

Brazen

Diamond Member
Jul 14, 2000
4,259
0
0
Originally posted by: Nothinman
2. Software RAID has come a long way, but for RAID 6, hardware RAID is still faster. Not to mention more reliable for power failures (BBU FTW!)

For a file server I really doubt you'll notice the speed difference and a good UPS will always win out over a battery on the RAID controller.

Not necessarily. If you UPS runs out, then your drives could still lose power in the middle of writing. A good quality hardware RAID card will use it's battery time to flush the cache, finish any writes, and park the drive, which will greatly reduce the chance of file system corruption. Of course if your UPS is set up properly, it should safely shutdown your computer before it runs out, which is better overall.
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
Not necessarily. If you UPS runs out, then your drives could still lose power in the middle of writing.

Which is why you have a daemon like NUT or apcupsd that watches the UPS and shuts down the server when the battery gets below a certain threshold.
 

Kakumba

Senior member
Mar 13, 2006
610
0
0
UPS doesn't help if motherboard or PSU fails... It is best to combine UPS with BBU. The BBU can keep the write cache intact for up to 72 hours.

Also, when transferring several terabytes from multiple clients, any extra speed is greatly appreciated.... will be using 4x gigabit interfaces teamed, so network will not be such a bottleneck.