Originally posted by: Matthias99
Originally posted by: drag
Well my understanding on Software RAID is that with Linux 'MD' software RAID 5 is generally faster then any hardware raid.
This is because the controllers on the RAID 5 devices use a more general purpose (as apposed to something like in your video card which is very specialized) proccessor to calculate the stuff needed for Raid is much much slower then even low-end general purpose CPUs we have in our computers. This is something like a 200-400mhz proccessor versus 3.0ghz pentium 4.
When we were all running 333mhz Pentium's then when you got hardware RAID it would significantly lower the CPU overhead for the box while increasing reliability and disk transfer speed... but it's not like that anymore. Modern CPUs are insanely fast.
Yes and no. Most hardware RAID5 cards use some sort of general-purpose processor to do the RAID calculations (rather than custom logic), but they are generally fast enough to push as much data as the interface can take. Could your CPU do it? Sure. But your CPU and system RAM have better things to do in most situations.
I realy depends. If the PC is just for that task then it doesn't matter. You'd use it's cpu to improve the performance of other systems doing more important stuff.
How much does a good hardware raid card costs?
I can go out and buy a 400 dollar Dell 'server' PC, buy a few SATA to PCI adapters and stuff a terrabyte of RAID'd disk space for under a thousand bucks.
It's flexible, it's fast. I can share it's disk space between a dozen 'servers'. I can use GFS and EVMS/LVM2 to managed shared disk access over the network so it's like a local file system. And there are other distributed file server options for high-speed network disk access... Stuff that is much nicer and has less overhead then dealing with traditional file services like NFS or CIFS/SMB stuff.
Those servers end up smaller, less rackspace, less energy, and so on and so forth.
See here in benchmarks were Linux software RAID is twice as fast on the same machine with the same harddrives then hardware raid.
http://www.chemistry.wustl.edu/~gelb/castle_raid.html
I don't know enough about the bench they're running (ie, was that write-only? read-write? what kind of hit ratios?), but the performance on that controller looks VERY low. It also looks like these are not caching controllers (that is, they do not have a substantial chunk of onboard memory), which can hurt performance significantly in RAID5 (since either the driver is buffering data in local memory, or you can only work in write-through mode).[/quote]
Look around and you'll find other benchmarks that show linux software raid is generally faster. This isn't the first benchmarks that I've found that show this.
It's kinda a dirty little secret. It's easier for hardware vendors to just point to 'low overhead' and 'faster' rather then try to tell them that buying their products may not increase performance, but will provide other tangible benifits.
And that's not the only place were I see things like that.. Linux MD software raid has performed better in other benchmarks, even versus other software raid solutions. To me it seems one of the substantially better things Linux has going for it for server stuff.
If you have CPU time and RAM to burn, software RAID (even RAID5) is just as good. If you have a sever that needs all the CPU time it can get to interface with clients, hardware storage controllers will do a lot better overall than software RAID.[/quote]
I like idea of having dedicated file servers sharing resources to other systems that do more speed important tasks. Basicly a NAS/SAN like you said.
With things like infinaband becoming affordable, being able to use LVM to manage storage pools of PCs in the same manner that you can manage storage pools of individual drives inside of a server (kinda like using software RAID), PCI express aleviating PC bus limitations, etc. etc. I can see PC clustering storage arrays replacing SANs for most people.
For isntance Archive.org doesn't use SANs, it doesn't even use RAID.. It just uses lots and lots and lots and lots of Mini-ITX machines with a few disk drives apeace running linux which they mirror and such accross each other to ensure high aviability, reliability, and decent performance.
Last time I read about Google they were using PC clusters in a similar way. They would have a load-sharing cluster of PCs that acted as one unit as a 'node', then they would have multiple of this nodes that shared would individually respond to requests for search engine stuff. The nodes were mirrored, had load balancing capabilities, and hotspare capabilities. If a PC in a node blew up then they would shutdown that node and a hotspare would take over automaticly. Then the techs could attend to the hardware failure at their convience with no loss of service to anybody. They wrote and open sourced their own "GFS", which is Google File System that they do all that with. (which is different from Redhat's GFS (global file system))
As it is now... I can take 5 500gig drives and stick them in with a couple disk controllers into a 400-600 dollar PC. Then take 3 identical things to those machines, give them a couple multi-port ethernet cards and combine them with nice switches for the 'storage fabric'.
On those dedicated storage machines I would setup GNBD, which is a way to provide direct block access to storage over a regular network to clients. So basicly those machines would appear as regular disk drives to any client. Which in turn would be the actual servers doing CIFS to windows clients, or OpenAFS or NFS to Linux workstations, or running Apache or Oracle or whatever.
Basicly GNBD would make the storage stuff those PCs are sending as /dev/sd* files.. I would use CLVM to manage those block devices like I would with regular disk drives. CLVM is a extended version of regular Linux LVM2, but with the ability to work well over a network and deal with multiple machines accessing the storage pool at the same time, as well as some other features.
So in actuality I would create a 6+ terrabyte shared storage for my servers using commodity PCs and free software. It would be faster and have less overhead for my individual application servers then if I went with hardware RAID for them, but at a fraction of the cost of SANS.
Obviously though it's inferior to a 'real' sans, but it would work if I needed the storage size, but lacked the budget.
GFS and friends also allow you to do that with SANs, which is what I suppose most people use it for nowadays.
So you can go:
SANS ---> 'sans fabric' ---> Multiple servers with their file system being managed with GFS. ---> regular ethernet ---> end user clients
or
SANs ---> 'sans fabric' ---> Linux GNBD servers ----> ethernet storage backbone ---> Multiple regular Linux servers ---> regular ethernet network ---> end user clients
(this allows you to save costs and improve peformance of sans without having to upgrade the sans to fill out your entire network.. the servers can be located and interface the regular ethernet network in different places to reduce traffic loads and isolate PC clients from one another, that sort of thing)
or
Linux file server clusters running GNBD ---> ethernet storage backbone ---> Multiple regular Linux servers ---> regular ethernet network ---> end user clients
The last scenerio is what I was talking about.
But there are also in the works lots of other stuff. It's not as mature as GFS and some CLVM (which is being worked on to expand it's capabilities), which people use in enterprise right now.
For instance you can use ddraid... It allows you to take network block devices and run actual software RAID over the network. That's definately not something you can do with hardware raid!
http://sources.redhat.com/cluster/ddraid/
I can use OCFS2, which is Oracle clustering file system version 2, which is usefull for general purpose stuff despite it's name.
http://oss.oracle.com/projects/ocfs2/
OCFS2 actually should make it into the vanilla kernel proper one of these days.
There is Lustre, which is a high performance distributed file system used in many linux clusters today.. including that Top500 stuff.
http://www.lustre.org/
For isntance hardware vendors are currently selling special purpose Lustre solutions for scientific applications were very high I/O is needed. Using PCs (with hardware RAID btw) they are able to setup a parrellel network based file system that has been run to get over 10Gb/s 2-way file transfer performance using PC clusters. You can go out and right now buy pre-built systems using exotic interconnects like 'Quadrics' to get sustained file transfer performance of around 2.5Gb/s.
http://www.taborcommunications.com/hpcwire/hpcwireWWW/04/1203/108916.html
All that file system stuff is open source right now.
Right now this sort of thing should be possible using regular PCs on a regular budget.. but it isn't. Ethernet is a big limitation, but as we get things like infinaband or myranet cheaper it will become more and more practical.
If you need this stuff RIGHT NOW, obviously SANS are the only practical solution for a the vast majority of people. But I don't think it will always be like that.