• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

best open source distributed file system?

alyarb

Platinum Member
I'm looking for something that kind of resembles GoogleFS, in that dissimilar, commodity hardware can be added to a heterogeneous storage pool.

Gluster sounds cool, but after reading about it I'm not sure if it has the best topology. It supports distribution and replication, but in a really basic way. You have to set specific nodes to be stripes, and specific nodes to be mirrors, but not both.

I am looking for a system where nodes are selected on a "per-chunk" basis on whether a given chunk of data should be replicated there or somewhere else. Something that provides a little of everything - a little parallelism for performance and just enough redundancy without wasting storage on every node like a perfect mirror would, because the whole point is that it is heterogeneous nodes in a completely asymmetrical pool. We should be able to add standalone disks or large RAIDs to a node, and the file system "controller" decides how best to use it.

I'm not sure something like this even exists... I wish someone had an ISO or OVF I could just download and start messing with, but I can't seem to find anything, at least anything that is free. Looking forward to hearing your thoughts!

Thanks
 
lvm on ext4? pretty easy to mirror disks and make backups.. what specific purpose does it need to serve? is it part of a local cloud or file server? are you looking for something that scales up/down easily? what is your target os?
 
thanks for responding. I'm not looking for a simple, straight answer because i know there isn't one. more if someone can converse on the subject that would be helpful.

the target OS is probably debian or centos. our main business is Ahsay OBS. We have about 500 TB in storage assets in copenhagen and orlando, and we are mostly running 2008 R2, but we won't be sticking with windows the way things are headed.

Each RAID has about 8 TB of usable storage, and of course it varies from user to user (assume 400 backup users per server), but those RAIDs with predominantly large numbers of small files are, I believe, pushing the practical limits of NTFS. We are organized to deal with it for now, but not really. NTFS is too slow with this much stuff.

So we are definitely looking to move to a modern linux file system that can remain agile with billions of files. We want to know if a file system exists where most of the RAIDs on most of our distantly separated servers can be pooled into a single logical unit, accessible by all members of the pool. For this, centos+gluster seems like an obvious choice. It must be open source and everything so we can install it on whatever we want, we don't buy appliances.

but what would be *really* cool is if we could do this WITHOUT raid controllers or anything special... basically just installing high end drives in commodity boxes.

from the googleFS abstract:

....It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients....

The file system has successfully met our storage needs. It is widely deployed within Google as the storage platform for the generation and processing of data used by our service as well as research and development efforts that require large data sets. The largest cluster to date provides hundreds of terabytes of storage across thousands of disks on over a thousand machines, and it is concurrently accessed by hundreds of clients.

that was 10 years ago. are there any open source solutions that offer this same capability?
 
Hadoop dominated the results of my own google searches, and I found it was not easy to understand, let alone configure.

Looking for a real file system that looks and behaves like a file system.
 
Back
Top