Originally posted by: thesix
Well NFSv4 isn't widely used yet dispite being aviable for some time now. A few years.
NFS has been used mostly by large businesses. Those're slow to migrate.
Full NFSv4 implementations (Sun, NetApp, IBM, Hummingbird etc.) came out less than two years ago, IIRC, which is not terribly long for a complex system like NFS.
Serious businesses don't take new technology lightly, which have to be reliable and servieable.
It has several problems.. for instance it still depends on the rpc stuff, I beleive.
More specificly, NFSv4 uses Sun ONC RPC. Is this a bad thing?
In practice, NFSv4 can use a single TCP connection with a well-defined destination TCP port (2049).
Even if it needs to "open a service port on the client", like the author suggests, is that a big deal?
But the major problem is that, from my perspective, is that it's in it's own little world.
Anything outside Windows is a "little world", if you must include PCs on small LANs.
The "big world" already has CIFS/SMB.
On the other hand, NFS is BIG in the business world.
It's not posix compliant.
POSIX is insufficient in this area, in fact POSIX has abandoned its own ACL effort.
Not talking about ACLs in this case, Just pure file system semantics. The application is going to behave differently between if it's running on NFS vs a native file system.
It's not realy all that compatable with Linux stuff
What Linux Stuff?
NFS is not designed for Linux, it's designed for Unix (in generic term, not the registered one), which is supposed to vendor/OS neutral.
Nothing is vendor/OS neutral.
There are numerious problems with making various NFS implimentations working together well, especially with Linux. (now I don't know, but this seems a awfully like Linux's developers fault, but it's a problem with propriatory vendors also)
Samba is nice because it's designed to deal with these issues from the outset. It's designed to be compatable with all the dozens of Microsoft variations from early versions of DOS to handheld devices to Vista. Each time Microsoft introduces a new OS or product they introduce a ever-so-slightly incompatable version. From this environment samba is pretty much ultra compatable with itself.
So if you support CIFS you only have to deal with 2 vendors: Samba or Microsoft. There is nobody else that matters.
With NFS everybody and their mom has a version and they don't always play well together. It's like POSIX. Sounds good in theory, but in reality being able to compile well with GCC means more for cross-platform support then posix sometimes.
And it's not compatable with Windows stuff either.
Yes, I would like to see more cooperation between the two worlds, at the technology level.
That doesn't mean there's no NFSv4 implementation on Windows, despite the differences.
It supports extended access control lists.. but these access control lists are not compatable with ones that people arlready use in Linux and other posix-style operating system file systems. NFS people tried to follow windows-style extended ACLs.. (much more widely used then Linux-style) but they followed Microsoft's documentation rather then actually examining how they realy functioned.. so they are incompatable.
ACL in Unix has been a mess. More vendors will adopt Windows-style ACL over the time, I think, since POSIX ACL seems to be dead.
Ya, but the problem is with NFSv4 is that they have their OWN UNIQUE extended ACL stuff. Sure they tried to model it after Windows, but I have a interview of a Samba guy making fun of them for actually following Microsoft's documentation. When you want to make something compatable with Microsoft's stuff you don't read the documentation, you have to observe how stuff actually works. They took Microsoft's documentation lliteral like a RFC document. With MS the actual software they end up making is the rule, not the documentation on how it should work.
So the role of NFS is pretty limited to situations were:
- You have a secure network
NFSv4 can be secure, it doesn't require a secure network. [/quote]
right.
- Need high speed shared storage
Not sure what you mean here.
I believe many NFS implementations are slow because the host filesystem performs badly under constant "commit" requests. NFS doesn't need "shared storage".
High network bandwidth helps NFS as much as anything else, actually high bandwidth is more important for true parallel filesystems.
NFS's bottleneck is at the server. It can only support as many clients as that one server can support. It's difficult to move beyond that.
- Are small enough that it doesn't warrent deploying a clustering file system and/or shared storage with fiber channel or iscsi
A truly distributed/parallel filesystem is a different beast than NFS.
This has nothing to do with small or big. They address different requirement.
We have clients running NFS for thousands of clients.
Well I mean 'big' by the amount of data that needs to be moved, not so much the amount of clients.
- Don't need compatability with Windows machines.
Windows does have NFSv4 commercial implementation.
http://connectivity.hummingbird.com/products/nc/nfs/index.html
Of course, in a mostly Windows environment, it doesn't make sense not to use the native protocol: CIFS/SAMBA.
Of course.
How compatable do you suppose that commercial nfsv4 is with Linux's nfsv4.
Personally I like the idea of distributed network file systems.
For instance
Lustre
Well, I like
GPFS too

Again, different beast. Many ways to implement, not a standard.
Well you have to understand a bit about Lustre's background to realise that it's not designed to be limited to specific task like GPFS or PVFS2. That it is designed to be usefull in a much wider context then just HPC.
Now I am no expert at this, but I have some ideas. In order to exprese them I have to make some definitions that may not be to applicable in the 'real world'.
So you have several types of file systems. Different ways to distribute files over a network so to speak.
Block Level network storage access protocols:
examples:
iSCSI - scsi over tcp/ip (Level 3 networking for block storage access)
AoE - ATA commands over Ethernet (level 2 networking for block level access)
GNBD - Global/GFS Network Block Device (level 3 again)
Fiberchannel - very expensive (level 1/2, I assume)
So those guys export out storage so you can access them directly. For instance at home I have my desktop do a PXE boot over iscsi that is formatted XFS (because I am a huge nerd).
Network File systems:
My own sort of definition..
examples:
NFSv3 - common network file system for Unix and Unix-like systems.
CIFS - some twisted attempt to standardize Microsoft's server message block file system.
These export a file system. They use a sort of flat namespace:
servername:/path/to/stuff
Which can be mounted to C: drive or to a directory or whatever.
They are limited by being locked to a single server. Users/computers need to be aware specifcly of what server to connect to and paths on that server to access specific portions of data. Moving data around from server to server can be a pain because those clients need to be made aware of were the data is being moved to.
Clustering File Systems:
common definition.
GFS - Global File System. Open sourced by Redhat.
OCFS2 - 2nd generation oracle file system, now in the Linux kernel.
Veratis CFS -propriatory whatmacallit.
etc etc.
Commonly used to access shared storage over block-level network storage systems. You setup a SAN, setup a Server to connect to it. Use GFS so that servers can access the same data on the same file system without stepping on each other's toes. High speed, special purpose. Doesn't scale well to lots of clients.
People do fancy things.. like say you have a SAN you connect to Linux servers via Fiber. Then you would export out that access further over ethernet with GNBD. This way you can leverage cheap switched gigabit ethernet networks to extend your very expensive fiberchannel/san arragement. Then setup GFS on those.. Then you would export that further out to end clients as NFS or CIFS.
All those layers!
Distributed Network File systems.
my specific defintion to differentiate between these and the NFS-style stuff.
OpenAFS - open source'd AFS file system. A classic.
DEC/DFS - IBM's posix-centric improvement over AFS.
NCP - Netware Core Protocol. A distributed file system for use with eDirectory.
DFS - Microsoft's distributed file system. (completely different from the IBM dfs stuff)
Coda - now-dead research project based around modernizing AFS
Intermezzo - now-dead file system based on Coda-born concepts.
With these systems they are different from NFS style stuff because they usually impliment a sort of global namespace, strong security, advanced caching mechanisms, advanced volume management features. That sort of thing.
For instance with OpenAFS you have global namespace It goes like this:
/afs/domain/department/group/usr
Or whatever you want to divde it up as. The last directory or whatnot would be a volume. A volume is a independant little nugget of a file system which can be moved around from server to server, mirrored, and that sort of thing. Supports quotas and stuff like that. You could have a hundred servers all over the internet serving out different peices of your domain, with volumes being moved around from server to server on a regular basis and the clients won't ever need to know anything about it. Everything will remain constant in the directory system.
Also it impliments good cache'ng mechanisms so even though it causes it to be very slow on modern networks (were networks are getting faster then local storage) it is quite usable over the internet as a real read/write file system.
Of course AFS is very old. It's entirely not posix compliant and doesn't even support the "user group world" "read write execute" that Unix supports. It's born out of the Athena project. The same people that brought us Kerberos and X Windows so that goes to show how old it is.
It's permission model is pretty bizzare compared to what we are used to.
NFSv4 kinda fits into this group also in some ways. But in other ways AFS is still more 'advanced' then any tradtional 'network file system' despite being from the 80's.
Of course for enterprise level stuff they've moved onto bigger and better things with the various DFS things that exist. For instance Novel's NCP is pretty kick ass when you look at the features it supports.
Now with the open source world IBM openned up AFS and the OpenAFS project was born.. It has a healthy development community and it has brought back to life as being a stable usefull file system.
If your using Debian you can pretty easily compile the required client modules with module-assistant and then install the openafs client software. (if your using stable backport the Etch versions.. they are much nicer).
It's fun because there are a whole host of government, military, and education AFS sites out there on the web and the Debian packages includes their domains. So it's fun to just surf around in your /afs/ file system.

It's very el-neato.
From AFS2 stuff came Coda as a way to modernize the concept. They wanted to take AFS's advantages such as the global namespace and authentication stuff. And get improved network performance by making Coda adaptable to different bandwidth situations. One of their wiz-bang features was that they wanted to impliment a fully detatched network mode. So you could connect a laptop to the network, grab what data you needed and be able to work on it at home.. without having copy anything. It would run all out of cache and still fit into that global namespace.
However that died.
From that came the Intermezzo project. Based around Coda concepts. This was designed to get good network performance. Allow for network and server outages in a sane manner. Do some failover, I beleive. It was designed to leverage the exising file systems as much as possible. Instead of creating a all new file system it would take Ext3 (or whatever) and extend it out for network-level access. This way they could keep it thin, and leverage the existing features like journalling to avoid code duplication and performance issues.
However that died.
Now some Intermezzo/Coda folks figured that they aren't going to be able to develop a real 'next gen' file system based on traditional Linux development model for financing projects (namely: none). So they got together and formed the Clustering File System company and started off with Lustre.
They sell GPL'd software to people engaged in HPC clustering in order to finance their work. With Beowolf style clusters you have this tremendious advantage of being able to leverage large amounts of computation power at low cost.. HOWEVER PC-based clusters are very very limited in the amount of I/O they are to do. Even with large clusters the typical network interconnect of choice is just plain switched gigabit ethernet. So this limits the types of jobs that they are able to do most effectively to workloads were you perform very complex computations on limited sized data sets. You still need to buy one of those Massively Parrellel Super Computers if you want to proccess huge amounts of data.
So Lustre is designed to provide a very high speed network object access at very low latencies. It's very thin, very fast. Scales to petabytes worth of data, to many thousands of clients. Been proven to be able to manage many 10's of Gigabytes per second with fancy hardware. And obviously is designed for high aviability and distributed file systems since with commodity based clusters your definately going to have bits and peices of the cluster winking in and out existance due to the el-cheapo nature of the hardware. Since this is designed for this first then it's attractive to people that want to improve the I/O performance of their clusters and they'll pay for it even if it's GPL'd software.
(Notice also that with HPC stuff they use the Mainframe originate convention of using Gigabyte per second rather then the PC-based convention of Gigabit per second.

)
For instance right now HP has their Serverworks stuff that is based on SAN-style hardware and Lustre with management tools and such. A 'total vendor soluiton' type thing.
http://h20311.www2.hp.com/HPC/cache/276636-0-0-0-121.html
They say it can do 35GB/s.. which is FAST.
Now it's used currently in HPC, but like other aspects of linux clustering this stuff if filtering down to benifit normal users.
Like Coda/Intermezzo/AFS it's designed to be usefull to a much wider audiance; eventually as it matures. They are starting to 'fill in the blanks'.
Also like with Intermezzo they are heavily leveraging exsiting file systems. They put a lot of work into improving the LInux-VFS stuff and improving Ext3. You'll see their company's name pop up quite a bit around Ext4 also. They work with Redhat to get their improvements into the kernel.
See this PDF:
http://www.clusterfs.com/roadmap.html
Currently the beta version of Lustre is 1.6.0 beta 4. It shows the features that Lustre supports and the features it plans on implienting. From that you can tell it's more then just for high performance computing.
It supports such nice features as:
- Scalable to 15,000 clients.
- Scalable up to 400+ OSS (the storage server portion. They can based on SAN, or just a standard server or whatever)
- 2 MDS for high aviability (the metadata server, for locating data, controlling locking, permissions, etc)
- Linux patch-less clients. You don't have to patch the kernel anymore, enough of Lustre is in the standard kernel already you can just build a module.

- export as CIFS for those hippy Windows users.
- export as NFS for everybody else.
- Posix I/O
- Posix ACLs/extended ACLs
- Extended attributes
- Quotas
- RAID 0 style operation.
- TCP networks
- Myranet, Infinaband, "Zero-copy TCP",
- I expect GSS/Kerberos support by this time next year.
Nummy. I expect with the CIFS export support and GSS stuff it should be able to tie into domain controllers and such and be usefull for even medium size businesses and such.
Edit:
Here is a old Lustre whitepaper if what I said made no sense maybe this will:
http://www.lustre.org/docs/whitepaper.pdf#search=%22Lustre%20white%20paper%22
Note: I am by no means an NFS expert. My comments above can be wrong. Don't quote me out of this thread
Sounds like you know a hell of a lot more about it then I do.
I just like looking up stuff for network file system stuff for some bizzare reason.
Personally I've used OpenAFS, NFS, CIFS and have played around with ISCSI and AoE.
As far as iSCSI vs AoE. I've used AoE vblade (virtual blade) and that sucks. With iSCSI I've used Open-ISCSI which is the current in-kernel software 'initiator' (like the client) and the "ISCSI Enterprise Target" which is the software emulation for the 'server'. I want to try out GNBD. From my personal experiance software emulation I/O with ISCSI and XFS or Ext3 is faster then NFS for most things except random seeks.

It's kinda fun to play around with.