NFS - how to force async globally?

Red Squirrel · Mar 11, 2016

Been reading up more on how to tune NFS as my NFS performance has always been super terrible. Well I did not know what it was but I think I narrowed it down to NFS and not the network or the raid. I read that sync is very slow and async is much faster, with a risk of data corruption if a write does not finish. (is power the only way this can happen? I have 4 hours of UPS run time so not worried about power)

Only issue it seems async is an /etc/export option and it has to be put for each and every single path AND host. So if you have a share with like 5 hosts you have to do it for each one. That seems ridiculously dirty and there is always the risk of forgetting to add it when making new entries etc. Is there a way to make it global?

Fallen Kell · Apr 18, 2016

Sorry, just saw this thread now...

Well, the easiest way to avoid having to update multiple systems is to setup a NIS or LDAP domain to propagate your automount maps. This way you only have to update one set of files which deals with all your systems.

But you are correct, the async option is a client only option, and thus you need to set it on each and every client for each mountpoint or automount map. In terms of data corruption, you are at risk if the server fails before it completes the write operation. You are also at risk if you have multiple systems reading/writing the same file.

Red Squirrel · Apr 20, 2016

What is the risk, is it that the particular file in question can be corrupt, or the whole file system can be corrupt? Also can I still have one client writing and multiple reading or is that still an issue? In general I don't think I will be at risk either way, but if the risk is losing the entire file system, maybe it's not worth it just in case, but if the risk is losing the one file then it's a risk I'm willing to take as I imagine the odds are very slim. I do need to look into a better backup routine though, I don't have enough versioning going on in my current scheme so if I don't catch an issue early enough it could potentially get overwritten in backup. But that's another topic. I need to look into rdiff backup I think.

Also did not realize the async was a client flag I thought it was a server flag. SO it has to be set for each client too and not just at the server? I was going to just write a script to generate the /etc/exports file if worse comes to worse but guess that won't work as I still need to specify it in the /etc/fstab of each client?

Fallen Kell · Apr 20, 2016

The corruption risk is to the particular file. You are ok with one writing and many reading, but if multiple are writing, the file locking mechanisms and dirty/clean tend to not play nicely with async.

Yes, the async would need to be set on each client in the fstab in your case. It is also HIGHLY recommended that you do not put NFS mounts in your fstab. If for any reason the filesystem is not available at boot time, the client hang on boot. For this reason it is recommended that you use the automounter. Preferably creating proper automount maps for each item/group. For instance, you would want to have your home areas as an automount map. You would edit your /etc/auto.master file and add a line for "/home /etc/auto.home --ghost tcp rsize=65536 wsize=65536 intr noatime async". You then make a new file "/etc/auto.home" and add entries like "redsquirrel mynfsserver:/datastorage/homedirs/redsquirrel" and "fallenkell mynfsserver:/datastorage2/homedirs/fallenkell", and then you would restart autofs and under /home you would now see /home/redsquirrel and /home/fallenkell. These would only get mounted on-demand (i.e. when you actually go an access something in them), and will unmount after a period of no activity. Thus reducing load on your NFS server by not taking up an active NFS mount thread. You can also create one for all your software installations, add "/sw /etc/auto.sw --ghost <etc., etc., NFS client side options here>" and then create the file "/etc/auto.sw" and make entries like "gcc4.8.2 mynfsserver:/datastorage/software/gcc4.8.2" or "JDK1.7.0 myserver2:/storage/JDK1.7.0"....

You can also create what are called a direct map by editing the "/etc/auto.master" file and checking for or creating a line such as "/- /etc/auto.direct --ghost <etc, etc., for the client side options>". And then edit the "/etc/auto.direct" file and add the full path to the directory in question, such as "/usr/local/mysoftware mynfsserver:/datastorage/mysoftware" and it will then mount that area as well. Direct maps are considered bad practice, as they can create headaches when not realizing an area is actually a NFS mounted area. It is advised to create individual maps like I stated above for /home or /sw or whatever you want to name it.

Again, if you have a bunch of clients, it can get tedious to update/put all the maps on all the different clients. You would really want to use NIS or LDAP to share the maps out such that you then only update one place and the clients simply all check the NIS or LDAP server for the information.

Red Squirrel · Apr 11, 2020

So this is an old thread but I was starting to get serious performance issues again (it comes and goes) and decided screw it, doing async mode. Turns out you can do it in the exports file, so I didn't need to do it for every client on each workstation, but only for each entry in the /etc/exports file. If you are exporting to multiple clients then you do need to add it for both of those entries, but at least it's all one file.

Ex:

Code:

/volumes/raid1/vms_lun1              borg.loc(rw,all_squash,anonuid=1046,anongid=1046,async) moria.loc(rw,all_squash,anonuid=1046,anongid=1046,async)

That seems to have done the trick. I guess it's a high risk setting but my sanity has been restored as so far it seems the performance issues I was getting with NFS are completely gone. I was never able to do more than 10MB/sec sustained, and I would get random lockups where the entire NFS system locks up throughout the whole network, causing lot of weird things like VMs to crash or misbehave. Even had a backup script that somehow skipped a line (setting a variable) and started to do weird stuff. It was a mess.

All of that is gone now, and I can get copy speeds of over 100MB/sec. For gig link that's on par with what I would expect.

Still early to tell though but it's been a few days now without any performance hiccup. I would get it at least once a day where everything just starts to grind to a halt and most of my VMs were always filled with dmsg logs about hard disk I/O issues. (kinda like what you would get if a HDD was failing, except this is VM) I also have this one VM that's running legacy code on FC9 and also doing mail, which I virtualized and it was acting very badly and now that VM actually runs fine. This seems to have solved pretty much all my lingering issues throughout my network. I was a little worried I would require rebooting the VM server for the changes to take effect but it seems just re-scanning the storage did it.

I just wish I had done this sooner... I was just reluctant because of the risk. Of course I will find out the risk in the next months or so... Now that my backup jobs are not going to completely kill my network I can probably refine them and make sure I got all my bases covered as far as backups go.

Search

NFS - how to force async globally?

Red Squirrel

No Lifer

Fallen Kell

Diamond Member

Red Squirrel

No Lifer

Fallen Kell

Diamond Member

Red Squirrel

No Lifer

TRENDING THREADS