Fixed a major problem with the computers at the office today. We currently have our users authentication handled by LDAP and their home directories stored in an NFS server. This is a pretty typical tried and tested scenario in many places.
Things were working fine for a while but for the past week or so, we started to experience mysterious problems. If only one person was logged in, everything was fine. But the moment another person logged into another machine, the NFS clients would hang and the server would experience 75% write disk activity.
Searching around suggested several remedies, none of which worked. A lot of the remedies were also particular to NFS3 only. However, after reading the man page for NFS4, we discovered that there was a problem with client address. Quoting the man page for nfs:
clientaddr=n.n.n.n
Specifies a single IPv4 address (in dotted-quad form), or a non-link-local IPv6 address, that the NFS client advertises to allow servers to perform NFS version 4 callback requests against files on this mount point. If the server is unable to establish callback connections to clients, performance may degrade, or accesses to files may temporarily hang.
If this option is not specified, the mount(8) command attempts to discover an appropriate callback address automatically. The automatic discovery process is not perfect, however. In the presence of multiple client network interfaces, special routing policies, or atypical network topologies, the exact address to use for callbacks may be nontrivial to determine.
We do not yet rightly know how things changed recently but previously, the NFS4 clients reported the correct client IP address to the server. However, when we checked the current machines, they were all reporting a 0.0.0.0 address.
Adding the correct static IP address to /etc/fstab by using the clientaddr option solved the problem. Now, things are back to normal again.
Update: It turns out that it’s related to this bug-report.
5 Comments
jamesjustjames · 2013-06-09 at 16:25
Interesting post. I was actually searching for more information on this option when I found this.
I don’t actually have any problems at the moment, but I’m curious if you know more about the technical details of why the server needs a callback address, and what to do if the client isn’t directly routable? What ports need to be open on the client? Does it need to respond to “NEW” traffic, or only “ESTABLISHED” ?
cheers
Shawn Tan · 2013-06-11 at 13:30
When we first used it, it could automatically detect the correct client-address. But it stopped doing that later. It is documented in the man page that the ability for the NFS4 auto mount to detect client-address is not 100%. So, we had to manually enter the client-address.
jamesjustjames · 2013-06-11 at 16:08
I’m curious if you know more about the technical details of why the server needs a callback address, and what to do if the client isn’t directly routable? What ports need to be open on the client? Does it need to respond to “NEW” traffic, or only “ESTABLISHED” ?
Shawn Tan · 2013-06-11 at 17:47
Maybe you should ask a NFS4 forum or one of it’s developers. It’s supposed to be a new NFS4 feature.
Shawn Tan · 2013-06-15 at 00:18
This may help http://wiki.debian.org/SecuringNFS