Friday, 23 November 2012

NFS, Linux file permissions and IBM Installation Manager

In this post, I describe how I overcame an issue with IBM Installation Manager, observed when attempting to install IBM Operational Decision Manager v8 ( based on WebSphere Application Server v8 )  on Linux.

I had an interesting experience today - trying to install a WAS-based product using IBM Installation Manager, via a set of Unix shell scripts and response files. For ease of use, the scripts are located on a shared file system ( NFS ), mounted to the /mnt/filesystem location on one of FOUR Linux servers.

This allows me to re-use the same set of scripts, without needing to host them locally, and seemed like the most practical root (!).

The problem started when I executed one script which deletes and then repopulates the scripts from a central properties file.

When I ran this on server A, I was then suddenly unable to delete the file on server B. This pattern occurred on servers C and D as well.

In each case, I was using the same user local Linux ID - wasadmin and, on the NFS server ( from which the file system was exported ), the user that "really" owned the files had changed the ownership to wasadmin ( group wasadmins ).

I checked, and rechecked, the file permissions, both as wasadmin and as root, using chown and chmod, and everything looked OK - RWX.

Thankfully, the Linux admin came to the rescue.

When he checked the /etc/passwd file on each of the four servers, he noticed that, for two of the users, the Unique ID (UID) was set to 1003 and, for the other two boxes, the UID was 1004.

When he checked the "real" user on the NFS server, it had a UID of 1004.

That was the problem - despite the NFS server setting the file ownership of the files to wasadmin, the actual UID of the account owning the files is the key point - it was 1004.

So, for the two users who were unable to delete the files, we confirmed that they had the wrong UID - 1003.

Once we changed the two failing users to 1004, all was well.

I needed to run, as root, the chown command against the files that the two failing users had previously used/touched, including: -

/opt/IBM/InstallationManager ....
/home/wasadmin/var
/home/wasadmin/etc

However, I found out, the hard way, that I also needed to do the same for IIM files in /tmp, such as /tmp/ciclogs_e1wasadmin.

Without this last step, I was seeing exceptions such as: -

java.io.FileNotFoundException: /tmp/ciclogs_e1wasadmin/sample.properties (Permission denied)

So, in summary, IF you've got multiple Linux ( or any other *Nix ) users needing full RWX access to files on a NFS server, you need to ensure that the UID of the user is consistent across the boxes AND the NFS server.

One of my colleagues said that this is especially important when using Automated Peer Recovery for Transaction files on NFS - the UID for wasadmin MUST be the same for both servers, or WAS on server A cannot read the transaction logs for server B, meaning that peer recover fails.

No comments:

Note to self - use kubectl to query images in a pod or deployment

In both cases, we use JSON ... For a deployment, we can do this: - kubectl get deployment foobar --namespace snafu --output jsonpath="{...