Thursday, 11 October 2012

Getting CWMRI1106E and CWMRI1032E when attempting to install IBM Installation Manager from WebSphere Application Server's Job Manager

In this post, I describe how I hit and, after much trial and error, resolved an issue with the Job Manager function of WebSphere Application Server 8.5.

So I'm continuing to grow my understanding of WebSphere Application Server 8.5, and am now using the Job Manager to perform a remote installation of IBM Installation Manager onto a Red Hat Linux server.

This is the precursor to the next step, where I use Job Manager to drive the remote Installation Manager to install WebSphere Application Server from an HTTP-based repository, built using Package Manager.

The benefit to this approach is that one could use a single Job Manager server to perform the remote installation, configuration and management of a multi-server WebSphere configuration.

I'm accessing Job Manager via the browser UI, which is rather shiny: -


In this particular case, I've configured Job Manager to connect to my target server ( rhel6.uk.ibm.com ) using a non-root user ( wasadmin ), which fits in with good practice to avoid running WebSphere tasks as root, wherever possible.

The Job Manager job ( installIM ) requires a small number of parameters: -

- Target host
- Credentials ( these can be configured directly for the target host, avoiding the need to re-enter them )
- Path to Installation Manager kit on the local Job Manager host - I'm using /opt/IBM/WebSphere/profiles/JobMgr01/IMKits/agent.installer.linux.gtk.x86_1.6.0.20120831_1216.zip
- Path to Agent Data location on remote host - I'm using /home/wasadmin//var/ibm/InstallationManager
- Path to Installation Manager location on remote host - I'm using /opt/IBM/InstallationManager
- License acceptance ( by checking  I accept the terms in the license agreements )

So I submitted the job and, after a minute or so, it failed with: -

CWMRI1106E: installIM job failed on host rhel6.uk.ibm.com with installation kit agent.installer.linux.gtk.x86_1.6.0.20120831_1216.zip. The exception is "CWMRI1032E: An error occurred while creating directory null on remote host rhel6.uk.ibm.com.".

I tried and re-tried and re-tried again, varying the parameters, but to no avail.

I then ran through the same steps, but using root rather than wasadmin, which worked.

This suggested file permissions, but I'd previously proven that wasadmin could create files and directories in /opt/IBM and in /home/wasadmin.

I then had a Eureka! moment, thinking back to other problems with non-root users and file permissions on Linux - where the non-root user is unable to write to /tmp.

I tested this hypothesis as follows: -

$ touch /tmp/foobar

touch: cannot touch `/tmp/foobar': Permission denied

and: -

ls -al /

total 120
dr-xr-xr-x.  26 root root  4096 Oct 11 09:11 .
dr-xr-xr-x.  26 root root  4096 Oct 11 09:11 ..
-rw-r--r--    1 root root     0 Oct 11 09:11 .autofsck
-rw-r--r--    1 root root     0 Jun  6 14:33 .autorelabel
dr-xr-xr-x.   2 root root  4096 Oct 10 10:47 bin
dr-xr-xr-x.   5 root root  3072 Aug 29 15:58 boot
drwxr-xr-x.  10 root root  4096 Feb 22  2012 cgroup
drwx------.   3 root root  4096 Jun  1 10:08 .dbus
drwxr-xr-x   18 root root  3740 Oct 11 09:11 dev
drwxr-xr-x. 126 root root 12288 Oct 11 10:53 etc
drwxr-xr-x.   4 root root  4096 Oct 10 17:01 home
dr-xr-xr-x.  13 root root  4096 Oct 10 10:46 lib
dr-xr-xr-x.  10 root root 12288 Oct 10 10:46 lib64
drwx------.   2 root root 16384 Jun  1 09:23 lost+found
drwxr-xr-x.   2 root root  4096 Jun 28  2011 media
drwxr-xr-x    2 root root     0 Oct 11 09:11 misc
drwxr-xr-x.   2 root root  4096 Jun 28  2011 mnt
drwxr-xr-x    2 root root     0 Oct 11 09:11 net
drwxr-xr-x.   2 root root  4096 Oct 11 11:05 opt
dr-xr-xr-x  113 root root     0 Oct 11 09:11 proc
dr-xr-x---.  31 root root  4096 Oct 10 16:53 root
dr-xr-xr-x.   2 root root 12288 Oct 10 10:47 sbin
drwxr-xr-x.   2 root root  4096 Jun  1 09:23 selinux
drwxr-xr-x.   2 root root  4096 Jun 28  2011 srv
drwxr-xr-x   13 root root     0 Oct 11 09:11 sys
drwxr-xr-x.   7 root root  4096 Oct 11 10:58 tmp
drwxr-xr-x.  13 root root  4096 Jun  1 09:24 usr
drwxr-xr-x.  23 root root  4096 Jun  6 14:59 var


I've seen this problem, and blogged about it before: -



This required me to update the permissions  ( as root ) of the /tmp directory to allow all users to read/write/execute there: -

$ chmod a+rwx /tmp

$ ls -al /

total 120
dr-xr-xr-x.  26 root root  4096 Oct 11 09:11 .
dr-xr-xr-x.  26 root root  4096 Oct 11 09:11 ..
-rw-r--r--    1 root root     0 Oct 11 09:11 .autofsck
-rw-r--r--    1 root root     0 Jun  6 14:33 .autorelabel
dr-xr-xr-x.   2 root root  4096 Oct 10 10:47 bin
dr-xr-xr-x.   5 root root  3072 Aug 29 15:58 boot
drwxr-xr-x.  10 root root  4096 Feb 22  2012 cgroup
drwx------.   3 root root  4096 Jun  1 10:08 .dbus
drwxr-xr-x   18 root root  3740 Oct 11 09:11 dev
drwxr-xr-x. 126 root root 12288 Oct 11 10:53 etc
drwxr-xr-x.   4 root root  4096 Oct 10 17:01 home
dr-xr-xr-x.  13 root root  4096 Oct 10 10:46 lib
dr-xr-xr-x.  10 root root 12288 Oct 10 10:46 lib64
drwx------.   2 root root 16384 Jun  1 09:23 lost+found
drwxr-xr-x.   2 root root  4096 Jun 28  2011 media
drwxr-xr-x    2 root root     0 Oct 11 09:11 misc
drwxr-xr-x.   2 root root  4096 Jun 28  2011 mnt
drwxr-xr-x    2 root root     0 Oct 11 09:11 net
drwxr-xr-x.   2 root root  4096 Oct 11 11:05 opt
dr-xr-xr-x  112 root root     0 Oct 11 09:11 proc
dr-xr-x---.  31 root root  4096 Oct 10 16:53 root
dr-xr-xr-x.   2 root root 12288 Oct 10 10:47 sbin
drwxr-xr-x.   2 root root  4096 Jun  1 09:23 selinux
drwxr-xr-x.   2 root root  4096 Jun 28  2011 srv
drwxr-xr-x   13 root root     0 Oct 11 09:11 sys
drwxrwxrwx.   7 root root  4096 Oct 11 10:58 tmp
drwxr-xr-x.  13 root root  4096 Jun  1 09:24 usr
drwxr-xr-x.  23 root root  4096 Jun  6 14:59 var

and validated the change as wasadmin: -

$ cd /tmp/
$ touch foo
$ rm foo 

I retried the installIM job again, but it failed, albeit after slightly longer ( and with more activity on the target box, including sssh and Java processes ).

This time, Job Manager showed a different, and more revealing message: -

CWMRI1103E: installIM job failed on host rhel6.uk.ibm.com with installation kit agent.installer.linux.gtk.x86_1.6.0.20120831_1216.zip. The standard error, standard output or logs of the command can be found in /opt/IBM/WebSphere/profiles/JobMgr01/config/temp/JobManager/134995125859408828/rhel6.uk.ibm.com/logs.

I checked the logs ( these are on the local Job Manager server ): -

$ cd /opt/IBM/WebSphere/profiles/JobMgr01/config/temp/JobManager/134995125859408828/rhel6.uk.ibm.com/logs
cat stdErr.txt 

ERROR: Your user ID or group has insufficient permissions granted for the installation directory path: /opt/IBM/InstallationManager. Ensure that your user ID or group has at least write permission to the installation directory and execute permission to each directory preceding the installation directory.

I validated this as wasadmin: -

$ cd /opt
$ mkdir foo

mkdir: cannot create directory `foo': Permission denied

I fixed this, again as root, as follows: -

$ mkdir /opt/IBM
$ chmod -R g+wr /opt/IBM/
$ chgrp -R wasadmins /opt/IBM/

In other words, I created the subdirectory - IBM - under the original /opt directory, and then allowed members of the group wasadmins ( which is actually a single user - wasadmin  ) to write/read within that subdirectory.

I re-submitted the installIM job and, this time, it succeeded with: -

CWMRI1102I: installIM job completed on host rhel6.uk.ibm.com with installation kit agent.installer.linux.gtk.x86_1.6.0.20120831_1216.zip. The standard output of the command can be found in /opt/IBM/WebSphere/profiles/JobMgr01/config/temp/JobManager/134995245574408833/rhel6.uk.ibm.com/logs.

So, the moral of the story is get your permissions right, both to /tmp and to /opt.

PS It's also worth noting that I could've submitted the installIM job via wsadmin as follows: -

$ cd /opt/IBM/WebSphere/profiles/JobMgr01/bin
$ ./wsadmin.sh -lang jython -user wasadmin -password password

WASX7209I: Connected to process "jobmgr" on node was85JobMgr01 using SOAP connector;  The type of process is: JobManager
WASX7031I: For help, enter: "print Help.help()"


wsadmin> AdminTask.submitJob('[-jobType installIM -group redhat -description installIM -jobParams [ [skipPrereqCheck FALSE] [installType auto] [acceptLicense TRUE] [installPath /opt/IBM/InstallationManager] [dataPath /home/wasadmin/var/ibm/InstallationManager] [kitPath /opt/IBM/WebSphere/profiles/JobMgr01/IMKits/agent.installer.linux.gtk.x86_1.6.0.20120831_1216.zip] ]]')

which is much quicker than using the web UI :-)

No comments:

Note to self - use kubectl to query images in a pod or deployment

In both cases, we use JSON ... For a deployment, we can do this: - kubectl get deployment foobar --namespace snafu --output jsonpath="{...