Friday, 28 May 2010

Lotus Connections 2.5.0.1 - Problems with Service Integration Bus after configuring Shared Message Store

During the final phase of configuring a two-node cluster for Lotus Connections 2.5, I was attempting to configure the Shared Message Store so that all LC features deployed as clusters across both nodes could see messages and logs.

The Information Centre covers this in depth here: -


and, in essence, I'd created a series of directories on a NFS server ( running NFS v3 on Red Hat Enterprise Linux ), as follows: -

mkdir /net/data/collaboration/messagestore
mkdir /net/data/collaboration/messagestore/Activities
mkdir /net/data/collaboration/messagestore/Blogs
mkdir /net/data/collaboration/messagestore/Communities
mkdir /net/data/collaboration/messagestore/Dogear
mkdir /net/data/collaboration/messagestore/Files
mkdir /net/data/collaboration/messagestore/Homepage
mkdir /net/data/collaboration/messagestore/Profiles
mkdir /net/data/collaboration/messagestore/Wikis

and then created eight new members of the WebSphere Service Integration Bus ( SIBus ), one for each of the clustered LC features.

Each bus member has two directories; one for logs and one for messages: -

/net/data/collaboration/messagestore/<clusterName>/log
/net/data/collaboration/messagestore/<clusterName>/store

these two subdirectories being created when the cluster is first started ( which, in turn, starts the bus ).

So far, so good.

I'd already verified that I could write to, and read from, the NFS server, by creating/editing/viewing/deleting files from both nodes, against the shared NFS server - which was automatically mounted in /etc/fstab when Linux starts.

However, whilst I could start my clusters after making these changes, the SIBus members never started completely, and merely showed as "Starting".

In order to diagnose the problem further, I stopped all of the clusters, stopped the node agents, cleared down the logs, started ONE node agent, and started one cluster ( Activities ), which meant that I only had one JVM on one node to play with.

I then monitored the logs and, c'est voila, I found these messages: -

[21/05/10 14:47:38:723 BST] 0000002d SibMessage    E   [ConnectionsBus:Activities.000-ConnectionsBus] CWSIS1592E: The file store has caught an unexpected io exception.
[21/05/10 14:47:38:724 BST] 0000002d SibMessage    I   [ConnectionsBus:Activities.000-ConnectionsBus] CWSIS1582I: The file store had a problem initialising its log file but will attempt to retry.
[21/05/10 14:47:43:731 BST] 0000002d SibMessage    I   [ConnectionsBus:Activities.000-ConnectionsBus] CWSIS1581I: The file store is attempting to initalise its log file: /net/data/collaboration/messagestore/Activities/log/Log

When I checked the normal Linux error log, via the dmesg command, I also found: -

SELinux: initialized (dev 0:13, type nfs), uses genfs_contexts
lockd: server 192.168.113.97 not responding, still trying

Working with the networking specialists at the client site, it turned out that the iptables firewall on the NFS server was misconfigured, and was blocking me. However, the problem was even more subtle, as my tests had proved that NFS writes and reads were working OK.

The problem, as seen from dmesg, was with the Lock Daemon ( lockd ), which was being blocked.

Using the NFS v3 protocol ( which is supported by Connections ), the ports that needed to be opened on the NFS server were: -

LOCKD_TCPPORT=32803
LOCKD_UDPPORT=32769


or, in other words: -

32803/tcp
32769/udp

Once these changes were made, and the NFS server was rebooted, the SIBus burst into life and Connections started .... connecting.

The moral of the story - get to know and love your network specialist :-)

4 comments:

Sharon Bellamy said...

fantastic .. thanks for sharing this one Mr Hay sir

peter greaves said...

why did you go for NFSv3? i ask because....the IBM w3 connections team recommend NFSv4 because of locking improvements. this was for shared files - but is there a connection?

whenever i type NFS is want to type NSF...:)

Dave Hay said...

Peter, hmmm, good point - will check with my client - for some reason, I have NFS v3 on the brain, will update tomorrow, Dave

Dave Hay said...

Nope, we were definitely using NFS version 3, as my client's network team had never managed to get v4 working on Linux. It worked like a dream, regards, Dave