Tuesday, 2 December 2014

IBM BPM Advanced - "The socket name is already in use" and "UNABLE_TO_CREATE_SSL_SERVER_SOCKET"

Context

This relates to a two cell, two node, multi-cluster IBM BPM Advanced 8.5.0.1 environment, installed and running on AIX.

We have four WAS profiles; two Deployment Manager, two Application Server, each with its own unique profile root.

This provides a pair of completely isolated Process Server environments.

This is the first time both cells ( aka Deployment Environments ) have been running at the same time, hence the first time this problem evidenced itself.

Symptoms

The SupCluster on 2nd node fails to properly start, with the following in SystemOut.log: -

[02/12/14 12:12:55:644 GMT] 00000001 SecurityCompo A   JSAS0009I: IOR interceptor registered.
[02/12/14 12:12:56:051 GMT] 00000001 FfdcProvider  W com.ibm.ws.ffdc.impl.FfdcProvider logIncident FFDC1003I: FFDC Incident emitted on /opt/ibm/WebSphereProfiles/PSCell2AppSrv02/logs/ffdc/SupClusterMember2_5b51e880_14.12.02_12.12.56.0313859347108527407738.txt com.ibm.ws.security.orbssl.WSSSLServerSocketFactoryImpl.createSSLServerSocket 368
[02/12/14 12:12:56:062 GMT] 00000001 FfdcProvider  W com.ibm.ws.ffdc.impl.FfdcProvider logIncident FFDC1003I: FFDC Incident emitted on /opt/ibm/WebSphereProfiles/PSCell2AppSrv02/logs/ffdc/SupClusterMember2_5b51e880_14.12.02_12.12.56.0522391579609472805370.txt com.ibm.ws.security.orbssl.WSSSLServerSocketFactoryImpl.createSSLServerSocket 459
[02/12/14 12:12:56:163 GMT] 00000001 FfdcProvider  W com.ibm.ws.ffdc.impl.FfdcProvider logIncident FFDC1003I: FFDC Incident emitted on /opt/ibm/WebSphereProfiles/PSCell2AppSrv02/logs/ffdc/SupClusterMember2_5b51e880_14.12.02_12.12.56.0627177776977967831938.txt com.ibm.ws.orbimpl.transport.WSTransport.createServerSocket 1439
[02/12/14 12:12:56:170 GMT] 00000001 ORBRas        E com.ibm.ws.orbimpl.transport.WSTransport createServerSocket P=348892:O=0:CT ORBX0390E: Cannot create listener thread. Exception=[ org.omg.CORBA.INTERNAL: CAUGHT_EXCEPTION_WHILE_CONFIGURING_SSL_SERVER_SOCKET, Exception=org.omg.CORBA.INTERNAL: UNABLE_TO_CREATE_SSL_SERVER_SOCKET Except
ion=java.net.BindException: The socket name is already in use.  vmcid: 0x49421000  minor code: 76  completed: No  vmcid: 0x49421000  minor code: 77  completed: No - received while attempting to open server socket on port 9443 ].
[02/12/14 12:12:56:250 GMT] 00000001 FfdcProvider  W com.ibm.ws.ffdc.impl.FfdcProvider logIncident FFDC1003I: FFDC Incident emitted on /opt/ibm/WebSphereProfiles/PSCell2AppSrv02/logs/ffdc/SupClusterMember2_5b51e880_14.12.02_12.12.56.1728772487686020602763.txt com.ibm.ws.orbimpl.transport.WSTransport.startListening 805
[02/12/14 12:12:56:324 GMT] 00000001 FfdcProvider  W com.ibm.ws.ffdc.impl.FfdcProvider logIncident FFDC1003I: FFDC Incident emitted on /opt/ibm/WebSphereProfiles/PSCell2AppSrv02/logs/ffdc/SupClusterMember2_5b51e880_14.12.02_12.12.56.2501689457682466448304.txt com.ibm.ws.orbimpl.transport.WSTransport.createListener 724 
[02/12/14 12:12:56:325 GMT] 00000001 WsServerImpl  E   WSVR0009E: Error occurred during startup com.ibm.ws.exception.RuntimeError: org.omg.CORBA.INTERNAL: CREATE_LISTENER_FAILED_4  vmcid: 0x49421000  minor code: 56 completed: No 

The relevant details are: -

The socket name is already in use

received while attempting to open server socket on port 9443

CAUGHT_EXCEPTION_WHILE_CONFIGURING_SSL_SERVER_SOCKET

UNABLE_TO_CREATE_SSL_SERVER_SOCKET

In this context, port 9443 == CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS

Investigation

We need to find out what other WAS service is blocking this port: -

Cell1

/opt/ibm/WebSphereProfiles/PSCell1Dmgr01/config/cells/PSCell1/nodes/AppSrv01Node/serverindex.xml

...
    <specialEndpoints xmi:id="NamedEndPoint_1416833240651" endPointName="WC_defaulthost_secure">
      <endPoint xmi:id="EndPoint_1416833240651" host="*" port="9443"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416828930395" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416828930396" host="bpm101.uk.ibm.com" port="9201"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416833240645" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416833240645" host="bpm101.uk.ibm.com" port="9405"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416833246210" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416833246210" host="bpm101.uk.ibm.com" port="9408"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416833248761" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416833248761" host="bpm101.uk.ibm.com" port="9411"/>
...

/opt/ibm/WebSphereProfiles/PSCell1Dmgr01/config/cells/PSCell1/nodes/AppSrv02Node/serverindex.xml

..
    <specialEndpoints xmi:id="NamedEndPoint_1416829075313" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416829075314" host="bpm101.uk.ibm.com" port="9204"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416833268778" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416833268778" host="bpm101.uk.ibm.com" port="9414"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416833270982" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416833270982" host="bpm101.uk.ibm.com" port="9417"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416833273046" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416833273046" host="bpm101.uk.ibm.com" port="9421"/>
...

Cell2

/opt/ibm/WebSphereProfiles/PSCell2Dmgr01/config/cells/PSCell2/nodes/AppSrv01Node/serverindex.xml

...
    <specialEndpoints xmi:id="NamedEndPoint_1416908172330" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416908172332" host="bpm101.uk.ibm.com" port="9206"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416911237224" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416911237224" host="bpm101.uk.ibm.com" port="9428"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416911243482" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416911243482" host="bpm101.uk.ibm.com" port="9431"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416911246325" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416911246325" host="bpm101.uk.ibm.com" port="9434"/>
...

/opt/ibm/WebSphereProfiles/PSCell2Dmgr01/config/cells/PSCell2/nodes/AppSrv02Node/serverindex.xml

...
    <specialEndpoints xmi:id="NamedEndPoint_1416908394097" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416908394098" host="bpm101.uk.ibm.com" port="9208"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416911266247" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416911266247" host="bpm101.uk.ibm.com" port="9437"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416911268577" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416911268577" host="bpm101.uk.ibm.com" port="9440"/>
...
    <specialEndpoints xmi:id="NamedEndPoint_1416911270874" endPointName="CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS">
      <endPoint xmi:id="EndPoint_1416911270874" host="bpm101.uk.ibm.com" port="9443"/>
...

Investigation shows that 9443 is being used by: -

(a) WC_defaulthost_secure

and: -

(b) CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS

We need to move latter to different, unique port - ended up with 9943

Iterate through to find a unique port - trial and error resulting in: -

cat /opt/ibm/WebSphereProfiles/PSCell1Dmgr01/config/cells/PSCell1/nodes/AppSrv01Node/serverindex.xml | grep 9943

cat /opt/ibm/WebSphereProfiles/PSCell1Dmgr01/config/cells/PSCell1/nodes/AppSrv02Node/serverindex.xml | grep 9943

cat /opt/ibm/WebSphereProfiles/PSCell2Dmgr01/config/cells/PSCell2/nodes/AppSrv01Node/serverindex.xml | grep 9943

cat /opt/ibm/WebSphereProfiles/PSCell2Dmgr01/config/cells/PSCell2/nodes/AppSrv02Node/serverindex.xml | grep 9943

cat /opt/ibm/WebSphereProfiles/PSCell1Dmgr01/config/cells/PSCell1/nodes/dmgr/serverindex.xml | grep 9943

cat /opt/ibm/WebSphereProfiles/PSCell2Dmgr01/config/cells/PSCell2/nodes/dmgr/serverindex.xml | grep 9943

This should/does return NO matches, indicating uniqueness

Validation

cd /opt/ibm/WebSphereProfiles/PSCell1Dmgr01/config/cells/PSCell1
cat `find . -name serverindex.xml` | grep 9943

cd /opt/ibm/WebSphereProfiles/PSCell2Dmgr01/config/cells/PSCell2
cat `find . -name serverindex.xml` | grep 9943

Execute Change

/opt/IBM/WebSphere/AppServer/profiles/Dmgr01/bin/wsadmin.sh -lang jython -host `hostname` -port 8879
AdminTask.modifyServerPort('SupClusterMember2', '[-nodeName AppSrv02Node -endPointName CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS -host bpm101.uk.ibm.com -port 9943 -modifyShared true]')
AdminConfig.save() 
AdminNodeManagement.syncActiveNodes() 

Additional

We also saw the same for CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS == 9444

We made the same port change from 9444 to 9944: -

/opt/IBM/WebSphere/AppServer/profiles/Dmgr01/bin/wsadmin.sh -lang jython -host `hostname` -port 8879
AdminTask.modifyServerPort('SupClusterMember2', '[-nodeName AppSrv02Node -endPointName CSIV2_SSL_MUTUALAUTH_LISTENER_ADDRESS -host bpm101.uk.ibm.com -port 9944 -modifyShared true]')
AdminConfig.save() 
AdminNodeManagement.syncActiveNodes() 

No comments:

Note to self - use kubectl to query images in a pod or deployment

In both cases, we use JSON ... For a deployment, we can do this: - kubectl get deployment foobar --namespace snafu --output jsonpath="{...