I'm just completing the build of an IBM BPM Advanced 8.5.5 environment, setting up Process Server on one VM to be available for online deployments from a Process Center on another VM.
All was looking good until I tried to start the third cluster, AppCluster, which contains the Business Process Modelling Notation (BPNM) run-time.
The start process stalled, and I found: -
...
[25/08/14 18:41:30:302 BST] 0000011b wle_repocore_ W CWLLG4010W: The repository contact failed due to untrusted SSL certificate. Process Center's SSL certificate needs to be trusted by Process Server, please see: http://pic.dhe.ibm.com/infocenter/dmndhelp/v8r5m0/topic/com.ibm.wbpm.admin.doc/topics/tins_cnfg_ssl_nd.html
[25/08/14 18:41:40:350 BST] 0000011b wle_repocore_ W CWLLG4010W: The repository contact failed due to untrusted SSL certificate. Process Center's SSL certificate needs to be trusted by Process Server, please see: http://pic.dhe.ibm.com/infocenter/dmndhelp/v8r5m0/topic/com.ibm.wbpm.admin.doc/topics/tins_cnfg_ssl_nd.html
[25/08/14 18:41:50:378 BST] 0000011b wle_repocore_ W CWLLG4010W: The repository contact failed due to untrusted SSL certificate. Process Center's SSL certificate needs to be trusted by Process Server, please see: http://pic.dhe.ibm.com/infocenter/dmndhelp/v8r5m0/topic/com.ibm.wbpm.admin.doc/topics/tins_cnfg_ssl_nd.html...
in SystemOut.log.
The exception refers one here: -
which really tells me what I already knew e.g. that I need to retrieve the endpoint certificate from Process Center ( actually from IBM HTTP Server fronting PC ) into the Process Server's cell default trust-store.
This I did via a Jython script: -
/opt/IBM/WebSphere/AppServer/profiles/Dmgr01/bin/wsadmin.sh -lang jython -host `hostname` -port 8879cellID=AdminControl.getCell()
AdminTask.retrieveSignerFromPort('[-keyStoreName CellDefaultTrustStore -keyStoreScope (cell):'+cellID+' -host
rhel6.uk.ibm.com -port
8443 -certificateAlias ProcessCenter -sslConfigName CellDefaultSSLSettings -sslConfigScopeName (cell):'+cellID+' ]')
AdminConfig.save()
AdminNodeManagement.syncActiveNodes()
quit
( port 8443 is the HTTPS port on which IHS listens )
Once I did this, and restarted the cluster, it started up without problems but .....
I now see this: -
...
[25/08/14 18:56:57:511 BST] 000000ee wle_repocore_ W CWLLG0098W: Unable to connect to Process Center.
[25/08/14 18:57:07:536 BST] 000000ee wle_repocore_ W CWLLG0098W: Unable to connect to Process Center.
[25/08/14 18:57:17:554 BST] 000000ee wle_repocore_ W CWLLG0098W: Unable to connect to Process Center.
...
in SystemOut.log and this: -
...
[25/08/14 19:05:08:605 BST] FFDC Exception:java.net.ConnectException SourceId:com.lombardisoftware.servlet.heartbeat.RepositoryHeartbeat.callProcessCenter ProbeId:518 Reporter:com.lombardisoftware.servlet.heartbeat.RepositoryHeartbeat@b975d2a1
java.net.ConnectException: Connection refused
....
in the node's FFDC logs.
When I looked at the error_log for the IHS instance fronting Process Center, I can see this: -
...
[Mon Aug 25 18:46:57 2014] [error] [client 10.99.79.100] [7f1a780147b0] [7824] SSL0279E: SSL Handshake Failed due to fatal alert from client. Client sent fatal alert [level 2 (fatal), description 46 (certificate_unknown)] [10.99.79.100:44309 -> 10.99.79.101:8443] [18:46:57.000038538] 0ms
[Mon Aug 25 18:47:00 2014] [error] [client 10.99.79.100] [7f1a58004180] [6406] SSL0279E: SSL Handshake Failed due to fatal alert from client. Client sent fatal alert [level 2 (fatal), description 46 (certificate_unknown)] [10.99.79.100:40941 -> 10.99.79.101:8443] [18:47:00.000955120] 0ms
[Mon Aug 25 18:52:37 2014] [error] [client 10.99.79.100] [7f1a700147b0] [7824] SSL0279E: SSL Handshake Failed due to fatal alert from client. Client sent fatal alert [level 2 (fatal), description 46 (certificate_unknown)] [10.99.79.100:41584 -> 10.99.79.101:8443] [18:52:37.000689637] 0ms
...
whereas I see nothing in the corresponding SystemOut.log for the AppCluster cluster member on the Process Center box.
This suggests that IHS ( Process Center ) is receiving a HTTPS connection from the AppCluster JVM ( Process Server ), but doesn't have a corresponding SSL certificate to decrypt the connection.
I fixed this by exporting the signer certificate from the cell-default trust store ( Process Server ): -
/opt/IBM/WebSphere/AppServer/profiles/Dmgr01/bin/wsadmin.sh -lang jython -host `hostname` -port 8879cellID=AdminControl.getCell() AdminTask.extractSignerCertificate('[-keyStoreName CellDefaultTrustStore -keyStoreScope (cell):'+cellID+' -certificateFilePath /tmp/rhel6.uk.ibm.com -base64Encoded false -certificateAlias root ]')'/tmp/rhel6.uk.ibm.com'quit
and imported it into the IHS key store: -
/opt/IBM/HTTPServer/bin/gskcapicmd -cert -add -db /opt/IBM/HTTPServer/ssl/keystore.kdb -pw passw0rd -file ~/
rhel6.uk.ibm.com and validated it as follows: -
/opt/IBM/HTTPServer/bin/gskcapicmd -cert -list -db /opt/IBM/HTTPServer/ssl/keystore.kdb -pw passw0rd
Whilst this resolved the IHS exception, I'm still seeing: -
...
[25/08/14 18:56:57:511 BST] 000000ee wle_repocore_ W CWLLG0098W: Unable to connect to Process Center.
...
This led me to check the Process Server configuration, in terms of what it knows about the Process Center box: -
/opt/IBM/WebSphere/AppServer/profiles/Dmgr01/bin/wsadmin.sh -lang jython -host `hostname` -port 8879
ps = AdminConfig.getid("/Cell:/ServerCluster:AppCluster/BPMClusterConfigExtension:/BPMProcessServer:/")
print AdminConfig.show(ps)
[authoringEnvironmentPortalPrefix portal]
[baseUrl teamworks/webservices]
[bpdTrackingEnabledDefault false]
[clientLink teamworks]
[coachDesignerXslUrl teamworks/coachdesigner/transform/CoachDesigner.xsl]
[commonPortalPrefix portal]
[consoleSections [root(cells/bpm855Cell/clusters/AppCluster|cluster-bpm.xml#BPMConsoleSection_1408983776561) console.lombardi.admin(cells/bpm855Cell/clusters/AppCluster|cluster-bpm.xml#BPMConsoleSection_1408983776562) console.user.management(cells/bpm855Cell/clusters/AppCluster|cluster-bpm.xml#BPMConsoleSection_1408983776563) console.monitoring(cells/bpm855Cell/clusters/AppCluster|cluster-bpm.xml#BPMConsoleSection_1408983776564) console.event.manager(cells/bpm855Cell/clusters/AppCluster|cluster-bpm.xml#BPMConsoleSection_1408983776565) console.admin.tools(cells/bpm855Cell/clusters/AppCluster|cluster-bpm.xml#BPMConsoleSection_1408983776566)]]
[defaultNamespaceUri schema/]
[heartBeatInterval 10]
[httpProtocolOnly true]
[imagePrefix teamworks]
[processAdminPrefix ProcessAdmin]
[processCenterInternalUrl https://rhel6.uk.ibm.com:8443/ProcessCenterInternal]
[processCenterUrl https://rhel6.uk.ibm.com:8443/ProcessCenter]
[processHelpWikiUrlEdit processhelp/en/Special:Edit?topic=%TITLE%&teamworksTitle=%TEAMWORKS_TITLE%]
[processHelpWikiUrlView processhelp/en/%TITLE%?teamworksTitle=%TEAMWORKS_TITLE%]
[repositoryPrefix ProcessCenter]
[security (cells/bpm855Cell/clusters/AppCluster|cluster-bpm.xml#BPMServerSecurity_1408983776561)]
[servletPrefix teamworks]
[teamworksWebappPrefix teamworks]
[useHTTPSURLPrefixes true]
[virtualHost (cells/bpm855Cell/clusters/AppCluster|cluster-bpm.xml#BPMVirtualHostInfo_1408983776561)]
[webapiPrefix webapi] Whilst the processCenterInternalUrl and processCenterUrl entries were quite correct, the parameter httpProtocolOnly should've been set to FALSE.
I changed this during the same wsadmin session: -
AdminConfig.modify(ps, [['httpProtocolOnly','false']])
AdminConfig.save()
AdminNodeManagement.syncActiveNodes()
and validated the change: -
ps = AdminConfig.getid("/Cell:/ServerCluster:AppCluster/BPMClusterConfigExtension:/BPMProcessServer:/")
print AdminConfig.show(ps)
...
[httpProtocolOnly false]
...
This got me further forward, but I'm still seeing: -
...
[25/08/14 20:52:33:059 BST] 000000ed wle_repocore_ W CWLLG0098W: Unable to connect to Process Center.
[25/08/14 20:52:43:078 BST] 000000ed wle_repocore_ W CWLLG0098W: Unable to connect to Process Center.
[25/08/14 20:52:53:107 BST] 000000ed wle_repocore_ W CWLLG0098W: Unable to connect to Process Center.
....
This led me to the FFC logs ( /opt/IBM/WebSphere/AppServer/profiles/AppSrv01/logs/ffdc ): -
[25/08/14 20:52:23:026 BST] FFDC Exception:javax.net.ssl.SSLException SourceId:com.lombardisoftware.servlet.heartbeat.RepositoryHeartbeat.callProcessCenter ProbeId:518 Reporter:com.lombardisoftware.servlet.heartbeat.RepositoryHeartbeat@4ebf502e
javax.net.ssl.SSLException: hostname in certificate didn't match: <rhel6.uk.ibm.com> != <"rhel6.uk.ibm.com>
at org.apache.commons.httpclient.protocol.AbstractVerifier.verify(Unknown Source)
at org.apache.commons.httpclient.protocol.BrowserCompatHostnameVerifier.verify(Unknown Source)
at org.apache.commons.httpclient.protocol.AbstractVerifier.verify(Unknown Source)
at org.apache.commons.httpclient.protocol.AbstractVerifier.verify(Unknown Source)
at org.apache.commons.httpclient.protocol.SSLProtocolSocketFactory.createSocket(Unknown Source)
at org.apache.commons.httpclient.protocol.SSLProtocolSocketFactory.createSocket(Unknown Source)
at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at com.lombardisoftware.server.util.HttpUtils.execute(HttpUtils.java:123)
at com.lombardisoftware.server.util.HttpUtils.executeWithBasicAuthenticationInternal(HttpUtils.java:109)
at com.lombardisoftware.server.util.HttpUtils.executeWithBasicAuthentication(HttpUtils.java:63)
at com.lombardisoftware.servlet.heartbeat.RepositoryHeartbeat.contactProcessCenterInternal(RepositoryHeartbeat.java:609)
at com.lombardisoftware.servlet.heartbeat.RepositoryHeartbeat.callProcessCenter(RepositoryHeartbeat.java:490)
at com.lombardisoftware.servlet.heartbeat.RepositoryHeartbeat.registerWithRepository(RepositoryHeartbeat.java:433)
at com.lombardisoftware.servlet.heartbeat.RepositoryHeartbeat.beat(RepositoryHeartbeat.java:366)
at com.lombardisoftware.servlet.heartbeat.RepositoryHeartbeat.run(RepositoryHeartbeat.java:169) Interestingly, when I used "Retrieve from port" in the WAS Integrated Solutions Console (ISC), I also noticed that the Distinguished Name (DN) of the certificate was: -
rather than: -
In other words, the quotes ( " ) were creeping in somewhere .....
Given that the "failing" SSL certificate was coming from IHS ( remember, Process Server "talks" to Process Center via IHS, rather than the WAS Web Container port ), I looked at the process that I had used to generate SSL certificates in IHS.
Lo and behold .....
This is the command that I was using to generate the self-signed certificate in IHS: -
/opt/IBM/HTTPServer/bin/gskcapicmd -cert -create -db /opt/IBM/HTTPServer/ssl/keystore.kdb -pw passw0rd -size 2048 -dn cn=rhel6.uk.ibm.com\\,o=ibm\\,c=uk -label "rhel6.uk.ibm.com" -default_cert yes I strongly suspect that the addition of the \\ characters was somehow causing the problem.
I removed the certificate and then recreated it WITHOUT the \\ characters: -
/opt/IBM/HTTPServer/bin/gskcapicmd -cert -create -db /opt/IBM/HTTPServer/ssl/keystore.kdb -pw passw0rd -size 2048 -dn cn=rhel6.uk.ibm.com\\,o=ibm\\,c=uk -label "rhel6.uk.ibm.com" -default_cert yes
Having restarted IHS and re-imported the IHS certificate into the cell-default trust store for Process Server, things then burst into life.
The moral of the story ? SSL is wonderful and powerful and can really mix you up :-)
PS In conclusion, this is the way that the Cell Default Trust Stores look on Process Center and Process Server: -
Process Center has the root signer certificate for the Process Server cell
Process Server has the self-signed certificate from IHS on the Process Center cell
PPS The whole business with importing the Process Server root certificate into the Process Center IHS keystore .... probably best to forget that :-)