Sunday, 16 July 2017

WebSphere Application Server - Transaction Logs - getting it wrong

I do need to write a long-form article about this, but I've been on a voyage of discovery configuring AND testing WAS transaction recovery, by placing the transaction/compensation/partner logs in an Oracle 12c database.

This is in the context of an IBM Business Process Manager Advanced environment.

During the process, I saw this in the SupCluster logs ( specifically the second cluster member ) : -

SupClusterMember2/SystemOut.log:[16/07/17 11:53:47:332 BST] 00000001 WASSessionCor I SessionProperties shouldSetAndDoLogging SESN0169I: Session Manager found the custom property InvalidateOnUnauthorizedSessionRequestException with value true.
SupClusterMember2/SystemOut.log:[16/07/17 11:59:40:316 BST] 0000004d SQLMultiScope I   CWRLS0009E: Details of recovery log failure: Another server has locked the HA lock row, com.ibm.ws.recoverylog.spi.InternalLogException: Another server has locked the HA lock row
SupClusterMember2/SystemOut.log:[16/07/17 11:59:40:319 BST] 0000004d SQLMultiScope E   CWRLS0024E: Exception caught during recovery! Another server has locked the HA lock row, com.ibm.ws.recoverylog.spi.InternalLogException: Another server has locked the HA lock row
SupClusterMember2/SystemOut.log:[16/07/17 11:59:40:324 BST] 0000004d SQLMultiScope A   WTRN0107W: Caught non-SQLException Throwable when forcing SQL RecoveryLog tranlog for server PSCell1\Node2\SupClusterMember2 Throwable: Another server has locked the HA lock row, com.ibm.ws.recoverylog.spi.InternalLogException: Another server has locked the HA lock row
SupClusterMember2/SystemOut.log:[16/07/17 11:59:40:333 BST] 0000004d SQLMultiScope A   WTRN0100E: Cannot recover from SQLException when forcing SQL RecoveryLog tranlog for server PSCell1\Node2\SupClusterMember2 Exception: Another server has locked the HA lock row, com.ibm.ws.recoverylog.spi.InternalLogException: Another server has locked the HA lock row


The problem was a PEBCAK, in that I'd obviously misconfigured things.

I validated my WAS configuration: -

cat /opt/ibm/WebSphereProfiles/Dmgr01/config/cells/PSCell1/nodes/Node1/serverindex.xml | grep -i recoveryLog

    <recoveryLog xmi:id="RecoveryLog_1500062274540" transactionLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/AppCluster_Tranlogs,tablesuffix=App1" compensationLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/AppCluster_Tranlogs,tablesuffix=App1" compensationLogFileSize="5"/>
    <recoveryLog xmi:id="RecoveryLog_1500062279192" transactionLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/SupCluster_Tranlogs,tablesuffix=Sup1"/>

cat /opt/ibm/WebSphereProfiles/Dmgr01/config/cells/PSCell1/nodes/Node2/serverindex.xml | grep -i recoveryLog

    <recoveryLog xmi:id="RecoveryLog_1500062274432" transactionLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/AppCluster_Tranlogs,tablesuffix=App2" compensationLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/AppCluster_Tranlogs,tablesuffix=App2" compensationLogFileSize="5"/>
    <recoveryLog xmi:id="RecoveryLog_1500062279152" transactionLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/SupCluster_Tranlogs,tablesuffix=Sup2"/>


to ensure that I : -

(a) was using the right datasources for the right clusters ( AppCluster has both Transaction and Compensation logs, whereas SupCluster only has Transaction logs )
(b) had suitably incremented the suffix - App1 or Sup1 for member 1, App2 or Sup2 for member 2

Finally, as this was a TEST environment, I dropped the tables: -

DROP TABLE CMNUSER.WAS_TRAN_LOGAPP1;
DROP TABLE CMNUSER.WAS_PARTNER_LOGAPP1;
DROP TABLE CMNUSER.WAS_COMP_LOGAPP1;
DROP TABLE CMNUSER.WAS_TRAN_LOGAPP2;
DROP TABLE CMNUSER.WAS_PARTNER_LOGAPP2;
DROP TABLE CMNUSER.WAS_COMP_LOGAPP2;
DROP TABLE CMNUSER.WAS_TRAN_LOGSUP1;
DROP TABLE CMNUSER.WAS_PARTNER_LOGSUP1;
DROP TABLE CMNUSER.WAS_COMP_LOGSUP1;
DROP TABLE CMNUSER.WAS_TRAN_LOGSUP2;
DROP TABLE CMNUSER.WAS_PARTNER_LOGSUP2;
DROP TABLE CMNUSER.WAS_COMP_LOGSUP2;


and recreated them: -

CREATE TABLE CMNUSER.WAS_TRAN_LOGAPP1(
 SERVER_NAME VARCHAR(128),
 SERVICE_ID SMALLINT,
 RU_ID NUMBER(19),
 RUSECTION_ID NUMBER(19),
 RUSECTION_DATA_INDEX SMALLINT,
 DATA BLOB);

CREATE TABLE CMNUSER.WAS_PARTNER_LOGAPP1(
 SERVER_NAME VARCHAR(128),
 SERVICE_ID SMALLINT,
 RU_ID NUMBER(19),
 RUSECTION_ID NUMBER(19),
 RUSECTION_DATA_INDEX SMALLINT,
 DATA BLOB);

CREATE TABLE CMNUSER.WAS_COMP_LOGAPP1(
 SERVER_NAME VARCHAR(128),
 SERVICE_ID SMALLINT,
 RU_ID NUMBER(19),
 RUSECTION_ID NUMBER(19),
 RUSECTION_DATA_INDEX SMALLINT,
 DATA BLOB);

 CREATE TABLE CMNUSER.WAS_TRAN_LOGAPP2(
 SERVER_NAME VARCHAR(128),
 SERVICE_ID SMALLINT,
 RU_ID NUMBER(19),
 RUSECTION_ID NUMBER(19),
 RUSECTION_DATA_INDEX SMALLINT,
 DATA BLOB);

CREATE TABLE CMNUSER.WAS_PARTNER_LOGAPP2(
 SERVER_NAME VARCHAR(128),
 SERVICE_ID SMALLINT,
 RU_ID NUMBER(19),
 RUSECTION_ID NUMBER(19),
 RUSECTION_DATA_INDEX SMALLINT,
 DATA BLOB);

CREATE TABLE CMNUSER.WAS_COMP_LOGAPP2(
 SERVER_NAME VARCHAR(128),
 SERVICE_ID SMALLINT,
 RU_ID NUMBER(19),
 RUSECTION_ID NUMBER(19),
 RUSECTION_DATA_INDEX SMALLINT,
 DATA BLOB);

 CREATE TABLE CMNUSER.WAS_TRAN_LOGSUP1(
  SERVER_NAME VARCHAR(128),
  SERVICE_ID SMALLINT,
  RU_ID NUMBER(19),
  RUSECTION_ID NUMBER(19),
  RUSECTION_DATA_INDEX SMALLINT,
  DATA BLOB);

 CREATE TABLE CMNUSER.WAS_PARTNER_LOGSUP1(
  SERVER_NAME VARCHAR(128),
  SERVICE_ID SMALLINT,
  RU_ID NUMBER(19),
  RUSECTION_ID NUMBER(19),
  RUSECTION_DATA_INDEX SMALLINT,
  DATA BLOB);

 CREATE TABLE CMNUSER.WAS_COMP_LOGSUP1(
  SERVER_NAME VARCHAR(128),
  SERVICE_ID SMALLINT,
  RU_ID NUMBER(19),
  RUSECTION_ID NUMBER(19),
  RUSECTION_DATA_INDEX SMALLINT,
  DATA BLOB);

  CREATE TABLE CMNUSER.WAS_TRAN_LOGSUP2(
   SERVER_NAME VARCHAR(128),
   SERVICE_ID SMALLINT,
   RU_ID NUMBER(19),
   RUSECTION_ID NUMBER(19),
   RUSECTION_DATA_INDEX SMALLINT,
   DATA BLOB);

  CREATE TABLE CMNUSER.WAS_PARTNER_LOGSUP2(
   SERVER_NAME VARCHAR(128),
   SERVICE_ID SMALLINT,
   RU_ID NUMBER(19),
   RUSECTION_ID NUMBER(19),
   RUSECTION_DATA_INDEX SMALLINT,
   DATA BLOB);

  CREATE TABLE CMNUSER.WAS_COMP_LOGSUP2(
   SERVER_NAME VARCHAR(128),
   SERVICE_ID SMALLINT,
   RU_ID NUMBER(19),
   RUSECTION_ID NUMBER(19),
   RUSECTION_DATA_INDEX SMALLINT,
   DATA BLOB);

And all is well.

From a testing perspective, I've created a SCA module which uses a JDBC Resource Adapter to create/update/read data from an Oracle database table.

Again, that's for a future long-form article ….

No comments:

Note to self - use kubectl to query images in a pod or deployment

In both cases, we use JSON ... For a deployment, we can do this: - kubectl get deployment foobar --namespace snafu --output jsonpath="{...