Wednesday, 2 September 2015

WebSphere Application Server 8.5.5 and DB2 10.5 - High Availability for Disaster Recovery - Like Manuel, I learn ....

I've been testing, experimenting and documenting my experiences with IBM DB2 10.5, specifically with the High Availability for Disaster Recovery (HADR) configuration, in the context of making a WebSphere Application Server (WAS) configuration more resilient.

To that end, I've been scripting the necessary configuration to allow WAS to use the DB2 Automatic Client Reroute Options: -


in order that WAS can connect to another DB2 server within the "cluster" in the situation where the primary server, to which the JDBC Data Source normally points: -


Now, in the first screenshot above, you'll notice that the Client reroute server list JNDI name is left blank.

This is by design :-)

Initially, I was populating that particular property, being unaware as to what it did :-)

That was a BAD move.

Basically, the property is ONLY used if one creates a second JDBC Data Source to point at the second, standby database server.

However, the use of the other parameters, Alternate Server Names and Alternate Port Numbers makes this unnecessary.

This IBM Technote says it far better than I: -


<snip>
Note 1: If you choose to use Client reroute server list JNDI name, you do not need to supply Alternate server names and Alternate port numbers (and vice versa). 
Note 2: The Client reroute server list JNDI name should be a unique JNDI name, different from the JNDI Name used in the previous step.
</snip>

So, back to me, and this is what I was seeing when I attempted to test a JDBC Data Source whilst the primary ( configured ) DB2 server was down down down: -

The test connection operation failed for data source Monitor_Database on server dmgr at node Dmgr with the following exception: java.sql.SQLNonTransientException: [jcc][t4][2043][11550][4.11.69] Exception java.net.ConnectException: Error opening socket to server bpm856.uk.ibm.com/192.168.33.200 on port 60,006 with message: Connection refused. ERRORCODE=-4499, SQLSTATE=08001 DSRA0010E: SQL State = 08001, Error Code = -4,499. View JVM logs for further details.

with this in SystemErr.log: -

...
[02/09/15 15:18:41:143 BST] 00000107 SystemErr     R java.sql.SQLNonTransientException: [jcc][t4][2043][11550][4.11.69] Exception java.net.ConnectException: Error opening socket to server bpm856.uk.ibm.com/192.168.33.200 on port 60,006 with message: Connection refused. ERRORCODE=-4499, SQLSTATE=08001 DSRA0010E: SQL State = 08001, Error Code = -4,499
...
[02/09/15 15:18:41:150 BST] 00000107 SystemErr     R Caused by: java.net.ConnectException: Connection refused
...

which, when you think about it, makes PERFECT sense.

In other words, I've told WAS to, in the eventuality of losing a DB2 server, use an alternative JDBC Data Source which .... DOES NOT EXIST :-)

Once I changed this - and, more importantly, fixed my Jython script, it all became shiny again :-)

4 comments:

பிரேம்ஜி said...

Hi Dave,
Hope you are doing good. Thanks for the solution. I had the exact problem last week and I fixed it using this article. But I am using IBM BPM which requires a restart of JVM every time the fail over happens at the database side. Is there anyway we can avoid the restart of JVM whenever the DB fail over happens. Kindly help me please.

Thanks,
Kumar.
Bentonville,AR

Dave Hay said...

Hi Kumar

Depending upon the version of IBM BPM that you're using, you shouldn't need to restart the JVMs. If needed, please check with IBM Support via a PMR, as the problem may be mitigatable by an iFix or fixpack.

Cheers, Dave

பிரேம்ஜி said...

Thanks a lot Dave. I am using the latest version of IBM BPM that is BPM 8.5.7.
Will the setting up custom properties of datasources like enableSeamlessfailover or enableClientaffinitieslist help me??

Anyway. I will raise a PMR while i check this.

Dave Hay said...

Hi Kumar

OK, so we're not making any use of enableSeamlessfailover or enableClientaffinitieslist, and WAS -> DB2 connectivity fails over as one would expect when HADR kicks in i.e. when the takeover commands are run, either manually or via TSA.

We're using BPM 8.5.5 on WAS 8.5.5.8, and it all just works :-)

Good luck, Dave