Thursday, 13 June 2013

BusinessDataAliasCache performance and memory leak fix for IBM Business Process Manager

This has come to light today, as we were seeing java.lang.OutOfMemoryError exceptions, as soon as our BPM Standard 7.5.1.0 JVM ( AppTarget cluster ) ran out of heap.

Using a monitoring tool, we could see the heap growing towards its MaxHeap value of 6 GB, with the amount of used heap growing at the same pace - as soon as we ran out of heap, down came WAS.

Using the IBM Heap Analyzer tool, we could see that we had a single instance of com.lombardisoftware.server.ejb.persistence.versioning.BranchManager totalling 4 GB in size :-)

Yoicks !!!!

Thankfully, I then found this: -



<snip>

What is provided with this fix? 

JR42522 changes the way the BusinessDataAliasCache behaves on first login and optimizes interactions with the database associated with managing the cache. High memory consumption is avoided by placing an upper limit on the number of entries in the BranchManager cache.

Who should use it? 

Because this cache is associated with portal queries that utilize business data, users with a large number of snapshots / versions of business process definitions (BPD) are at risk for experiencing issues resolved by this fix.


Memory Consumption: 

Because there are no restrictions on when and how often the BusinessDataAliasCache is loaded, if a large number of unarchived snapshots are present, an OutOfMemory condition can result. The issue here specifically arises due to the amount of business data per snapshot consuming a large amount of heap. As a result, this fix prevents data from non-active snapshots from being read and provides an upper limit on the amount of heap consumed by business data per snapshot.  

In the heap dump that is generated from an OutOfMemory condition related to this issue, I would expect to see the following in Heap Analyzer output: 

One instance of "com.lombardisoftware.server.ejb.persistence.versioning.BranchManager" loaded by "com.ibm.oti.vm.BootstrapClassLoader @ 0x40f91a48" occupies 1,944,715,592 (90.52%) bytes. 

The memory is accumulated in one instance of "java.util.concurrent.ConcurrentHashMap$Segment[]" loaded by "com.ibm.oti.vm.BootstrapClassLoader @ 0x40f91a48". 

Keywords:
java.util.concurrent.ConcurrentHashMap$Segment[] 
com.lombardisoftware.server.ejb.persistence.versioning.BranchManager 
com.ibm.oti.vm.BootstrapClassLoader @ 0x40f91a48 

The key being a large portion of heap accumulated in "com.lombardisoftware.server.ejb.persistence.versioning.BranchManager."
</snip>

Therefore, we have two things to do: -

(a) Look at removing unwanted snapshots - I'll blog more about that later
(b) Apply the iFix

Good times !!!

No comments:

Note to self - use kubectl to query images in a pod or deployment

In both cases, we use JSON ... For a deployment, we can do this: - kubectl get deployment foobar --namespace snafu --output jsonpath="{...