Wednesday, 12 September 2018

ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

Whilst restarting an IBM BPM Standard 8.5.6 environment, after a power outage (!), I hit the following exceptions in the MECluster SystemOut.log : -

...
[12/09/18 10:16:01:149 BST] 0000007f SibMessage    I   [BPM.De1.Bus:MECluster.000-BPM.De1.Bus] CWSIS1593I: The messaging engine, ME_UUID=FC1DE993D20B5A10, INC_UUID=10F19927CD0C63BD, has failed to gain an initial lock on the data store.
[12/09/18 10:16:01:149 BST] 0000007f SibMessage    I   [BPM.De1.Bus:MECluster.000-BPM.De1.Bus] CWSIS1538I: The messaging engine, ME_UUID=FC1DE993D20B5A10, INC_UUID=10F19927CD0C63BD, is attempting to obtain an exclusive lock on the data store.
[12/09/18 10:16:05:951 BST] 00000074 SibMessage    E   [BPM.De1.Bus:MECluster.000-BPM.De1.Bus] CWSIS0002E: The messaging engine encountered an exception while starting. Exception: com.ibm.ws.sib.msgstore.PersistenceException: CWSIS1501E: The data source has produced an unexpected exception: com.ibm.ws.sib.msgstore.persistence.ConnectionUnavailableException: Connection cannot be provided as Datasource has been disabled!
[12/09/18 10:16:05:955 BST] 0000007f SibMessage    I   [BPM.De1.Bus:MECluster.000-BPM.De1.Bus] CWSIS1593I: The messaging engine, ME_UUID=FC1DE993D20B5A10, INC_UUID=10F19927CD0C63BD, has failed to gain an initial lock on the data store.
[12/09/18 10:16:05:999 BST] 00000074 SibMessage    E   [BPM.De1.Bus:MECluster.000-BPM.De1.Bus] CWSID0035E: Messaging engine MECluster.000-BPM.De1.Bus cannot be started; detected error reported during com.ibm.ws.sib.msgstore.impl.MessageStoreImpl start()
[12/09/18 10:16:06:000 BST] 00000074 SibMessage    E   [BPM.De1.Bus:MECluster.000-BPM.De1.Bus] CWSID0027E: Messaging engine MECluster.000-BPM.De1.Bus cannot be restarted because a serious error has been reported.
[12/09/18 10:16:06:001 BST] 00000074 SibMessage    I   [BPM.De1.Bus:MECluster.000-BPM.De1.Bus] CWSID0016I: Messaging engine MECluster.000-BPM.De1.Bus is in state Stopped.
[12/09/18 10:16:06:002 BST] 0000007f SibMessage    I   [BPM.De1.Bus:MECluster.000-BPM.De1.Bus] CWSIS1538I: The messaging engine, ME_UUID=FC1DE993D20B5A10, INC_UUID=10F19927CD0C63BD, is attempting to obtain an exclusive lock on the data store.
[12/09/18 10:16:06:004 BST] 00000074 SibMessage    E   [BPM.De1.Bus:MECluster.000-BPM.De1.Bus] CWSID0039E: HAManager-initiated activation has failed, messaging engine MECluster.000-BPM.De1.Bus will be disabled. Reason Refer to earlier messages
[12/09/18 10:16:06:005 BST] 00000074 HAGroupImpl   I   HMGR0124I: An activate or deactive request for the local member of group IBM_hc=MECluster,WSAF_SIB_BUS=BPM.De1.Bus,WSAF_SIB_MESSAGING_ENGINE=MECluster.000-BPM.De1.Bus,type=WSAF_SIB failed. The reason is < Messaging Engine MECluster.000-BPM.De1.Bus could not be activated: Refer to earlier messages > and the data is <null>
[12/09/18 10:16:06:006 BST] 00000074 HAGroupImpl   I   HMGR0129I: The local member of group IBM_hc=MECluster,WSAF_SIB_BUS=BPM.De1.Bus,WSAF_SIB_MESSAGING_ENGINE=MECluster.000-BPM.De1.Bus,type=WSAF_SIB has been disabled. The reason is disable called internally, the reason is < Messaging Engine MECluster.000-BPM.De1.Bus could not be activated: Refer to earlier messages >.
[12/09/18 10:16:06:006 BST] 00000074 SibMessage    I   [BPM.De1.Bus:MECluster.000-BPM.De1.Bus] CWSID0016I: Messaging engine MECluster.000-BPM.De1.Bus is in state Joined.
...

I've seen this before, and it usually means that the SIB tables are "locked' within Oracle.

My Q&D solution (!) is to simply identify and drop the SIB tables: -

SELECT owner, table_name FROM dba_tables where owner like '%CMNUSER%' and table_name like '%SIB%';

drop table CMNUSER.SIB000;
drop table CMNUSER.SIB001;
drop table CMNUSER.SIB002;
drop table CMNUSER.SIBCLASSMAP;
drop table CMNUSER.SIBKEYS;
drop table CMNUSER.SIBLISTING;
drop table CMNUSER.SIBOWNER;
drop table CMNUSER.SIBOWNERO;
drop table CMNUSER.SIBXACTS;

Sadly, one of these tables was locked by an existing session, even though I'd previously stopped the MECluster, so this command: -

drop table CMNUSER.SIBOWNER;

returned: -

drop table CMNUSER.SIBOWNER
                   *
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired


Thankfully, the internet came to the rescue AGAIN : -


so I ran this query: -

select object_name, s.sid, s.serial#, p.spid 
from v$locked_object l, dba_objects o, v$session s, v$process p
where l.object_id = o.object_id and l.session_id = s.sid and s.paddr = p.addr;

OBJECT_NAME
--------------------------------------------------------------------------------
       SID    SERIAL# SPID
---------- ---------- ------------------------
SIBOWNER
55 50828 46179

SIBOWNER
42  2410 90896

and then used these commands to kill the sessions: _

alter system kill session '55,50828';
alter system kill session '42,2410';

and then validated that there were no further locks: -

select object_name, s.sid, s.serial#, p.spid 
from v$locked_object l, dba_objects o, v$session s, v$process p
where l.object_id = o.object_id and l.session_id = s.sid and s.paddr = p.addr;

no rows selected

and then dropped the table: -

drop table CMNUSER.SIBOWNER;

Table dropped.

No comments:

Note to self - use kubectl to query images in a pod or deployment

In both cases, we use JSON ... For a deployment, we can do this: - kubectl get deployment foobar --namespace snafu --output jsonpath="{...