Blog

ORA-15137: The ASM cluster is in rolling patch state

SYMPTOM

While adding a new disk into an existing diskgroup, below errors occurred:

ORA-15032: not all alterations performed
ORA-15137: The ASM cluster is in rolling patch state.

INVESTIGATION

1) On both nodes, it shows the cluster in “In Rolling Patch”, and the patch levels are all the same.

SQL> SELECT SYS_CONTEXT('SYS_CLUSTER_PROPERTIES', 'CLUSTER_STATE') 
     FROM DUAL;

SYS_CONTEXT('SYS_CLUSTER_PROPERTIES','CLUSTER_STATE')
--------------------------------------------------------------------
In Rolling Patch

SQL> SELECT SYS_CONTEXT('SYS_CLUSTER_PROPERTIES', 'CURRENT_PATCHLVL') 
     FROM DUAL;

SYS_CONTEXT('SYS_CLUSTER_PROPERTIES','CURRENT_PATCHLVL')
--------------------------------------------------------------------
3628626982

$ asmcmd
ASMCMD> showclusterstate
In Rolling Patch

ASMCMD> showpatches
---------------
List of Patches
===============
26609817
26609966
26839277
27105253
27128906
27144050
27335416
27458609
27464465
27674384

ASMCMD> showversion
ASM version         : 12.2.0.1.0

2) “crsctl query crs softwarepatch” shows the same results on both nodes:

$ crsctl query crs softwarepatch
Oracle Clusterware patch level on node racnode1 is [3628626982].

$ crsctl query crs softwarepatch
Oracle Clusterware patch level on node racnode2 is [3628626982].

3) “crsctl query crs releasepatch” shows the same results on both nodes:

— racnode1:

$crsctl query crs releasepatch
Oracle Clusterware release patch level is [3628626982] and the complete 
list of patches [26609817 26609966 26839277 27105253 27128906 27144050 
27335416 27458609 27464465 27674384 ] have been applied on the local node.

— racnode2:

$crsctl query crs releasepatch
Oracle Clusterware release patch level is [3628626982] and the complete 
list of patches [26609817 26609966 26839277 27105253 27128906 27144050 
27335416 27458609 27464465 27674384 ] have been applied on the local node.

4) kfod command shows the same results on both nodes:

— racnode1:

$ $ORACLE_HOME/bin/kfod op=patchlvl
-------------------
Current Patch level
===================
3628626982

$ $ORACLE_HOME/bin/kfod op=patches
---------------
List of Patches
===============
26609817
26609966
26839277
27105253
27128906
27144050
27335416
27458609
27464465
27674384

— racnode2:

$ $ORACLE_HOME/bin/kfod op=patchlvl
-------------------
Current Patch level
===================
3628626982

$ $ORACLE_HOME/bin/kfod op=patches
---------------
List of Patches
===============
26609817
26609966
26839277
27105253
27128906
27144050
27335416
27458609
27464465
27674384

5) lsinventory shows the same results on both nodes:

$ $ORACLE_HOME/OPatch/opatch lsinventory | grep -i desc
ARU platform description:: Linux x86-64
Patch description: "Database Apr 2018 Release Update : 12.2.0.1.180417 (27674384)"
Patch description: "OCW APR 2018 RELEASE UPDATE 12.2.0.1.0(180129) (27464465)"
Patch description: "ACFS APR 2018 RELEASE UPDATE 12.2.0.1.0(180129) (27458609)"
Patch description: "Tomcat Release Update 12.2.0.1.0(ID:171023.0830) (27144050)"
Patch description: "DBWLM RELEASE UPDATE 12.2.0.1.0(ID:170913) (26839277)"

SOLUTIONS

$crsctl stop rollingpatch
CRS-1161: The cluster was successfully patched to patch level [3628626982].

recheck with above commands, the cluster status is changed from “In Rolling Patch” to “Normal” now.

SQL> SELECT SYS_CONTEXT('SYS_CLUSTER_PROPERTIES', 'CLUSTER_STATE') 
     FROM DUAL;

SYS_CONTEXT('SYS_CLUSTER_PROPERTIES','CLUSTER_STATE')
--------------------------------------------------------------------------------
Normal

Generally speaking, For “ORA-15137: The ASM cluster is in rolling patch state” issue, the below steps can be followed one after another, until the issue is resolved:

a) stop rolling patch status one node by another:

SQL>Alter system stop rolling patch;

b) stop rolling patch for whole cluster:

$crsctl stop rollingpatch

c) In case of  postpatch not complete successfully for some reason, which can also cause this issue:

— As super root user :

 $GRID_HOME/crs/install/rootcrs.sh -prepatch 
 $GRID_HOME/crs/install/rootcrs.sh -postpatch

d) For some reason, OCR is not updated with right patchlevel:

$GRID_HOME/crs/install/rootcrs.sh -prepatch 
$GI_HOME/bin/clscfg -patch
$GRID_HOME/crs/install/rootcrs.sh -postpatch

e) For some reason,  patches are available from “opatch lsinventory”, but they are missing from kfod output:

-- as super user
$GRID_HOME/crs/install/rootcrs.sh -prepatch 

-- as grid owner,
$GRID_HOME/bin/patchgen commit -pi 12345678 
$GRID_HOME/bin/patchgen commit -pi 23456789 

-- as super user
$GRID_HOME/crs/install/rootcrs.sh -postpatch

Search AD users and group in Windows Desktop

There are changes you want to search AD with no tools on-hand like Jexplore, ApacheDS or LdapAdmin. Here is the command you can get DSQuery for simple search for user/group. 1. %SystemRoot%\SYSTEM32\rundll32.exe dsquery,OpenQueryWindow

via Search AD users and group in Windows Desktop — Ran’s Identity and Access Management blog

ORA-02085: database link DBLINK_TEST connects to TESTDB

If parameter global_names =TRUE, the database link name has to match the remote database name.

Try to query a remote table through a database link:

SQL> select count(*) from testuser.test@dblink_test;
select count(*) from testuser.test@dblink_test;
*
ERROR at line 1:
ORA-02085: database link UDBLINK_TEST connects to TESTDB

REASON

Check database parameter global_names =TRUE. If parameter global_names =TRUE, the database link name has to match the remote database name( TESTDB).

WORKAROUND

Turn off  global_names in session level, then it should be ok.

SQL>alter session set global_names=false;
Session altered.

SQL>select count(*) from testuser.test@dblink_test;

 COUNT(*)
----------
       100

ORA-27140 ORA-27300 ORA-27301 ORA-27302 ORA-273003 When Start Up or Shut Down RAC Instance

ORACLE_HOME is shared by both RAC instance A and RAC instance B. Both instance A and B are shut down, then ORACLE_HOME is patched, the ORACLE_HOME/bin/oracle binary will be as below just after the patching:

$ id oracle
uid=122(oracle) gid=202(oinstall) groups=202(oinstall),201(dba)

$ id grid
uid=518(grid) gid=202(oinstall) groups=202(oinstall),201(dba)

$ ls -ltr /dev/oracleasm/disks/
total 0
brw-rw---- 1 grid dba 253, 16 Jul 30 16:38 OCR_VOTE01
brw-rw---- 1 grid dba 253, 50 Jul 30 16:38 ASM_FRA01
brw-rw---- 1 grid dba 253, 57 Jul 30 16:38 ASM_DISK01
..
.

$ ls -ltr /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle
-rwsr-s--x 1 oracle oinstall 228655023 May 17 11:11 /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle

Manually start up instance A by using sqlplus without using ‘srvctl start instance’, and check oracle binary gid( oinstall) still the same:

$ sqlplus / as sysdba

...
..
.
Connected to an idle instance.

SQL> startup
ORACLE instance started.
...
..
.
Database mounted.
Database opened.
SQL> exit
$
$ ls -ltr /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle
-rwsr-s--x 1 oracle oinstall 228655023 May 17 11:11 /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle

Then start up instance B through clusterware by using ‘srvctl start ‘, and noticed clusterware changed the group of the oracle binary:

$srvctl start database -d TESTB
$ ls -ltr /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle
-rwsr-s--x 1 oracle dba 228655023 May 17 11:11 /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle

The instance A got below errors in trace files and alert log:

Alert.log:

Process m000 died, see its trace file
Wed Jul 25 16:37:42 2018
Process m000 died, see its trace file
..
.
Process q000 died, see its trace file
Wed Jul 25 16:41:54 2018
Process W000 died, see its trace file
Process W000 died, see its trace file

Trace file:

$ cat TESTA1_m000_7813.trc
..
.
/u01/app/oracle/diag/rdbms/testa/TESTA1/trace/TESTA1_m000_7813.trc
ORACLE_HOME = /u01/app/oracle/product/11.2.0/dbhome_1

*** 2018-07-25 16:35:55.662
Died during process startup with error 27140 (seq=39770)
OPIRIP: Uncaught error 27140. Error stack:
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 202 (oinstall), 
current egid = 201 (dba)

crsd_oraagent_oracle trace file shows clusterware is changed RAC $ORACLE_HOME/bin/oracle binary by using “setasmgidwrap” while instance B is started up by using ‘srvctl start ‘ by clusterware.

crsd_oraagent_oracle_42.trc:2018-07-25 16:34:31.584046 :CLSDYNAM:2785666816: [ora.testb.db]{1:44997:10712} [start] command = '/u01/app/12.1.0.2/grid/bin/setasmgidwrap oracle_binary_path=/u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle'
crsd_oraagent_oracle_42.trc:2018-07-25 16:34:31.584068 :CLSDYNAM:2785666816: [ora.testb.db]{1:44997:10712} [start] Utils:execCmd action = 1 flags = 6 ohome = /u01/app/12.1.0.2/grid cmdname = setasmgidwrap.
crsd_oraagent_oracle_42.trc:2018-07-25 16:43:38.726230 :CLSDYNAM:2128570112: [ora.testb.db]{1:44997:10835} [start] command = '/u01/app/12.1.0.2/grid/bin/setasmgidwrap oracle_binary_path=/u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle'
crsd_oraagent_oracle_42.trc:2018-07-25 16:43:38.726270 :CLSDYNAM:2128570112: [ora.testb.db]{1:44997:10835} [start] Utils:execCmd action = 1 flags = 6 ohome = /u01/app/12.1.0.2/grid cmdname = setasmgidwrap.

CONCLUSION:

Always shutdown or startup RAC instances by using ‘srvctl start” through clusterware. Otherwise you might get ORA-27140 ORA-27300 ORA-27301 ORA-27302 ORA-273003 errors.

CRS-2412: The Cluster Time Synchronization Service detects that the local time is significantly different from the mean cluster time

In crs alert.log, below message is seen every 30 minutes in 12.2.0.1 GI.

[OCTSSD(9932)]CRS-2412: The Cluster Time Synchronization Service 
detects that the local time is significantly different from the mean cluster time

ctss is running in observer mode, because NTP is running.

$ crsctl check ctss
CRS-4700: The Cluster Time Synchronization Service is in Observer mode.

Subscribe to get access

Read more of this content when you subscribe today.