Ocrconfig Failed With Error PROT-30 PROC-50

Try to add OCR into  new diskgroup ‘+OCR_VOTE’, and get below errors:

# ./ocrconfig -add OCR_VOTE
PROT-30: The Oracle Cluster Registry location to be added is not usable
PROC-50: The Oracle Cluster Registry location to be added is inaccessible on nodes racnode1.

Check crsd.trc log, the error is very obvious that  “OCR file/disk OCR_VOTE  No such file or directory”.

2018-01-25 14:55:18.658112 : OCRSRV:3726690624: utstoragetypecommon:failed in stat OCR file/disk OCR_VOTE, errno=2, os 
err string=No such file or directory. Returns [8]
2018-01-25 14:55:18.658123 : OCRSRV:3726690624: proprstcheckstoarge: sproprutstoragetypecommon Return [8]. OS error [2]. 
Error detail [No such file or directory]
 OCRSRV:3726690624: proas_replace_dev: Failed to verify the OCR location [OCR_VOTE] on local node. Retval:[8]. Error: 
[No such file or directory]

Check ocrconfig log :

$/u01/app/grid/diag/crs/racnode1/crs/trace$ cat ocrconfig_83818.trc
Trace file /u01/app/grid/diag/crs/racnode1/crs/trace/ocrconfig_83818.trc
Oracle Database 12c Clusterware Release 12.1.0.2.0 - Production Copyright 1996, 2014 Oracle. All rights reserved.
2018-01-25 15:56:13.188061 : OCRCONF: ocrconfig starts...
 default: prgdevid: Failed to open the OCR location to be added [OCR_VOTE]. Retval[2]
 OCRCONF: chkid: Failed to get the OCR id from the new device to be added/replaced[OCR_VOTE]. Retval[26]
2018-01-25 15:56:13.270558 : OCRCONF: Failed to compare the OCR id in the device to be added/replaced [OCR_VOTE] and OCR 
id in repository the context is pointing to. Skiping the check.
2018-01-25 15:56:13.271684 : OCRCLI: proac_replace_dev:[OCR_VOTE]: Failed. Retval [50]
2018-01-25 15:56:13.271745 : OCRAPI: procr_replace_dev: failed to replace device (50)
2018-01-25 15:56:13.271783 : OCRCONF: Failed to replace the OCR device. OCR error:[PROC-50: The Oracle Cluster Registry 
location to be added is inaccessible on nodes racnode1.]
2018-01-25 15:56:13.271815 : OCRCONF: The new OCR device [OCR_VOTE] cannot be opened
2018-01-25 15:56:13.271840 : OCRCONF: Exiting [status=failed]...

Checked physical disk and ASM disk are all good.

Finally check back the command and find ‘+’ is missing from diskgroup name.

# ./ocrconfig -add +OCR_VOTE

CRS-6706: Oracle Clusterware Release patch level (‘nnnnnn’) does not match Software patch level (‘nnnnnn’)

“opatchauto” failed in the middle, try to rerun again, get “CRS-6706” error.

# /u01/app/12.1.0.2/grid/OPatch/opatchauto apply /tmp/12.1.0.2/27010872 -oh /u01/app/12.1.0.2/grid
...
..
.
Using configuration parameter file: /u01/app/12.1.0.2/grid/OPatch/auto/dbtmp/bootstrap_racnode1/patchwork/crs/install/crsconfig_params
CRS-6706: Oracle Clusterware Release patch level ('173535486') does not match Software patch level ('2039526626'). Oracle Clusterware cannot be started.
CRS-4000: Command Start failed, or completed with errors.
2018/01/23 16:29:02 CLSRSC-117: Failed to start Oracle Clusterware stack


After fixing the cause of failure Run opatchauto resume

]
OPATCHAUTO-68061: The orchestration engine failed.
OPATCHAUTO-68061: The orchestration engine failed with return code 1
OPATCHAUTO-68061: Check the log for more details.
OPatchAuto failed.

OPatchauto session completed at Tue Jan 23 16:29:04 2018
Time taken to complete the session 1 minute, 39 seconds

opatchauto failed with error code 42

On racnode1:

$/u01/app/12.1.0.2/grid/bin/kfod op=patches
---------------
List of Patches
===============
19941482
19941477
19694308
19245012
26925218   <---- Does not exist on node2

$/u01/app/12.1.0.2/grid/bin/kfod op=patchlvl
2039526626

On racnode2:

$/u01/app/12.1.0.2/grid/bin/kfod op=patches
---------------
List of Patches
===============
19941482
19941477
19694308
19245012

$/u01/app/12.1.0.2/grid/bin/kfod op=patchlvl
2039526626

We can see patch 26925218 has been applied onto racnode 1, the solution is :

1) Rollback this patch ( 26925218), and run “opatchauto” again to finish the patching successfully. then everything should be fine.

OR

2) Manually complete all the left patches in the GI. After this everything is fine.

In 12c, GI home must have identical patches for the clusterware to start unless during rolling patching.
After applied the same patches on all nodes, GI started fine.

———-

Another situation you might meet is all nodes have same patches but ‘opatch lsinventory’ shows the different patch level:

For example , on racnode1:

$ /u01/app/12.1.0.2/grid/bin/kfod op=patches
---------------
List of Patches
===============
11111111
22222222
33333333

$ /u01/app/12.1.0.2/grid/bin/kfod op=patchlvl
-------------------
Current Patch level
===================
8888888888

Node2

$ /u01/app/12.1.0.2/grid/bin/kfod op=patches
---------------
List of Patches
===============
11111111
22222222
33333333

$ /u01/app/12.1.0.2/grid/bin/kfod op=patchlvl
-------------------
Current Patch level
===================
9999999999

However opatch lsinventory shows the different patch level:

Patch level status of Cluster nodes :

Patching Level Nodes
-------------- -----
8888888888 node1                    ====>> different patch level
9999999999 node2

For 12.1.0.1/2:

Execute”/u01/app/12.1.0.2/grid/crs/install/rootcrs.sh -patch” as root user on the problematic node and the patch level should be corrected.

For 12.2

Execute”<GI_HOME>/crs/install/rootcrs.pl -prepatch”  “<GI_HOME>/crs/install/rootcrs.pl -postpatch”and as <root_user> on the problematic node and the patch level should be corrected

REFERENCES:

RS-6706: Oracle Clusterware Release patch level (‘nnn’) does not match Software patch level (‘mmm’) (Doc ID 1639285.1)

12c opatchauto : Prerequisite check “CheckApplicable” failed

It is a good practice to copy and unzip GI patches with grid user

Using “opatchauto” to apply Jan2018 GI PSU onto 12R1 GI HOME, got “Prerequisite check “CheckApplicable” failed errors.

 # /u01/app/12.1.0.2/grid/OPatch/opatchauto apply /tmp/27010872 
   -oh /u01/app/12.1.0.2/grid
...
..
.
Bringing down CRS service on home /u01/app/12.1.0.2/grid
Prepatch operation log file location: /u01/app/12.1.0.2/grid/
 cfgtoollogs/crsconfig/crspatch_racnode1_2018-01-23_03-52-31PM.log
CRS service brought down successfully on home /u01/app/12.1.0.2/grid

Start applying binary patch on home /u01/app/12.1.0.2/grid
Failed while applying binary patches on home /u01/app/12.1.0.2/grid

Execution of [OPatchAutoBinaryAction] patch action failed, check log 
for more details. Failures:
Patch Target : racnode1->/u01/app/12.1.0.2/grid Type[crs]
Details: [
---------------------------Patching Failed--------------------------
Command execution failed during patching in home: /u01/app/12.1.0.2/
grid, host: racnode1.
Command failed: /u01/app/12.1.0.2/grid/OPatch/opatchauto apply 
 /tmp/27010872 -oh /u01/app/12.1.0.2/grid -target_type cluster 
 -binary -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -jre 
/u01/app/12.1.0.2/grid/OPatch/jre -persistresult /u01/app/12.1.0.2/
grid/OPatch/auto/dbsessioninfo/sessionresult_racnode1_crs.ser 
-analyzedresult /u01/app/12.1.0.2/grid/OPatch/auto/dbsessioninfo/
sessionresult_analyze_racnode1_crs.ser

Command failure output:
==Following patches FAILED in apply:

Patch: /tmp/27010872/26925218
Log: /u01/app/12.1.0.2/grid/cfgtoollogs/opatchauto/core/opatch/
opatch2018-01-23_15-53-43PM_1.log
Reason:Failed during Patching: oracle.opatch.opatchsdk.OPatchException:
Prerequisite check "CheckApplicable" failed.

After fixing the cause of failure Run opatchauto resume

]
OPATCHAUTO-68061: The orchestration engine failed.
OPATCHAUTO-68061: The orchestration engine failed with return code 1
OPATCHAUTO-68061: Check the log for more details.
OPatchAuto failed.

OPatchauto session completed at Tue Jan 23 15:56:55 2018
Time taken to complete the session 6 minutes, 39 seconds

Chech opatch logfile:

[Jan 23, 2018 3:56:55 PM] [INFO] Space Needed : 3191.113MB
[Jan 23, 2018 3:56:55 PM] [INFO] Prereq checkPatchApplicableOnCurrentPlatform 
Passed for patch : 26925218
[Jan 23, 2018 3:56:55 PM] [INFO] Patch 26925218:
 onewaycopyAction : Source File "/tmp/27010872/26925218/files/crs/install
/dropdb.pl" does not exists or is not readable
 'oracle.crs, 12.1.0.2.0': Cannot copy file from 'dropdb.pl' to 
'/u01/app/12.1.0.2/grid/crs/install/dropdb.pl'
[Jan 23, 2018 3:56:55 PM] [INFO] Prerequisite check "CheckApplicable" failed.
 The details are:

Patch 26925218:
 onewaycopyAction : Source File "/tmp/27010872/26925218/files/crs/
install/dropdb.pl" does not exists or is not readable
 'oracle.crs, 12.1.0.2.0': Cannot copy file from 'dropdb.pl' to 
'/u01/app/12.1.0.2/grid/crs/install/dropdb.pl'
[Jan 23, 2018 3:56:55 PM] [SEVERE] OUI-67073:UtilSession failed:
 Prerequisite check "CheckApplicable" failed.
[Jan 23, 2018 3:56:55 PM] [INFO] Finishing UtilSession at Tue Jan 23 15:56:55 AEDT 2018
[Jan 23, 2018 3:56:55 PM] [INFO] Log file location: /u01/app/12.1.0.2
/grid/cfgtoollogs/opatchauto/core/opatch/opatch2018-01-23_15-53-43PM_1.log

Check “dropdb.pl” file from unziped patch. the owner is oracle.

# ls -ltr /tmp/27010872/26925218/files/crs/install/dropdb.pl
-rwx------ 1 oracle oinstall 3541 Jan 6 07:48 /tmp/27010872/26925218/
                                          files/crs/install/dropdb.pl

The patch file was unzipped by RAC  user ‘oracle’  instead of GI owner ‘grid’. Change this file owner to grid, and run “opatchauto resume” to continue the patching successfully.

#chown grid /tmp/27010872/26925218/files/crs/install/dropdb.pl
#/u01/app/12.1.0.2/grid/OPatch/opatchauto resume

REFERENCES:

12c opatchauto: Prerequisite check “CheckApplicable” failed (Doc ID 1937982.1)

High CPU By /usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 /sbin/ifconfig -a

SYMPTOM

In 12.1.0.2 GI/RAC environment, there are a couple of processes consuming high CPU.

$ ps -ef|grep ifconfig
root 18941 1 0 06:25 ? 00:00:00 sh -c /bin/su -l grid -c "/usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 RACTEST2 /sbin/ifconfig -a" 2>&1
root 18942 18941 99 06:25 ? 06:07:08 /bin/su -l grid -c /usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 RACTEST2 /sbin/ifconfig -a
grid 26928 23166 0 12:32 pts/1 00:00:00 grep ifconfig
root 62153 1 0 Jan23 ? 00:00:00 sh -c /bin/su -l grid -c "/usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 RACTEST2 /sbin/ifconfig -a" 2>&1
root 62154 62153 99 Jan23 ? 14:29:31 /bin/su -l grid -c /usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 RACTEST2 /sbin/ifconfig -a
root 77170 1 0 10:30 ? 00:00:00 sh -c /bin/su -l grid -c "/usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 RACTEST2 /sbin/ifconfig -a" 2>&1
root 77171 77170 99 10:30 ? 02:02:37 /bin/su -l grid -c /usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 RACTEST2 /sbin/ifconfig -a

$top
..
.
 PID   USER PR NI VIRT RES  SHR  S %CPU %MEM TIME+ COMMAND
 62154 root 25 0 98.8m 1392 1104 R 100.0 0.0 851:33.36 su
 18942 root 25 0 98.8m 1400 1104 R 99.9  0.0 349:10.86 su
 77171 root 25 0 98.8m 1404 1104 R 99.9  0.0 104:39.33 su
 ..
 .

CAUSES

As per Oracle ID 2340905.1, it is a Bug 24692439 : LNX64-12.2-DIAGSNAP: AUXILIARY CMDS GENERATED BY DIAGSNAP WOULD HOG CPU FOREVER.

It is fixed in 18.1.

WORKAROUND

1)as GI owner:

$ oclumon manage -disable diagsnap
Diagsnap option is successfully Disabled on RACTEST1
Diagsnap option is successfully Disabled on RACTEST2
Successfully Disabled diagsnap

2) kill the existing “su” processes.

#kill -9 77170

....

datapatch -verbose Fails with Error “patch xxxxxxxx: Archived patch directory is empty”

Always keeps ORACLE_HOME applied patches consistent in situations like cloning ORALE_HOME, switching database role in DataGuard

SYMPTOM

“datapatch -verbose” failes on 12.1.0.2 database with following errors:

$ ./datapatch -verbose -skip_upgrade_check
...
..
.
Error: prereq checks failed!
 patch 22139226: Archived patch directory is empty
Prereq check failed, exiting without installing any patches.
...
..
.

Check $ORACLE_HOME/sqlpatch, there is no files for patch 22139226 to be used for rollback.

$ ls -ltr $ORACLE_HOME/sqlpatch|grep 22139226
$

CAUSE

  1. The database is just migrated from old ORACLE_HOME to a new ORACLE_HOME.
  2. The database is just switched over or failed over. The applied patches are different between ORACLE_HOME of primary and standby.

SOLUTION

Copy missing patch files from the old ORACLE_HOME, or from standby ORACLE_HOME to primary ORACLE_HOME.

REFERENCE

datapatch -verbose Fails with Error :” Patch xxxxxx: Archived Patch Directory Is Empty” (Doc ID 2235541.1).