High CPU By /usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 /sbin/ifconfig -a

SYMPTOM

In 12.1.0.2 GI/RAC environment, there are a couple of processes consuming high CPU.

$ ps -ef|grep ifconfig
root 18941 1 0 06:25 ? 00:00:00 sh -c /bin/su -l grid -c "/usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 RACTEST2 /sbin/ifconfig -a" 2>&1
root 18942 18941 99 06:25 ? 06:07:08 /bin/su -l grid -c /usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 RACTEST2 /sbin/ifconfig -a
grid 26928 23166 0 12:32 pts/1 00:00:00 grep ifconfig
root 62153 1 0 Jan23 ? 00:00:00 sh -c /bin/su -l grid -c "/usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 RACTEST2 /sbin/ifconfig -a" 2>&1
root 62154 62153 99 Jan23 ? 14:29:31 /bin/su -l grid -c /usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 RACTEST2 /sbin/ifconfig -a
root 77170 1 0 10:30 ? 00:00:00 sh -c /bin/su -l grid -c "/usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 RACTEST2 /sbin/ifconfig -a" 2>&1
root 77171 77170 99 10:30 ? 02:02:37 /bin/su -l grid -c /usr/bin/ssh -o StrictHostKeyChecking=no -o BatchMode=yes -o ConnectTimeout=5 RACTEST2 /sbin/ifconfig -a

$top
..
.
 PID   USER PR NI VIRT RES  SHR  S %CPU %MEM TIME+ COMMAND
 62154 root 25 0 98.8m 1392 1104 R 100.0 0.0 851:33.36 su
 18942 root 25 0 98.8m 1400 1104 R 99.9  0.0 349:10.86 su
 77171 root 25 0 98.8m 1404 1104 R 99.9  0.0 104:39.33 su
 ..
 .

CAUSES

As per Oracle ID 2340905.1, it is a Bug 24692439 : LNX64-12.2-DIAGSNAP: AUXILIARY CMDS GENERATED BY DIAGSNAP WOULD HOG CPU FOREVER.

It is fixed in 18.1.

WORKAROUND

1)as GI owner:

$ oclumon manage -disable diagsnap
Diagsnap option is successfully Disabled on RACTEST1
Diagsnap option is successfully Disabled on RACTEST2
Successfully Disabled diagsnap

2) kill the existing “su” processes.

#kill -9 77170

....

datapatch -verbose Fails with Error “patch xxxxxxxx: Archived patch directory is empty”

Always keeps ORACLE_HOME applied patches consistent in situations like cloning ORALE_HOME, switching database role in DataGuard

SYMPTOM

“datapatch -verbose” failes on 12.1.0.2 database with following errors:

$ ./datapatch -verbose -skip_upgrade_check
...
..
.
Error: prereq checks failed!
 patch 22139226: Archived patch directory is empty
Prereq check failed, exiting without installing any patches.
...
..
.

Check $ORACLE_HOME/sqlpatch, there is no files for patch 22139226 to be used for rollback.

$ ls -ltr $ORACLE_HOME/sqlpatch|grep 22139226
$

CAUSE

  1. The database is just migrated from old ORACLE_HOME to a new ORACLE_HOME.
  2. The database is just switched over or failed over. The applied patches are different between ORACLE_HOME of primary and standby.

SOLUTION

Copy missing patch files from the old ORACLE_HOME, or from standby ORACLE_HOME to primary ORACLE_HOME.

REFERENCE

datapatch -verbose Fails with Error :” Patch xxxxxx: Archived Patch Directory Is Empty” (Doc ID 2235541.1).

The pluggable databases that need to be patched must be in upgrade

Normally “datapatch” for OJVM requires database in UPGRADE mode

After applied PSU and OJVM binary patches, and  run “datapatch”,  then get error “The pluggable databases that need to be patched must be in upgrade mode”.

$ ./datapatch -verbose
...
..
.
 The following patches will be rolled back:
 22139226 (Database PSU 12.1.0.2.160119, Oracle JavaVM Component (Jan2016))
 The following patches will be applied:
 27001733 (Database PSU 12.1.0.2.180116, Oracle JavaVM Component (JAN2018))
 26925311 (DATABASE PATCH SET UPDATE 12.1.0.2.180116)

Error: prereq checks failed!
 patch 22139226: The pluggable databases that need to be patched must be in upgrade mode
Prereq check failed, exiting without installing any patches.

...
..
.

The workaround is to add “skip_upgrade_check” option for “datapatch” command:

$ ./datapatch -verbose -skip_upgrade_check
SQL Patching tool version 12.1.0.2.0 Production on Fri Jan 19 17:33:18 2018
Copyright (c) 2012, 2016, Oracle. All rights reserved.

Log file for this invocation: /u01/app/oracle/cfgtoollogs/sqlpatch/sqlpatch_74462_2018_01_19_17_33_18/sqlpatch_invocation.log

Connecting to database...OK
Note: Datapatch will only apply or rollback SQL fixes for PDBs
 that are in an open state, no patches will be applied to closed PDBs.
 Please refer to Note: Datapatch: Database 12c Post Patch SQL Automation
 (Doc ID 1585822.1)
Bootstrapping registry and package to current versions...done
Determining current state...done

Current state of SQL patches:
Patch 22139226 (Database PSU 12.1.0.2.160119, Oracle JavaVM Component (Jan2016)):
 Installed in RACTESTPDB only
Patch 27001733 (Database PSU 12.1.0.2.180116, Oracle JavaVM Component (JAN2018)):
 Installed in the binary registry and CDB$ROOT PDB$SEED
Bundle series PSU:
 ID 180116 in the binary registry and ID 180116 in PDB CDB$ROOT, ID 180116 in PDB PDB$SEED, ID 160419 in PDB RACTESTPDB

Adding patches to installation queue and performing prereq checks...
Installation queue:
 For the following PDBs: CDB$ROOT PDB$SEED
 Nothing to roll back
 Nothing to apply
 For the following PDBs: RACTESTPDB
 The following patches will be rolled back:
 22139226 (Database PSU 12.1.0.2.160119, Oracle JavaVM Component (Jan2016))
 The following patches will be applied:
 27001733 (Database PSU 12.1.0.2.180116, Oracle JavaVM Component (JAN2018))
 26925311 (DATABASE PATCH SET UPDATE 12.1.0.2.180116)

Installing patches...
Patch installation complete. Total patches installed: 3

Validating logfiles...
Patch 22139226 rollback (pdb RACTESTPDB): SUCCESS
 logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/22139226/19729684/22139226_rollback
Patch 27001733 apply (pdb RACTESTPDB): SUCCESS
 logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/27001733/21728084/27001733_apply_LG
Patch 26925311 apply (pdb RACTESTPDB): SUCCESS
 logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/26925311/21850549/26925311_apply_LG
SQL Patching tool complete on Fri Jan 19 17:37:33 2018


SQL> select PATCH_ID,VERSION,ACTION,STATUS,DESCRIPTION 
     from dba_registry_sqlpatch order by ACTION_TIME;
PATCH_ID  VERSION ACTION     STATUS  DESCRIPTION
---------- --------------    ------- --------------- ----------------------------------------------------------------------------------------------------
 22139226 12.1.0.2 APPLY     SUCCESS Database PSU 12.1.0.2.160119, 
                                     Oracle JavaVM Component (Jan2016)
 22291127 12.1.0.2 APPLY     SUCCESS Database Patch Set Update : 
                                     12.1.0.2.160419 (22291127)
 22139226 12.1.0.2 ROLLBACK  SUCCESS Database PSU 12.1.0.2.160119, 
                                     Oracle JavaVM Component (Jan2016)
 27001733 12.1.0.2 APPLY     SUCCESS Database PSU 12.1.0.2.180116, 
                                     Oracle JavaVM Component (JAN2018)
 26925311 12.1.0.2 APPLY     SUCCESS DATABASE PATCH SET UPDATE 
                                     12.1.0.2.180116

OPATCHAUTO-68021: The following argument(s) are required: [-wallet]

Always download and apply the latest version of Opatch utility

When applying “Patch 27100009 – Grid Infrastructure Jan 2018 Release Update (GI RU) 12.2.0.1.180116”, get errors from auto opatch:

# /u01/app/12.2.0.1/grid/OPatch/opatchauto apply  /PATCHES/12.2.0.1/27100009
System initialization log file is /u01/app/12.2.0.1/grid/cfgtoollogs/opatchautodb/systemconfig2018-01-18_10-59-37AM.log.
Session log file is /u01/app/12.2.0.1/grid/cfgtoollogs/opatchauto/opatchauto2018-01-18_11-02-03AM.log

OPATCHAUTO-68021: Missing required argument(s).
OPATCHAUTO-68021: The following argument(s) are required: [-wallet]
OPATCHAUTO-68021: Provide the required argument(s).
Oracle OPatchAuto Version 13.9.0.2.0
Copyright (c) 2016, Oracle Corporation.  All rights reserved.

DESCRIPTION
    This operation applies patch.
...
..
.

The workaround is to use latest version of Opatch (12.2.0.1.9 or higher).

OR

Manually create a wallet in 12.2 as per “Creation of opatchauto wallet in 12.2 in 12.2.0.1.8 (Doc ID 2270185.1)”.

OPatchauto Failed due to exit status 2

ALWAYS get latest OPATCH utility on all RAC nodes first before applying patches.

When applying 12.2.0.1 GI RU( 12.2.0.1.180116 ), there are a couple of issues resolved by :

  • Detach ORACLE_HOME which have no instances running against.
  • Remove databases which are non required , but still in OCR.
  • Use latest opatch  for “wallet” issue.

Finally get another issue when running opatchauto:

# /u01/app/12.2.0.1/grid/OPatch/opatchauto apply 
  /PATCHES/12.2.0.1/27100009 -oh /u01/app/12.2.0.1/grid

OPatchauto session is initiated at Thu Jan 18 11:56:18 2018
System initialization log file is /u01/app/12.2.0.1/grid/cfgtoollogs/
opatchautodb/systemconfig2018-01-18_11-56-29AM.log.
Session log file is /u01/app/12.2.0.1/grid/cfgtoollogs/opatchauto/
opatchauto2018-01-18_11-57-03AM.log
The id for this session is AJ8A

oracle.dbsysmodel.driver.sdk.productdriver.ProductDriverException: 
Execution failed for host rachost2 due to : com.oracle.cie.remote.RemoteConnectionException: 
Command execution of [/u01/app/12.2.0.1/grid//perl/bin/perl 
/u01/app/12.2.0.1/grid/OPatch/auto/database/bin/RemoteHostExecutor.pl  
-GRID_HOME=/u01/app/12.2.0.1/grid -OBJECTLOC=/u01/app/12.2.0.1/grid/
/cfgtoollogs/opatchautodb/hostdata.obj -CRS_ACTION=check 
-CLUSTERNODES=rachost1,rachost2 -JVM_HANDLER=oracle/dbsysmodel/driver
/sdk/productdriver/remote/RemoteDataCollector ] failed due to exit status 2.

OPatchAuto failed.

OPatchauto session completed at Thu Jan 18 11:57:11 2018

Time taken to complete the session 0 minute, 54 seconds

opatchauto failed with error code 42

The below command caused errors :

 /u01/app/12.2.0.1/grid//perl/bin/perl 
/u01/app/12.2.0.1/grid/OPatch/auto/database/bin/RemoteHostExecutor.pl  
-GRID_HOME=/u01/app/12.2.0.1/grid -OBJECTLOC=/u01/app/12.2.0.1/grid/
/cfgtoollogs/opatchautodb/hostdata.obj -CRS_ACTION=check 
-CLUSTERNODES=rachost1,rachost2 -JVM_HANDLER=oracle/dbsysmodel/driver
/sdk/productdriver/remote/RemoteDataCollector

Found “/u01/app/12.2.0.1/grid/OPatch/auto/database/bin/RemoteHostExecutor.pl” file does not exist on racnode2, because the OPatch on this node is still old version 12.2.0.1.6. But on racnode1 OPatch version is latest 12.2.0.1.11.

Copy the latest OPatch from node1 onto node2, then run the  opatchauto successfully.

ALWAYS get latest OPATCH on all RAC nodes first  before applying patches.