Disable Chronyd and Enable CTSSD into Active Mode in Linux 7

As we know, if any Network Time Protocol (NTP)  demon ntpd or chronyd  running,  then Oracle cluster ware CTSS ( Cluster Time Synchronization Service ) will run in Observer mode.

$ crsctl check ctss
CRS-4701:The Cluster Time Synchronization Service is in Observer mode.
$crsctl stat res -t -init
...
..
.
ora.ctssd
      1   ONLINE  ONLINE  racnode1    BSERVER,STABLE

Now we disable Chronyd, and remove all Chrony configurations.

# systemctl stop chronyd
# systemctl disable chronyd
Removed symlink /etc/systemd/system/multi-user.target.wants/chronyd.service.
# yum remove chrony
Loaded plugins: ulninfo
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
Resolving Dependencies
--> Running transaction check
---> Package chrony.x86_64 0:2.1.1-1.el7 will be erased
--> Finished Dependency Resolution

Dependencies Resolved

Check configurations files are gone, otherwise ctssd still thinks NTP servers are running.

$ ls -ltr /etc/chro*
-rw-r-----. 1 root chrony 62 Nov 24 2015 /etc/chrony.keys.rpmsave

Check again, we see CTSSD running in ACTIVE mode now.

$ crsctl check ctss
CRS-4701: The Cluster Time Synchronization Service is in Active mode.
CRS-4702: Offset (in msec): 0
$ crsctl stat res -t -init

ora.ctssd
     1 ONLINE ONLINE racnode1 ACTIVE:0,STABLE
$ cluvfy comp clocksync -n all -verbose

Verifying Clock Synchronization across the cluster nodes

Checking if Clusterware is installed on all nodes...
Oracle Clusterware is installed on all nodes.

Checking if CTSS Resource is running on all nodes...
Check: CTSS Resource running on all nodes
Node Name    Status
----------- ------------------------
racnode2    passed
racnode1    passed
CTSS resource check passed

Querying CTSS for time offset on all nodes...
Query of CTSS for time offset passed

Check CTSS state started...
Check: CTSS state
Node Name    State
----------- ------------------------
racnode2    Active
racnode1    Active
CTSS is in Active state. Proceeding with check of clock time offsets on all nodes...
Reference Time Offset Limit: 1000.0 msecs
Check: Reference Time Offset
Node Name     Time Offset Status
------------ ----------- ------------------------
racnode2     0.0         passed
racnode1     0.0         passed

Time offset is within the specified limits on the following set of nodes:
"[racnode2, racnode1]"
Result: Check of clock time offsets passed

Oracle Cluster Time Synchronization Services check passed

Verification of Clock Synchronization across the cluster nodes was successful.

crs alert.log

2019-09-08 18:46:55.004 [OCTSSD(22044)]CRS-2410: The Cluster Time 
         Synchronization Service on host racnode2 is in active mode.

octssd.trc on master node ( racnode2 ):

....
..
.
2019-09-08 19:31:56.380369 : CTSS:1714730752: sclsctss_ivsr2: default pid file not found
2019-09-08 19:31:56.380386 : CTSS:1714730752: sclsctss_ivsr2: default pid file not found
2019-09-08 19:31:56.380393 : CTSS:1714730752: ctss_check_vendor_sw: Vendor time sync software is not detected. status [1].
...
..
.

octssd.trc on non-master node ( racnode1 ):

2019-09-08 19:39:07.441725 : CTSS:2003805952: ctsselect_msm: CTSS mode is [0xc4]
2019-09-08 19:39:07.441736 : CTSS:2003805952: ctssslave_swm1_2: Ready to initiate new time sync process.
2019-09-08 19:39:07.442805 : CTSS:2003805952: ctssslave_swm2_1: Waiting for time sync message from master. sync_state[2].
2019-09-08 19:39:07.447917 : CTSS:2008008448: ctssslave_msg_handler4_1: Waiting for slave_sync_with_master to finish sync process. sync_state[3].
2019-09-08 19:39:07.447926 : CTSS:2003805952: ctssslave_swm2_3: Received time sync message from master.
2019-09-08 19:39:07.447935 : CTSS:2003805952: ctssslave_swm15: The CTSS master is ahead this node. The local time offset [11975 usec] is being adjusted. Sync method [2]
2019-09-08 19:39:07.447938 : CTSS:2003805952: ctssslave_swm17: LT [1567935547sec 447908usec], MT [1567935547sec 139990164505707usec], Delta [6167usec]
2019-09-08 19:39:07.447940 : CTSS:2003805952: ctssslave_swm19: The offset is [-11975 usec] and sync interval set to [1]
2019-09-08 19:39:07.447943 : CTSS:2003805952: ctsselect_msm: Sync interval returned in [1]
2019-09-08 19:39:07.447950 : CTSS:2008008448: ctssslave_msg_handler4_3: slave_sync_with_master finished sync process. Exiting clsctssslave_msg_handler

How to Check Clock Synchronisation between Oracle Cluster Nodes

cluvfy comp clocksync [-n <node_list>] [-noctss] [-verbose]

USAGE:
cluvfy comp clocksync [-n <node_list>] [-noctss] [-verbose]
<node_list> is the comma-separated list of non-domain qualified node
names on which the test should be conducted. If "all" is specified, 
then all the nodes in the cluster will be used for verification.

-noctss does not check Oracle Cluster Synch service, but checks only
the platforms native clock synch service(such as NTP)

DESCRIPTION:
Checks Oracle Cluster Time Synchronization Service(CTSS) on all nodes
in the nodelist. 

If no '-n' option is provided, local node is used for this check.  
If the "-noctss" option is specified, then Oracle CTSS check is not 
performed, instead the platforms native Time Synchronization is 
checked.
$ cluvfy comp clocksync

Verifying Clock Synchronization ...
CTSS is in Observer state. Switching over to clock synchronization 
checks using NTP

Verifying Network Time Protocol (NTP) ...
Verifying '/etc/chrony.conf' ...PASSED
Verifying Daemon 'chronyd' ...PASSED
Verifying NTP daemon or service using UDP port 123 ...PASSED
Verifying chrony daemon is synchronized with at least one external 
                  time source ...PASSED
Verifying Network Time Protocol (NTP) ...PASSED
Verifying Clock Synchronization ...PASSED

Verification of Clock Synchronization across the cluster nodes was 
    successful.

CVU operation performed:Clock Synchronization across the cluster nodes
Date: 03/09/2018 3:31:04 PM
CVU home: /u01/app/12.2.0.1/grid/
User: grid
$ cluvfy comp clocksync -n all -verbose

Verifying Clock Synchronization ...
Node Name Status
--------- ------------------------
racnode1  passed
racnode2  passed

Node Name State
--------- ------------------------
racnode1 Observer
racnode2 Observer

CTSS is in Observer state. 
Switching over to clock synchronization checks using NTP

Verifying Network Time Protocol (NTP) ...
Verifying '/etc/chrony.conf' ...
Node Name File exists?
--------- ------------------------
racnode1 yes
racnode2 yes

Verifying '/etc/chrony.conf' ...PASSED
Verifying Daemon 'chronyd' ...
Node Name Running?
--------- ------------------------
racnode1 yes
racnode2 yes

Verifying Daemon 'chronyd' ...PASSED
Verifying NTP daemon or service using UDP port 123 ...
Node Name Port Open?
--------- ------------------------
racnode1 yes
racnode2 yes

Verifying NTP daemon or service using UDP port 123 ...PASSED
Verifying chrony daemon is synchronized with at least one external 
                                       time source ...PASSED
Verifying Network Time Protocol (NTP) ...PASSED
Verifying Clock Synchronization ...PASSED

Verification of Clock Synchronization across the cluster nodes 
was successful.

CVU operation performed:Clock Synchronization across the cluster nodes
Date: 03/09/2018 3:35:14 PM
CVU home: /u01/app/12.2.0.1/grid/
User: grid

many “asmcmd daemon” high cpu

There are many hung “asmcmd daemon” processes with high CPU usage.

$top 

PID    USER  PR  NI   VIRT RES  SHR   %CPU %MEM TIME+ COMMAND
71337 grid  20  0 5717400 5.2g  1044 43.0  0.7 1:07.71 asmcmd daemon
137157grid  20  0 4751716 4.4g 14880 41.0  0.6 5:43.21 asmcmd daemon
76530 grid  20  0 5671444 5.2g  1044 40.4  0.7 2:46.47 asmcmd daemon
96115 grid  20  0 4750220 4.3g  1044 40.4  0.6 5:00.40 asmcmd daemon
81230 grid  20  0 5704468 5.2g  1036 39.7  0.7 3:58.48 asmcmd daemon
...
..
.
$ ps -eaf | grep -i asmcmd

grid 71337 1 40 Aug06 ? 13:01:13 asmcmd daemon
grid 76530 1 40 Aug06 ? 13:02:52 asmcmd daemon
grid 81230 1 40 Aug06 ? 13:04:04 asmcmd daemon
grid 96115 1 40 Aug06 ? 09:35:06 asmcmd daemon
grid 115047 81230 0 09:33 ? 00:00:00 sh -c /u01/app/12.2.0.1/grid/bin
/clsecho -t -o /u01/app/12.2.0.1/grid/log/diag/asmcmd/user_grid/
racnode1/alert/alert.log "ASMCMD Background (PID = 81230)
 2> /tmp/clsecho_stderr_file.txt
grid 115058 76530 0 09:33 ? 00:00:00 [asmcmd daemon]
grid 115060 126727 0 09:33 ? 00:00:00 [asmcmd daemon]
...
..
.

WORKAROUND

1)Kill the “asmcmd daemon” processes.

2) Upgrade Trace File Analyzer (TFA) to the latest version.

CRS-2674 CRS-2632 ORA-01013 CRS-5017 PRCR-1079 PRCD-1084 Srvctl Start Service

One or two services failed to start up, while the rest services are all good to be started up or shutdown.

$ srvctl start service -d TESTDB -s reports
PRCD-1084 : Failed to start service REPORTS
PRCR-1079 : Failed to start resource ora.testdb.reports.svc
CRS-5017: The resource action "ora.testdb.reports.svc start" 
          encountered the following error:
ORA-01013: user requested cancel of current operation
. For details refer to "(:CLSN00107:)" in "/u01/app/grid/diag/crs/
                     racnode1/crs/trace/crsd_oraagent_oracle.trc".
CRS-2674: Start of 'ora.testdb.reports.svc' on 'racnode1' failed
CRS-2632: There are no more servers to try to place resource 
 'ora.testdb.reports.svc' on that would satisfy its placement policy

It seems the information about this database or service is inconsistent in OCR.  We can try the following steps one after another until the it is successful.

1)Remove database from OCR, and add it back again.

$srvctl remove database....

$srvctl add database ....

$ srvctl add instance ....

$ srvctl start service -d TESTDB -s reports

2) if not working in previous step, then restart all clusterware.

#crsctl stop crs

#ctsct start crs

$srvctl start service -d TESTDB -s reports

3) For some reason database outage is unavailable, service can be started up manually.

Start up service on instance 1. Users or applications can connect to database through this service, but srvctl still shows service not running

SQL> exec DBMS_SERVICE.START_SERVICE('REPORTS','TESTDB1');

$ srvctl status service -d testdb -s reports 

Service REPORTS is not running.

Apply 12.1.0.2.190716 PSU on GI RAC and OJVM

1)Download GI PSU 12.1.0.2.190716 Patch 29698592, and Oracle  JavaVM Component Database PSU 12.1.0.2.190716 Patch 29774383.

2) Download and install latest p6880880_121010_Linux-x86-64.zip for both GI_HOME and RDMS_HOME.

3) Set up PATH

# export PATH=$PATH:/u01/app/12.1.0/grid/OPatch

# opatch version
OPatch Version: 12.2.0.1.17

OPatch succeeded.

4) Apply GI and DB PSU on first node, then do the same on second node.

As root user, execute the following command on each node of the cluster:

# /u01/app/12.1.0/grid/OPatch/opatchauto apply /tmp/29698592

OPatchauto session is initiated at Sun Jul 28 22:39:56 2019

System initialization log file is /u01/app/12.1.0/grid/cfgtoollogs/opatchautodb/systemconfig2019-07-28_10-49-47PM.log.

Session log file is /u01/app/12.1.0/grid/cfgtoollogs/opatchauto/opatchauto2019-07-28_10-51-20PM.log
The id for this session is B1YJ

Executing OPatch prereq operations to verify patch applicability on home /u01/app/12.1.0/grid

Executing OPatch prereq operations to verify patch applicability on home /u01/app/oracle/product/12.1.0/dbhome_1
Patch applicability verified successfully on home /u01/app/oracle/product/12.1.0/dbhome_1

Patch applicability verified successfully on home /u01/app/12.1.0/grid
Verifying SQL patch applicability on home /u01/app/oracle/product/12.1.0/dbhome_1
SQL patch applicability verified successfully on home /u01/app/oracle/product/12.1.0/dbhome_1
Preparing to bring down database service on home /u01/app/oracle/product/12.1.0/dbhome_1
Successfully prepared home /u01/app/oracle/product/12.1.0/dbhome_1 to bring down database service
Bringing down CRS service on home /u01/app/12.1.0/grid
Prepatch operation log file location: /u01/app/12.1.0/grid/cfgtoollogs/crsconfig/crspatch_racnode1_2019-07-28_11-31-15PM.log
CRS service brought down successfully on home /u01/app/12.1.0/grid
Performing prepatch operation on home /u01/app/oracle/product/12.1.0/dbhome_1
Perpatch operation completed successfully on home /u01/app/oracle/product/12.1.0/dbhome_1
Start applying binary patch on home /u01/app/oracle/product/12.1.0/dbhome_1
Binary patch applied successfully on home /u01/app/oracle/product/12.1.0/dbhome_1
Performing postpatch operation on home /u01/app/oracle/product/12.1.0/dbhome_1
Postpatch operation completed successfully on home /u01/app/oracle/product/12.1.0/dbhome_1
Start applying binary patch on home /u01/app/12.1.0/grid
Binary patch applied successfully on home /u01/app/12.1.0/grid
Starting CRS service on home /u01/app/12.1.0/grid
Postpatch operation log file location: /u01/app/12.1.0/grid/cfgtoollogs/crsconfig/crspatch_racnode1_2019-07-29_01-30-59AM.log
CRS service started successfully on home /u01/app/12.1.0/grid
Preparing home /u01/app/oracle/product/12.1.0/dbhome_1 after database service restarted
No step execution required.........
Trying to apply SQL patch on home /u01/app/oracle/product/12.1.0/dbhome_1
SQL patch applied successfully on home /u01/app/oracle/product/12.1.0/dbhome_1

OPatchAuto successful.
--------------------------------Summary--------------------------------

Patching is completed successfully. Please find the summary as follows:

Host:racnode1
RAC Home:/u01/app/oracle/product/12.1.0/dbhome_1
Version:12.1.0.2.0
Summary:

==Following patches were SKIPPED:

Patch: /tmp/29698592/26983807
Reason: This patch is not applicable to this specified target type - "rac_database"

Patch: /tmp/29698592/29423125
Reason: This patch is not applicable to this specified target type - "rac_database"
==Following patches were SUCCESSFULLY applied:

Patch: /tmp/29698592/29494060
Log: /u01/app/oracle/product/12.1.0/dbhome_1/cfgtoollogs/opatchauto/core/opatch/opatch2019-07-28_23-34-47PM_1.log

Patch: /tmp/29698592/29509318
Log: /u01/app/oracle/product/12.1.0/dbhome_1/cfgtoollogs/opatchauto/core/opatch/opatch2019-07-28_23-34-47PM_1.log
Host:racnode1
CRS Home:/u01/app/12.1.0/grid
Version:12.1.0.2.0
Summary:

==Following patches were SUCCESSFULLY applied:

Patch: /tmp/29698592/26983807
Log: /u01/app/12.1.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2019-07-29_00-16-53AM_1.log

Patch: /tmp/29698592/29423125
Log: /u01/app/12.1.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2019-07-29_00-16-53AM_1.log

Patch: /tmp/29698592/29494060
Log: /u01/app/12.1.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2019-07-29_00-16-53AM_1.log

Patch: /tmp/29698592/29509318
Log: /u01/app/12.1.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2019-07-29_00-16-53AM_1.log

OPatchauto session completed at Mon Jul 29 02:12:00 2019
Time taken to complete the session 212 minutes, 4 seconds
[root@racnode1 ~]#

From the log, we can see GI management database was patched as well.

...
..
.
2019-07-29 01:53:43: Mgmtdb is running on node: racnode1; local node: racnode1
2019-07-29 01:53:43: Mgmtdb is running on the local node
2019-07-29 01:53:43: Starting to patch Mgmt DB ...
2019-07-29 01:53:43: Invoking "/u01/app/12.1.0/grid/sqlpatch/sqlpatch -db -MGMTDB"
2019-07-29 01:53:43: Running as user grid: /u01/app/12.1.0/grid/sqlpatch/sqlpatch -db -MGMTDB
2019-07-29 01:53:43: Invoking "/u01/app/12.1.0/grid/sqlpatch/sqlpatch -db -MGMTDB" as user "grid"
2019-07-29 01:53:43: Executing /bin/su grid -c "/u01/app/12.1.0/grid/sqlpatch/sqlpatch -db -MGMTDB"
2019-07-29 01:53:43: Executing cmd: /bin/su grid -c "/u01/app/12.1.0/grid/sqlpatch/sqlpatch -db -MGMTDB"

5) Apply Patch 29774383 – Oracle JavaVM Component 12.1.0.2.190716 Database PSU.

[oracle@racnode1 patches]$ cd 29774383
[oracle@racnode1 29774383]$
[oracle@racnode1 29774383]$opatch prereq CheckConflictAgainstOHWithDetail -ph ./
Oracle Interim Patch Installer version 12.2.0.1.17
Copyright (c) 2019, Oracle Corporation. All rights reserved.

PREREQ session

Oracle Home : /u01/app/oracle/product/12.1.0/dbhome_1
Central Inventory : /u01/app/oraInventory
from : /u01/app/oracle/product/12.1.0/dbhome_1/oraInst.loc
OPatch version : 12.2.0.1.17
OUI version : 12.1.0.2.0
Log file location : /u01/app/oracle/product/12.1.0/dbhome_1/cfgtoollogs/opatch/opatch2019-07-31_20-53-44PM_1.log

Invoking prereq "checkconflictagainstohwithdetail"

Prereq "checkConflictAgainstOHWithDetail" passed.

OPatch succeeded.

[oracle@racnode1 29774383]$ opatch apply
-- on node 2

[oracle@racnode2 29774383]$ export PATH=$PATH:$ORACLE_HOME/OPatch
[oracle@racnode2 29774383]$ opatch apply

Loading Modified SQL Files Into the Database

[oracle@racnode2 29774383]$ sqlplus /nolog

SQL>  CONNECT / AS SYSDBA
Connected to an idle instance.

SQL> STARTUP
ORACLE instance started.
Database mounted.
Database opened.

alter system set cluster_database=false scope=spfile;
System altered.

[oracle@racnode2 29774383]$ srvctl stop database -d RACTEST
[oracle@racnode2 29774383]$ sqlplus /nolog

SQL> CONNECT / AS SYSDBA
Connected to an idle instance.

SQL> STARTUP UPGRADE
ORACLE instance started.
Database mounted.
Database opened.

SQL> alter pluggable database all open upgrade;

Pluggable database altered.
[oracle@racnode2 29774383]$ cd $ORACLE_HOME/OPatch
[oracle@racnode2 OPatch]$ ./datapatch -verbose
SQL Patching tool version 12.1.0.2.0 Production on Wed Jul 31 21:21:07 2019
Copyright (c) 2012, 2016, Oracle.  All rights reserved.

Log file for this invocation: /u01/app/oracle/cfgtoollogs/sqlpatch/
                 sqlpatch_25124_2019_07_31_21_21_07/sqlpatch_invocation.log

Connecting to database...OK
Note:  Datapatch will only apply or rollback SQL fixes for PDBs
       that are in an open state, no patches will be applied to closed PDBs.
    Please refer to Note: Datapatch: Database 12c Post Patch SQL Automation
       (Doc ID 1585822.1)
Bootstrapping registry and package to current versions...done
Determining current state...done

Current state of SQL patches:
Patch 29774383 (Database PSU 12.1.0.2.190716, Oracle JavaVM Component (JUL2019)):
  Installed in the binary registry only
Bundle series PSU:
  ID 190716 in the binary registry and ID 190716 in PDB CDB$ROOT, ID 190716 in PDB PDB$SEED, ID 190716 in PDB PDB_A

Adding patches to installation queue and performing prereq checks...
Installation queue:
  For the following PDBs: CDB$ROOT PDB$SEED PDB_A
    Nothing to roll back
    The following patches will be applied:
      29774383 (Database PSU 12.1.0.2.190716, Oracle JavaVM Component (JUL2019))

Installing patches...
Patch installation complete.  Total patches installed: 3

Validating logfiles...
Patch 29774383 apply (pdb CDB$ROOT): SUCCESS
  logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/29774383/22961858/29774383_apply_RACTEST_CDBROOT_2019Jul31_21_22_21.log (no errors)
Patch 29774383 apply (pdb PDB$SEED): SUCCESS
  logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/29774383/22961858/29774383_apply_RACTEST_PDBSEED_2019Jul31_21_25_13.log (no errors)
Patch 29774383 apply (pdb PDB_A): SUCCESS
  logfile: /u01/app/oracle/cfgtoollogs/sqlpatch/29774383/22961858/29774383_apply_RACTEST_PDB_A_2019Jul31_21_25_13.log (no errors)
SQL Patching tool complete on Wed Jul 31 21:26:52 2019
[oracle@racnode2 OPatch]$ sqlplus /nolog

SQL*Plus: Release 12.1.0.2.0 Production on Wed Jul 31 21:29:55 2019

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

SQL> Connect / as sysdba
Connected.
SQL> alter system set cluster_database=true scope=spfile;

System altered.

SQL> SHUTDOWN
Database closed.
Database dismounted.
ORACLE instance shut down.

[oracle@racnode2 OPatch]$ srvctl start  database -d RACTEST

[oracle@racnode2 OPatch]$ sqlplus /as sysdba

SQL> @ $ORACLE_HOME/rdbms/admin/utlrp.sql

Finally check applied patches on both nodes and databases.

$opatch lsinventory
SQL> show con_name

CON_NAME
---------------
CDB$ROOT

SQL> select PATCH_ID, STATUS, VERSION, DESCRIPTION  
       from dba_registry_sqlpatch
  PATCH_ID STATUS     VERSION   DESCRIPTION
---------- ---------- --------- ------------------------------------------
  29494060 SUCCESS    12.1.0.2  DATABASE PATCH SET UPDATE 12.1.0.2.190716
  29774383 SUCCESS    12.1.0.2  Database PSU 12.1.0.2.190716, 
                                Oracle JavaVM Component (JUL2019)
SQL> alter session set container=PDB_A;

Session altered.

SQL> show con_name;

CON_NAME
------------------------------
PDB_A

SQL> select PATCH_ID, STATUS, VERSION, DESCRIPTION 
      from dba_registry_sqlpatch

PATCH_ID  STATUS  VERSION  DESCRIPTION
--------- ------- -------- --------------------------------------
29494060  SUCCESS 12.1.0.2 DATABASE PATCH SET UPDATE 12.1.0.2.190716
29774383  SUCCESS 12.1.0.2 Database PSU 12.1.0.2.190716, 
                           Oracle JavaVM Component (JUL2019)
SQL> show parameter instance_name

NAME           TYPE         VALUE
-------------- ----------- ------------------------
instance_name   string     -MGMTDB

SQL>  select PATCH_ID, STATUS, VERSION, DESCRIPTION  
        from dba_registry_sqlpatch;

  PATCH_ID STATUS  VERSION    DESCRIPTION
---------- ------- ---------- ------------------------------------------
  29494060 SUCCESS 12.1.0.2    DATABASE PATCH SET UPDATE 12.1.0.2.190716