PRVF-7590 PRVG-1024 PRVF-5415 PRVF-9652 While Running cluvfy comp clocksync

$ cluvfy comp clocksync -n all -verbose

The NTP configuration file "/etc/ntp.conf" is available on all nodes
NTP configuration file "/etc/ntp.conf" existence check passed

Checking daemon liveness...

Check: Liveness for "ntpd"
Node Name   Running?
----------  ------------------------
racnode2    no
racnode1    no
PRVF-7590 : "ntpd" is not running on node "racnode2"
PRVF-7590 : "ntpd" is not running on node "racnode1"
PRVG-1024 : The NTP Daemon or Service was not running on any of the 
            cluster nodes.
PRVF-5415 : Check to see if NTP daemon or service is running failed
Result: Clock synchronization check using Network Time Protocol(NTP) 
        failed

PRVF-9652 : Cluster Time Synchronization Services check failed

Verification of Clock Synchronization across the cluster nodes was 
unsuccessful on all the specified nodes.

But ntpd demon process is running :

#systemctl status ntpd
● ntpd.service - Network Time Service
Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled; vendor preset: disabled)
Active: active (running) since Sun 2019-09-08 21:06:46 AEST; 58min ago
Process: 2755 ExecStart=/usr/sbin/ntpd -u ntp:ntp $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 2756 (ntpd)
CGroup: /system.slice/ntpd.service
└─2756 /usr/sbin/ntpd -u ntp:ntp -g

Debug “cluvfy” or “runcluvfy.sh”:

$ rm -rf /tmp/cvutrace
$ mkdir /tmp/cvutrace
$ export CV_TRACELOC=/tmp/cvutrace
$ export SRVM_TRACE=true
$ export SRVM_TRACE_LEVEL=1

$ cluvfy comp clocksync -n all -verbose

$ ls -ltr  /tmp/cvutrace
total 1960
-rw-r--r-- 1 grid oinstall       0 Sep  8 21:46 cvutrace.log.0.lck
-rw-r--r-- 1 grid oinstall       0 Sep  8 21:47 cvuhelper.log.0.lck
-rw-r--r-- 1 grid oinstall    1586 Sep  8 21:47 cvuhelper.log.0
-rw-r--r-- 1 grid oinstall 2000962 Sep  8 21:47 cvutrace.log.0

From the trace file, it complaints “file check failed”  for file “/var/run/ntpd.pid”.

$]$ tail -20  /tmp/cvutrace/cvutrace.log.0
[main] [ 2019-09-08 21:47:03.312 EST ] [VerificationLogData.traceLogData:251]  FINE: [Task.perform:514]
m_nodeList='racnode2,racnode1'
[main] [ 2019-09-08 21:47:03.312 EST ] [VerificationLogData.traceLogData:251]  INFO: [sVerificationUtil.getUniqueDistributionID:494]  DistributionID[0]:7.2
[main] [ 2019-09-08 21:47:03.312 EST ] [VerificationLogData.traceLogData:251]  INFO: [sVerificationUtil.getUniqueDistributionID:559]  ==== Distribution Id determined to be OL7
[main] [ 2019-09-08 21:47:03.312 EST ] [VerificationLogData.traceLogData:251]  FINE: [VerificationCommand.execute:297]
Output: '<CV_VRES>1</CV_VRES><CV_LOG>Exectask: file check failed</CV_LOG><CV_CMDLOG><CV_INITCMD>/tmp/CVU_12.1.0.2.0_grid/exectask -chkfile /var/run/ntpd.pid </CV_INITCMD><CV_CMD>access() /var/run/ntpd.pid F_OK</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>2</CV_CMDSTAT></CV_CMDLOG><CV_ERES>0</CV_ERES>'
[main] [ 2019-09-08 21:47:03.313 EST ] [VerificationLogData.traceLogData:251]  FINE: [VerificationCommand.execute:297]
Output: '<CV_VRES>1</CV_VRES><CV_LOG>Exectask: file check failed</CV_LOG><CV_CMDLOG><CV_INITCMD>/tmp/CVU_12.1.0.2.0_grid/exectask -chkfile /var/run/ntpd.pid </CV_INITCMD><CV_CMD>access() /var/run/ntpd.pid F_OK</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>2</CV_CMDSTAT></CV_CMDLOG><CV_ERES>0</CV_ERES>'
[main] [ 2019-09-08 21:47:03.313 EST ] [VerificationLogData.traceLogData:251]  ERROR: [Result.addErrorDescription:624]  PRVF-7590 : "ntpd" is not running on node "racnode2"
[main] [ 2019-09-08 21:47:03.313 EST ] [VerificationLogData.traceLogData:251]  ERROR: [Result.addErrorDescription:624]  PRVF-7590 : "ntpd" is not running on node "racnode1"
[main] [ 2019-09-08 21:47:03.313 EST ] [VerificationLogData.traceLogData:251]  FINE: [Task.perform:594]
TaskDaemonLiveliness:Daemon Liveness[TASKDAEMONLIVELINESS]:TASK_SUMMARY:FAILED:CRITICAL:VERIFICATION_FAILED
          ERRORMSG(racnode2): PRVF-7590 : "ntpd" is not running on node "racnode2"
          ERRORMSG(racnode1): PRVF-7590 : "ntpd" is not running on node "racnode1"
[main] [ 2019-09-08 21:47:03.313 EST ] [VerificationLogData.traceLogData:251]  ERROR: [ResultSet.addErrorDescription:1102]  PRVG-1024 : The NTP Daemon or Service was not running on any of the cluster nodes.
[main] [ 2019-09-08 21:47:03.313 EST ] [VerificationLogData.traceLogData:251]  ERROR: [ResultSet.addErrorDescription:1102]  PRVF-5415 : Check to see if NTP daemon or service is running failed
[main] [ 2019-09-08 21:47:03.313 EST ] [VerificationLogData.traceLogData:251]  FINE: [Task.perform:594]
TaskCTSSIntegrity:Clock Synchronization[TASKCTSSINTEGRITY]:TASK_SUMMARY:FAILED:CRITICAL:VERIFICATION_FAILED
          ERRORMSG(GLOBAL): PRVF-5415 : Check to see if NTP daemon or service is running failed
[main] [ 2019-09-08 21:47:03.313 EST ] [CluvfyDriver.main:360]  ==== cluvfy exiting normally.

SOLUTION

As per Oracle GI installation documentation, configure ntpd service to start with a pidfile . Edit “/etc/sysconfig/ntpd” and modify the below line

OPTIONS="-g"

to

OPTIONS="-g -p /var/run/ntpd.pid"
# systemctl restart ntpd
# ls -l /var/run/ntpd*
-rw-r--r-- 1 root root 4 Sep 8 22:21 /var/run/ntpd.pid

$ cluvfy comp clocksync -n all -verbose

Verifying Clock Synchronization across the cluster nodes

Checking if Clusterware is installed on all nodes...
Oracle Clusterware is installed on all nodes.

Checking if CTSS Resource is running on all nodes...
Check: CTSS Resource running on all nodes
Node Name  Status
---------  ------------------------
racnode1   passed
racnode2   passed
CTSS resource check passed

Querying CTSS for time offset on all nodes...
Query of CTSS for time offset passed

Check CTSS state started...
Check: CTSS state
Node Name    State
------------ ------------------------
racnode2     Observer
racnode1     Observer
CTSS is in Observer state. Switching over to clock synchronization 
     checks using NTP


Starting Clock synchronization checks using Network Time Protocol(NTP)

Checking existence of NTP configuration file "/etc/ntp.conf" across 
nodes

Node Name File exists?
--------- -----------------------
racnode2  yes
racnode1  yes
The NTP configuration file "/etc/ntp.conf" is available on all nodes
NTP configuration file "/etc/ntp.conf" existence check passed

Checking daemon liveness...

Check: Liveness for "ntpd"
Node Name   Running?
---------- ------------------------
racnode2    yes
racnode1    yes
Result: Liveness check passed for "ntpd"
Check for NTP daemon or service alive passed on all nodes

Checking whether NTP daemon or service is using UDP port 123 
on all nodes

Check for NTP daemon or service using UDP port 123
Node Name  Port Open?
---------- -----------------------
racnode2   yes
racnode1   yes
Check for synchronization of NTP daemon with at least one external 
time source passed on all nodes.

Result: Clock synchronization check using Network Time Protocol(NTP) 
        passed


Oracle Cluster Time Synchronization Services check passed

Verification of Clock Synchronization across the cluster nodes was 
successful.

Disable Chronyd and Enable CTSSD into Active Mode in Linux 7

As we know, if any Network Time Protocol (NTP)  demon ntpd or chronyd  running,  then Oracle cluster ware CTSS ( Cluster Time Synchronization Service ) will run in Observer mode.

$ crsctl check ctss
CRS-4701:The Cluster Time Synchronization Service is in Observer mode.
$crsctl stat res -t -init
...
..
.
ora.ctssd
      1   ONLINE  ONLINE  racnode1    BSERVER,STABLE

Now we disable Chronyd, and remove all Chrony configurations.

# systemctl stop chronyd
# systemctl disable chronyd
Removed symlink /etc/systemd/system/multi-user.target.wants/chronyd.service.
# yum remove chrony
Loaded plugins: ulninfo
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
Resolving Dependencies
--> Running transaction check
---> Package chrony.x86_64 0:2.1.1-1.el7 will be erased
--> Finished Dependency Resolution

Dependencies Resolved

Check configurations files are gone, otherwise ctssd still thinks NTP servers are running.

$ ls -ltr /etc/chro*
-rw-r-----. 1 root chrony 62 Nov 24 2015 /etc/chrony.keys.rpmsave

Check again, we see CTSSD running in ACTIVE mode now.

$ crsctl check ctss
CRS-4701: The Cluster Time Synchronization Service is in Active mode.
CRS-4702: Offset (in msec): 0
$ crsctl stat res -t -init

ora.ctssd
     1 ONLINE ONLINE racnode1 ACTIVE:0,STABLE
$ cluvfy comp clocksync -n all -verbose

Verifying Clock Synchronization across the cluster nodes

Checking if Clusterware is installed on all nodes...
Oracle Clusterware is installed on all nodes.

Checking if CTSS Resource is running on all nodes...
Check: CTSS Resource running on all nodes
Node Name    Status
----------- ------------------------
racnode2    passed
racnode1    passed
CTSS resource check passed

Querying CTSS for time offset on all nodes...
Query of CTSS for time offset passed

Check CTSS state started...
Check: CTSS state
Node Name    State
----------- ------------------------
racnode2    Active
racnode1    Active
CTSS is in Active state. Proceeding with check of clock time offsets on all nodes...
Reference Time Offset Limit: 1000.0 msecs
Check: Reference Time Offset
Node Name     Time Offset Status
------------ ----------- ------------------------
racnode2     0.0         passed
racnode1     0.0         passed

Time offset is within the specified limits on the following set of nodes:
"[racnode2, racnode1]"
Result: Check of clock time offsets passed

Oracle Cluster Time Synchronization Services check passed

Verification of Clock Synchronization across the cluster nodes was successful.

crs alert.log

2019-09-08 18:46:55.004 [OCTSSD(22044)]CRS-2410: The Cluster Time 
         Synchronization Service on host racnode2 is in active mode.

octssd.trc on master node ( racnode2 ):

....
..
.
2019-09-08 19:31:56.380369 : CTSS:1714730752: sclsctss_ivsr2: default pid file not found
2019-09-08 19:31:56.380386 : CTSS:1714730752: sclsctss_ivsr2: default pid file not found
2019-09-08 19:31:56.380393 : CTSS:1714730752: ctss_check_vendor_sw: Vendor time sync software is not detected. status [1].
...
..
.

octssd.trc on non-master node ( racnode1 ):

2019-09-08 19:39:07.441725 : CTSS:2003805952: ctsselect_msm: CTSS mode is [0xc4]
2019-09-08 19:39:07.441736 : CTSS:2003805952: ctssslave_swm1_2: Ready to initiate new time sync process.
2019-09-08 19:39:07.442805 : CTSS:2003805952: ctssslave_swm2_1: Waiting for time sync message from master. sync_state[2].
2019-09-08 19:39:07.447917 : CTSS:2008008448: ctssslave_msg_handler4_1: Waiting for slave_sync_with_master to finish sync process. sync_state[3].
2019-09-08 19:39:07.447926 : CTSS:2003805952: ctssslave_swm2_3: Received time sync message from master.
2019-09-08 19:39:07.447935 : CTSS:2003805952: ctssslave_swm15: The CTSS master is ahead this node. The local time offset [11975 usec] is being adjusted. Sync method [2]
2019-09-08 19:39:07.447938 : CTSS:2003805952: ctssslave_swm17: LT [1567935547sec 447908usec], MT [1567935547sec 139990164505707usec], Delta [6167usec]
2019-09-08 19:39:07.447940 : CTSS:2003805952: ctssslave_swm19: The offset is [-11975 usec] and sync interval set to [1]
2019-09-08 19:39:07.447943 : CTSS:2003805952: ctsselect_msm: Sync interval returned in [1]
2019-09-08 19:39:07.447950 : CTSS:2008008448: ctssslave_msg_handler4_3: slave_sync_with_master finished sync process. Exiting clsctssslave_msg_handler