Autonomous Health Framework (AHF ) Inventory Status STOPPED

This post demonstrates how to resolve AHF Inventory Status is “STOPPED” without uninstalling and installing AHF.

# ahfctl statusahf

.-------------------------------------------------------------------------------------------------.
| Host     | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+----------+---------------+--------+------+------------+----------------------+------------------+
| racnode1 | RUNNING       |  10052 | 5000 | 21.4.0.0.0 | 21400020211220074549 | STOPPED          |
| racnode2 | RUNNING       | 384602 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
| racnode3 | RUNNING       |  20041 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
| racnode4 | RUNNING       | 228081 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
'----------+---------------+--------+------+------------+----------------------+------------------'

Subscribe to get access

Read more of this content when you subscribe today.

High CPU and Memory Usage by Autonomous Health Framework (AHF)

A client ExaCC platform experiences both CPU and memory resource issues. After detailed investigations, we found the Autonomous Health Framework (AHF) processes consume huge CPU and memory resources. The followings are the diagnoses and solutions.

INVESTIGATIONS and SOLUTIONS

Subscribe to get access

Read more of this content when you subscribe today.

many “asmcmd daemon” high cpu

There are many hung “asmcmd daemon” processes with high CPU usage.

$top 

PID    USER  PR  NI   VIRT RES  SHR   %CPU %MEM TIME+ COMMAND
71337 grid  20  0 5717400 5.2g  1044 43.0  0.7 1:07.71 asmcmd daemon
137157grid  20  0 4751716 4.4g 14880 41.0  0.6 5:43.21 asmcmd daemon
76530 grid  20  0 5671444 5.2g  1044 40.4  0.7 2:46.47 asmcmd daemon
96115 grid  20  0 4750220 4.3g  1044 40.4  0.6 5:00.40 asmcmd daemon
81230 grid  20  0 5704468 5.2g  1036 39.7  0.7 3:58.48 asmcmd daemon
...
..
.
$ ps -eaf | grep -i asmcmd

grid 71337 1 40 Aug06 ? 13:01:13 asmcmd daemon
grid 76530 1 40 Aug06 ? 13:02:52 asmcmd daemon
grid 81230 1 40 Aug06 ? 13:04:04 asmcmd daemon
grid 96115 1 40 Aug06 ? 09:35:06 asmcmd daemon
grid 115047 81230 0 09:33 ? 00:00:00 sh -c /u01/app/12.2.0.1/grid/bin
/clsecho -t -o /u01/app/12.2.0.1/grid/log/diag/asmcmd/user_grid/
racnode1/alert/alert.log "ASMCMD Background (PID = 81230)
 2> /tmp/clsecho_stderr_file.txt
grid 115058 76530 0 09:33 ? 00:00:00 [asmcmd daemon]
grid 115060 126727 0 09:33 ? 00:00:00 [asmcmd daemon]
...
..
.

WORKAROUND

1)Kill the “asmcmd daemon” processes.

2) Upgrade Trace File Analyzer (TFA) to the latest version.

TFA – Oracle Trace File Analyser

Uninstall TFA

1) Check TFA_HOME.

$ grep TFA_HOME= /etc/init.d/init.tfa
TFA_HOME=/u01/app/12.2.0.1/grid/tfa/racnode1/tfa_home

2) Uninstall TFA as root user.

[root@racnode1 bin]# ./tfactl -h

Usage : /u01/app/12.2.0.1/grid/bin/tfactl <command> [options]
 commands:diagcollect|collection|analyze|ips|run|start|stop|enable|disable|status|print|access|purge|directory|host|receiver|set|toolstatus|uninstall|diagnosetfa
For detailed help on each command use:
 /u01/app/12.2.0.1/grid/bin/tfactl <command> -help

[root@racnode1 bin]# /u01/app/12.2.0.1/grid/bin/tfactl uninstall

TFA will be uninstalled on node racnode1 :

Removing TFA from racnode1 only
Please remove TFA locally on any other configured nodes

Notifying Other Nodes about TFA Uninstall...
TFA is not yet secured to run all commands
FAIL
Sleeping for 10 seconds...
Stopping TFA Support Tools...
Stopping TFA in racnode1...
Shutting down TFA
Removed symlink /etc/systemd/system/multi-user.target.wants/oracle-tfa.service.
Removed symlink /etc/systemd/system/graphical.target.wants/oracle-tfa.service.
. . . . .
. . .
Successfully shutdown TFA..

Deleting TFA support files on racnode1:
Removing /u01/app/grid/tfa/racnode1/database...
Removing /u01/app/grid/tfa/racnode1/log...
Removing /u01/app/grid/tfa/racnode1/output...
Removing /u01/app/grid/tfa/racnode1...
Removing /u01/app/grid/tfa...
Removing /etc/rc.d/rc0.d/K17init.tfa
Removing /etc/rc.d/rc1.d/K17init.tfa
Removing /etc/rc.d/rc2.d/K17init.tfa
Removing /etc/rc.d/rc4.d/K17init.tfa
Removing /etc/rc.d/rc6.d/K17init.tfa
Removing /etc/init.d/init.tfa...
Removing /u01/app/12.2.0.1/grid/bin/tfactl...
Removing /u01/app/12.2.0.1/grid/tfa/bin...
Removing /u01/app/12.2.0.1/grid/tfa/racnode1...
Removing /u01/app/12.2.0.1/grid/tfa...

Install TFA

1)Install TFA locally as root user on each RAC node.

[root@racnode1 TFA]# ./installTFA-LINUX -tfabase /u01/app/grid -javahome /u01/app/12.2.0.1/grid/jdk
TFA Installation Log will be written to File : /tmp/tfa_install_49362_2017_08_31-11_35_54.log

Starting TFA installation

TFA Version: 122122 Build Date: 201707270831

Running Auto Setup for TFA as user root...

Would you like to do a [L]ocal only or [C]lusterwide installation ? [L|l|C|c] [C] : L
Installing TFA now...

Discovering Nodes and Oracle resources

Starting Discovery...

Getting list of nodes in cluster . . . . .

List of nodes in cluster:
racnode1
racnode2

CRS_HOME=/u01/app/12.2.0.1/grid

Searching for running databases...
1. RACDEV1
2. RACTEST1

Searching out ORACLE_HOME for selected databases...
Getting Oracle Inventory...
ORACLE INVENTORY: /u01/app/oraInventory
Discovery Complete...

TFA Will be Installed on racnode1...
Checking JAVA Status on all nodes ...
TFA will scan the following Directories
++++++++++++++++++++++++++++++++++++++++++++

.------------------------------------------------------------------.
|                            racnode1                              |
+-------------------------------------------------------+----------+
| Trace Directory                                       | Resource |
+----------------------------------------------- -------+----------+
| /u01/app/12.2.0.1/grid/cfgtoollogs                    | CFGTOOLS |
| /u01/app/12.2.0.1/grid/crf/db/racnode1                | CRS      |
| /u01/app/12.2.0.1/grid/crs/log                        | CRS      |
| /u01/app/12.2.0.1/grid/css/log                        | CRS      |
| /u01/app/12.2.0.1/grid/cv/log                         | CRS      |
| /u01/app/12.2.0.1/grid/evm/admin/log                  | CRS      |
| /u01/app/12.2.0.1/grid/evm/admin/logger               | CRS      |
| /u01/app/12.2.0.1/grid/evm/log                        | CRS      |
| /u01/app/12.2.0.1/grid/install                        | INSTALL  |
| /u01/app/12.2.0.1/grid/inventory/ContentsXML          | INSTALL  |
| /u01/app/12.2.0.1/grid/log                            | CRS      |
| /u01/app/12.2.0.1/grid/network/log                    | CRS      |
| /u01/app/12.2.0.1/grid/opmn/logs                      | CRS      |
| /u01/app/12.2.0.1/grid/racg/log                       | CRS      |
| /u01/app/12.2.0.1/grid/rdbms/log                      | ASM      |
| /u01/app/12.2.0.1/grid/scheduler/log                  | CRS      |
| /u01/app/12.2.0.1/grid/srvm/log                       | CRS      |
| /u01/app/grid/cfgtoollogs                             | CFGTOOLS |
| /u01/app/grid/crsdata/racnode1/acfs                   | ACFS     |
| /u01/app/grid/crsdata/racnode1/afd                    | ASM      |
| /u01/app/grid/crsdata/racnode1/chad                   | CRS      |
| /u01/app/grid/crsdata/racnode1/core                   | CRS      |
| /u01/app/grid/crsdata/racnode1/crsconfig              | CRS      |
| /u01/app/grid/crsdata/racnode1/crsdiag                | CRS      |
| /u01/app/grid/crsdata/racnode1/cvu                    | CRS      |
| /u01/app/grid/crsdata/racnode1/evm                    | CRS      |
| /u01/app/grid/crsdata/racnode1/output                 | CRS      |
| /u01/app/grid/crsdata/racnode1/trace                  | CRS      |
| /u01/app/grid/diag/asm/+asm/+ASM1/cdump               | ASM      |
| /u01/app/grid/diag/crs/racnode1/crs/cdump             | CRS      |
| /u01/app/grid/diag/crs/racnode1/crs/trace             | CRS      |
| /u01/app/grid/diag/rdbms/_mgmtdb/-MGMTDB/cdump        | RDBMS    |
| /u01/app/grid/diag/tnslsnr/racnode1/listener/cdump    | TNS      |
...
..
.
| /u01/app/oraInventory/ContentsXML                     | INSTALL |
| /u01/app/oraInventory/logs                            | INSTALL |
...
..
.

Installing TFA on racnode1:
HOST: racnode1 TFA_HOME: /u01/app/grid/tfa/racnode1/tfa_home

.-----------------------------------------------------------------------------.
| Host     | Status of TFA | PID   | Port | Version    | Build ID             |
+----------+---------------+-------+------+------------+----------------------+
| racnode1 |    RUNNING    | 51354 | 5000 | 12.2.1.2.2 | 12212220170727083130 |
'----------+---------------+-------+------+------------+----------------------'

Running Inventory in All Nodes...
Enabling Access for Non-root Users on racnode1...
Adding default users to TFA Access list...

Summary of TFA Installation:
.-----------------------------------------------------------.
|                 racnode1                                  |
+---------------------+-------------------------------------+
| Parameter           |            Value                    |
+---------------------+-------------------------------------+
| Install location    | /u01/app/grid/tfa/racnode1/tfa_home |
| Repository location | /u01/app/grid/tfa/repository        |
| Repository usage    | 0 MB out of 10240 MB                |
'---------------------+-------------------------------------'

TFA is successfully installed...

Usage : /u01/app/12.2.0.1/grid/bin/tfactl <command> [options]
 commands:diagcollect|collection|analyze|ips|run|start|stop|enable|disable|status|print|access|purge|directory|host|receiver|set|toolstatus|uninstall|diagnosetfa|syncnodes
For detailed help on each command use:
 /u01/app/12.2.0.1/grid/bin/tfactl <command> -help

2) Startup OSWatcher with gzip option.

[root@racnode1 TFA]# /u01/app/12.2.0.1/grid/bin/tfactl
 tfactl> status oswbb
Check run status of TFA process
Usage : /u01/app/12.2.0.1/grid/bin/tfactl status

 tfactl> stop oswbb
Stopped OSWatcher

 tfactl> start oswbb 15 168 gzip
Starting OSWatcher

[root@racnode1 TFA]# ps -eaf | grep -i osw |grep -v grep
grid 19631 1 0 11:52 pts/3 00:00:00 /bin/sh ./OSWatcher.sh 15 168 gzip /u01/app/grid/tfa/repository/suptools/racnode1/oswbb/grid/archive

Synchronize TFA between RAC nodes

[root@racnode1 ~]# /u01/app/12.2.0.1/grid/bin/tfactl syncnodes

Current Node List in TFA :
1. racnode1

Node List in Cluster :
1. racnode1
2. racnode2

Node List to sync TFA Certificates :
 1 racnode2
 
Do you want to update this node list? [Y|N] [N]: Y

Please Enter all the remote nodes you want to sync...

Enter Node List (seperated by space) : racnode1 racnode2

Node List to sync TFA Certificates :
 1 racnode2
 
Syncing TFA Certificates on racnode2 :

TFA_HOME on racnode2 : /u01/app/grid/tfa/racnode2/tfa_home

Copying TFA Certificates to racnode2...
root@racnode2's password:
Copying SSL Properties to racnode2...
root@racnode2's password:

Restarting TFA on racnode2...
root@racnode2's password:
Restarting TFA..
Killing TFA running with pid 17697
Waiting up to 120 seconds for TFA to be re-started..
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
Successfully re-started TFA..

.--------------------------------------------------------------------------------------------------------.
| Host     | Status of TFA | PID        | Port    | Version    | Build ID             | Inventory Status |
+----------+---------------+------------+---------+------------+----------------------+------------------+
| racnode1 | RUNNING       | 117654     | 5000    | 12.2.1.2.2 | 12212220170727083130 | COMPLETE         |
| racnode2 | RUNNING       | 114949     | 5000    | 12.2.1.2.2 | 12212220170727083130 | COMPLETE         |
'----------+---------------+------------+---------+------------+----------------------+------------------'