ORA-12516 from “oclumon manage -repos ” command

SYMPTOM

While running oclumon command in 12.2 GI to check CHM retention, and get ORA-12516 error:

$ oclumon manage -repos checkretentiontime 259200
Failed change retention. Error returned ORA-12516: TNS:listener could not find available handler with matching protocol stack

INVESTIGATION

Check MGMTLSNR is enabled and running

$ srvctl status mgmtlsnr
Listener MGMTLSNR is enabled
Listener MGMTLSNR is running on node(s): racnode1

Services are all registered on MGMTLSNR

Both private and private HAIP are registered in listener.

$ lsnrctl status MGMTLSNR
...
..
.
Listening Endpoints Summary…
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=MGMTLSNR)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.1.1.11)(PORT=1526)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=169.254.105.132)(PORT=1526)))
..
.
Service "_mgmtdb" has 1 instance(s).
Instance "-MGMTDB", status READY, has 1 handler(s) for this service…
Service "gimr_dscrep_10" has 1 instance(s).
Instance "-MGMTDB", status READY, has 1 handler(s) for this service…
The command completed successfully

MGMTDB is DISABLED

$ srvctl status mgmtdb
Database is disabled
Instance -MGMTDB is running on node racnode1

strace log

Log shows private IP access issue.

66684 [00007f94efff1687] getsockname(50, {sa_family=AF_INET, sin_port=htons(48322), sin_addr=inet_addr("169.254.105.132")}, [16]) = 0
66684 [00007f94efff1657] getpeername(50, {sa_family=AF_INET, sin_port=htons(61021), sin_addr=inet_addr("169.254.105.132")}, [16]) = 0
66684 [00007f94f2603aeb] recvfrom(50, 0x1a63888, 10240, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
66684 [00007f94f2603aeb] recvfrom(50, 0x1a63888, 10240, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
66684 [00007f94f2603aeb] recvfrom(50, 0x1a63888, 10240, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
66684 [00007f94f2603aeb] recvfrom(50, 0x1a63888, 10240, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
66684 [00007f94f2603aeb] recvfrom(50, 0x1a63888, 10240, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)

LOCAL_LISTENER

local_listener shows without private IP.

SQL> show parameter local_listener

NAME                 TYPE     VALUE
-------------------  ------  -------------------------------
local_listener       string   (ADDRESS=(PROTOCOL=TCP)(HOST=
                                     10.1.1.11)(PORT=1526))

CAUSES

local_listener includes private IP only, but private HAIP is missing.

SOLUTION

ENABLE MGMTDB

$ srvctl enable mgmtdb

$ srvctl status mgmtdb
Database is enabled
Instance -MGMTDB is running on node racnode1

Add private HAIP into local_listener

SQL> alter system set local_listener='(ADDRESS=(PROTOCOL=TCP)(HOST=10.1.1.11)(PORT=1526))','(ADDRESS=(PROTOCOL=TCP)(HOST=169.254.105.132)(PORT=1526))' scope=both;

System altered.

SQL> show parameter local_listener

NAME          TYPE    VALUE
------------- ------- ------------------------------
local_listener string (ADDRESS=(PROTOCOL=TCP)(HOST=10.1.1.11)
                      (PORT=1526)), (ADDRESS=(PROTOCOL=TCP)
                      (HOST=169.254.105.132)(PORT=1526))

Run oclumon command again successfully.

$ oclumon manage -repos checkretentiontime 259200
The Cluster Health Monitor repository can support the desired retention for 2 hosts

How To Manage the Cluster Health Monitor ( CHM ) Repository

Cluster Health Monitor ( CHM ) Repository size should be reviewed periodically to meet business needs and OCR/VOTE disk availability.

Where is Cluster Health Monitor (CHM) Repository ?

In 11.2, the CHM repository is stored in a Berkley Database . The default location of the CHM repository is $GI_HOME/crf/db.

In 12.1, the CHM repository is hosted in the Grid Infrastructure Management Repository (GIMR). The default location for GIMR is stored in the ASM diskgroup which stores the OCR and voting disk .

What is the recommended CHM data retention ?

Oracle Support recommends that the CHM repository be sized according to 72 hours ( 259,200 seconds )(three days) of data retention (e.g.., one weekend worth).

What is the minimum size of  CHM repository ?

For 11.2 GI, one day of data retention for each node requires  867 MB around. So the size of the CHM repository needed to retain 72 hours of data would be as follows:

~72 hours of CHM data retention = NumberOfNodes * 3Days * 867 MB

So for a 2 nodes cluster :

~72 hours of CHM data retention = 2 ( nodes ) * 3 ( days ) * 867 ( per day per node )(5202 MB)

For 12.1, one day of data retention for each node requires 750 MB around, so the size of the CHM repository needed to retain 72 hours of data would be as follows:

~72 hours of CHM data retention = NumberOfNodes * 3Days * 750 MB

So for a 2 node cluster

~72 hours of CHM data retention = 2 ( nodes ) * 3( days ) * 750 ( per day per node ) (4500 MB)

How to see the current CHM repository retention in seconds ?

[grid@racnode1 ~]$ /u01/app/12.1.0/grid/bin/oclumon manage -get repsize

CHM Repository Size = 272580 seconds

How to resize the CHM Repository retention ?

For 11.2 GI:

To determine the current location of the CHM repository:

$oclumon manage -get reppath
 To move and resize the CHM repository for 3 days retention for a 2 nodes cluster:

$ oclumon manage -repos reploc path* -maxspace 5202


* where path = directory path for new location of the CHM repository

For 12.1:

To resize the CHM Repository with one command to result in 3 days retention, eg., for a 2  nodes cluster:

$ oclumon manage -repos changerepossize 4500

How to verify the change in repository size has met the desired retention ?

In 12.1.0.1

$ oclumon manage -repos changeretentiontime 260000

This command does not make any changes. It is more like a “what-if”, ie., what if I wanted to change the retention time, how much space would be required ?

In 12.1.0.2 the syntax was changed and should be used as follows :

[grid@racnode1 ~]$ oclumon manage -repos checkretentiontime 260000

The Cluster Health Monitor repository can support the desired retention for 2 hosts