Tablespace SYSMGMTDATA Is Full in GIMR Database (-MGMTDB)

Cluster Health Monitor ( CHM) tables could be fragmented in MGMTDB database.

1) Check the datafile of tablespace SYSMGMTDATA in PDB:

SQL> select PDB_NAME from dba_pdbs;

PDB_NAME
-------------
RACTEST
PDB$SEED

SQL> alter session set container=RACTEST;
Session altered. 

SQL> select TABLESPACE_NAME,FILE_NAME,AUTOEXTENSIBLE,
            BYTES/1024/1024,MAXBYTES/1024/1024 
      from dba_data_files 
     where TABLESPACE_NAME='SYSMGMTDATA';

TABLESPACE_NAME 
--------------- 
FILE_NAME
----------------------------------------------------------------------   
AUT BYTES/1024/1024 MAXBYTES/1024/1024
--- --------------- -------------------
SYSMGMTDATA      +OCR_VOTE/_MGMTDB/FD9B43BF6A646F8CE043B6A9E80A2815/DATAFILE/sysmgmtdata.269.922884445 
NO   2048              2048

2) Check CHM data in repository, and we can see the purging is working:

SQL> select trunc(BEGINTIME), count(*) 
       from chm.CHMOS_PROCESS_INT_TBL 
   group by trunc(BEGINTIME) 
   order by trunc(BEGINTIME);

TRUNC(BEG  COUNT(*)
---------  ---------
21-FEB-17  1222178
22-FEB-17  3265667
23-FEB-17  1327167

3) Check the current CHM repository retention in seconds :

$ oclumon manage -get reppath

CHM Repository Path = +OCR_VOTE/_MGMTDB/FD9B43BF6A646F8CE043B6A9E80A2815/DATAFILE/sysmgmtdata.269.922884445

$ oclumon manage -get repsize

CHM Repository Size = 136320 seconds

4) Check the big objects in repository, they are all from two partition tables CHMOS_PROCESS_INT_TBL and CHMOS_DEVICE_INT_TBL.

SQL> select owner, segment_name,segment_type , bytes/1024/1024 
       from dba_segments 
      where owner='CHM' 
        and bytes/1024/1024>10;
OWNER      SEGMENT_NAME            SEGMENT_TYPE       BYTES/1024/1024
---------- ----------------------  ------------------ ---------------
CHM        CHMOS_PROCESS_INT_TBL    TABLE PARTITION   18
CHM        CHMOS_DEVICE_INT_TBL     TABLE PARTITION   18
CHM        CHMOS_PROCESS_INT_TBL    TABLE PARTITION   18
CHM        CHMOS_DEVICE_INT_TBL     TABLE PARTITION   18

Solution 1 :  Move GIMR repository onto to a different disk group as per “How to Move 12c Grid Infrastructure Management Repository ( GIMR ) to Another Diskgroup“.

Solution 2: Manually truncate the big objects owned by CHM as per Doc ID 2177879.1. Normally the following two tables can be truncated :

SQL> truncate table CHM.CHMOS_PROCESS_INT_TBL;
SQL> truncate table CHM.CHMOS_DEVICE_INT_TBL;

Solution 3:  Resize datafile bigger. For example, increases the size from 2048M to 2560M

SQL> alter database datafile '+OCR_VOTE/_MGMTDB/FD9B43BF6A646F8CE043B6A9E80A2815/DATAFILE/sysmgmtdata.269.922884445' resize 2560m;

Database altered.

How To Manage the Cluster Health Monitor ( CHM ) Repository

Cluster Health Monitor ( CHM ) Repository size should be reviewed periodically to meet business needs and OCR/VOTE disk availability.

Where is Cluster Health Monitor (CHM) Repository ?

In 11.2, the CHM repository is stored in a Berkley Database . The default location of the CHM repository is $GI_HOME/crf/db.

In 12.1, the CHM repository is hosted in the Grid Infrastructure Management Repository (GIMR). The default location for GIMR is stored in the ASM diskgroup which stores the OCR and voting disk .

What is the recommended CHM data retention ?

Oracle Support recommends that the CHM repository be sized according to 72 hours ( 259,200 seconds )(three days) of data retention (e.g.., one weekend worth).

What is the minimum size of  CHM repository ?

For 11.2 GI, one day of data retention for each node requires  867 MB around. So the size of the CHM repository needed to retain 72 hours of data would be as follows:

~72 hours of CHM data retention = NumberOfNodes * 3Days * 867 MB

So for a 2 nodes cluster :

~72 hours of CHM data retention = 2 ( nodes ) * 3 ( days ) * 867 ( per day per node )(5202 MB)

For 12.1, one day of data retention for each node requires 750 MB around, so the size of the CHM repository needed to retain 72 hours of data would be as follows:

~72 hours of CHM data retention = NumberOfNodes * 3Days * 750 MB

So for a 2 node cluster

~72 hours of CHM data retention = 2 ( nodes ) * 3( days ) * 750 ( per day per node ) (4500 MB)

How to see the current CHM repository retention in seconds ?

[grid@racnode1 ~]$ /u01/app/12.1.0/grid/bin/oclumon manage -get repsize

CHM Repository Size = 272580 seconds

How to resize the CHM Repository retention ?

For 11.2 GI:

To determine the current location of the CHM repository:

$oclumon manage -get reppath
 To move and resize the CHM repository for 3 days retention for a 2 nodes cluster:

$ oclumon manage -repos reploc path* -maxspace 5202


* where path = directory path for new location of the CHM repository

For 12.1:

To resize the CHM Repository with one command to result in 3 days retention, eg., for a 2  nodes cluster:

$ oclumon manage -repos changerepossize 4500

How to verify the change in repository size has met the desired retention ?

In 12.1.0.1

$ oclumon manage -repos changeretentiontime 260000

This command does not make any changes. It is more like a “what-if”, ie., what if I wanted to change the retention time, how much space would be required ?

In 12.1.0.2 the syntax was changed and should be used as follows :

[grid@racnode1 ~]$ oclumon manage -repos checkretentiontime 260000

The Cluster Health Monitor repository can support the desired retention for 2 hosts