Oracle – Page 120 – Make DBA Life Easy

How to Move 12c Grid Infrastructure Management Repository ( GIMR ) to Another Diskgroup

Grid Infrastructure Management Repository ( GIMR ) should be relocated to a disk group other than default disk group OCR/VOTE.

First, create a disk group GIMR for storing new GIMR data. At the moment, it is stored in disk group OCR_VOTE. Please refer to section “Create New Disks and Create New Diskgroup FRA” for creating a disk group.

1. Stop and disable ora.crf resource on every node

[root@racnode1 bin]# pwd
/u01/app/12.1.0/grid/bin

[root@racnode1 bin]# ./crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'racnode1'
CRS-2677: Stop of 'ora.crf' on 'racnode1' succeeded

[root@racnode1 bin]# ./crsctl modify res ora.crf -attr ENABLED=0 -init
[root@racnode2 bin]#

Do not stop ora.mgmtlsnr or ora.mgmtdb resource, otherwise Step 2 will fail with the following:

Oracle Grid Management database is running on node “racnode1”. Run dbca on node “racnode1” to delete the database.

2. Use DBCA to delete the management database

As user grid, find on which node the mgmtdb is running on :

[grid@racnode1 ~]$ srvctl status mgmtdb
Database is enabled
Instance -MGMTDB is running on node racnode1

[grid@racnode1 ~]$ /u01/app/12.1.0/grid/bin/dbca -silent -deleteDatabase -sourceDB -MGMTDB
Connecting to database
4% complete
9% complete
14% complete
19% complete
23% complete
28% complete
47% complete
Updating network configuration files
48% complete
52% complete
Deleting instance and datafiles
76% complete
100% complete
Look at the log file "/u01/app/grid/cfgtoollogs/dbca/_mgmtdb.log" for further details.

[grid@racnode1 ~]$ cat "/u01/app/grid/cfgtoollogs/dbca/_mgmtdb.log"
The Database Configuration Assistant will delete the Oracle instance and datafiles for your database. All information in the database will be destroyed. Do you want to proceed?
Connecting to database
DBCA_PROGRESS : 4%
DBCA_PROGRESS : 9%
DBCA_PROGRESS : 14%
DBCA_PROGRESS : 19%
DBCA_PROGRESS : 23%
DBCA_PROGRESS : 28%
DBCA_PROGRESS : 47%
Updating network configuration files
DBCA_PROGRESS : 48%
DBCA_PROGRESS : 52%
Deleting instance and datafiles
DBCA_PROGRESS : 76%
DBCA_PROGRESS : 100%
Database deletion completed.
[grid@racnode1 ~]$

3. Recreate the 12.1.0.2 MGMTDB

a. As Grid User on any node execute the following DBCA command with the desired <DG Name>:

[grid@racnode1 ~]$ /u01/app/12.1.0/grid/bin/dbca -silent -createDatabase -sid -MGMTDB -createAsContainerDatabase true -templateName MGMTSeed_Database.dbc -gdbName _mgmtdb -storageType ASM -diskGroupName +GIMR -datafileJarLocation /u01/app/12.1.0/grid/assistants/dbca/templates -characterset AL32UTF8 -autoGeneratePasswords -skipUserTemplateCheck
Registering database with Oracle Grid Infrastructure
5% complete
Copying database files
7% complete
9% complete
16% complete
23% complete
30% complete
37% complete
41% complete
Creating and starting Oracle instance
43% complete
48% complete
49% complete
50% complete
55% complete
60% complete
61% complete
64% complete
Completing Database Creation
68% complete
79% complete
89% complete
100% complete
Look at the log file "/u01/app/grid/cfgtoollogs/dbca/_mgmtdb/_mgmtdb0.log" for further details.
[grid@racnode1 ~]$

b. Create a PDB within the MGMTDB by using DBCA.

As Grid User on any node execute the following DBCA command:

NOTE: The CLUSTER_NAME needs to have any hyphens (“-“) replaced with underscores (“_”)

[grid@racnode1 ~]$ /u01/app/12.1.0/grid/bin/dbca -silent -createPluggableDatabase -sourceDB -MGMTDB -pdbName RACNODE_CLUSTER -createPDBFrom RMANBACKUP -PDBBackUpfile /u01/app/12.1.0/grid/assistants/dbca/templates/mgmtseed_pdb.dfb -PDBMetadataFile /u01/app/12.1.0/grid/assistants/dbca/templates/mgmtseed_pdb.xml -createAsClone true
Creating Pluggable Database
4% complete
12% complete
21% complete
38% complete
55% complete
85% complete
Completing Pluggable Database Creation
100% complete
Look at the log file "/u01/app/grid/cfgtoollogs/dbca/_mgmtdb/RACNODE_CLUSTER/_mgmtdb.log" for further details.
[grid@racnode1 ~]$

4. Secure that the Management Database credential

[grid@racnode1 ~]$ srvctl status MGMTDB
Database is enabled
Instance -MGMTDB is running on node racnode1
[grid@racnode1 ~]$

[grid@racnode1 ~]$ /u01/app/12.1.0/grid/bin/mgmtca <--no output
[grid@racnode1 ~]$

5. Enable and start ora.crf resource on every node

[root@racnode1 ~]# /u01/app/12.1.0/grid/bin/crsctl modify res ora.crf -attr ENABLED=1 -init

[root@racnode1 ~]# /u01/app/12.1.0/grid/bin/crsctl start res ora.crf -init
CRS-2672: Attempting to start 'ora.crf' on 'racnode1'
CRS-2676: Start of 'ora.crf' on 'racnode1' succeeded
[root@racnode1 ~]#

[root@racnode2 bin]# /u01/app/12.1.0/grid/bin/crsctl modify res ora.crf -attr ENABLED=1 -init

[root@racnode2 bin]# /u01/app/12.1.0/grid/bin/crsctl start res ora.crf -init
CRS-2672: Attempting to start 'ora.crf' on 'racnode2'
CRS-2676: Start of 'ora.crf' on 'racnode2' succeeded

[root@racnode1 bin]# ./crsctl stat res -t -init
--------------------------------------------------------------------------
Name Target State Server State details
--------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------
ora.asm
 1 ONLINE ONLINE racnode1 Started,STABLE
ora.cluster_interconnect.haip
 1 ONLINE ONLINE racnode1 STABLE
ora.crf
 1 ONLINE ONLINE racnode1 STABLE
ora.crsd
 1 ONLINE ONLINE racnode1 STABLE
ora.cssd
 1 ONLINE ONLINE racnode1 STABLE
ora.cssdmonitor
 1 ONLINE ONLINE racnode1 STABLE
ora.ctssd
 1 ONLINE ONLINE racnode1 ACTIVE:0,STABLE
ora.diskmon
 1 OFFLINE OFFLINE STABLE
ora.evmd
 1 ONLINE ONLINE racnode1 STABLE
ora.gipcd
 1 ONLINE ONLINE racnode1 STABLE
ora.gpnpd
 1 ONLINE ONLINE racnode1 STABLE
ora.mdnsd
 1 ONLINE ONLINE racnode1 STABLE
ora.storage
 1 ONLINE ONLINE racnode1 STABLE
--------------------------------------------------------------------------------

6. Check management database and management listener

[grid@racnode1 ~]$ srvctl status MGMTDB
Database is enabled
Instance -MGMTDB is running on node racnode1

[grid@racnode1 ~]$ ps -eaf | grep tns
root 15 2 0 20:05 ? 00:00:00 [netns]
grid 1168 892 0 22:37 pts/0 00:00:00 grep --color=auto tns
grid 3974 1 0 20:06 ? 00:00:00 /u01/app/12.1.0/grid/bin/tnslsnr MGMTLSNR -no_crs_notify -inherit
grid 4064 1 0 20:06 ? 00:00:00 /u01/app/12.1.0/grid/bin/tnslsnr LISTENER -no_crs_notify -inherit
grid 4092 1 0 20:06 ? 00:00:00 /u01/app/12.1.0/grid/bin/tnslsnr LISTENER_SCAN2 -no_crs_notify -inherit
grid 4103 1 0 20:06 ? 00:00:00 /u01/app/12.1.0/grid/bin/tnslsnr LISTENER_SCAN3 -no_crs_notify -inherit

[grid@racnode1 ~]$ lsnrctl status MGMTLSNR
LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 14-MAR-2016 22:38:03

Copyright (c) 1991, 2014, Oracle. All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=MGMTLSNR)))
STATUS of the LISTENER
------------------------
Alias MGMTLSNR
Version TNSLSNR for Linux: Version 12.1.0.2.0 - Production
Start Date 14-MAR-2016 20:06:33
Uptime 0 days 2 hr. 31 min. 29 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/12.1.0/grid/network/admin/listener.ora
Listener Log File /u01/app/grid/diag/tnslsnr/racnode1/mgmtlsnr/alert/log.xml
Listening Endpoints Summary...
 (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=MGMTLSNR)))
 (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.100.61)(PORT=1521)))
 (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=169.254.146.121)(PORT=1521)))
Services Summary...
Service "-MGMTDBXDB" has 1 instance(s).
 Instance "-MGMTDB", status READY, has 1 handler(s) for this service...
Service "_mgmtdb" has 1 instance(s).
 Instance "-MGMTDB", status READY, has 1 handler(s) for this service...
Service "racnode_cluster" has 1 instance(s).
 Instance "-MGMTDB", status READY, has 1 handler(s) for this service...
The command completed successfully

How To Manage the Cluster Health Monitor ( CHM ) Repository

Cluster Health Monitor ( CHM ) Repository size should be reviewed periodically to meet business needs and OCR/VOTE disk availability.

Where is Cluster Health Monitor (CHM) Repository ?

In 11.2, the CHM repository is stored in a Berkley Database . The default location of the CHM repository is $GI_HOME/crf/db.

In 12.1, the CHM repository is hosted in the Grid Infrastructure Management Repository (GIMR). The default location for GIMR is stored in the ASM diskgroup which stores the OCR and voting disk .

What is the recommended CHM data retention ?

Oracle Support recommends that the CHM repository be sized according to 72 hours ( 259,200 seconds )(three days) of data retention (e.g.., one weekend worth).

What is the minimum size of CHM repository ?

For 11.2 GI, one day of data retention for each node requires 867 MB around. So the size of the CHM repository needed to retain 72 hours of data would be as follows:

~72 hours of CHM data retention = NumberOfNodes * 3Days * 867 MB

So for a 2 nodes cluster :

~72 hours of CHM data retention = 2 ( nodes ) * 3 ( days ) * 867 ( per day per node )(5202 MB)

For 12.1, one day of data retention for each node requires 750 MB around, so the size of the CHM repository needed to retain 72 hours of data would be as follows:

~72 hours of CHM data retention = NumberOfNodes * 3Days * 750 MB

So for a 2 node cluster

~72 hours of CHM data retention = 2 ( nodes ) * 3( days ) * 750 ( per day per node ) (4500 MB)

How to see the current CHM repository retention in seconds ?

[grid@racnode1 ~]$ /u01/app/12.1.0/grid/bin/oclumon manage -get repsize

CHM Repository Size = 272580 seconds

How to resize the CHM Repository retention ?

For 11.2 GI:

To determine the current location of the CHM repository:

$oclumon manage -get reppath

To move and resize the CHM repository for 3 days retention for a 2 nodes cluster:

  $ oclumon manage -repos reploc path* -maxspace 5202

* where path = directory path for new location of the CHM repository

For 12.1:

To resize the CHM Repository with one command to result in 3 days retention, eg., for a 2 nodes cluster:

$ oclumon manage -repos changerepossize 4500

How to verify the change in repository size has met the desired retention ?

In 12.1.0.1

$ oclumon manage -repos changeretentiontime 260000

This command does not make any changes. It is more like a “what-if”, ie., what if I wanted to change the retention time, how much space would be required ?

In 12.1.0.2 the syntax was changed and should be used as follows :

[grid@racnode1 ~]$ oclumon manage -repos checkretentiontime 260000

The Cluster Health Monitor repository can support the desired retention for 2 hosts

Grid Infrastructure Management Repository (GIMR)

Since -MGMT DB uses OCR/Voting disk by default, It is strongly recommended to check MGMT database tablespace usage. In some GI versions, we see MGMT database used all OCR/Voting disk.

What is Management Repository?

Grid Infrastructure Management Repository ( GIMR ) is a single instance database managed by GI. It will be up and running on one node in the cluster. If the hosting node is down, the database will be automatically failed over to other node.

What’s the purpose of Management Database?

GIMR will be the central repository to store Cluster Health Monitor (aka CHM/OS, ora.crf) and other data in 12c.

Where does Management Database store it’s datafiles?

In 12R1, by default, Management database uses the same shared storage as OCR/Voting File.

Can Management Database be turned on/off when you want ?

In 12.1.0.1, GIMR is optional, if Management Database is not selected to be configured during installation/upgrade OUI, all features (Cluster Health Monitor (CHM/OS) etc) that depend on it will be disabled.

Note: there’s no supported procedure to enable Management Database once the GI stack is configured

This changed in 12.1.0.2, it’s mandatory to have GIMR and must not be turned off

What are the resources associated with Management Database?

The following resources from “crsctl stat res -t” are for Management Database:

ora.mgmtdb
1 ONLINE ONLINE racnode1 Open,STABLE
ora.MGMTLSNR
1 ONLINE ONLINE racnode1 169.254.146.121 172.16.100.61,STABLE

On OS level, the database “-MGMTDB” and listener MGMTLSNR are for Management Database:

[grid@racnode1 ~]$ ps -ef| grep pmon_-MGMTDB
grid 4210 1 0 12:07 ? 00:00:00 mdb_pmon_-MGMTDB

[grid@racnode1 ~]$ ps -ef| grep MGMTLSNR
grid 4015 1 0 12:07 ? 00:00:00 /u01/app/12.1.0/grid/bin/tnslsnr MGMTLSNR -no_crs_notify -inherit

How to start/stop Management Database ?

If Management Database is down for some reason, the following srvctl command can be used to start it:

Usage: srvctl start mgmtdb [-startoption <start_option>] [-node <node_name>]

Usage: srvctl start mgmtlsnr [-node <node_name>]

[grid@racnode1 ~]$ srvctl status MGMTDB
Database is enabled
Instance -MGMTDB is running on node racnode1

[grid@racnode1 ~]$ srvctl status MGMTLSNR
Listener MGMTLSNR is enabled
Listener MGMTLSNR is running on node(s): racnode1

[grid@racnode1 ~]$ srvctl stop MGMTDB
[grid@racnode1 ~]$ srvctl stop MGMTLSNR

[grid@racnode1 ~]$ srvctl status MGMTDB
Database is enabled
Database is not running.
[grid@racnode1 ~]$ srvctl status MGMTLSNR
Listener MGMTLSNR is enabled
Listener MGMTLSNR is not running

[grid@racnode1 ~]$ srvctl start MGMTDB
[grid@racnode1 ~]$ srvctl start MGMTLSNR
PRCC-1014 : MGMTLSNR was already running
PRCR-1004 : Resource ora.MGMTLSNR is already running
PRCR-1079 : Failed to start resource ora.MGMTLSNR
CRS-5702: Resource 'ora.MGMTLSNR' is already running on 'racnode1'

[grid@racnode1 ~]$ srvctl status MGMTDB
Database is enabled
Instance -MGMTDB is running on node racnode1

[grid@racnode1 ~]$ srvctl status MGMTLSNR
Listener MGMTLSNR is enabled
Listener MGMTLSNR is running on node(s): racnode1

How to access to Management Database trace file etc?

Since the database name starts with “-” sign, “./” needs to be specified to avoid error:

[grid@racnode1 grid]$ cd $ORACLE_BASE
[grid@racnode1 grid]$ cd diag
[grid@racnode1 rdbms]$ cd _mgmtdb
[grid@racnode1 _mgmtdb]$ ls -ltr
total 4
drwxr-x--- 16 grid oinstall 4096 Feb 16 21:59 -MGMTDB
-rw-r----- 1 grid oinstall 0 Feb 16 21:59 i_1.mif

[grid@racnode1 _mgmtdb]$ cd -MGMTDB
-bash: cd: -M: invalid option
cd: usage: cd [-L|[-P [-e]]] [dir]

[grid@racnode1 _mgmtdb]$ cd ./-MGMTDB
[grid@racnode1 trace]$ view -MGMTDB_mmon_26447.trc
VIM - Vi IMproved 7.4 (2013 Aug 10, compiled May 4 2014 20:16:04)
Unknown option argument: "-MGMTDB_mmon_26447.trc"
More info with: "vim -h"

[grid@racnode1 trace]$ view ./-MGMTDB_mmon_26447.trc

Is there any need to manually backup or tune Management Database?

As of now, there’s no such need.

How much (shared) disk space should be allocated for the Management Database?

For Oracle Cluster Registry (OCR) with external redundancy and the Grid Infrastructure Management Repository

Minimum: At least 5.2 GB for the OCR volume that contains the Grid Infrastructure Management Repository (4.5 GB + 300 MB voting files + 400 MB OCR), plus 500 MB for each node for clusters greater than four nodes. For example, a six-node cluster allocation should be 6.2 GB.

Reference: http://docs.oracle.com/database/121/CWLIN/storage.htm#CHDDCAHD

EMAGENT Stopped With “OutOfMemoryError”

Java paramemetes like ” -Xmx” and “-XX:MaxPermSize” might cause OEM agent up/down periodically.

SYMPTOMS

The agent fails to start, or keeps restarting itself, or becomes unresponsive and the following messages are seen in logs.

$AGENT_INST/bin/emctl start agent reports following error :

......
......
......
Starting agent .........................................
Consult emctl.log and emagent.nohup in: /db/oracle/product/agent12c/12.1.0.4/agent_inst/sysman/log

$AGENT_INST/sysman/log/emagent.nohup reports following error :

......
......
2016-03-06 20:03:08,691 [1:main] WARN - Missing filename for log handler 'opsscfg
Agent is going down due to an OutOfMemoryError
----- Sun Mar 6 20:03:28 2016::23451::Checking status of EMAgent : 30249 -----
----- Sun Mar 6 20:03:28 2016::23451::EMAgent exited at Sun Mar 6 20:03:28 2016 with return value 57. -----
----- Sun Mar 6 20:03:28 2016::23451::EMAgent will be restarted because of an Out of Memory Exception. 
......
......

$AGENT_INST/sysman/log/gcagent.log reports following error :

......
......
2016-03-06 20:02:52,549 [1:main] FATAL - Fatal error: Target Interaction Manager failed at Startup
java.lang.OutOfMemoryError: Java heap space
......
......
......
2016-03-06 20:02:52,552 [1:main] ERROR - agent main threw an error
java.lang.OutOfMemoryError: Java heap space

$GENT_INST/bin/emctl start agent reports following error :

Oracle Enterprise Manager Cloud Control 12c Release 4
Copyright (c) 1996, 2014 Oracle Corporation. All rights reserved.
Starting agent ............................................failed.
Fatal agent error: Target Interaction Manager failed at Startup
Fatal agent error: Target Interaction Manager failed at Startup
Fatal agent error: Target Interaction Manager failed at Startup
EMAgent is Thrashing. Exiting watchdog

$AGENT_INST/sysman/log/emagent.nohup reports following error :

......
......
2016-03-06 20:03:08,691 [1:main] WARN - Missing filename for log handler 'opsscfg
Agent is going down due to an OutOfMemoryError

$AGENT_INST/sysman/log/gcagent.log reports following error :

2016-03-06 20:02:52,549[1:3305B9] INFO - Invoking STARTUP_P1 (1) on Target Interaction Manager
2016-03-06 20:02:52,549[1:main] FATAL - Fatal error: Target Interaction Manager failed at Startup
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)

$AGENT_INST/sysman/log/gcagent.log reports following errors:

......
......
......
java.lang.OutOfMemoryError: PermGen space
......
......
......
java.lang.OutOfMemoryError: PermGen space
2016-03-06 20:02:52,660 [64:GC.Executor.8 (oracle_camm_manager:ADPManager_EMGC_ADPMANAGER2:Response) (oracle_camm_manager:ADPManager_EMGC_ADPMANAGER2:Response:Response)] ERROR - oracle_camm_manager:ADPManager_EMGC_ADPMANAGER2:Response:Response
java.lang.OutOfMemoryError: PermGen space

SOLUTION

1. Stop the agent first.

$ $AGENT_INST/bin/emctl stop agent

If agent does not shutdown gracefully, then kill all agent background processes by first grepping for agent perl and java processes only.

$ps -eaf | grep java | grep '/db/oracle/product/agent12c/12.1.0.4/agent_inst'

$ps -eaf | grep perl
$kill -9 <Process id>

2. Take backup and edit $AGENT_INST/sysman/config/emd.properties file.

Note : Tune -Xmx and -XX:MaxPermSize parameters , in the below example heap value is increased to 512M from 128M .

The default current value:

agentJavaDefines=-Xmx128M -XX:MaxPermSize=96M

The new value:

agentJavaDefines=-Xmx512M -XX:MaxPermSize=128M

For 13c, please refer to “EM 13c: Agent Tuning Parameters when the Agent is Monitoring Several Targets (Doc ID 2449197.1)“

3. Restart agent successfully.

$ $AGENT_INST/bin/emctl start agent
Oracle Enterprise Manager Cloud Control 12c Release 4
Copyright (c) 1996, 2014 Oracle Corporation. All rights reserved.
Starting agent ............................................... started.

$AGENT_INST/bin/emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 4
Copyright (c) 1996, 2014 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version : 12.1.0.4.0
OMS Version : 12.1.0.4.0
Protocol Version : 12.1.0.1.0
Agent Home : /db/oracle/product/agent12c/12.1.0.4/agent_inst
Agent Log Directory : /db/oracle/product/agent12c/12.1.0.4/agent_inst/sysman/log
Agent Binaries : /db/oracle/product/agent12c/12.1.0.4/core/12.1.0.4.0
Agent Process ID : 6532
Parent Process ID : 6208
......
......
......
Operating System : Linux version 2.6.32-358.6.1.el6.x86_64 (amd64)
Last Reload : (none)
Last successful upload : 2016-03-06 20:19:15
Last attempted upload : 2016-03-06 20:19:15
Total Megabytes of XML files uploaded so far : 0.03
Number of XML files pending upload : 0
Size of XML files pending upload(MB) : 0
Available disk space on upload filesystem : 41.27%
Collection Status : Collections enabled
Heartbeat Status : Ok
Last attempted heartbeat to OMS : 2016-03-06 20:19:14
Last successful heartbeat to OMS : 2016-03-06 20:19:14
Next scheduled heartbeat to OMS : 2016-03-06 20:20:14

---------------------------------------------------------------
Agent is Running and Ready