Check N+1 standby redo logs are created on both primary and standby
— on primary
SQL> SELECT thread#, group#, sequence#, bytes, archived, status
FROM v$standby_log
order by thread#, group#;
THREAD# GROUP# SEQUENCE# BYTES ARC STATUS
---------- ---------- ---------- ---------- --- ----------
1 11 0 1073741824 YES UNASSIGNED
1 12 0 1073741824 YES UNASSIGNED
1 13 0 1073741824 YES UNASSIGNED
2 21 0 1073741824 YES UNASSIGNED
2 22 0 1073741824 YES UNASSIGNED
2 23 0 1073741824 YES UNASSIGNED
6 rows selected.
— on standby
SQL> SELECT thread#, group#, sequence#, bytes, archived, status
FROM v$standby_log
order by thread#, group#;
THREAD# GROUP# SEQUENCE# BYTES ARC STATUS
---------- ---------- ---------- ---------- --- ----------
1 11 0 1073741824 NO UNASSIGNED
1 12 328 1073741824 YES ACTIVE
1 13 0 1073741824 YES UNASSIGNED
2 21 0 1073741824 NO UNASSIGNED
2 22 234 1073741824 YES ACTIVE
2 23 0 1073741824 YES UNASSIGNED
6 rows selected.
Check primary & standby database protection mode and switchover status.
— On Primary
SQL>SELECTNAME,OPEN_MODE,DATABASE_ROLE,PROTECTION_MODE,DB_UNIQUE_NAME,
SWITCHOVER_STATUS FROM V$DATABASE;
NAME OPEN_MODE DATABASE_R PROTECTION_MODE DB_UNIQUE_ SWITCHOVER_STATUS
--------- ------------ ---------- -------------------- ---------- --------------------
TESTDB READ WRITE PRIMARY MAXIMUM PERFORMANCE TESTDB TO STANDBY
— ON standby
SQL>SELECTNAME,OPEN_MODE,DATABASE_ROLE,PROTECTION_MODE,DB_UNIQUE_NAME,
SWITCHOVER_STATUS FROM V$DATABASE;
NAME OPEN_MODE DATABASE_R PROTECTION_MODE DB_UNIQUE_ SWITCHOVER_STATUS
--------- ---------- ---------- -------------------- ---------- --------------------
TESTDB MOUNTED PHYSICAL STANDBY MAXIMUM PERFORMANCE STBYDB NOT ALLOWED
Check maximum archive log sequences
— ON primary
SQL> select THREAD#,STATUS,ENABLED,GROUPS,INSTANCE, MAX(SEQUENCE#)
FROM gv$THREAD
group by THREAD#,STATUS,ENABLED,GROUPS,INSTANCE ;
THREAD# STATUS ENABLED GROUPS INSTANCE MAX(SEQUENCE#)
---------- ------ -------- ---------- ---------- --------------
2 OPEN PUBLIC 2 TESTDB2 234
1 OPEN PUBLIC 2 TESTDB1 328
— On standby
SQL> select THREAD#,STATUS,ENABLED,GROUPS,INSTANCE, MAX(SEQUENCE#)
FROM gv$THREAD
group by THREAD#,STATUS,ENABLED,GROUPS,INSTANCE ;
THREAD# STATUS ENABLED GROUPS INSTANCE MAX(SEQUENCE#)
---------- ------ -------- ---------- ---------- --------------
2 OPEN PUBLIC 2 TESTDB2 234
1 OPEN PUBLIC 2 TESTDB1 328
Check archived logs are applied or not until the maximum sequences
— On primary
SQL> SELECT THREAD#,DEST_ID,APPLIED,max(SEQUENCE#) FROM V$ARCHIVED_LOG where DEST_ID>1 group by THREAD#,APPLIED,DEST_ID;
THREAD# DEST_ID APPLIED MAX(SEQUENCE#)
---------- ---------- --------- --------------
1 2 YES 327
2 2 YES 233
On standby
SQL> SELECT THREAD#,DEST_ID,APPLIED,max(SEQUENCE#)
FROM V$ARCHIVED_LOG
where APPLIED='YES'
group by THREAD#,APPLIED,DEST_ID;
THREAD# DEST_ID APPLIED MAX(SEQUENCE#)
---------- ---------- --------- --------------
1 1 YES 327
2 1 YES 233
Check recovery progress with gv$managed_standby view.
— On primary
SQL> select INST_ID,process,pid,status,sequence# from gv$managed_standby order by 1,2;
INST_ID PROCESS PID STATUS SEQUENCE#
---------- --------- ------------------------ ------------ ----------
1 ARCH 70960 CLOSING 327
1 ARCH 70964 CLOSING 320
1 ARCH 70914 CLOSING 317
1 ARCH 70956 CLOSING 318
1 DGRD 70921 ALLOCATED 0
1 DGRD 70917 ALLOCATED 0
1 DGRD 9311 ALLOCATED 0
1 DGRD 87854 ALLOCATED 0
1 LNS 87745 WRITING 328
2 ARCH 45746 CLOSING 232
2 ARCH 45884 CLOSING 230
2 ARCH 45829 CLOSING 233
2 ARCH 45559 CLOSING 231
2 DGRD 123096 ALLOCATED 0
2 DGRD 45652 ALLOCATED 0
2 DGRD 45897 ALLOCATED 0
2 DGRD 45491 ALLOCATED 0
2 LNS 45891 WRITING 234
18 rows selected.
On standby
SQL> select INST_ID,process,pid,status,sequence# from gv$managed_standby order by 1,2;
INST_ID PROCESS PID STATUS SEQUENCE#
---------- --------- ------------------------ ------------ ----------
1 ARCH 19160 CLOSING 229
1 ARCH 19189 CLOSING 327
1 ARCH 19208 CLOSING 319
1 ARCH 19230 CLOSING 326
1 DGRD 19173 ALLOCATED 0
1 DGRD 19178 ALLOCATED 0
1 MRP0 28123 APPLYING_LOG 234
1 RFS 59847 RECEIVING 328
2 ARCH 74638 CLOSING 233
2 ARCH 74715 CLOSING 227
2 ARCH 74711 CLOSING 232
2 ARCH 74705 CLOSING 230
2 DGRD 74693 ALLOCATED 0
2 DGRD 74661 ALLOCATED 0
2 RFS 76404 IDLE 0
2 RFS 115608 RECEIVING 234
2 RFS 76410 IDLE 0
17 rows selected.
Check Dataguard status
— on standby only
SQL> select name, value from V$DATAGUARD_STATS;
NAME VALUE
-------------------------------- ------------------------------
transport lag +00 00:00:00
apply lag +00 00:00:00
apply finish time +00 00:00:00.000
estimated startup time 41
Check recovery process
— on standby only
SQL> select to_char(start_time, 'DD-MON-RR HH24:MI:SS') start_time,
item, round(sofar/1024,2) "MB/Sec"
from v$recovery_progress
order by start_time;
START_TIME ITEM MB/Sec
------------------ -------------------------------- ----------
28-FEB-25 22:07:39 Active Apply Rate 1.33
28-FEB-25 22:07:39 Active Time 34.95
28-FEB-25 22:07:39 Apply Time per Log .68
28-FEB-25 22:07:39 Average Apply Rate .12
28-FEB-25 22:07:39 Checkpoint Time per Log 0
28-FEB-25 22:07:39 Redo Applied 18
28-FEB-25 22:07:39 Last Applied Redo 0
28-FEB-25 22:07:39 Log Files .05
28-FEB-25 22:07:39 Maximum Apply Rate 69.75
28-FEB-25 22:07:39 Recovery ID 0
28-FEB-25 22:07:39 Elapsed Time 144.17
11 rows selected.
Check gap status between primary and standby database
SQL>select status, gap_status from v$archive_dest_status where dest_id = 2;
STATUS GAP_STATUS
--------- ------------------------
VALID NO GAP
Appendix – Monitoring Data Guard Configuration Health Using SQL
— on primary database
| Goal | Query | Expected Results |
| Check if any remote standby archive destination is getting errors. Check if all remote standby archive destinations is enabled or “VALID”. | select sysdate,status,error from gv$archive_dest_status where type=’PHYSICAL’ and status!=’VALID’ or error is not null; | Good behavior – no rows returned. If query returns rows, then raise an alert with the returned data. |
| Check if any nologging activity has occurred on the primary for the last day. | select file#, name, unrecoverable_change#, unrecoverable_time from v$datafile where unrecoverable_time>(sysdate – 1); | Good behavior – no rows returned. If query returns rows, then the standby is vulnerable and the subsequent file needs to be refreshed on the standby. |
| Detect gaps on the standby | select sysdate,database_mode,recovery_mode, gap_status from v$archive_dest_status where type=’PHYSICAL’ and gap_status!=’NO GAP’; | Good behavior – no rows returned. If query returns rows, then there’s an existing gap between the primary and the standby. Subsequent action is to run the same query on the standby, if the output is identical, then no action required. If the output on the standby does not match the primary then the datafile on the standby should be refreshed. |
| Assess if any severe Data Guard event occurred in the last day | SQL> select * from v$dataguard_status where severity in (‘Error’,’Fatal’) and TIMESTAMP > sysdate -1; | Good behavior – no rows returned. If query returns rows then the result should raise an alert with the returned output. |
| FOR SYNC ENVIRONMENTS ONLY: Assess if running in Maximum Availability mode and configuration is in sync. | select sysdate,protection_mode, synchronized, synchronization_status from v$archive_dest_status where type=’PHYSICAL’ and synchronization_status !=’OK’; | Good behavior – no rows returned. If query returns rows then the result should raise an alert with the returned output. |
— on standby
| Goal | Query | Expected Results |
| Determine if there is a transport lag | select name,value,time_computed,datum_time from v$dataguard_stats where name=’transport lag’ and value > ‘+00 00:01:00’; | Good behavior – no rows are returned. If no rows are returned then this implies that there is no transport lag |
| Determine if there is an apply lag | select name,value,time_computed,datum_time from v$dataguard_stats where name=’apply lag’ and value > ‘+00 00:01:00’; | Good behavior – no rows are returned. If no rows are returned then this implies that there is no apply lag |
| Standby data file check (offline files or files that are not accessible) | select * from v$datafile_header where status =’OFFLINE’ or ERROR is not null; | Good behavior – no rows returned. Any rows returned will show which file(s) are having IO or recovery issues. |
| Verify that the Media Recovery Process is currently running. | select * from v$managed_standby where process like ‘MRP%’; | Good behavior – rows returned. If no rows are returned then the MRP process is not running. |
| Assess if any severe Data Guard event occurred in the last day | select * from v$dataguard_status where severity in (‘Error’,’Fatal’) and timestamp > (sysdate -1); | Good behavior – no rows returned. If query returns rows, then the result should raise an alert with the returned output. |