Physical Standby Data Guard Health Check

Check N+1 standby redo logs are created on both primary and standby

— on primary

SQL> SELECT thread#, group#, sequence#, bytes, archived, status 
     FROM v$standby_log 
     order by thread#, group#;


   THREAD#     GROUP#  SEQUENCE#      BYTES ARC STATUS
---------- ---------- ---------- ---------- --- ----------
         1         11          0 1073741824 YES UNASSIGNED
         1         12          0 1073741824 YES UNASSIGNED
         1         13          0 1073741824 YES UNASSIGNED
         2         21          0 1073741824 YES UNASSIGNED
         2         22          0 1073741824 YES UNASSIGNED
         2         23          0 1073741824 YES UNASSIGNED

6 rows selected.

— on standby

SQL> SELECT thread#, group#, sequence#, bytes, archived, status 
     FROM v$standby_log 
     order by thread#, group#;

   THREAD#     GROUP#  SEQUENCE#      BYTES ARC STATUS
---------- ---------- ---------- ---------- --- ----------
         1         11          0 1073741824 NO  UNASSIGNED
         1         12        328 1073741824 YES ACTIVE
         1         13          0 1073741824 YES UNASSIGNED
         2         21          0 1073741824 NO  UNASSIGNED
         2         22        234 1073741824 YES ACTIVE
         2         23          0 1073741824 YES UNASSIGNED

6 rows selected.

Check primary & standby database protection mode and switchover status.

— On Primary

SQL>SELECTNAME,OPEN_MODE,DATABASE_ROLE,PROTECTION_MODE,DB_UNIQUE_NAME,
    SWITCHOVER_STATUS FROM V$DATABASE;

NAME      OPEN_MODE    DATABASE_R PROTECTION_MODE      DB_UNIQUE_ SWITCHOVER_STATUS
--------- ------------ ---------- -------------------- ---------- --------------------
TESTDB    READ WRITE   PRIMARY    MAXIMUM PERFORMANCE  TESTDB     TO STANDBY

— ON standby

SQL>SELECTNAME,OPEN_MODE,DATABASE_ROLE,PROTECTION_MODE,DB_UNIQUE_NAME,
    SWITCHOVER_STATUS FROM V$DATABASE;

NAME      OPEN_MODE  DATABASE_R PROTECTION_MODE      DB_UNIQUE_ SWITCHOVER_STATUS
--------- ---------- ---------- -------------------- ---------- --------------------
TESTDB    MOUNTED   PHYSICAL STANDBY    MAXIMUM PERFORMANCE  STBYDB     NOT ALLOWED

Check maximum archive log sequences

— ON primary

SQL> select  THREAD#,STATUS,ENABLED,GROUPS,INSTANCE, MAX(SEQUENCE#) 
     FROM gv$THREAD 
     group by THREAD#,STATUS,ENABLED,GROUPS,INSTANCE ;

   THREAD# STATUS ENABLED      GROUPS INSTANCE   MAX(SEQUENCE#)
---------- ------ -------- ---------- ---------- --------------
         2 OPEN   PUBLIC            2 TESTDB2             234
         1 OPEN   PUBLIC            2 TESTDB1             328

— On standby

SQL> select  THREAD#,STATUS,ENABLED,GROUPS,INSTANCE, MAX(SEQUENCE#) 
     FROM gv$THREAD 
     group by THREAD#,STATUS,ENABLED,GROUPS,INSTANCE ;

   THREAD# STATUS ENABLED      GROUPS INSTANCE   MAX(SEQUENCE#)
---------- ------ -------- ---------- ---------- --------------
         2 OPEN   PUBLIC            2 TESTDB2             234
         1 OPEN   PUBLIC            2 TESTDB1             328

Check archived logs are applied or not until the maximum sequences

— On primary

SQL>  SELECT THREAD#,DEST_ID,APPLIED,max(SEQUENCE#)  FROM V$ARCHIVED_LOG where DEST_ID>1 group by THREAD#,APPLIED,DEST_ID;

   THREAD#    DEST_ID APPLIED   MAX(SEQUENCE#)
---------- ---------- --------- --------------
         1          2 YES                  327
         2          2 YES                  233

On standby

SQL> SELECT THREAD#,DEST_ID,APPLIED,max(SEQUENCE#)  
     FROM V$ARCHIVED_LOG 
     where APPLIED='YES' 
     group by THREAD#,APPLIED,DEST_ID;

   THREAD#    DEST_ID APPLIED   MAX(SEQUENCE#)
---------- ---------- --------- --------------
         1          1 YES                  327
         2          1 YES                  233

Check recovery progress with gv$managed_standby view.

— On primary

SQL>  select INST_ID,process,pid,status,sequence# from gv$managed_standby order by 1,2;

   INST_ID PROCESS   PID                      STATUS        SEQUENCE#
---------- --------- ------------------------ ------------ ----------
         1 ARCH      70960                    CLOSING             327
         1 ARCH      70964                    CLOSING             320
         1 ARCH      70914                    CLOSING             317
         1 ARCH      70956                    CLOSING             318
         1 DGRD      70921                    ALLOCATED             0
         1 DGRD      70917                    ALLOCATED             0
         1 DGRD      9311                     ALLOCATED             0
         1 DGRD      87854                    ALLOCATED             0
         1 LNS       87745                    WRITING             328
         2 ARCH      45746                    CLOSING             232
         2 ARCH      45884                    CLOSING             230
         2 ARCH      45829                    CLOSING             233
         2 ARCH      45559                    CLOSING             231
         2 DGRD      123096                   ALLOCATED             0
         2 DGRD      45652                    ALLOCATED             0
         2 DGRD      45897                    ALLOCATED             0
         2 DGRD      45491                    ALLOCATED             0
         2 LNS       45891                    WRITING             234

18 rows selected.

On standby

SQL>  select INST_ID,process,pid,status,sequence# from gv$managed_standby order by 1,2;

   INST_ID PROCESS   PID                      STATUS        SEQUENCE#
---------- --------- ------------------------ ------------ ----------
         1 ARCH      19160                    CLOSING             229
         1 ARCH      19189                    CLOSING             327
         1 ARCH      19208                    CLOSING             319
         1 ARCH      19230                    CLOSING             326
         1 DGRD      19173                    ALLOCATED             0
         1 DGRD      19178                    ALLOCATED             0
         1 MRP0      28123                    APPLYING_LOG        234
         1 RFS       59847                    RECEIVING           328
         2 ARCH      74638                    CLOSING             233
         2 ARCH      74715                    CLOSING             227
         2 ARCH      74711                    CLOSING             232
         2 ARCH      74705                    CLOSING             230
         2 DGRD      74693                    ALLOCATED             0
         2 DGRD      74661                    ALLOCATED             0
         2 RFS       76404                    IDLE                  0
         2 RFS       115608                   RECEIVING           234
         2 RFS       76410                    IDLE                  0

17 rows selected.

Check Dataguard status

— on standby only

SQL> select name, value from V$DATAGUARD_STATS;

NAME                             VALUE
-------------------------------- ------------------------------
transport lag                    +00 00:00:00
apply lag                        +00 00:00:00
apply finish time                +00 00:00:00.000
estimated startup time           41

Check recovery process

— on standby only

SQL>  select to_char(start_time, 'DD-MON-RR HH24:MI:SS') start_time,
item, round(sofar/1024,2) "MB/Sec"
from v$recovery_progress 
order by start_time;

START_TIME         ITEM                                 MB/Sec
------------------ -------------------------------- ----------
28-FEB-25 22:07:39 Active Apply Rate                      1.33
28-FEB-25 22:07:39 Active Time                           34.95
28-FEB-25 22:07:39 Apply Time per Log                      .68
28-FEB-25 22:07:39 Average Apply Rate                      .12
28-FEB-25 22:07:39 Checkpoint Time per Log                   0
28-FEB-25 22:07:39 Redo Applied                             18
28-FEB-25 22:07:39 Last Applied Redo                         0
28-FEB-25 22:07:39 Log Files                               .05
28-FEB-25 22:07:39 Maximum Apply Rate                    69.75
28-FEB-25 22:07:39 Recovery ID                               0
28-FEB-25 22:07:39 Elapsed Time                         144.17

11 rows selected.

Check gap status between primary and standby database

SQL>select status, gap_status from v$archive_dest_status where dest_id = 2;

STATUS    GAP_STATUS
--------- ------------------------
VALID     NO GAP

Appendix – Monitoring Data Guard Configuration Health Using SQL

— on primary database

Goal	Query	Expected Results
Check if any remote standby archive destination is getting errors. Check if all remote standby archive destinations is enabled or “VALID”.	select sysdate,status,error from gv$archive_dest_status where type=’PHYSICAL’ and status!=’VALID’ or error is not null;	Good behavior – no rows returned. If query returns rows, then raise an alert with the returned data.
Check if any nologging activity has occurred on the primary for the last day.	select file#, name, unrecoverable_change#, unrecoverable_time from v$datafile where unrecoverable_time>(sysdate – 1);	Good behavior – no rows returned. If query returns rows, then the standby is vulnerable and the subsequent file needs to be refreshed on the standby.
Detect gaps on the standby	select sysdate,database_mode,recovery_mode, gap_status from v$archive_dest_status where type=’PHYSICAL’ and gap_status!=’NO GAP’;	Good behavior – no rows returned. If query returns rows, then there’s an existing gap between the primary and the standby. Subsequent action is to run the same query on the standby, if the output is identical, then no action required. If the output on the standby does not match the primary then the datafile on the standby should be refreshed.
Assess if any severe Data Guard event occurred in the last day	SQL> select * from v$dataguard_status where severity in (‘Error’,’Fatal’) and TIMESTAMP > sysdate -1;	Good behavior – no rows returned. If query returns rows then the result should raise an alert with the returned output.
FOR SYNC ENVIRONMENTS ONLY: Assess if running in Maximum Availability mode and configuration is in sync.	select sysdate,protection_mode, synchronized, synchronization_status from v$archive_dest_status where type=’PHYSICAL’ and synchronization_status !=’OK’;	Good behavior – no rows returned. If query returns rows then the result should raise an alert with the returned output.

— on standby

Goal	Query	Expected Results
Determine if there is a transport lag	select name,value,time_computed,datum_time from v$dataguard_stats where name=’transport lag’ and value > ‘+00 00:01:00’;	Good behavior – no rows are returned. If no rows are returned then this implies that there is no transport lag
Determine if there is an apply lag	select name,value,time_computed,datum_time from v$dataguard_stats where name=’apply lag’ and value > ‘+00 00:01:00’;	Good behavior – no rows are returned. If no rows are returned then this implies that there is no apply lag
Standby data file check (offline files or files that are not accessible)	select * from v$datafile_header where status =’OFFLINE’ or ERROR is not null;	Good behavior – no rows returned. Any rows returned will show which file(s) are having IO or recovery issues.
Verify that the Media Recovery Process is currently running.	select * from v$managed_standby where process like ‘MRP%’;	Good behavior – rows returned. If no rows are returned then the MRP process is not running.
Assess if any severe Data Guard event occurred in the last day	select * from v$dataguard_status where severity in (‘Error’,’Fatal’) and timestamp > (sysdate -1);	Good behavior – no rows returned. If query returns rows, then the result should raise an alert with the returned output.