ORA-16857: standby disconnected from redo source for longer than specified threshold

One single 11.2.0.4 instance Oracle database alert log shows below information;

 RFS[1]: No standby redo logfiles available for thread 1

Data Guard shows ORA-16857 error:

DGMGRL> show database "TESTSTY";

Database - TESTSTY

Role: PHYSICAL STANDBY
 Intended State: APPLY-ON
 Transport Lag: 10 minutes 32 seconds (computed 48 seconds ago)
 Apply Lag: 10 minutes 32 seconds (computed 48 seconds ago)
 Apply Rate: 39.97 MByte/s
 Real Time Query: OFF
 Instance(s):
 TESTSTY

Database Warning(s):
 ORA-16857: standby disconnected from redo source for longer than 
specified threshold

Database Status:
WARNING

Checked both primary and standby database, the  standby online redo logs have been created. But the size of the standby online redo logs are different from the database redo logs for both primary and secondary database.

— on standby :

SQL> select GROUP#,THREAD# ,BYTES/1024/1024 from v$standby_log;

GROUP#     THREAD#    BYTES/1024/1024
---------- ---------- ---------------
 4         1           50
 5         1           50
 6         1           50
 7         1           50

SQL> select GROUP#,THREAD#,BYTES/1024/1024 from v$log;

GROUP#     THREAD#    BYTES/1024/1024
---------- ---------- ---------------
 1         1          100
 3         1          100
 2         1          100

— On Primary

SQL> select GROUP#,THREAD# ,BYTES/1024/1024 from v$standby_log;

GROUP#     THREAD#    BYTES/1024/1024
---------- ---------- ---------------
 4         1          50
 5         1          50
 6         1          50
 7         1          50

SQL> select GROUP#,THREAD#,BYTES/1024/1024 from v$log;

GROUP#     THREAD#    BYTES/1024/1024
---------- ---------- ---------------
 1         1          100
 2         1          100
 3         1          100

Drop all standby online redo logs on both primary and standby databases, and recreate them again with same size as redo logfiles.

-- for standby db which is under recovery, recovery needs to be stopped first
SQL>alter database recover managed standby database cancel;

SQL>alter database add standby logfile thread 1 group 4 size 100m;

It will create standby online redo logs for one under +FRA, and another one under from “db_create_file_dest” parameter, if “db_create_online_log_dest_x” are not defined.

Finally restart the recovery process, then everything is fine.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE 
THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE;

Database altered.

alert.log:

RFS[1]: Selected log 4 for thread 1 sequence 41436 dbid 1345227970 branch 816878594
Mon Feb 19 14:38:25 2018
..
.
Recovery of Online Redo Log: Thread 1 Group 4 Seq 41436 Reading mem 0
..
.

DGMGRL of DataGuard:

DGMGRL> show database 'TESTSTY';

Database - TESTSTY

Role: PHYSICAL STANDBY
 Intended State: APPLY-ON
 Transport Lag: 0 seconds (computed 0 seconds ago)
 Apply Lag: 0 seconds (computed 0 seconds ago)
 Apply Rate: 208.00 KByte/s
 Real Time Query: OFF
 Instance(s):
 TESTSTY

Database Status:
SUCCESS

Warning: ORA-16857: standby disconnected from redo source for longer than specified threshold

One single 11.2.0.4 instance Oracle database alert log shows following information;

 RFS[1]: No standby redo logfiles available for thread 1

Data Guard shows ORA-16857 error:

DGMGRL> show database "TESTSTY";

Database - TESTSTY

Role: PHYSICAL STANDBY
 Intended State: APPLY-ON
 Transport Lag: 10 minutes 32 seconds (computed 48 seconds ago)
 Apply Lag: 10 minutes 32 seconds (computed 48 seconds ago)
 Apply Rate: 39.97 MByte/s
 Real Time Query: OFF
 Instance(s):
 TESTSTY

Database Warning(s):
 ORA-16857: standby disconnected from redo source for longer than 
specified threshold

Database Status:
WARNING

Checked both primary and standby database, the  standby online redo logs have been created. It is strange to see some of the thread id is different between primary and secondary database.

— on standby :

SQL> select GROUP#,THREAD#,BYTES/1024/1024 ,ARCHIVED,STATUS
       from v$standby_log; 2

GROUP#     THREAD#    BYTES/1024/1024 ARC STATUS
---------- ---------- --------------- --- ----------
 4         1          50              NO UNASSIGNED
 5         0          50              YES UNASSIGNED
 6         0          50              YES UNASSIGNED
 7         0          50              YES UNASSIGNED

— On Primary

SQL> select GROUP#,THREAD#,BYTES/1024/1024 ,ARCHIVED,STATUS 
       from v$standby_log;

GROUP#     THREAD#    BYTES/1024/1024 ARC STATUS
---------- ---------- --------------- --- ----------
 4         0          50              YES UNASSIGNED
 5         0          50              YES UNASSIGNED
 6         0          50              YES UNASSIGNED
 7         0          50              YES UNASSIGNED

Drop all standby online redo logs on both primary and standby databases, and recreate then again by specify “thread 1” explicitly.

-- for standby db which is under recovery, recovery needs to be stopped first
SQL>alter database recover managed standby database cancel;

SQL>alter database add standby logfile thread 1 group 4 size 50m;

It will create standby online redo logs for one under +FRA, and another one under from “db_create_file_dest” parameter, if “db_create_online_log_dest_x” are not defined.

Finally restart the recovery process, everything is fine.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE 
THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE;

Database altered.

“ORA-16644: apply instance not available” While DGMGRL Switchover

For some reason, DG broker does not recognise the applying instance of a standby database.

DGMGRL> switchover to STDBYDB;
Performing switchover NOW, please wait...
Operation requires a connection to instance "STDBYDB2" on database
 "STDBYDB"
Connecting to instance "STDBYDB2"...
Connected as SYSDBA.
Error: ORA-16644: apply instance not available

Failed.
Unable to switchover, primary database is still "PRIMDB"
DGMGRL>

DIAGNOSIS

a) Check and find one of the instance STDBYDB1 is the recovery instance. Stop the standby recovery process, and then start it again.  Data Guard still cannot find it.

b) Stop whole standby database, and restart it again. Check standby applying is undergoing. Data Guard still cannot find it.

RESOLUTION

DGMGRL> edit database STDBYDB set property 'PreferredApplyInstance'
='STDBYDB1';
Property "PreferredApplyInstance" updated

$srvctl stop database -d STDBYDB;
$srvctl start database -d STDBYDB;

DGMGRL> edit database STDBYDB set property 'PreferredApplyInstance'='';
Property "PreferredApplyInstance" updated