Fix Oracle Database Corrupt Blocks by Using RMAN

The following errors appear in alert log:

Thu Sep 23 20:06:34 2021
Hex dump of (file 23, block 774506) in trace file /u01/app/oracle/diag/rdbms/testdb/TETSDB/trace/TETSDB_ora_7558.trc
Corrupt block relative dba: 0x000bd16a (file 23, block 774506)
Bad check value found during backing up datafile
Data in bad block:
 type: 6 format: 2 rdba: 0x000bd16a
 last change scn: 0x071e.81e8b798 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xb7980602
 check value in block header: 0xe202
 computed block checksum: 0x82fe
Reread of blocknum=774506, file=/u05/oradata/TETSDB/cdr_01.dbf. found same corrupt data
Reread of blocknum=774506, file=/u05/oradata/TETSDB/cdr_01.dbf. found same corrupt data
Reread of blocknum=774506, file=/u05/oradata/TETSDB/cdr_01.dbf. found same corrupt data
Reread of blocknum=774506, file=/u05/oradata/TETSDB/ctl_01.dbf. found same corrupt data
Reread of blocknum=774506, file=/u05/oradata/TETSDB/cdr_01.dbf. found same corrupt data

SOLUTION

The corrupted block can be fixed online without offline datafile or tablespace required.

double confirm the data corrution

Run dbverify utility

$dbv blocksize=8192 file=/u05/oradata/TETSDB/cdr_01.dbf
...
..
.
Corrupt block relative dba: 0x000bd16a (file 23, block 774506)
...
..
.
Total Pages Marked Corrupt : 1

Run RMAN utility

RMAN> connect target /
RMAN> run {
allocate channel c1 type disk;
allocate channel c2 type disk;
allocate channel c3 type disk;
allocate channel c4 type disk;
backup validate check logical database;
close channel c1;
close channel c2;
close channel c3;
close channel c4
}

OR

Just check specified file and blocks:

RMAN> validate check logical datafile 23 block 774505 to 774507;


Check block corruption from database view

SQL> select * from  v$database_block_corruption;

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
        23     774506          1                  0 CHECKSUM

Check the impacted object

  SQL> select 
              relative_fno, owner, segment_name, segment_type
         from
              dba_extents
        where
              file_id = 23
         and
              774506 between block_id and block_id + blocks - 1;
  
RELATIVE_FNO OWNER       SEGMENT_NAME       SEGMENT_TYPE
------------ ----------- ------------------ ------------------
        1024  TEST       CDR_IDX1        INDEX

fix the corruption

If it is an object like index, just rebuilt it even online rebuilding the index.

SQL> alter index TEST.CDR_IDX1 rebuild online parallel 8;

Index altered.

SQL>  alter index TEST.CDR_IDX1 parallel 1;

Index altered.
$ rman target / catalog rman/passwd@rman
connected to target database: TESTDB (DBID=1234567890)
connected to recovery catalog database

RMAN> run {
allocate channel ch1 TYPE 'SBT_TAPE';
SEND 'NSR_ENV=(NSR_SERVER=BKP_SERVER,NSR_CLIENT=DB_SERVER)';
BLOCKRECOVER DATAFILE 23 block 774506;
}
...
..
.
archived log for thread 1 with sequence 51992 is already on disk as file /fra/TESTDB/archivelog/2021_09_23/o1_mf_1_51992_jnrmdprf_.arc
archived log for thread 1 with sequence 51993 is already on disk as file /fra/TESTDB/archivelog/2021_09_23/o1_mf_1_51993_jnrpkppq_.arc
media recovery complete, elapsed time: 00:05:08
Finished recover at 23-SEP-21
released channel: ch1

RMAN>

Check the progressing:

SQL> select 
           username,sid, opname, target, sofar, totalwork, units, 
           to_char(start_time,'YYYYMMDD-HH24:MI:SS') StartTime, 
           time_remaining, message  
      from 
           V$SESSION_LONGOPS 
      where 
           time_remaining>60;

Finally check and confirm the corrupted block(s) have been fixed.

SQL> select * from  v$database_block_corruption;

no rows selected                                            

For some database versions, the record in v$database_block_corruption still exists , even there is no corrupted blocks.

To clear v$database_block_corruption orphan records:

RMAN> validate check logical datafile 23 block 774505to 774507;

Starting validate at 04-OCT-21
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=413 device type=DISK
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00023 name=/u05/oradata/TESTDB/CDR_01.dbf
channel ORA_DISK_1: validation complete, elapsed time: 00:00:01
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
23   OK     0              0            3               8013155974853
  File Name: /u05/oradata/TESTDB/CDR_01.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              0
  Index      0              3
  Other      0              0

Finished validate at 04-OCT-21

The check v$database_block_corruption is cleared successfully.

SQL> select * from  v$database_block_corruption;

no rows selected

ORA-01578 ORA-01110 ORA-26040: Data block was loaded using the NOLOGGING option

Alert log shows following ORA- errors:

ORA-01578:ORACLE data block corrupted (file # 13, block # 652351)
ORA-01110:data file 13:'/media/sf_software/oradata/OEMREP/EMPDBREPOS/mgmt.dbf'
ORA-26040:Data block was loaded using the NOLOGGING option

Find out the segment impacted, and confirm this is an empty table.

SQL>select OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAME 
    from dba_extents 
   where file_id =13 and 652351 between block_id AND block_id+blocks-1;

OWNER    SEGMENT_NAME           SEGMENT_TYPE   TABLESPACE_NAME
-------- ---------------------  -------------- ----------------
SYSMAN   MOS_PA_EVAL_TARGETS_E  TABLE          MGMT_TABLESPACE
SQL> select count(*) from SYSMAN.MOS_PA_EVAL_TARGETS_E;

COUNT(*)
----------
0
SQL>select FILE#, BLOCK#, BLOCKS,scn_to_timestamp(NONLOGGED_START_CHANGE#), 
           scn_to_timestamp( NONLOGGED_END_CHANGE#), OBJECT# 
    from v$nonlogged_block;

FILE# BLOCK#   BLOCKS  SCN_TO_TIMESTAMP(NONLOGGED_START_CHANGE#)  SCN_TO_TIMESTAMP(NONLOGGED_END_CHANGE#)  OBJECT#
---------------------- -----------------------------------------  ---------------------------------------  -------
13   652351   1        12-MAY-20 11.58.27.000000000 AM           12-MAY-20 10.59.29.000000000 PM           91461

Truncate or move the table. 

SQL>truncate table SYSMAN.MOS_PA_EVAL_TARGETS_E;

or

SQL> alter table SYSMAN.MOS_PA_EVAL_TARGETS_E move tablespace MGMT_TABLESPACE;

luckily, the table is empty. it is easy to deal with. If table is not empty, then we can skip corrupted blocks to avoid DML ORA- errors.

Skip corrupt blocks 

SQL> select * from SYSMAN.STG_MOS_PA_EVAL_TARGETS_E;

select * from SYSMAN.STG_MOS_PA_EVAL_TARGETS_E
                      *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 13, block # 652351)
ORA-26040: Data block was loaded using the NOLOGGING option
BEGIN
  DBMS_REPAIR.skip_corrupt_blocks (
    schema_name => 'SYSMAN',
    object_name => 'MOS_PA_EVAL_TARGETS_E',
    object_type => DBMS_REPAIR.table_object,
    flags       => DBMS_REPAIR.skip_flag);
END;
/
 
SQL> select owner, table_name, SKIP_CORRUPT  from dba_tables 
      where table_name='MOS_PA_EVAL_TARGETS_E'

OWNER      TABLE_NAME                     SKIP_COR
---------- ------------------------------ --------
SYSMAN     MOS_PA_EVAL_TARGETS_E          ENABLED

Backup the table data, truncate the table , and then restore the table data

SQL> create table SYSMAN.STG_MOS_PA_EVAL_TARGETS_E as 
            select * from SYSMAN.MOS_PA_EVAL_TARGETS_E;

SQL> Truncate table SYSMAN.MOS_PA_EVAL_TARGETS_E;

SQL>Insert into SYSMAN.MOS_PA_EVAL_TARGETS_E 
                select * from SYSMAN.STG_MOS_PA_EVAL_TARGETS_E ;

The nologging corrupt block is returned to free space list

SQL> Select BYTES from dba_free_space 
     where file_id=13 and 652351 between block_id and block_id + blocks -1;

BYTES
----------
65536

SQL> select * from dba_extents 
     where file_id =13 and 652351 between block_id AND block_id+blocks-1;

no rows selected

After a little while, the nologging corrupt block will be reused

SQL> Select BYTES from dba_free_space
where file_id=13 and 652351 between block_id and block_id + blocks -1; 

no rows selected

SQL> select count(*) from dba_extents
where file_id =13 and 652351 between block_id AND block_id+blocks-1; 

COUNT(*)
----------
1

How To Format Empty Failing (Index) Pages Marked by DBVERIFY

Formatting empty failing pages to make clean DBVERIFY results.

ISSUE

Following another post “ORA-00600 internal error code arguments [ktbdchk1: bad dscn] after DG switchover”, there is another smaller file number 8 with 22 indexes pages identified by dbverify as failing pages as below:

Total Pages Failing   (Index): 22

Here we are going to demonstrate how to format those failing index pages.

SOLUTION

Identify the index segments and rebuild them online accordingly.

From the dbverify log below and run the query to get index segments owner and names for rebuilding:

itl[20] has higher commit scn(0x0001.ff7137bc) than block scn (0x0001.0651314d)
Page 15795600 failed with check code 6056

SQL>SELECT tablespace_name, segment_type, owner, segment_name
FROM dba_extents
WHERE file_id =8 and 15795600 between block_id AND block_id + blocks-1;

After rebuilt all the impacted indexes and run dbverify and sql query above, all the failing pages(blocks) are returned to freelist queue. They can be verified by query against dba_free_space view as well.

SQL> select name, bytes/1024/1024/1024 from v$datafile where file#=8;

NAME                                          BYTES/1024/1024/1024
--------------------------------------------  --------------------
+DG1/testdb/datafile/users_tbl.583.845563563  188

Create a table to use all free space, big pctfree to speed up inserting to make use of all available free space.

SQL> create table EMPTY( n number ) tablespace users_tbl pctfree 99;

Table created.

To get min/max free block size.

SQL> select TABLESPACE_NAME,FILE_ID,BLOCKS,count(*)
from dba_free_space
where TABLESPACE_NAME='USERS_TBL'
group by TABLESPACE_NAME,FILE_ID,BLOCKS 
order by 2;

TABLESPACE_NAME BLOCKS COUNT(*)
-------------- ------- ----------
USERS_TBL         8       3
USERS_TBL        16       3
USERS_TBL        32       2
USERS_TBL        40       1
USERS_TBL        48       2
.........
..
USERS_TBL     75904       1

21 rows selected.

To get how many times of minimum free space blocks in total available free space.

SQL> select sum(BLOCKS)/8  
       from dba_free_space
      where TABLESPACE_NAME='USERS_TBL';

SUM(BLOCKS)/8
-----------
66944

Turn datafile autoextend off.

SQL> alter database datafile '+DG1/testdb/datafile/USERS_TBL.583.845563563' autoextend off;

Database altered.

Allocate all available free space to table EMPTY with smallest freespace blcoks.

SQL>BEGIN
      for i in 1..66944 loop
        EXECUTE IMMEDIATE 'alter table EMPTY allocate extent ( size 64K) ';
      end loop;
    END;
    /

Insert data into table EMPTY until encountering unable  to allocate space ORA- error.

SQL> Begin
       FOR i IN 1..100000000 loop
         for j IN 1..10000 loop
           Insert into EMPTY VALUES(i+j);
        end loop;
        commit;
      END LOOP;
   END;
  /

Now drop table to release the free space.

SQL> drop table EMPTY;

Table dropped.

Turn datafile autoextend on.

SQL> alter database datafile '+DG1/tetsdb/datafile/users_tbl.583.845563563' autoextend on;

Database altered.

Run DBVEIFY to confirm all pages are all right now with ZERO pages failing.

DBVERIFY - Verification starting : FILE = +DG1/testdb/datafile/users_tbl.583.845563563

DBVERIFY - Verification complete
Total Pages Examined : 26869760
Total Pages Processed (Data) : 4391230
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 682945
Total Pages Failing (Index): 0
Total Pages Processed (Lob) : 20668189
Total Pages Failing (Lob) : 0
Total Pages Processed (Other): 49949
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 1077447
Total Pages Marked Corrupt : 0