CRS-2883: Resource ‘ora.cluster_interconnect.haip’ failed during Clusterware stack start

When starting up second node CRS of 12.1.0.2 RAC, it failed with “CRS-2883” error.

root@racnode2]# /u01/app/12.1.0/grid/bin/crsctl start crs -wait
CRS-4123: Starting Oracle High Availability Services-managed resources
CRS-2672: Attempting to start 'ora.mdnsd' on 'racnode2'
CRS-2672: Attempting to start 'ora.evmd' on 'racnode2'
CRS-2676: Start of 'ora.mdnsd' on 'racnode2' succeeded
CRS-2676: Start of 'ora.evmd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'racnode2'
CRS-2676: Start of 'ora.gpnpd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'racnode2'
CRS-2676: Start of 'ora.gipcd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'racnode2'
CRS-2676: Start of 'ora.cssdmonitor' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'racnode2'
CRS-2672: Attempting to start 'ora.diskmon' on 'racnode2'
CRS-2676: Start of 'ora.diskmon' on 'racnode2' succeeded
CRS-2676: Start of 'ora.cssd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'racnode2'
CRS-2672: Attempting to start 'ora.ctssd' on 'racnode2'
CRS-2883: Resource 'ora.cluster_interconnect.haip' failed during Clusterware stack start.
CRS-4406: Oracle High Availability Services synchronous start failed.
CRS-4000: Command Start failed, or completed with errors.

Check ohasd_orarootagent_root.trc file:

...
..
.
2019-07-29 19:41:59.985295 : USRTHRD:3605423872: {0:9:3} HAIP: to set HAIP
2019-07-29 19:42:00.036464 : USRTHRD:3605423872: {0:9:3} HAIP: number of inf from clsinet -- 1
2019-07-29 19:42:00.037488 : CSSCLNT:3605423872: clssgsgrppridata: buffer too small - bufsize(4) < datasize(8)
2019-07-29 19:42:00.037795 : USRTHRD:3605423872: {0:9:3} CssGroup::getPrivateGroupData clssgsgrppridata() error, rc = 13
2019-07-29 19:42:00.037868 : USRTHRD:3605423872: {0:9:3} [NetHAMain] thread hit exception CssGroup::getPrivateGroupData clssgsgrppridata() error
2019-07-29 19:42:00.037881 : USRTHRD:3605423872: {0:9:3} [NetHAMain] thread stopping
...
..
.

CAUSE

Patch 29698592 (  Grid Infrastructure Patch Set Update 12.1.0.2.190716 ) has been applied onto first node, but it hasn’t been applied onto second node.

RESOLUTION

All nodes should be patched with same GI patches. Since CRS cannot be started up on 2nd node, so  opatchauto cannot be used.  All GI patches should be manually applied onto second node.

On second node:

1)Kill all the cluster processes manually.

2) Make GI_HOME read/write for GI owner “grid”

#chmod -R 775 $GI_HOME

3) Manually apply GI patches.

--OCW PSU
[grid@racnode2]$ $GI_HOME/OPatch/opatch apply -oh $GI_HOME -local /tmp/29698592/29509318

--ACFS PSU
[grid@racnode2]$ $GI_HOME/OPatch/opatch apply -oh $GI_HOME -local /tmp/29698592/29423125

--DBWLM PSU
[grid@racnode2]$ $GI_HOME/OPatch/opatch apply -oh $GI_HOME -local /tmp/29698592/26983807

--DB PSU
[grid@racnode2]$ $GI_HOME/OPatch/opatch apply -oh $GI_HOME -local /tmp/29698592/29494060

4) Starting up CRS still fails.

[root@racnode2]# $GI_HOME/bin/crsctl start crs -wait
CRS-6706: Oracle Clusterware Release patch level ('3536172590') does 
not match Software patch level ('0'). Oracle Clusterware cannot be started.
CRS-4000: Command Start failed, or completed with errors.

Still on node 2.

For 12.2, Execute “<GI_HOME>/crs/install/rootcrs.pl -prepatch”  “<GI_HOME>/crs/install/rootcrs.pl -postpatch” and as <root_user>  the patch level should be corrected.

For 12.1

[root@racnode2 ]# $GI_HOME/crs/install/rootcrs.sh -patch
Using configuration parameter file: /u01/app/12.1.0/grid/crs/install/crsconfig_params
2019/07/29 22:26:41 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector.

2019/07/29 22:28:40 CLSRSC-4003: Successfully patched Oracle Trace File Analyzer (TFA) Collector.

2019/07/29 22:28:56 CLSRSC-329: Replacing Clusterware entries in file 'oracle-ohasd.service'

CRS-4123: Oracle High Availability Services has been started.
CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Starting Oracle High Availability Services-managed resources
CRS-2672: Attempting to start 'ora.mdnsd' on 'racnode2'
CRS-2672: Attempting to start 'ora.evmd' on 'racnode2'
CRS-2676: Start of 'ora.mdnsd' on 'racnode2' succeeded
CRS-2676: Start of 'ora.evmd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'racnode2'
CRS-2676: Start of 'ora.gpnpd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'racnode2'
CRS-2676: Start of 'ora.gipcd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'racnode2'
CRS-2676: Start of 'ora.cssdmonitor' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'racnode2'
CRS-2672: Attempting to start 'ora.diskmon' on 'racnode2'
CRS-2676: Start of 'ora.diskmon' on 'racnode2' succeeded
CRS-2676: Start of 'ora.cssd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'racnode2'
CRS-2672: Attempting to start 'ora.ctssd' on 'racnode2'
CRS-2676: Start of 'ora.ctssd' on 'racnode2' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'racnode2' succeeded
CRS-2679: Attempting to clean 'ora.asm' on 'racnode2'
CRS-2681: Clean of 'ora.asm' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'racnode2'
CRS-2676: Start of 'ora.asm' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.storage' on 'racnode2'
CRS-2676: Start of 'ora.storage' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.crf' on 'racnode2'
CRS-2676: Start of 'ora.crf' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'racnode2'
CRS-2676: Start of 'ora.crsd' on 'racnode2' succeeded
CRS-6017: Processing resource auto-start for servers: racnode2
CRS-2672: Attempting to start 'ora.net1.network' on 'racnode2'
CRS-2676: Start of 'ora.net1.network' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.ons' on 'racnode2'
CRS-2673: Attempting to stop 'ora.racnode2.vip' on 'racnode1'
CRS-2677: Stop of 'ora.racnode2.vip' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.racnode2.vip' on 'racnode2'
CRS-2676: Start of 'ora.ons' on 'racnode2' succeeded
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'racnode1'
CRS-2676: Start of 'ora.racnode2.vip' on 'racnode2' succeeded
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'racnode1'
CRS-2672: Attempting to start 'ora.LISTENER.lsnr' on 'racnode2'
CRS-2677: Stop of 'ora.scan1.vip' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.scan1.vip' on 'racnode2'
CRS-2676: Start of 'ora.scan1.vip' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'racnode2'
CRS-2676: Start of 'ora.LISTENER.lsnr' on 'racnode2' succeeded
CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.ractest.db' on 'racnode2'
CRS-2676: Start of 'ora.ractest.db' on 'racnode2' succeeded
CRS-6016: Resource auto-start has completed for server racnode2
CRS-6024: Completed start of Oracle Cluster Ready Services-managed resources
CRS-4123: Oracle High Availability Services has been started.
Oracle Clusterware active version on the cluster is [12.1.0.2.0]. 
The cluster upgrade state is [NORMAL]. 
The cluster active patch level is [3536172590].

Finally use opatchauto to apply the rest patches other than GI patches, the already applied GI patches will be skipped smartly by opatchauto .

[root@racnode2]#$GI_HOME/OPatch/opatchauto apply /tmp/29698592

SUMMARY

  • All nodes should be applied with same patches with latest opatch ( patch id 6880880 )
  • opatchauto requires that CRS is able to be started up, and shutdown accordingly. If CRS cannot be started up, apply GI patches manually first, use “rootcrs.sh” or “rootcrs.pl” to correct the patch level on problematic node. After that CRS can be started up on all nodes.

Fast Split Partition

Try to split  yearly partition into monthly 12 partitions , it took more than 7 hours , and finally failed for splitting January one.

SQl>alter table CDR split partition PAR_2018 at ( 
    TO_DATE(' 2018-02-01 00:00:00','sYYYY-MM-DD HH24:MI:SS') ) 
    into ( partition par_201801, partition PAR_2018 ) update indexes;
*
ERROR at line 1:
ORA-30036: unable to extend segment by 8 in undo tablespace 'UNDOTBS4'

Elapsed: 07:19:47.81

For 11gR2, here is the faster way of splitting the partition.

1) Disable the indexes.

SQL>alter index CDR_IDX1 unusable;
SQL>alter index CDR_IDX2 unusable;
SQL>alter index CDR_IDX3 unusable;
SQL>alter index CDR_IDX4 unusable;

2) Split partition with parallel option.

SQl>alter table CDR split partition PAR_2018 at ( 
    TO_DATE(' 2018-02-01 00:00:00','sYYYY-MM-DD HH24:MI:SS') ) 
    into ( partition par_201801, partition PAR_2018 )
    parallel 8;

Table altered.

Elapsed: 00:00:43.75

3) Rebuild indexes with parallel option.

SQL> alter index CDR_IDX1 rebuild parallel 16;

Index altered.

Elapsed: 00:01:26.70

SQL> alter index CDR_IDX2 rebuild parallel 16;

Index altered.

Elapsed: 00:00:42.96

SQL> alter index CDR_IDX3 rebuild parallel 16;

Index altered.

Elapsed: 00:00:41.34

SQL> alter index CDR_IDX4 rebuild parallel 16;

Index altered.

Elapsed: 00:00:47.16



SQL>  alter index CDR_IDX1 parallel 1;

Index altered.

Elapsed: 00:00:00.01
SQL>  alter index CDR_IDX2 parallel 1;

Index altered.

Elapsed: 00:00:00.03
SQL>  alter index CDR_IDX3 parallel 1;

Index altered.

Elapsed: 00:00:00.01
SQL>  alter index CDR_IDX4  parallel 1;

Index altered.

Elapsed: 00:00:00.01

For 12c onward, split a partition into multiple partitions in one statement.

SQL>alter table CDR split partition PAR_2018 into ( 
partition par_201801 values less than TO_DATE(' 2018-02-01 00:00:00','sYYYY-MM-DD HH24:MI:SS')),
partition par_201802 values less than TO_DATE(' 2019-03-01 00:00:00','sYYYY-MM-DD HH24:MI:SS')),
partition par_201903 values less than TO_DATE(' 2019-04-01 00:00:00','sYYYY-MM-DD HH24:MI:SS')),
...
..
.
partition par_201811 values less than TO_DATE(' 2018-12-01 00:00:00','sYYYY-MM-DD HH24:MI:SS')),
partition par_201812 values less than TO_DATE(' 2019-01-01 00:00:00','sYYYY-MM-DD HH24:MI:SS'))
);

Fast Split Partition and Split Subpartition

Fast split partition is very useful in production environment where outage is unavailable, and no impact on partitioned tables. Please refer to the details in Fast SPLIT PARTITION and SPLIT SUBPARTITION

Here is a good example of using fast split partition operation. For some reason, table CDR is partitioned by transaction date, and the invalid CDR records created for future date with future partitions created.

SQL> select sysdate from dual;

SYSDATE
---------
24-JUL-19

TABLE_NAME    PARTITION  INTERVAL
------------- ---------- --------- ---------------------------
CDR           RANGE      NUMTODSINTERVAL(1,'DAY')

TABLE_NAME PARTITION_NAME   HIGH_VALUE                    NUM_ROWS
---------- -------------- ------------------------------ ---------
CDR        P_FIRST        TIMESTAMP' 1900-01-01 00:00:00'         0
CDR        SYS_P707668    TIMESTAMP' 2019-09-04 00:00:00' 296956241
CDR        SYS_P708049    TIMESTAMP' 2019-09-05 00:00:00'         0
CDR        SYS_P708687    TIMESTAMP' 2019-09-06 00:00:00'         0
CDR        SYS_P709207    TIMESTAMP' 2019-09-07 00:00:00'         0

After cleaned the invalid CDR records, the future partitions should be split and dropped.

SQL> alter table CDR set interval ();

Table altered.


SQL> alter table CDR drop partition SYS_P709207 update indexes;

Table altered.


SQL> alter table CDR drop partition SYS_P708687 update indexes;

Table altered.

SQL> alter table CDR drop partition SYS_P708049  update indexes;

Table altered.

SQL> alter table CDR split partition SYS_P707668 at 
     ( TO_DATE(' 2019-07-25 00:00:00','sYYYY-MM-DD HH24:MI:SS') ) 
into ( partition par_20190724, partition SYS_P707668 ) update indexes;

Table altered.


SQL> select count(*) from CDR partition ( SYS_P707668);

COUNT(*)
----------
0

SQL> alter table CDR drop partition SYS_P707668 update indexes;

Table altered.

SQL> alter table CDR set interval (NUMTODSINTERVAL(1,'DAY'));

Table altered.


TABLE_NAME  PARTITION_NAME  HIGH_VALUE                      NUM_ROWS
----------- --------------- -------------------- ---------- ---------
CDR         P_FIRST         TIMESTAMP' 1900-01-01 00:00:00'        0
CDR         PAR_20190724    TIMESTAMP' 2019-07-25 00:00:00' 297116561
CDR         SYS_P709208     TIMESTAMP' 2019-07-26 00:00:00'    123456
CDR         SYS_P709923     TIMESTAMP' 2019-07-27 00:00:00'    234567
...
..
.

Partition split is very fast, because all the records go to one partition, and the other partition is empty.

ORA-16751: failed to switchover to physical standby database and Switchover Ends with Two Physical Standby Databases

Trying to switch over database from primary database to physical standby database,  it ran into the issue where both database became physical standby.

DGMGRL> switchover to "STDBYDB"
Performing switchover NOW, please wait...
Operation requires a connection to instance "STDBYDB" on database "STDBYDB"

Connecting to instance "STDBYDB"...
Connected.

Error: ORA-16751: failed to switchover to physical standby database

Failed.

Unable to switchover, primary database is still "PRIMDB"

DGMGRL> 

--
-- Check database role
--

NAME       OPEN_MODE            DATABASE_ROLE    
--------- -------------------- ---------------- 
PRIMDB    CLOSED BY SWITCHOVER PHYSICAL STANDBY  
STDBYDB   MOUNTED              PHYSICAL STANDBY

Data Guard Broker log:

07/19/2019 08:07:55
SQL Execution error=604, 
sql=[ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN]. 
See error stack below.
ORA-00604: error occurred at recursive SQL level 1
ORA-00344: unable to re-create online log '/u05/oralog/PRIMDB/redo1a.log'
ORA-27040: file create error, unable to create file
Linux-x86_64 Error: 2: No such file or directory
Additional information:
switchover to primary command failed
Database Resource SetState Error (16751)
Command SWITCHOVER TO STDBYDB completed with error ORA-16751

DataGuard cannot create online redo logs on standby database side. The reason is the following two parameters are not set up correctly.

The parameters on standby database side should be like those:

DB_FILE_NAME_CONVERT  = 'PRIMDB','STDBYDB'
LOG_FILE_NAME_CONVERT = 'PRIMDB','STDBYDB'

Solution

Logon to standby database server.

SQL>select database_role from v$database;

DATABASE_ROLE
----------------
PHYSICAL STANDBY

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL> startup mount
ORACLE instance started.
...
..
.
Database mounted.


SQL> alter database recover managed standby database cancel;

Database altered.

SQL> alter database recover managed standby database finish;

Database altered.

SQL> ALTER database commit to switchover to primary with session shutdown;

Database altered.

SQL> alter database open;

Database altered.

SQL>  select database_role from v$database;

DATABASE_ROLE
----------------
PRIMARY

Finally rebuild the Data Guard Broker Configuration.

ORA-00932: inconsistent datatypes from Heterogeneous Service Database Link

There is an Oracle database link between Oracle and PostgreSQL database through Heterogeneous Service ( HS ).

The sql query is working fine in PostgreSQL database:

testdb=> \d test_tbl
Table "testuser.test_tbl"
Column  | Type                    | Collation | Nullable | Default
--------+-------------------------+-----------+----------+---------
id      | double precision        |           |          |
s10     | character varying(10)   |           |          |
s20     | character varying(20)   |           |          |
s30     | character varying(2048) |           |          |


testdb=> select * from test_tbl;
id  | s10    | s20    | s30
----+--------+--------+---------------
1   | tets1 | test2 | Test Record 1

(1 row)

testdb=>select "id", substr("s10",1,5),
                substr("s20",1,5),substr("s30",1,10) 
         from test_tbl;

id | substr | substr | substr
---+--------+--------+------------
1  | test1  | test2  | Test Recor

(1 row)

While running the same query in Oracle database through Heterogeneous Service ( HS ) database link, get “ORA-00932: inconsistent datatypes” error. Sometimes you might get “ORA-02070: database does not support TO_CHAR in this context”.

SQL> select "id", substr("s10",1,5),substr("s20",1,5),substr("s30",1,10) 
      from "test_tbl"@PGLINK;
select "id", substr("s10",1,5),substr("s20",1,5), substr("s30",1,10) 
from "test_tbl"@PGLINK
*
ERROR at line 1:
ORA-00932: inconsistent datatypes: expected CHAR got LONG

Subscribe to get access

Read more of this content when you subscribe today.