Blog

The agent is overloaded [current requests: 128]

Java layer deadlock —“Dead Lock detected!!”, bounce the agent, and then everything is working fine.

SITUATION

The following alerts are received from racnode1 -“The agent is overloaded [current requests: 128]”

From: oracle 
Sent: Friday, 4 August 2017 7:07 PM
Cc: 
Subject: EM Event: Warning: racnode1 - Agent Unreachable (REASON = The agent is overloaded [current requests: 128]). Host is reachable.

...
..
.
Categories=Availability 
Message=Agent Unreachable (REASON = The agent is overloaded [current requests: 128]). Host is reachable. 
Severity=Warning 
Event reported time=Aug 4, 2017 7:06:27 PM AEST
...
..
.

INVESTIGATING

1)   Check agent status

Agent is running
Agent upload is not working
Agent reload is not working
OMS heartbeat is not working

$ emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version : 12.1.0.5.0
OMS Version : 13.2.0.0.0
Protocol Version : 12.1.0.1.0
..
.
Last Reload : 2017-08-04 11:28:59
Last successful upload : 2017-08-04 14:51:03  <--- 5 hours ago
Last attempted upload : 2017-08-04 14:51:03
..
.
Last attempted heartbeat to OMS : 2017-08-04 14:50:23
Last successful heartbeat to OMS : 2017-08-04 14:50:23
Next scheduled heartbeat to OMS : 2017-08-04 14:51:23

2) Upload agent
$ emctl upload agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD upload error:The agent is overloaded [current requests: 128]
3) Reload agent
$ emctl reload agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD reload error:The agent is overloaded [current requests: 128]
4) “emagent_perl.trc” file has no information updated since agent restarted
5) Check “gcagent.log”

Java layer deadlock —“Dead Lock detected!!”

2017-08-04 19:28:59,071 [43:GCThread-13] ERROR -
Dead Lock detected!!
Participating threads:Thread Info Dump:
=================
"HTTP Listener-3592 - /emd/main/ (~Task-free~ OMS.pbs@16398@omsnode=>[150183756670001])" tid=3592 WAITING
 > Accumulated wait time (msec): 1372208 (1 times)

"HTTP Listener-2141 - /emd/main/ (~Task-free~ OMS.pbs@13103@omsnode=>[150182243190001])" tid=2141 BLOCKED
 > Accumulated wait time (msec): 11036289 (76 times)
 > Accumulated blocked time (msec): 16506994 (4 times)

"oracle.dfw.impl.incident.DiagnosticsDataExtractorImpl - Incident Dump Executor (created: Fri Aug 04 14:51:06 EST 2017)" tid=3088 BLOCKED
 > Accumulated blocked time (msec): 16672145 (7 times)

"HTTP Listener-1022 - /emd/main/ (~Task-free~ OMS.pbs@16398@omsnode=>[150181021899001])" tid=1022 WAITING
 > Accumulated wait time (msec): 28746227 (37 times)
 > Accumulated blocked time (msec): 133 (12 times)

"HTTP Listener-1078 - /emd/main/ (DispatchRequests OMS.console@16398@omsnode=>[150181015881006])" tid=1078 WAITING
 > Accumulated wait time (msec): 28719225 (44 times)

=================
Thread Info Dump:
=================
"HTTP Listener-3592 - /emd/main/ (~Task-free~ OMS.pbs@16398@omsnode=>[150183756670001])" tid=3592 WAITING
 sun.misc.Unsafe.park(Native Method)
 - waiting on <0x149717ec> (a java.util.concurrent.locks.ReentrantLock$NonfairSync), which is owned by "HTTP Listener-2141 - /emd/main/ (~Task-free~ OMS.pbs@13103@omsnode=>[150182243190001])" (tid=2141)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
 java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
...
..
.

SOLUTION

1) Stop agent
$ emctl stop agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
Stopping agent ...
 stopped.
2) Start agent
$ emctl start agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
Starting agent ............................................ started.
3) Upload agent successfully
$ emctl upload agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD upload completed successfully
4) Reload agent successfully
$ emctl reload agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD reload completed successfully
5)Check agent status successfully
$ emctl status agent
...
..
Last attempted heartbeat to  OMS : 2017-08-04 19:53:31
Last successful heartbeat to OMS : 2017-08-04 19:53:31
Next scheduled heartbeat to  OMS : 2017-08-04 19:54:32

---------------------------------------------------------------
Agent is Running and Ready

High Swap Usage On Oracle Database Server

SITUATION

When investigating into client’s Oracle database performance issue, we found the swap space usage is constantly very high on this Linux server.

OS: RHEL 7.3
DB: Oracle 12.2.0.1

FINDINGS

1)top
Tasks: 352 total, 2 running, 350 sleeping, 0 stopped, 0 zombie
Cpu(s): 13.4%us, 4.1%sy, 0.0%ni, 79.3%id, 2.2%wa, 0.3%hi, 0.8%si, 0.0%st
Mem: 32172820k total, 32015956k used, 156864k free, 14528k buffers
Swap: 16777208k total, 7435428k used, 9341780k free, 11129844k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
137049 oracle 15 0 16.2g 5.3g 5.3g S 20.6 17.4 10:14.17 oracle
 72457 oracle 15 0 16.2g 4.7g 4.7g S 15.3 15.3 10:50.15 oracle
...
..
.
2) pmap
$ pmap -x 137049
137049: oracleRACTEST1 (LOCAL=NO)
Address          Kbytes   RSS     Dirty Mode Mapping
0000000000400000 96356    11704   0     r-x-- oracle
0000000006419000 444      140     4     rwx-- oracle
0000000006488000 148      100     80    rwx-- [ anon ]
000000001966e000 532      176     92    rwx-- [ anon ]
0000000060000000 16779264 5444888 1485768 rwxs- [ shmid=0x670005 ]
00000032b6a00000 112      108     0     r-x-- ld-2.5.so
00000032b6c1c000 4        0       0     r-x-- ld-2.5.so
...
..
.
00007fff4c504000 160 136 132 rwx-- [ stack ]
00007fff4c5d2000 12 4 0 r-x-- [ anon ]
ffffffffff600000 8192 0 0 ----- [ anon ]
---------------- ------ ------ ------
total kB 16954204 5477952 1490448
3) swappiness
$ cat /proc/sys/vm/swappiness
10

Subscribe to get access

Read more of this content when you subscribe today.

ohasd failed to start

SITUATION

1) Two 11.2.0.4 RAC nodes( racnode1/2). Deleted node racnode2 . Upgraded OS from RHEL 4  to RHEL 7 for racnode2, then tried to add node racnode2 back into cluster .

2) export IGNORE_PREADDNODE_CHECKS=Y, Ran addnode.sh ->  orainstRoot.sh->root.sh, then got “ohasd failed to start” error.

...
..
.
ohasd failed to start
Failed to start the Clusterware. Last 20 lines of the alert log follow:
2017-07-24 16:11:23.098:
[client(11401)]CRS-2101:The OLR was formatted using version 3.
2017-07-24 16:11:24.001:
[client(11424)]CRS-1001:The OCR was formatted using version 3.

ohasd failed to start at /u01/app/11.2.0.4/grid/crs/install/roothas.pl line 377, line 4.

SOLUTION

Subscribe to get access

Read more of this content when you subscribe today.

Copy Oracle Home Binary from One RAC Node to Another RAC Node

For some reason RAC oracle home binary corrected on racnode2, so it needs to be copied and reconfigured from racnode1.

Subscribe to get access

Read more of this content when you subscribe today.

Move OCR, Voting Disk File, ASM SPILE to New Diskgroup

It is recommended to put OCR, Voting File and ASM SPILE onto a dedicated diskgroup.

This exercise is to move everything from old OCR/VOTING diskgroup OCR_VOTE  to a new diskgroup OCR_VOTE2 in 11.2.0.4.

Subscribe to get access

Read more of this content when you subscribe today.