ORA-02800: Requests timed out from Oracle Scheduler Job

Get errors from running an Oracle scheduler job:

SQL> exec dbms_scheduler.run_job(job_name=>'BATCH_JOB', 
                                 use_current_session=>false);
BEGIN dbms_scheduler.run_job(job_name=>'BATCH_JOB', 
                                 use_current_session=>false); END;

*
ERROR at line 1:
ORA-02800: Requests timed out
ORA-06512: at "SYS.DBMS_ISCHED", line 196
ORA-06512: at "SYS.DBMS_SCHEDULER", line 486
ORA-06512: at line 1

WORKAROUND

1)check parameter to make sure not zero.

SQL> show parameter job_queue_processes

NAME                                 TYPE     VALUE
-------------------------------- ------------ ------------------
job_queue_processes                integer    100

2) Bounce the database due to the bugs.

3) Rerun the job successfully.

SQL> exec dbms_scheduler.run_job(job_name=>'BATCH_JOB', 
use_current_session=>false);

PL/SQL procedure successfully completed.

Too many AQ$_PLSQL_NTFN_XXXXX jobs In Oracle Database

SYMPTOM

There are more than one million  AQ$_PLSQL_NTFN_xxxxxxxx scheduler jobs in oracle database.  

sql hangs for querying dba_scheduler_jobs.

Please note there are a couple of bugs on this issue, the steps are not necessarily working for every database versions.

SQL>select count(*) from dba_scheduler_jobs;

WORKAROUND

1)set job_queue_processes =0

SQL>alter system set job_queue_processes=0 scope=memory;

2)Stop and Drop all those AQ$_PLSQL_NTFN scheduler jobs:

— add rownum<100000 into below sqls, if the sql hangs.

SQL>select 'EXEC DBMS_SCHEDULER.STOP_JOB ('''|| job_name ||''', force => TRUE);' 
    from dba_scheduler_jobs 
    where job_name like 'AQ$_PLSQL_NTFN_%'
    and state='RUNNING';


SQL> select ' exec dbms_scheduler.drop_job('''||job_name ||''');' 
     from dba_scheduler_jobs 
     where job_name like 'AQ$_PLSQL_NTFN_%';

3)

SQL>select count(*) from sys.aq$AQ_SRVNTFN_TABLE_1 
    where queue='AQ$_AQ_SRVNTFN_TABLE_1_E';

DECLARE 
po dbms_aqadm.aq$_purge_options_t; 
BEGIN 
po.block := TRUE; 
DBMS_AQADM.PURGE_QUEUE_TABLE( 
queue_table => 'SYS.AQ_SRVNTFN_TABLE_1', 
purge_condition => 'qtview.queue = ''AQ$_AQ_SRVNTFN_TABLE_1_E''', 
purge_options => po); 
END; 
/

select count(*) from sys.aq$AQ_SRVNTFN_TABLE_1 
where queue='AQ$_AQ_SRVNTFN_TABLE_1_E';

4) Collect stats.

SQL> exec dbms_stats.gather_fixed_objects_stats;

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_schema_stats ('SYSTEM');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_schema_stats ('SYS');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_dictionary_stats;

PL/SQL procedure successfully completed.

5) Rebuild indexes of DBA_SCHEDULER_JOBS

6) Bounce the instance.

SQL>shutdown immediate ;
SQL>startup;
SQL>alter system set job_queue_processes=1000 scope=memory;

ORA-04021: timeout occurred while waiting to lock object

Application job failed with trace file contents:

*** 2018-06-16 11:03:17.044
*** SESSION ID:(2437.25769) 2018-06-16 11:03:17.044
*** CLIENT ID:() 2018-06-16 11:03:17.044
*** SERVICE NAME:(BATCH) 2018-06-16 11:03:17.044
*** MODULE NAME:(DBMS_SCHEDULER) 2018-06-16 11:03:17.044
*** ACTION NAME:(IMMEDIATE16_06_2018_10_10_01) 2018-06-16 11:03:17.044

ORA-12012: error on auto execute of job "BATCH"."IMMEDIATE16_06_2018_10_10_01"
ORA-04021: timeout occurred while waiting to lock object
ORA-06512: at "BATCH.AP_DATA_PROCESS", line 652
ORA-06512: at line 1

Check the session details from ASH history:

set pagesize 100
set linesize 250
col INSTANCE_NUMBER format 99
col SESSION_ID format 99999
col BLOCKING_INST_ID format 99
col event format a20
col WAIT_CLASS format a15
col PROGRAM format a20
col MODULE format a16
col p3 format 99999999999999999999
select sql_id,to_char(SAMPLE_TIME,'YYYYMMDD-HH24:MI:SS'),
       INSTANCE_NUMBER,SESSION_ID,PROGRAM,MODULE,EVENT,
       CURRENT_OBJ#,WAIT_CLASS,BLOCKING_INST_ID,
       BLOCKING_SESSION, p3
from dba_hist_active_sess_history
where INSTANCE_NUMBER= 3 
and SESSION_ID = 2437 
and sample_time between to_date('16-06-18 11:00:00','DD-MM-YY HH24:MI:SS')
and to_date('16-06-18 11:04:00','DD-MM-YY HH24:MI:SS')
and event ='library cache lock'
order by sample_id;

SQL_ID TO_CHAR(SAMPLE_TI INSTANCE_NUMBER SESSION_ID PROGRAM MODULE EVENT CURRENT_OBJ# WAIT_CLASS BLOCKING_INST_ID BLOCKING_SESSION P3
------------- ----------------- --------------- ---------- -------------------- ---------------- -------------------- ------------ --------------- ---------------- ---------------- ---------------------
...
..
6cp7dp2xgxr6m 20180616-11:02:52 3 2437 oracle@racnode3 (J001) DBMS_SCHEDULER library cache lock 27307391 Concurrency 2 5205 76355447552016387
6cp7dp2xgxr6m 20180616-11:03:02 3 2437 oracle@racnode3 (J001) DBMS_SCHEDULER library cache lock 27307391 Concurrency 2 5205 76355447552016387
6cp7dp2xgxr6m 20180616-11:03:12 3 2437 oracle@rcanode3 (J001) DBMS_SCHEDULER library cache lock 27307391 Concurrency 2 5205 76355447552016387

20 rows selected.

The OBJ# matches the DDL command:

ALTER INDEX BATCH.CDR_IND REBUILD SUBPARTITION CDR_IND_SUB_213;
SQL>select OWNER,OBJECT_NAME,SUBOBJECT_NAME,
           OBJECT_TYPE,STATUS,LAST_DDL_TIME 
     from  dba_objects 
    where  OBJECT_ID=27307391;

OWNER OBJECT_NAME SUBOBJECT_NAME  OBJECT_TYPE        STATUS LAST_DDL_
----- ----------- --------------  ------------------ ------ --------
BATCH CDR_IND     CDR_IND_SUB_213 INDEX SUBPARTITION VALID  20180616-12:39:22

The P3 shows the waiting object details, we can see the session failed because of waiting for object CDR,  which is holding by blocking session 5205 on node 2.

SQL> select lpad('Library cache P3 value: ',50,'.')||76355447552016387 from dual;

..........................Library cache P3 value: 76355447552016387
SQL> select lpad('Library cache P3 value HEX: ',50,'.')||to_char(76355447552016387,'xxxxxxxxxxxxxxxxxxxxx') from dual;

.....................Library cache P3 value HEX: 10f44e000010003
SQL> select lpad('Object ID: ',50,'.')||to_number(substr('10f44e000010003',1,length('10f44e000010003')-8 ),'xxxxxxxxxxxxxx') from dual;

...........Object ID: 17777888
SQL> select lpad('Namespace: ',50,'.')||to_number(substr('10f44e000010003',-8,4),'xxxxxxxxxxxxxx') from dual;

.......................................Namespace: 1
SQL> select lpad('RequestMode: ',50,'.')||decode(to_number(substr('10f44e000010003',-4),'xxxxxxxxxxxxxx'),3,'exclusive mode',2,'share mode',1,'null mode',0,'no lock/pin held','unknown') from dual;

.....................................RequestMode: exclusive mode
SQL> SELECT indx,kglstdsc FROM x$kglst WHERE kglsttyp='NAMESPACE' and indx=1;

1 TABLE/PROCEDURE
SQL>select OWNER,OBJECT_NAME,SUBOBJECT_NAME,OBJECT_TYPE,NAMESPACE 
from dba_objects where object_id=17777888;

OWNER     OBJECT_NAME   SUBOBJECT_NAME      OBJECT_TYPE     NAMESPACE
---------------- -----------------------  ---------------------------------  ------------------------  -----------------
BATCH        CDR                                                               TABLE                   1

Check blocking session details on racnode2, and found it is running DBMS_STATS against waiting object CDR from racnode3 .

b6usrg82hwsa3 20180616-11:02:56 2 5205 oracle@racnode2 (J032) DBMS_SCHEDULER row cache lock 18 Concurrency 0 2999959 GLOBAL
b6usrg82hwsa3 20180616-11:03:06 2 5205 oracle@racnode2 (J032) DBMS_SCHEDULER row cache lock 18 Concurrency 0 0       GLOBAL
b6usrg82hwsa3 20180616-11:03:16 2 5205 oracle@racnode2 (J032) DBMS_SCHEDULER row cache lock 18 Concurrency 0 1953712 GLOBAL
4m477zx25xrgq 20180616-11:03:26 2 5205 oracle@racnode2 (J032) DBMS_SCHEDULER                27307236       321 0     NOT IN WAIT
4m477zx25xrgq 20180616-11:03:36 2 5205 oracle@racnode2 (J032) DBMS_SCHEDULER                27307242       1582 0    NOT IN WAIT
4m477zx25xrgq 20180616-11:03:46 2 5205 oracle@racnode2 (J032) DBMS_SCHEDULER                27307219       1537 0    NOT IN WAIT


SQL> select sql_text from v$sql where sql_id='b6usrg82hwsa3';

SQL_TEXT
-------------------------------------------------------------------
call dbms_stats.gather_database_stats_job_proc ( )

SQL>se long 10000
SQL> select SQL_FULLTEXT from v$sql where sql_id='4m477zx25xrgq';
SQL_FULLTEXT
--------------------------------------------------------------------
/* SQL Analyze(1) */ select /*+ full(t) no_parallel(t) no_parallel_index(t)
dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monito
....
from "BATCH"."CDR" t
 where TBL$OR$IDX$PART$NUM("BATCH"."CDR",0,3,0,"ROWID") = :objn /*
 NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL
,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV
,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,ACL,NIL,NIL*/

The blocking session ID 5205 on node 2 is running a weekend maintenance job –“Optimizer Statistics Gathering”. it is analysing partition table  CDR.

There are a lot of discussions about “library cache lock” from DBMS_STATS gather of stats for a subpartition.

20180204002507

19790972 – “library cache lock” waits due to DBMS_STATS gather of stats for a sub partition

Workaround

Under the approval of Oracle Support,  you can try to change below parameter value from 15 mins to longer time.

SQL>alter system set "_kgl_time_to_wait_for_locks'=60 scope=spfile;

SQL>shutdown immediate;

SQL>startup

ORMB Batch Scheduling Job Hangs At F1-FLUSH with “PL/SQL Lock Timer” Wait Event

Overnight ORMB Batch Scheduling Job has been running for more than twenty hours, user wants to know why ?

1) Check which job at what step is still running and hanging.

SQL> select OWNER,JOB_NAME,JOB_SUBNAME,SESSION_ID,ELAPSED_TIME,CPU_USED 
     from dba_scheduler_running_jobs;

OWNER   JOB_NAME   JOB_SUBNAME  SESSION_ID ELAPSED_TIME  CPU_USED
------- ---------- ------------ ---------- ----------- ---------------
SCHADM  DAILY_JOBS F1_FLUSH_JOB 1006       +000 21:47:00.08 +00000:00:22.15

2) Check session status.

SQL> select USERNAME,STATUS,EVENT,sql_id,LAST_CALL_ET 
      from gv$session where sid=1006;

USERNAME  STATUS EVENT               SQL_ID           LAST_CALL_ET
--------- ------ ------------------  ---------------- -----------
SCHADM    ACTIVE PL/SQL lock timer   fnfa105xssacx    78949

3) Check which sql is the running with ‘PL/SQL lock timer’.

SQL> set long 20000
SQL> select SQL_FULLTEXT from v$sql where sql_id='fnfa105xssacx';

SQL_FULLTEXT
---------------------------------------------------------------------
DECLARE job BINARY_INTEGER := :job; next_date TIMESTAMP WITH TIME ZONE := :myda
te; broken BOOLEAN := FALSE; job_name VARCHAR2(30) := :job_name; job_subname
VARCHAR2(30) := :job_subname; job_owner VARCHAR2(30) := :job_owner; job_start
TIMESTAMP WITH TIME ZONE := :job_start; job_scheduled_start TIMESTAMP WITH TIME
 ZONE := :job_scheduled_start; window_start TIMESTAMP WITH TIME ZONE := :window
_start; window_end TIMESTAMP WITH TIME ZONE := :window_end; chain_id VARCHAR2(
14) := :chainid; credential_owner varchar2(30) := :credown; credential_name
varchar2(30) := :crednam; destination_owner varchar2(30) := :destown; destinat
ion_name varchar2(30) := :destnam; job_dest_id varchar2(14) := :jdestid; log_i
d number := :log_id; BEGIN DECLARE
 BATCH_CODE CHAR(8);
 USER_ID CHAR(8);
 THREAD_COUNT NUMBER;
 BUSINESS_DATE DATE;
 MAX_TIMEOUT_MINS NUMBER;
 RERUN_NUMBER NUMBER;
 THREAD_POOL VARCHAR2(254);
 MAIL_ID VARCHAR2(254);
 MIN_RUN_STATUS CHAR(2);
 RAISE_ERROR BOOLEAN;
 THREAD_NOTIFICATIONS BOOLEAN;
 SOFT_PARM_MAP CISADM.OUAF_BATCH.PARM_MAP_T;
 SOFT_PARM_LIST CISADM.OUAF_BATCH.PARM_LIST_T;
BEGIN
 BATCH_CODE := 'F1-FLUSH';
 USER_ID := 'JAMES';
 --THREAD_COUNT := 0;
 --BUSINESS_DATE := to_date('2016-01-31','yyyy-MM-dd');
 MAX_TIMEOUT_MINS := 0;
 RERUN_NUMBER := 0;
 THREAD_POOL := 'ORMB_BATCH';
 --MIN_RUN_STATUS := NULL;
 --RAISE_ERROR := NULL;
 -- THREAD_NOTIFICATIONS := true;
 --SOFT_PARM_MAP := NULL;
 --SOFT_PARM_LIST := CISADM.OUAF_BATCH.PARM_LIST_T('DIST-THD-POOL','');
 OUAF_BATCH.SUBMIT_JOB(
 BATCH_CODE => BATCH_CODE,
 USER_ID => USER_ID,
 THREAD_COUNT => THREAD_COUNT,
 BUSINESS_DATE => BUSINESS_DATE,
 MAX_TIMEOUT_MINS => MAX_TIMEOUT_MINS,
 RERUN_NUMBER => RERUN_NUMBER,
 THREAD_POOL => THREAD_POOL,
 MIN_RUN_STATUS => MIN_RUN_STATUS,
 RAISE_ERROR => RAISE_ERROR,
 THREAD_NOTIFICATIONS => THREAD_NOTIFICATIONS,
 SOFT_PARM_MAP => SOFT_PARM_MAP,
 SOFT_PARM_LIST => SOFT_PARM_LIST
 );
END; :mydate := next_date; IF broken THEN :b := 1; ELSE :b := 0; END IF; END;

4) ‘PL/SQL Lock Timer’ event ( Doc ID 1476088.1 )

"PL/SQL Lock Timer"

This event is called through the DBMSLOCK.SLEEP procedure or USERLOCK.SLEEP procedure. This event is most likely to come from user written stored procedures. The ‘idle’ event ‘PL/SQL lock timer’ is worth watching because it points us in the direction of application response, throughput, and possible coding issues. The PL/SQL lock timer event is, as the command that issues it states, a sleep mandated by the application code. The application is idle and doing nothing. This means that if the application sleeps for a combined interval of 5 minutes, the response to the user or complete run of the application will take at least 5 minutes longer.
While this event does not require resources and thus is not an Oracle resource contention issue, it is our job to provide the quickest response to users and push applications through our system to increase the user’s perceived performance. Most often, sleeps are put into an application for serialization of transactional events or spinning on queues until something happens. When the event occurs or a queue is populated, then the application continues its work. We should ask ourselves why an application is sleeping for any amount of time and provide alternatives if necessary to reduce this idle event. You will gain quicker responses if you trigger an action instead of the action waiting for an event.

This event is called through the DBMSLOCK.SLEEP procedure or 
USERLOCK.SLEEP procedure. This event is most likely to come from 
user written stored procedures.

5) Sent the details to vendor why F1-Flush has “PL/SQL lock timer’  wait event forever.

How to send email notifications for dbms_scheduler jobs

SITUATION

Jobs can be scheduled  from either OEM ( Oracle Enterprise Manager 11g Grid and 12c Cloud ) or database DBMS_SCHEDULER. The following two pictures  show the boxes ticked will trigger notifications when the event happens.

Environment:

                 OEM 12c :       12.1.0.4
                 Database:       11.2.0.4

(picture 1 ) — OEM JOB

OEM Job Email Notification
( picture 2 )– DBMS_SCHEDULER JOB

DBMS_SCHEDULER Job Notification

Email notification for OEM jobs works fine, but we could not receive email notifications from DBMS_SCHEDULER  jobs.

WORKAROUND

Email notification needs to be configured explicitly for database DBMS_SCHEDULER jobs to send email notifications.

  • Ask system administrator to check or configure SMTP server availability from database server.
  • Check and make sure ACL is configured for the database user who runs the DBMS_SCHEDULER job as per “Configure ACL”
  • Set up the SMTP server and the email address
Begin
dbms_scheduler.set_scheduler_attribute('email_server','smtp.mydomain.com:25');
dbms_scheduler.set_scheduler_attribute('email_sender','job_runner@mydomain.com');
End;
variable v_value VARCHAR2(1000);
BEGIN
    DBMS_SCHEDULER.get_scheduler_attribute (
      attribute => '&1',
      value => :v_value);
END;
/
Enter value for 1: email_server
old 3: attribute => '&1',
new 3: attribute => 'email_server',

PL/SQL procedure successfully completed.

SQL> print v_value

V_VALUE
---------------------------------------------------
smtp.mydomain.com:25

SQL> /
Enter value for 1: email_sender
old 3: attribute => '&1',
new 3: attribute => 'email_sender',

PL/SQL procedure successfully completed.

SQL> print v_value

V_VALUE
---------------------------------------------
job_runner@mydomain.com
  • Configure the job with notification as the job owner to run :
 BEGIN
 DBMS_SCHEDULER.ADD_JOB_EMAIL_NOTIFICATION (
 job_name => 'TEST_JOB',
 recipients => 'oracle_dba@mydomain.com',
 sender => 'do_not_reply@mydomain.com',
 subject=>'Scheduler Job Notification-%job_owner%.%job_name%-%event_type%',
 body => '%event_type% occurred at %event_timestamp%. %error_message%',
 events => 'JOB_FAILED, JOB_BROKEN, JOB_DISABLED, OB_SCH_LIM_REACHED');
END;
/

PL/SQL procedure successfully completed.
A notification can be generated for the following events:

 

  • job_all_events
  • job_broken
  • job_chain_stalled
  • job_completed
  • job_disabled
  • job_failed
  • job_over_max_dur
  • job_run_completed
  • job_sch_lim_reached
  • job_started
  • job_stopped
  • job_succeeded
SQL> SELECT JOB_NAME, RECIPIENT, EVENT FROM USER_SCHEDULER_NOTIFICATIONS;
JOB_NAMER   RECIPIENT                   EVENT
----------  --------------------------- -----------------------------
TEST_JOB    oracle_dba@mydomain.com     JOB_FAILED
TEST_JOB    oracle_dba@mydomain.com     JOB_BROKEN
TEST_JOB    oracle_dba@mydomain.com     JOB_SCH_LIM_REACHED
TEST_JOB    oracle_dba@mydomain.com     JOB_DISABLED
 
PLEASE NOTE:  don’t tick the box of email notification for database dbms_scheduler  jobs ( ( picture 2 )– DBMS_SCHEDULER JOB ) , otherwise it will not work again. Hopefully this will be fixed in future. It is not working for OEM 12.1.0.4.
 
  • To remove notifications:
BEGIN
 DBMS_SCHEDULER.REMOVE_JOB_EMAIL_NOTIFICATION (
   job_name => 'TEST_JOB',
 recipients => 'oracle_dba@mydomain.com',
     events => 'JOB_FAILED, JOB_BROKEN, JOB_DISABLED, OB_SCH_LIM_REACHED');
END;
/