Concurrent Manager Troubleshooting : Multiple Concurrent Requests Struck without Processing


Scenario: Concurrent Requests are in Running Normal state for longer time than normal duration of the Program.

Analysis: 

Step 1: Navigate to System Administrator Responsibility : Concurrent Manager –> Administer

Identify the Concurrent Requests which are in Running Status.

Get the Database Session Details of the currently running concurrent requests using the below Query:

SELECT DISTINCT  a.request_id,C.INST_ID, d.sid, d.serial# ,d.osuser,d.process , c.SPID ,d.inst_id
FROM apps.fnd_concurrent_requests a,
apps.fnd_concurrent_processes b,
gv$process c,
gv$session d
WHERE a.controlling_manager = b.concurrent_process_id
AND c.pid = b.oracle_process_id
AND b.session_id=d.audsid
AND a.phase_code = ‘R’ and a.status_coDe=’R’;

Step 2: Verify if the database sessions are active/Inactive at the database level using the below query.

select inst_id,sid,serial#,program,module,status,last_call_et,sql_id from gv$session where sid=&sid;

— sid value to be taken from output of Sql Query in Step 1

If Database session is INACTIVE And Running no sql for more than an Hour, we can Terminate the Concurrent Requests.
If the Database session is ACTIVE and has an SQL_ID attached with it, Need to check on tuning the sql being run by the database session.

I will cover more details about SQL Tuning in another post.

How to use coe_xfr_sql_profile.sql to Improve performance of a sql query?

Oracle SQL Performance tuning is one of the frequently reported issues in a Oracle Database administrator’s daily job.
Very often we notice that sql queries or concurrent Programs which were running within few minutes earlier are now taking hours to be completed.
Underlying reason is due to the plan_hash_value used by the query has been changed.
coe_xfr_sql_profile.sql is the script developed by Oracle Support which helps us to identify the plan_hash_value which will resolve the performance issue.


Steps to use the script coe_xfr_sql_profile.sql


1. Download the script to a temporary directory on the database server.
Script is also available in Oracle Metalink.


2. Before Running the coe_xfr_sql_profile.sql, we need to know the sql_id of the sql query which is taking longer time to complete than usual.


To find the sql_id use below query


2.1 If you know the database SID of the long running query.
select inst_id,sid,serial#,status,last_Call_et,sql_id from gv$session where sid= ‘&sessionid’;


2.2 If you know the query test, Use gv$sql and gv$sqlarea views to get the sql_id.


3. Login to sqlplus as sysdba and run coe_xfr_sql_profile.sql at the SQL prompt.


$ sqlplus “/as sysdba”
SQL> @coe_xfr_sql_profile.sql
Parameter 1:
SQL_ID (required)


Enter value for 1: 0a3f7vuks8d7y   (–this is the sql_id of long running sql)
PLAN_HASH_VALUE AVG_ET_SECS
————— ———–
1484137450 15.6
3622468234 4560.76


[Output shows the list of available plan hash values, which can be forced on to the sql query. We need to select the plan_hash_value with low ETA to resolve the performance issue]


Parameter 2:
PLAN_HASH_VALUE (required)


Enter value for 2: 1484137450 




++Based on the inputs provided, it generates a sql script with naming convention coe_xfr_sql_profile_0a3f7vuks8d7y_1484137450.sql as output


coe_xfr_sql_profile_<sql_id>_<plan_hash_value>.sql


4. Run the script as sysdba


SQL>coe_xfr_sql_profile_0a3f7vuks8d7y_1484137450.sql 


5. Verify that Performance issue is resolved by Re-running the sql.

How to Use Logon Trigger to trace sessions connected to Oracle database?

To start tracing:


create or replace trigger user_logon_trg
after logon on database
begin
if USER = ‘xxxx’ then
execute immediate
‘Alter session set events ”10046 trace name context forever, level 8”’;
end if;
end;
/


/* Login a new session as User ‘xxxx’ and execute your selects to be traced */




NOTE: The following syntax can also be used within the if logic to also get the user name or to add more granularity by specifying a host name where the connection originated.


IF SYS_CONTEXT(‘USERENV’,’SESSION_USER’) = ‘<USER_NM>’ AND SYS_CONTEXT(‘USERENV’,’HOST’) = ‘<HOST_NM>’ THEN


To stop tracing: via LogOff Trigger (needs to be created before logging off)


create or replace trigger user_logoff_trg
before logoff on database
begin
if USER = ‘xxxx’ then
execute immediate
‘Alter session set events ”10046 trace name context off”’;
end if;
end;
/


Concurrent Manager Troubleshooting : Multiple Concurrent Requests Struck without Processing



Scenario: Concurrent Requests are in Running Normal state for longer time than normal duration of the Program.


Analysis: 


Step 1: Navigate to System Administrator Responsibility : Concurrent Manager –> Administer 


Identify the Concurrent Requests which are in Running Status.


Get the Database Session Details of the currently running concurrent requests using the below Query:


SELECT DISTINCT  a.request_id,C.INST_ID, d.sid, d.serial# ,d.osuser,d.process , c.SPID ,d.inst_id
FROM apps.fnd_concurrent_requests a,
apps.fnd_concurrent_processes b,
gv$process c,
gv$session d
WHERE a.controlling_manager = b.concurrent_process_id
AND c.pid = b.oracle_process_id
AND b.session_id=d.audsid
AND a.phase_code = ‘R’ and a.status_coDe=’R’;




Step 2: Verify if the database sessions are active/Inactive at the database level using the below query.




select inst_id,sid,serial#,program,module,status,last_call_et,sql_id from gv$session where sid=&sid;


— sid value to be taken from output of Sql Query in Step 1




If Database session is INACTIVE And Running no sql for more than an Hour, we can Terminate the Concurrent Requests.
If the Database session is ACTIVE and has an SQL_ID attached with it, Need to check on tuning the sql being run by the database session.


I will cover more details about SQL Tuning in another post.

SQL Profile and SQL Plan Baseline

What is SQL Profile?


A SQL profile contains corrections for poor optimizer estimates discovered by the SQL Tuning advisor.A  SQL profile is to a SQL statement what statistics are to a table or index.
Implementing sql profiles allows us to replace bad sql execution plan with good execution plan.





What is SQL Plan Baseline?


A SQL plan baseline for a SQL statement consists of a set of accepted plans. When the statement is parsed, the optimizer will only select the best plan from among this set. If a different plan is found using the normal cost-based selection process, the optimizer will add it to the plan history but this plan will not be used until it is verified to perform better than the existing accepted plan and is evolved.

Clusterware Compatibility (Destructive) Testing

Destructive tests include forced failures by software and
hardware while the system is running with either minimal or high workload.
Oracle software – one or more of Oracle background processes is killed manually.
OS software – one or more of the cluster daemons is killed manually or the
system is forced to reboot. Hardware: Manual removal of network or disk
connectivity or power supply.
There are two major categories of cluster compatibility
tests:
 Clusterware (Destructive):
Starting with Oracle Database 10g, the certification and
validation process has been enhanced to include hardware destructive tests
executed under high system load.
   Cluster File System:
  Starting with Oracle
Database 11g, the certification and validation process has been further
enhanced to include a set of destructive and high availability tests, designed
to verify the use of cluster file system to support the various Oracle
Clusterware and Real Application Clusters components.

SQL Tuning Advisor (10g and 11g)

SQL Tuning Advisor is used for tuning SQL statements.
This is run to implement a ADDM performance finding.
Additionally, you can run the SQL Tuning Advisor on the most resource-intensive SQL statements, referred to as top SQL, from the cursor cache or the AWR, as well as on a user-defined SQL workload.

To run the SQL Tuning Advisor do the following:

1. On the Home Page, under Related Links, click Advisor Central, then click SQL Tuning Advisor. The SQL Tuning Advisor Links page appears.
2. The advisor can be run on one of the following sources:
Top SQL—These consist of recently active top SQL statements from the cursor cache (Spot SQL) or historical top SQL from the AWR (Period SQL).
SQL Tuning Sets—These consist of SQL statements you provide. An STS can be created from SQL statements captured by AWR snapshots or from any SQL workload.
3. For example, you can select Top SQL. The Top SQL page appears. This page has two tabs, Spot SQL and Period SQL. Spot lists recent top SQL from the cursor cache, while Period SQL lists historical top SQL captured in the AWR. You must select an interval to analyze by dragging the shaded box over the period. You then select one or more SQL statements to analyze during the selected period.
4. Click Run SQL Tuning Advisor. The SQL Tuning Options page appears showing the SQL statements in the interval. Give your task a name and description, select the scope for the analysis (Comprehensive or Limited), and select a start time for the task. Click OK.
5. Navigate back to the Advisor Central page. The status of Advisor Tasks is listed under this heading in the results section. You must wait until your task status is COMPLETED. You can check the status by clicking your browser’s Refresh button. Then, select your task and click View Result.
6. The SQL Tuning Result page appears. To view recommendations, select the SQL statement and click View Recommendations. The recommendation can include one or more of the following:
Create an index to offer alternate, faster access paths to the query optimizer.
Accept SQL profile, which contains additional SQL statistics specific to the statement that enables the query optimizer to generate a significantly better execution plan.
Gather optimizer statistics on objects with stale or no statistics.
Advice on how to rewrite a query for better performance.

SQL Advisor in Oracle 10g


Another great feature of Oracle 10G that allow you to tune SQL. Now you don’t need to tune SQL statement manually. This new feature
does it for you.

SQL Tuning Advisor using DBMS_SQLTUNE package and very simple to use.

The example below shows how to use SQL advisor.

1. Grant following access to the user that is going to run this new tool. In the example below SCOTT is the owner of the schema.

GRANT ADVISOR TO USER1;
GRANT SELECT_CATALOG_ROLE TO USER1;
GRANT EXECUTE ON DBMS_SQLTUNE TO USER1;
2. Create the tuning task

DECLARE
task_name_var VARCHAR2(30);
sqltext_var CLOB;
BEGIN
sqltext_var := ‘SELECT * from TAB1 where empno = 1200’;
task_name_var := DBMS_SQLTUNE.CREATE_TUNING_TASK(
sql_text => sqltext_var,
user_name => ‘USER1’,
scope => ‘COMPREHENSIVE’,
time_limit => 60,
task_name => ‘sql_tuning_task_test1’,
description => ‘This is a  tuning task on TAB1 table’);
END;
/
Some time you may have queries that might take longer than the time that you have specified in the “time_limit” parameter. If this is the case then remove this parameter.

NOTE: You can not create more than one task with the same name. If this is the case then drop the existing task or use a different name.

2.1 To view the existing task for the user run the following statement.

select task_name from dba_advisor_log where owner = ‘SCOTT’;
3. Execute the tuning task

Execute dbms_sqltune.Execute_tuning_task (task_name => ‘sql_tuning_task_test1’);
3.1 You can check the status of the task using following query.

select status from dba_advisor_log where task_name=’sql_tuning_task_test1′;

4. Now view the Recommendation

set linesize 100
set long 1000
set longchunksize 1000

SQL> select dbms_sqltune.report_tuning_task(‘sql_tuning_task_test1’) from dual;

DBMS_SQLTUNE.REPORT_TUNING_TASK(‘SQL_TUNING_TASK_TEST1’)
—————————————————————————————————-
GENERAL INFORMATION SECTION
——————————————————————————-
Tuning Task Name : sql_tuning_task_test1
Scope : COMPREHENSIVE
Time Limit(seconds): 60
Completion Status : COMPLETED
Started at : 06/22/2006 15:33:13
Completed at : 06/22/2006 15:33:14

——————————————————————————-
SQL ID : ad1489724nqpn
SQL Text: SELECT * from TAB1 where VALUE=18902;

——————————————————————————-
FINDINGS SECTION (1 finding)
——————————————————————————-

1- Statistics Finding
———————
Table “USER1”.”TAB1″ was not analyzed.

Recommendation
————–
Consider collecting optimizer statistics for this table.
execute dbms_stats.gather_table_stats(ownname => ‘USER1′, tabname =>’TAB1’, estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE);

Based on this information, you can decide what actions are necessary to tune the SQL.

11g AUTOMATIC SQL TUNING

As part of Automatic SQL Tuning, Oracle 11g automatically runs the SQL Tuning Advisor against high impact SQL statements during maintenance windows. This process involves the following steps:
AWR statistics are used to compile an ordered list of the SQL statements with the greatest performance impact on the system, where the impact is the sum of the CPU and I/O times for the statement during the past week. The list excludes statements that are inherently less tunable, such as recently (within a month) tuned recursive statements, parallel queries, DML, DDL and SQL statements whose performance problems are caused by concurrency issues.
The SQL tuning advisor is run against each statement in turn. The outcome may include both SQL profiles and other recommendations.
Suggested SQL profiles are performance tested, and those that result in at least a threefold improvement are accepted if the ACCEPT_SQL_PROFILES parameter is set to TRUE, or reported if it is set to FALSE.
The accepted SQL profiles are optionally implemented . Several factors many prevent SQL profiles from being implemented automatically, including stale optimizer statistics of dependent objects. The TYPE column of the DBA_SQL_PROFILES view indicates if SQL profiles are created manually (MANUAL) or automatically (AUTO-TUNE).
The ENABLE and DISABLE procedures of the DBMS_AUTO_TASK_ADMIN package control whether automatic SQL tuning is included in the automated maintenance tasks.
— Enable
BEGIN
  DBMS_AUTO_TASK_ADMIN.enable(
    client_name => ‘sql tuning advisor’,
    operation   => NULL,
    window_name => NULL);
END;
/

— Disable
BEGIN
  DBMS_AUTO_TASK_ADMIN.disable(
    client_name => ‘sql tuning advisor’,
    operation   => NULL,
    window_name => NULL);
END;
/
It is also indirectly disabled by setting the STATISTICS_LEVEL parameter to BASIC, as this stops automatic statistics gathering by the AWR.

The SET_TUNING_TASK_PARAMETER procedure of the DBMS_SQLTUNE package controls the behavior of the SQL tuning advisor. The parameters specifically for the automatic runs include:
ACCEPT_SQL_PROFILES – Automatically accept SQL profiles (default FALSE).
MAX_SQL_PROFILES_PER_EXEC – The maximum number of SQL profiles automatically implemented per run (default 20).
MAX_AUTO_SQL_PROFILES – The maximum number of automatic SQL profiles allowed on the system (default 10000).
The current parameter values are displayed using the %_ADVISOR_PARAMETERS views.
COLUMN parameter_value FORMAT A30

SELECT parameter_name, parameter_value
FROM   dba_advisor_parameters
WHERE  task_name = ‘SYS_AUTO_SQL_TUNING_TASK’
AND    parameter_name IN (‘ACCEPT_SQL_PROFILES’,
                          ‘MAX_SQL_PROFILES_PER_EXEC’,
                          ‘MAX_AUTO_SQL_PROFILES’);

PARAMETER_NAME                 PARAMETER_VALUE
—————————— ——————————
ACCEPT_SQL_PROFILES            FALSE
MAX_SQL_PROFILES_PER_EXEC      20
MAX_AUTO_SQL_PROFILES          10000

3 rows selected.

SQL>
The following code shows how the SET_TUNING_TASK_PARAMETER procedure is used to turn on acceptance of automatically generated SQL profiles.
BEGIN
  DBMS_SQLTUNE.set_tuning_task_parameter(
    task_name => ‘SYS_AUTO_SQL_TUNING_TASK’,
    parameter => ‘ACCEPT_SQL_PROFILES’,
    value     => ‘TRUE’);
END;
/
The REPORT_AUTO_TUNING_TASK function of the DBMS_SQLTUNE package returns a CLOB containing a report from the specified automatic tuning task. Setting the BEGIN_EXEC and END_EXEC parameters to NULL produces a report from the most recent execution.
VARIABLE l_report CLOB;
BEGIN
  :l_report := DBMS_SQLTUNE.report_auto_tuning_task(
    begin_exec   => NULL,
    end_exec     => NULL,
    type         => DBMS_SQLTUNE.type_text,     — ‘TEXT’
    level        => DBMS_SQLTUNE.level_typical, — ‘TYPICAL’
    section      => DBMS_SQLTUNE.section_all,   — ‘ALL’
    object_id    => NULL,
    result_limit => NULL);
END;
/

SET LONG 1000000
PRINT :l_report

L_REPORT
——————————————————————————–
GENERAL INFORMATION SECTION
——————————————————————————-
Tuning Task Name                        : SYS_AUTO_SQL_TUNING_TASK
Tuning Task Owner                       : SYS
Workload Type                           : Automatic High-Load SQL Workload
Execution Count                         : 31
Current Execution                       : EXEC_1_25
Execution Type                          : TUNE SQL
Scope                                   : COMPREHENSIVE
Global Time Limit(seconds)              : 3600
Per-SQL Time Limit(seconds)             : 1200
Completion Status                       : COMPLETED
Started at                              : 01/16/2008 22:00:06
Completed at                            : 01/16/2008 22:00:46
Number of Candidate SQLs                : 0
Cumulative Elapsed Time of SQL (s)      : 0

——————————————————————————-

11g RAC Performance Analysis

Monitoring Performance by Analyzing GCS and GES Statistics


In order to determine the amount of work and cost related to inter-instance messaging and contention, examine block transfer rates, remote requests made by each transaction, the number and time waited for global cache events as described under the following headings:


Analyzing Cache Fusion Impact in Real Application Clusters
Analyzing Performance Using GCS and GES Statistics
Analyzing Cache Fusion Impact in Real Application Clusters
The effect of accessing blocks in the global cache and maintaining coherency is represented by
The Global Cache Service statistics for current and cr blocks, for example, gc current blocks received, gc cr blocks received, and so on)
The Global Cache Service wait events, for gc current block 3-way, gc cr grant 2-way, and so on.
The response time for cache fusion transfers is determined by the messaging and processing times imposed by the physical interconnect components, the IPC protocol and the GCS protocol. It is not affected by disk I/O factors other than occasional log writes. The cache fusion protocol does not require I/O to data files in order to guarantee cache coherency and RAC inherently does not cause any more I/O to disk than a non-clustered instance.


Analyzing Performance Using GCS and GES Statistics


Monitor Global Cache Service performance by identifying data blocks and objects which are frequently used (“hot”) by all instances. High concurrency on certain blocks may be identified by Global Cache Service wait events and times.


The following wait events indicate that the access to cached data blocks was held up because they were busy either in the remote or the local cache, respectively:
gc current block busy
gc current block 2-way busy
gc current block 3-way busy
gc cr block 2-way busy
gc cr block 3-way busy
This means that the blocks were pinned or held up by sessions or delayed by a log write on a remote instance (for example, gc current, cr 2-way busy, or cr 3-way busy), or that a session on the same instance is already accessing a block which is in transition between instances and the current session needs to wait behind it (for example, gc current block busy).
The V$SESSION_WAIT view to identify objects and data blocks with contention. The gc wait events contain the file and block number for a block request in p1 and p2, respectively.
An additional segment statistic, gc buffer busy, has been added to quickly determine the “busy” objects without recourse to the query on V$SESSION_WAIT mentioned earlier.
The AWR infrastructure provides a view of active session history which can also be used to trace recent wait events and their arguments. It is therefore useful for hot block analysis.
Most of the reporting facilities used by AWR and Statspack contain the object statistics and cluster wait class category, so that sampling of the views mentioned earlier is largely unnecessary.
It is advisable to run ADDM on the snapshot data collected by the AWR infrastructure to obtain an overall evaluation of the impact of the global cache. The advisory will also identify the busy objects and SQL highest cluster wait time.


Analyzing Cache Fusion Transfer Impact Using GCS Statistics


Monitor Global Cache Service performance by identifying objects read and modified frequently and the service times imposed by the remote access. Waiting for blocks to arrive may constitute a significant portion of the response time, in the same way that reading from disk could increase the block access delays, only that cache fusion transfers in most cases are faster than disk access latencies.
The following wait events indicate that the remotely cached blocks were shipped to the local instance without having been busy, pinned or requiring a log flush:
gc current block 2-way
gc current block 3-way
gc cr block 2-way
gc cr block 3-way
The object statistics for gc current blocks received and gc cr blocks received enable quick identification of the indexes and tables which are shared by the active instances. As mentioned earlier, creating an ADDM analysis will, in most cases, point you to the SQL statements and database objects that could be impacted by inter-instance contention.
Note:
You must run Statspack at level 7 to collect statistics related to block contention and segment block waits.
Any increases in the average wait times for the events mentioned earlier could be caused by the following:
High load: CPU shortages, long run queues, scheduling delays
Misconfiguration: using public instead of private interconnect for message and block traffic
If the average wait times are acceptable and no interconnect or load issues can be diagnosed, then the accumulated time waited can usually be attributed to a few SQL statements which need to be tuned to minimize the number of blocks accessed.
The column CLUSTER_WAIT_TIME in V$SQLAREA represents the wait time incurred by individual SQL statements for global cache events and will identify the SQL which may need to be tuned.


Analyzing Response Times Based on Wait Events


Most global cache wait events that show a high total time as reported in the AWR and Statspack reports or in the dynamic performance views are normal and may present themselves as the top database time consumers without actually indicating a problem. This section describes the most important and frequent wait events that you should be aware of when interpreting performance data.
If user response times increases and a high proportion of time waited is for global cache (gc), then the cause should be determined. Most reports include a breakdown of events sorted by percentage of the total time.
It is useful to start with an ADDM report, which would analyze the routinely collected performance statistics with respect to their impact and point to the objects and SQL contributing most to the time waited, and then move on to the more detailed reports produced by AWR and Statspack.
The most important wait events for RAC include various categories, such as:
Block-oriented
gc current block 2-way
gc current block 3-way
gc cr block 2-way
gc cr block 3-way
Message-oriented
gc current grant 2-way
gc cr grant 2-way
Contention-oriented
gc current block busy
gc cr block busy
gc current buffer busy
Load-oriented
gc current block congested
gc cr block congested
The block-oriented wait event statistics indicate that a block was received as either the result of a 2-way or a 3-way message, that is, the block was sent from either the resource master requiring 1 message and 1 transfer, or was forwarded to a third node from which it was sent, requiring 2 messages and 1 block transfer.


These events are usually the most frequent in the absence of block contention and the length of the wait is determined by the time it takes on the physical network, the time to process the request in the serving instances and the time it takes for the requesting process to wake up after the block arrives.
The average wait time and the total wait time should be considered when being alerted to performance issues where these particular waits have a high impact. Usually, either interconnect or load issues or SQL execution against a large shared working set can be found to be the root cause.
The message-oriented wait event statistics indicate that no block was received because it was not cached in any instance. Instead a global grant was given, allowing the requesting instance to read the block from disk or modify it.


If the time consumed by these events is high, then it may be assumed that the frequently executed SQL causes a lot of disk I/O (in the event of the cr grant) or that the workload inserts a lot of data and needs to find and format new blocks frequently (in the event of the current grant).
The contention-oriented wait event statistics indicate that a block was received which was pinned by a session on another node, was deferred because a change had not yet been flushed to disk or because of high concurrency, and therefore could not be shipped immediately. A buffer may also be busy locally when a session has already initiated a cache fusion operation and is waiting for its completion when another session on the same node is trying to read or modify the same data. High service times for blocks exchanged in the global cache may exacerbate the contention, which can be caused by frequent concurrent read and write accesses to the same data.
The load-oriented wait events indicate that a delay in processing has occurred in the GCS, which is usually caused by high load, CPU saturation and would have to be solved by additional CPUs, load-balancing, offloading processing to different times or a new cluster node.For the events mentioned, the wait time encompasses the entire round trip from the time a session starts to wait after initiating a block request until the block arrives.