What causes Node eviction in Oracle RAC?

What is node eviction?

Node eviction in RAC is done when a heartbeat indicates that a node is not responding, 

and the evicted node is re-started to make it a part of cluster.


Causes for RAC node eviction:


Node eviction on Oracle RAC environment can be due to any of the below reasons. 

– A failure of any of the major hardware components (CPU, RAM, network interconnect).

– A server that is experiencing RAM swapping.

– When communications to the voting disk is interrupted, causing the disconnected node to be evicted and re-boot.

– Database or ASM hang condition.


Below is the list of important log files to review in case of a node eviction

– Clusterware alert log

– Database alert log

– CSSD agent logs

– CSSD monitor logs

– System Message logs (/var/log/messages)

What is Flashback feature in Oracle Database?

Oracle Flashback Database and restore points are related data protection features that enable you to rewind data back in time to correct any problems caused by logical data corruption or user errors within a designated time window.

Points to Remember

– Flashback Database command can be run either from SQLPLUS or using RMAN Utility.

– FLASHBACK DATABASE command can be used to rewind the database to a target time, SCN or a log sequence number.

– Flashback command works by undoing the changes made to the data files that exist when you run the command.

– Flashback can fix only logical failures, not physical failures.

– If the database control file is restored from backup or re-created, then all existing flashback log information is discarded.

– Avoid using FLASHBACK DATABASE with a target time or SCN that coincides with a NOLOGGING operation, it can cause block corruption.




How to enable Flashback Database feature in Oracle?

 Prerequisites for Flashback Database and Guaranteed Restore Points

Flashback Database

Configure the following database settings before enabling Flashback Database:

-Your database must be running in ARCHIVELOG mode.

-You must have a fast recovery area enabled.

-For Oracle Real Application Clusters (Oracle RAC) databases, the fast recovery area must be in a clustered file system or in ASM.

Guaranteed Restore Points

To use guaranteed restore points,the COMPATIBLE initialization parameter must be set to 10.2.0 or greater.


Steps to enable Flashback Database:

1. Connect to sqlplus as sysdba and set  the desired value for Flashback retention target using below command.


SQL>ALTER SYSTEM SET DB_FLASHBACK_RETENTION_TARGET=4320;

Here, Flashback retention target is set to window of 3 days(4320 Minutes), default value is 1 day (1440 Minutes).


2. Enable the Flashback Database feature for the whole database using the following command:

SQL>ALTER DATABASE FLASHBACK ON;


3.Use the following command to check if Flashback Database is enabled for your target database:

SQL>SELECT FLASHBACK_ON FROM V$DATABASE;



Oracle EBS 12.2.X Login fails with “Unable to create anonymous session. ICX_SESSION_CREATION_FAILED”

Issue:

In a 12.2.6 EBS environment, Login fails with below error

Unable to create anonymous session. ICX_SESSION_CREATION_FAILED (userid=6) exception oracle.apps.fnd.common.PoolException: Exception creating new Poolable object. Encountered a java exception with the message Exception creating new Poolable object. Cause:java.lang.RuntimeException: java.lang.RuntimeException: java.sql.SQLException: java.lang.reflect.InvocationTargetException.

Analysis:

The error indicates that there are not enough resources to create a new JDBC session. oacore_server.log has the below error messages



####<Aug 08, 2017 3:43:05 PM EST> <Info> <Common> <test.domain.com> <oacore_server1> <[ACTIVE] ExecuteThread: ‘5’ for queue: ‘weblogic.kernel.Default (self-tuning)’> <<anonymous>> <> <005dy^V9G5e9xW6RJMU4TB0000xv001lSB> <1591994585553> <BEA-000627> <Reached maximum capacity of pool “EBSDataSource”, making “0” new resource instances instead of “1”.>
####<Aug 08, 2017 3:43:07 PM EST> <Info> <Common> <test.domain.com> <oacore_server1> <pool-2-thread-1> <<anonymous>> <> <*******************> <887561> <BEA-000627> <Reached maximum capacity of pool “EBSDataSource”, making “0” new resource instances instead of “1”.>
####<Aug 08, 2017 3:43:15 PM EST> <Info> <Common> <test.domain.com> <oacore_server1> <[ACTIVE] ExecuteThread: ‘5’ for queue: ‘weblogic.kernel.Default (self-tuning)’> <<anonymous>> <> <005dy^V9G5e9xW6RJMU4TB0000xv001lSB> <1591994595562> <BEA-000627> <Reached maximum capacity of pool “EBSDataSource”, making “0” new resource instances instead of “1”.>
####<Aug 08, 2017 3:43:15 PM EST> <Info> <Common> <test.domain.com> <oacore_server1> <pool-2-thread-1> <<anonymous>> <> <*******************> <955123> <BEA-000627> <Reached maximum capacity of pool “EBSDataSource”, making “0” new resource instances instead of “1”.>

Number of processes capacity for the Datasource “EBSDatasource” is exhausted.

Solution:

1. Increase the number of sessions in EBSDatasource

Login to weblogic Console, Navigate to Datasource, click on EBSDatasource.

Click on Lock and Edit, Go to Connection Pool Tab, Increase the value in Number of sessions field.

Click on Activate Changes.

This change does not require any restart of services.

(or)


Alternate Solution:

2. Bounce oacore_server to release any inactive sessions and free up the resources

cd $ADMIN_SCRIPTS_HOME

admanagedsrvctl.sh stop oacore_server

admanagedsrvctl.sh start oacore_server

Query to check installed modules in Oracle EBS


The below query query gives the list of Installed/Not Installed/Shared Product Modules in Oracle E-Business Suite (EBS) application

set pages 20000;
col application_id for 9999;
col application_name for A50;
col status for A1;
col application_short_name for A10;
select fa.application_id,
fa.application_short_name,
fpi.status,
fatl.application_name
from
fnd_product_installations fpi,
fnd_application fa,
fnd_application_tl fatl
where
(
fa.application_id = fpi.application_id and
fa.application_id = fatl.application_id and
fatl.language = ‘US’
)
order by fa.application_short_name;

Check for values in Status column of the output, usually either I or N or S.

I – Installed
S – Shared
N – Not Licensed

Query to check AD and TXK Code level in a Oracle E-Business Suite environment


While preparing patch analysis on EBS environments, some times we find in README file that AD/ TXK level should be at a certain code level to apply the patch.We can use the below query to check AD and TXK Code level.

select  ABBREVIATION, NAME, codelevel FROM AD_TRACKABLE_ENTITIES where abbreviation in (‘txk’,’ad’);


Sample Output

$sqlplus apps

SQL>  select  ABBREVIATION, NAME, codelevel FROM AD_TRACKABLE_ENTITIES where abbreviation in (‘txk’,’ad’);

ABBREVIATION                   NAME                                     CODELEVEL
—————————— —————————————- ———-
ad                             Applications DBA                                     C.11
txk                            Oracle Applications Technology Stack     C.11

Connecting to a Database from SQLDeveloper fails with IO Error: The Network Adapter Could Not Establish The Connection

Issue: New Users were trying to access a 3-node RAC Oracle Database system using SQLDeveloper


Below is the connection string provided to user for Oracle Database Connection. RAC Virtual IP is provided to connect to Database.




TESTDB01=
       (DESCRIPTION=
               (ADDRESS=(PROTOCOL=tcp)(HOST=test001-vip1.domain.com)(PORT=1521))
               (ADDRESS=(PROTOCOL=tcp)(HOST=test002-vip1.domain.com)(PORT=1521))
               (ADDRESS=(PROTOCOL=tcp)(HOST=test003-vip1.domain.com)(PORT=1521))
           (CONNECT_DATA=
               (SERVICE_NAME=TESTDB01)
               (INSTANCE_NAME=TESTDB01)
           )
       )




User tried to connect using one of the VIP and got the below error in SQLDeveloper.
Status : Failure -Test failed: IO Error: The Network Adapter Could Not Establish The Connection.


Analysis:


Verified the Oracle Database Services and Listener Services are running fine.
Other Users were able to connect to SQLDeveloper using the same Connection String.


Solution:


User does not have network communication open to port 1521 on Oracle RAC Virtual IP.


ping  test001-vip1.domain.com
ping  test002-vip1.domain.com
ping  test003-vip1.domain.com


telnet test001-vip1.domain.com 1521
telnet test002-vip1.domain.com 1521
telnet test003-vip1.domain.com 1521


Ping to VIP address works fine but Telnet to port 1521 on the VIP address is giving “Connect Failed” Error.


When Network communication is opened on port 1521 on all the Oracle RAC Virtual IP servers/Addresses, User can connect to Oracle Database using SQLDeveloper.

Troubleshooting concurrent requests struck at Post processing Phase

Sometimes , Eventhough the Database sessions related to the concurrent Programs are in INACTIVE State , we could not terminate the Concurrent Request with error “Could not Lock Request” , Issue could be Requests are struck at Post Processing Phase and OutPut Post Processor is locking the Concurrent Requests.


Need to follow the below steps to perform clean shutdown of Oracle Concurrent Manager.


a) Put Pending Concurrent Requests on hold using the below sql queries.


+Create table apps.conc_req_on_hold as select * from fnd_Concurrent_requests where PHASE_CODE=’P’ and hold_flag=’N’;
+select count(*) from apps.conc_req_on_hold
+ update fnd_Concurrent_requests set hold_flag=’Y’ where PHASE_CODE=’P’ and hold_flag=’N’ and request_id in (select request_id from apps.conc_req_on_hold);


NOTE: You have to commit if select & update are same number of records. Otherwise rollback and try again till the numbers are same


+ Commit;


You can find more details about putting Pending Concurrent Jobs on Hold here.
http://www.appsdbadiaries.com/2016/01/concurrent-requests-on-hold.html








b)Bring Down Concurrent Manager using adcmctl.sh stop


c) Update the status of Struck Concurrent Requests to “Terminated” 


SQL> update fnd_concurrent_requests set status_code=’X’,phase_code=’C’ where status_code=’R’ and phase_code=’R’;


Commit;




Use Concurrent Manager Recovery Wizard from OAM to clear the database sessions associated with cancelled Requests.


d) Start Concurrent Manager using adcmctl.sh


e) Remove Hold on Concurrent Requests


SQL>update fnd_Concurrent_requests set hold_flag=’N’ where request_id in (select request_id from apps.conc_req_on_hold);


Commit the changes


 SQL>commit;