Saturday, December 7, 2013

How to Recover and open the database if the archive log required for recovery is missing?

Recovery scenario: we had to recover one of our development databases from old backup.
As part of recovery process, our restore went fine and also were able to re-create control file. During recovery, it asked for Archive logs. We checked with our UNIX team for required archive logs and found out they don’t have required archive logs.

Error:

SQL> recover database until cancel using backup controlfile;
ORA-00279: change 9867098396261 generated at 03/21/2008 13:37:44 needed for
thread 1
ORA-00289: suggestion : /arcredo/XSCLFY/log1_648355446_2093.arc
ORA-00280: change 9867098396261 for thread 1 is in sequence #2093

Specify log: {=suggested | filename | AUTO | CANCEL}
cancel

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01195: online backup of file 1 needs more recovery to be consistent
ORA-01110: data file 1: ‘/u100/oradata/XSCLFY/SYSTEM01_SCLFY.dbf’
ORA-01112: media recovery not started

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01195: online backup of file 1 needs more recovery to be consistent
ORA-01110: data file 1: ‘/u100/oradata/XSCLFY/SYSTEM01_SCLFY.dbf’

After doing some research, I found out one hidden parameter (_ALLOW_RESETLOGS_CORRUPTION=TRUE) will allow us to open database even though it’s not properly recovered.

We forced open the database by setting the _ALLOW_RESETLOGS_CORRUPTION=TRUE. It allows us to open database but instance crashed immediately after open. I checked the alert.log file and found out we have undo tablespace corruption.

Alert log shows below error
Errors in file /u01/XSCLFYDB/admin/XSCLFY/udump/xsclfy_ora_9225.trc:
ORA-00600: internal error code, arguments: [4194], [17], [9], [], [], [], [], []
Tue Mar 25 12:45:55 2008
Errors in file /u01/XSCLFYDB/admin/XSCLFY/bdump/xsclfy_smon_24975.trc:
ORA-00600: internal error code, arguments: [4193], [53085], [50433], [], [], [], [], []
Doing block recovery for file 433 block 13525
Block recovery from logseq 2, block 31 to scn 9867098416340

To resolve undo corruption issue, I changed undo_management to “Manual” in init.ora. Now it allowed us to open database successfully. Once database was up and running, I created new undo tablespace and dropped old corrupted undo tablespace. I changed back the undo_management to “Auto” and undo_tablespace to “NewUndoTablespace”.

It resolved our issue and database was up and running without any issue.
_ALLOW_RESETLOGS_CORRUPTION=TRUE allows database to open without consistency checks. This may result in a corrupted database. The database should be recreated.

As per Oracle Metalink, there is no 100% guarantee that setting ALLOW_RESETLOGS_CORRUPTION=TRUE will open the database. However, once the database is opened, then we must immediately rebuild the database.

Database rebuild means doing the following, namely:
(1) perform a full-database export,
(2) create a brand new and separate database, and finally
(3) import the recent export dump. This option can be tedious and time consuming, but once we successfully open the new database, then we expect minimal or perhaps no data loss at all. Before you try this option, ensure that you have a good and valid backup of the current database.

Solution:
1) Set _ALLOW_RESETLOGS_CORRUPTION=TRUE in init.ora file.
2) Startup Mount
3) Recover database
4) Alter database open resetlogs.
5) reset undo_management to “manual” in init.ora file.
6) startup database
7) Create new undo tablespace
changed undo_management to “AUTO” and undo_tablespace to “NewTablespace”
9) Bounce database.

Sunday, October 13, 2013

ORA-01503: CREATE CONTROLFILE failed ORA-12720: operation requires database is in EXCLUSIVE mode

PROBLEM:


Create an controlfile for a RAC database using the CONTROLFILE Trace throws the below error message

SQL> @create_control.sql

CREATE CONTROLFILE REUSE SET DATABASE "DBNAME" RESETLOGS NOARCHIVELOG

ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-12720: operation requires database is in EXCLUSIVE mode

ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-12720: operation requires database is in EXCLUSIVE mode



SOLUTION:


cluster_database=TRUE Change this parameter to FALSE in the parameter file

or 

SQL> alter system set cluster_database=false scope=spfile; 


System altered.


SQL> shutdown immediate;

SQL> startup nomount

SQL> @create_control.sql

Control file created.


SQL> shutdown immediate;

Once the controlfile created and open the database and revert back the parameter cluster_database=TRUE in the parameter file 
 
 
SQL>startup


===============================================================================
- Reference

Recreating my controlfile [ID 735106.1]

Monday, November 19, 2012

Oracle Backup Types

A database backup is a copy of data that can be used to reconstruct the data.
Oracle backup types can be divided into below categories.

1.1. Physical and Logical Backups
  • Physical backups.       
    Backup of physical files such as data-files, control-files and archive-log files
  • Logical backups.         
    Backup of logical data such as tables or stored procedures exported from DB

1.2. Consistent and Inconsistent Backups
A physical backup can be classified by being a consistent or an inconsistent backup.
Consistent backup
  • All datafiles have the same SCN; in other words, all changes in the redo logs have been applied to the datafiles.
  • It is rare that an open database backup can be considered consistent.
  • Consistent backups are taken when the database is shut down normally or in a MOUNT state.
Inconsistent backup
  • Is performed while the database is open and users are accessing the database.
  • Typically SCNs of the datafiles do not match when an inconsistent backup is taking place, a recovery operation performed using an inconsistent backup must rely on both archived and online redo log files to bring the database into a consistent state before it is opened.
  • Database must be in ARCHIVELOG mode to use an inconsistent backup method.

1.3. Full and Incremental Backups
Full backups
  • Include all blocks of every datafile within a tablespace or a database;
  • It is essentially a bit-for-bit copy of one or more datafiles in the database.
  • Can be created with RMAN or with OS-level file copy commands
Incremental backups
Backup of block level changes to database made after previous incremental/full backup. Backup can be level 0 or level 1.
  • ‘Level 0’ Incremental backup - This backs up all blocks in database. This is equivalent to full backup.
  • ‘level 1’ Incremental backup - This backs up database block changes after previous incremental backup. If there is no level 0 incremental backup and you run level 1 incremental backup, RMAN will automatically make level 0 incremental backup.
1.      Cumulative incremental backup - level 1 incremental backup which includes all blocks changed since most recent level 0 incremental backup. Cumulative backups may take more time and space than non- cumulative, however only one cumulative backup from any level will be needed for recovery.

2.      Differential incremental backup - level 1 incremental backup which includes only block changed since most recent incremental backup. By default, incremental backups are differential
  • A distinct advantage to using an incremental backup in a recovery strategy is that archived and online redo log files may not be necessary to restore a database or tablespace to a consistent state
  • The incremental backups may have some or all of the blocks needed.
  • Incremental backups can only be performed within RMAN.
  • Incremental backups can only be performed on datafiles, and not on control files or archived redo logs.
  • if the incremental keyword is not included, by default recovery manager will perform a full backup.
  • During recovery, incremental backups are chosen by recovery manager, and applied automatically
  • In the case that archived redo logs were not successfully backed up (or were corrupted), incremental backups offer a way of making the datafile newer without needing this redo. Conversely, if the tape that an incremental backup resides on is corrupt, archived redo logs can be used to roll forward.
  • Incremental backups can be applied in parallel to multiple datafiles concurrently.
  • Much less redo is applied during recovery.

1.4. Online (Hot) and Offline (cold) backups
·         Online Database Backup (HOT BACKUP)
An online backup or also known as an open backup or Hot backup, is a backup in which all read-write datafiles and control files have not been check pointed with respect to the same SCN. For example, one read-write datafile header may contain an SCN of 100 while other read-write datafile headers contain an SCN of 95 or 90. Oracle cannot open the database until all of these header SCNs are consistent, that is, until all changes recorded in the online redo logs have been saved to the datafiles on disk. If the database must be up and running 24 hours a day, 7 days a week, then you have no choice but to perform online backups of a whole database which is in ARCHIVELOG mode.

·         Offline Database Backup (COLD BACKUP)

Backup taken while Oracle database is down or in mount state NOT OPEN.

In this backup, all datafiles and control files are consistent to the same point in time - consistent with respect to the same SCN, for example. The only tablespaces in a consistent backup that are allowed to have older SCNs are read-only and offline-normal tablespaces, which are consistent with the other datafiles in the backup. This type of backup allows the user to open the set of files created by the backup without applying redo logs, since the data is already consistent. The only way to perform this type of backup is to shut down the database cleanly and make the backup while the database is closed. A consistent whole database backup is the only valid backup option for databases running in NOARCHIVELOG mode.


1.5. Image Copies
  • Image copy is a bit-for-bit identical copy of a DB file
  • Image copies are full backups created by operating system commands or RMAN backup as copy commands.
  • Image copies are the default backup file format in RMAN.
  • By creating image copies using RMAN, all datafiles will automatically be included in the backup.

1.6. Backupsets and Backup Pieces
  • Backup set is a collection of files called backup pieces, each of which may contain the backup of one or several database files
  • Backupsets can be created and restored only with RMAN.
  • Each backup piece belongs to only one backupset.
  • All backupsets and pieces are recorded in the RMAN repository.

1.7. Compressed Backups
  • Compression is available to reduce the amount of disk space or tape needed to store the backup.
  • Compressed backups are only usable by RMAN
  • Need no special processing when used in a recovery operation;
  • RMAN automatically decompresses the backup.
  • Creating compressed backups is as easy [as compressed backupset]

1.8. Whole Database Backup
  • The most common type of backup, a whole database backup contains the control file along with all database files that belong to a database. If operating in ARCHIVELOG mode, the DBA also has the option of backing up different parts of the database over a period of time, thereby constructing a whole database backup piece by piece.


1.9. Tablespace Backups
  • A tablespace backup is a subset of the database. Tablespace backups are only valid if the database is operating in ARCHIVELOG mode. The only time a tablespace backup is valid for a database running in NOARCHIVELOG mode is when that tablespace is read-only or offline-normal.


1.10. Datafile Backups
  • A datafile backup is a backup of a single datafile. Datafile backups, which are not as common as tablespace backups and are only valid if the database is run in ARCHIVELOG mode. The only time a datafile backup is valid for a database running in NOARCHIVELOG mode is if that datafile is the only file in a tablespace. For example, the backup is a tablespace backup, but the tablespace only contains one file and is read-only or offline-normal.


1.11. Control File Backups
  • A control file backup is a backup of a database's control file. If a database is open, the user can create a valid backup by issuing the following SQL statement: ALTER DATABASE BACKUP CONTROLFILE to 'location'; or use Recovery Manager (RMAN).


1.12. Archived Redo Log Backups
  •  Archived redo logs are the key to successful media recovery. Depending on the disk space available and the number of transactions executed on the database, you want to keep as many days of archive logs on disk and you want to back them up regularly to ensure a more complete recovery.


1.13. Configuration Files Backups
  • Configuration files may consist of spfile or init.ora, password file, tnsnames.ora, and sqlnet.ora. Since these files do not change often, then they require a less frequent backup schedule. If you lost a configuration file it can be easily recreated manually. When restore time is a premium, it will be faster to restore a backup of the configuration file then manually creating a file with a specific format.


Thursday, August 9, 2012

Oracle RAC Interview Questions and Answers


What is RAC?

RAC stands for Real Application cluster. It is a clustering solution from Oracle Corporation that ensures high availability of databases by providing instance failover, media failover features.

How many nodes are supported in a RAC Database?

10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and 100 instances in a
RAC database.

 What is SCAN?

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (
RAC) 11g Release 2 feature that provides a single name for clients to access an Oracle Database running in a cluster. The benefit is clients using SCAN do not need to change if you add or remove nodes in the cluster.

Click here for more details from Oracle

Mention the Oracle RAC software components:-

·       An Environment that supports of two or more database instances is an RAC.
·       They are composed of Memory structures and background processes.
·       Oracle RAC instances use two processes GES (Global Enqueue Service), GCS (Global Cache Service) that enable cache fusion.
·       Oracle RAC instances are composed of following background processes:
ACMS—Atomic Controlfile to Memory Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle
RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor

What is GRD?

GRD stands for Global Resource Directory. The GES and GCS maintain records of the status of each datafile and each cached block using global resource directory. This process is referred to as cache fusion and helps in data integrity.

Cache Fusion in Detail:-

Oracle RAC is composed of two or more instances. When a block of data is read from datafile by an instance within the cluster and another instance is in need of the same block, it is easy to get the block image from the instance which has the block in its SGA rather than reading from the disk. To enable inter instance communication Oracle RAC makes use of interconnects. The Global Enqueue Service (GES) monitors and Instance enqueue process manages the cache fusion.

What are Oracle database background processes specific to RAC

LMS—Global Cache Service Process

•LMD—Global Enqueue Service Daemon

•LMON—Global Enqueue Service Monitor

•LCK0—Instance Enqueue Process

To ensure that each Oracle
RAC database instance obtains the block that it needs to satisfy a query or transaction, Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file and each cached block using a Global Resource Directory (GRD). The GRD contents are distributed across all of the active instances.

RAC Background Processes in Detail.

ACMS in Detail:-

ACMS stands for Atomic Controlfile Memory Service. In an Oracle RAC environment ACMS is an agent that ensures a distributed SGA memory update (ie) SGA updates are globally committed on success or globally aborted in event of a failure.

GTX0-j in Detail:-

The process provides transparent support for XA global transactions in a RAC environment. The database auto tunes the number of these processes based on the workload of XA global transactions.

LMON in Detail:-

This process monitors global enques and resources across the cluster and performs Global Enqueue recovery operations. This is called as Global Enqueue Service Monitor.

LMD in Detail:-

This process is called as global enqueue service daemon. This process manages incoming remote resource requests within each instance.

LMS in Detail:-

This process is called as Global Cache service process. This process maintains status of datafiles and each cached block by recording information in a Global Resource Directory (GRD). This process also controls the flow of messages to remote instances and manages global data block access and transmits block images between the buffer caches of different instances. This processing is a part of cache fusion feature.

LCK0 in Detail:-

This process is called as Instance enqueue process. This process manages non-cache fusion resource requests such as library and row cache requests.

RMSn in Detail:-

This process is called as Oracle RAC management process. These processes perform manageability tasks for Oracle RAC. Tasks include creation of resources related Oracle RAC when new instances are added to the cluster.

RSMN in Detail:-

This process is called as Remote Slave Monitor. This process manages background slave process creation and communication on remote instances. This is a background slave process. This process performs tasks on behalf of a coordinating process running in another instance.

What are Oracle Clusterware processes for 10g on Unix and Linux

Cluster Synchronization Services (ocssd) Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart.

Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be a database, an instance, a service, a Listener, a virtual IP (
VIP) address, an application process, and so on) based on the resource's configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. This process runs as the root user

Event manager daemon (evmd) —A background process that publishes events that crs creates.

Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O fencing. OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms.

RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur.
 
What are Oracle Clusterware Components

Voting Disk — Oracle
RAC uses the voting disk to manage cluster membership by way of a health check and arbitrates cluster ownership among the instances in case of network failures. The voting disk must reside on shared disk.

Oracle Cluster Registry (
OCR) — Maintains cluster configuration information as well as configuration information about any cluster database within the cluster. The OCR must reside on shared disk that is accessible by all of the nodes in your cluster

What components in RAC must reside in shared storage?

All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shared storage.

What is the significance of using cluster-aware shared storage in an Oracle
RAC environment?

All instances of an Oracle RAC can access all the datafiles, controlfiles, SPFILE's, redolog files when these files are hosted out of cluster-aware shared storage which are group of shared disks.

Give few examples for solutions that support cluster storage:-

·       ASM (automatic storage management),
·       raw disk devices, 
·       network file system (NFS),
·       OCFS2 and
·       OCFS (Oracle Cluster Fie systems).

What is an interconnect network?

An interconnect network is a private network that connects all of the servers in a cluster. The interconnect network uses a switch/multiple switches that only the nodes in the cluster can access.

How can we configure the cluster interconnect?

·       Configure User Datagram Protocol (UDP) on Gigabit Ethernet for cluster interconnects.
·       On UNIX and Linux systems we use UDP and RDS (Reliable data socket) protocols to be used by Oracle Clusterware.
·       Windows clusters use the TCP protocol.

Can we use crossover cables with Oracle Clusterware interconnect?

No, crossover cables are not supported with Oracle Clusterware interconnects.

What is the use of cluster interconnect?

Cluster interconnect is used by the Cache fusion for inter instance communication.

what is the purpose of Private Interconnect?

Clusterware uses the private interconnect for cluster synchronization (network heartbeat) and daemon communication between the the clustered nodes. This communication is based on the
TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP). Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster.

How do users connect to database in an Oracle
RAC environment?

Users can access a RAC database using a client/server configuration or through one or more middle tiers, with or without connection pooling. Users can use oracle services feature to connect to database.

What is the use of a service in Oracle RAC environment?

Applications should use the services feature to connect to the Oracle database. Services enable us to define rules and characteristics to control how users and applications connect to database instances.

What are the characteristics controlled by Oracle services feature?

The characteristics include a unique name, workload balancing, failover options, and high availability.

Which enables the load balancing of applications in RAC?

Oracle Net Services enable the load balancing of application connections across all of the instances in an Oracle RAC database.

What is a virtual IP address or VIP?

A virtual IP address or VIP is an alternate IP address that the client connections use instead of the standard public IP address. To configure VIP address, we need to reserve a spare IP address for each node, and the IP addresses must use the same subnet as the public network.

What is the use of VIP?

If a node fails, then the node's VIP address fails over to another node on which the VIP address can accept TCP connections but it cannot accept Oracle connections.

Why do we have a Virtual IP (VIP) in Oracle RAC?

Without using VIPs or FAN, clients connected to a node that died will often wait for a
TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don't really have a good HA solution without using VIPs.
When a node fails, the
VIP associated with it is automatically failed over to some other node and new node re-arps the world indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately.

Give situations under which VIP address failover happens:-

VIP addresses failover happens when the node on which the VIP address runs fails; all interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from the network.

What is the significance of VIP address failover?

When a VIP address failover happens, Clients that attempt to connect to the VIP address receive a rapid connection refused error .They don't have to wait for TCP connection timeout messages.

What are the administrative tools used for Oracle RAC environments?

Oracle RAC cluster can be administered as a single image using the below
·       OEM (Enterprise Manager),
·       SQL*PLUS,
·       Server control (SRVCTL),
·       Cluster Verification Utility (CLUVFY),
·       DBCA,
·       NETCA

How do we verify that RAC instances are running?

Issue the following query from any one node connecting through SQL*PLUS.
$connect sys/sys as sysdba
SQL>select * from V$ACTIVE_INSTANCES;
The query gives the instance number under INST_NUMBER column, host instance name under INST_NAME column.

What is FAN?

Fast application Notification as it abbreviates to FAN relates to the events related to instances, services and nodes. This is a notification mechanism that Oracle RAC uses to notify other processes about the configuration and service level information that includes service status changes such as, UP or DOWN events. Applications can respond to FAN events and take immediate action.

Where can we apply FAN UP and DOWN events?

FAN UP and FAN DOWN events can be applied to instances, services and nodes.

State the use of FAN events in case of a cluster configuration change?

During times of cluster configuration changes, Oracle RAC high availability framework publishes a FAN event immediately when a state change occurs in the cluster. So applications can receive FAN events and react immediately. This prevents applications from polling database and detecting a problem after such a state change.

Why should we have separate homes for ASM instance?

It is a good practice to have ASM home separate from the database home (ORACLE_HOME). This helps in upgrading and patching ASM and the Oracle database software independent of each other. Also, we can deinstall the Oracle database software independent of the ASM instance.

What is the advantage of using ASM?

Having ASM is the Oracle recommended storage option for RAC databases as the ASM maximizes performance by managing the storage configuration across the disks. ASM does this by distributing the database file across all of the available storage within our cluster database environment.

What is rolling upgrade?

It is a new ASM feature from Database 11g. ASM instances in Oracle database 11g release(from 11.1) can be upgraded or patched using rolling upgrade feature. This enables us to patch or upgrade ASM nodes in a clustered environment without affecting database availability. During a rolling upgrade we can maintain a functional cluster while one or more of the nodes in the cluster are running in different software versions.

Can rolling upgrade be used to upgrade from 10g to 11g database?

No, it can be used only for Oracle database 11g releases (from 11.1).

State the initialization parameters that must have same value for every instance in an Oracle RAC database:-

Some initialization parameters are critical at the database creation time and must have same values. Their value must be specified in SPFILE or PFILE for every instance. The list of parameters that must be identical on every instance are given below:

·       ACTIVE_INSTANCE_COUNT
·       ARCHIVE_LAG_TARGET
·       COMPATIBLE
·       CLUSTER_DATABASE
·       CLUSTER_DATABASE_INSTANCE
·       CONTROL_FILES
·       DB_BLOCK_SIZE
·       DB_DOMAIN
·       DB_FILES
·       DB_NAME
·       DB_RECOVERY_FILE_DEST
·       DB_RECOVERY_FILE_DEST_SIZE
·       DB_UNIQUE_NAME
·       INSTANCE_TYPE (RDBMS or ASM)
·       PARALLEL_MAX_SERVERS
·       REMOTE_LOGIN_PASSWORD_FILE
·       UNDO_MANAGEMENT

Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all instances?

These parameters can be identical on all instances only if these parameter values are set to zero.

What two parameters must be set at the time of starting up an ASM instance in a RAC environment?

The parameters CLUSTER_DATABASE and INSTANCE_TYPE must be set.

Mention the components of Oracle Clusterware:-

Oracle Clusterware is made up of components like voting disk and Oracle Cluster Registry (OCR).

What is a CRS resource?

Oracle Clusterware is used to manage high-availability operations in a cluster. Anything that Oracle Clusterware manages is known as a CRS resource. Some examples of CRS resources are database, an instance, a service, a listener, a VIP address, an application process etc.

What is the use of OCR?

Oracle Clusterware manages CRS resources based on the configuration information of CRS resources stored in OCR (Oracle Cluster Registry).

How does an Oracle Clusterware manage CRS resources?

Oracle Clusterware manages CRS resources based on the configuration information of CRS resources stored in OCR (Oracle Cluster Registry).

Name some Oracle Clusterware tools and their uses?

·       OIFCFG - allocating and deallocating network interfaces.
·       OCRCONFIG - Command-line tool for managing Oracle Cluster Registry.
·       OCRDUMP - Identify the interconnect being used.
·       CVU - Cluster verification utility to get status of CRS resources

What are the modes of deleting instances from Oracle Real Application cluster Databases?

We can delete instances using silent mode or interactive mode using DBCA (Database Configuration Assistant).

How do we remove ASM from an Oracle RAC environment?

We need to stop and delete the instance in the node first in interactive or silent mode. After that ASM can be removed using srvctl tool as follows:
srvctl stop asm -n node_name
srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name

How do we verify that an instance has been removed from OCR after deleting an instance?

Issue the following srvctl command:
srvctl config database -d database_name
cd
CRS_HOME/bin
./crs_stat

How do we verify an existing current backup of OCR?

We can verify the current backup of OCR using the following command : ocrconfig -showbackup

What are the performance views in an Oracle RAC environment?

We have v$ views that are instance specific. In addition we have GV$ views called as global views that has an INST_ID column of numeric data type.GV$ views obtain information from individual V$ views.

What are the types of connection load-balancing?

There are two types of connection load-balancing: server-side load balancing and client-side load balancing.

What is the difference between server-side and client-side connection load balancing?

Client-side balancing happens at client side where load balancing is done using listener. In case of server-side load balancing listener uses a load-balancing advisory to redirect connections to the instance providing best service.

Give the usage of srvctl:-
·       srvctl start instance -d db_name -i "inst_name_list"  [-o start_options]
·       srvctl stop instance -d name -i "inst_name_list" [-o stop_options]
·       srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediate
·       srvctl start database -d name [-o start_options]
·       srvctl stop database -d name [-o stop_options]
·       srvctl start database -d orcl -o mount
How do you troubleshoot node reboot

Please check metalink ...

Note 265769.1 Troubleshooting
CRS Reboots
Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle Clusterware Node evictions.

How do you backup the
OCR

There is an automatic backup mechanism for
OCR. The default location is : $ORA_CRS_HOME\cdata\"clustername"\

To display backups :
#ocrconfig -showbackup
To restore a backup :
#ocrconfig -restore

With Oracle
RAC 10g Release 2 or later, you can also use the export command:
#ocrconfig -export -s online, and use -import option to restore the contents back.
With Oracle
RAC 11g Release 1, you can do a manual backup of the OCR with the command:
# ocrconfig -manual backup

How do you backup voting disk

#dd if=voting_disk_name of=backup_file_name

How do I identify the voting disk location

#crsctl query css votedisk

How do I identify the
OCR file location

check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)
or
#ocrcheck

Is ssh required for normal Oracle
RAC operation ?

"ssh" are not required for normal Oracle
RAC operation. However "ssh" should be enabled for Oracle RAC and patchset installation.

What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in
AWR Report?

This is most likely due to a fault in interconnect network.
Check netstat -s
if you see "fragments dropped" or "packet reassemblies failed" , Work with your system administrator find the fault with network.

What is the purpose of the ONS daemon?

The Oracle Notification Service (ONS) daemon is an daemon started by the
CRS clusterware as part of the nodeapps. There is one ons daemon started per clustered node.
The Oracle Notification Service daemon receives a subset of published clusterware events via the local evmd and racgimon Clusterware daemons and forward those events to application subscribers and to the local listeners.

This in order to facilitate:

a. the FAN or Fast Application Notification feature or allowing applications to respond to database state changes.
b. the 10gR2 Load Balancing Advisory, the feature that permit load balancing across different RAC nodes dependent of the load on the different nodes. The rdbms MMON is creating an advisory for distribution of work every 30seconds and forward it via racgimon and ONS to listeners and applications.

Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, however sqlplus can start it on both nodes? How do you identify the problem?

Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl. Now you will get detailed error stack.


What is (use of) Virtual IP (VIP) in Oracle Real Application Clusters (RAC)?

When installing Oracle 10g/11g R1 RAC, three network interfaces (IPs) are required for each node in the RAC cluster, they are:
  • Public Interface:  Used for normal network communications to the node
  • Private Interface:  Used as the cluster interconnect
  • Virtual (Public) Interface: Used for failover and RAC management
When installing Oracle 11g R2 RAC, we need one more network interface (IP) is required for each node in the RAC cluster.
  • SCAN Interface (IP):  Single Client Access Name (SCAN) is a new Oracle Real Application Clusters (RAC) 11g Release 2 feature, which provides a single name for clients to access an Oracle Database running in a cluster. The benefit is clients using SCAN do not need to change if you add or remove nodes in the cluster.
When a client connects to a tns-alias, it uses a TCP connection to an IP address, defined in the tnsnames.ora file. When using RAC, we define multiple addresses in our tns-alias, to be able to failover when an IP address, listener or instance is unavailable. TCP timeouts can differ from platform to platform or implementation to implementation. This makes it difficult to predict the failover time. 

Oracle 10g Cluster Ready Services enables databases to use a Virtual IP address to configure the listener ON. This feature is to assure that oracle clients quickly failover when a node fails. In Oracle Database 10g RAC, the use of a virtual IP address to mask the individual IPO addresses of the clustered nodes is required. The virtual IP addresses are used to simplify failover and are automatically managed by CRS.

To create a Virtual IP (VIP) address, the Virtual IP Configuration Assistant (VIPCA) is called from the root.sh script of a RAC install, which then configures the virtual IP addresses for each node specified during the installation process. In order to be able to run VIPCA, there must be unused public IP addresses available for each node that has been configured in the /etc/hosts file.
One public IP address for each node to use for its Virtual IP address for client connections and for connection failover. This IP address is in addition to the operating system managed public host IP address that is already assigned to the node by the operating system. This public Virtual IP must be associated with the same interface name on every node that is a part of the cluster. The IP addresses that are used for all of the nodes that are part of a cluster must be from the same subnet. The host names for the VIP addresses must be registered with the domain name server (DNS). The Virtual IP address should not be in use at the time of the installation because this is a Virtual IP address that Oracle manages internally to the RAC processes. This virtual IP address does not require a separate NIC. The VIPs should be registered in the DNS. The VIP addresses must be on the same subnet as the public host network addresses. Each Virtual IP (VIP) configured requires an unused and resolvable IP address.
Using virtual IP we can save our TCP/IP timeout problem because Oracle notification service (ONS) maintains communication between each nodes and listeners. Once ONS found any listener down or node down, it will notify another nodes and listeners. While new connection is trying to establish connection to failure node or listener, virtual IP of failure node automatically divert to surviving node and session will be establishing in another surviving node. This process doesn't wait for TCP/IP timeout event. Due to this new connection gets faster session establishment to another surviving nodes/listener.
Virtual IP (VIP) is for fast connection establishment in failover dictation. Still we can use physical IP address in Oracle 10g in listener if we have no worry for failover timing. We can change default TCP/IP timeout using operating system utilities/commands and kept smaller. But taking advantage of VIP (Virtual IP address) in Oracle 10g RAC database is advisable.

**************************************************************************************