Newsletters Archive - All

If you would like to know more about the iDevelopment.info Newsletter, please email me.


  Recover Corrupt/Missing Oracle Cluster Registry (OCR) with No Backup - (Oracle 10g) — (12-October-2009)

It happens. Not very often, but it can happen. You are faced with a corrupt or 
missing Oracle Cluster Registry (OCR) and have no backup to recover from. So, 
how can something like this occur? We know that the CRSD process is responsible 
for creating backup copies of the OCR every 4 hours from the master node in the 
CRS_home/cdata directory. These backups are meant to be used to recoverthe OCR 
from a lost or corrupt OCR file using the ocrconfig -restore command, so how 
is it possible to be in a situation where the OCR needs to be recovered and you 
have no viable backup?

Well, consider a scenario where you add a node to the cluster and before the 
next backup (before 4 hours) you find the OCR has been corrupted. You may have 
forgotten to create a logical export of the OCR before adding the new node or 
worse yet, the logical export you took is also corrupt. In either case, you are 
left with a corrupt OCR and no recent backup. Talk about a bad day! Another 
possible scenario could be a shell script that wrongly deletes all available 
backups. Talk about an even worse day.

In the event the OCR is corrupt on one node and all options to recover it have 
failed, one safe way to re-create the OCR (and consequently the voting disk) is 
to reinstall the Oracle Clusterware software. In order to accomplish this, a 
complete outage is required for the entire cluster throughout the duration of 
the re-install. The Oracle Clusterware software will need to be fully removed, 
the OCR and voting disks reformatted, all virtual IP addresses (VIPs) 
de-installed, and a complete reinstall of the Oracle Clusterware software will 
need to be performed. It should also be noted that any patches that were 
applied to the original clusterware install will need to be re-applied. As you 
can see, having a backup of the OCR and voting disk can dramatically simplify 
the recovery of your system!

A second and much more efficient method used to re-create the OCR (and 
consequently the voting disk as well) is to re-run the root.sh script from the 
primary node in the cluster. This is described in Doc ID: 399482.1 on the
My Oracle Support web site. The procedures actually call for running the 
rootdelete.sh and rootdeinstall.sh on all nodes in the cluster before running 
root.sh. In my opinion, this method is quicker and much less intrusive than 
reinstalling Oracle Clusterware and the one described in the following article: 

Recover Corrupt/Missing OCR with No Backup - (Oracle 10g)

----------------------------
Jeffrey M. Hunter, OCP
Sr. Database Administrator
jhunter@idevelopment.info
http://www.idevelopment.info
----------------------------