DBA Tips Archive for Oracle

  


[an error occurred while processing this directive]

No Title

[an error occurred while processing this directive]

by Michael New, MichaelNew@earthlink.net, Gradation LLC

Contents

Introduction

With sufficient preparation and adequate testing, performing a failover using Oracle Data Guard is a fairly trivial operation. After performing the failover, however, the next challenge is what to do with the original primary database. Regardless of which type of standby database you failed over to, physical or logical, if this was the only standby database in the Data Guard configuration, then the database environment will be running unprotected until a new standby database can be setup and synchronized with the current primary.

Depending on the size of the database, rebuilding the old primary from scratch and entering it back into the Data Guard configuration could be an expensive operation as it would potentially involve a lot of time and resources. However, prior to Oracle Database 10g, this was the only option available. Starting with Oracle Database 10g and the introduction of Flashback Database, it is now possible to rewind the old primary database back in time to the point just before where the failure occurred and from there, bring it back into the configuration as a standby. After the Flashback Database operation, the roles can be changed using switchover, thus bringing the primary database (and the Data Guard configuration) back to its original state. The only catch here is that you must have already enabled Flashback Database on the original primary database before the primary failover in order to rewind it back to a point in time (before the primary failover). If Flashback Database was not enabled on the original primary database before the failover, your only choice would be to remove the original primary and re-create it. The following guides can be used to create the original primary database as a standby database:

 

A word of caution before starting the old primary machine. Make certain to not allow the old primary database to come online in read/write mode after failing over to a standby. Some application servers may still be configured with connection information to the old primary database. The applications would then be able to make a connection to the stale database and begin generating transactions during the same time clients are connected to the real primary database. This type of "split-brain" scenario can be very complex to clean up and recover from. It is best to never allow the failed primary database to go beyond the MOUNT state after a failover to a standby database.

If Flashback Database was enabled on the original primary database before the failover and the original primary database is still available after the original primary machine comes back online, it is possible to restore the original primary database back into the Data Guard configuration as a standby database using the steps outlined in this guide. This process is known as reinstatement.

Flashing Back a Failed Primary Database into a Physical Standby Database

The steps outlined in this section assume the user has already performed a failover involving a physical standby database and Flashback Database was enabled on the old primary database before the failover:


SQL> select flashback_on from v$database; FLASHBACK_ON ------------- YES

The procedures documented in this section brings the old primary database back into the Data Guard configuration as a new physical standby database.

  1. Determine the SCN at which the old standby database became the primary database.

    The first step in the process is to determine the point in which the standby database became the primary database. Given the fact that all redo is applied by a SCN, we need to determine the failover SCN from the new primary. After a failover (when the physical standby becomes the primary), Oracle writes the failover SCN to the control file and makes the value available through the STANDBY_BECAME_PRIMARY_SCN column of the V$DATABASE fixed view.

    On the new primary database, issue the following query to determine the SCN at which the old standby database became the new primary database:


    SQL> select to_char(standby_became_primary_scn) from v$database; TO_CHAR(STANDBY_BECAME_PRIMARY_SCN) ---------------------------------------- 3272186

  2. Flash back the failed primary database.

    To create a new physical standby database, shut down the old primary database (if necessary), mount it, and flash it back to the value for STANDBY_BECAME_PRIMARY_SCN that was determined in the previous step.


    SQL> shutdown immediate; <old primary database was not started> SQL> startup mount; ORACLE instance started. Total System Global Area 1241513984 bytes Fixed Size 1273420 bytes Variable Size 318767540 bytes Database Buffers 905969664 bytes Redo Buffers 15503360 bytes Database mounted. SQL> flashback database to scn 3272186; Flashback complete.

  3. Convert the database to a physical standby database.

    Perform the following steps on the old primary database:

    1. Issue the following statement on the old primary database:


      SQL> alter database convert to physical standby; Database altered.

      This statement will dismount the database after successfully converting the control file to a standby control file.

    2. Shut down and restart the old primary database:


      SQL> shutdown immediate; ORA-01507: database not mounted ORACLE instance shut down. SQL> startup mount; ORACLE instance started. Total System Global Area 1241513984 bytes Fixed Size 1273420 bytes Variable Size 318767540 bytes Database Buffers 905969664 bytes Redo Buffers 15503360 bytes Database mounted.

  4. Restart transporting redo to the new physical standby database.

    Before the new standby database was created, the new primary database probably stopped transmitting redo to the remote destination (due to a timeout for example). To restart redo transport services, perform the following steps on the new primary database. The primary / physical standby database used in this guide is using LOG_ARCHIVE_DEST_2.

    1. Issue the following query to see the current state of the archive destinations:


      SQL> select dest_name, status, protection_mode, destination, error 2 from v$archive_dest_status; DEST_NAME STATUS PROTECTION_MODE DESTINATION ERROR -------------------- --------- -------------------- --------------------------- -------------- LOG_ARCHIVE_DEST_1 VALID MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_2 ERROR MAXIMUM PERFORMANCE turlock.idevelopment.info ORA-03113: end-of-file on communication channel LOG_ARCHIVE_DEST_3 VALID MAXIMUM PERFORMANCE /u04/oracle/oraarch/MODESTO LOG_ARCHIVE_DEST_4 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_5 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_6 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_7 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_8 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_9 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_10 INACTIVE MAXIMUM PERFORMANCE

    2. If necessary, enable the destination from the new primary primary database.


      SQL> alter system set log_archive_dest_state_2='enable'; System altered.

    3. Perform a log switch from the new primary to ensure the standby database begins receiving redo data from the new primary database, and verify it was sent successfully. At the SQL prompt, enter the following statements:


      SQL> alter system switch logfile; System altered. SQL> select dest_name, status, protection_mode, destination, error 2 from v$archive_dest_status; DEST_NAME STATUS PROTECTION_MODE DESTINATION ERROR -------------------- --------- -------------------- --------------------------- -------------- LOG_ARCHIVE_DEST_1 VALID MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_2 VALID MAXIMUM PERFORMANCE turlock.idevelopment.info LOG_ARCHIVE_DEST_3 VALID MAXIMUM PERFORMANCE /u04/oracle/oraarch/MODESTO LOG_ARCHIVE_DEST_4 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_5 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_6 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_7 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_8 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_9 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_10 INACTIVE MAXIMUM PERFORMANCE

      On the new standby database, you may also need to change the LOG_ARCHIVE_DEST_n initialization parameters so that redo transport services do not transmit redo data to other databases. This step can be skipped if both the primary and standby database roles were set up with the VALID_FOR attribute in the server parameter file (SPFILE). By doing this, the Data Guard configuration operates properly through role transitions.

  5. Start Redo Apply.

    Start Redo Apply or real-time apply on the new physical standby database:

    Once the failed primary database is restored and is running in the standby role, you can optionally perform a switchover to transition the databases back to their original (pre-failure) roles. See "Switchovers Involving a Physical Standby Database" for more information.

  6. Restart Redo Apply on the new physical standby database after recovering the EOR marker.

    The new physical standby database is now running at SCN 3272186, and the MRP will start to look for redo from that point on. As part of this process, the MRP will need to recover the redo that was contained in log sequence number 6666 since this is the exact point we were at on the original primary when the failover occurred (see alert.log information below). If you remember back to the actual failover process, we needed to perform terminal recovery on the target standby. After entering the FINISH command, Oracle wrote a special End Of Redo (EOR) marker in the current log file. This is unlike a role transition involving a switchover which forces a log switch followed by the EOR in the header of the next redo log. After the new physical standby database reads in the EOR marker, the MRP stops. But notice in the alert.log that it is telling us that is knows there is more redo beyond the current archived redo log containing the EOR marker:


    Media Recovery Log /u04/oracle/oraarch/MODESTO/1_6666_736458474.dbf Identified End-Of-Redo for thread 1 sequence 6666 Fri Feb 18 09:39:25 EST 2011 Media Recovery End-Of-Redo indicator encountered Fri Feb 18 09:39:25 EST 2011 Media Recovery Applied until change 3272188 Fri Feb 18 09:39:25 EST 2011 MRP0: Media Recovery Complete: End-Of-REDO (modesto) Begin: Standby Redo Logfile archival End: Standby Redo Logfile archival Media Recovery archivelogs detected beyond End-Of-REDO Resetting standby activation ID 0 (0x0) Fri Feb 18 09:39:27 EST 2011 MRP0: Background Media Recovery process shutdown (modesto)

    After the new physical standby database reads in the EOR marker and the MRP stops, all we have to do is restart Redo Apply once again.


    SQL> alter database recover managed standby database using current logfile disconnect; Database altered.

    Something interesting here that you may have noticed and is worth mentioning now. Similar to an OPEN RESET LOGS, the failover operation will reset the log sequence numbers for each thread back to 1. In Oracle9i, this would have broke the standby database and rendering it unusable. Not anymore with Oracle Database 10g. Since 10g, Data Guard is now able to process a reset like this and apply only what it needs from the old redo log stream and then switch over to the new redo log stream which began at 1 again.


    Fri Feb 18 10:37:12 EST 2011 alter database recover managed standby database using current logfile disconnect Fri Feb 18 10:37:12 EST 2011 Attempt to start background Managed Standby Recovery process (modesto) MRP0 started with pid=24, OS id=30863 Fri Feb 18 10:37:12 EST 2011 MRP0: Background Managed Standby Recovery process started (modesto) Managed Standby Recovery starting Real Time Apply Media Recovery apply resetlogs offline range for datafile 1, incarnation : 0 Media Recovery apply resetlogs offline range for datafile 2, incarnation : 0 Media Recovery apply resetlogs offline range for datafile 3, incarnation : 0 Media Recovery apply resetlogs offline range for datafile 4, incarnation : 0 Media Recovery apply resetlogs offline range for datafile 5, incarnation : 0 Media Recovery apply resetlogs offline range for datafile 6, incarnation : 0 Media Recovery apply resetlogs offline range for datafile 7, incarnation : 0 Media Recovery apply resetlogs offline range for datafile 8, incarnation : 0 Media Recovery apply resetlogs offline range for datafile 9, incarnation : 0 parallel recovery started with 2 processes Fri Feb 18 10:37:18 EST 2011 Waiting for all non-current ORLs to be archived... Media Recovery Log /u04/oracle/oraarch/MODESTO/1_1_743418834.dbf Fri Feb 18 10:37:19 EST 2011 Completed: alter database recover managed standby database using current logfile disconnect Fri Feb 18 10:37:20 EST 2011 Media Recovery Log /u04/oracle/oraarch/MODESTO/1_2_743418834.dbf Media Recovery Log /u04/oracle/oraarch/MODESTO/1_3_743418834.dbf Media Recovery Log /u04/oracle/oraarch/MODESTO/1_4_743418834.dbf Media Recovery Log /u04/oracle/oraarch/MODESTO/1_5_743418834.dbf Media Recovery Log /u04/oracle/oraarch/MODESTO/1_6_743418834.dbf Media Recovery Log /u04/oracle/oraarch/MODESTO/1_7_743418834.dbf Media Recovery Waiting for thread 1 sequence 8 Fri Feb 18 10:39:38 EST 2011 RFS[2]: Successfully opened standby log 4: '/u02/oradata/MODESTO/onlinelog/o1_mf_4_6hvg3qk9_.log' Fri Feb 18 10:39:41 EST 2011 Media Recovery Log /u04/oracle/oraarch/MODESTO/1_8_743418834.dbf Media Recovery Waiting for thread 1 sequence 9

Data Guard will now continue to resolve any gaps in the redo stream and apply any missing redo to the new physical standby database. Once the failed primary database is restored and is running in the standby role, you can optionally perform a switchover to transition the databases back to their original (pre-failure) roles. See "Switchovers Involving a Physical Standby Database" for more information.

Flashing Back a Failed Primary Database into a Logical Standby Database

After failing over to a logical standby database, the only way to reinstate the old primary database is as a logical standby. It is not possible to transition the old primary to a physical standby since it is not an exact block-for-block copy of the new primary database.

The steps outlined in this section assume the user has already performed a failover involving a logical standby database and Flashback Database was enabled on the old primary database before the failover:


SQL> select flashback_on from v$database; FLASHBACK_ON ------------- YES

The procedures documented in this section brings the old primary database back into the Data Guard configuration as a new logical standby database, without having to formally re-instantiate it from the new primary database.

  1. Determine the SCN to which to flash back the failed primary database.

    On the new primary database, issue the following query to determine the SCN to which you want to flash back the failed primary database:


    SQL> select applied_scn as flashback_scn from v$logstdby_progress; FLASHBACK_SCN ------------- 3369533

    The above SCN value will be used in this section to determine which log files must be copied to the failed primary and to flash back the failed primary.

  2. Determine the log files that must be copied to the failed primary database for Flashback Database.

    On the new primary database, issue the following query to determine the log files that must be copied to the failed primary database for Flashback Database to reach a consistent state.


    select file_name from dba_logstdby_log where next_change# > ( select value from dba_logstdby_parameters where name = 'STANDBY_BECAME_PRIMARY_SCN' ) and first_change# <= (FLASHBACK_SCN from step 1);

    For example:


    SQL> select file_name 2 from dba_logstdby_log 3 where next_change# > ( select value 4 from dba_logstdby_parameters 5 where name = 'STANDBY_BECAME_PRIMARY_SCN' 6 ) 7 and first_change# <= (3369533); FILE_NAME --------------------------------------------------- /u04/oracle/oraarch/TURLOCK/1_80_743541307.dbf

  3. Flash back the failed primary database.

    To create a new logical standby database, shut down the database (if necessary), mount the failed primary database, flash it back to the FLASHBACK_SCN determined in step 1, and enable the database guard.


    SQL> shutdown immediate; <old primary database was not started> SQL> startup mount; ORACLE instance started. Total System Global Area 1241513984 bytes Fixed Size 1273420 bytes Variable Size 318767540 bytes Database Buffers 905969664 bytes Redo Buffers 15503360 bytes Database mounted. SQL> flashback database to scn 3369533; Flashback complete. SQL> alter database guard all; Database altered.

  4. Open the database with the RESETLOGS option.


    SQL> alter database open resetlogs; Database altered.

  5. Create a database link to the new primary database and start SQL Apply.


    SQL> create database link turlock 2 connect to <username> identified by <password> using 'turlock.idevelopment.info'; Database link created. SQL> alter database start logical standby apply new primary turlock; Database altered.

    The database user account specified in the CREATE DATABASE LINK statement must have the SELECT_CATALOG_ROLE role granted to it on the primary database.

  6. Restart transporting redo to the new logical standby database.

    Before the new standby database was reinstated, the new primary database probably stopped transmitting redo to the remote destination (due to a timeout for example). To restart redo transport services, perform the following steps on the new primary database. The primary / logical standby database used in this guide is using LOG_ARCHIVE_DEST_2.

    1. If necessary, enable the destination from the new primary primary database.


      SQL> alter system set log_archive_dest_state_2='enable'; System altered.

    2. Perform a log switch from the new primary ensure the standby database begins receiving redo data from the new primary database, and verify it was sent successfully. At the SQL prompt, enter the following statements:


      SQL> alter system switch logfile; System altered. SQL> select dest_name, status, protection_mode, destination, error 2 from v$archive_dest_status; DEST_NAME STATUS PROTECTION_MODE DESTINATION ERROR -------------------- --------- -------------------- --------------------------- -------------- LOG_ARCHIVE_DEST_1 VALID MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_2 VALID MAXIMUM PERFORMANCE modesto.idevelopment.info LOG_ARCHIVE_DEST_3 VALID MAXIMUM PERFORMANCE /u04/oracle/oraarch/TURLOCK LOG_ARCHIVE_DEST_4 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_5 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_6 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_7 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_8 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_9 INACTIVE MAXIMUM PERFORMANCE LOG_ARCHIVE_DEST_10 INACTIVE MAXIMUM PERFORMANCE

      On the new standby database, you may also need to change the LOG_ARCHIVE_DEST_n initialization parameters so that redo transport services do not transmit redo data to other databases. This step can be skipped if both the primary and standby database roles were set up with the VALID_FOR attribute in the server parameter file (SPFILE). By doing this, the Data Guard configuration operates properly through role transitions.

  7. Enable SQL Apply GUARD

    After reinstating the new logical standby database, the SQL Apply GUARD may be set to NONE which means the database GUARD is disabled and all modifications are allowed.


    SQL> select guard_status from v$database; GUARD_STATUS -------------- NONE

    Click here for further information on protecting replicated tables on a logical standby from user modifications.

The role reversal is now complete.

Once the failed primary database has been restored and is running in the standby role, you can optionally perform a switchover to transition the databases back to their original (pre-failure) roles. See "Switchovers Involving a Logical Standby Database" for more information.

Further Reading

Additional information on Oracle 10g Release 2 Data Guard can be found in the Data Guard Concepts and Administration. This guide is available from the Oracle Documentation Library website located at the following address http://download.oracle.com/docs/cd/B19306_01/server.102/b14239.pdf.

About the Author

Jeffrey Hunter is an Oracle Certified Professional, Java Development Certified Professional, Author, and an Oracle ACE. Jeff currently works as a Senior Database Administrator for The DBA Zone, Inc. located in Pittsburgh, Pennsylvania. His work includes advanced performance tuning, Java and PL/SQL programming, developing high availability solutions, capacity planning, database security, and physical / logical database design in a UNIX / Linux server environment. Jeff's other interests include mathematical encryption theory, tutoring advanced mathematics, programming language processors (compilers and interpreters) in Java and C, LDAP, writing web-based database administration tools, and of course Linux. He has been a Sr. Database Administrator and Software Engineer for over 20 years and maintains his own website site at: http://www.iDevelopment.info. Jeff graduated from Stanislaus State University in Turlock, California, with a Bachelor's degree in Computer Science and Mathematics.



Copyright (c) 1998-2017 Jeffrey M. Hunter. All rights reserved.

All articles, scripts and material located at the Internet address of http://www.idevelopment.info is the copyright of Jeffrey M. Hunter and is protected under copyright laws of the United States. This document may not be hosted on any other site without my express, prior, written permission. Application to host any of the material elsewhere can be made by contacting me at jhunter@idevelopment.info.

I have made every effort and taken great care in making sure that the material included on my web site is technically accurate, but I disclaim any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on it. I will in no case be liable for any monetary damages arising from such loss, damage or destruction.

Last modified on
Thursday, 19-Jan-2012 01:07:10 EST
Page Count: 13908