DBA Tips Archive for Oracle

  


Add a Node to an Existing Oracle RAC 10g R2 Cluster on Linux - (RHEL 5.3)

by Jeff Hunter, Sr. Database Administrator


Contents

  1. Overview
  2. Hardware and Costs
  3. Install the Linux Operating System
  4. Install Required Linux Packages for Oracle RAC
  5. Network Configuration
  6. Configure Network Security on the Openfiler Storage Server
  7. Configure the iSCSI Initiator
  8. Create "oracle" User and Directories
  9. Configure the Linux Server for Oracle
  10. Configure the "hangcheck-timer" Kernel Module
  11. Configure RAC Nodes for Remote Access using SSH
  12. All Startup Commands for New Oracle RAC Node
  13. Install and Configure Oracle Cluster File System (OCFS2)
  14. Install and Configure Automatic Storage Management (ASMLib 2.0)
  15. Pre-Installation Tasks for Oracle10g Release 2
  16. Extend Oracle Clusterware Software to the New Node
  17. Extend Oracle Database Software to the New Node
  18. Add Listener to New Node
  19. Add Database Instance to the New Node
  20. About the Author



Overview

As your organization grows, so too does your need for more application and database resources to support the company's IT systems. Oracle RAC 10g provides a scalable framework which allows DBA's to effortlessly extend the database tier to support this increased demand. As the number of users and transactions increase, additional Oracle instances can be added to the Oracle database cluster to distribute the extra load.

This document is an extension to my article "Building an Inexpensive Oracle RAC 10g Release 2 on Linux - (CentOS 5.3 / iSCSI)". Contained in this new article are the steps required to add a single node to an already running and configured two-node Oracle RAC 10g Release 2 environment on the CentOS 5 (x86) platform. Although this article was written and tested on CentOS 5.3 Linux, it should work unchanged with Red Hat Enterprise Linux 5 Update 3.

This article assumes the following:

The reader has already built and configured a two-node Oracle RAC 10g Release 2 environment using the article "Building an Inexpensive Oracle RAC 10g Release 2 on Linux - (CentOS 5.3 / iSCSI)". The article provides comprehensive instructions for building a two-node RAC cluster, each with a single processor running CentOS 5.3 (x86), Oracle RAC 10g Release 2 for Linux x86, OCFS2, and ASMLib 2.0. The current two-node RAC environment actually consists of three machines — two named linux1 and linux2 which each run an Oracle10g instance and a third node to run the network storage server named openfiler1.

Note: The current two-node Oracle RAC environment has been upgraded from its base release (10.2.0.1.0) to Oracle RAC 10g Release 2 (10.2.0.4) Patch Set 3 for Linux x86 by applying the 6810189 patchset (p6810189_10204_Linux-x86.zip). The patchset was applied to Oracle Clusterware and the Oracle Database software. The procedures for installing patch sets are not included in any of the parent article(s).

To maintain the current naming convention, the new Oracle RAC node to be added to the existing cluster will be named linux3 (running a new instance named racdb3) making it a three-node cluster.

The new Oracle RAC node should have the same operating system version and installed patches as the current two-node cluster.

Each node in the existing Oracle RAC cluster has a copy of the Oracle Clusterware and Oracle Database software installed on their local disks. The current two-node Oracle RAC environment does not use shared Oracle homes for the Clusterware or Database software.

The software owner for the Oracle Clusterware and Oracle Database installs will be "oracle". It is important that the UID and GID of the oracle user account on the new node be identical to that of the existing RAC nodes. For the purpose of this example, the oracle user account will be defined as follows:
[oracle@linux1 ~]$ id oracle
uid=501(oracle) gid=501(oinstall) groups=501(oinstall),502(dba),503(oper)
The existing Oracle RAC 10g environment makes use of a clustered file system (OCFS2) to store the two files required to be shared by Oracle Clusterware; namely the Oracle Cluster Registry (OCR) file and the Voting Disk. Instructions for installing and adding the new Oracle RAC node to the "live" OCFS2 file system will be included.

Automatic Storage Management (ASM) is being used as the file system and volume manager for all Oracle physical database files (data, online redo logs, control files, archived redo logs) and a Flash Recovery Area. In addition to ASM, we will also be configuring ASMLib on the new Oracle RAC node.

To add instances to an existing RAC database, Oracle Corporation recommends using the Oracle cloning procedures which is described in the Oracle Universal Installer and OPatch User's Guide. This article, however, uses manual procedures to add nodes and instances to the existing Oracle RAC cluster. The manual procedures method described in this article involve extending the RAC database by first extending the Oracle Clusterware home to the new Oracle RAC node and then extending the Oracle Database home. In other words, you extend the software onto the new node in the same order as you installed the clusterware and Oracle database software components on the existing two-node RAC.

During the creation of the existing two-node cluster, the installation of Oracle Clusterware and the Oracle Database software were only performed from one node in the RAC cluster — namely from linux1 as the oracle user account. The Oracle Universal Installer (OUI) on that particular node would then use the ssh and scp commands to run remote commands on and copy the Oracle software to all other nodes within the RAC cluster. The oracle user account on the node running the OUI (runInstaller) had to be trusted by all other nodes in the RAC cluster. This meant that the oracle user account had to run the secure shell commands (ssh or scp) on the Linux server executing the OUI (linux1) against all other Linux servers in the cluster without being prompted for a password. The same security requirements hold true for this article. User equivalence will be configured so that the Oracle Clusterware and Oracle Database software will be securely copied from linux1 to the new Oracle RAC node (linux3) using ssh and scp without being prompted for a password.

All shared disk storage for the existing Oracle RAC is based on iSCSI using a Network Storage Server; namely Openfiler Release 2.3 (Final) x86_64.

Powered by rPath Linux, Openfiler is a free browser-based network storage management utility that delivers file-based Network Attached Storage (NAS) and block-based Storage Area Networking (SAN) in a single framework. The entire software stack interfaces with open source applications such as Apache, Samba, LVM2, ext3, Linux NFS and iSCSI Enterprise Target. Openfiler combines these ubiquitous technologies into a small, easy to manage solution fronted by a powerful web-based management interface.

These articles provide a low cost alternative for those who want to become familiar with Oracle RAC 10g using commercial off the shelf components and downloadable software. Bear in mind that these articles are provided for educational purposes only so the setup is kept simple to demonstrate ideas and concepts. For example, the disk mirroring configured in this article will be setup on one physical disk only, while in practice that should be performed on multiple physical drives. In addition, each Linux node will only be configured with two network cards — one for the public network (eth0) and one for the private cluster interconnect "and" network storage server for shared iSCSI access (eth1). For a production RAC implementation, the private interconnect should be at least Gigabit (or more) and "only" be used by Oracle to transfer Cluster Manager and Cache Fusion related data. A third dedicated network interface (i.e. eth2) should be configured on another Gigabit network for access to the network storage server (Openfiler).

The following is a conceptual look at what the environment will look like after adding the third Oracle RAC node (linux3) to the cluster. Click on the graphic below to enlarge the image:

Oracle RAC 10g Release 2 Environment

Figure 1: Adding linux3 to the current Oracle RAC 10g Release 2 Environment


  While this article provides comprehensive instructions for successfully adding a node to an existing Oracle RAC 10g system, it is by no means a substitute for the official Oracle documentation. In addition to this article, users should also consult the following Oracle documents to gain a full understanding of alternative configuration options, installation, and administration with Oracle RAC 10g. Oracle's official documentation site is docs.oracle.com.

  Oracle Clusterware and Oracle Real Application Clusters Installation Guide - 10g Release 2 (10.2) for Linux
  Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide - 10g Release 2 (10.2)
  2 Day + Real Application Clusters Guide - 10g Release 2 (10.2)



Hardware and Costs

The hardware used in this article to build the third node (linux3) consists of a Linux workstation and components which can be purchased at many local computer stores or over the Internet (i.e. Stallard Technologies, Inc.).

Oracle RAC Node 3 - (linux3)
  Dell Dimension 3000 Series

     - Intel(R) Pentium(R) 4 Processor at 2.80GHz
     - 2GB DDR SDRAM (at 333MHz)
     - 60GB 7200 RPM Internal Hard Drive
     - Integrated Intel 3D AGP Graphics
     - Integrated 10/100 Ethernet - (Broadcom BCM4401)
     - CDROM (48X Max Variable)
     - 3.5" Floppy
     - No Keyboard, Monitor, or Mouse - (Connected to KVM Switch)

US$300
  1 - Ethernet LAN Card

Each Linux server for Oracle RAC should contain two NIC adapters. The Dell Dimension includes an integrated 10/100 Ethernet adapter that will be used to connect to the public network. The second NIC adapter will be used for the private network (RAC interconnect and Openfiler networked storage). Select the appropriate NIC adapter that is compatible with the maximum data transmission speed of the network switch to be used for the private network. For the purpose of this article, I used a Gigabit Ethernet switch (and 1Gb Ethernet cards) for the private network.

Used for RAC interconnect to linux1, linux2 and Openfiler networked storage.

     Gigabit Ethernet

       Intel 10/100/1000Mbps PCI Desktop Adapter - (PWLA8391GT)

US$35

  2 - Network Cables

       Category 6 patch cable - (Connect linux3 to public network)
       Category 6 patch cable - (Connect linux3 to interconnect Ethernet switch)

US$10
US$10

Total     US$355  


We are about to start the installation process. As we start to go into the details of the installation, it should be noted that most of the tasks within this document will need to be performed on the new Oracle RAC node (linux3). I will indicate at the beginning of each section whether or not the task(s) should be performed on the new Oracle RAC node, the current Oracle RAC node(s), or on the network storage server (openfiler1).



Install the Linux Operating System


  Perform the following installation on the new Oracle RAC node!

After procuring the required hardware, it is time to start the configuration process. The first task we need to perform is to install the Linux operating system. As already mentioned, this article will use CentOS 5.3 (x86) and follows Oracle's suggestion of performing a "default RPMs" installation type to ensure all expected Linux O/S packages are present for a successful Oracle RDBMS installation.


Downloading CentOS

  CentOS.org

Download and burn the following ISO images to CD/DVD for CentOS Release 5 Update 3 for either x86 or x86_64 depending on your hardware architecture.

32-bit (x86) Installations

If the Linux RAC nodes have a DVD installed, you may find it more convenient to make use of the single DVD image:

64-bit (x86_64) Installations

If the Linux RAC nodes have a DVD installed, you may find it more convenient to make use of the single DVD image:

  If you are downloading the above ISO files to a MS Windows machine, there are many options for burning these images (ISO files) to a CD. You may already be familiar with and have the proper software to burn images to CD. If you are not familiar with this process and do not have the required software to burn images to CD, here are just two (of many) software packages that can be used:

  UltraISO
  Magic ISO Maker


Installing CentOS

This section provides a summary of the screens used to install CentOS. For more detailed installation instructions, it is possible to use the manuals from Red Hat Linux http://www.redhat.com/docs/manuals/. I would suggest, however, that the instructions I have provided below be used for this Oracle RAC 10g configuration.

  Before installing the Linux operating system on the new Oracle RAC node, you should have the two NIC interfaces (cards) installed.

After downloading and burning the CentOS images (ISO files) to CD/DVD, insert CentOS Disk #1 into the new Oracle RAC server (linux3 in this example), power it on, and answer the installation screen prompts as noted below.

Boot Screen

The first screen is the CentOS boot screen. At the boot: prompt, hit [Enter] to start the installation process.

Media Test

When asked to test the CD media, tab over to [Skip] and hit [Enter]. If there were any errors, the media burning software would have warned us. After several seconds, the installer should then detect the video card, monitor, and mouse. The installer then goes into GUI mode.

Welcome to CentOS

At the welcome screen, click [Next] to continue.

Language / Keyboard Selection

The next two screens prompt you for the Language and Keyboard settings. In almost all cases, you can accept the defaults. Make the appropriate selection for your configuration and click [Next] to continue.

Disk Partitioning Setup

Keep the default selection to [Remove linux partitions on selected drives and create default layout] and check the option to [Review and modify partitioning layout]. Click [Next] to continue.

You will then be prompted with a dialog window asking if you really want to remove all Linux partitions. Click [Yes] to acknowledge this warning.

Partitioning

The installer will then allow you to view (and modify if needed) the disk partitions it automatically selected. For most automatic layouts, the installer will choose 100MB for /boot, double the amount of RAM (systems with <= 2,048MB RAM) or an amount equal to RAM (systems with > 2,048MB RAM) for swap, and the rest going to the root (/) partition. Starting with RHEL 4, the installer will create the same disk configuration as just noted but will create them using the Logical Volume Manager (LVM). For example, it will partition the first hard drive (/dev/hda for my configuration) into two partitions — one for the /boot partition (/dev/hda1) and the remainder of the disk dedicate to a LVM named VolGroup00 (/dev/hda2). The LVM Volume Group (VolGroup00) is then partitioned into two LVM partitions - one for the root filesystem (/) and another for swap.

The main concern during the partitioning phase is to ensure enough swap space is allocated as required by Oracle (which is a multiple of the available RAM). The following is Oracle's requirement for swap space:

Available RAM Swap Space Required
Between 1,024MB and 2,048MB 1.5 times the size of RAM
Between 2,049MB and 8,192MB Equal to the size of RAM
More than 8,192MB .75 times the size of RAM

For the purpose of this install, I will accept all automatically preferred sizes. (Including 4,096MB for swap since I have 2,048MB of RAM installed.

If for any reason, the automatic layout does not configure an adequate amount of swap space, you can easily change that from this screen. To increase the size of the swap partition, [Edit] the volume group VolGroup00. This will bring up the "Edit LVM Volume Group: VolGroup00" dialog. First, [Edit] and decrease the size of the root file system (/) by the amount you want to add to the swap partition. For example, to add another 512MB to swap, you would decrease the size of the root file system by 512MB (i.e. 36,032MB - 512MB = 35,520MB). Now add the space you decreased from the root file system (512MB) to the swap partition. When completed, click [OK] on the "Edit LVM Volume Group: VolGroup00" dialog.

Once you are satisfied with the disk layout, click [Next] to continue.

Boot Loader Configuration

The installer will use the GRUB boot loader by default. To use the GRUB boot loader, accept all default values and click [Next] to continue.

Network Configuration

I made sure to install both NIC interfaces (cards) in the new Linux machine before starting the operating system installation. This screen should have successfully detected each of the network devices. Since we will be using this machine to host an Oracle instance, there will be several changes that need to be made to the network configuration. The settings you make here will, of course, depend on your network configuration. The key point to make is that the machine should never be configured with DHCP since it will be used to host an Oracle instance. You will need to configure the machine with static IP addresses. You will also need to configure the server with a real host name.

First, make sure that each of the network devices are checked to [Active on boot]. The installer may choose to not activate eth1 by default.

Second, [Edit] both eth0 and eth1 as follows. Verify that the option "Enable IPv4 support" is selected. Click off the option for "Use dynamic IP configuration (DHCP)" and configure a static IP address and Netmask for your environment. Click off the option to "Enable IPv6 support". You may choose to use different IP addresses for both eth0 and eth1 that I have documented in this guide and that is OK. Put eth1 (the interconnect) on a different subnet than eth0 (the public network):

eth0:
- Check ON the option to [Enable IPv4 support]
- Check OFF the option to [Use dynamic IP configuration (DHCP)] - (select Manual configuration)
   IPv4 Address: 192.168.1.107
   Prefix (Netmask): 255.255.255.0
- Check OFF the option to [Enable IPv6 support]

eth1:
- Check ON the option to [Enable IPv4 support]
- Check OFF the option to [Use dynamic IP configuration (DHCP)] - (select Manual configuration)
   IPv4 Address: 192.168.2.107
   Prefix (Netmask): 255.255.255.0
- Check OFF the option to [Enable IPv6 support]

Continue by setting your hostname manually. I used "linux3" for this new Oracle RAC node. Finish this dialog off by supplying your gateway and DNS servers.

Time Zone Selection

Select the appropriate time zone for your environment and click [Next] to continue.

Set Root Password

Select a root password and click [Next] to continue.

Package Installation Defaults

By default, CentOS Linux installs most of the software required for a typical server. There are several other packages (RPMs), however, that are required to successfully install the Oracle Database software. The installer includes a "Customize software" selection that allows the addition of RPM groupings such as "Development Libraries" or "Legacy Library Support". The ADDITION of such RPM groupings is NOT an issue. De-selecting any "default RPM" groupings or individual RPMs, however, can result in failed Oracle Clusterware and Oracle Database installation attempts.

For the purpose of this article, select the radio button [Customize now] and click [Next] to continue.

This is where you pick the packages to install. Most of the packages required for the Oracle software are grouped into "Package Groups" (i.e. Application -> Editors). Since these nodes will be hosting the Oracle Clusterware and Oracle RAC software, verify that at least the following package groups are selected for install. For many of the Linux package groups, not all of the packages associated with that group get selected for installation. (Note the "Optional packages" button after selecting a package group.) So although the package group gets selected for install, some of the packages required by Oracle do not get installed. In fact, there are some packages that are required by Oracle that do not belong to any of the available package groups (i.e. libaio-devel). Not to worry. A complete list of required packages for Oracle Clusterware 10g and Oracle RAC 10g for CentOS 5 will be provided at the end of this section. These packages will need to be manually installed from the CentOS CDs after the operating system install. For now, install the following package groups:

  • Desktop Environments
    • GNOME Desktop Environment
  • Applications
    • Editors
    • Graphical Internet
    • Text-based Internet
  • Development
    • Development Libraries
    • Development Tools
    • Legacy Software Development
  • Servers
    • Server Configuration Tools
  • Base System
    • Administration Tools
    • Base
    • Java
    • Legacy Software Support
    • System Tools
    • X Window System

In addition to the above packages, select any additional packages you wish to install for this node keeping in mind to NOT de-select any of the "default" RPM packages. After selecting the packages to install click [Next] to continue.

About to Install

This screen is basically a confirmation screen. Click [Next] to start the installation. If you are installing CentOS using CDs, you will be asked to switch CDs during the installation process depending on which packages you selected.

Congratulations

And that's it. You have successfully installed CentOS on the new Oracle RAC server (linux3). The installer will eject the CD/DVD from the CD-ROM drive. Take out the CD/DVD and click [Reboot] to reboot the system.

Post Installation Wizard Welcome Screen

When the system boots into CentOS Linux for the first time, it will prompt you with another Welcome screen for the "Post Installation Wizard". The post installation wizard allows you to make final O/S configuration settings. On the "Welcome" screen, click [Forward] to continue.

Firewall

On this screen, make sure to select the [Disabled] option and click [Forward] to continue.

You will be prompted with a warning dialog about not setting the firewall. When this occurs, click [Yes] to continue.

SELinux

On the SELinux screen, choose the [Disabled] option and click [Forward] to continue.

You will be prompted with a warning dialog warning that changing the SELinux setting will require rebooting the system so the entire file system can be relabeled. When this occurs, click [Yes] to acknowledge a reboot of the system will occur after firstboot (Post Installation Wizard) is completed.

Kdump

Accept the default setting on the Kdump screen (disabled) and click [Forward] to continue.

Date and Time Settings

Adjust the date and time settings if necessary and click [Forward] to continue.

Create User

Create any additional (non-oracle) operating system user accounts if desired and click [Forward] to continue. For the purpose of this article, I will not be creating any additional operating system accounts. I will be creating the "oracle" user account during the Oracle database installation later in this guide.

If you chose not to define any additional operating system user accounts, click [Continue] to acknowledge the warning dialog.

Sound Card

This screen will only appear if the wizard detects a sound card. On the sound card screen click [Forward] to continue.

Additional CDs

On the "Additional CDs" screen click [Finish] to continue.

Reboot System

Given we changed the SELinux option (to disabled), we are prompted to reboot the system. Click [OK] to reboot the system for normal use.

Login Screen

After rebooting the machine, you are presented with the login screen. Login using the "root" user account and the password you provided during the installation.



Install Required Linux Packages for Oracle RAC


  Install the following required Linux packages on the new Oracle RAC node!

After installing CentOS Linux, the next step is to verify and install all packages (RPMs) required by both Oracle Clusterware and Oracle RAC.

Although many of the required packages for Oracle were installed in the section Install the Linux Operating System, several will be missing either because they were considered optional within the package group or simply didn't exist in any package group!

The packages listed in this section (or later versions) are required for Oracle Clusterware 10g Release 2 and Oracle RAC 10g Release 2 running on the CentOS 5 or Red Hat Enterprise Linux 5 platform.

32-bit (x86) Installations

Note that the openmotif RPM packages are required to install Oracle demos. This article does not cover the installation of Oracle demos.

Each of the packages listed above can be found on CD #1, CD #2, and CD #3 on the CentOS 5 - (x86) CDs. While it is possible to query each individual package to determine which ones are missing and need to be installed, an easier method is to run the rpm -Uvh PackageName command from the five CDs as follows. For packages that already exist and are up to date, the RPM command will simply ignore the install and print a warning message to the console that the package is already installed.

# From CentOS 5.3 (x86)- [CD #1]
mkdir -p /media/cdrom
mount -r /dev/cdrom /media/cdrom
cd /media/cdrom/CentOS
rpm -Uvh binutils-2.*
rpm -Uvh elfutils-libelf-0.*
rpm -Uvh glibc-2.*
rpm -Uvh glibc-common-2.*
rpm -Uvh libaio-0.*
rpm -Uvh libgcc-4.*
rpm -Uvh libstdc++-4.*
rpm -Uvh make-3.*
cd /
eject


# From CentOS 5.3 (x86) - [CD #2]
mount -r /dev/cdrom /media/cdrom
cd /media/cdrom/CentOS
rpm -Uvh elfutils-libelf-devel-0.*
rpm -Uvh glibc-devel-2.*
rpm -Uvh glibc-headers-2.*
rpm -Uvh gcc-4.*
rpm -Uvh gcc-c++-4.*
rpm -Uvh libstdc++-devel-4.*
rpm -Uvh unixODBC-2.*
cd /
eject


# From CentOS 5.3 (x86) - [CD #3]
mount -r /dev/cdrom /media/cdrom
cd /media/cdrom/CentOS
rpm -Uvh compat-libstdc++-296*
rpm -Uvh compat-libstdc++-33*
rpm -Uvh libaio-devel-0.*
rpm -Uvh libXp-1.*
rpm -Uvh openmotif-2.*
rpm -Uvh unixODBC-devel-2.*
cd /
eject


# From CentOS 5.3 (x86) - [CD #4]
mount -r /dev/cdrom /media/cdrom
cd /media/cdrom/CentOS
rpm -Uvh sysstat-7.*
cd /
eject


64-bit (x86_64) Installations

Note that the openmotif RPM packages are required to install Oracle demos. This article does not cover the installation of Oracle demos.

Each of the packages listed above can be found on CD #1, CD #2, CD #3, and CD #4 on the CentOS 5 - (x86_64) CDs. While it is possible to query each individual package to determine which ones are missing and need to be installed, an easier method is to run the rpm -Uvh PackageName command from the five CDs as follows. For packages that already exist and are up to date, the RPM command will simply ignore the install and print a warning message to the console that the package is already installed.

# From CentOS 5.3 (x86_64)- [CD #1]
mkdir -p /media/cdrom
mount -r /dev/cdrom /media/cdrom
cd /media/cdrom/CentOS
rpm -Uvh binutils-2.*
rpm -Uvh elfutils-libelf-0.*
rpm -Uvh glibc-2.*
rpm -Uvh glibc-common-2.*
rpm -Uvh libaio-0.*
rpm -Uvh libgcc-4.*
rpm -Uvh libstdc++-4.*
rpm -Uvh make-3.*
cd /
eject


# From CentOS 5.3 (x86_64) - [CD #2]
mount -r /dev/cdrom /media/cdrom
cd /media/cdrom/CentOS
rpm -Uvh elfutils-libelf-devel-0.*
rpm -Uvh glibc-devel-2.*
rpm -Uvh glibc-headers-2.*
rpm -Uvh gcc-4.*
rpm -Uvh gcc-c++-4.*
rpm -Uvh libstdc++-devel-4.*
rpm -Uvh unixODBC-2.*
cd /
eject

# From CentOS 5.3 (x86_64) - [CD #3]
mount -r /dev/cdrom /media/cdrom
cd /media/cdrom/CentOS
rpm -Uvh compat-libstdc++-296*
rpm -Uvh compat-libstdc++-33*
rpm -Uvh libaio-devel-0.*
rpm -Uvh libXp-1.*
rpm -Uvh openmotif-2.*
rpm -Uvh unixODBC-devel-2.*
cd /
eject


# From CentOS 5.3 (x86_64) - [CD #4]
mount -r /dev/cdrom /media/cdrom
cd /media/cdrom/CentOS
rpm -Uvh sysstat-7.*
cd /
eject



Network Configuration


  Perform the following network configuration tasks on the new Oracle RAC node!


Introduction to Network Settings

Although we configured several of the network settings during the installation of CentOS, it is important to not skip this section as it contains critical steps that are required for a successful RAC environment.

During the Linux O/S install we already configured the IP address and host name for the new Oracle RAC node. We now need to configure the /etc/hosts file as well as adjusting several of the network settings for the interconnect.

All nodes in the RAC cluster should have one static IP address for the public network and one static IP address for the private cluster interconnect. Do not use DHCP naming for the public IP address or the interconnects; you need static IP addresses! The private interconnect should only be used by Oracle to transfer Cluster Manager and Cache Fusion related data along with data for the network storage server (Openfiler). Note that Oracle does not support using the public network interface for the interconnect. You must have one network interface for the public network and another network interface for the private interconnect. For a production RAC implementation, the interconnect should be at least Gigabit (or more) and only be used by Oracle as well as having the network storage server (Openfiler) on a separate Gigabit network.


Configuring Public and Private Network

With the new Oracle RAC node, we need to configure the network for access to the public network as well as the private interconnect.

The easiest way to configure network settings in Red Hat Linux is with the program Network Configuration. This application can be started from the command-line as the "root" user account as follows:

[root@linux3 ~]# /usr/bin/system-config-network &

  Do not use DHCP naming for the public IP address or the interconnects - we need static IP addresses!

Using the Network Configuration application, you need to configure both NIC devices as well as the /etc/hosts file on all nodes in the RAC cluster. Both of these tasks can be completed using the Network Configuration GUI. Notice that the /etc/hosts settings should the same for all nodes and that I removed any entry that has to do with IPv6 (for example, ::1 localhost6.localdomain6 localhost6).

Please note that for the purpose of this example configuration the /etc/hosts entries will be the same for all three Oracle RAC nodes (linux1, linux2, and linux3) as well as the network storage server (openfiler1):

Our example configuration will use the following settings for all nodes:

Oracle RAC Node 3 - (linux3)
Device IP Address Subnet Gateway Purpose
eth0 192.168.1.107 255.255.255.0 192.168.1.1 Connects linux3 to the public network
eth1 192.168.2.107 255.255.255.0   Connects linux3 (interconnect) to linux1/linux2 (linux1-priv/linux2-priv)
/etc/hosts
127.0.0.1        localhost.localdomain localhost

# Public Network - (eth0)
192.168.1.100    linux1
192.168.1.101    linux2
192.168.1.107    linux3

# Private Interconnect - (eth1)
192.168.2.100    linux1-priv
192.168.2.101    linux2-priv
192.168.2.107    linux3-priv

# Public Virtual IP (VIP) addresses - (eth0:1)
192.168.1.200    linux1-vip
192.168.1.201    linux2-vip
192.168.1.207    linux3-vip

# Private Storage Network for Openfiler - (eth1)
192.168.1.195    openfiler1
192.168.2.195    openfiler1-priv

Oracle RAC Node 2 - (linux2)
Device IP Address Subnet Gateway Purpose
eth0 192.168.1.101 255.255.255.0 192.168.1.1 Connects linux2 to the public network
eth1 192.168.2.101 255.255.255.0   Connects linux2 (interconnect) to linux1/linux3 (linux1-priv/linux3-priv)
/etc/hosts
127.0.0.1        localhost.localdomain localhost

# Public Network - (eth0)
192.168.1.100    linux1
192.168.1.101    linux2
192.168.1.107    linux3

# Private Interconnect - (eth1)
192.168.2.100    linux1-priv
192.168.2.101    linux2-priv
192.168.2.107    linux3-priv

# Public Virtual IP (VIP) addresses - (eth0:1)
192.168.1.200    linux1-vip
192.168.1.201    linux2-vip
192.168.1.207    linux3-vip

# Private Storage Network for Openfiler - (eth1)
192.168.1.195    openfiler1
192.168.2.195    openfiler1-priv

Oracle RAC Node 1 - (linux1)
Device IP Address Subnet Gateway Purpose
eth0 192.168.1.100 255.255.255.0 192.168.1.1 Connects linux1 to the public network
eth1 192.168.2.100 255.255.255.0   Connects linux1 (interconnect) to linux2/linux3 (linux2-priv/linux3-priv)
/etc/hosts
127.0.0.1        localhost.localdomain localhost

# Public Network - (eth0)
192.168.1.100    linux1
192.168.1.101    linux2
192.168.1.107    linux3

# Private Interconnect - (eth1)
192.168.2.100    linux1-priv
192.168.2.101    linux2-priv
192.168.2.107    linux3-priv

# Public Virtual IP (VIP) addresses - (eth0:1)
192.168.1.200    linux1-vip
192.168.1.201    linux2-vip
192.168.1.207    linux3-vip

# Private Storage Network for Openfiler - (eth1)
192.168.1.195    openfiler1
192.168.2.195    openfiler1-priv


In the screen shots below, only the new Oracle RAC node (linux3) is shown. Ensure that the /etc/hosts file is updated on all participating nodes to access the new Oracle RAC node!



Figure 2: Network Configuration Screen - Node 3 (linux3)



Figure 3: Ethernet Device Screen - eth0 (linux3)



Figure 4: Ethernet Device Screen - eth1 (linux3)



Figure 5: Network Configuration Screen - /etc/hosts (linux3)


Once the network is configured, you can use the ifconfig command to verify everything is working. The following example is from the new Oracle RAC node linux3:

[root@linux3 ~]# /sbin/ifconfig -a
eth0      Link encap:Ethernet  HWaddr 00:1E:2A:37:6B:9E
          inet addr:192.168.1.107  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::21e:2aff:fe37:6b9e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1167677 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1842517 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:576629131 (549.9 MiB)  TX bytes:2143836310 (1.9 GiB)
          Interrupt:209 Base address:0xef00

eth1      Link encap:Ethernet  HWaddr 00:0E:0C:C0:78:64
          inet addr:192.168.2.107  Bcast:192.168.2.255  Mask:255.255.255.0
          inet6 addr: fe80::20e:cff:fec0:7864/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:48 errors:0 dropped:0 overruns:0 frame:0
          TX packets:59 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:4782 (4.6 KiB)  TX bytes:5564 (5.4 KiB)
          Base address:0xdd80 Memory:fe9c0000-fe9e0000

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:2034 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2034 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2386686 (2.2 MiB)  TX bytes:2386686 (2.2 MiB)

sit0      Link encap:IPv6-in-IPv4
          NOARP  MTU:1480  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)


Verify Network Access to All Nodes

Verify that the new Oracle RAC node has access to the public and private network for all current nodes. From linux3:
[root@linux3 ~]# ping -c 1 linux1 | grep '1 packets transmitted'
1 packets transmitted, 1 received, 0% packet loss, time 0ms

[root@linux3 ~]# ping -c 1 linux1-priv | grep '1 packets transmitted'
1 packets transmitted, 1 received, 0% packet loss, time 0ms

[root@linux3 ~]# ping -c 1 linux2 | grep '1 packets transmitted'
1 packets transmitted, 1 received, 0% packet loss, time 0ms

[root@linux3 ~]# ping -c 1 linux2-priv | grep '1 packets transmitted'
1 packets transmitted, 1 received, 0% packet loss, time 0ms

[root@linux3 ~]# ping -c 1 openfiler1 | grep '1 packets transmitted'
1 packets transmitted, 1 received, 0% packet loss, time 0ms

[root@linux3 ~]# ping -c 1 openfiler1-priv | grep '1 packets transmitted'
1 packets transmitted, 1 received, 0% packet loss, time 0ms


Confirm the RAC Node Name is Not Listed in Loopback Address

Ensure that the new Oracle RAC node (linux3) is not included for the loopback address in the /etc/hosts file. If the machine name is listed in the in the loopback address entry as below:

    127.0.0.1        linux3 localhost.localdomain localhost
it will need to be removed as shown below:
    127.0.0.1        localhost.localdomain localhost

 

If the RAC node name is listed for the loopback address, you will receive the following error during the RAC installation:

ORA-00603: ORACLE server session terminated by fatal error
or
ORA-29702: error occurred in Cluster Group Service operation


Confirm localhost is defined in the /etc/hosts file for the loopback address

Ensure that the entry for localhost.localdomain and localhost are included for the loopback address in the /etc/hosts file on the new Oracle RAC node:

    127.0.0.1        localhost.localdomain localhost

 

If an entry does not exist for localhost in the /etc/hosts file, Oracle Clusterware will be unable to start the application resources — notably the ONS process. The error would indicate "Failed to get IP for localhost" and will be written to the log file for ONS. For example:

CRS-0215 could not start resource 'ora.linux3.ons'. Check log file
"/u01/app/crs/log/linux3/racg/ora.linux3.ons.log"
for more details.

The ONS log file will contain lines similar to the following:

Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 Oracle. All rights reserved.
2007-04-14 13:10:02.729: [ RACG][3086871296][13316][3086871296][ora.linux3.ons]: Failed to get IP for localhost (1)
Failed to get IP for localhost (1)
Failed to get IP for localhost (1)
onsctl: ons failed to start
...


Adjusting Network Settings

With Oracle 9.2.0.1 and later, Oracle makes use of UDP as the default protocol on Linux for inter-process communication (IPC), such as Cache Fusion and Cluster Manager buffer transfers between instances within the RAC cluster.

Oracle strongly suggests to adjust the default and maximum receive buffer size (SO_RCVBUF socket option) to 1024KB and the default and maximum send buffer size (SO_SNDBUF socket option) to 256KB.

The receive buffers are used by TCP and UDP to hold received data until it is read by the application. The receive buffer cannot overflow because the peer is not allowed to send data beyond the buffer size window. This means that datagrams will be discarded if they don't fit in the socket receive buffer, potentially causing the sender to overwhelm the receiver.

 

The default and maximum window size can be changed in the /proc file system without reboot:

[root@linux3 ~]# sysctl -w net.core.rmem_default=1048576
net.core.rmem_default = 1048576

[root@linux3 ~]# sysctl -w net.core.rmem_max=1048576
net.core.rmem_max = 1048576

[root@linux3 ~]# sysctl -w net.core.wmem_default=262144
net.core.wmem_default = 262144

[root@linux3 ~]# sysctl -w net.core.wmem_max=262144
net.core.wmem_max = 262144

The above commands made the changes to the already running OS. You should now make the above changes permanent (for each reboot) by adding the following lines to the /etc/sysctl.conf file on the new Oracle RAC node:

# +---------------------------------------------------------+
# | ADJUSTING NETWORK SETTINGS                              |
# +---------------------------------------------------------+
# | With Oracle 9.2.0.1 and onwards, Oracle now makes use   |
# | of UDP as the default protocol on Linux for             |
# | inter-process communication (IPC), such as Cache Fusion |
# | and Cluster Manager buffer transfers between instances  |
# | within the RAC cluster. Oracle strongly suggests to     |
# | adjust the default and maximum receive buffer size      |
# | (SO_RCVBUF socket option) to 1024KB, and the default    |
# | and maximum send buffer size (SO_SNDBUF socket option)  |
# | to 256KB. The receive buffers are used by TCP and UDP   |
# | to hold received data until it is read by the           |
# | application. The receive buffer cannot overflow because |
# | the peer is not allowed to send data beyond the buffer  |
# | size window. This means that datagrams will be          |
# | discarded if they don't fit in the socket receive       |
# | buffer. This could cause the sender to overwhelm the    |
# | receiver.                                               |
# +---------------------------------------------------------+

# +---------------------------------------------------------+
# | Default setting in bytes of the socket "receive" buffer |
# | which may be set by using the SO_RCVBUF socket option.  |
# +---------------------------------------------------------+
net.core.rmem_default=1048576

# +---------------------------------------------------------+
# | Maximum setting in bytes of the socket "receive" buffer |
# | which may be set by using the SO_RCVBUF socket option.  |
# +---------------------------------------------------------+
net.core.rmem_max=1048576

# +---------------------------------------------------------+
# | Default setting in bytes of the socket "send" buffer    |
# | which may be set by using the SO_SNDBUF socket option.  |
# +---------------------------------------------------------+
net.core.wmem_default=262144

# +---------------------------------------------------------+
# | Maximum setting in bytes of the socket "send" buffer    |
# | which may be set by using the SO_SNDBUF socket option.  |
# +---------------------------------------------------------+
net.core.wmem_max=262144


Check and turn off UDP ICMP rejections:

During the Linux installation process, I indicated to not configure the firewall option. By default the option to configure a firewall is selected by the installer. This has burned me several times so I like to do a double-check that the firewall option is not configured and to ensure udp ICMP filtering is turned off.

If UDP ICMP is blocked or rejected by the firewall, the Oracle Clusterware software will crash after several minutes of running. When the Oracle Clusterware process fails, you will have something similar to the following in the <machine_name>_evmocr.log file:

08/29/2005 22:17:19
oac_init:2: Could not connect to server, clsc retcode = 9
08/29/2005 22:17:19
a_init:12!: Client init unsuccessful : [32]
ibctx:1:ERROR: INVALID FORMAT
proprinit:problem reading the bootblock or superbloc 22

When experiencing this type of error, the solution is to remove the udp ICMP (iptables) rejection rule - or to simply have the firewall option turned off. The Oracle Clusterware software will then start to operate normally and not crash. The following commands should be executed as the root user account:

  1. Check to ensure that the firewall option is turned off. If the firewall option is stopped (like it is in my example below) you do not have to proceed with the following steps.
    [root@linux3 ~]# /etc/rc.d/init.d/iptables status
    Firewall is stopped.
  2. If the firewall option is operating you will need to first manually disable UDP ICMP rejections:
    [root@linux3 ~]# /etc/rc.d/init.d/iptables stop
    
    Flushing firewall rules: [  OK  ]
    Setting chains to policy ACCEPT: filter [  OK  ]
    Unloading iptables modules: [  OK  ]
  3. Then, to turn UDP ICMP rejections off for next server reboot (which should always be turned off):
    [root@linux3 ~]# chkconfig iptables off 



Configure Network Security on the Openfiler Storage Server


  Perform the following configuration tasks on the network storage server (openfiler1)!

With the network now setup, the next step is to configure network access in Openfiler so that the new Oracle RAC node (linux3) has permissions to the shared iSCSI volumes used in the current Oracle RAC 10g environment. For the purpose of this example, all iSCSI traffic will use the private network interface eth1 which in this article is on the 192.168.2.0 network.

Openfiler administration is performed using the Openfiler Storage Control Center — a browser based tool over an https connection on port 446. For example:

https://openfiler1.idevelopment.info:446/

From the Openfiler Storage Control Center home page, login as an administrator. The default administration login credentials for Openfiler are:

The first page the administrator sees is the [Status] / [System Information] screen.

Services

This article assumes that the current Oracle RAC 10g environment is operational and therefore the iSCSI services should already be enabled within Openfiler.

To verify the iSCSI services are running, use the Openfiler Storage Control Center and navigate to [Services] / [Manage Services]:


Figure 6: Verify iSCSI Services are Enabled

Another method is to SSH into the Openfiler server and verify the iscsi-target service is running:

[root@openfiler1 ~]# service iscsi-target status
ietd (pid 3961) is running...


Network Access Configuration

The next step is to configure network access in Openfiler to identify the new Oracle RAC node (linux3) which will need to access the iSCSI volumes used in the current Oracle RAC 10g environment. Note that this step does not actually grant the appropriate permissions to the iSCSI volumes required by the new Oracle RAC node. That will be accomplished later in this section by updating the ACL for each of the current iSCSI logical volumes.

As in the previous section, configuring network access is accomplished using the Openfiler Storage Control Center by navigating to [System] / [Network Setup]. The "Network Access Configuration" section (at the bottom of the page) allows an administrator to setup networks and/or hosts that will be allowed to access resources exported by the Openfiler appliance. For the purpose of this article, we will want to add the new Oracle RAC node individually rather than allowing the entire 192.168.2.0 network have access to Openfiler resources.

When entering the new Oracle RAC node, note that the 'Name' field is just a logical name used for reference only. As a convention when entering nodes, I simply use the node name defined for that IP address. Next, when entering the actual node in the 'Network/Host' field, always use its IP address even though its host name may already be defined in your /etc/hosts file or DNS. Lastly, when entering actual hosts in our Class C network, use a subnet mask of 255.255.255.255.

It is important to remember that you will be entering the IP address of the private network (eth1) for the new Oracle RAC node.

The following image shows the results of adding the new Oracle RAC node linux3 to the local network configuration:


Figure 7: Configure Openfiler Network Access for new Oracle RAC Node


Current Logical iSCSI Volumes

The current Openfiler configuration contains five logical iSCSI volumes in a single volume group named rac1.

iSCSI / Logical Volumes
Volume Name Volume Description Required Space (MB) Filesystem Type
racdb-crs racdb - Oracle Clusterware 2,048 iSCSI
racdb-asm1 racdb - ASM Volume 1 16,984 iSCSI
racdb-asm2 racdb - ASM Volume 2 16,984 iSCSI
racdb-asm3 racdb - ASM Volume 3 16,984 iSCSI
racdb-asm4 racdb - ASM Volume 4 16,984 iSCSI

To view the available iSCSI volumes from within the Openfiler Storage Control Center, navigate to [Volumes] / [Manage Volumes]. There we will see all five logical volumes within the volume group rac1:


Figure 8: Current Logical (iSCSI) Volumes


Update Network ACL

Before the new Oracle RAC node can have access to the current iSCSI targets, it needs to be granted the appropriate permissions. Awhile back, we configured network access in Openfiler for the new Oracle RAC node. The new Oracle RAC node will need to access the current iSCSI targets through the storage (private) network. In this section, we grant the new Oracle RAC node access to the current iSCSI targets.

The current Openfiler configuration contains an iSCSI name (the Target IQN) which each iSCSI logical volume listed in the previous section is mapped to:

iSCSI Target / Logical Volume Mappings
Target IQN iSCSI Volume Name Volume Description
iqn.2006-01.com.openfiler:racdb.crs racdb-crs racdb - Oracle Clusterware
iqn.2006-01.com.openfiler:racdb.asm1 racdb-asm1 racdb - ASM Volume 1
iqn.2006-01.com.openfiler:racdb.asm2 racdb-asm2 racdb - ASM Volume 2
iqn.2006-01.com.openfiler:racdb.asm3 racdb-asm3 racdb - ASM Volume 3
iqn.2006-01.com.openfiler:racdb.asm4 racdb-asm4 racdb - ASM Volume 4

To start the process of allowing access to the current iSCSI logical volumes from the new Oracle RAC node, navigate to [Volumes] / [iSCSI Targets]. Then click on the grey sub-tab named "Target Configuration". On this page is a section named "Select iSCSI Target" where users can select which of the iSCSI targets to display and/or edit:


Figure 9: Select iSCSI Target

Select the first iSCSI target (which in this example is iqn.2006-01.com.openfiler:racdb.crs) and click the [Change] button. Next, click the grey sub-tab named "Network ACL" (next to "LUN Mapping" sub-tab). As you can see, all nodes in the cluster will have their "Access" set to 'Allow' with the exception of the new Oracle RAC node. For the current iSCSI target, change the "Access" for the new Oracle RAC node from 'Deny' to 'Allow' and click the 'Update' button:


Figure 10: Update Network ACL

After updating the Network ACL for the first iSCSI Target, click on the grey sub-tab named "Target Configuration" and select the next iSCSI target in the "Select iSCSI Target" section. Click the [Change] button and continue to update the Network ACL for this iSCSI target (changing the "Access" for the new Oracle RAC node from 'Deny' to 'Allow' under the "Network ACL" grey sub-tab). Continue this process until access has been granted to the new Oracle RAC node to all five iSCSI targets.



Configure the iSCSI Initiator


  Configure the iSCSI initiator on the new Oracle RAC node!

An iSCSI client can be any system (Linux, Unix, MS Windows, Apple Mac, etc.) for which iSCSI support (a driver) is available. In our case, the clients are the three Oracle RAC nodes, (linux1, linux2, and linux3), running CentOS 5.

In this section we will be configuring the iSCSI software initiator on the new Oracle RAC node linux3. CentOS 5.3 includes the Open-iSCSI iSCSI software initiator which can be found in the iscsi-initiator-utils RPM. This is a change from previous versions of CentOS (4.x) which included the Linux iscsi-sfnet software driver developed as part of the Linux-iSCSI Project. All iSCSI management tasks like discovery and logins will use the command-line interface iscsiadm which is included with Open-iSCSI.

The iSCSI software initiator will be configured to automatically login to the network storage server (openfiler1) and discover the iSCSI volumes listed in the previous section. We will then go through the steps of creating persistent local SCSI device names (i.e. /dev/iscsi/asm1) for each of the iSCSI target names discovered using udev. Having a consistent local SCSI device name and which iSCSI target it maps to is required in order to know which volume (device) is to be used for OCFS2 and which volumes belong to ASM. Before we can do any of this, however, we must first install the iSCSI initiator software on the new Oracle RAC node.


Installing the iSCSI (initiator) service

With CentOS 5.3, the Open-iSCSI iSCSI software initiator does not get installed by default. The software is included in the iscsi-initiator-utils package which can be found on CD #1. To determine if this package is installed (which in most cases, it will not be), perform the following on the new Oracle RAC node:

[root@linux3 ~]# rpm -qa | grep iscsi-initiator-utils

If the iscsi-initiator-utils package is not installed, load CD #1 into the new Oracle RAC node and perform the following:

[root@linux3 ~]# mount -r /dev/cdrom /media/cdrom
[root@linux3 ~]# cd /media/cdrom/CentOS
[root@linux3 ~]# rpm -Uvh iscsi-initiator-utils-6.2.0.868-0.18.el5.i386.rpm
[root@linux3 ~]# cd /
[root@linux3 ~]# eject


Configure the iSCSI (initiator) service

After verifying that the iscsi-initiator-utils package is installed on the new Oracle RAC node, start the iscsid service and enable it to automatically start when the system boots. We will also configure the iscsi service to automatically start which logs into iSCSI targets needed at system startup.

[root@linux3 ~]# service iscsid start
Turning off network shutdown. Starting iSCSI daemon: [  OK  ]
[  OK  ]

[root@linux3 ~]# chkconfig iscsid on
[root@linux3 ~]# chkconfig iscsi on

Now that the iSCSI service is started, use the iscsiadm command-line interface to discover all available targets on the network storage server:

[root@linux3 ~]# iscsiadm -m discovery -t sendtargets -p openfiler1-priv
192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.asm1
192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.asm2
192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.asm3
192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.asm4
192.168.2.195:3260,1 iqn.2006-01.com.openfiler:racdb.crs


Manually Login to iSCSI Targets

At this point the iSCSI initiator service has been started and the new Oracle RAC node was able to discover the available targets from the network storage server. The next step is to manually login to each of the available targets which can be done using the iscsiadm command-line interface. Note that I had to specify the IP address and not the host name of the network storage server (openfiler1-priv) - I believe this is required given the discovery (above) shows the targets using the IP address.

[root@linux3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.asm1 -p 192.168.2.195 -l
[root@linux3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.asm2 -p 192.168.2.195 -l
[root@linux3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.asm3 -p 192.168.2.195 -l
[root@linux3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.asm4 -p 192.168.2.195 -l
[root@linux3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.crs -p 192.168.2.195 -l


Configure Automatic Login

The next step is to ensure the client will automatically login to each of the targets listed above when the machine is booted (or the iSCSI initiator service is started/restarted):

[root@linux3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.asm1 -p 192.168.2.195 --op update -n node.startup -v automatic
[root@linux3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.asm2 -p 192.168.2.195 --op update -n node.startup -v automatic
[root@linux3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.asm3 -p 192.168.2.195 --op update -n node.startup -v automatic
[root@linux3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.asm4 -p 192.168.2.195 --op update -n node.startup -v automatic
[root@linux3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.crs -p 192.168.2.195 --op update -n node.startup -v automatic


Create Persistent Local SCSI Device Names

In this section, we will go through the steps to create persistent local SCSI device names for each of the iSCSI target names. This will be done using udev. Having a consistent local SCSI device name and which iSCSI target it maps to is required in order to know which volume (device) is to be used for OCFS2 and which volumes belong to ASM.

When any of the Oracle RAC nodes boot and the iSCSI initiator service is started, it will automatically login to each of the targets configured in a random fashion and map them to the next available local SCSI device name. For example, the target iqn.2006-01.com.openfiler:racdb.asm1 may get mapped to /dev/sda. I can actually determine the current mappings for all targets by looking at the /dev/disk/by-path directory:

[root@linux3 ~]# (cd /dev/disk/by-path; ls -l *openfiler* | awk '{FS=" "; print $9 " " $10 " " $11}')
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.asm1-lun-0 -> ../../sda
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.asm1-lun-0-part1 -> ../../sda1
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.asm2-lun-0 -> ../../sdb
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.asm2-lun-0-part1 -> ../../sdb1
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.asm3-lun-0 -> ../../sdc
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.asm3-lun-0-part1 -> ../../sdc1
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.asm4-lun-0 -> ../../sdd
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.asm4-lun-0-part1 -> ../../sdd1
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.crs-lun-0 -> ../../sde
ip-192.168.2.195:3260-iscsi-iqn.2006-01.com.openfiler:racdb.crs-lun-0-part1 -> ../../sde1

Using the output from the above listing, we can establish the following current mappings:

Current iSCSI Target Name to local SCSI Device Name Mappings
iSCSI Target Name SCSI Device Name
iqn.2006-01.com.openfiler:racdb.asm1 /dev/sda
iqn.2006-01.com.openfiler:racdb.asm2 /dev/sdb
iqn.2006-01.com.openfiler:racdb.asm3 /dev/sdc
iqn.2006-01.com.openfiler:racdb.asm4 /dev/sdd
iqn.2006-01.com.openfiler:racdb.crs /dev/sde

This mapping, however, may change every time the Oracle RAC node is rebooted. For example, after a reboot it may be determined that the iSCSI target iqn.2006-01.com.openfiler:racdb.asm1 gets mapped to the local SCSI device /dev/sdd. It is therefore impractical to rely on using the local SCSI device name given there is no way to predict the iSCSI target mappings after a reboot.

What we need is a consistent device name we can reference (i.e. /dev/iscsi/asm1) that will always point to the appropriate iSCSI target through reboots. This is where the Dynamic Device Management tool named udev comes in. udev provides a dynamic device directory using symbolic links that point to the actual device using a configurable set of rules. When udev receives a device event (for example, the client logging in to an iSCSI target), it matches its configured rules against the available device attributes provided in sysfs to identify the device. Rules that match may provide additional device information or specify a device node name and multiple symlink names and instruct udev to run additional programs (a SHELL script for example) as part of the device event handling process.

The first step is to create a new rules file. The file will be named /etc/udev/rules.d/55-openiscsi.rules and contain only a single line of name=value pairs used to receive events we are interested in. It will also define a call-out SHELL script (/etc/udev/scripts/iscsidev.sh) to handle the event.

Create the following rules file /etc/udev/rules.d/55-openiscsi.rules on the new Oracle RAC node:

/etc/udev/rules.d/55-openiscsi.rules
# /etc/udev/rules.d/55-openiscsi.rules
KERNEL=="sd*", BUS=="scsi", PROGRAM="/etc/udev/scripts/iscsidev.sh %b",SYMLINK+="iscsi/%c/part%n"

We now need to create the UNIX SHELL script that will be called when this event is received. Let's first create a separate directory on the new Oracle RAC node where udev scripts can be stored:

[root@linux3 ~]# mkdir -p /etc/udev/scripts

Next, create the UNIX shell script /etc/udev/scripts/iscsidev.sh on the new Oracle RAC node:

/etc/udev/scripts/iscsidev.sh
#!/bin/sh

# FILE: /etc/udev/scripts/iscsidev.sh

BUS=${1}
HOST=${BUS%%:*}

[ -e /sys/class/iscsi_host ] || exit 1

file="/sys/class/iscsi_host/host${HOST}/device/session*/iscsi_session*/targetname"

target_name=$(cat ${file})

# This is not an open-scsi drive
if [ -z "${target_name}" ]; then
   exit 1
fi

# Check if QNAP drive
check_qnap_target_name=${target_name%%:*}
if [ $check_qnap_target_name = "iqn.2004-04.com.qnap" ]; then
    target_name=`echo "${target_name%.*}"`
fi

echo "${target_name##*.}"

After creating the UNIX SHELL script, change it to executable:

[root@linux3 ~]# chmod 755 /etc/udev/scripts/iscsidev.sh

Now that udev is configured, restart the iSCSI service on the new Oracle RAC node:

[root@linux3 ~]# service iscsi stop
Logging out of session [sid: 1, target: iqn.2006-01.com.openfiler:racdb.asm1, portal: 192.168.2.195,3260]
Logging out of session [sid: 2, target: iqn.2006-01.com.openfiler:racdb.asm2, portal: 192.168.2.195,3260]
Logging out of session [sid: 3, target: iqn.2006-01.com.openfiler:racdb.asm3, portal: 192.168.2.195,3260]
Logging out of session [sid: 4, target: iqn.2006-01.com.openfiler:racdb.asm4, portal: 192.168.2.195,3260]
Logging out of session [sid: 5, target: iqn.2006-01.com.openfiler:racdb.crs, portal: 192.168.2.195,3260]
Logout of [sid: 1, target: iqn.2006-01.com.openfiler:racdb.asm1, portal: 192.168.2.195,3260]: successful
Logout of [sid: 2, target: iqn.2006-01.com.openfiler:racdb.asm2, portal: 192.168.2.195,3260]: successful
Logout of [sid: 3, target: iqn.2006-01.com.openfiler:racdb.asm3, portal: 192.168.2.195,3260]: successful
Logout of [sid: 4, target: iqn.2006-01.com.openfiler:racdb.asm4, portal: 192.168.2.195,3260]: successful
Logout of [sid: 5, target: iqn.2006-01.com.openfiler:racdb.crs, portal: 192.168.2.195,3260]: successful
Stopping iSCSI daemon:

[root@linux3 ~]# service iscsi start
iscsid dead but pid file exists
Turning off network shutdown. Starting iSCSI daemon: [  OK  ]
[  OK  ]
Setting up iSCSI targets: Logging in to [iface: default, target: iqn.2006-01.com.openfiler:racdb.asm2, portal: 192.168.2.195,3260]
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:racdb.asm1, portal: 192.168.2.195,3260]
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:racdb.crs, portal: 192.168.2.195,3260]
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:racdb.asm4, portal: 192.168.2.195,3260]
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:racdb.asm3, portal: 192.168.2.195,3260]
Login to [iface: default, target: iqn.2006-01.com.openfiler:racdb.asm2, portal: 192.168.2.195,3260]: successful
Login to [iface: default, target: iqn.2006-01.com.openfiler:racdb.asm1, portal: 192.168.2.195,3260]: successful
Login to [iface: default, target: iqn.2006-01.com.openfiler:racdb.crs, portal: 192.168.2.195,3260]: successful
Login to [iface: default, target: iqn.2006-01.com.openfiler:racdb.asm4, portal: 192.168.2.195,3260]: successful
Login to [iface: default, target: iqn.2006-01.com.openfiler:racdb.asm3, portal: 192.168.2.195,3260]: successful
[  OK  ]

Let's see if our hard work paid off:

[root@linux3 ~]# ls -l /dev/iscsi/*
/dev/iscsi/asm1:
total 0
lrwxrwxrwx 1 root root  9 Sep  2 22:29 part -> ../../sda
lrwxrwxrwx 1 root root 10 Sep  2 22:29 part1 -> ../../sda1

/dev/iscsi/asm2:
total 0
lrwxrwxrwx 1 root root  9 Sep  2 22:29 part -> ../../sdc
lrwxrwxrwx 1 root root 10 Sep  2 22:29 part1 -> ../../sdc1

/dev/iscsi/asm3:
total 0
lrwxrwxrwx 1 root root  9 Sep  2 22:29 part -> ../../sdb
lrwxrwxrwx 1 root root 10 Sep  2 22:29 part1 -> ../../sdb1

/dev/iscsi/asm4:
total 0
lrwxrwxrwx 1 root root  9 Sep  2 22:29 part -> ../../sde
lrwxrwxrwx 1 root root 10 Sep  2 22:29 part1 -> ../../sde1

/dev/iscsi/crs:
total 0
lrwxrwxrwx 1 root root  9 Sep  2 22:29 part -> ../../sdd
lrwxrwxrwx 1 root root 10 Sep  2 22:29 part1 -> ../../sdd1

The listing above shows that udev did the job it was suppose to do! We now have a consistent set of local device names that can be used to reference the iSCSI targets. For example, we can safely assume that the device name /dev/iscsi/asm1/part will always reference the iSCSI target iqn.2006-01.com.openfiler:racdb.asm1. We now have a consistent iSCSI target name to local device name mapping which is described in the following table:

iSCSI Target Name to Local Device Name Mappings
iSCSI Target Name Local Device Name
iqn.2006-01.com.openfiler:racdb.asm1 /dev/iscsi/asm1/part
iqn.2006-01.com.openfiler:racdb.asm2 /dev/iscsi/asm2/part
iqn.2006-01.com.openfiler:racdb.asm3 /dev/iscsi/asm3/part
iqn.2006-01.com.openfiler:racdb.asm4 /dev/iscsi/asm4/part
iqn.2006-01.com.openfiler:racdb.crs /dev/iscsi/crs/part



Create "oracle" User and Directories


  Perform the following tasks on the new Oracle RAC node!

In this section we will create the oracle UNIX user account, recommended O/S groups, and all required directories. The following O/S groups will be created:

Description Oracle Privilege Oracle Group Name UNIX Group name
Oracle Inventory and Software Owner     oinstall
Database Administrator SYSDBA OSDBA dba
Database Operator SYSOPER OSOPER oper

We will be using the Oracle Cluster File System, Release 2 (OCFS2) to store the files required to be shared for the Oracle Clusterware software. When using OCFS2, the UID of the UNIX user "oracle" and GID of the UNIX group "oinstall" must be the same on all Oracle RAC nodes in the cluster. If either the UID or GID are different, the files on the OCFS2 file system will show up as "unowned" or may even be owned by a different user. For this example, I will use 501 for the "oracle" UID and 501 for the "oinstall" GID.

Note that members of the UNIX group oinstall are considered the "owners" of the Oracle software. Members of the dba group can administer Oracle databases, for example starting up and shutting down databases. Members of the optional group oper have a limited set of database administrative privileges such as managing and running backups. The default name for this group is oper. To use this group, choose the "Custom" installation type to install the Oracle database software. In this article, we are creating the oracle user account to have all responsibilities!

  This guide adheres to the Optimal Flexible Architecture (OFA) for naming conventions used in creating the directory structure.


Create Group and User for Oracle

Lets start this section by creating the UNIX oinstall, dba, and oper group and oracle user account:

[root@linux3 ~]# groupadd -g 501 oinstall
[root@linux3 ~]# groupadd -g 502 dba
[root@linux3 ~]# groupadd -g 503 oper
[root@linux3 ~]# useradd -m -u 501 -g oinstall -G dba,oper -d /home/oracle -s /bin/bash -c "Oracle Software Owner" oracle

[root@linux3 ~]# id oracle
uid=501(oracle) gid=501(oinstall) groups=501(oinstall),502(dba),503(oper)

Set the password for the oracle account:

[root@linux3 ~]# passwd oracle
Changing password for user oracle.
New UNIX password: xxxxxxxxxxx
Retype new UNIX password: xxxxxxxxxxx
passwd: all authentication tokens updated successfully.


Verify That the User nobody Exists

Before installing the Oracle software, complete the following procedure to verify that the user nobody exists on the system:

  1. To determine if the user exists, enter the following command:
    [root@linux3 ~]# id nobody
    uid=99(nobody) gid=99(nobody) groups=99(nobody)
    If this command displays information about the nobody user, then you do not have to create that user.

  2. If the user nobody does not exist, then enter the following command to create it:
    [root@linux3 ~]# /usr/sbin/useradd nobody


Create the Oracle Base Directory

The next step is to create a new directory that will be used to store the Oracle Database software. When configuring the oracle user's environment (later in this section) we will be assigning the location of this directory to the $ORACLE_BASE environment variable.

The following assumes that the directories are being created in the root file system. Please note that this is being done for the sake of simplicity and is not recommended as a general practice. Normally, these directories would be created on a separate file system.

After the directory is created, you must then specify the correct owner, group, and permissions for it. Perform the following on the new Oracle RAC node:

[root@linux3 ~]# mkdir -p /u01/app/oracle
[root@linux3 ~]# chown -R oracle:oinstall /u01/app/oracle
[root@linux3 ~]# chmod -R 775 /u01/app/oracle

At the end of this procedure, you will have the following:


Create the Oracle Clusterware Home Directory

Next, create a new directory that will be used to store the Oracle Clusterware software. When configuring the oracle user's environment (later in this section) we will be assigning the location of this directory to the $ORA_CRS_HOME environment variable.

As noted in the previous section, the following assumes that the directories are being created in the root file system. This is being done for the sake of simplicity and is not recommended as a general practice. Normally, these directories would be created on a separate file system.

After the directory is created, you must then specify the correct owner, group, and permissions for it. Perform the following on the new Oracle RAC node:

[root@linux3 ~]# mkdir -p /u01/app/crs
[root@linux3 ~]# chown -R oracle:oinstall /u01/app/crs
[root@linux3 ~]# chmod -R 775 /u01/app/crs

At the end of this procedure, you will have the following:


Create Mount Point for OCFS2 / Clusterware

Let's now create the mount point for the Oracle Cluster File System, Release 2 (OCFS2) that will be used to store the two Oracle Clusterware shared files.

Perform the following on the new Oracle RAC node:

[root@linux3 ~]# mkdir -p /u02
[root@linux3 ~]# chown -R oracle:oinstall /u02
[root@linux3 ~]# chmod -R 775 /u02


Create Login Script for oracle User Account

To ensure that the environment is setup correctly for the "oracle" UNIX userid on the new Oracle RAC node, use the following .bash_profile:

  When you are setting the Oracle environment variables for each Oracle RAC node, ensure to assign each RAC node a unique Oracle SID! For this example, I used:

For this example, I used:

  • linux1 : ORACLE_SID=racdb1
  • linux2 : ORACLE_SID=racdb2
  • linux3 : ORACLE_SID=racdb3

Login to the new Oracle RAC node as the oracle user account:

[root@linux3 ~]# su - oracle
.bash_profile for Oracle User
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
      . ~/.bashrc
fi

alias ls="ls -FA"
alias s="screen -DRRS iPad -t iPad"

export JAVA_HOME=/usr/local/java

# User specific environment and startup programs
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1
export ORA_CRS_HOME=/u01/app/crs
export ORACLE_PATH=$ORACLE_BASE/dba_scripts/sql:.:$ORACLE_HOME/rdbms/admin
export CV_JDKHOME=/usr/local/java

# Each RAC node must have a unique ORACLE_SID. (i.e. racdb1, racdb2, racdb3...)
export ORACLE_SID=racdb3

export PATH=.:${JAVA_HOME}/bin:$JAVA_HOME/db/bin:${PATH}:$HOME/bin:$ORACLE_HOME/bin
export PATH=${PATH}:/usr/bin:/bin:/usr/bin/X11:/usr/local/bin
export PATH=${PATH}:$ORACLE_BASE/dba_scripts/bin
export ORACLE_TERM=xterm
export TNS_ADMIN=$ORACLE_HOME/network/admin
export ORA_NLS10=$ORACLE_HOME/nls/data
export NLS_DATE_FORMAT="DD-MON-YYYY HH24:MI:SS"
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$ORACLE_HOME/oracm/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lib:/usr/lib:/usr/local/lib
export CLASSPATH=$ORACLE_HOME/JRE
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/rdbms/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/oc4j/ant/lib/ant.jar
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/oc4j/ant/lib/ant-launcher.jar
export CLASSPATH=${CLASSPATH}:$JAVA_HOME/db/lib/derby.jar
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/network/jlib
export THREADS_FLAG=native
export TEMP=/tmp
export TMPDIR=/tmp



Configure the Linux Server for Oracle


  Perform the following tasks on the new Oracle RAC node!

 

The kernel parameters and shell limits discussed in this section will need to be defined on the new Oracle RAC node every time the machine is booted. This section provides information about setting those kernel parameters required for Oracle. Instructions for placing them in a startup script (/etc/sysctl.conf) are included in the section "All Startup Commands for New Oracle RAC Node".


Overview

This section focuses on configuring the new Oracle RAC Linux server - getting it prepared for the Oracle RAC 10g installation. This includes verifying enough swap space, setting shared memory and semaphores, setting the maximum number of file handles, setting the IP local port range, setting shell limits for the oracle user, activating all kernel parameters for the system, and finally how to verify the correct date and time for all nodes in the cluster.

There are several different ways to configure (set) these parameters. For the purpose of this article, I will be making all changes permanent (through reboots) by placing all commands in the /etc/sysctl.conf file.


Swap Space Considerations

Installing Oracle Database 10g Release 2 on RHEL/OL 5 requires a minimum of 1024MB of memory. (Note: An inadequate amount of swap during the installation will cause the Oracle Universal Installer to either "hang" or "die")

To check the amount of memory you have, type:

[root@linux3 ~]# cat /proc/meminfo | grep MemTotal
MemTotal: 2074068 kB

To check the amount of swap you have allocated, type:

[root@linux3 ~]# cat /proc/meminfo | grep SwapTotal
SwapTotal: 4128760 kB

 

If you have less than 2048MB of memory (between your RAM and SWAP), you can add temporary swap space by creating a temporary swap file. This way you do not have to use a raw device or even more drastic, rebuild your system.

As root, make a file that will act as additional swap space, let's say about 500MB:

[root@linux3 ~]# dd if=/dev/zero of=tempswap bs=1k count=500000

Now we should change the file permissions:

[root@linux3 ~]# chmod 600 tempswap

Finally we format the "partition" as swap and add it to the swap space:

[root@linux3 ~]# mke2fs tempswap
[root@linux3 ~]# mkswap tempswap
[root@linux3 ~]# swapon tempswap


Configuring Kernel Parameters and Shell Limits

The kernel parameters and shell limits presented in this section are recommended values only as documented by Oracle. For production database systems, Oracle recommends that you tune these values to optimize the performance of the system.

On the new Oracle RAC node, verify that the kernel parameters described in this section are set to values greater than or equal to the recommended values. Also note that when setting the four semaphore values that all four values need to be entered on one line.


Configuring Kernel Parameters

Oracle Database 10g Release 2 on RHEL/OL 5 requires the kernel parameter settings shown below. The values given are minimums, so if your system uses a larger value, do not change it:

kernel.shmmax = 4294967295
kernel.shmall = 268435456
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default=1048576
net.core.rmem_max=1048576
net.core.wmem_default=262144
net.core.wmem_max=262144

RHEL/OL 5 already comes configured with default values defined for the following kernel parameters:

kernel.shmall
kernel.shmmax

Use the default values if they are the same or larger than the required values.

This article assumes a fresh new install of CentOS 5 and as such, many of the required kernel parameters are already set (see above). This being the case, you can simply copy / paste the following to the new Oracle RAC node while logged in as root:

[root@linux3 ~]# cat >> /etc/sysctl.conf <<EOF
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
EOF

The above command persisted the required kernel parameters through reboots by inserting them in the /etc/sysctl.conf startup file. Linux allows modification of these kernel parameters to the current system while it is up and running, so there's no need to reboot the system after making kernel parameter changes. To activate the new kernel parameter values for the currently running system, run the following as root on the new Oracle RAC node:

[root@linux3 ~]# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 4294967295
kernel.shmall = 268435456
net.core.rmem_default = 1048576
net.core.rmem_max = 1048576
net.core.wmem_default = 262144
net.core.wmem_max = 262144
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000

Verify the new kernel parameter values by running the following on the new Oracle RAC node:

[root@linux3 ~]# /sbin/sysctl -a | grep shm
vm.hugetlb_shm_group = 0
kernel.shmmni = 4096
kernel.shmall = 268435456
kernel.shmmax = 4294967295

[root@linux3 ~]# /sbin/sysctl -a | grep sem
kernel.sem = 250        32000   100     128

[root@linux3 ~]# /sbin/sysctl -a | grep file-max
fs.file-max = 65536

[root@linux3 ~]# /sbin/sysctl -a | grep ip_local_port_range
net.ipv4.ip_local_port_range = 1024     65000

[root@linux3 ~]# /sbin/sysctl -a | grep 'core\.[rw]mem'
net.core.rmem_default = 1048576
net.core.wmem_default = 262144
net.core.rmem_max = 1048576
net.core.wmem_max = 262144


Setting Shell Limits for the oracle User

To improve the performance of the software on Linux systems, Oracle recommends you increase the following shell limits for the oracle user:

Shell Limit Item in limits.conf Hard Limit
Maximum number of open file descriptors nofile 65536
Maximum number of processes available to a single user nproc 16384

To make these changes, run the following as root:

[root@linux3 ~]# cat >> /etc/security/limits.conf <<EOF
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
EOF

[root@linux3 ~]# cat >> /etc/pam.d/login <<EOF
session required /lib/security/pam_limits.so
EOF

Update the default shell startup file for the "oracle" UNIX account.

  • For the Bourne, Bash, or Korn shell, add the following lines to the /etc/profile file by running the following command:
    [root@linux3 ~]# cat >> /etc/profile <<EOF
    if [ \$USER = "oracle" ]; then 
        if [ \$SHELL = "/bin/ksh" ]; then
            ulimit -p 16384
            ulimit -n 65536
        else
            ulimit -u 16384 -n 65536
        fi
        umask 022
    fi
    EOF
  • For the C shell (csh or tcsh), add the following lines to the /etc/csh.login file by running the following command:
    [root@linux3 ~]# cat >> /etc/csh.login <<EOF
    if ( \$USER == "oracle" ) then
        limit maxproc 16384
        limit descriptors 65536
    endif
    EOF


Setting the Correct Date and Time on the new Oracle RAC Node

When adding the new Oracle RAC node to the cluster, the Oracle Universal Installer (OUI) copies the Oracle Clusterware and Oracle Database software from the source RAC node (linux1 in this article) to the new node in the cluster (linux3). During the remote copy process, the OUI will execute the UNIX "tar" command on the remote node (linux3) to extract the files that were archived and copied over. If the date and time on the node performing the install is greater than that of the node it is copying to, the OUI will throw an error from the "tar" command indicating it is attempting to extract files stamped with a time in the future:

Error while copying directory 
    /u01/app/crs with exclude file list 'null' to nodes 'linux3'.
[PRKC-1002 : All the submitted commands did not execute successfully]
---------------------------------------------
linux3:
   /bin/tar: ./bin/lsnodes: time stamp 2009-09-02 23:07:04 is 735 s in the future
   /bin/tar: ./bin/olsnodes: time stamp 2009-09-02 23:07:04 is 735 s in the future
   ...(more errors on this node)

Please note that although this would seem like a severe error from the OUI, it can safely be disregarded as a warning. The "tar" command DOES actually extract the files; however, when you perform a listing of the files (using ls -l) on the remote node (the new Oracle RAC node), they will be missing the time field until the time on the remote server is greater than the timestamp of the file.

Before attempting to add the new node, ensure that all nodes in the cluster are set as closely as possible to the same date and time. Oracle strongly recommends using the Network Time Protocol feature of most operating systems for this purpose, with all nodes using the same reference Network Time Protocol server.

Accessing a Network Time Protocol server, however, may not always be an option. In this case, when manually setting the date and time for the nodes in the cluster, ensure that the date and time of the node you are performing the software installations from (linux1) is less than the new node being added to the cluster (linux3). I generally use a 20 second difference as shown in the following example:

Show the date and time from linux1:

[root@linux3 ~]# date
Wed Sep  2 23:09:00 EDT 2009

Setting the date and time on the new Oracle RAC node linux3:

[root@linux3 ~]# date -s "9/2/2009 23:09:20"

The RAC configuration described in this article does not make use of a Network Time Protocol server.



Configure the "hangcheck-timer" Kernel Module


  Perform the following tasks on the new Oracle RAC node!

Oracle 9.0.1 and 9.2.0.1 used a userspace watchdog daemon called watchdogd to monitor the health of the cluster and to restart a RAC node in case of a failure. Starting with Oracle 9.2.0.2 (and still available in Oracle10g Release 2), the watchdog daemon has been deprecated by a Linux kernel module named hangcheck-timer which addresses availability and reliability problems much better. The hang-check timer is loaded into the Linux kernel and checks if the system hangs. It will set a timer and check the timer after a certain amount of time. There is a configurable threshold to hang-check that, if exceeded will reboot the machine. Although the hangcheck-timer module is not required for Oracle Clusterware (Cluster Manager) operation, it is highly recommended by Oracle.


The hangcheck-timer.ko Module

The hangcheck-timer module uses a kernel-based timer that periodically checks the system task scheduler to catch delays in order to determine the health of the system. If the system hangs or pauses, the timer resets the node. The hangcheck-timer module uses the Time Stamp Counter (TSC) CPU register which is a counter that is incremented at each clock signal. The TCS offers much more accurate time measurements since this register is updated by the hardware automatically.

Much more information about the

hangcheck-timer project can be found here.


Installing the hangcheck-timer.ko Module

The hangcheck-timer was normally shipped only by Oracle, however, this module is now included with Red Hat Linux AS starting with kernel versions 2.4.9-e.12 and higher. The hangcheck-timer should already be included. Use the following to ensure that you have the module included:

[root@linux3 ~]# find /lib/modules -name "hangcheck-timer.ko"
/lib/modules/2.6.18-128.el5/kernel/drivers/char/hangcheck-timer.ko

In the above output, we care about the hangcheck timer object (hangcheck-timer.ko) in the /lib/modules/2.6.18-128.el5/kernel/drivers/char directory.


Configuring and Loading the hangcheck-timer Module

There are two key parameters to the hangcheck-timer module:

  The two hangcheck-timer module parameters indicate how long a RAC node must hang before it will reset the system. A node reset will occur when the following is true:
system hang time > (hangcheck_tick + hangcheck_margin)


Configuring Hangcheck Kernel Module Parameters

Each time the hangcheck-timer kernel module is loaded (manually or by Oracle) it needs to know what value to use for each of the two parameters we just discussed: (hangcheck-tick and hangcheck-margin).

These values need to be available after each reboot of the Linux server. To do this, make an entry with the correct values to the /etc/modprobe.conf file as follows:

[root@linux3 ~]# echo "options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180" >> /etc/modprobe.conf

Each time the hangcheck-timer kernel module gets loaded, it will use the values defined by the entry I made in the /etc/modprobe.conf file.


Manually Loading the Hangcheck Kernel Module for Testing

Oracle is responsible for loading the hangcheck-timer kernel module when required. It is for this reason that it is not required to perform a modprobe or insmod of the hangcheck-timer kernel module in any of the startup files (i.e. /etc/rc.local).

It is only out of pure habit that I continue to include a modprobe of the hangcheck-timer kernel module in the /etc/rc.local file. Someday I will get over it, but realize that it does not hurt to include a modprobe of the hangcheck-timer kernel module during startup.

So to keep myself sane and able to sleep at night, I always configure the loading of the hangcheck-timer kernel module on each startup as follows:

[root@linux3 ~]# echo "/sbin/modprobe hangcheck-timer" >> /etc/rc.local

  You don't have to manually load the hangcheck-timer kernel module using modprobe or insmod after each reboot. The hangcheck-timer module will be loaded by Oracle (automatically) when needed.

Now, to test the hangcheck-timer kernel module to verify it is picking up the correct parameters we defined in the /etc/modprobe.conf file, use the modprobe command. Although you could load the hangcheck-timer kernel module by passing it the appropriate parameters (e.g. insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180), we want to verify that it is picking up the options we set in the /etc/modprobe.conf file.

To manually load the hangcheck-timer kernel module and verify it is using the correct values defined in the /etc/modprobe.conf file, run the following command:

[root@linux3 ~]# modprobe hangcheck-timer
[root@linux3 ~]# grep Hangcheck /var/log/messages | tail -2
Sep  2 23:32:00 linux3 kernel: Hangcheck: starting hangcheck timer 0.9.0 (tick is 30 seconds, margin is 180 seconds).
Sep  2 23:32:00 linux3 kernel: Hangcheck: Using get_cycles().



Configure RAC Nodes for Remote Access using SSH


  Perform the following configuration procedures on linux1 and the new Oracle RAC node!

During the creation of the existing two-node cluster, the installation of Oracle Clusterware and the Oracle Database software were only performed from one node in the RAC cluster — namely from linux1 as the oracle user account. The Oracle Universal Installer (OUI) on that particular node would then use the ssh and scp commands to run remote commands on and copy files (the Oracle software) to all other nodes within the RAC cluster. The oracle user account on the node running the OUI (runInstaller) had to be trusted by all other nodes in the RAC cluster. This meant that the oracle user account had to run the secure shell commands (ssh or scp) on the Linux server executing the OUI against all other Linux servers in the cluster without being prompted for a password. The same security requirements hold true for this article. User equivalence will be configured so that the Oracle Clusterware and Oracle Database software will be securely copied from linux1 to the new Oracle RAC node (linux3) using ssh and scp without being prompted for a password.

As was the case when configuring the existing two-node cluster, this article assumes the Oracle software installation to the new Oracle RAC node will be performed from linux1. This section provides the methods required for configuring SSH1, an RSA key, and user equivalence for the new Oracle RAC node.


Configuring the Secure Shell

To determine if SSH is installed and running on the new Oracle RAC node, enter the following command:

[root@linux3 ~]# pgrep sshd
3695

If SSH is running, then the response to this command is a list of process ID number(s).


Creating the RSA Keys on the new Oracle RAC Node

The first step in configuring SSH is to create an RSA public/private key pair on the new Oracle RAC node. An RSA public/private key should already exist on both of the two nodes in the current two-node cluster. The command to do this will create a public and private key for RSA (for a total of two keys per node). The content of the RSA public keys will then be copied into an authorized key file on linux1 which is then distributed to all other Oracle RAC nodes in the cluster.

Use the following steps to create the RSA key pair from the new Oracle RAC node (linux3);

  1. Log on as the "oracle" UNIX user account.
    [root@linux3 ~]# su - oracle

  2. If necessary, create the .ssh directory in the "oracle" user's home directory and set the correct permissions on it:
    [oracle@linux3 ~]$ mkdir -p ~/.ssh
    [oracle@linux3 ~]$ chmod 700 ~/.ssh

  3. Enter the following command to generate an RSA key pair (public and private key) for the SSH protocol:
    [oracle@linux3 ~]$ /usr/bin/ssh-keygen -t rsa
    At the prompts:
    • Accept the default location for the key files.
    • Enter and confirm a pass phrase. This should be different from the "oracle" UNIX user account password however it is not a requirement.

    This command will write the public key to the ~/.ssh/id_rsa.pub file and the private key to the ~/.ssh/id_rsa file. Note that you should never distribute the private key to anyone!


Updating and Distributing the "authorized key file" from linux1

Now that the new Oracle RAC node contains a public and private key for RSA, you will need to update the authorized key file on linux1 to add (append) the new RSA public key from linux3. An authorized key file is nothing more than a single file that contains a copy of everyone's (every node's) RSA public key. Once the authorized key file contains all of the public keys, it is then distributed to all other nodes in the cluster.

Complete the following steps on linux1 to update and then distribute the authorized key file to all nodes in the Oracle RAC cluster.

  1. As the "oracle" UNIX user account, verify that an authorized key file currently exists on the node linux1 (~/.ssh/authorized_keys). The authorized key file should already exist from the initial installation of Oracle RAC.
    [oracle@linux1 ~]$ cd ~/.ssh
    [oracle@linux1 ~]$ ls -l *.pub
    -rw-r--r-- 1 oracle oinstall 223 Sep  2 01:18 id_rsa.pub

  2. In this step, use SCP (Secure Copy) or SFTP (Secure FTP) to copy the content of the ~/.ssh/id_rsa.pub public key from the new Oracle RAC node to the authorized key file on linux1. You will be prompted for the oracle UNIX user account password for the new Oracle RAC node.

    Again, this task will be performed from linux1.

    [oracle@linux1 ~]$ ssh linux3 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    The authenticity of host 'linux3 (192.168.1.107)' can't be established.
    RSA key fingerprint is f5:38:37:e8:84:4e:bd:6d:6b:25:f7:94:58:e8:b2:7a.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'linux3,192.168.1.107' (RSA) to the list of known hosts.
    oracle@linux3's password: xxxxx

      The first time you use SSH to connect to a node from a particular system, you will see a message similar to the following:
    The authenticity of host 'linux3 (192.168.1.107)' can't be established.
    RSA key fingerprint is f5:38:37:e8:84:4e:bd:6d:6b:25:f7:94:58:e8:b2:7a.
    Are you sure you want to continue connecting (yes/no)? yes
    Enter yes at the prompt to continue. You should not see this message again when you connect from this system to the same node.

  3. At this point, we have the RSA public key from every node in the cluster (including the new Oracle RAC node) in the authorized key file on linux1. We now need to copy it to all other nodes in the cluster. Use the scp command from linux1 to copy the authorized key file to all remaining nodes in the RAC cluster:
    [oracle@linux1 ~]$ scp ~/.ssh/authorized_keys linux2:.ssh/authorized_keys
    Enter passphrase for key '/home/oracle/.ssh/id_rsa': xxxxx
    authorized_keys                             100% 1191     1.2KB/s   00:00
    
    [oracle@linux1 ~]$ scp ~/.ssh/authorized_keys linux3:.ssh/authorized_keys
    oracle@linux3's password: xxxxx
    authorized_keys                             100% 1191     1.2KB/s   00:00

  4. Change the permission of the authorized key file on all Oracle RAC nodes in the cluster by logging into the node and running the following:
    [oracle@linux1 ~]$ chmod 600 ~/.ssh/authorized_keys
    [oracle@linux2 ~]$ chmod 600 ~/.ssh/authorized_keys
    [oracle@linux3 ~]$ chmod 600 ~/.ssh/authorized_keys

  5. At this point, if you use ssh to log in to or run a command on the new Oracle RAC node, you are prompted for the pass phrase that you specified when you created the RSA key. For example, test the following from linux1:
    [oracle@linux1 ~]$ ssh linux3 hostname
    Enter passphrase for key '/home/oracle/.ssh/id_rsa': xxxxx
    linux3

      If you see any other messages or text, apart from the host name, then the Oracle installation can fail. Make any changes required to ensure that only the host name is displayed when you enter these commands. You should ensure that any part of a login script(s) that generate any output, or ask any questions, are modified so that they act only when the shell is an interactive shell.


Enabling SSH User Equivalency for the Current Shell Session

When running the addNode.sh script from linux1 (which runs the OUI), it will need to run the secure shell tool commands (ssh and scp) on the new Oracle RAC node without being prompted for a pass phrase. Even though SSH is now configured on all Oracle RAC nodes in the cluster, using the secure shell tool commands will still prompt for a pass phrase. Before running the addNode.sh script, you need to enable user equivalence for the terminal session you plan to run the script from. For the purpose of this article, the addNode.sh script will be run from linux1.

User equivalence will need to be enabled on any new terminal shell session on linux1 before attempting to run the addNode.sh script. If you log out and log back in to the node you will be performing the Oracle installation from, you must enable user equivalence for the terminal shell session as this is not done by default.

To enable user equivalence for the current terminal shell session, perform the following steps:

  1. Log on to the node where you want to run the addNode.sh script from (linux1) as the "oracle" UNIX user account.
    [root@linux1 ~]# su - oracle

  2. Enter the following commands:
    [oracle@linux1 ~]$ exec /usr/bin/ssh-agent $SHELL
    [oracle@linux1 ~]$ /usr/bin/ssh-add
    Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
    Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
    At the prompt, enter the pass phrase for each key that you generated.

  3. If SSH is configured correctly, you will be able to use the ssh and scp commands without being prompted for a password or pass phrase from this terminal session:
    [oracle@linux1 ~]$ ssh linux1 "date;hostname"
    Thu Sep  3 00:02:34 EDT 2009
    linux1
    
    [oracle@linux1 ~]$ ssh linux2 "date;hostname"
    Thu Sep  3 00:03:04 EDT 2009
    linux2
    
    [oracle@linux1 ~]$ ssh linux3 "date;hostname"
    Thu Sep  3 00:03:25 EDT 2009
    linux3

      The commands above should display the date set on each Oracle RAC node along with its hostname. If any of the nodes prompt for a password or pass phrase then verify that the ~/.ssh/authorized_keys file on that node contains the correct public keys.

    Also, if you see any other messages or text, apart from the date and hostname, then the Oracle installation can fail. Make any changes required to ensure that only the date is displayed when you enter these commands. You should ensure that any part of a login script(s) that generate any output, or ask any questions, are modified so that they act only when the shell is an interactive shell.

  4. The Oracle Universal Installer is a GUI interface and requires the use of an X Server. From the terminal session enabled for user equivalence (the node you will be running the addNode.sh script from), set the environment variable DISPLAY to a valid X Windows display:

    Bourne, Korn, and Bash shells:

    [oracle@linux1 ~]$ DISPLAY=<Any X-Windows Host>:0
    [oracle@linux1 ~]$ export DISPLAY
    C shell:
    [oracle@linux1 ~]$ setenv DISPLAY <Any X-Windows Host>:0
    After setting the DISPLAY variable to a valid X Windows display, you should perform another test of the current terminal session to ensure that X11 forwarding is not enabled:
    [oracle@linux1 ~]$ ssh linux1 hostname
    linux1
    
    [oracle@linux1 ~]$ ssh linux2 hostname
    linux2
    
    [oracle@linux1 ~]$ ssh linux3 hostname
    linux3

  5. You must run the addNode.sh script from this terminal session or remember to repeat the steps to enable user equivalence (steps 2, 3, and 4 from this section) before you start the Oracle Universal Installer from a different terminal session.

For more information on configuring SSH and user equivalence in an Oracle RAC 10g environment, see the section "Configure RAC Nodes for Remote Access using SSH" in the parent article.



All Startup Commands for New Oracle RAC Node


  Verify that the following startup commands are included on the new Oracle RAC node!

Up to this point, we have talked in great detail about the parameters and resources that need to be configured on the new Oracle RAC node for the Oracle RAC 10g configuration. This section will review those parameters, commands, and entries (in previous sections of this document) that need to occur on the new Oracle RAC node when it is booted.

In this section, I provide all of the commands, parameters, and entries that have been discussed so far that will need to be included in the startup scripts for the new Oracle RAC node. For each of the startup files below, I indicate in blue the entries that should be included in each of the startup files in order to provide a successful RAC node.


/etc/modprobe.conf

All parameters and values to be used by kernel modules.

/etc/modprobe.conf
alias eth0 r8169
alias eth1 e1000
alias scsi_hostadapter ata_piix
alias snd-card-0 snd-intel8x0
options snd-card-0 index=0
options snd-intel8x0 index=0
remove snd-intel8x0 { /usr/sbin/alsactl store 0 >/dev/null 2>&1 || : ; }; /sbin/modprobe -r --ignore-remove snd-intel8x0
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180


/etc/sysctl.conf

We wanted to adjust the default and maximum send buffer size as well as the default and maximum receive buffer size for the interconnect. This file also contains those parameters responsible for configuring shared memory, semaphores, file handles, and local IP range for use by the Oracle instance.

/etc/sysctl.conf
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename
# Useful for debugging multi-threaded applications
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

# Controls the maximum size of a message, in bytes
kernel.msgmnb = 65536

# Controls the default maxmimum size of a mesage queue
kernel.msgmax = 65536

# +---------------------------------------------------------+
# | ADJUSTING NETWORK SETTINGS                              |
# +---------------------------------------------------------+
# | With Oracle 9.2.0.1 and onwards, Oracle now makes use   |
# | of UDP as the default protocol on Linux for             |
# | inter-process communication (IPC), such as Cache Fusion |
# | and Cluster Manager buffer transfers between instances  |
# | within the RAC cluster. Oracle strongly suggests to     |
# | adjust the default and maximum receive buffer size      |
# | (SO_RCVBUF socket option) to 1024MB, and the default    |
# | and maximum send buffer size (SO_SNDBUF socket option)  |
# | to 256KB. The receive buffers are used by TCP and UDP   |
# | to hold received data until it is read by the           |
# | application. The receive buffer cannot overflow because |
# | the peer is not allowed to send data beyond the buffer  |
# | size window. This means that datagrams will be          |
# | discarded if they don't fit in the socket receive       |
# | buffer. This could cause the sender to overwhelm the    |
# | receiver.                                               |
# +---------------------------------------------------------+

# +---------------------------------------------------------+
# | Default setting in bytes of the socket "receive" buffer |
# | which may be set by using the SO_RCVBUF socket option.  |
# +---------------------------------------------------------+
net.core.rmem_default=1048576

# +---------------------------------------------------------+
# | Maximum setting in bytes of the socket "receive" buffer |
# | which may be set by using the SO_RCVBUF socket option.  |
# +---------------------------------------------------------+
net.core.rmem_max=1048576

# +---------------------------------------------------------+
# | Default setting in bytes of the socket "send" buffer    |
# | which may be set by using the SO_SNDBUF socket option.  |
# +---------------------------------------------------------+
net.core.wmem_default=262144

# +---------------------------------------------------------+
# | Maximum setting in bytes of the socket "send" buffer    |
# | which may be set by using the SO_SNDBUF socket option.  |
# +---------------------------------------------------------+
net.core.wmem_max=262144

# +---------------------------------------------------------+
# | ADJUSTING ADDITIONAL KERNEL PARAMETERS FOR ORACLE       |
# +---------------------------------------------------------+
# | Configure the kernel parameters for all Oracle Linux    |
# | servers by setting shared memory and semaphores,        |
# | setting the maximum amount of file handles, and setting |
# | the IP local port range.                                |
# +---------------------------------------------------------+

# +---------------------------------------------------------+
# | SHARED MEMORY                                           |
# +---------------------------------------------------------+
# Controls the maximum shared segment size, in bytes
kernel.shmmax = 4294967295

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 268435456

# Controls the maximum number of shared memory segments system wide
kernel.shmmni = 4096

# +---------------------------------------------------------+
# | SEMAPHORES                                              |
# | ----------                                              |
# |                                                         |
# | SEMMSL_value  SEMMNS_value  SEMOPM_value  SEMMNI_value  |
# |                                                         |
# +---------------------------------------------------------+
kernel.sem=250 32000 100 128

# +---------------------------------------------------------+
# | FILE HANDLES                                            |
# ----------------------------------------------------------+
fs.file-max=65536

# +---------------------------------------------------------+
# | LOCAL IP RANGE                                          |
# ----------------------------------------------------------+
net.ipv4.ip_local_port_range=1024 65000

 

Verify that each of the required kernel parameters (above) are configured in the /etc/sysctl.conf file. Then, ensure that each of these parameters are truly in effect by running the following command on the new Oracle RAC node:

[root@linux3 ~]# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
net.core.rmem_default = 1048576
net.core.rmem_max = 1048576
net.core.wmem_default = 262144
net.core.wmem_max = 262144
kernel.shmmax = 4294967295
kernel.shmall = 268435456
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000


/etc/hosts

All machine/IP entries for nodes in the RAC cluster.

/etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.

127.0.0.1        localhost.localdomain   localhost

# Public Network - (eth0)
192.168.1.100    linux1
192.168.1.101    linux2
192.168.1.107    linux3

# Private Interconnect - (eth1)
192.168.2.100    linux1-priv
192.168.2.101    linux2-priv
192.168.2.107    linux3-priv

# Public Virtual IP (VIP) addresses - (eth0:1)
192.168.1.200    linux1-vip
192.168.1.201    linux2-vip
192.168.1.207    linux3-vip

# Private Storage Network for Openfiler - (eth1)
192.168.1.195    openfiler1
192.168.2.195    openfiler1-priv

# Miscellaneous Nodes
192.168.1.1      router
192.168.1.102    alex
192.168.1.103    nascar
192.168.1.105    packmule
192.168.1.106    melody
192.168.1.120    cartman
192.168.1.121    domo
192.168.1.122    switch1
192.168.1.190    george
192.168.1.245    accesspoint


/etc/udev/rules.d/55-openiscsi.rules

/etc/udev/rules.d/55-openiscsi.rules
# /etc/udev/rules.d/55-openiscsi.rules
KERNEL=="sd*", BUS=="scsi", PROGRAM="/etc/udev/scripts/iscsidev.sh %b",SYMLINK+="iscsi/%c/part%n"


/etc/udev/scripts/iscsidev.sh

/etc/udev/scripts/iscsidev.sh
#!/bin/sh

# FILE: /etc/udev/scripts/iscsidev.sh

BUS=${1}
HOST=${BUS%%:*}

[ -e /sys/class/iscsi_host ] || exit 1

file="/sys/class/iscsi_host/host${HOST}/device/session*/iscsi_session*/targetname"

target_name=$(cat ${file})

# This is not an open-scsi drive
if [ -z "${target_name}" ]; then
   exit 1
fi

echo "${target_name##*.}"


/etc/rc.local

Loading the hangcheck-timer kernel module.

/etc/rc.local
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

# +---------------------------------------------------------+
# | HANGCHECK TIMER                                         |
# | (I do not believe this is required, but doesn't hurt)   |
# +---------------------------------------------------------+

/sbin/modprobe hangcheck-timer



Install and Configure Oracle Cluster File System (OCFS2)


  Perform the following tasks on the new Oracle RAC node!


Overview

The current two-node Oracle RAC database makes use of the Oracle Cluster File System, Release 2 (OCFS2) to store the two files that are required to be shared by the Oracle Clusterware software. Note that for each of the two shared Oracle Clusterware shared files, a mirrored copy was created making for five files in total:

  • Oracle Cluster Registry (OCR)

    • File 1 : /u02/oradata/racdb/OCRFile
    • File 2 : /u02/oradata/racdb/OCRFile_mirror

  • CRS Voting Disk

    • File 1 : /u02/oradata/racdb/CSSFile
    • File 2 : /u02/oradata/racdb/CSSFile_mirror1
    • File 3 : /u02/oradata/racdb/CSSFile_mirror2

Along with these two groups of files (the OCR and Voting disk), we also used this space to store the shared ASM SPFILE for all Oracle RAC instances.

In this section, we will download and install the release of OCFS2 used for the current two-node cluster on the new Oracle RAC node.

See the following page for more information on OCFS2 (including Installation Notes) for Linux:

  OCFS2 Project Documentation


Download OCFS2

First, let's download the same OCFS2 distribution used for the current two-node RAC. The OCFS2 distribution comprises of two sets of RPMs; namely, the kernel module and the tools. The latest kernel module is available for download from http://oss.oracle.com/projects/ocfs2/files/ and the tools from http://oss.oracle.com/projects/ocfs2-tools/files/.

Download the appropriate RPMs starting with the same OCFS2 kernel module (the driver) used for the current two-node RAC. With CentOS 5.3, I am using kernel release 2.6.18-128.el5. The appropriate OCFS2 kernel module was found in the latest release of OCFS2 at the time of this writing (OCFS2 Release 1.4.2-1 at the time of this writing).

The available OCFS2 kernel modules for Linux kernel 2.6.18-128.el5 are listed below. Always download the OCFS2 kernel module that matches the distribution, platform, kernel version and the kernel flavor (smp, hugemem, psmp, etc).

32-bit (x86) Installations

  ocfs2-2.6.18-128.el5-1.4.2-1.el5.i686.rpm - (Package for default kernel)
  ocfs2-2.6.18-128.el5PAE-1.4.2-1.el5.i686.rpm - (Package for PAE kernel)
  ocfs2-2.6.18-128.el5xen-1.4.2-1.el5.i686.rpm - (Package for xen kernel)

Next, download both the OCFS2 tools and the OCFS2 console applications:

  ocfs2-tools-1.4.2-1.el5.i386.rpm - (OCFS2 tools)
  ocfs2console-1.4.2-1.el5.i386.rpm - (OCFS2 console)

64-bit (x86_64) Installations

  ocfs2-2.6.18-128.el5-1.4.2-1.el5.x86_64.rpm - (Package for default kernel)
  ocfs2-2.6.18-128.el5xen-1.4.2-1.el5.x86_64.rpm - (Package for xen kernel)

Next, download both the OCFS2 tools and the OCFS2 console applications:

  ocfs2-tools-1.4.2-1.el5.x86_64.rpm - (OCFS2 tools)
  ocfs2console-1.4.2-1.el5.x86_64.rpm - (OCFS2 console)

 

The OCFS2 Console is optional but highly recommended. The ocfs2console application requires e2fsprogs, glib2 2.2.3 or later, vte 0.11.10 or later, pygtk2 (EL4) or python-gtk (SLES9) 1.99.16 or later, python 2.3 or later and ocfs2-tools.

 

If you were curious as to which OCFS2 driver release you need, use the OCFS2 release that matches your kernel version. To determine your kernel release:

[root@linux3 ~]# uname -a
Linux linux3 2.6.18-128.el5 #1 SMP Wed Jan 21 10:44:23 EST 2009 i686 i686 i386 GNU/Linux


Install OCFS2

I will be installing the OCFS2 files onto a single processor / x86 machines. The installation process is simply a matter of running the following command on the new Oracle RAC node as the root user account:
[root@linux3 ~]# rpm -Uvh ocfs2-2.6.18-128.el5-1.4.2-1.el5.i686.rpm \
ocfs2console-1.4.2-1.el5.i386.rpm \
ocfs2-tools-1.4.2-1.el5.i386.rpm
Preparing...                ########################################### [100%]
   1:ocfs2-tools            ########################################### [ 33%]
   2:ocfs2-2.6.18-128.el5   ########################################### [ 67%]
   3:ocfs2console           ########################################### [100%]


Disable SELinux (RHEL4 U2 and higher)

Users of RHEL4 U2 and higher (CentOS 5.3 is based on RHEL5 U3) are advised that OCFS2 currently does not work with SELinux enabled. If you are using RHEL4 U2 or higher (which includes us since we are using CentOS 5.3) you will need to disable SELinux (using tool system-config-securitylevel) to get the O2CB service to execute.

During the installation of CentOS, we Disabled SELinux on the SELinux screen. If, however, you did not disable SELinux during the installation phase, you can use the tool system-config-securitylevel to disable SELinux.

If you did not follow the instructions to disable the SELinux option during the installation of CentOS (or if you simply want to verify it is truly disable), run the "Security Level Configuration" GUI utility:

[root@linux3 ~]# /usr/bin/system-config-securitylevel &

This will bring up the following screen:


Figure 11: Security Level Configuration Opening Screen / Firewall Disabled

Now, click the SELinux tab and select the "Disabled" option. After clicking [OK], you will be presented with a warning dialog. Simply acknowledge this warning by clicking "Yes". Your screen should now look like the following after disabling the SELinux option:


Figure 12: SELinux Disabled

If you needed to disable SELinux in this section on the new Oracle RAC node, it will need to be rebooted to implement the change. SELinux must be disabled before you can continue with configuring OCFS2!


Configure OCFS2

The next step is to generate and configure the /etc/ocfs2/cluster.conf file on the new Oracle RAC node. The easiest way to accomplish this is to run the GUI tool ocfs2console. The /etc/ocfs2/cluster.conf file will contain hostnames and IP addresses for "all" nodes in the cluster. After creating the /etc/ocfs2/cluster.conf on the new Oracle RAC node, these changes will then be distributed to the other two current RAC nodes using the o2cb_ctl command-line utility.

In this section, we will not only create and configure the /etc/ocfs2/cluster.conf file using ocfs2console, but will also create and start the cluster stack O2CB. When the /etc/ocfs2/cluster.conf file is not present, (as will be the case in our example), the ocfs2console tool will create this file along with a new cluster stack service (O2CB) with a default cluster name of ocfs2.

 

OCFS2 will be configured to use the private network (192.168.2.0) for all of its network traffic as recommended by Oracle. While OCFS2 does not take much bandwidth, it does require the nodes to be alive on the network and sends regular keepalive packets to ensure that they are. To avoid a network delay being interpreted as a node disappearing on the net which could lead to a node-self-fencing, a private interconnect is recommended. It is safe to use the same private interconnect for both Oracle RAC and OCFS2.

A popular question then is what node name should be used when adding nodes to an OCFS2 cluster and should it be related to the IP address? When adding nodes to an OCFS2 cluster, the node name being entered must match the hostname of the machine or the OCFS2 console will throw an error. The IP address, however, need not be the one associated with that hostname. In other words, any valid IP address on that node can be used. OCFS2 will not attempt to match the node name (hostname) with the specified IP address.

[root@linux3 ~]# ocfs2console &

This will bring up the GUI as shown below:


Figure 13: ocfs2console Screen


Using the ocfs2console GUI tool, perform the following steps:

  1. Select [Cluster] -> [Configure Nodes...]. This will start the OCFS2 Cluster Stack (Figure 14). Acknowledge this Information dialog box by clicking [Close]. You will then be presented with the "Node Configuration" dialog.
  2. On the "Node Configuration" dialog, click the [Add] button.
    • This will bring up the "Add Node" dialog.
    • In the "Add Node" dialog, enter the Host name and IP address for the first node in the cluster. Leave the IP Port set to its default value of 7777. In my example, I added all three nodes using linux1 / 192.168.2.100 for the first node, linux2 / 192.168.2.101 for the second node and linux3 / 192.168.2.107 for the third node.
      Note: The node name you enter "must" match the hostname of the machine and the IP addresses will use the private interconnect.
    • Click [Apply] on the "Node Configuration" dialog - All nodes should now be "Active" as shown in Figure 15.
    • Click [Close] on the "Node Configuration" dialog.
  3. After verifying all values are correct, exit the application using [File] -> [Quit]. These steps need only be performed on the new Oracle RAC node.



Figure 14: Starting the OCFS2 Cluster Stack


The following dialog shows the OCFS2 settings I used when configuring the new Oracle RAC node:


Figure 15: Configuring Nodes for OCFS2


After exiting the ocfs2console, you will have a /etc/ocfs2/cluster.conf similar to the following. In the next section, this file (along with other changes) will be distributed to the current two RAC nodes:

/etc/ocfs2/cluster.conf
node:
        ip_port = 7777
        ip_address = 192.168.2.100
        number = 0
        name = linux1
        cluster = ocfs2

node:
        ip_port = 7777
        ip_address = 192.168.2.101
        number = 1
        name = linux2
        cluster = ocfs2

node:
        ip_port = 7777
        ip_address = 192.168.2.107
        number = 2
        name = linux3
        cluster = ocfs2

cluster:
        node_count = 3
        name = ocfs2


Add New Oracle RAC Node to the OCFS2 Cluster

The next step is to add the new Oracle RAC node (linux3) to the current "live" OCFS2 cluster. This entails running the o2cb_ctl command-line utility from the current two RAC nodes linux1 and linux2.

As root, run the following from linux1 and then linux2:

[root@linux1 ~]# o2cb_ctl -C -i -n linux3 -t node -a number=2 -a ip_address=192.168.2.107 -a ip_port=7777 -a cluster=ocfs2
Node linux3 created

[root@linux2 ~]# o2cb_ctl -C -i -n linux3 -t node -a number=2 -a ip_address=192.168.2.107 -a ip_port=7777 -a cluster=ocfs2
Node linux3 created

o2cb_ctl parameters:

-C  : Create an object in the OCFS2 Cluster Configuration.
-i  : Valid only with -C. When creating something (node or cluster), 
      it will also install it in the live cluster (/config). If the 
      parameter is not specified, then only update the 
      /etc/ocfs2/cluster.conf.
-n  : Object name which is usually the node name or cluster name.
-t  : Type can be cluster, node or heartbeat.
-a  : Attribute in the format "parameter=value" which will be set in  
      the file /etc/ocfs2/cluster.conf file. Since nodes are numbered
      starting with zero, the third node in the OCFS2 cluster will
      be "number=2". Set the IP address which in this 
      example will be the private interconnect "ip_address=192.168.2.107". 
      The port number used in the current two-node cluster is
      "ip_port=7777". Finally, identify which OCFS2 cluster to use which 
      in our case is named "cluster=ocfs2".


Configure O2CB to Start on Boot and Adjust O2CB Heartbeat Threshold

Next, configure the on-boot properties of the OC2B driver on the new Oracle RAC node so that the cluster stack services will start on each boot. You will also be adjusting the OCFS2 Heartbeat Threshold from its default setting of 31 to 61.

Set the on-boot properties as follows:

[root@linux3 ~]# /etc/init.d/o2cb offline ocfs2
[root@linux3 ~]# /etc/init.d/o2cb unload
[root@linux3 ~]# /etc/init.d/o2cb configure
Configuring the O2CB driver.

This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on
boot.  The current values will be shown in brackets ('[]').  Hitting
 without typing an answer will keep that current value.  Ctrl-C
will abort.

Load O2CB driver on boot (y/n) [n]: y
Cluster stack backing O2CB [o2cb]: o2cb
Cluster to start on boot (Enter "none" to clear) [ocfs2]: ocfs2
Specify heartbeat dead threshold (>=7) [31]: 61
Specify network idle timeout in ms (>=5000) [30000]: 30000
Specify network keepalive delay in ms (>=1000) [2000]: 2000
Specify network reconnect delay in ms (>=2000) [2000]: 2000
Writing O2CB configuration: OK
Loading filesystem "configfs": OK
Mounting configfs filesystem at /sys/kernel/config: OK
Loading filesystem "ocfs2_dlmfs": OK
Mounting ocfs2_dlmfs filesystem at /dlm: OK
Starting O2CB cluster ocfs2: OK


Mount the OCFS2 File System

Since the clustered file system already exists, the next step is to simply mount it on the new Oracle RAC node. Let's first do it using the command-line, then I'll show how to include it in the /etc/fstab to have it mount on each boot. The current OCFS2 file system was created with the label oracrsfiles which will be used when mounting.

First, here is how to manually mount the OCFS2 file system from the command-line. Remember that this needs to be performed as the root user account on the new Oracle RAC node:

[root@linux3 ~]# mount -t ocfs2 -o datavolume,nointr -L "oracrsfiles" /u02

If the mount was successful, you will simply get your prompt back. We should, however, run the following checks to ensure the file system is mounted correctly. Let's use the mount command to ensure that the clustered file system is really mounted:

[root@linux3 ~]# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
domo-san:Public on /domo type nfs (rw,addr=192.168.2.121)
configfs on /sys/kernel/config type configfs (rw)
ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
/dev/sdc1 on /u02 type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)

 

Please take note of the datavolume option I am using to mount the new file system. Oracle database users must mount any volume that will contain the Voting Disk file, Cluster Registry (OCR), Data files, Redo logs, Archive logs and Control files with the datavolume mount option so as to ensure that the Oracle processes open the files with the o_direct flag. The nointr option ensures that the I/O's are not interrupted by signals.

Any other type of volume, including an Oracle home (which I will not be using for this article), should not be mounted with this mount option.

  Why does it take so much time to mount the volume? It takes around 5 seconds for a volume to mount. It does so as to let the heartbeat thread stabilize. In a later release, Oracle plans to add support for a global heartbeat, which will make most mounts instant.


Configure OCFS2 to Mount Automatically at Startup

This section provides the steps necessary to mount the OCFS2 file system each time the new Oracle RAC node is booted using its label.

We start by adding the following line to the /etc/fstab file on the new Oracle RAC node:

LABEL=oracrsfiles     /u02           ocfs2   _netdev,datavolume,nointr     0 0

  Notice the "_netdev" option for mounting this file system. The _netdev mount option is a must for OCFS2 volumes. This mount option indicates that the volume is to be mounted after the network is started and dismounted before the network is shutdown.

Now, let's make sure that the ocfs2.ko kernel module is being loaded and that the file system will be mounted during the boot process.

If you have been following along with the examples in this article, the actions to load the kernel module and mount the OCFS2 file system should already be enabled. However, we should still check those options by running the following as the root user account on the new Oracle RAC node:

[root@linux3 ~]# chkconfig --list o2cb
o2cb            0:off   1:off   2:on    3:on    4:on    5:on    6:off

The flags that I have marked in bold should be set to "on".


Check Permissions on OCFS2 File System

From the new Oracle RAC node, use the ls command to check ownership. The permissions should be set to 0775 with owner "oracle" and group "oinstall".

[root@linux3 ~]# ls -ld /u02
drwxrwxr-x 6 oracle oinstall 3896 Aug 26 23:41 /u02


Verify Access to the Shared Clusterware Files

From the new Oracle RAC node as the oracle user account, use the ls command to verify access to the Oracle Clusterware shared files (OCR file and Voting Disk):

[oracle@linux3 ~]$ ls -l /u02/oradata/racdb
total 14820
-rw-r--r--  1 oracle oinstall 10240000 Aug 26 22:43 CSSFile
-rw-r--r--  1 oracle oinstall 10240000 Aug 26 22:43 CSSFile_mirror1
-rw-r--r--  1 oracle oinstall 10240000 Aug 26 22:43 CSSFile_mirror2
drwxr-x---  2 oracle oinstall     3896 Aug 26 23:45 dbs/
-rw-r-----  1 root   oinstall  5074944 Sep  2 14:18 OCRFile
-rw-r-----  1 root   oinstall  5074944 Sep  2 14:18 OCRFile_mirror



Install and Configure Automatic Storage Management (ASMLib 2.0)


  Perform the following tasks on the new Oracle RAC node!


Introduction

The current two-node Oracle RAC database makes use of Automatic Storage Management (ASM) to be used as the file system and volume manager for all Oracle physical database files (data, online redo logs, control files, archived redo logs) and a Flash Recovery Area.

In this section, we will download, install, and configure ASMLib on the new Oracle RAC node. Using this method, Oracle database files will be stored on raw block devices managed by ASM using ASMLib calls. RAW devices are not required with this method as ASMLib works with block devices.


Download the ASMLib 2.0 Packages

The ASMLib distribution comprises of two sets of RPMs; namely, the kernel module and the ASMLib tools.

We start this section by downloading the same ASMLib distribution used for the current two-node RAC - (2.0.5-1). Like the Oracle Cluster File System, we need to download the appropriate version of the ASMLib driver for the Linux kernel which in my case is kernel 2.6.18-128.el5 running on two single processor / x86 machines:

[root@linux3 ~]# uname -a
Linux linux1 2.6.18-128.el5 #1 SMP Wed Jan 21 10:44:23 EST 2009 i686 i686 i386 GNU/Linux

  If you do not currently have an account with Oracle OTN, you will need to create one. This is a FREE account!


  Oracle ASMLib Downloads for Red Hat Enterprise Linux Server 5

32-bit (x86) Installations

  oracleasm-2.6.18-128.el5-2.0.5-1.el5.i686.rpm - (Package for default kernel)
  oracleasm-2.6.18-128.el5PAE-2.0.5-1.el5.i686.rpm - (Package for PAE kernel)
  oracleasm-2.6.18-128.el5xen-2.0.5-1.el5.i686.rpm - (Package for xen kernel)

Next, download the following ASMLib tools:

  oracleasm-support-2.1.3-1.el5.i386.rpm - (Driver support files)
  oracleasmlib-2.0.4-1.el5.i386.rpm - (Userspace library)

64-bit (x86_64) Installations

  oracleasm-2.6.18-128.el5-2.0.5-1.el5.x86_64.rpm - (Package for default kernel)
  oracleasm-2.6.18-128.el5xen-2.0.5-1.el5.x86_64.rpm - (Package for xen kernel)

Next, download the following ASMLib tools:

  oracleasm-support-2.1.3-1.el5.x86_64.rpm - (Driver support files)
  oracleasmlib-2.0.4-1.el5.x86_64.rpm - (Userspace library)


Install ASMLib 2.0 Packages

I will be installing the ASMLib files onto the new Oracle RAC node (linux3) which is a single processor machine. The installation process is simply a matter of running the following command on the new Oracle RAC node as the root user account:
[root@linux3 ~]# rpm -Uvh oracleasm-2.6.18-128.el5-2.0.5-1.el5.i686.rpm \
oracleasmlib-2.0.4-1.el5.i386.rpm \
oracleasm-support-2.1.3-1.el5.i386.rpm
Preparing...                ########################################### [100%]
   1:oracleasm-support      ########################################### [ 33%]
   2:oracleasm-2.6.18-128.el########################################### [ 67%]
   3:oracleasmlib           ########################################### [100%]


Configure and Loading the ASMLib 2.0 Packages

After downloading and installing the ASMLib 2.0 Packages for Linux, we now need to configure and load the ASM kernel module. Run the following as root on the new Oracle RAC node:
[root@linux3 ~]# /etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.

This will configure the on-boot properties of the Oracle ASM library
driver.  The following questions will determine whether the driver is
loaded on boot and what permissions it will have.  The current values
will be shown in brackets ('[]').  Hitting  without typing an
answer will keep that current value.  Ctrl-C will abort.

Default user to own the driver interface []: oracle
Default group to own the driver interface []: oinstall
Start Oracle ASM library driver on boot (y/n) [n]: y
Scan for Oracle ASM disks on boot (y/n) [y]: y
Writing Oracle ASM library driver configuration: done
Initializing the Oracle ASMLib driver: [  OK  ]
Scanning the system for Oracle ASMLib disks: [  OK  ]


Scan for ASM Disks

From the new Oracle RAC node, you can now perform a scandisk to recognize the current volumes. Even though the above configuration automatically ran the scandisk utility, I still like to manually perform this step!
[root@linux3 ~]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks [  OK  ]

We can now test that the ASM disks were successfully identified using the following command as the root user account:

[root@linux3 ~]# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
VOL4



Pre-Installation Tasks for Oracle10g Release 2


  Perform the following checks on the new Oracle RAC node and run the Oracle Cluster Verification Utility (CVU) from linux1!

Before propagating the Oracle Clusterware and Oracle RAC software to the new Oracle RAC node, it is highly recommended to run the Cluster Verification Utility (CVU) against all Oracle RAC nodes (which will include the new Oracle RAC node) to verify the hardware and software configuration. CVU is a command-line utility provided on the Oracle Clusterware installation media. It is responsible for performing various system checks to assist you with confirming the Oracle RAC nodes are properly configured for Oracle Clusterware and Oracle Real Application Clusters installation. The CVU only needs to be run from the node you will be performing the Oracle installations from (linux1 in this article).

Prerequisites for Using Cluster Verification Utility

Install cvuqdisk RPM (RHEL Users Only)

The first pre-requisite for running the CVU pertains to users running Oracle Linux, Red Hat Linux, CentOS, and SuSE. If you are using any of the above listed operating systems, then you must download and install the package cvuqdisk to the new Oracle RAC node (linux3). Without cvuqdisk, CVU will be unable to discover shared disks and you will receive the error message "Package cvuqdisk not installed" when you run CVU.

The cvuqdisk RPM can be found on the Oracle Clusterware installation media in the rpm directory. For the purpose of this article, the Oracle Clusterware media was extracted to the /home/oracle/orainstall/clusterware directory on linux1. Note that before installing the cvuqdisk RPM, we need to set an environment variable named CVUQDISK_GRP to point to the group that will own the cvuqdisk utility. The default group is oinstall which is the group we are using for the oracle UNIX user account in this article.

Locate and copy the cvuqdisk RPM from linux1 to linux2 as the "oracle" user account:

[oracle@linux1 ~]$ ssh linux3 "mkdir -p /home/oracle/orainstall/clusterware/rpm"
[oracle@linux1 ~]$ scp /home/oracle/orainstall/clusterware/rpm/cvuqdisk-1.0.1-1.rpm linux3:/home/oracle/orainstall/clusterware/rpm

Perform the following steps as the "root" user account on the new Oracle RAC node to install the cvuqdisk RPM:

[root@linux3 ~]# cd /home/oracle/orainstall/clusterware/rpm
[root@linux3 ~]# CVUQDISK_GRP=oinstall; export CVUQDISK_GRP

[root@linux3 ~]# rpm -iv cvuqdisk-1.0.1-1.rpm
Preparing packages for installation...
cvuqdisk-1.0.1-1

[root@linux3 ~]# ls -l /usr/sbin/cvuqdisk
-rwsr-x--- 1 root oinstall 4168 Jun  2  2005 /usr/sbin/cvuqdisk

Verify Remote Access / User Equivalence

The CVU should be run from linux1 — the node we will be extending the Oracle software from. Before running CVU, login as the oracle user account and verify remote access / user equivalence is configured to all nodes in the cluster. When using the secure shell method, user equivalence will need to be enabled for the terminal shell session before attempting to run the CVU. To enable user equivalence for the current terminal shell session, perform the following steps remembering to enter the pass phrase for each key that you generated when prompted:

[oracle@linux1 ~]$ exec /usr/bin/ssh-agent $SHELL
[oracle@linux1 ~]$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
Checking Pre-Installation Tasks for CRS with CVU

Once all prerequisites for running the CVU utility have been met, we can now check that all pre-installation tasks for Oracle Clusterware are completed by executing the following command as the "oracle" UNIX user account (with user equivalence enabled) from linux1:

[oracle@linux1 ~]$ cd /home/oracle/orainstall/clusterware/cluvfy
[oracle@linux1 ~]$ mkdir -p jdk14
[oracle@linux1 ~]$ unzip jrepack.zip -d jdk14
[oracle@linux1 ~]$ CV_HOME=/home/oracle/orainstall/clusterware/cluvfy; export CV_HOME
[oracle@linux1 ~]$ CV_JDKHOME=/home/oracle/orainstall/clusterware/cluvfy/jdk14; export CV_JDKHOME
[oracle@linux1 ~]$ ./runcluvfy.sh stage -pre crsinst -n linux1,linux2,linux3 -verbose

Review the CVU report. Note that there are several errors you may ignore in this report.

If your system only has 1GB of RAM memory, you may receive an error during the "Total memory" check:

Check: Total memory
  Node Name     Available                 Required                  Comment
  ------------  ------------------------  ------------------------  ----------
  linux3        1009.65MB (1033880KB)     1GB (1048576KB)           failed
  linux2        1009.65MB (1033880KB)     1GB (1048576KB)           failed
  linux1        1009.65MB (1033880KB)     1GB (1048576KB)           failed
Result: Total memory check failed.

As you can see from the output above, the requirement is for 1GB of memory (1048576 KB). Although your system may have 1GB of memory installed in each of the Oracle RAC nodes, the Linux kernel is calculating it to be 1033880 KB which comes out to be 14696 KB short. This can be considered close enough and safe to continue with the installation. As I mentioned earlier in this article, I highly recommend all Oracle RAC nodes have 2GB of RAM memory or higher for performance reasons.

The first error is with regards to finding a suitable set of interfaces for VIPs which can be safely ignored. This is a bug documented in Metalink Note 338924.1:

Suitable interfaces for the private interconnect on subnet "192.168.2.0":
linux3 eth1:192.168.2.107
linux2 eth1:192.168.2.101
linux1 eth1:192.168.2.100

ERROR:
Could not find a suitable set of interfaces for VIPs.

Result: Node connectivity check failed.

As documented in the note, this error can be safely ignored.

The last set of errors that can be ignored deal with specific RPM package versions that are not required with CentOS 5. For example:

While these specific packages are listed as missing in the CVU report, please ensure that the correct versions of the compat-* packages are installed on the new Oracle RAC node. For example, in CentOS 5 Update 3, these would be:

Checking the Hardware and Operating System Setup with CVU

The next CVU check to run will verify the hardware and operating system setup. Again, run the following as the "oracle" UNIX user account from linux1:

[oracle@linux1 ~]$ cd /home/oracle/orainstall/clusterware/cluvfy
[oracle@linux1 ~]$ ./runcluvfy.sh stage -post hwos -n linux1,linux2,linux3 -verbose

Review the CVU report. As with the previous check (pre-installation tasks for CRS), the check for finding a suitable set of interfaces for VIPs will fail and can be safely ignored

Also note you may receive warnings in the "Checking shared storage accessibility..." portion of the report:

Checking shared storage accessibility...

WARNING:
Unable to determine the sharedness of /dev/sde on nodes:
        linux3,linux3,linux3,linux3,linux3,linux2,linux2,linux2,linux2,linux2,linux1,linux1,linux1,linux1,linux1


Shared storage check failed on nodes "linux3,linux2,linux1".

If this occurs, this too can be safely ignored. While we know the disks are visible and shared from all of our Oracle RAC nodes in the cluster, the check itself may fail. Several reasons for this have been documented. The first came from Metalink indicating that cluvfy currently does not work with devices other than SCSI devices. This would include devices like EMC PowerPath and volume groups like those from Openfiler. At the time of this writing, no workaround exists other than to use manual methods for detecting shared devices. Another reason for this error was documented by Bane Radulovic at Oracle Corporation. His research shows that CVU calls smartclt on Linux, and the problem is that smartclt does not return the serial number from our iSCSI devices. For example, a check against /dev/sde shows:

[root@linux3 ~]# /usr/sbin/smartctl -i /dev/sde
smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Device: Openfile Virtual disk     Version: 0
Serial number:
Device type: disk
Local Time is: Mon Sep  3 02:02:53 2007 EDT
Device supports SMART and is Disabled
Temperature Warning Disabled or Not Supported

At the time of this writing, it is unknown if the Openfiler developers have plans to fix this.



Extend Oracle Clusterware Software to the New Node


  Extend the Oracle Clusterware software to the new Oracle RAC node from linux1!


Overview

In this section, we will extend the current Oracle RAC database by adding the new Oracle RAC node linux3. The new node will need to be added to the cluster at the clusterware layer so that the other nodes in the RAC cluster consider it to be part of the cluster.

When you extend an Oracle RAC database, you must first extend the Oracle Clusterware home to the new node and then extend the Oracle Database home. In other words, you extend the software onto the new node in the same order as you installed the clusterware and Oracle database components on the existing nodes.

Oracle Clusterware is already installed on the cluster. The task in this section is to add the new Oracle RAC node to the clustered configuration. This is done by executing the Oracle provided utility addNode.sh from one of the existing nodes in the cluster; namely linux1. This script is located in the Oracle Clusterware's home oui/bin directory (/u01/app/crs/oui/bin). During the add node process, the shared Oracle Cluster Registry file and Voting Disk will be updated with the information regarding the new node.


Verifying Terminal Shell Environment

Before starting the Oracle Universal Installer, you should first verify you are logged onto the server you will be running the installer from (i.e. linux1) then run the xhost command as root from the console to allow X Server connections. Next, login as the oracle user account. If you are using a remote client to connect to the node performing the installation (SSH / Telnet to linux1 from a workstation configured with an X Server), you will need to set the DISPLAY variable to point to your local workstation. Finally, verify remote access / user equivalence to all nodes in the cluster:

Verify Server and Enable X Server Access

[root@linux1 ~]# hostname
linux1

[root@linux1 ~]# xhost +
access control disabled, clients can connect from any host

Login as the oracle User Account and Set DISPLAY (if necessary)

[root@linux1 ~]# su - oracle

[oracle@linux1 ~]$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
[oracle@linux1 ~]$ # NODE PERFORMING THE INSTALL
[oracle@linux1 ~]$ DISPLAY=<your local workstation>:0.0
[oracle@linux1 ~]$ export DISPLAY

Verify Remote Access / User Equivalence

Verify you are able to run the Secure Shell commands (ssh or scp) on the Linux server you will be running the Oracle Universal Installer from against the new Oracle RAC node without being prompted for a password. When using the secure shell method, user equivalence will need to be enabled on any new terminal shell session before attempting to run the OUI. To enable user equivalence for the current terminal shell session, perform the following steps remembering to enter the pass phrase for the RSA key you generated when prompted:

[oracle@linux1 ~]$ exec /usr/bin/ssh-agent $SHELL
[oracle@linux1 ~]$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)

[oracle@linux1 ~]$ ssh linux1 "date;hostname"
Thu Sep  3 13:08:19 EDT 2009
linux1

[oracle@linux1 ~]$ ssh linux3 "date;hostname"
Thu Sep  3 13:08:42 EDT 2009
linux3


Configure Oracle Clusterware on the New Node

The next step is to configure Oracle Clusterware on the new Oracle RAC node linux3. As previously mentioned, this is performed by executing the new addNode.sh utility located in the Oracle Clusterware's home oui/bin directory (/u01/app/crs/oui/bin) from linux1:

[oracle@linux1 ~]$ hostname
linux1

[oracle@linux1 ~]$ id -a
uid=501(oracle) gid=501(oinstall) groups=501(oinstall),502(dba),503(oper)

[oracle@linux1 ~]$ cd /u01/app/crs/oui/bin
[oracle@linux1 ~]$ ./addNode.sh

Screen Name Response
Welcome Screen Click Next
Specify Cluster Nodes
to Add to Installation
In this screen, the OUI lists all existing nodes in the top portion labeled "Existing Nodes". On the bottom half of the screen labeled "Specify New Nodes", enter the information for the new node in the appropriate fields:
Public Node Name Private Node Name Virtual Node Name
linux3 linux3-priv linux3-vip

Click Next to continue.

Cluster Node
Additional Summary
Verify the new Oracle RAC node is listed under the "New Nodes" drill-down. Click Install to start the installation!
Execute Configuration
Scripts
Once all of the required Oracle Clusterware components have been copied from linux1 to linux3, the OUI prompts to execute three files as described in the following sections.


From linux3

Navigate to the /u01/app/oracle/oraInventory directory on linux3 and run orainstRoot.sh as the "root" user account.


From linux1

Important: As documented in Metalink Note 392415.1, the rootaddnode.sh script (which is run in this section) may error out at the end with "Connection refused" (PRKC-1044) when trying to trying to add a new node to the cluster. The reason this error occurs is because the "oracle" user account on the node running the rootaddnode.sh script is setup with SSH for remote access to the new node and has a non-empty SSH passphrase. Note that for obvious security reasons, the "oracle" user account is typically setup with a non-empty pass phrase for SSH keys and would thus succumb to this error. The rootaddnode.sh script uses SSH to check remote node connectivity from linux1 to linux3. If it gives any prompt, it will consider ssh is not configured properly. The script will then use rsh instead. If rsh is not configured, then it will error out with "Connection refused". If you are using SSH for user equivalency (as I am in this article), you will need to temporarily define an empty rsa passphrase for the "oracle" user account on linux1 as follows:

[oracle@linux1 ~]$ ssh-keygen -p
Enter file in which the key is (/home/oracle/.ssh/id_rsa):
Enter old passphrase: [OLD PASSPHRASE]
Key has comment '/home/oracle/.ssh/id_rsa'
Enter new passphrase (empty for no passphrase): [JUST HIT ENTER WITHOUT ENTERING A PASSPHRASE]
Enter same passphrase again: [JUST HIT ENTER WITHOUT ENTERING A PASSPHRASE]
Your identification has been saved with the new passphrase.

After temporarily defining an empty rsa passphrase for the "oracle" user account, navigate to the /u01/app/crs/install directory on linux1 and run rootaddnode.sh as the "root" user account. The rootaddnode.sh script will add the new node information to the Oracle Cluster Registry (OCR) file using the srvctl utility.

After running the rootaddnode.sh script from linux1, you can set your passphrase back to the old passphrase using the same "ssh-keygen -p" command.


From linux3

Finally, navigate to the /u01/app/crs directory on linux3 and run root.sh as the "root" user account.

If the Oracle Clusterware home directory is a subdirectory of the ORACLE_BASE directory (which should never be!), you will receive several warnings regarding permissions while running the root.sh script on the new node. These warnings can be safely ignored.

The root.sh may take awhile to run. With Oracle Clusterware version 10.2.0.4, the root.sh script should complete successfully.

With Oracle version 10.2.0.1, when running the root.sh on linux3, you may receive critical errors. Use the troubleshooting methods described in the parent to this article:

Running root.sh on the Last Node will Fail!

If the vipca failed to run (10.2.0.1 users), re-run vipca (GUI) manually as root from linux3 (the node where the error occurred). Please keep in mind that vipca is a GUI and will need to set your DISPLAY variable accordingly to your X server:

[root@linux3 ~]# $ORA_CRS_HOME/bin/vipca

When the "VIP Configuration Assistant" appears, this is how I answered the screen prompts:

   Welcome: Click Next
   Network interfaces: Select only the public interface - eth0
   Virtual IPs for cluster nodes:
       Node Name: linux1
       IP Alias Name: linux1-vip
       IP Address: 192.168.1.200
       Subnet Mask: 255.255.255.0

       Node Name: linux2
       IP Alias Name: linux2-vip
       IP Address: 192.168.1.201
       Subnet Mask: 255.255.255.0

       Node Name: linux3
       IP Alias Name: linux3-vip
       IP Address: 192.168.1.207
       Subnet Mask: 255.255.255.0

   Summary: Click Finish
   Configuration Assistant Progress Dialog: Click OK after configuration is complete.
   Configuration Results: Click Exit

Go back to the OUI and acknowledge the "Execute Configuration scripts" dialog window.

End of installation At the end of the installation, exit from the OUI.


Verify Oracle Clusterware Installation

After extending Oracle Clusterware to the new node, we can run through several tests to verify the install was successful. Run the following commands on the new Oracle RAC node (linux3) as the "oracle" user account:

Check Cluster Nodes

[oracle@linux3 ~]$ $ORA_CRS_HOME/bin/olsnodes -n
linux1  1
linux2  2
linux3  3
Confirm Oracle Clusterware Function
[oracle@linux3 ~]$ $ORA_CRS_HOME/bin/crs_stat -t -v
Name           Type           R/RA   F/FT   Target    State     Host
----------------------------------------------------------------------
ora.racdb.db   application    0/0    0/1    ONLINE    ONLINE    linux2
ora....b1.inst application    0/5    0/0    ONLINE    ONLINE    linux1
ora....b2.inst application    0/5    0/0    ONLINE    ONLINE    linux2
ora....srvc.cs application    0/0    0/1    ONLINE    ONLINE    linux1
ora....db1.srv application    0/0    0/0    ONLINE    ONLINE    linux1
ora....db2.srv application    0/0    0/0    ONLINE    ONLINE    linux2
ora....SM1.asm application    0/5    0/0    ONLINE    ONLINE    linux1
ora....E1.lsnr application    0/5    0/0    ONLINE    ONLINE    linux1
ora....de1.gsd application    0/5    0/0    ONLINE    ONLINE    linux1
ora....de1.ons application    0/3    0/0    ONLINE    ONLINE    linux1
ora....de1.vip application    0/0    0/0    ONLINE    ONLINE    linux1
ora....SM2.asm application    0/5    0/0    ONLINE    ONLINE    linux2
ora....E2.lsnr application    0/5    0/0    ONLINE    ONLINE    linux2
ora....de2.gsd application    0/5    0/0    ONLINE    ONLINE    linux2
ora....de2.ons application    0/3    0/0    ONLINE    ONLINE    linux2
ora....de2.vip application    0/0    0/0    ONLINE    ONLINE    linux2
ora....de3.gsd application    0/5    0/0    ONLINE    ONLINE    linux3
ora....de3.ons application    0/3    0/0    ONLINE    ONLINE    linux3
ora....de3.vip application    0/0    0/0    ONLINE    ONLINE    linux3
Check CRS Status
[oracle@linux3 ~]$ $ORA_CRS_HOME/bin/crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
Check Oracle Clusterware Auto-Start Scripts on New Node (linux3)
[oracle@linux3 ~]$ ls -l /etc/init.d/init.*
-rwxr-xr-x 1 root root  2236 Sep  3 13:19 /etc/init.d/init.crs*
-rwxr-xr-x 1 root root  4926 Sep  3 13:19 /etc/init.d/init.crsd*
-rwxr-xr-x 1 root root 53446 Sep  3 13:19 /etc/init.d/init.cssd*
-rwxr-xr-x 1 root root  3208 Sep  3 13:19 /etc/init.d/init.evmd*



Extend Oracle Database Software to the New Node


  Extend the Oracle Database software to the new Oracle RAC node from linux1!


Overview

After copying and configuring the Oracle Clusterware software to the new node, we now need to copy the Oracle Database software from one of the existing nodes to linux3. This is done by executing the Oracle provided utility addNode.sh from one of the existing nodes in the cluster; namely linux1. This script is located in the $ORACLE_HOME/oui/bin directory (/u01/app/oracle/product/10.2.0/db_1/oui/bin).


Verifying Terminal Shell Environment

As discussed in the previous section, the terminal shell environment needs to be configured for remote access and user equivalence to the new Oracle RAC node before running the Oracle Universal Installer. Note that you can utilize the same terminal shell session used in the previous section which in this case, you do not have to perform any of the actions described below with regards to setting up remote access and the DISPLAY variable:

Login as the oracle User Account and Set DISPLAY (if necessary)

[root@linux1 ~]# su - oracle

[oracle@linux1 ~]$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
[oracle@linux1 ~]$ # NODE PERFORMING THE INSTALL
[oracle@linux1 ~]$ DISPLAY=<your local workstation>:0.0
[oracle@linux1 ~]$ export DISPLAY

Verify Remote Access / User Equivalence

Verify you are able to run the Secure Shell commands (ssh or scp) on the Linux server you will be running the Oracle Universal Installer from against the new Oracle RAC node without being prompted for a password. When using the secure shell method, user equivalence will need to be enabled on any new terminal shell session before attempting to run the OUI. To enable user equivalence for the current terminal shell session, perform the following steps remembering to enter the pass phrase for the RSA key you generated when prompted:

[oracle@linux1 ~]$ exec /usr/bin/ssh-agent $SHELL
[oracle@linux1 ~]$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)

[oracle@linux1 ~]$ ssh linux1 "date;hostname"
Thu Sep  3 14:19:17 EDT 2009
linux1

[oracle@linux1 ~]$ ssh linux3 "date;hostname"
Thu Sep  3 14:19:43 EDT 2009
linux3


Install Oracle Database Software on the New Node

Copy the Oracle Database software to the new Oracle RAC node linux3. As previously mentioned, this is performed by executing the new addNode.sh utility located in the $ORACLE_HOME/oui/bin directory from linux1:
[oracle@linux1 ~]$ hostname
linux1

[oracle@linux1 ~]$ id -a
uid=501(oracle) gid=501(oinstall) groups=501(oinstall),502(dba),503(oper)

[oracle@linux1 ~]$ cd /u01/app/oracle/product/10.2.0/db_1/oui/bin
[oracle@linux1 ~]$ ./addNode.sh

Screen Name Response
Welcome Screen Click Next
Specify Cluster Nodes
to Add to Installation
In this screen, the OUI lists all of the nodes already part of the installation in the top portion labeled "Existing Nodes". On the bottom half of the screen labeled "Specify New Nodes" is a list of new nodes which can be added. By default linux3 is selected. Verify linux3 is selected (checked) and Click Next to continue.
Cluster Node
Additional Summary
Verify the new Oracle RAC node is listed under the "New Nodes" drill-down. Click Install to start the installation!
Execute Configuration
Scripts
Once all of the required Oracle Database components have been copied from linux1 to linux3, the OUI prompts to execute the root.sh on the new Oracle RAC node. Navigate to the /u01/app/oracle/product/10.2.0/db_1 directory on linux3 and run root.sh as the "root" user account.

After running the root.sh script on the new Oracle RAC node, go back to the OUI and acknowledge the "Execute Configuration scripts" dialog window.

End of installation At the end of the installation, exit from the OUI.



Add Listener to New Node


  Perform the following configuration procedures from only one of the Oracle RAC nodes in the cluster (linux1)! The Network Configuration Assistant (NETCA) will setup the TNS listener in a clustered configuration to include the new node in the cluster.


Overview

In this section, you will use the Network Configuration Assistant (NETCA) to setup the TNS listener in a clustered configuration to include the new Oracle RAC node. The NETCA program will be run from linux1 with user equivalence enabled to all nodes in the cluster.


Verifying Terminal Shell Environment

As discussed in the previous section, the terminal shell environment needs to be configured for remote access and user equivalence to the new Oracle RAC node before running the NETCA. Note that you can utilize the same terminal shell session used in the previous section which in this case, you do not have to perform any of the actions described below with regards to setting up remote access and the DISPLAY variable:

Login as the oracle User Account and Set DISPLAY (if necessary)

[root@linux1 ~]# su - oracle

[oracle@linux1 ~]$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
[oracle@linux1 ~]$ # NODE PERFORMING THE INSTALL
[oracle@linux1 ~]$ DISPLAY=<your local workstation>:0.0
[oracle@linux1 ~]$ export DISPLAY

Verify Remote Access / User Equivalence

Verify you are able to run the Secure Shell commands (ssh or scp) on the Linux server you will be running the NETCA from against all other Linux servers in the cluster without being prompted for a password. When using the secure shell method, user equivalence will need to be enabled on any new terminal shell session before attempting to run the NETCA. To enable user equivalence for the current terminal shell session, perform the following steps remembering to enter the pass phrase for the RSA key you generated when prompted:
[oracle@linux1 ~]$ exec /usr/bin/ssh-agent $SHELL
[oracle@linux1 ~]$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)

[oracle@linux1 ~]$ ssh linux1 "date;hostname"
Thu Sep  3 14:19:17 EDT 2009
linux1

[oracle@linux1 ~]$ ssh linux3 "date;hostname"
Thu Sep  3 14:19:43 EDT 2009
linux3


Run the Network Configuration Assistant

To start the NETCA, run the following from linux1:
[oracle@linux1 ~]$ netca &

The following table walks you through the process of reconfiguring the TNS listeners in a clustered configuration to include the new node.

Screen Name Response
Select the Type of Oracle
Net Services Configuration
Select Cluster configuration
Select the nodes to configure Only select the new Oracle RAC node: linux3.
Type of Configuration Select Listener configuration.
Listener Configuration
Next 6 Screens
The following screens are now like any other normal listener configuration. You can simply accept the default parameters for the next six screens:
   What do you want to do: Add
   Listener name: LISTENER
   Selected protocols: TCP
   Port number: 1521
   Configure another listener: No
   Listener configuration complete! [ Next ]
You will be returned to this Welcome (Type of Configuration) Screen.
Type of Configuration Select Naming Methods configuration.
Naming Methods Configuration The following screens are:
   Selected Naming Methods: Local Naming
   Naming Methods configuration complete! [ Next ]
You will be returned to this Welcome (Type of Configuration) Screen.
Type of Configuration Click Finish to exit the NETCA.


Verify TNS Listener Configuration

The Oracle TNS listener process should now be running on all three nodes in the RAC cluster:
[oracle@linux1 ~]$ hostname
linux1

[oracle@linux1 ~]$ ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}'
LISTENER_LINUX1

[oracle@linux1 ~]$ $ORA_CRS_HOME/bin/crs_stat ora.linux1.LISTENER_LINUX1.lsnr
NAME=ora.linux1.LISTENER_LINUX1.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on linux1

=====================

[oracle@linux2 ~]$ hostname
linux2

[oracle@linux2 ~]$ ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}'
LISTENER_LINUX2

[oracle@linux2 ~]$ $ORA_CRS_HOME/bin/crs_stat ora.linux2.LISTENER_LINUX2.lsnr
NAME=ora.linux2.LISTENER_LINUX2.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on linux2

=====================

[oracle@linux3 ~]$ hostname
linux3

[oracle@linux3 ~]$ ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}'
LISTENER_LINUX3

[oracle@linux3 ~]$ $ORA_CRS_HOME/bin/crs_stat ora.linux3.LISTENER_LINUX3.lsnr
NAME=ora.linux3.LISTENER_LINUX3.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on linux3



Add Database Instance to the New Node


  Add the new Oracle instance to the new Oracle RAC node using DBCA!


Overview

The final step in extending the Oracle RAC database is to add a new database instance to the new Oracle RAC node. The database instance will be named racdb3 and hosted on the new node linux3. This process can be performed using either Enterprise Manager or the Database Configuration Assistant (DBCA). For the purpose of this article, I am opting to use the DBCA.

Before executing the DBCA, make certain that $ORACLE_HOME and $PATH are set appropriately for the $ORACLE_BASE/product/10.2.0/db_1 environment.

You should also verify that all services we have installed up to this point (Oracle TNS listener, Oracle Clusterware processes, etc.) are running before attempting to start the clustered database creation process.

The DBCA program will be run from linux1 with user equivalence enabled to all nodes in the cluster.


Verifying Terminal Shell Environment

As discussed in the previous section, the terminal shell environment needs to be configured for remote access and user equivalence to the new Oracle RAC node before running the DBCA. Note that you can utilize the same terminal shell session used in the previous section which in this case, you do not have to perform any of the actions described below with regards to setting up remote access and the DISPLAY variable:

Login as the oracle User Account and Set DISPLAY (if necessary)

[root@linux1 ~]# su - oracle

[oracle@linux1 ~]$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
[oracle@linux1 ~]$ # NODE PERFORMING THE INSTALL
[oracle@linux1 ~]$ DISPLAY=<your local workstation>:0.0
[oracle@linux1 ~]$ export DISPLAY

Verify Remote Access / User Equivalence

Verify you are able to run the Secure Shell commands (ssh or scp) on the Linux server you will be running the DBCA from against all other Linux servers in the cluster without being prompted for a password. When using the secure shell method, user equivalence will need to be enabled on any new terminal shell session before attempting to run the DBCA. To enable user equivalence for the current terminal shell session, perform the following steps remembering to enter the pass phrase for the RSA key you generated when prompted:
[oracle@linux1 ~]$ exec /usr/bin/ssh-agent $SHELL
[oracle@linux1 ~]$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)

[oracle@linux1 ~]$ ssh linux1 "date;hostname"
Thu Sep  3 14:19:17 EDT 2009
linux1

[oracle@linux1 ~]$ ssh linux3 "date;hostname"
Thu Sep  3 14:19:43 EDT 2009
linux3


Add Database Instance to New Node

To start the database instance creation process for the new Oracle RAC node, run the following from linux1:

[oracle@linux1 ~]$ dbca &
Screen Name Response
Welcome Screen Select Oracle Real Application Clusters database.
Operations Select Instance Management.
Instance Management Select Add an instance.
List of cluster databases Provides a list of clustered databases running on the node. For the purpose of this example, the clustered database running on node linux1 is racdb. Select this clustered database.

At the bottom of this screen, the DBCA requests you to "Specify a user with SYSDBA system privileges":

    Username: sys
    Password: <sys_password>

Click Next to continue.

List of cluster
database instances
This screen provides a list of all instances currently available on the cluster, their status, and which node they reside on.

Verify this list is correct and Click Next to continue.

Instance naming and
node selection
This screen lists the next instance name in the series and requests the node on which to add the instance to. In this example, the next instance name is racdb3 and the node name to create it on is linux3. For this example, the default values are correct (instance name "racdb3" to be added to node "linux3"). After verifying these values, Click Next to continue.

After clicking Next, there will be a small pause before the next screen appears as the DBCA determines the current state of the new node and what services (if any) are configured on the existing nodes.

Database Services If the current clustered database has any database services defined, the next screen allows the DBA to configure those database services for the new instance. In this example, the existing clustered database has one service defined named racdb_srvc. With the "racdb_srvc" database service selected, change the details to Preferred for the new instance (racdb3) and the "TAF Policy" set to Basic.
Instance Storage By default, the DBCA does a good job of determining the instance specific files such as an UNDO tablespace (UNDOTBS3), database files for this tablespace, and two redo log groups. Verify the storage options and Click Finish to add the instance.
Database Configuration
Assistant: Summary
After verifying the instance creation options in the summary dialog, Click OK to begin the instance management process.
Extend ASM During the add instance step, the DBCA verifies the new node and then checks to determine if ASM is present on the existing cluster (which in this example, ASM is configured). The DBCA presents a dialog box indicating that "ASM is present on the cluster but needs to be extended to the following nodes: [linux3]. Do you want ASM to be extended?" Click on Yes to add the ASM instance to the new node.

NOTE: In the previous section (Add Listener to New Node), I provided instructions to setup the TNS listener in a clustered configuration to include the new Oracle RAC node using NETCA. If the listener is not yet configured on the new Oracle RAC node, the DBCA will prompt the user with a dialog asking to configure a new listener using port 1521 and listener name "LISTENER_LINUX3". The TNS listener must be present and started on the new Oracle RAC node in order to create and start the ASM instance on the new node.

Database Configuration Assistant
Progress Screen
A progress bar is display while the new instance is being configured. Once the instance management process is complete, the DBCA prompts the user with a dialog and the message "Do you want to perform another operation?" Click No to end and exit the DBCA utility.
Start New Database Services The DBCA will automatically start the new instance (racdb3) on the node linux3. If any services were configured during the instance management process, however, they are left in an offline state. For the purpose of this example, I had to manually start the "racdb_srvc" service for the database:

$ srvctl start service -s racdb_srvc -d racdb -i racdb3

When the Oracle Database Configuration Assistant has completed, you will have successfully extended the current Oracle RAC database!


Verify New Database Environment

Check Cluster Services

[oracle@linux1 ~]$ $ORA_CRS_HOME/bin/crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.racdb.db   application    ONLINE    ONLINE    linux2
ora....b1.inst application    ONLINE    ONLINE    linux1
ora....b2.inst application    ONLINE    ONLINE    linux2
ora....b3.inst application    ONLINE    ONLINE    linux3
ora....srvc.cs application    ONLINE    ONLINE    linux1
ora....db1.srv application    ONLINE    ONLINE    linux1
ora....db2.srv application    ONLINE    ONLINE    linux2
ora....db3.srv application    ONLINE    ONLINE    linux3
ora....SM1.asm application    ONLINE    ONLINE    linux1
ora....E1.lsnr application    ONLINE    ONLINE    linux1
ora....de1.gsd application    ONLINE    ONLINE    linux1
ora....de1.ons application    ONLINE    ONLINE    linux1
ora....de1.vip application    ONLINE    ONLINE    linux1
ora....SM2.asm application    ONLINE    ONLINE    linux2
ora....E2.lsnr application    ONLINE    ONLINE    linux2
ora....de2.gsd application    ONLINE    ONLINE    linux2
ora....de2.ons application    ONLINE    ONLINE    linux2
ora....de2.vip application    ONLINE    ONLINE    linux2
ora....SM3.asm application    ONLINE    ONLINE    linux3
ora....E3.lsnr application    ONLINE    ONLINE    linux3
ora....de3.gsd application    ONLINE    ONLINE    linux3
ora....de3.ons application    ONLINE    ONLINE    linux3
ora....de3.vip application    ONLINE    ONLINE    linux3

- or -

[oracle@linux1 ~]$ rac_crs_stat
HA Resource                                   Target     State
-----------                                   ------     -----
ora.racdb.db                                  ONLINE     ONLINE on linux2
ora.racdb.racdb1.inst                         ONLINE     ONLINE on linux1
ora.racdb.racdb2.inst                         ONLINE     ONLINE on linux2
ora.racdb.racdb3.inst                         ONLINE     ONLINE on linux3
ora.racdb.racdb_srvc.cs                       ONLINE     ONLINE on linux1
ora.racdb.racdb_srvc.racdb1.srv               ONLINE     ONLINE on linux1
ora.racdb.racdb_srvc.racdb2.srv               ONLINE     ONLINE on linux2
ora.racdb.racdb_srvc.racdb3.srv               ONLINE     ONLINE on linux3
ora.linux1.ASM1.asm                           ONLINE     ONLINE on linux1
ora.linux1.LISTENER_linux1.lsnr               ONLINE     ONLINE on linux1
ora.linux1.gsd                                ONLINE     ONLINE on linux1
ora.linux1.ons                                ONLINE     ONLINE on linux1
ora.linux1.vip                                ONLINE     ONLINE on linux1
ora.linux2.ASM2.asm                           ONLINE     ONLINE on linux2
ora.linux2.LISTENER_linux2.lsnr               ONLINE     ONLINE on linux2
ora.linux2.gsd                                ONLINE     ONLINE on linux2
ora.linux2.ons                                ONLINE     ONLINE on linux2
ora.linux2.vip                                ONLINE     ONLINE on linux2
ora.linux3.ASM3.asm                           ONLINE     ONLINE on linux3
ora.linux3.LISTENER_linux3.lsnr               ONLINE     ONLINE on linux3
ora.linux3.gsd                                ONLINE     ONLINE on linux3
ora.linux3.ons                                ONLINE     ONLINE on linux3
ora.linux3.vip                                ONLINE     ONLINE on linux3


Verify New Instance

Login to one of the instances and query the gv$instance view:
SQL> select inst_id, instance_name, status, to_char(startup_time, 'DD-MON-YYYY HH24:MI:SS')
  2  from gv$instance order by inst_id;

   INST_ID INSTANCE_NAME    STATUS       TO_CHAR(STARTUP_TIME
---------- ---------------- ------------ --------------------
         1 racdb1           OPEN         02-SEP-2009 17:27:52
         2 racdb2           OPEN         02-SEP-2009 17:28:57
         3 racdb3           OPEN         03-SEP-2009 14:47:57


Update TNSNAMES

Login to all machines that will be accessing the new instance and update the tnsnames.ora file (if necessary).


Verify Enterprise Manager - Database Control

The DBCA should have updated and added the new node(s) to EM Database Control. Bring up a web browser and navigate to:

http://linux3:1158/em



About the Author

Jeffrey Hunter is an Oracle Certified Professional, Java Development Certified Professional, Author, and an Oracle ACE. Jeff currently works as a Senior Database Administrator for The DBA Zone, Inc. located in Pittsburgh, Pennsylvania. His work includes advanced performance tuning, Java and PL/SQL programming, developing high availability solutions, capacity planning, database security, and physical / logical database design in a UNIX, Linux, and Windows server environment. Jeff's other interests include mathematical encryption theory, programming language processors (compilers and interpreters) in Java and C, LDAP, writing web-based database administration tools, and of course Linux. He has been a Sr. Database Administrator and Software Engineer for over 18 years and maintains his own website site at: http://www.iDevelopment.info. Jeff graduated from Stanislaus State University in Turlock, California, with a Bachelor's degree in Computer Science.



Copyright (c) 1998-2014 Jeffrey M. Hunter. All rights reserved.

All articles, scripts and material located at the Internet address of http://www.idevelopment.info is the copyright of Jeffrey M. Hunter and is protected under copyright laws of the United States. This document may not be hosted on any other site without my express, prior, written permission. Application to host any of the material elsewhere can be made by contacting me at jhunter@idevelopment.info.

I have made every effort and taken great care in making sure that the material included on my web site is technically accurate, but I disclaim any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on it. I will in no case be liable for any monetary damages arising from such loss, damage or destruction.

Last modified on
Wednesday, 16-May-2012 10:38:05 EDT
Page Count: 13340