Oracle DBA Tips Corner |
|
Add a Node to an Existing Oracle RAC 10g Release 2 Cluster on Linux - (CentOS 4.5 / iSCSI)
by Jeff Hunter, Sr. Database Administrator
Contents
Overview
This document is an extension to my article
"Building an Inexpensive Oracle RAC 10g Release 2 on Linux - (CentOS 4.5 / iSCSI)".
Contained in this new article are the steps required to add a single node
to an already running and configured
two-node Oracle RAC 10g Release 2 environment on the
CentOS 32-bit (x86) platform. Although this article was written and
tested on CentOS 4.5 Linux, it should work unchanged with
Red Hat Enterprise Linux 4 Update 5.
This article assumes the following:
Note: The current two-node Oracle RAC
environment has been upgraded from its base release (10.2.0.1.0) to
version 10.2.0.3.0 by applying the 5337014 patchset (p5337014_10203_LINUX.zip).
The patchset was applied to Oracle Clusterware and the Oracle Database
software. I also applied the one-off patchset - "BUNDLE Patch for Base Bug 6000740"
(MLR7 ON TOP OF 10.2.0.3) to the Oracle Clusterware and Oracle
Database software. The procedures for installing both patchsets are
not included in any of the parent article(s).
The following is a conceptual look at what the environment will look like
after adding the third Oracle RAC node (linux3) to the cluster. Click on the
graphic below to enlarge the image:
Figure 1: Adding linux3 to the current Oracle RAC 10g Release 2 Environment
- Intel(R) Pentium(R) 4 Processor at 2.80GHz
Each Linux server for Oracle RAC should contain two NIC adapters.
The Dell Dimension includes an integrated 10/100 Ethernet adapter
that will be used to connect to the public network. The second NIC adapter
will be used for the private network (RAC interconnect and
Openfiler networked storage). Select the appropriate NIC adapter
that is compatible with the maximum data transmission speed of the
network switch to be used for the private network.
For the purpose of this article, I used a Gigabit Ethernet switch (and 1Gb Ethernet cards)
for the private network.
Used for RAC interconnect to linux1, linux2 and Openfiler networked storage.
Gigabit Ethernet
Install the Linux Operating System
After procuring the required hardware, it is time to start the configuration
process. The first task we need to perform is to install the
Linux operating system. As already mentioned, this article will use CentOS 4.5.
Although I have used Red Hat Fedora in the past, I wanted to switch
to a Linux environment that would guarantee all of the functionality
contained with Oracle. This is where CentOS comes in.
The CentOS project takes the Red Hat Enterprise Linux 4 source RPMs and compiles
them into a free clone of the Red Hat Enterprise Server 4 product. This provides
a free and stable version of the Red Hat Enterprise Linux 4 (AS/ES) operating environment that
I can now use for
testing different Oracle configurations. I have moved away from Fedora as I need a
stable environment that is not only
free, but as close to the actual Oracle supported operating system as possible.
While CentOS is not the only project performing the same functionality, I
tend to stick with it as it is stable and reacts fast with regards to updates by Red Hat.
After downloading and burning the CentOS images (ISO files) to CD,
insert CentOS Disk #1 into the new Oracle RAC server (linux3 in this example), power it on,
and answer the installation screen prompts as noted below.
Boot Screen
If there were
a previous installation of Linux on this machine, the next screen
will ask if you want to "remove" or "keep" old partitions. Select the option
to [Remove all partitions on this system]. Also, ensure that the
[hda] drive is selected for this installation. I also keep the
checkbox [Review (and modify if needed) the partitions created] selected.
Click [Next] to continue.
You will then be prompted with a dialog window asking if you really
want to remove all partitions. Click [Yes] to acknowledge this warning.
The main concern during
the partitioning phase is to ensure enough swap space is allocated as
required by Oracle (which is a multiple of the available RAM).
The following is Oracle's requirement for swap space:
For the purpose of this install, I will accept
all automatically preferred sizes. (Including 2GB for swap since I have
2GB of RAM installed.)
If for any reason, the automatic layout does not
configure an adequate amount of swap space, you can easily change that from this screen.
To increase the size of the swap partition, [Edit] the volume group VolGroup00.
This will bring up the "Edit LVM Volume Group: VolGroup00" dialog.
First, [Edit] and decrease the size of the root file system (/) by the amount you
want to add to the swap partition. For example, to add another 512MB to swap, you would
decrease the size of the root file system by 512MB (i.e. 36,032MB - 512MB = 35,520MB).
Now add the space you decreased from the root file system (512MB) to the swap
partition. When completed, click [OK] on the "Edit LVM Volume Group: VolGroup00"
dialog.
Once you are satisfied with the disk layout, click [Next] to continue.
First, make sure that each of the network devices are checked
to [Active on boot]. The installer may choose
to not activate eth1 by default.
Second, [Edit] both eth0 and eth1 as follows. You may choose
to use different IP addresses for both eth0 and eth1 and that
is OK. Configure eth1 (the interconnect and storage network) on
a different subnet than eth0 (the public network):
eth0:
eth1:
Continue by setting your hostname manually. I used
"linux3" for this new Oracle RAC node.
Finish this dialog off by supplying your gateway and
DNS servers.
You will be prompted
with a warning dialog about not setting the firewall.
If this occurs, simply hit [Proceed] to continue.
Please note that the installation of Oracle does not require
all Linux packages to be installed. My decision to install
all packages was for the sake of brevity.
Please see section
"Pre-Installation Tasks for Oracle10g Release 2"
for a more detailed
look at the critical packages required for a successful
Oracle installation.
Also note that with some RHEL 4 distributions, you will not get the "Package Group Selection"
screen by default. There, you are asked to simply "Install default
software packages" or "Customize software packages to be installed".
Select the option to "Customize software packages to be installed"
and click [Next] to continue. This will then bring up the
"Package Group Selection" screen. Now, scroll down to the bottom of this screen and select
[Everything] under the "Miscellaneous" section. Click
[Next] to continue.
Note that with CentOS 4.5, the installer would ask to switch
to Disk #2, Disk #3, Disk #4, Disk #1, and then back to Disk #4.
When the system boots into Linux for the first time, it will prompt
you with another Welcome screen. The following wizard allows you to
configure the date and time, add any additional users, test the
sound card, and to install any additional CDs. The only screen I care
about is the time and date (and if you are using CentOS 4.x, the
monitor/display settings). As for the others, simply run through
them as there is nothing additional that needs to be installed (at this point
anyways!). If everything was successful, you should now be presented with the
login screen.
During the Linux O/S install we already configured the IP address and
host name for the new Oracle RAC node.
We now need to configure
the /etc/hosts file as well as adjusting several of the
network settings for the interconnect.
All nodes in the RAC cluster should have one static IP address for the public network
and one static IP address for the private cluster interconnect. The
private interconnect should only be used by Oracle to transfer
Cluster Manager and Cache Fusion related data along with data for the
network storage server (Openfiler). Note that Oracle does not support
using the public network interface for the interconnect. You must
have one network interface for the public network and another network
interface for the private interconnect. For a production RAC implementation,
the interconnect should be at least gigabit (or more) and only be used by Oracle
as well as having the network storage server (Openfiler) on a separate gigabit network.
The easiest way to configure network settings in Red Hat Linux is with the program
Network Configuration. This application can be started from the command-line
as the "root" user account as follows:
Using the Network Configuration application, we will need to configure
both NIC devices as well as the
Please note that for the purpose of this example configuration
the
Our example configuration will use the following settings for all nodes:
Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 Oracle. All rights reserved.
Oracle strongly suggests to adjust the default and maximum send buffer size
(SO_SNDBUF socket option) to 256 KB, and the default and maximum receive
buffer size (SO_RCVBUF socket option) to 256 KB.
The receive buffers are used by TCP and UDP to hold received data until it is read by
the application. The receive buffer cannot overflow because the peer is not allowed to
send data beyond the buffer size window. This means that datagrams will be discarded if
they don't fit in the socket receive buffer. This could cause the sender to overwhelm
the receiver.
The default and maximum window size can be changed without a reboot.
Add the following entries to the /etc/sysctl.conf file
on the new Oracle RAC node:
If UDP ICMP is blocked or rejected by the firewall, the Oracle Clusterware software will crash after several minutes
of running. When the Oracle Clusterware process fails, you will have something similar to the following
in the <machine_name>_evmocr.log file:
Configure Network Security on the Openfiler Storage Server
With the network now setup, the next step is to configure
network access in Openfiler so that the new Oracle RAC node
(linux3) has permissions to the shared iSCSI volumes
used in the current Oracle RAC 10g environment. For
the purpose of this example, all iSCSI traffic will use the
private network interface eth1 which in this article
is on the 192.168.2.0 network.
Openfiler administration is performed using the Openfiler Storage Control Center
a browser based tool over an https connection on port 446. For example:
The first page the administrator sees is the
[Accounts] / [Authentication] screen. Configuring user accounts
and groups is not necessary for this article and will
therefore not be discussed.
To verify the iSCSI services are running, use the Openfiler Storage Control Center and
navigate to [Services] / [Enable/Disable]:
Another method is to SSH into the Openfiler server and
verify the iscsi-target service is running:
Again, this task can be completed using the Openfiler Storage Control Center
by navigating to [General] / [Local Networks]. The Local Networks screen allows
an administrator to setup networks and/or hosts that will be allowed to
access resources exported by the Openfiler appliance. For the purpose of this
article, we will want to add the new Oracle RAC node individually rather than allowing
the entire 192.168.2.0 network have access to Openfiler resources.
When entering the new Oracle RAC node, note that the 'Name' field
is just a logical name used for reference only. As a convention when entering
nodes, I simply use the node name defined for that IP address.
Next, when entering the actual node in the 'Network/Host' field, always use it's
IP address even though its host name may already be defined in your /etc/hosts
file or DNS. Lastly, when entering actual hosts in our Class C network, use
a subnet mask of 255.255.255.255.
It is important to remember that you will be entering the IP address
of the private network (eth1) for the new Oracle RAC node.
The following image shows the results of adding the new Oracle RAC node linux3
to the local network configuration:
To view the available iSCSI volumes from within the Openfiler Storage Control Center,
navigate to [Volumes] / [List of Existing Volumes]. There we will see all five
logical volumes within the volume group rac1:
From the Openfiler Storage Control Center, navigate to
[Volumes] / [List of Existing Volumes]. This will present the screen shown in the
previous section. For each of the five logical volumes, click on the 'Edit' link (under
the Properties column). This will bring up the 'Edit properties' screen for that
volume. Scroll to the bottom of this screen; change the access for host linux3-priv from
'Deny' to 'Allow' and click the 'Update' button. Perform this task for all five logical volumes.
An iSCSI client can be any system (Linux, Unix, MS Windows, Apple Mac, etc.) for which iSCSI
support (a driver) is available. In our case, the clients are the three Oracle RAC nodes,
(linux1, linux2, and linux3), running Red Hat 4.
In this section we will be configuring the iSCSI initiator on the new Oracle RAC node linux3.
This involves configuring the /etc/iscsi.conf file on the new Oracle
RAC node with the name of the network storage server (openfiler1) so it can
discover the current iSCSI volumes.
Use the following command to install the iscsi-initiator-utils RPM package if not present:
After verifying that the iscsi-initiator-utils RPM is installed, the only
configuration step required on the new Oracle RAC node (iSCSI client) is to specify the
network storage server (iSCSI server) in the /etc/iscsi.conf file.
Edit the /etc/iscsi.conf file and include an entry for
DiscoveryAddress which specifies the hostname of the Openfiler network storage server.
In our case that was:
In this section, we simply want to verify that the new Oracle RAC node
was able to successfully discover the five logical iSCSI volumes
on the Openfiler server.
When the Openfiler server publishes available iSCSI targets,
configured clients get the message that new iSCSI disks are now available.
This happens when the iscsi-target service gets started/restarted on the Openfiler server
or when the iSCSI initiator service is started/restarted on the client.
We would see something like this in the client's /var/log/messages file:
Another method not
only checks for the existence of the iSCSI volumes, but also displays how the local SCSI device
names map to iSCSI targets' host IDs and LUNs. Use the following
script which was provided by
Martin Jones
to display these mappings:
Example run:
Create "oracle" User and Directories
I will be using the Oracle Cluster File System, Release 2 (OCFS2) to store the files required to be shared
for the Oracle Clusterware software.
When using OCFS2, the UID of the UNIX user "oracle" and GID of the UNIX group "oinstall" must be the same
on all of the Oracle RAC nodes in the cluster. If either the UID or GID are different, the files on the OCFS2 file system
will show up as "unowned" or may even be owned by a different user. For this article and its parent
article, I will use 501 for the "oracle" UID and 501 for the "oinstall" GID.
Note that members of the UNIX group oinstall are considered the "owners" of
the Oracle software.
Members of the dba group can administer Oracle databases, for example starting up and shutting
down databases. In this article, we are creating the oracle user account to have both
responsibilities!
The following assumes that the directories are being created in the root
file system. Please note that this is being done for the sake of
simplicity and is not recommended as a general practice. Normally, these
directories would be created on a separate file system.
After the directory is created, you must then specify the correct
owner, group, and permissions for it. Perform the following on the new Oracle
RAC node:
At the end of this procedure, you will have the following:
As noted in the previous section, the following assumes that the directories are being
created in the root file system. This is being done for the sake of simplicity
and is not recommended as a general practice. Normally, these directories would
be created on a separate file system.
After the directory is created, you must then specify the correct
owner, group, and permissions for it. Perform the following on the new Oracle
RAC node:
At the end of this procedure, you will have the following:
Perform the following on the new Oracle RAC node:
For this example, I used:
Login to the new Oracle RAC node as the oracle user account:
Configure the Linux Server for Oracle
The kernel parameters and shell limits discussed in this section will need to be defined on the new Oracle RAC node
every time the machine is booted. This section will not go into great depth in explaining the purpose
of those kernel parameters that are required by Oracle (These parameters are described in detail
in the
parent
to this article). Provided in this section, however, are instructions on how to set
all required kernel parameters for Oracle and how to have them enabled when the node boots.
Further instructions for configuring kernel parameters in
a startup script (/etc/sysctl.conf) is included in the section
"All Startup Commands for New Oracle RAC Node".
As root, make a file that will act as additional swap space, let's say about 500MB:
Now we should change the file permissions:
Finally we format the "partition" as swap and add it to the swap space:
On the new Oracle RAC node, verify that the kernel parameters described in this section are
set to values greater than or equal to the recommended values. Also note that when
setting the four semaphore values that all four values need to be entered on one line.
To make these changes, run the following as root:
We could reboot at this point to ensure all of these parameters are set in the
kernel or we could simply "run" the /etc/sysctl.conf file by running the
following command as root on the new Oracle RAC node:
Please note that although this would seem like a severe error from the OUI, it can
safely be disregarded as a warning. The "tar" command DOES actually
extract the files; however, when you perform a listing of the files (using ls -l) on the remote node
(the new Oracle RAC node),
they will be missing the time field until the time on the remote server is greater than the
timestamp of the file.
Before attempting to add the new node, ensure that all nodes in the cluster
are set as closely as possible to the same date and time. Oracle strongly recommends using
the Network Time Protocol feature of most operating systems for this purpose,
with all nodes using the same reference Network Time Protocol server.
Accessing a Network Time Protocol server, however, may not always be an option.
In this case, when manually setting the date and time for the nodes in the
cluster, ensure that the date and time of the node you are performing the software
installations from (linux1) is less than the new node being added to the cluster (linux3).
I generally use a 20 second difference as shown in the following example:
Show the date and time from linux1:
Setting the date and time on the new Oracle RAC node linux3:
The RAC configuration described in this article does not make use
of a Network Time Protocol server.
Configure the "hangcheck-timer" Kernel Module
Oracle 9.0.1 and 9.2.0.1 used a userspace watchdog daemon called
watchdogd to monitor the health of the cluster and to
restart a RAC node in case of a failure. Starting with Oracle 9.2.0.2
(and still available in Oracle10g Release 2), the
watchdog daemon has been deprecated by a Linux kernel module named
hangcheck-timer which addresses availability and reliability
problems much better. The hang-check timer is loaded into the
Linux kernel and checks if the system hangs. It will set a timer and check the
timer after a certain amount of time. There is a configurable threshold to
hang-check that, if exceeded will reboot the machine. Although
the hangcheck-timer module is not required for Oracle Clusterware (Cluster Manager)
operation, it is highly recommended by Oracle.
Much more information about the
hangcheck-timer project
can be found
here.
These values need to be available after each reboot of the Linux server. To do this, make
an entry with the correct values to the /etc/modprobe.conf file as follows:
It is only out of pure habit that I continue to include a modprobe
of the hangcheck-timer kernel module in the /etc/rc.local file. Someday I will get
over it, but realize that it does not hurt to include a modprobe of
the hangcheck-timer kernel module during startup.
So to keep myself sane and able to
sleep at night, I always configure the loading of the hangcheck-timer kernel module on
each startup as follows:
Now, to test the hangcheck-timer kernel module to verify it is picking up the
correct parameters we defined in the /etc/modprobe.conf file, use the modprobe
command. Although you could load
the hangcheck-timer kernel module by passing it the appropriate parameters
(e.g. insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180),
we want to verify that it is picking up the options we set in the
/etc/modprobe.conf file.
To manually load the hangcheck-timer kernel module and verify it is using the
correct values defined in the /etc/modprobe.conf file, run the following command:
Configure RAC Nodes for Remote Access using SSH
As was the case when configuring the existing two-node cluster,
this article assumes the Oracle software installation to the new Oracle RAC node
will be performed from linux1. This section provides the
methods required for configuring SSH1, an RSA key, and user equivalence for
the new Oracle RAC node.
Use the following steps to create the RSA key pair from the new Oracle RAC node (linux3);
This command will write the public key to the ~/.ssh/id_rsa.pub
file and the private key to the ~/.ssh/id_rsa file.
Note that you should never distribute the private key to anyone!
Complete the following steps on linux1 to
update and then distribute the authorized key file to all nodes
in the Oracle RAC cluster.
Again, this task will be performed from linux1.
User equivalence will need to be enabled on any new terminal shell session
on linux1
before attempting to run the addNode.sh script. If you log out and log back in to the
node you will be performing the Oracle installation from, you must
enable user equivalence for the terminal shell session as this is not
done by default.
To enable user equivalence for the current terminal shell session, perform
the following steps:
Also, if you see any other messages or text, apart from the date and hostname,
then the Oracle installation can fail. Make any changes required to
ensure that only the date is displayed when you enter these commands.
You should ensure that any part of a login script(s) that generate any
output, or ask any questions, are modified so that they act only when
the shell is an interactive shell.
Bourne, Korn, and Bash shells:
All Startup Commands for New Oracle RAC Node
/etc/modprobe.conf
/etc/sysctl.conf
/etc/hosts
/etc/rc.local
Install and Configure Oracle Cluster File System (OCFS2)
Along with these two groups of files (the OCR and Voting disk), we also used
this space to store the shared ASM SPFILE for all Oracle RAC instances.
In this section, we will download and install the release of OCFS2 used for
the current two-node cluster on the new Oracle RAC node -
(OCFS2 Release 1.2.5-6).
See the following page for more information on OCFS2
(including Installation Notes) for Linux:
Download the same OCFS2 distribution used for the current two-node RAC
starting with the OCFS2 kernel module (the driver).
With CentOS 4.5, I am using kernel release 2.6.9-55.EL.
The available OCFS2 kernel modules for Linux kernel 2.6.9-55.EL are listed below.
Always download the OCFS2 kernel module that matches the distribution, platform, kernel version and
the kernel flavor (smp, hugemem, psmp, etc).
Install OCFS2
If you followed the installation instructions I provided for the CentOS operating system,
the SELinux option should already be disabled in which case this section can
be skipped. During the CentOS installation, the SELinux option was disabled in the
Firewall section.
If you did not follow the instructions to disable the SELinux option during the installation
of CentOS (or if you simply
want to verify it is truly disable),
run the "Security Level Configuration" GUI utility:
In this section, we will not only create and configure the /etc/ocfs2/cluster.conf
file using ocfs2console, but will also create and
start the cluster stack O2CB. When the /etc/ocfs2/cluster.conf
file is not present, (as will be the case in our example), the ocfs2console
tool will create this file along with a new cluster stack service
(O2CB) with a default cluster name of ocfs2.
A popular question then is what node name should be used and should it
be related to the IP address? The node name needs to match the hostname of
the machine.
The IP address need not be the one associated with that hostname. In other
words, any valid IP address on that node can be used. OCFS2 will not attempt
to match the node name (hostname) with the specified IP address.
As root, run the following from linux1 and then linux2:
o2cb_ctl parameters:
Set the on-boot properties as follows:
First, here is how to manually mount the OCFS2 file system from the
command-line. Remember that this needs to be performed as the
root user account on the new Oracle RAC node:
Any other type of volume, including an Oracle home (which I will not be using for this article),
should not be mounted with this mount option.
We start by adding the following line to the /etc/fstab file
on the new Oracle RAC node:
Now, let's make sure that the ocfs2.ko kernel module is being loaded
and that the file system will be mounted during the boot process.
If you have been following along with the
examples in this article, the actions to load the kernel module and mount
the OCFS2 file system should already be enabled. However, we should still
check those options by running the following as the root user account
on the new Oracle RAC node:
Install and Configure Automatic Storage Management (ASMLib 2.0)
In this section, we will download, install,
and configure ASMLib on the new Oracle RAC node. Using this method,
Oracle database files will be stored on raw block devices managed by
ASM using ASMLib calls.RAW devices are not required with this method
as ASMLib works with block devices.
Download the same ASMLib distribution used for the current two-node RAC
starting with the ASMLib kernel module (the driver).
With CentOS 4.5, I am using kernel release 2.6.9-55.EL while the
ASMLib kernel driver used in the current two-node RAC environment is version
2.0.3-1.
Like the Oracle Cluster File System, we need to download the version
for the Linux kernel and number of processors on the new Oracle RAC node. We are using
kernel 2.6.9-55.EL #1 while the machine I am using only has a single processor:
We can now test that the ASM disks were successfully identified
using the following command as the root user account:
Pre-Installation Tasks for Oracle10g Release 2
The next pre-installation step is to run the
Cluster Verification Utility (CVU) from linux1 against
the new Oracle RAC node. CVU is a command-line
utility provided on the Oracle Clusterware installation media. It is
responsible for performing various system checks to assist you with
confirming the Oracle RAC nodes are properly
configured for Oracle Clusterware and Oracle Real Application Clusters
installation. The CVU only needs to be run
from the node you will be performing the Oracle installations from (linux1
in this article).
To query package information (gcc and glibc-devel for example),
use the "rpm -q <PackageName> [, <PackageName>]" command
as follows:
JDK 1.4.2
Install cvuqdisk RPM (RHEL Users Only)
The cvuqdisk RPM can be found on the Oracle Clusterware installation
media in the rpm directory. For the purpose of this article, the Oracle Clusterware
media was extracted to the ~oracle/orainstall/clusterware
directory on linux1.
Note that before installing the cvuqdisk RPM, we need to set
an environment variable named CVUQDISK_GRP to point to the group that will own
the cvuqdisk utility. The default group is oinstall which is the primary group
we are using for the oracle UNIX user account in this article. If you are using
a different primary group (i.e. dba), you will need to set CVUQDISK_GRP=<YOUR_GROUP> before
attempting to install the cvuqdisk RPM.
Locate and copy the cvuqdisk RPM from linux1 to the new Oracle RAC node (linux3).
Next, perform the following steps as the root user account from linux3 to install:
Verify Remote Access / User Equivalence
The first error is with regards
to finding a suitable set of interfaces for VIPs which can be safely ignored. This is a bug
documented in Metalink Note
338924.1:
As documented in the note, this error can be safely ignored.
The last set of errors that can be ignored deal with specific
RPM package versions that do not exist in RHEL4 Update 5. For
example:
While these specific packages are listed as missing
in the CVU report, please ensure that the correct versions
of the compat-* packages are installed on the new
Oracle RAC node. For example, in RHEL4 Update 5,
these would be:
Also note that the
check for shared storage accessibility will fail.
Extend Oracle Clusterware Software to the New Node
When you extend an Oracle RAC database, you must first extend the Oracle Clusterware home to the
new node and then extend the Oracle Database home. In other words, you extend the software onto
the new node in the same order as you installed the clusterware and Oracle database
components on the existing nodes.
Oracle Clusterware is already installed on the cluster.
The task in this section is to add the new Oracle RAC node to the clustered configuration.
This is done by executing the Oracle provided utility addNode.sh from one of the
existing nodes in the cluster; namely linux1. This script
is located in the Oracle Clusterware's home oui/bin directory (/u01/app/crs/oui/bin).
During the add node process, the shared Oracle Cluster Registry file and Voting Disk
will be updated with the information regarding the new node.
Verify Server and Enable X Server Access
Login as the oracle User Account and Set DISPLAY (if necessary)
Verify Remote Access / User Equivalence
When using the secure shell method,
user equivalence
will need to be enabled on any new terminal shell session
before attempting to run the OUI. To enable
user equivalence for the current terminal shell session, perform the
following steps remembering to enter the pass phrase for the RSA key
you generated when prompted:
Click Next to continue.
From linux3
Navigate to the /u01/app/oracle/oraInventory directory
on linux3 and run orainstRoot.sh as the "root" user account.
From linux1
Important: As documented in Metalink Note 392415.1,
the rootaddnode.sh script (which is run in this section) may error out at the end with
"Connection refused" (PRKC-1044) when trying to trying to add a new node
to the cluster. The reason this error occurs is because the "oracle" user account
on the node running the rootaddnode.sh script is setup with SSH for remote access
to the new node and has a non-empty SSH passphrase. Note that for obvious security
reasons, the "oracle" user account
is typically setup with a non-empty pass phrase for SSH keys and would thus succumb to
this error. The rootaddnode.sh script uses SSH to check remote node connectivity from
linux1 to linux3. If it gives any prompt, it will consider ssh is
not configured properly. The script will then use rsh instead. If rsh is not configured,
then it will error out with "Connection refused". If you are using SSH for user
equivalency (as I am in this article), you will need to temporarily define an
empty rsa passphrase for the "oracle" user account on linux1 as follows:
[oracle@linux1 ~]$ ssh-keygen -p
Afer temporarily defining an empty rsa passphrase for the "oracle" user account,
navigate to the /u01/app/crs/install directory
on linux1 and run rootaddnode.sh as the "root" user account.
The rootaddnode.sh script will add the new node information to the
Oracle Cluster Registry (OCR) file using the srvctl utility.
After running the rootaddnode.sh script from linux1, you can set your
passphrase back to the old passphrase using the same "ssh-keygen -p" command.
From linux3
Finally, navigate to the /u01/app/crs directory
on linux3 and run root.sh as the "root" user account.
If the Oracle Clusterware home directory is a subdirectory of the ORACLE_BASE directory (which should never be!),
you will receive several warnings regarding permissions while running the root.sh script
on the new node. These warnings can be safely ignored.
The root.sh may take awhile to run. With Oracle version 10.2.0.1, when running the root.sh
on linux3, you will receive a critical error and the output should look like:
This issue is specific to Oracle 10.2.0.1
(noted in Metalink article 338924.1)
and needs to be
resolved before continuing. The easiest workaround is to re-run
vipca (GUI) manually as root from linux3 (the node where the error occurred).
Please keep in mind that vipca is a GUI and will need to set your DISPLAY
variable accordingly to your X server:
# $ORA_CRS_HOME/bin/vipca
When the "VIP Configuration Assistant" appears, this is how I
answered the screen prompts:
Welcome: Click Next
Node Name: linux2
Node Name: linux3
Summary: Click Finish
Go back to the OUI and acknowledge the "Execute Configuration scripts" dialog window.
Check Cluster Nodes
Extend Oracle Database Software to the New Node
Login as the oracle User Account and Set DISPLAY (if necessary)
Verify Remote Access / User Equivalence
When using the secure shell method,
user equivalence
will need to be enabled on any new terminal shell session
before attempting to run the OUI. To enable
user equivalence for the current terminal shell session, perform the
following steps remembering to enter the pass phrase for the RSA key
you generated when prompted:
After running the root.sh script on the new Oracle RAC node,
go back to the OUI and acknowledge the "Execute Configuration scripts" dialog window.
Login as the oracle User Account and Set DISPLAY (if necessary)
Verify Remote Access / User Equivalence
When using the secure shell method,
user equivalence
will need to be enabled on any new terminal shell session
before attempting to run the NETCA. To enable
user equivalence for the current terminal shell session, perform the
following steps remembering to enter the pass phrase for the RSA key
you generated when prompted:
The following table walks you through the process of reconfiguring the TNS
listeners in a clustered configuration to include the new node.
Add Database Instance to the New Node
Before executing the DBCA, make certain that
$ORACLE_HOME and $PATH are set appropriately for the
$ORACLE_BASE/product/10.2.0/db_1 environment.
You should also verify that all services we have installed up to this point
(Oracle TNS listener, Oracle Clusterware processes, etc.) are running
before attempting to start the clustered database creation process.
The DBCA program will be run from linux1 with user equivalence enabled to
all nodes in the cluster.
Login as the oracle User Account and Set DISPLAY (if necessary)
Verify Remote Access / User Equivalence
When using the secure shell method,
user equivalence
will need to be enabled on any new terminal shell session
before attempting to run the DBCA. To enable
user equivalence for the current terminal shell session, perform the
following steps remembering to enter the pass phrase for the RSA key
you generated when prompted:
At the bottom of this screen, the DBCA requests you to "Specify a user with SYSDBA
system privileges":
Username: sys
Click Next to continue.
Verify this list is correct and Click Next to continue.
After clicking Next, there will be a small pause before the next screen appears
as the DBCA determines the current state of the new node and what services (if any)
are configured on the existing nodes.
NOTE: In the previous section
(Add Listener to New Node),
I provided instructions
to setup the TNS listener in a clustered configuration to include the new Oracle RAC node using NETCA.
If the listener is not yet configured on the new Oracle RAC node, the DBCA will prompt the
user with a dialog asking to configure a new listener using port 1521 and listener name
"LISTENER_LINUX3". The TNS listener must be present and started on the new Oracle RAC node
in order to create and start the ASM instance on the new node.
$ srvctl start service -s orcl_taf -d orcl -i orcl3
When the Oracle Database Configuration Assistant has completed, you will
have successfully extended the current Oracle RAC database!
Check Cluster Services
http://linux3:1158/em
All articles, scripts and material located at the Internet address of http://www.idevelopment.info is the copyright of Jeffrey M. Hunter
and is protected under copyright laws of the United States. This document may not be hosted on any other site without my express,
prior, written permission. Application to host any of the material elsewhere can be made by contacting me at jhunter@idevelopment.info.
I have made every effort and taken great care in making sure that the material included on my web site is technically accurate,
but I disclaim any and all responsibility for any loss, damage or destruction of data or any other property which may arise from
relying on it. I will in no case be liable for any monetary damages arising from such loss, damage or destruction.
As your organization grows so too does your need for more application and database resources
to support the company's IT systems. Oracle RAC 10g provides a scalable framework which allows DBA's
to effortlessly extend the database tier to support this increased demand. As the number of
users and transactions increase, additional Oracle instances can be added to the Oracle database
cluster to distribute the extra load.

The reader has already built and configured a two-node
Oracle RAC 10g Release 2 environment using the article
"Building an Inexpensive Oracle RAC 10g Release 2 on Linux - (CentOS 4.5 / iSCSI)".
The article provides comprehensive instructions for building a
two-node RAC cluster, each with a single processor
running CentOS 4.5, Oracle RAC 10g Release 2,
OCFS2, and ASMLib 2.0. The current two-node RAC environment
actually consists of three machines two named
linux1 and linux2 which each run
an Oracle10g instance and a third node to
run the network storage server named openfiler1.

To maintain the current naming convention, the new Oracle RAC node to be added to the existing cluster will
be named linux3 (running a new instance named orcl3) making it a three-node cluster.

The new Oracle RAC node should have the same operating system version and installed patches
as the current two-node cluster.

Each node in the existing Oracle RAC cluster has a copy of the Oracle Clusterware and Oracle Database software
installed on their local disks. The current two-node Oracle RAC environment does not
use shared Oracle homes for the Clusterware or Database software.

The software owner for the Oracle Clusterware and Oracle Database installs will be
"oracle". It is important that the UID and GID of the oracle user
account be identical to that of the existing RAC nodes. For the purpose of this example,
the oracle user account will be defined as follows:
[oracle@linux1 ~]$ id oracle
uid=501(oracle) gid=501(oinstall) groups=501(oinstall),502(dba)

The existing Oracle RAC 10

Automatic Storage Management (ASM) is being used as the file system and volume manager
for all Oracle physical database files (data, online redo logs, control files, archived redo logs)
and a Flash Recovery Area. In addition to ASM, we will also be configuring ASMLib
on the new Oracle RAC node.

To add instances to an existing RAC database, Oracle Corporation recommends
using the Oracle cloning procedures which is described
in the Oracle Universal Installer and OPatch User's Guide. This article, however, uses manual procedures
to add nodes and instances to the existing Oracle RAC cluster. The manual procedures
method described in this article involve extending the RAC database by first
extending the Oracle Clusterware home to the new Oracle RAC node and then extending
the Oracle Database home. In other words, you extend the software onto the new node
in the same order as you installed the clusterware and Oracle database software
components on the existing two-node RAC.

During the creation of the existing two-node cluster, the installation of Oracle
Clusterware and the Oracle Database software were only performed from one node in the
RAC cluster namely from linux1 as the oracle user account.
The Oracle Universal Installer (OUI) on that particular node would then use the
ssh and scp commands to run remote commands on and
copy files (the Oracle software) to all other nodes within the RAC cluster.
The oracle user account on the node running the OUI
(runInstaller) had to be trusted by all other nodes in the
RAC cluster. This meant that the oracle user account had to
run the secure shell commands (ssh or scp) on the
Linux server executing the OUI (linux1) against all other Linux servers in
the cluster without being prompted for a password. The same security requirements hold
true for this article. User equivalence will be configured so that the
Oracle Clusterware and Oracle Database
software will be securely copied from linux1 to the new Oracle RAC node
(linux3) using ssh and scp without being prompted for a password.

All shared disk storage for the existing Oracle RAC is based on
iSCSI
using a Network Storage Server; namely
Openfiler Release 2.2 (respin 2). Powered by
rPath Linux,
Openfiler
is a free browser-based network storage management utility that delivers file-based
Network Attached Storage (NAS) and block-based Storage Area Networking (SAN) in a single framework.
Openfiler supports CIFS, NFS, HTTP/DAV, FTP, however, we will only be making use of its
iSCSI capabilities to implement an inexpensive SAN for the shared storage components
required by Oracle RAC 10g.
This solution offers a low-cost alternative to fibre channel
for testing and educational purposes, but given the
low-end hardware being used, it should not be used in a production environment.

These articles provide a low cost alternative for those who want
to become familiar with Oracle RAC 10g using commercial off
the shelf components and downloadable software. Bear in mind that
these articles are provided for educational purposes only so the
setup is kept simple to demonstrate ideas and concepts. For example,
the disk mirroring configured in this article will be setup on one
physical disk only, while in practice that should be performed on
multiple physical drives. In addition, each Linux node will
only be configured with two network cards one for
the public network (eth0) and one for the private cluster interconnect
"and" network storage server for shared iSCSI access (eth1).
For a production RAC implementation, the private interconnect should be at
least gigabit (or more) and "only" be used by Oracle
to transfer Cluster Manager and Cache Fusion related data. A third
dedicated network interface (i.e. eth2) should be configured on another
gigabit network for access to the network storage server (Openfiler).
While this article provides comprehensive instructions for successfully adding a node to
an existing Oracle RAC 10g system, it is by no means a substitute for the official
Oracle documentation. In addition to this article, users should also consult the following
Oracle documents to gain a full understanding of alternative configuration options, installation, and
administration with Oracle RAC 10g. Oracle's official documentation site is
docs.oracle.com.
Oracle Clusterware and Oracle Real Application Clusters Installation Guide - 10g Release 2 (10.2) for Linux
Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide - 10g Release 2 (10.2)
2 Day + Real Application Clusters Guide - 10g Release 2 (10.2)
The hardware used in this article to build the third node (linux3)
consists of a Linux workstation and components
which can be purchased at many local computer stores or over the Internet.
Oracle RAC Node 3 - (linux3)
Dell Dimension 3000 Series
- 2GB DDR SDRAM (at 333MHz)
- 60GB 7200 RPM Internal Hard Drive
- Integrated Intel 3D AGP Graphics
- Integrated 10/100 Ethernet - (Broadcom BCM4401)
- CDROM (48X Max Variable)
- 3.5" Floppy
- No Keyboard, Monitor, or Mouse - (Connected to KVM Switch)
US$300
1 - Ethernet LAN Card
Intel 10/100/1000Mbps PCI Desktop Adapter - (PWLA8391GT)
US$35
2 - Network Cables
Category 5e patch cable - (Connect linux3 to public network)
Category 5e patch cable - (Connect linux3 to interconnect ethernet switch)
US$5
US$5
Total
US$345
We are about to start the installation process.
As we start to go into the details of the
installation, it should be noted that most of the tasks within this
document will need to be performed on the new Oracle RAC node (linux3).
I will indicate at the beginning of each section whether or not the task(s)
should be performed on the new Oracle RAC node, the current Oracle RAC node(s),
or on the network storage server (openfiler1).
Perform the following installation on the new Oracle RAC node!
Downloading CentOS
Use the links (below) to download CentOS 4.5. After
downloading CentOS, you will then want to burn each of the ISO images
to CD.
If you are downloading the above ISO files to a MS Windows machine,
there are many options for burning these images (ISO files) to a CD. You
may already be familiar with and have the proper software
to burn images to CD. If you are not familiar with this process
and do not have the required software to burn images to CD, here are just
two (of many) software packages that can be used:
Installing CentOS
This section provides a summary of the screens used to install
CentOS. For more detailed installation instructions, it
is possible to use the manuals from Red Hat Linux
http://www.redhat.com/docs/manuals/.
I would suggest, however, that the instructions I have provided
below be used for this Oracle RAC 10g configuration.
Before installing the Linux operating system on the new Oracle RAC node,
you should have the two NIC interfaces (cards) installed.
The first screen is the CentOS boot screen.
At the boot: prompt, hit [Enter] to start the installation process.
Media Test
When asked to test the CD media, tab over to [Skip] and hit
[Enter]. If there
were any errors, the media burning software would have warned us. After several
seconds, the installer should then detect the video card, monitor, and mouse.
The installer then goes into GUI mode.
Welcome to CentOS
At the welcome screen, click [Next] to continue.
Language / Keyboard Selection
The next two screens prompt you for the Language and Keyboard
settings. In almost all cases, you can accept the defaults.
Make the appropriate selection for your configuration and click [Next] to continue.
Installation Type
Choose the [Custom] option and click [Next] to continue.
Disk Partitioning Setup
Select [Automatically partition] and click [Next] continue.
Partitioning
The installer will then allow you to view (and modify if needed) the
disk partitions it automatically selected. For most automatic layouts, the
installer will choose 100MB for /boot, double the amount of RAM (systems with < 2GB RAM)
or an amount equal to RAM (systems with > 2GB RAM) for swap, and the rest going to the
root (/) partition. Starting with EL 4, the installer
will create the same disk configuration as just noted but will create
them using the Logical Volume Manager (LVM). For example, it will
partition the first hard drive (/dev/hda for my configuration) into two
partitions — one for the /boot partition (/dev/hda1) and the
remainder of the disk dedicate to a LVM named VolGroup00 (/dev/hda2).
The LVM Volume Group (VolGroup00) is then partitioned into two LVM
partitions - one for the root filesystem (/) and another for swap.
Boot Loader Configuration
Available RAM
Swap Space Required
Between 1 GB and 2 GB
1.5 times the size of RAM
Between 2 GB and 8 GB
Equal to the size of RAM
More than 8 GB
.75 times the size of RAM
The installer will use the GRUB boot loader by default.
To use the GRUB boot loader, accept all default values and click [Next] to continue.
Network Configuration
I made sure to install both NIC interfaces (cards) in the
new Linux machine before starting the operating system installation.
This screen should have successfully detected each of the network
devices.
Firewall
- Check OFF the option to [Configure using DHCP]
- Leave the [Activate on boot] checked ON
- IP Address: 192.168.1.107
- Netmask: 255.255.255.0
- Check OFF the option to [Configure using DHCP]
- Leave the [Activate on boot] checked ON
- IP Address: 192.168.2.107
- Netmask: 255.255.255.0
On this screen, make sure to select [No firewall].
Also under the option to "Enable SELinux?",
select [Disabled] and click [Next] to continue.
Additional Language Support / Time Zone
The next two screens allow you to select additional language support
and time zone information.
In almost all cases, you can accept the defaults.
Make the appropriate selection for your configuration and click [Next] to continue.
Set Root Password
Select a root password and click [Next] to continue.
Package Group Selection
Scroll down to the bottom of this screen and select
[Everything] under the "Miscellaneous" section. Click
[Next] to continue.
About to Install
This screen is basically a confirmation screen. Click [Next]
on this screen and then the [Continue] button on the dialog box
to start the installation. During the installation process,
you will be asked to switch disks to Disk #2, Disk #3, and then Disk #4.
Graphical Interface (X) Configuration
With most RHEL 4 distributions (not the case with CentOS 4.5), when the installation
is complete, the installer will attempt to detect
your video hardware. Ensure that the installer has detected
and selected the correct video hardware (graphics card and monitor) to
properly use the X Windows server. You will continue with the X
configuration in the next serveral screens.
Congratulations
And that's it. You have successfully installed CentOS
on the new Oracle RAC node (linux3). The installer will eject the CD
from the CD-ROM drive. Take out the CD and click [Reboot] to reboot
the system.
Perform the following network configuration tasks on the new Oracle RAC node!
Introduction to Network Settings
Although we configured several of the network
settings during the installation of CentOS, it is important
to not skip this section as it contains critical
steps that are required for a successful RAC environment.
Configuring Public and Private Network
With the new Oracle RAC node, we need to configure the network
for access to the public network as well as the private interconnect.
# su -
# /usr/bin/system-config-network &
Do not use DHCP naming for the public IP address or the interconnects - we need static IP addresses!
/etc/hosts file on all nodes
in the RAC cluster. Both of these tasks can
be completed using the Network Configuration GUI.
/etc/hosts entries will be the same for all three
Oracle RAC nodes
(linux1, linux2, and linux3)
as well as the network storage server (openfiler1):
Oracle RAC Node 3 - (linux3)
Device
IP Address
Subnet
Gateway
Purpose
eth0
192.168.1.107
255.255.255.0
192.168.1.1
Connects linux3 to the public network
eth1
192.168.2.107
255.255.255.0
Connects linux3 (interconnect) to linux1/linux2 (linux1-priv/linux2-priv)
/etc/hosts
127.0.0.1 localhost.localdomain localhost
# Public Network - (eth0)
192.168.1.100 linux1
192.168.1.101 linux2
192.168.1.107 linux3
# Private Interconnect - (eth1)
192.168.2.100 linux1-priv
192.168.2.101 linux2-priv
192.168.2.107 linux3-priv
# Public Virtual IP (VIP) addresses - (eth0:1)
192.168.1.200 linux1-vip
192.168.1.201 linux2-vip
192.168.1.207 linux3-vip
# Private Storage Network for Openfiler
192.168.1.195 openfiler1
192.168.2.195 openfiler1-priv
Oracle RAC Node 2 - (linux2)
Device
IP Address
Subnet
Gateway
Purpose
eth0
192.168.1.101
255.255.255.0
192.168.1.1
Connects linux2 to the public network
eth1
192.168.2.101
255.255.255.0
Connects linux2 (interconnect) to linux1/linux3 (linux1-priv/linux3-priv)
/etc/hosts
127.0.0.1 localhost.localdomain localhost
# Public Network - (eth0)
192.168.1.100 linux1
192.168.1.101 linux2
192.168.1.107 linux3
# Private Interconnect - (eth1)
192.168.2.100 linux1-priv
192.168.2.101 linux2-priv
192.168.2.107 linux3-priv
# Public Virtual IP (VIP) addresses - (eth0:1)
192.168.1.200 linux1-vip
192.168.1.201 linux2-vip
192.168.1.207 linux3-vip
# Private Storage Network for Openfiler
192.168.1.195 openfiler1
192.168.2.195 openfiler1-priv
Oracle RAC Node 1 - (linux1)
Device
IP Address
Subnet
Gateway
Purpose
eth0
192.168.1.100
255.255.255.0
192.168.1.1
Connects linux1 to the public network
eth1
192.168.2.100
255.255.255.0
Connects linux1 (interconnect) to linux2/linux3 (linux2-priv/linux3-priv)
/etc/hosts
127.0.0.1 localhost.localdomain localhost
# Public Network - (eth0)
192.168.1.100 linux1
192.168.1.101 linux2
192.168.1.107 linux3
# Private Interconnect - (eth1)
192.168.2.100 linux1-priv
192.168.2.101 linux2-priv
192.168.2.107 linux3-priv
# Public Virtual IP (VIP) addresses - (eth0:1)
192.168.1.200 linux1-vip
192.168.1.201 linux2-vip
192.168.1.207 linux3-vip
# Private Storage Network for Openfiler
192.168.1.195 openfiler1
192.168.2.195 openfiler1-priv
In the screen shots below, only the new Oracle RAC node (linux3) is shown. Ensure
that the /etc/hosts file is updated on all participating nodes to access the
new Oracle RAC node!
Figure 2: Network Configuration Screen - Node 3 (linux3)
Figure 3: Ethernet Device Screen - eth0 (linux3)
Figure 4: Ethernet Device Screen - eth1 (linux3)
Figure 5: Network Configuration Screen - /etc/hosts (linux3)
Once the network is configured, you can use the ifconfig
command to verify everything is working. The following example
is from the new Oracle RAC node linux3:
# /sbin/ifconfig -a
eth0 Link encap:Ethernet HWaddr 00:1E:2A:37:6B:9E
inet addr:192.168.1.107 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::21e:2aff:fe37:6b9e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1167677 errors:0 dropped:0 overruns:0 frame:0
TX packets:1842517 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:576629131 (549.9 MiB) TX bytes:2143836310 (1.9 GiB)
Interrupt:209 Base address:0xef00
eth1 Link encap:Ethernet HWaddr 00:0E:0C:C0:78:64
inet addr:192.168.2.107 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: fe80::20e:cff:fec0:7864/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:48 errors:0 dropped:0 overruns:0 frame:0
TX packets:59 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:4782 (4.6 KiB) TX bytes:5564 (5.4 KiB)
Base address:0xdd80 Memory:fe9c0000-fe9e0000
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:2034 errors:0 dropped:0 overruns:0 frame:0
TX packets:2034 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2386686 (2.2 MiB) TX bytes:2386686 (2.2 MiB)
sit0 Link encap:IPv6-in-IPv4
NOARP MTU:1480 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Verify Network Access to All Nodes
Verify that the new Oracle RAC node has access to the public and
private network for all current nodes. From linux3:
# ping -c 1 linux1 | grep '1 packets transmitted'
1 packets transmitted, 1 received, 0% packet loss, time 0ms
# ping -c 1 linux1-priv | grep '1 packets transmitted'
1 packets transmitted, 1 received, 0% packet loss, time 0ms
# ping -c 1 linux2 | grep '1 packets transmitted'
1 packets transmitted, 1 received, 0% packet loss, time 0ms
# ping -c 1 linux2-priv | grep '1 packets transmitted'
1 packets transmitted, 1 received, 0% packet loss, time 0ms
# ping -c 1 openfiler1 | grep '1 packets transmitted'
1 packets transmitted, 1 received, 0% packet loss, time 0ms
# ping -c 1 openfiler1-priv | grep '1 packets transmitted'
1 packets transmitted, 1 received, 0% packet loss, time 0ms
Confirm the RAC Node Name is Not Listed in Loopback Address
Ensure that the new Oracle RAC node (linux3) is
not included for the loopback address in the /etc/hosts file.
If the machine name is listed in the in the loopback address entry as below:
127.0.0.1 linux3 localhost.localdomain localhost
it will need to be removed as shown below:
127.0.0.1 localhost.localdomain localhost
If the RAC node name is listed for the loopback address, you will
receive the following error during the RAC installation:
ORA-00603: ORACLE server session terminated by fatal error
or
ORA-29702: error occurred in Cluster Group Service operation
Confirm localhost is defined in the /etc/hosts file for the loopback address
Ensure that the entry for localhost.localdomain and localhost are
included for the loopback address in the /etc/hosts file for the new Oracle RAC node:
127.0.0.1 localhost.localdomain localhost
If an entry does not exist for localhost in the /etc/hosts
file, Oracle Clusterware will be unable to start the application resources notably the ONS process.
The error would indicate "Failed to get IP for localhost"
and will be written to the log file for ONS. For example:
CRS-0215 could not start resource 'ora.linux3.ons'. Check log file
"/u01/app/crs/log/linux3/racg/ora.linux3.ons.log"
for more details.
The ONS log file will contain lines similar to the following:
2007-04-14 13:10:02.729: [ RACG][3086871296][13316][3086871296][ora.linux3.ons]: Failed to get IP for localhost (1)
Failed to get IP for localhost (1)
Failed to get IP for localhost (1)
onsctl: ons failed to start
...
Adjusting Network Settings
With Oracle 9.2.0.1 and onwards, Oracle now makes use of UDP as the default protocol
on Linux for inter-process communication (IPC), such as Cache Fusion
and Cluster Manager buffer transfers
between instances within the RAC cluster.
# +---------------------------------------------------------+
# | ADJUSTING NETWORK SETTINGS |
# +---------------------------------------------------------+
# | With Oracle 9.2.0.1 and onwards, Oracle now makes use |
# | of UDP as the default protocol on Linux for |
# | inter-process communication (IPC), such as Cache Fusion |
# | and Cluster Manager buffer transfers between instances |
# | within the RAC cluster. Oracle strongly suggests to |
# | adjust the default and maximum receive buffer size |
# | (SO_RCVBUF socket option) to 256 KB, and the default |
# | and maximum send buffer size (SO_SNDBUF socket option) |
# | to 256 KB. The receive buffers are used by TCP and UDP |
# | to hold received data until it is read by the |
# | application. The receive buffer cannot overflow because |
# | the peer is not allowed to send data beyond the buffer |
# | size window. This means that datagrams will be |
# | discarded if they don't fit in the socket receive |
# | buffer. This could cause the sender to overwhelm the |
# | receiver. |
# +---------------------------------------------------------+
# +---------------------------------------------------------+
# | Default setting in bytes of the socket "receive" buffer |
# | which may be set by using the SO_RCVBUF socket option. |
# +---------------------------------------------------------+
net.core.rmem_default=262144
# +---------------------------------------------------------+
# | Maximum setting in bytes of the socket "receive" buffer |
# | which may be set by using the SO_RCVBUF socket option. |
# +---------------------------------------------------------+
net.core.rmem_max=262144
# +---------------------------------------------------------+
# | Default setting in bytes of the socket "send" buffer |
# | which may be set by using the SO_SNDBUF socket option. |
# +---------------------------------------------------------+
net.core.wmem_default=262144
# +---------------------------------------------------------+
# | Maximum setting in bytes of the socket "send" buffer |
# | which may be set by using the SO_SNDBUF socket option. |
# +---------------------------------------------------------+
net.core.wmem_max=262144
Then, ensure that each of these parameters are truly in
effect by running the following command on the new Oracle RAC node:
# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.core.rmem_default = 262144
net.core.rmem_max = 262144
net.core.wmem_default = 262144
net.core.wmem_max = 262144
Check and turn off UDP ICMP rejections:
During the Linux installation process, I indicated to not configure the
firewall option. (By default the option to configure a firewall is selected
by the installer.)
This has burned me several times so I like to do a double-check that the firewall
option is not configured and to ensure udp ICMP filtering is turned off.
08/29/2005 22:17:19
oac_init:2: Could not connect to server, clsc retcode = 9
08/29/2005 22:17:19
a_init:12!: Client init unsuccessful : [32]
ibctx:1:ERROR: INVALID FORMAT
proprinit:problem reading the bootblock or superbloc 22
When experiencing this type of error, the solution is to remove the udp ICMP (iptables)
rejection rule - or to simply have the firewall option turned off.
The Oracle Clusterware software will then start to operate normally and not crash. The following commands
should be executed as the root user account:
# /etc/rc.d/init.d/iptables status
Firewall is stopped.
# /etc/rc.d/init.d/iptables stop
Flushing firewall rules: [ OK ]
Setting chains to policy ACCEPT: filter [ OK ]
Unloading iptables modules: [ OK ]
# chkconfig iptables off
Perform the following configuration tasks on the network storage server (openfiler1)!
https://openfiler1:446/
From the Openfiler Storage Control Center home page, login as
an administrator. The default administration login credentials
for Openfiler are:
Services
This article assumes that the current Oracle RAC 10g environment is
operational and therefore the iSCSI services should already be enabled within
Openfiler.
Figure 6: Verify iSCSI Services are Enabled
[root@openfiler1 ~]# service iscsi-target status
ietd (pid 3784) is running...
Network Access Restriction
The next step is to configure network access in Openfiler so
that the new Oracle RAC node (linux3) has permissions to the shared iSCSI volumes
used in the current Oracle RAC 10g environment.
Figure 7: Configure Openfiler Host Access for new Oracle RAC Node
Current Logical iSCSI Volumes
The current Openfiler configuration contains five logical iSCSI volumes
in a single volume group named rac1.
iSCSI / Logical Volumes in Volume Group rac1
Volume Name
Volume Description
Required Space (MB)
Filesystem Type
crs
Oracle Clusterware
2,048
iSCSI
asm1
Oracle ASM Volume 1
118,720
iSCSI
asm2
Oracle ASM Volume 2
118,720
iSCSI
asm3
Oracle ASM Volume 3
118,720
iSCSI
asm4
Oracle ASM Volume 4
118,720
iSCSI
Figure 8: Current Logical (iSCSI) Volumes
Grant Access Rights to New Logical Volumes
Before an iSCSI client can have access to any of the iSCSI volumes, it needs
to be granted the appropriate permissions. In this section, we need to grant
access to each of the five local iSCSI volumes to the new Oracle RAC node linux3.
Figure 9: Grant Host Access to Logical (iSCSI) Volumes
Configure the iSCSI initiator on the new Oracle RAC node!
iSCSI (initiator) service
On the new Oracle RAC node, we have to make sure the iSCSI (initiator) service is
up and running.
If not installed as part of the operating system setup, the iscsi-initiator-utils RPM
(i.e. iscsi-initiator-utils-4.0.3.0-5.i386.rpm) should be downloaded and installed on the new Oracle RAC node.
The new Oracle RAC node must have the iscsi-initiator-utils RPM installed. To determine
if this package is installed, perform the following:
# rpm -qa | grep iscsi
iscsi-initiator-utils-4.0.3.0-5
If not installed, the iscsi-initiator-utils RPM package can be found on disk 3 of 4
of the RHEL4 Update 5 distribution or
downloaded
from one of the Internet RPM resources.
# rpm -Uvh iscsi-initiator-utils-4.0.3.0-5.i386.rpm
warning: iscsi-initiator-utils-4.0.3.0-5.i386.rpm:
V3 DSA signature: NOKEY, key ID 443e1821
Preparing... ########################################### [100%]
1:iscsi-initiator-utils ########################################### [100%]
...
DiscoveryAddress=openfiler1-priv
...
After making the change to the /etc/iscsi.conf file on the new Oracle RAC node,
we can start (or restart) the iscsi initiator service on that node:
# service iscsi restart
Searching for iscsi-based multipath maps
Found 0 maps
Stopping iscsid: iscsid not running
Checking iscsi config: [ OK ]
Loading iscsi driver: [ OK ]
Starting iscsid: [ OK ]
We should also configure the iSCSI service to be active across machine reboots for the new Oracle RAC node.
The Linux command chkconfig can be used to achieve that as follows:
# chkconfig --level 345 iscsi on
The iSCSI initiator service should now be configured and started on
the new Oracle RAC node. In the parent to this article
("Building an Inexpensive Oracle RAC 10g Release 2 on Linux - (CentOS 4.5 / iSCSI)"),
we needed go through the arduous task of mapping the iSCSI target
names discovered from Openfiler to the local SCSI device name on one
of the Oracle RAC nodes. Given that all five logical iSCSI volumes were
partitioned and formatted with labels in that article, we
don't have to perform that task again. Note that one of the iSCSI volumes
was formatted and labeled using OCFS2 while the other four were labeled
for use by ASM.
...
Jan 21 16:41:29 linux3 iscsi: iscsid startup succeeded
Jan 21 16:41:29 linux3 iscsid[13822]: Connected to Discovery Address 192.168.2.195
Jan 21 16:41:29 linux3 kernel: iscsi-sfnet:host0: Session established
Jan 21 16:41:29 linux3 kernel: iscsi-sfnet:host2: Session established
Jan 21 16:41:29 linux3 kernel: iscsi-sfnet:host1: Session established
Jan 21 16:41:29 linux3 kernel: scsi0 : SFNet iSCSI driver
Jan 21 16:41:29 linux3 kernel: scsi2 : SFNet iSCSI driver
Jan 21 16:41:29 linux3 kernel: scsi1 : SFNet iSCSI driver
Jan 21 16:41:29 linux3 kernel: Vendor: Openfile Model: Virtual disk Rev: 0
Jan 21 16:41:29 linux3 kernel: Type: Direct-Access ANSI SCSI revision: 04
Jan 21 16:41:29 linux3 kernel: SCSI device sda: 243138560 512-byte hdwr sectors (124487 MB)
Jan 21 16:41:29 linux3 kernel: SCSI device sda: drive cache: write through
Jan 21 16:41:29 linux3 kernel: Vendor: Openfile Model: Virtual disk Rev: 0
Jan 21 16:41:29 linux3 kernel: Type: Direct-Access ANSI SCSI revision: 04
Jan 21 16:41:29 linux3 kernel: SCSI device sda: 243138560 512-byte hdwr sectors (124487 MB)
Jan 21 16:41:29 linux3 kernel: iscsi-sfnet:host3: Session established
Jan 21 16:41:29 linux3 kernel: iscsi-sfnet:host4: Session established
Jan 21 16:41:29 linux3 kernel: scsi3 : SFNet iSCSI driver
Jan 21 16:41:29 linux3 kernel: SCSI device sda: drive cache: write through
Jan 21 16:41:29 linux3 kernel: sda: unknown partition table
Jan 21 16:41:29 linux3 kernel: Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Jan 21 16:41:29 linux3 kernel: Vendor: Openfile Model: Virtual disk Rev: 0
Jan 21 16:41:29 linux3 scsi.agent[13934]: disk at /devices/platform/host0/target0:0:0/0:0:0:0
Jan 21 16:41:29 linux3 kernel: Type: Direct-Access ANSI SCSI revision: 04
Jan 21 16:41:29 linux3 kernel: Vendor: Openfile Model: Virtual disk Rev: 0
Jan 21 16:41:29 linux3 kernel: Type: Direct-Access ANSI SCSI revision: 04
Jan 21 16:41:29 linux3 kernel: scsi4 : SFNet iSCSI driver
Jan 21 16:41:29 linux3 kernel: SCSI device sdb: 243138560 512-byte hdwr sectors (124487 MB)
Jan 21 16:41:29 linux3 kernel: Vendor: Openfile Model: Virtual disk Rev: 0
Jan 21 16:41:29 linux3 kernel: Type: Direct-Access ANSI SCSI revision: 04
Jan 21 16:41:29 linux3 kernel: SCSI device sdb: drive cache: write through
Jan 21 16:41:29 linux3 scsi.agent[13983]: disk at /devices/platform/host2/target2:0:0/2:0:0:0
Jan 21 16:41:29 linux3 scsi.agent[13996]: disk at /devices/platform/host3/target3:0:0/3:0:0:0
Jan 21 16:41:30 linux3 kernel: SCSI device sdb: 243138560 512-byte hdwr sectors (124487 MB)
Jan 21 16:41:30 linux3 kernel: SCSI device sdb: drive cache: write through
Jan 21 16:41:30 linux3 kernel: sdb: unknown partition table
Jan 21 16:41:30 linux3 kernel: Attached scsi disk sdb at scsi2, channel 0, id 0, lun 0
Jan 21 16:41:30 linux3 kernel: SCSI device sdc: 243138560 512-byte hdwr sectors (124487 MB)
Jan 21 16:41:30 linux3 kernel: SCSI device sdc: drive cache: write through
Jan 21 16:41:30 linux3 kernel: SCSI device sdc: 243138560 512-byte hdwr sectors (124487 MB)
Jan 21 16:41:30 linux3 kernel: SCSI device sdc: drive cache: write through
Jan 21 16:41:30 linux3 kernel: sdc: unknown partition table
Jan 21 16:41:30 linux3 kernel: Attached scsi disk sdc at scsi3, channel 0, id 0, lun 0
Jan 21 16:41:30 linux3 kernel: SCSI device sdd: 243138560 512-byte hdwr sectors (124487 MB)
Jan 21 16:41:30 linux3 kernel: SCSI device sdd: drive cache: write through
Jan 21 16:41:30 linux3 kernel: SCSI device sdd: 243138560 512-byte hdwr sectors (124487 MB)
Jan 21 16:41:30 linux3 kernel: SCSI device sdd: drive cache: write through
Jan 21 16:41:30 linux3 kernel: sdd: unknown partition table
Jan 21 16:41:30 linux3 kernel: Attached scsi disk sdd at scsi1, channel 0, id 0, lun 0
Jan 21 16:41:30 linux3 kernel: SCSI device sde: 4194304 512-byte hdwr sectors (2147 MB)
Jan 21 16:41:30 linux3 scsi.agent[14032]: disk at /devices/platform/host4/target4:0:0/4:0:0:0
Jan 21 16:41:30 linux3 scsi.agent[14045]: disk at /devices/platform/host1/target1:0:0/1:0:0:0
Jan 21 16:41:30 linux3 kernel: SCSI device sde: drive cache: write through
Jan 21 16:41:30 linux3 kernel: SCSI device sde: 4194304 512-byte hdwr sectors (2147 MB)
Jan 21 16:41:30 linux3 kernel: SCSI device sde: drive cache: write through
Jan 21 16:41:30 linux3 kernel: sde: unknown partition table
Jan 21 16:41:30 linux3 kernel: Attached scsi disk sde at scsi4, channel 0, id 0, lun 0
...
The above entries show that the client (linux3) was able to establish
the iSCSI sessions with the iSCSI storage server (openfiler1-priv at 192.168.2.195).
iscsi-ls-map.sh # ---------------------
# FILE: iscsi-ls-map.sh
# ---------------------
RUN_USERID=root
export RUN_USERID
RUID=`id | awk -F\( '{print $2}'|awk -F\) '{print $1}'`
if [[ ${RUID} != "$RUN_USERID" ]];then
echo " "
echo "You must be logged in as $RUN_USERID to run this script."
echo "Exiting script."
echo " "
exit 1
fi
dmesg | grep "^Attach" \
| awk -F" " '{ print "/dev/"$4 " " $6 }' \
| sed -e 's/,//' | sed -e 's/scsi//' \
| sort -n -k2 \
| sed -e '/disk1/d' > /tmp/tmp_scsi_dev
iscsi-ls | egrep -e "TARGET NAME" -e "HOST ID" \
| awk -F" " '{ if ($0 ~ /^TARGET.*/) printf $4; if ( $0 ~ /^HOST/) printf " %s\n",$4}' \
| sort -n -k2 \
| cut -d':' -f2- \
| cut -d'.' -f2- > /tmp/tmp_scsi_targets
join -t" " -1 2 -2 2 /tmp/tmp_scsi_dev /tmp/tmp_scsi_targets > MAP
echo "Host / SCSI ID SCSI Device Name iSCSI Target Name"
echo "---------------- ----------------------- -----------------"
cat MAP | sed -e 's/ / /g'
rm -f MAP# ./iscsi-ls-map.sh
Host / SCSI ID SCSI Device Name iSCSI Target Name
---------------- ------------------------ -----------------
0 /dev/sda asm4
1 /dev/sdd asm3
2 /dev/sdb asm2
3 /dev/sdc asm1
4 /dev/sde crs
Perform the following tasks on the new Oracle RAC node!
This guide adheres
to the Optimal Flexible Architecture (OFA) for naming conventions used
in creating the directory structure.
Create Group and User for Oracle
Lets start this section by creating the UNIX
oinstall and dba group
and oracle user account:
# groupadd -g 501 oinstall
# groupadd -g 502 dba
# useradd -m -u 501 -g oinstall -G dba -d /home/oracle -s /bin/bash -c "Oracle Software Owner" oracle
# id oracle
uid=501(oracle) gid=501(oinstall) groups=501(oinstall),502(dba)
Set the password for the oracle account:
# passwd oracle
Changing password for user oracle.
New UNIX password: xxxxxxxxxxx
Retype new UNIX password: xxxxxxxxxxx
passwd: all authentication tokens updated successfully.
Verify That the User nobody Exists
Before installing the Oracle software, complete the following procedure to verify that the user
nobody exists on the system:
# id nobody
uid=99(nobody) gid=99(nobody) groups=99(nobody)
If this command displays information about the nobody user, then you do not
have to create that user.
# /usr/sbin/useradd nobody
Create the Oracle Base Directory
The next step is to create a new directory that will be used to store
the Oracle Database software. When configuring the
oracle user's environment (later in this section) we will be assigning
the location of this directory to the $ORACLE_BASE environment variable.
# mkdir -p /u01/app/oracle
# chown -R oracle:oinstall /u01/app/oracle
# chmod -R 775 /u01/app/oracle
Create the Oracle Clusterware Home Directory
Next, create a new directory that will be used to store
the Oracle Clusterware software. When configuring the
oracle user's environment (later in this section) we will be assigning
the location of this directory to the $ORA_CRS_HOME environment variable.
# mkdir -p /u01/app/crs
# chown -R oracle:oinstall /u01/app/crs
# chmod -R 775 /u01/app/crs
Create Mount Point for OCFS2 / Clusterware
Let's now create the mount point for the Oracle Cluster File System, Release 2 (OCFS2)
that will be used to store the two Oracle Clusterware shared files.
# mkdir -p /u02
# chown -R oracle:oinstall /u02
# chmod -R 775 /u02
Create Login Script for oracle User Account
To ensure that the environment is setup correctly for the "oracle" UNIX
userid on the new Oracle RAC node, use the following .bash_profile:
When you are setting the Oracle environment variables for each Oracle RAC node, ensure to
assign each RAC node a unique Oracle SID!
# su - oracle
.bash_profile for Oracle User # .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
alias ls="ls -FA"
export JAVA_HOME=/usr/local/java
# User specific environment and startup programs
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1
export ORA_CRS_HOME=/u01/app/crs
export ORACLE_PATH=$ORACLE_BASE/common/oracle/sql:.:$ORACLE_HOME/rdbms/admin
export CV_JDKHOME=/usr/local/java
# Each RAC node must have a unique ORACLE_SID. (i.e. orcl1, orcl2, orcl3,...)
export ORACLE_SID=orcl3
export PATH=.:${JAVA_HOME}/bin:${PATH}:$HOME/bin:$ORACLE_HOME/bin
export PATH=${PATH}:/usr/bin:/bin:/usr/bin/X11:/usr/local/bin
export PATH=${PATH}:$ORACLE_BASE/common/oracle/bin
export ORACLE_TERM=xterm
export TNS_ADMIN=$ORACLE_HOME/network/admin
export ORA_NLS10=$ORACLE_HOME/nls/data
export NLS_DATE_FORMAT="DD-MON-YYYY HH24:MI:SS"
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$ORACLE_HOME/oracm/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lib:/usr/lib:/usr/local/lib
export CLASSPATH=$ORACLE_HOME/JRE
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/rdbms/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/network/jlib
export THREADS_FLAG=native
export TEMP=/tmp
export TMPDIR=/tmp
Perform the following tasks on the new Oracle RAC node!
Swap Space Considerations
(An inadequate amount of swap during the installation
will cause the Oracle Universal Installer to either "hang" or "die")
# cat /proc/meminfo | grep MemTotal
MemTotal: 2074428 kB
# cat /proc/meminfo | grep SwapTotal
SwapTotal: 2031608 kB
# dd if=/dev/zero of=tempswap bs=1k count=500000
# chmod 600 tempswap
# mke2fs tempswap
# mkswap tempswap
# swapon tempswap
Configuring Kernel Parameters and Shell Limits
The kernel parameters and shell limits presented in this section are recommended values
only as documented by Oracle. For production database systems, Oracle recommends that
you tune these values to optimize the performance of the system.
Setting Shared Memory / Semaphores / File Handles / Local IP Range
Set the following kernel parameters in the
/etc/sysctl.conf file on the new Oracle RAC node.
cat >> /etc/sysctl.conf <<EOF
# +---------------------------------------------------------+
# | ADJUSTING ADDITIONAL KERNEL PARAMETERS FOR ORACLE |
# +---------------------------------------------------------+
# | Configure the kernel parameters for all Oracle Linux |
# | servers by setting shared memory and semaphores, |
# | setting the maximum amount of file handles, and setting |
# | the IP local port range. |
# +---------------------------------------------------------+
# +---------------------------------------------------------+
# | SHARED MEMORY |
# +---------------------------------------------------------+
kernel.shmmax=2147483648
# +---------------------------------------------------------+
# | SEMAPHORES |
# | ---------- |
# | |
# | SEMMSL_value SEMMNS_value SEMOPM_value SEMMNI_value |
# | |
# +---------------------------------------------------------+
kernel.sem=250 32000 100 128
# +---------------------------------------------------------+
# | FILE HANDLES |
# ----------------------------------------------------------+
fs.file-max=65536
# +---------------------------------------------------------+
# | LOCAL IP RANGE |
# ----------------------------------------------------------+
net.ipv4.ip_local_port_range=1024 65000
EOF
Setting Shell Limits for the oracle User
To improve the performance of the software on Linux systems, Oracle recommends you increase the
following shell limits for the oracle user:
Shell Limit
Item in limits.conf
Hard Limit
Maximum number of open file descriptors
nofile
65536
Maximum number of processes available to a single user
nproc
16384
cat >> /etc/security/limits.conf <<EOF
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
EOF
cat >> /etc/pam.d/login <<EOF
session required /lib/security/pam_limits.so
EOF
Update the default shell startup file for the "oracle" UNIX account.
cat >> /etc/profile <<EOF
if [ \$USER = "oracle" ]; then
if [ \$SHELL = "/bin/ksh" ]; then
ulimit -p 16384
ulimit -n 65536
else
ulimit -u 16384 -n 65536
fi
umask 022
fi
EOF
cat >> /etc/csh.login <<EOF
if ( \$USER == "oracle" ) then
limit maxproc 16384
limit descriptors 65536
endif
EOF
Activating All Kernel Parameters for the System
At this point, we have covered all of the required Linux kernel parameters needed
for a successful Oracle installation and configuration. The sections above
configured the Linux system to persist each of the kernel parameters through reboots on system
startup by placing them all in the /etc/sysctl.conf file.
# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.core.rmem_default = 262144
net.core.rmem_max = 262144
net.core.wmem_default = 262144
net.core.wmem_max = 262144
kernel.shmmax = 2147483648
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
Setting the Correct Date and Time on the new Oracle RAC Node
When adding the new Oracle RAC node to the cluster, the Oracle Universal Installer (OUI) copies
the Oracle Clusterware and Oracle Database software from the source RAC node
(linux1 in this article) to the new node in the cluster
(linux3). During the remote copy process, the OUI will execute the UNIX
"tar" command on the remote node (linux3) to extract the files that were archived and
copied over. If the date and time on the node performing the install is greater than
that of the node it is copying to, the OUI will throw an error from the
"tar" command indicating it is attempting to extract files stamped with a time in the future:
Error while copying directory
/u01/app/crs with exclude file list 'null' to nodes 'linux3'.
[PRKC-1002 : All the submitted commands did not execute successfully]
---------------------------------------------
linux3:
/bin/tar: ./bin/lsnodes: time stamp 2008-02-13 09:21:34 is 735 s in the future
/bin/tar: ./bin/olsnodes: time stamp 2008-02-13 09:21:34 is 735 s in the future
...(more errors on this node)
# date
Thu Feb 14 00:17:00 EST 2008
# date -s "2/14/2008 00:17:20"
Perform the following tasks on the new Oracle RAC node!
The hangcheck-timer.ko Module
The hangcheck-timer module uses a kernel-based timer that
periodically checks the system task scheduler to catch delays in order to
determine the health of the system. If the system hangs or pauses, the timer
resets the node. The hangcheck-timer module uses the Time Stamp Counter
(TSC) CPU register which is a counter that is incremented at each clock signal.
The TCS offers much more accurate time measurements since this register
is updated by the hardware automatically.
Installing the hangcheck-timer.ko Module
The hangcheck-timer was normally shipped only by Oracle, however, this
module is now included with Red Hat Linux AS starting with kernel versions
2.4.9-e.12 and higher. The hangcheck-timer should already be included.
Use the following to ensure that you have the module included:
# find /lib/modules -name "hangcheck-timer.ko"
/lib/modules/2.6.9-55.EL/kernel/drivers/char/hangcheck-timer.ko
In the above output, we care about the hangcheck timer object
(hangcheck-timer.ko) in the
/lib/modules/2.6.9-55.EL/kernel/drivers/char directory.
Configuring and Loading the hangcheck-timer Module
There are two key parameters to the hangcheck-timer module:
The two hangcheck-timer module parameters indicate how long a RAC node
must hang before it will reset the system. A node reset will occur when
the following is true:
system hang time > (hangcheck_tick + hangcheck_margin)
Configuring Hangcheck Kernel Module Parameters
Each time the hangcheck-timer kernel module is loaded (manually or by Oracle) it needs to know what value to use for
each of the two parameters we just discussed: (hangcheck-tick and hangcheck-margin).
# su -
# echo "options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180" >> /etc/modprobe.conf
Each time the hangcheck-timer kernel module gets loaded, it will use the values
defined by the entry I made in the /etc/modprobe.conf file.
Manually Loading the Hangcheck Kernel Module for Testing
Oracle is responsible for loading the hangcheck-timer kernel module when required. It is for this
reason that it is not required to perform a modprobe or insmod of the
hangcheck-timer kernel module in any of the startup files (i.e. /etc/rc.local).
# echo "/sbin/modprobe hangcheck-timer" >> /etc/rc.local
You don't have to manually load the hangcheck-timer kernel module using
modprobe or insmod after each reboot.
The hangcheck-timer module will be loaded by Oracle (automatically) when needed.
# su -
# modprobe hangcheck-timer
# grep Hangcheck /var/log/messages | tail -2
Feb 14 01:22:52 linux3 kernel: Hangcheck: starting hangcheck timer 0.9.0 (tick is 30 seconds, margin is 180 seconds).
Feb 14 01:22:52 linux3 kernel: Hangcheck: Using monotonic_clock().
Perform the following configuration procedures on linux1 and the new Oracle RAC node!
During the creation of the existing two-node cluster, the installation of Oracle
Clusterware and the Oracle Database software were only performed from one node in the
RAC cluster namely from linux1 as the oracle user account.
The Oracle Universal Installer (OUI) on that particular node then would use the
ssh and scp commands to run remote commands on and
copy files (the Oracle software) to all other nodes within the RAC cluster.
The oracle user account on the node running the OUI
(runInstaller) had to be trusted by all other nodes in the
RAC cluster. This meant that the oracle user account had to
run the secure shell commands (ssh or scp) on the
Linux server executing the OUI against all other Linux servers in
the cluster without being prompted for a password. The same security requirements hold
true for this article. User equivalence will be configured so that the
Oracle Clusterware and Oracle Database
software will be securely copied from linux1 to the new Oracle RAC node
(linux3) using ssh and scp without being prompted for a password.
To determine if SSH is installed and running on the new Oracle RAC node,
enter the following command:
# pgrep sshd
3695
If SSH is running, then the response to this command is a list of process ID
number(s).
Creating the RSA Keys on the new Oracle RAC Node
The first step in configuring SSH is to create an RSA public/private key pair on
the new Oracle RAC node. An RSA public/private key should already exist on both of
the two nodes in the current two-node cluster. The command to do this will create a public
and private key for RSA (for a total of two keys per
node). The content of the RSA public keys will then be
copied into an authorized key file on linux1 which is then
distributed to all other Oracle RAC nodes in the cluster.
# su - oracle
$ mkdir -p ~/.ssh
$ chmod 700 ~/.ssh
$ /usr/bin/ssh-keygen -t rsa
At the prompts:
Updating and Distributing the "authorized key file" from linux1
Now that the new Oracle RAC node contains a public and private key for RSA,
you will need to update the authorized key file on
linux1 to add (append) the new RSA public key from linux3.
An authorized key file is nothing more than a single
file that contains a copy of everyone's (every node's) RSA public key.
Once the authorized key file contains all of the public
keys, it is then distributed to all other nodes in the cluster.
$ cd ~/.ssh
$ ls -l *.pub
-rw-r--r-- 1 oracle oinstall 223 Sep 2 01:18 id_rsa.pub
$ ssh linux3 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
The authenticity of host 'linux3 (192.168.1.107)' can't be established.
RSA key fingerprint is f5:38:37:e8:84:4e:bd:6d:6b:25:f7:94:58:e8:b2:7a.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'linux3,192.168.1.107' (RSA) to the list of known hosts.
oracle@linux3's password: xxxxx
The first time you use SSH to
connect to a node from a particular system, you will see a message
similar to the following:
The authenticity of host 'linux3 (192.168.1.107)' can't be established.
RSA key fingerprint is f5:38:37:e8:84:4e:bd:6d:6b:25:f7:94:58:e8:b2:7a.
Are you sure you want to continue connecting (yes/no)? yes
Enter yes at the prompt to continue. You should
not see this message again when you connect from this system
to the same node.
$ scp ~/.ssh/authorized_keys linux2:.ssh/authorized_keys
Enter passphrase for key '/home/oracle/.ssh/id_rsa': xxxxx
authorized_keys 100% 669 0.7KB/s 00:00
$ scp ~/.ssh/authorized_keys linux3:.ssh/authorized_keys
oracle@linux3's password: xxxxx
authorized_keys 100% 669 0.7KB/s 00:00
$ chmod 600 ~/.ssh/authorized_keys
$ ssh linux3 hostname
Enter passphrase for key '/home/oracle/.ssh/id_rsa': xxxxx
linux3
If you see any other messages or text, apart from the host name,
then the Oracle installation can fail. Make any changes required
to ensure that only the host name is displayed when you enter
these commands. You should ensure that any part of a login script(s)
that generate any output, or ask any questions, are modified so that
they act only when the shell is an interactive shell.
Enabling SSH User Equivalency for the Current Shell Session
When running the addNode.sh script from linux1 (which runs the OUI), it will need to
run the secure shell tool commands (ssh and scp) on the new Oracle RAC node
without being prompted for a pass phrase. Even though SSH is now configured
on all Oracle RAC nodes in the cluster, using the secure shell tool commands will
still prompt for a pass phrase. Before running the addNode.sh script, you need to
enable user equivalence for the terminal session you plan to run the
script from. For the purpose of this article, the addNode.sh script
will be run from linux1.
For more information on configuring SSH and user equivalence in an Oracle RAC 10g
environment, see the section
"Configure RAC Nodes for Remote Access using SSH"
in the parent article.
# su - oracle
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
At the prompt, enter the pass phrase for each key that you generated.
$ ssh linux1 "date;hostname"
Fri Feb 22 12:13:57 EST 2008
linux1
$ ssh linux2 "date;hostname"
Fri Feb 22 12:14:43 EST 2008
linux2
$ ssh linux3 "date;hostname"
Fri Feb 22 12:13:16 EST 2008
linux3
The commands above should display the date set on each Oracle RAC node along with its hostname.
If any of the nodes prompt for a password or pass phrase then verify
that the ~/.ssh/authorized_keys file on that node
contains the correct public keys.
$ DISPLAY=<Any X-Windows Host>:0
$ export DISPLAY
C shell:
$ setenv DISPLAY <Any X-Windows Host>:0
After setting the DISPLAY variable to a valid X Windows
display, you should perform another test of the current terminal
session to ensure that X11 forwarding is not enabled:
$ ssh linux1 hostname
linux1
$ ssh linux2 hostname
linux2
$ ssh linux3 hostname
linux3
Verify that the following startup commands are
included on the new Oracle RAC node!
This section will recap all of the parameters, commands, and entries that
were covered in previous sections of this document
that need to happen on the new Oracle RAC node when the machine is booted.
For each of the startup files below,
I indicate in blue the entries that should be included
in each of the startup files on the new Oracle RAC node.
All parameters and values to be used by kernel modules.
/etc/modprobe.conf
alias eth0 b44
alias eth1 e1000
alias scsi_hostadapter ata_piix
alias usb-controller ehci-hcd
alias usb-controller1 uhci-hcd
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
We wanted to adjust the default and maximum send buffer size as well as the
default and maximum receive buffer size for the interconnect. This file also contains
those parameters responsible for configuring shared memory,
semaphores, file handles, and local IP range for use by the Oracle instance.
/etc/sysctl.conf
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.
# Controls IP packet forwarding
net.ipv4.ip_forward = 0
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1
# +---------------------------------------------------------+
# | ADJUSTING NETWORK SETTINGS |
# +---------------------------------------------------------+
# | With Oracle 9.2.0.1 and onwards, Oracle now makes use |
# | of UDP as the default protocol on Linux for |
# | inter-process communication (IPC), such as Cache Fusion |
# | and Cluster Manager buffer transfers between instances |
# | within the RAC cluster. Oracle strongly suggests to |
# | adjust the default and maximum receive buffer size |
# | (SO_RCVBUF socket option) to 256 KB, and the default |
# | and maximum send buffer size (SO_SNDBUF socket option) |
# | to 256 KB. The receive buffers are used by TCP and UDP |
# | to hold received data until it is read by the |
# | application. The receive buffer cannot overflow because |
# | the peer is not allowed to send data beyond the buffer |
# | size window. This means that datagrams will be |
# | discarded if they don't fit in the socket receive |
# | buffer. This could cause the sender to overwhelm the |
# | receiver. |
# +---------------------------------------------------------+
# +---------------------------------------------------------+
# | Default setting in bytes of the socket "receive" buffer |
# | which may be set by using the SO_RCVBUF socket option. |
# +---------------------------------------------------------+
net.core.rmem_default=262144
# +---------------------------------------------------------+
# | Maximum setting in bytes of the socket "receive" buffer |
# | which may be set by using the SO_RCVBUF socket option. |
# +---------------------------------------------------------+
net.core.rmem_max=262144
# +---------------------------------------------------------+
# | Default setting in bytes of the socket "send" buffer |
# | which may be set by using the SO_SNDBUF socket option. |
# +---------------------------------------------------------+
net.core.wmem_default=262144
# +---------------------------------------------------------+
# | Maximum setting in bytes of the socket "send" buffer |
# | which may be set by using the SO_SNDBUF socket option. |
# +---------------------------------------------------------+
net.core.wmem_max=262144
# +---------------------------------------------------------+
# | ADJUSTING ADDITIONAL KERNEL PARAMETERS FOR ORACLE |
# +---------------------------------------------------------+
# | Configure the kernel parameters for all Oracle Linux |
# | servers by setting shared memory and semaphores, |
# | setting the maximum amount of file handles, and setting |
# | the IP local port range. |
# +---------------------------------------------------------+
# +---------------------------------------------------------+
# | SHARED MEMORY |
# +---------------------------------------------------------+
kernel.shmmax=2147483648
# +---------------------------------------------------------+
# | SEMAPHORES |
# | ---------- |
# | |
# | SEMMSL_value SEMMNS_value SEMOPM_value SEMMNI_value |
# | |
# +---------------------------------------------------------+
kernel.sem=250 32000 100 128
# +---------------------------------------------------------+
# | FILE HANDLES |
# ----------------------------------------------------------+
fs.file-max=65536
# +---------------------------------------------------------+
# | LOCAL IP RANGE |
# ----------------------------------------------------------+
net.ipv4.ip_local_port_range=1024 65000
Verify that each of the
required kernel parameters (above) are configured in the
/etc/sysctl.conf file. Then, ensure that each of these parameters are truly in
effect by running the following command on both Oracle RAC nodes in the cluster:
# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.core.rmem_default = 262144
net.core.rmem_max = 262144
net.core.wmem_default = 262144
net.core.wmem_max = 262144
kernel.shmmax = 2147483648
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
All machine/IP entries for nodes in the RAC cluster.
/etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
# Public Network - (eth0)
192.168.1.100 linux1
192.168.1.101 linux2
192.168.1.107 linux3
# Private Interconnect - (eth1)
192.168.2.100 linux1-priv
192.168.2.101 linux2-priv
192.168.2.107 linux3-priv
# Public Virtual IP (VIP) addresses - (eth0:1)
192.168.1.200 linux1-vip
192.168.1.201 linux2-vip
192.168.1.207 linux3-vip
# Private Storage Network for Openfiler - (eth1)
192.168.1.195 openfiler1
192.168.2.195 openfiler1-priv
192.168.1.106 melody
192.168.1.102 alex
192.168.1.105 bartman
192.168.1.120 cartman
Loading the hangcheck-timer kernel module.
/etc/rc.local
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.
touch /var/lock/subsys/local
# +---------------------------------------------------------+
# | HANGCHECK TIMER |
# | (I do not believe this is required, but doesn't hurt) |
# +---------------------------------------------------------+
/sbin/modprobe hangcheck-timer
Perform the following tasks on the new Oracle RAC node!
Overview
The current two-node Oracle RAC database makes use of the Oracle
Cluster File System, Release 2 (OCFS2) to store the two files that
are required to be shared by the Oracle Clusterware software. Note that
for each of the two shared Oracle Clusterware shared files, a mirrored
copy was created making for five files in total:
OCFS2 Project Documentation
Download OCFS2
The OCFS2 distribution comprises of two sets of RPMs; namely, the kernel module and the
tools.
Next, download the OCFS2 tools and the OCFS2 console applications.
ocfs2-2.6.9-55.EL-1.2.5-6.i686.rpm - (for single processor)
ocfs2-2.6.9-55.ELsmp-1.2.5-6.i686.rpm - (for multiple processors)
ocfs2-2.6.9-55.ELhugemem-1.2.5-6.i686.rpm - (for hugemem)
ocfs2-tools-1.2.4-1.i386.rpm - (OCFS2 tools)
ocfs2console-1.2.4-1.i386.rpm - (OCFS2 console)
The OCFS2 Console is optional but highly recommended. The ocfs2console
application requires e2fsprogs, glib2 2.2.3 or later, vte 0.11.10 or
later, pygtk2 (EL4) or python-gtk (SLES9) 1.99.16 or later, python 2.3 or later and ocfs2-tools.
If you were curious as to which OCFS2 driver release you need, use the
OCFS2 release that matches your kernel version. To determine your kernel release:
$ uname -a
Linux linux3 2.6.9-55.EL #1 Wed May 2 13:52:16 EDT 2007 i686 i686 i386 GNU/Linux
In the absence of the string "smp" after the string "EL", we are running a single processor (Uniprocessor)
machine. If the string "smp" were to appear, then you would be running on a multi-processor machine.
I will be installing the OCFS2 files onto the new Oracle RAC node (linux3) which is
a single processor machine.
The installation process is simply a matter of running the following
command on the new Oracle RAC node as the root user account:
$ su -
# rpm -Uvh ocfs2-2.6.9-55.EL-1.2.5-6.i686.rpm \
ocfs2console-1.2.4-1.i386.rpm \
ocfs2-tools-1.2.4-1.i386.rpm
Preparing... ########################################### [100%]
1:ocfs2-tools ########################################### [ 33%]
2:ocfs2-2.6.9-55.EL ########################################### [ 67%]
3:ocfs2console ########################################### [100%]
Disable SELinux (RHEL4 U2 and higher)
Users of RHEL4 U2 and higher (CentOS 4.5 is based on RHEL4 U5) are advised that OCFS2
currently does not work with SELinux enabled. If you are using RHEL4 U2 or higher
(which includes us since we are using CentOS 4.5) you will need to
disable SELinux (using tool system-config-securitylevel) to get
the O2CB service to execute.
A ticket has been logged with Red Hat on this issue.
# /usr/bin/system-config-securitylevel &
This will bring up the following screen:

Figure 10: Security Level Configuration Opening Screen
Now, click the SELinux tab and check off the "Enabled" checkbox.
After clicking [OK], you will be presented with a warning dialog.
Simply acknowledge this warning by clicking "Yes".
Your screen should now look like the following after disabling the SELinux option:

Figure 11: SELinux Disabled
If modifications were made to disable SELinux on the new Oracle RAC node, it will
need to be rebooted to implement the change. SELinux must be disabled
before you can continue with configuring OCFS2!
# init 6
Configure OCFS2
The next step is to generate and configure the /etc/ocfs2/cluster.conf
file on the new Oracle RAC node. The easiest way to accomplish this is
to run the GUI tool ocfs2console. The /etc/ocfs2/cluster.conf
file will contain hostnames and IP addresses for "all" nodes in the cluster.
After creating the /etc/ocfs2/cluster.conf on the new Oracle RAC node,
these changes will then be distributed to the other two current RAC nodes
using the o2cb_ctl command-line utility.
Note that OCFS2 will be configured to use the private network (192.168.2.0)
for all of its network traffic as recommended by Oracle. While OCFS2
does not take much bandwidth, it does require the nodes to be alive on the
network and sends regular keepalive packets to ensure that they are. To
avoid a network delay being interpreted as a node disappearing on the net
which could lead to a node-self-fencing, a private interconnect is recommended.
It is safe to use the same private interconnect for both Oracle RAC and OCFS2.
$ su -
# ocfs2console &
This will bring up the GUI as shown below:

Figure 12: ocfs2console Screen
Using the ocfs2console GUI tool, perform the following steps:
Note: The node name you enter "must" match the hostname of the machine
and the IP addresses will use the private interconnect.

Figure 13: Starting the OCFS2 Cluster Stack
The following dialog shows the OCFS2 settings I used when configuring the new Oracle RAC node:

Figure 14: Configuring Nodes for OCFS2
After exiting the ocfs2console, you will have a /etc/ocfs2/cluster.conf
similar to the following. In the next section, this file (along with other changes)
will be distributed to the current two RAC nodes:
/etc/ocfs2/cluster.conf
node:
ip_port = 7777
ip_address = 192.168.2.100
number = 0
name = linux1
cluster = ocfs2
node:
ip_port = 7777
ip_address = 192.168.2.101
number = 1
name = linux2
cluster = ocfs2
node:
ip_port = 7777
ip_address = 192.168.2.107
number = 2
name = linux3
cluster = ocfs2
cluster:
node_count = 3
name = ocfs2
Add New Oracle RAC Node to the OCFS2 Cluster
The next step is to add the new Oracle RAC node (linux3) to the
current "live" OCFS2 cluster. This entails running the o2cb_ctl command-line
utility from the current two RAC nodes linux1 and linux2.
[root@linux1 ~]# o2cb_ctl -C -i -n linux3 -t node -a number=2 -a ip_address=192.168.2.107 -a ip_port=7777 -a cluster=ocfs2
Node linux3 created
[root@linux2 ~]# o2cb_ctl -C -i -n linux3 -t node -a number=2 -a ip_address=192.168.2.107 -a ip_port=7777 -a cluster=ocfs2
Node linux3 created
-C : Create an object in the OCFS2 Cluster Configuration.
-i : Valid only with -C. When creating something (node or cluster),
it will also install it in the live cluster (/config). If the
parameter is not specified, then only update the
/etc/ocfs2/cluster.conf.
-n : Object name which is usually the node name or cluster name.
-t : Type can be cluster, node or heartbeat.
-a : Attribute in the format "parameter=value" which will be set in
the file /etc/ocfs2/cluster.conf file. Since nodes are numbered
starting with zero, the third node in the OCFS2 cluster will
be "number=2". Set the IP address which in this
example will be the private interconnect "ip_address=192.168.2.107".
The port number used in the current two-node cluster is
"ip_port=7777". Finally, identify which OCFS2 cluster to use which
in our case is named "cluster=ocfs2".
Configure O2CB to Start on Boot and Adjust O2CB Heartbeat Threshold
Next, configure the on-boot properties of the OC2B driver on the new Oracle RAC node so
that the cluster stack services will start on each boot. We will also be adjusting
the OCFS2 Heartbeat Threshold from its default setting of 7 to 61.
# /etc/init.d/o2cb offline ocfs2
# /etc/init.d/o2cb unload
# /etc/init.d/o2cb configure
Configuring the O2CB driver.
This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on
boot. The current values will be shown in brackets ('[]'). Hitting
<ENTER> without typing an answer will keep that current value. Ctrl-C
will abort.
Load O2CB driver on boot (y/n) [n]: y
Cluster to start on boot (Enter "none" to clear) [ocfs2]: ocfs2
Specify heartbeat dead threshold (>=7) [7]: 61
Specify network idle timeout in ms (>=5000) [10000]: 10000
Specify network keepalive delay in ms (>=1000) [5000]: 5000
Specify network reconnect delay in ms (>=2000) [2000]: 2000
Writing O2CB configuration: OK
Loading module "configfs": OK
Mounting configfs filesystem at /config: OK
Loading module "ocfs2_nodemanager": OK
Loading module "ocfs2_dlm": OK
Loading module "ocfs2_dlmfs": OK
Mounting ocfs2_dlmfs filesystem at /dlm: OK
Starting O2CB cluster ocfs2: OK
Mount the OCFS2 File System
Since the clustered file system already exists, the next step
is to simply mount it on the new Oracle RAC node.
Let's first do it using the command-line, then I'll show
how to include it in the /etc/fstab to have it mount
on each boot. The current OCFS2 file system was created
with the label oracrsfiles which will be used
when mounting.
$ su -
# mount -t ocfs2 -o datavolume,nointr -L "oracrsfiles" /u02
If the mount was successful, you will simply
get your prompt back. We should, however, run the following
checks to ensure the file system is mounted correctly.
Let's use the mount command to ensure that
the clustered file system is really mounted:
# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
none on /proc type proc (rw)
none on /sys type sysfs (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
usbfs on /proc/bus/usb type usbfs (rw)
/dev/hda1 on /boot type ext3 (rw)
none on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
cartman:SHARE2 on /cartman type nfs (rw,addr=192.168.1.120)
configfs on /config type configfs (rw)
ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
/dev/sdc1 on /u02 type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)
Please take note of the datavolume option I am using to mount
the clustered file system. Oracle database users must mount any volume
that will contain the Voting Disk file, Cluster Registry (OCR), Data files,
Redo logs, Archive logs and Control files with the datavolume mount
option so as to ensure that the Oracle processes open the files with the o_direct
flag. The nointr option ensures that the I/O's are not interrupted by signals.
Configure OCFS2 to Mount Automatically at Startup
This section provides the steps necessary to mount the OCFS2 file system each time
the new Oracle RAC node is booted using its label.
LABEL=oracrsfiles /u02 ocfs2 _netdev,datavolume,nointr 0 0
Notice the "_netdev" option for mounting this file system.
The _netdev mount option is a must for OCFS2 volumes. This
mount option indicates that the volume is to be mounted after the
network is started and dismounted before the network is
shutdown.
$ su -
# chkconfig --list o2cb
o2cb 0:off 1:off 2:on 3:on 4:on 5:on 6:off
The flags that I have marked in bold should be set to "on".
Check Permissions on OCFS2 File System
From the new Oracle RAC node, use the ls command to check ownership.
The permissions should be set to 0775 with owner "oracle" and
group "oinstall".
# ls -ld /u02
drwxrwxr-x 4 oracle oinstall 4096 Dec 20 01:53 /u02
Verify Access to the Shared Clusterware Files
From the new Oracle RAC node as the oracle user account, use the
ls command to verify access to the Oracle Clusterware shared files
(OCR file and Voting Disk):
[oracle@linux3 ~]$ ls -l /u02/oradata/orcl
total 14820
-rw-r--r-- 1 oracle oinstall 10240000 Dec 20 02:33 CSSFile
-rw-r--r-- 1 oracle oinstall 10240000 Dec 20 02:33 CSSFile_mirror1
-rw-r--r-- 1 oracle oinstall 10240000 Dec 20 02:33 CSSFile_mirror2
drwxr-x--- 2 oracle oinstall 4096 Dec 20 15:34 dbs/
-rw-r----- 1 root oinstall 4931584 Dec 20 14:49 OCRFile
-rw-r----- 1 root oinstall 4931584 Dec 20 14:49 OCRFile_mirror
How to Determine OCFS2 Version
To determine which version of OCFS2 is running, use:
# cat /proc/fs/ocfs2/version
OCFS2 1.2.5 Mon Jul 30 13:22:57 PDT 2007 (build 4d201e17b1bc7db76d96570e328927c7)
Perform the following tasks on the new Oracle RAC node!
Introduction
The current two-node Oracle RAC database makes use of Automatic
Storage Management (ASM) to be used as the file system and volume manager
for all Oracle physical database files (data, online redo logs, control files, archived redo logs)
and a Flash Recovery Area.
Download the ASMLib 2.0 Packages
The ASMLib distribution comprises of two sets of RPMs; namely, the kernel module and the
ASMLib tools.
Oracle ASMLib Downloads for Red Hat Enterprise Linux 4 AS
You will also need to download the following ASMLib tools:
oracleasm-2.6.9-55.EL-2.0.3-1.i686.rpm - (for single processor)
oracleasm-2.6.9-55.ELsmp-2.0.3-1.i686.rpm - (for multiple processors)
oracleasm-2.6.9-55.ELhugemem-2.0.3-1.i686.rpm - (for hugemem)
oracleasmlib-2.0.2-1.i386.rpm - (Userspace library)
oracleasm-support-2.0.3-1.i386.rpm - (Driver support files)
Install ASMLib 2.0 Packages
I will be installing the ASMLib files onto the new Oracle RAC node (linux3)
which is a single processor machine. The installation process is simply a matter
of running the following command on the new Oracle RAC node as the root user account:
$ su -
# rpm -Uvh oracleasm-2.6.9-55.EL-2.0.3-1.i686.rpm \
oracleasmlib-2.0.2-1.i386.rpm \
oracleasm-support-2.0.3-1.i386.rpm
Preparing... ########################################### [100%]
1:oracleasm-support ########################################### [ 33%]
2:oracleasm-2.6.9-55.EL ########################################### [ 67%]
3:oracleasmlib ########################################### [100%]
Configure and Loading the ASMLib 2.0 Packages
After downloading and installing the ASMLib 2.0 Packages for Linux,
we now need to configure and load the ASM kernel module. Run the following
as root on the new Oracle RAC node:
$ su -
# /etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library
driver. The following questions will determine whether the driver is
loaded on boot and what permissions it will have. The current values
will be shown in brackets ('[]'). Hitting <ENTER> without typing an
answer will keep that current value. Ctrl-C will abort.
Default user to own the driver interface []: oracle
Default group to own the driver interface []: oinstall
Start Oracle ASM library driver on boot (y/n) [n]: y
Fix permissions of Oracle ASM disks on boot (y/n) [y]: y
Writing Oracle ASM library driver configuration: [ OK ]
Creating /dev/oracleasm mount point: [ OK ]
Loading module "oracleasm": [ OK ]
Mounting ASMlib driver filesystem: [ OK ]
Scanning system for ASM disks: [ OK ]
Scan for ASM Disks
From the new Oracle RAC node, you can now perform
a scandisk to recognize the current volumes. Even though
the above configuration automatically ran the scandisk utility, I still
like to manually perform this step!
# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks [ OK ]
# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
VOL4
Perform the following checks on the new Oracle RAC node and run
the Oracle Cluster Verification Utility (CVU) from linux1!
When installing the Linux O/S (CentOS or Red Hat Enterprise Linux 4), you should verify
that all required RPMs for Oracle are installed. If you followed the instructions
I used for installing Linux, you would have installed Everything,
in which case you will have all of the required RPM packages. However,
if you performed another installation type (i.e. "Advanced Server), you
may have some packages missing and will need to install them. All
of the required RPMs are on the Linux CDs/ISOs.
The following packages must be installed on the new Oracle RAC node. Note
that the version number for your Linux distribution may vary slightly.
Prerequisites for Using Cluster Verification Utility
binutils-2.15.92.0.2-21
compat-db-4.1.25-9
compat-gcc-32-3.2.3-47.3
compat-gcc-32-c++-3.2.3-47.3
compat-libstdc++-33-3.2.3-47.3
compat-libgcc-296-2.96-132.7.2
control-center-2.8.0-12.rhel4.5
cpp-3.4.6-3
gcc-3.4.6-3
gcc-c++-3.4.6-3
glibc-2.3.4-2.25
glibc-common-2.3.4-2.25
glibc-devel-2.3.4-2.25
glibc-headers-2.3.4-2.25
glibc-kernheaders-2.4-9.1.98.EL
gnome-libs-1.4.1.2.90-44.1
libaio-0.3.105-2
libstdc++-3.4.6-3
libstdc++-devel-3.4.6-3
make-3.80-6.EL4
openmotif-2.2.3-10.RHEL4.5
openmotif21-2.1.30-11.RHEL4.6
pdksh-5.2.14-30.3
setarch-1.6-1
sysstat-5.0.5-11.rhel4
xscreensaver-4.18-5.rhel4.11
Note that the openmotif RPM packages are only required to install
Oracle demos. This article does not cover the installation of Oracle demos.
# rpm -q gcc glibc-devel
gcc-3.4.6-3
glibc-devel-2.3.4-2.25
If you need to install any of the above packages (which you should not have
to if you installed Everything), use the "rpm -Uvh <PackageName.rpm>" command.
For example, to install the GCC gcc-3.4.6-3 package, use:
# rpm -Uvh gcc-3.4.6-3.i386.rpm
Checking Pre-Installation Tasks for CRS with CVU
You must have JDK 1.4.2 installed on your system before you can run CVU.
If you do not have JDK 1.4.2 installed on your system, and you attempt to run CVU,
you will receive an error message similar to the following:
Note that during the creation of the current two-node Oracle RAC,
the JDK was installed on to linux1 and the appropriate
environment variable (CV_JDKHOME=/usr/local/java) was
defined in the .bash_profile login script for the
oracle user account. No action needs to be performed in this
section.
ERROR. Either CV_JDKHOME environment variable should be set
or /stagepath/cluvfy/jrepack.zip should exist.
The second pre-requisite for running the CVU is for Red Hat Linux users.
If you are using Red Hat Linux, then you must download and install the Red Hat
operating system package cvuqdisk on the new Oracle RAC node.
Without cvuqdisk, CVU will be
unable to discover shared disks, and you will receive the error message
"Package cvuqdisk not installed" when you run CVU.
# -- IF YOU ARE USING A PRIMARY GROUP OTHER THAN oinstall
# CVUQDISK_GRP=<YOUR_GROUP>; export CVUQDISK_GRP
# cd ~oracle/orainstall/clusterware/rpm
# rpm -iv cvuqdisk-1.0.1-1.rpm
Preparing packages for installation...
cvuqdisk-1.0.1-1
# ls -l /usr/sbin/cvuqdisk
-rwsr-x--- 1 root oinstall 4168 Jun 2 2005 /usr/sbin/cvuqdisk
The CVU should be run from linux1 the node we will be extending the
Oracle software from. Before running CVU, login as the
oracle user account and verify remote access / user equivalence is configured
to all nodes in the cluster. When using the
secure shell
method, user equivalence
will need to be enabled for the terminal shell session before attempting to run the CVU.
To enable user equivalence for the current terminal shell session, perform the
following steps remembering to enter the pass
phrase for each key that you generated when prompted:
# su - oracle
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
Once all prerequisites for using CVU have been met, we can start by
checking that all pre-installation tasks for Oracle Clusterware (CRS)
are completed by executing the following command as the "oracle" UNIX user account
(with user equivalence enabled) from linux1:
Checking the Hardware and Operating System Setup with CVU
$ cd ~oracle/orainstall/clusterware/cluvfy
$ ./runcluvfy.sh stage -pre crsinst -n linux1,linux2,linux3 -verbose
Review the CVU report. Note that there are several errors you may
ignore in this report.
Suitable interfaces for the private interconnect on subnet "192.168.2.0":
linux3 eth1:192.168.2.107
linux2 eth1:192.168.2.101
linux1 eth1:192.168.2.100
ERROR:
Could not find a suitable set of interfaces for VIPs.
Result: Node connectivity check failed.
The next CVU check to run will verify the hardware and operating system setup.
Again, run the following as the "oracle" UNIX user account from linux1:
$ cd ~oracle/orainstall/clusterware/cluvfy
$ ./runcluvfy.sh stage -post hwos -n linux1,linux2,linux3 -verbose
Review the CVU report. As with the previous check
(pre-installation tasks for CRS),
the check for finding a suitable set of interfaces for VIPs will
fail and can be safely ignored.
This too can be safely ignored.
While we know the disks are visible and shared from all Oracle RAC nodes in the
cluster, the check itself fails. Several reasons for this have been
documented. The first came from Metalink indicating that
cluvfy currently does not work with devices other than SCSI
devices. This would include devices like EMC PowerPath and volume groups like
those from Openfiler. At the time of this writing, no workaround exists
other than to use manual methods for detecting shared devices. Another
reason for this error was documented by Bane Radulovic at Oracle Corporation.
His research shows that CVU calls smartclt on Linux, and the problem
is that smartclt does not return the serial number from our iSCSI devices.
For example, a check against /dev/sde shows:
Checking shared storage accessibility...
WARNING:
Unable to determine the sharedness of /dev/sdc on nodes:
linux3,linux3,linux3,linux2,linux2,linux2,linux1,linux1,linux1
Shared storage check failed on nodes "linux3,linux2,linux1".
# /usr/sbin/smartctl -i /dev/sde
smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Device: Openfile Virtual disk Version: 0
Serial number:
Device type: disk
Local Time is: Mon Sep 3 02:02:53 2007 EDT
Device supports SMART and is Disabled
Temperature Warning Disabled or Not Supported
At the time of this writing, it is unknown if the Openfiler developers have
plans to fix this.
Extend the Oracle Clusterware software to the new Oracle RAC node from linux1!
Overview
In this section, we will extend the current Oracle RAC database by adding
the new Oracle RAC node linux3. The new node will need to be added to the cluster at
the clusterware layer so that the other nodes in the RAC cluster consider it to be part of the cluster.
Verifying Terminal Shell Environment
Before starting the Oracle Universal Installer, you should first verify
you are logged onto the server you will be running the installer from
(i.e. linux1) then run
the xhost command as root from the console to
allow X Server connections. Next, login as the oracle user account.
If you are using a remote client to connect to the node performing the
installation (SSH / Telnet to linux1 from a workstation configured with
an X Server), you will need to set the DISPLAY variable to point to your
local workstation.
Finally, verify remote access / user equivalence to all nodes in the cluster:
# hostname
linux1
# xhost +
access control disabled, clients can connect from any host
# su - oracle
$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
$ # NODE PERFORMING THE INSTALL
$ DISPLAY=<your local workstation>:0.0
$ export DISPLAY
Verify you are able to run the Secure Shell
commands (ssh or scp) on the Linux server
you will be running the Oracle Universal Installer from against the new Oracle
RAC node without being prompted for a password.
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
$ ssh linux1 "date;hostname"
Sat Feb 23 16:48:05 EST 2008
linux1
$ ssh linux3 "date;hostname"
Sat Feb 23 16:46:51 EST 2008
linux3
Configure Oracle Clusterware on the New Node
The next step is to configure Oracle Clusterware on the new Oracle RAC node linux3.
As previously mentioned, this is performed by executing the new addNode.sh utility located
in the Oracle Clusterware's home oui/bin directory (/u01/app/crs/oui/bin) from linux1:
$ hostname
linux1
$ id -a
uid=501(oracle) gid=501(oinstall) groups=501(oinstall),502(dba)
$ cd /u01/app/crs/oui/bin
$ ./addNode.sh
Screen Name
Response
Welcome Screen
Click Next
Specify Cluster Nodes
to Add to Installation
In this screen, the OUI lists all existing nodes in the top portion
labeled "Existing Nodes". On the bottom half of the screen
labeled "Specify New Nodes", enter the information for the new node
in the appropriate fields:
Public Node Name
Private Node Name
Virtual Node Name
linux3
linux3-priv
linux3-vip
Cluster Node
Additional Summary
Verify the new Oracle RAC node is listed under the "New Nodes" drilldown.
Click Install to start the installation!
Execute Configuration
Scripts
Once all of the required Oracle Clusterware components have been copied from linux1
to linux3, the OUI prompts to execute three files as described in the following
sections.
Enter file in which the key is (/home/oracle/.ssh/id_rsa):
Enter old passphrase: [OLD PASSPHRASE]
Key has comment '/home/oracle/.ssh/id_rsa'
Enter new passphrase (empty for no passphrase): [JUST HIT ENTER WITHOUT ENTERING A PASSPHRASE]
Enter same passphrase again: [JUST HIT ENTER WITHOUT ENTERING A PASSPHRASE]
Your identification has been saved with the new passphrase.
...
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
linux1
linux2
linux3
CSS is active on all nodes.
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps
The given interface(s), "eth0" is not public. Public interfaces should be used to configure virtual IPs.
Network interfaces: Select only the public interface - eth0
Virtual IPs for cluster nodes:
Node Name: linux1
IP Alias Name: linux1-vip
IP Address: 192.168.1.200
Subnet Mask: 255.255.255.0
IP Alias Name: linux2-vip
IP Address: 192.168.1.201
Subnet Mask: 255.255.255.0
IP Alias Name: linux3-vip
IP Address: 192.168.1.207
Subnet Mask: 255.255.255.0
Configuration Assistant Progress Dialog: Click OK after configuration is complete.
Configuration Results: Click Exit
End of installation
At the end of the installation, exit from the OUI.
Verify Oracle Clusterware Installation
After extending Oracle Clusterware to the new node, we can run through several
tests to verify the install was successful. Run the following commands on
the new Oracle RAC node (linux3):
Confirm Oracle Clusterware Function
$ $ORA_CRS_HOME/bin/olsnodes -n
linux1 1
linux2 2
linux3 3
Check CRS Status
$ $ORA_CRS_HOME/bin/crs_stat -t -v
Name Type R/RA F/FT Target State Host
----------------------------------------------------------------------
ora....SM1.asm application 0/5 0/0 ONLINE ONLINE linux1
ora....X1.lsnr application 0/5 0/0 ONLINE ONLINE linux1
ora.linux1.gsd application 0/5 0/0 ONLINE ONLINE linux1
ora.linux1.ons application 0/3 0/0 ONLINE ONLINE linux1
ora.linux1.vip application 0/0 0/0 ONLINE ONLINE linux1
ora....SM2.asm application 0/5 0/0 ONLINE ONLINE linux2
ora....X2.lsnr application 0/5 0/0 ONLINE ONLINE linux2
ora.linux2.gsd application 0/5 0/0 ONLINE ONLINE linux2
ora.linux2.ons application 0/3 0/0 ONLINE ONLINE linux2
ora.linux2.vip application 0/0 0/0 ONLINE ONLINE linux2
ora.linux3.gsd application 0/5 0/0 ONLINE ONLINE linux3
ora.linux3.ons application 0/3 0/0 ONLINE ONLINE linux3
ora.linux3.vip application 0/0 0/0 ONLINE ONLINE linux3
ora.orcl.db application 0/1 0/1 ONLINE ONLINE linux2
ora....l1.inst application 0/5 0/0 ONLINE ONLINE linux1
ora....l2.inst application 0/5 0/0 ONLINE ONLINE linux2
ora...._taf.cs application 0/1 0/1 ONLINE ONLINE linux2
ora....cl1.srv application 0/1 0/0 ONLINE ONLINE linux1
ora....cl2.srv application 0/1 0/0 ONLINE ONLINE linux2
Check Oracle Clusterware Auto-Start Scripts on New Node (linux3)
$ $ORA_CRS_HOME/bin/crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
$ ls -l /etc/init.d/init.*
-rwxr-xr-x 1 root root 2236 Feb 23 19:16 /etc/init.d/init.crs
-rwxr-xr-x 1 root root 5252 Feb 23 19:16 /etc/init.d/init.crsd
-rwxr-xr-x 1 root root 44589 Feb 23 19:16 /etc/init.d/init.cssd
-rwxr-xr-x 1 root root 3669 Feb 23 19:16 /etc/init.d/init.evmd
Extend the Oracle Database software to the new Oracle RAC node from linux1!
Overview
After copying and configuring the Oracle Clusterware software to the new
node, we now need to copy the Oracle Database software from one of the existing nodes to linux3.
This is done by executing the Oracle provided utility addNode.sh from one of the
existing nodes in the cluster; namely linux1. This script
is located in the $ORACLE_HOME/oui/bin directory
(/u01/app/oracle/product/10.2.0/db_1/oui/bin).
Verifying Terminal Shell Environment
As discussed in the previous section, the terminal shell environment needs to
be configured for remote access and user equivalence to the new Oracle RAC node
before running the Oracle Universal Installer. Note
that you can utilize the same terminal shell session used in the previous section
which in this case, you do not have to perform any of the actions described below
with regards to setting up remote
access and the DISPLAY variable:
# su - oracle
$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
$ # NODE PERFORMING THE INSTALL
$ DISPLAY=<your local workstation>:0.0
$ export DISPLAY
Verify you are able to run the Secure Shell
commands (ssh or scp) on the Linux server
you will be running the Oracle Universal Installer from against all other Linux servers
in the cluster without being prompted for a password.
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
$ ssh linux1 "date;hostname"
Sat Feb 23 16:48:05 EST 2008
linux1
$ ssh linux3 "date;hostname"
Sat Feb 23 16:46:51 EST 2008
linux3
Install Oracle Database Software on the New Node
Copy the Oracle Database software to the new Oracle RAC node linux3.
As previously mentioned, this is performed by executing the new addNode.sh utility located
in the $ORACLE_HOME/oui/bin directory from linux1:
$ hostname
linux1
$ id -a
uid=501(oracle) gid=501(oinstall) groups=501(oinstall),502(dba)
$ cd /u01/app/oracle/product/10.2.0/db_1/oui/bin
$ ./addNode.sh
Screen Name
Response
Welcome Screen
Click Next
Specify Cluster Nodes
to Add to Installation
In this screen, the OUI lists all of the nodes already part of
the installation in the top portion labeled "Existing Nodes". On the bottom half of the screen
labeled "Specify New Nodes" is a list of new nodes which can be added.
By default linux3 is selected. Verify linux3 is selected (checked) and
Click Next to continue.
Cluster Node
Additional Summary
Verify the new Oracle RAC node is listed under the "New Nodes" drilldown.
Click Install to start the installation!
Execute Configuration
Scripts
Once all of the required Oracle Database components have been copied from linux1
to linux3, the OUI prompts to execute the root.sh
on the new Oracle RAC node.
Navigate to the /u01/app/oracle/product/10.2.0/db_1 directory
on linux3 and run root.sh as the "root" user account.
End of installation
At the end of the installation, exit from the OUI.
Perform the following configuration procedures from
only one of the Oracle RAC nodes in the cluster (linux1)!
The Network Configuration Assistant (NETCA) will setup the TNS
listener in a clustered configuration to include the new node in the cluster.
Overview
In this section, you will use the Network Configuration Assistant (NETCA)
to setup the TNS listener in a clustered configuration to include the new Oracle RAC node.
The NETCA program will be run from linux1 with user equivalence enabled to
all nodes in the cluster.
Verifying Terminal Shell Environment
As discussed in the previous section, the terminal shell environment needs to
be configured for remote access and user equivalence to the new Oracle RAC node
before running the NETCA. Note
that you can utilize the same terminal shell session used in the previous section
which in this case, you do not have to perform any of the actions described below
with regards to setting up remote
access and the DISPLAY variable:
# su - oracle
$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
$ # NODE PERFORMING THE INSTALL
$ DISPLAY=<your local workstation>:0.0
$ export DISPLAY
Verify you are able to run the Secure Shell
commands (ssh or scp) on the Linux server
you will be running the NETCA from against all other Linux servers
in the cluster without being prompted for a password.
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
$ ssh linux1 "date;hostname"
Sat Feb 23 16:48:05 EST 2008
linux1
$ ssh linux3 "date;hostname"
Sat Feb 23 16:46:51 EST 2008
linux3
Run the Network Configuration Assistant
To start the NETCA, run the following from linux1:
$ netca &
Screen Name
Response
Select the Type of Oracle
Net Services Configuration
Select Cluster configuration
Select the nodes to configure
Only select the new Oracle RAC node: linux3.
Type of Configuration
Select Listener configuration.
Listener Configuration
Next 6 Screens
The following screens are now like any other normal listener configuration. You can simply
accept the default parameters for the next six screens:
What do you want to do: Add
Listener name: LISTENER
Selected protocols: TCP
Port number: 1521
Configure another listener: No
Listener configuration complete! [ Next ]
You will be returned to this Welcome (Type of Configuration) Screen.
Type of Configuration
Select Naming Methods configuration.
Naming Methods Configuration
The following screens are:
Selected Naming Methods: Local Naming
Naming Methods configuration complete! [ Next ]
You will be returned to this Welcome (Type of Configuration) Screen.
Type of Configuration
Click Finish to exit the NETCA.
Verify TNS Listener Configuration
The Oracle TNS listener process should now be running on all three nodes in the RAC
cluster:
$ hostname
linux1
$ ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}'
LISTENER_LINUX1
$ $ORA_CRS_HOME/bin/crs_stat ora.linux1.LISTENER_LINUX1.lsnr
NAME=ora.linux1.LISTENER_LINUX1.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on linux1
=====================
$ hostname
linux2
$ ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}'
LISTENER_LINUX2
$ $ORA_CRS_HOME/bin/crs_stat ora.linux2.LISTENER_LINUX2.lsnr
NAME=ora.linux2.LISTENER_LINUX2.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on linux2
=====================
$ hostname
linux3
$ ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}'
LISTENER_LINUX3
$ $ORA_CRS_HOME/bin/crs_stat ora.linux3.LISTENER_LINUX3.lsnr
NAME=ora.linux3.LISTENER_LINUX3.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on linux3
Add the new Oracle instance to the new Oracle RAC node using DBCA!
Overview
The final step in extending the Oracle RAC database is to add a new
database instance to the new Oracle RAC node. The database instance will
be named orcl3 and hosted on the new node linux3.
This process can be performed using
either Enterprise Manager or the Database Configuration Assistant (DBCA). For
the purpose of this article, I am opting to use the DBCA.
Verifying Terminal Shell Environment
As discussed in the previous section, the terminal shell environment needs to
be configured for remote access and user equivalence to the new Oracle RAC node
before running the DBCA. Note that you can utilize the same terminal shell
session used in the previous section which in this case, you do not have
to perform any of the actions described below with regards to setting up
remote access and the DISPLAY variable:
# su - oracle
$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
$ # NODE PERFORMING THE INSTALL
$ DISPLAY=<your local workstation>:0.0
$ export DISPLAY
Verify you are able to run the Secure Shell
commands (ssh or scp) on the Linux server
you will be running the DBCA from against all other Linux servers
in the cluster without being prompted for a password.
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
$ ssh linux1 "date;hostname"
Sat Feb 23 16:48:05 EST 2008
linux1
$ ssh linux3 "date;hostname"
Sat Feb 23 16:46:51 EST 2008
linux3
Add Database Instance to New Node
To start the database instance creation process for the new Oracle RAC node,
run the following from linux1:
$ dbca &
Screen Name
Response
Welcome Screen
Select Oracle Real Application Clusters database.
Operations
Select Instance Management.
Instance Management
Select Add an instance.
List of cluster databases
Provides a list of clustered databases running on the node. For the purpose of this
example, the clustered database running on node linux1 is orcl.
Select this clustered database.
Password: <sys_password>
List of cluster
database instances
This screen provides a list of all instances currently available on the cluster, their status,
and which node they reside on.
Instance naming and
node selection
This screen lists the next instance name in the series and requests the node
on which to add the instance to. In this example, the next instance name is
orcl3 and the node name to create it on is
linux3. For this example, the default
values are correct (instance name "orcl3" to be added to node "linux3"). After
verifying these values, Click Next to continue.
Database Services
If the current clustered database has any database services defined, the next
screen allows the DBA to configure those database services for the new
instance. In this example, the existing clustered database has one service
defined named orcl_taf.
With the "orcl_taf" database service selected, change the details to
Preferred for the
new instance (orcl3) and the "TAF Policy" set to Basic.
Instance Storage
By default, the DBCA does a good job of determining the instance specific
files such as an UNDO tablespace (UNDOTBS3), database files for this tablespace,
and two redo log groups. Verify the storage options and
Click Finish to add the instance.
Database Configuration
Assistant: Summary
After verifying the instance creation options in the summary dialog,
Click OK to begin the instance management process.
Extend ASM
During the add instance step, the DBCA verifies the new node and then checks
to determine if ASM is present on the existing cluster (which in this example, ASM is
configured). The DBCA presents a dialog box indicating that "ASM is present on the
cluster but needs to be extended to the following nodes: [linux3]. Do you want ASM to be
extended?" Click on Yes to add the ASM instance to the new node.
Database Configuration Assistant
Progress Screen
A progress bar is display while the new instance is being configured. Once the instance management
process is complete, the DBCA prompts the user with a dialog and the message "Do you want to perform
another operation?" Click No to end and exit the DBCA utility.
Start New Database Services
The DBCA will automatically start the new instance (orcl3) on the node linux3. If any services were
configured during the instance management process, however, they are left in an offline state.
For the purpose of this example, I had to manually start the "orcl_taf" service for the database:
Verify New Database Environment
$ $ORA_CRS_HOME/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE linux1
ora....X1.lsnr application ONLINE ONLINE linux1
ora.linux1.gsd application ONLINE ONLINE linux1
ora.linux1.ons application ONLINE ONLINE linux1
ora.linux1.vip application ONLINE ONLINE linux1
ora....SM2.asm application ONLINE ONLINE linux2
ora....X2.lsnr application ONLINE ONLINE linux2
ora.linux2.gsd application ONLINE ONLINE linux2
ora.linux2.ons application ONLINE ONLINE linux2
ora.linux2.vip application ONLINE ONLINE linux2
ora....SM3.asm application ONLINE ONLINE linux3
ora....X3.lsnr application ONLINE ONLINE linux3
ora.linux3.gsd application ONLINE ONLINE linux3
ora.linux3.ons application ONLINE ONLINE linux3
ora.linux3.vip application ONLINE ONLINE linux3
ora.orcl.db application ONLINE ONLINE linux2
ora....l1.inst application ONLINE ONLINE linux1
ora....l2.inst application ONLINE ONLINE linux2
ora....l3.inst application ONLINE ONLINE linux3
ora...._taf.cs application ONLINE ONLINE linux2
ora....cl1.srv application ONLINE ONLINE linux1
ora....cl2.srv application ONLINE ONLINE linux2
ora....cl3.srv application ONLINE ONLINE linux3
- or -
$ rac_crs_stat
HA Resource Target State
----------- ------ -----
ora.linux1.ASM1.asm ONLINE ONLINE on linux1
ora.linux1.LISTENER_LINUX1.lsnr ONLINE ONLINE on linux1
ora.linux1.gsd ONLINE ONLINE on linux1
ora.linux1.ons ONLINE ONLINE on linux1
ora.linux1.vip ONLINE ONLINE on linux1
ora.linux2.ASM2.asm ONLINE ONLINE on linux2
ora.linux2.LISTENER_LINUX2.lsnr ONLINE ONLINE on linux2
ora.linux2.gsd ONLINE ONLINE on linux2
ora.linux2.ons ONLINE ONLINE on linux2
ora.linux2.vip ONLINE ONLINE on linux2
ora.linux3.ASM3.asm ONLINE ONLINE on linux3
ora.linux3.LISTENER_LINUX3.lsnr ONLINE ONLINE on linux3
ora.linux3.gsd ONLINE ONLINE on linux3
ora.linux3.ons ONLINE ONLINE on linux3
ora.linux3.vip ONLINE ONLINE on linux3
ora.orcl.db ONLINE ONLINE on linux2
ora.orcl.orcl1.inst ONLINE ONLINE on linux1
ora.orcl.orcl2.inst ONLINE ONLINE on linux2
ora.orcl.orcl3.inst ONLINE ONLINE on linux3
ora.orcl.orcl_taf.cs ONLINE ONLINE on linux2
ora.orcl.orcl_taf.orcl1.srv ONLINE ONLINE on linux1
ora.orcl.orcl_taf.orcl2.srv ONLINE ONLINE on linux2
ora.orcl.orcl_taf.orcl3.srv ONLINE ONLINE on linux3
Verify New Instance
Login to one of the instances and query the gv$instance view:
SQL> select inst_id, instance_name, status, to_char(startup_time, 'DD-MON-YYYY HH24:MI:SS')
2 from gv$instance order by inst_id;
INST_ID INSTANCE_NAME STATUS TO_CHAR(STARTUP_TIME
---------- ---------------- ---------- --------------------
1 orcl1 OPEN 23-FEB-2008 00:10:16
2 orcl2 OPEN 23-FEB-2008 00:10:47
3 orcl3 OPEN 26-FEB-2008 22:52:51
Update TNSNAMES
Login to all machines that will be accessing the new instance
and update the tnsnames.ora file (if necessary).
Verify Enterprise Manager - Database Control
The DBCA should have updated and added the new node(s) to EM Database Control. Bring up
a web browser and navigate to:
Jeffrey Hunter is an Oracle Certified Professional, Java Development Certified Professional, Author,
and an Oracle ACE.
Jeff currently works as a Senior Database Administrator for
The DBA Zone, Inc. located in Pittsburgh, Pennsylvania.
His work includes advanced performance tuning, Java and PL/SQL programming, capacity
planning, database security, and physical / logical database design in a UNIX,
Linux, and Windows server environment. Jeff's other interests include mathematical
encryption theory, programming language processors (compilers and interpreters)
in Java and C, LDAP, writing web-based database administration tools, and of
course Linux. He has been a Sr. Database Administrator and Software Engineer
for over 16 years and maintains his own website site at:
http://www.iDevelopment.info.
Jeff graduated from Stanislaus State University in Turlock,
California, with a Bachelor's degree in Computer Science.
Thursday, 03-Sep-2009 15:59:39 EDT
Page Count: 37624