Oracle DBA Tips Corner |
|
Create an Oracle RAC 10g Release 2 using VMware Workstation 5 - (CentOS 4.4)
by Jeff Hunter, Sr. Database Administrator
Contents
Overview
For those who simply want to become familiar with Oracle10g RAC, this article
provides a very low cost alternative to configure an Oracle10g RAC system using
my laptop (which his running Windows XP) and VMware Workstation 5.
We will use VMware Workstation 5.5.3 to create and configure two virtual
machines running CentOS 4.4 Enterprise Linux. The two virtual machines -
(to be named vmlinux1 and vmlinux2) - will then be used
for an Oracle10g RAC dual node configuration.
This article was written and tested using:
Keep in mind that with VMware Workstation 5.x when
attempting to configure either of the virtual machines for clustering the
shared disk, the VMware console will display the following error when starting
the virtual machine:
This is a warning that can be safely ignored for our 10g RAC installation and configuration.
Each of the virtual machines will virtualize all of the required hardware components as required
for an Oracle10g two node RAC configuration. For example, each of the virtual machines will
be configured with two network interfaces - one for the public network and a second (running on
a separate subnet) for the interconnect. With VMware, the process of creating additional hardware components
for it to virtualize (i.e. network interfaces) is effortless. The shared storage component, however, will be
a bit more tricky. Like we will perform for the network interfaces, we will use VMware to virtualize several hard disks
to be used for Oracle's physical database files (data, online redo logs, control files, archived redo logs).
The new hard drives will be created using VMware on the first virtual node (vmlinux1) while the
second virtual node (vmlinux2) will be configured to share them.
When we are done, we will have a dual node cluster (each virtual machine will have a single processor),
both running Linux (CentOS 4.4 or Red Hat Enterprise Linux 4), Oracle10g Release 2, OCFS2, ASMLib 2.0
with shared disk storage being virtualized through VMware.
This article does assume some familiarity with installing and creating virtual machines
using the VMware Workstation 5 software as well as installing and configuring Oracle10g RAC.
Details for the installation and virtual machine creation
using VMware Workstation will not be provided in this document. I do, however, provide links (next section) to other articles of
mine that do provide these detailed instructions.
Install VMware Workstation 5
In this article, I will not be providing details for installing VMware Workstation 5
on the Windows XP Platform since it is like installing any type
of software for Windows. If you would, however, like to see the
step-by-step details for this type of install, I do have a separate article that provides
all of the tasks (including screenshots) for a successful install at
Installing VMware Workstation 5.0 - (Windows XP).
Host Machine and Virtual Machine Configuration Overview
Note that I have a 300GB external hard drive connected to my
laptop. While the VMware Workstation software will be installed on the internal hard drive,
(C:), I will be using an external hard drive, (M:), for
all virtual machines and disks to be used for the shared storage.
ASM SPFILE Volume will be mounted on /u02/oradata/orcl
The following table describes the virtual names and IP addresses
that I will be using during the installation of Oracle10g
Clusterware:
Create Two Virtual Machines - (CentOS Enterprise Linux 4.4)
Optional Virtual Machine Configuration Steps
I generally remove the floppy drive and sound card.
For each virtual machine, select [Edit virtual machine settings]
and navigate to the device you want to remove. The following screen shot shows how to remove the
audio device:
Configure Second Network Interface
For each of the new virtual machines, we need to create a second network interface for the interconnect.
VMware allows you to effortlessly create (or virtualize) a second network interface.
For each virtual machine, select [Edit virtual machine settings]
and click the [Add] button. This will bring up the "Add Hardware Wizard". The
following table identifies the values to be used to configure the second network interface:
Install CentOS to New Virtual Machines
Use the links (below) to download CentOS Enterprise Linux 4.4. After
downloading CentOS, you will then want to burn each of the ISO images
to CD.
To start, insert Disk #1 of CentOS Enterprise Linux into the physical CD-ROM
drive and then power up the first virtual machine (vmlinux1). There are several ways
to power up the virtual machine:
The following table describes the values I provided to install CentOS Enterprise Linux 4.4
to each virtual machine.
After checking your media CDs (or if you are like me and Skip this process),
the installer then starts to probe for your video device, monitor and mouse.
The installer should determine that the video drive to use is VMware.
It will detect the monitor as Unknown (which is OK). It then probes and finds
the mouse. Once this process is done, it will start the X Server.
Starting with RHEL 4, the installer will create the same disk configuration
as just noted
but will create them using the Logical Volume Manager (LVM). For example,
it will partition the first hard drive (/dev/sda for my configuration)
into two partitions - one for the /boot partition (/dev/sda1) and the
remainder of the disk dedicate to a LVM named VolGroup00 (/dev/sda2).
The LVM Volume Group (VolGroup00) is then partitioned into two LVM partitions -
one for the root file system (/) and another for swap. I basically check
that it created at least 1GB of swap. Since I configured the virtual machine
to take 748MB of RAM, the installer created 1,496MB of swap.
First, make sure that each of the network devices are checked
to [Active on boot]. The installer may choose
to not activate eth1.
Second, [Edit] both eth0 and eth1 as follows. You may choose
to use different IP addresses for both eth0 and eth1 and that
is OK. If possible, try to put eth1 (the interconnect) on
a different subnet than eth0 (the public network):
eth0:
eth1:
Continue by setting your hostname manually. I used
"vmlinux1" for the first node and "vmlinux2" for the second.
Finish this dialog off by supplying your gateway and
DNS servers.
This is where you pick the packages to install. If you wanted to simply scroll down
to the [Miscellaneous] section and select [Everything], this will
install all packages. To simplify the installation, this is the option I typically choose.
Doing this,
you will get everything required for Oracle, but you will also get many packages that
are not necessary for Oracle to install. Having these unwanted packages does not keep
me up at night.
If you don't want to install everything, you can choose just those packages that
are needed for Oracle. First, ensure that the [Kernel Development Libraries]
and the [Development Tools] package are selected. You must have these packages for Oracle to
install.
If you will be installing Oracle9i or Oracle10g, then you will need to select the
[Legacy Software Development Libraries]. Oracle9i and Oracle10g needs the
older versions of gcc to compile and it included in the
legacy package.
During the installation process,
you will be asked to switch disks to Disk #2, Disk #3, and then Disk #4.
Click [Continue] to start the installation process.
Note that with CentOS 4.2 and CentOS 4.4, the installer will ask to switch
to Disk #2, Disk #3, Disk #4, Disk #1, and then back to Disk #4.
Install VMware Tools - (optional)
Installing VMware Tools within X
NOTE: In some Linux distributions, the VMware Tools CD icon may fail to
appear when you install VMware Tools within an X windows session on a guest.
In this case, you should continue installing VMware Tools as described in
"Installing VMware Tools from the Command Line with the Tar Installer", beginning with step 3.
NOTE: Be sure to respond [yes] if the installer offers to run the configuration program.
NOTE: Some Linux distributions
automatically mount CD-ROMs. If your distribution uses automounting,
do not use the mount and umount commands described in this section.
You still must untar
the VMware Tools installer to /tmp.
NOTE: If you have a previous installation, delete
the previous vmware-distrib directory before installing. The default location of this directory is
/tmp/vmware-tools-distrib.
NOTE: If you attempt to install a tar installation
over an rpm installation - or the reverse - the installer detects the previous
installation and must convert the installer database format before continuing.
Network Configuration
Each node should have one static IP address for the public network
and one static IP address for the private cluster interconnect. The
private interconnect should only be used by Oracle to transfer
Cluster Manager and Cache Fusion related data. Note that Oracle
does not support using the public network interface for the interconnect.
You must have one network interface for the public network and
another network interface for the private interconnect. For a production RAC implementation,
the interconnect should be at least gigabit or more and only be used by Oracle.
Starting with the Red Hat Enterprise Linux 3.0 release (and in CentOS Enterprise
Linux), the FTP server (wu-ftpd) is no longer
available with
Configure Telnet for root logins
Simply edit the file
Configure FTP for root logins
Edit the files
The easiest way to configure network settings in Red Hat Linux is with the program
Network Configuration. This application can be started from the command-line
as the "root" user account as follows:
Using the Network Configuration application, you need to configure
both NIC devices as well as the
Our example configuration will use the following settings:
It's all about availability of the application.
When a node fails, the VIP associated with it is supposed to be automatically
failed over to some other node. When this occurs, two things happen.
This means that when the client issues SQL to the node that is
now down, or traverses the address list while connecting, rather than
waiting on a very long TCP/IP time-out (~10 minutes), the client receives
a TCP reset. In the case of SQL, this is ORA-3113. In the case of
connect, the next address in tnsnames is used.
Going one step further is making use of Transparent Application Failover (TAF).
With TAF successfully configured, it is possible to completely avoid ORA-3113
errors alltogether!
Without using VIPs, clients connected to a node that died will often wait
a 10 minute TCP timeout period before getting an error.
As a result, you don't really have a good HA solution without using VIPs.
Source - Metalink: "RAC Frequently Asked Questions" (Note:220970.1)
Oracle strongly suggests to adjust the default and maximum send buffer size
(SO_SNDBUF socket option) to 256 KB, and the default and maximum receive
buffer size (SO_RCVBUF socket option) to 256 KB.
The receive buffers are used by TCP and UDP to hold received data until it is read by
the application. The receive buffer cannot overflow because the peer is not allowed to
send data beyond the buffer size window. This means that datagrams will be discarded if
they don't fit in the socket receive buffer. This could cause the sender to overwhelm
the receiver.
The above commands made the changes to the already running O/S.
You should now make the above changes permanent (for each reboot) by adding the following lines
to the /etc/sysctl.conf file for each node in your RAC cluster:
If UDP ICMP is blocked or rejected by the firewall, the Oracle Clusterware software will crash after several minutes
of running. When the Oracle Clusterware process fails, you will have something similar to the following
in the <machine_name>_evmocr.log file:
Configure Virtual Shared Storage
First, ensure that both virtual machines are powered down. If either of the
virtual machines are started,
click in the virtual machine and select [Start] - [Turn off computer] - [Turn off].
The following table identifies the values to be used to configure the first virtual (SCSI) hard drive using
the "Add Hardware Wizard". Using vmlinux1,
select the [Edit virtual machine settings]
option in the VMware software console. Then click the [Add] button to bring up the "Add Hardware Wizard".
This process will need to be repeated four additional times to create all five virtual (SCSI) hard drives
listed above:
Each of the two virtual machines we created have a "VMware Configuration File" named
"rhel4.vmx". For my environment, these files are located at:
This configuration data
should be inserted in the VMware Configuration File (Red Hat Enterprise Linux 4.vmx) for both
virtual machines. I generally append this text to the end of the configuration file for both virtual
machines:
With VMware Workstation 5.5 and higher, you will recieve the error
"Clustering is not supported for WMware Workstation. This setting will be ignored."
upon starting each virtual machine:
This warning can be safely ignored. Acknowledge the dialog by clicking [OK].
During the Linux boot
process (for all versions of VMware Workstation), the O/S will detect the new SCSI adaptor as an
"LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI". When prompted, hit any key to continue.
This will bring up the [Hardware Added] screen for the LSI Logic Ultra320 SCSI adaptor.
Select the [Configure] button to register the new SCSI adaptor.
Create oracle User and Directories
For this example, I used:
Create Partitions on the Shared Storage Devices
The following table lists the partition that will
be created on each of the shared virtual disks and what files
will be contained on them.
Create Partition on Each Shared Virtual Disk
Disk 1
Disk 2
Disk 3
Disk 4
Disk 5
Verify New Partitions
Configure the Linux Servers for Oracle
Throughout this section you will notice that there are several different ways to
configure (set) these parameters. For the purpose of this article, I will
be making all changes permanent (through reboots) by placing all commands
in the /etc/sysctl.conf file.
As root, make a file that will act as additional swap space, let's say about 300MB:
Now we should change the file permissions:
Finally we format the "partition" as swap and add it to the swap space:
Oracle makes use of shared memory for its Shared Global Area (SGA) which is an area of
memory that is shared by all Oracle backup and foreground processes. Adequate sizing of
the SGA is critical to Oracle performance since it is responsible for holding the database
buffer cache, shared SQL, access paths, and so much more.
To determine all shared memory limits, use the following:
Setting SHMMAX
You can determine the value of SHMMAX by performing the following:
Setting SHMMNI
You can determine the value of SHMMNI by performing the following:
Setting SHMALL
To determine all semaphore limits, use the following:
Setting SEMMSL
Oracle recommends setting SEMMSL to the largest PROCESS instance parameter
setting in the init.ora file for all databases on the Linux system plus 10.
Also, Oracle recommends setting the SEMMSL to a value of no less than 100.
Setting SEMMNI
Oracle recommends setting the SEMMNI to a value of no less than 100.
Setting SEMMNS
Oracle recommends setting the SEMMNS to the sum of the PROCESSES
instance parameter setting for each database on the system, adding the largest
PROCESSES twice, and then finally adding 10 for each Oracle database on the system.
Use the following calculation to determine the maximum number of semaphores
that can be allocated on a Linux system. It will be the lesser of:
Setting SEMOPM
The semop system call (function) provides the ability to do operations for multiple
semaphores with one semop system call. A semaphore set can have the maximum
number of SEMMSL semaphores per semaphore set and is therefore recommended
to set SEMOPM equal to SEMMSL.
Oracle recommends setting the SEMOPM to a value of no less than 100.
Setting Semaphore Kernel Parameters
Use the following command to determine the maximum number of file handles
for the entire system:
Oracle recommends that the file handles for the entire system be set to at least 65536.
Use the following command to determine the value of ip_local_port_range:
To make these changes, run the following as root:
We could reboot at this point to ensure all of these parameters are set in the
kernel or we could simply "run" the /etc/sysctl.conf file by running the
following command as root. Perform this on both Oracle RAC nodes in the cluster!
Please note that although this would seem like a severe error from the OUI, it can
safely be disregarded as a warning. The "tar" command DOES actually
extract the files; however, when you perform a listing of the files (using ls -l) on the remote node,
they will be missing the time field until the time on the server is greater than the
timestamp of the file.
Before starting any of the above noted installations, ensure that each member node of the cluster
is set as closely as possible to the same date and time. Oracle strongly recommends using
the Network Time Protocol feature of most operating systems for this purpose,
with all nodes using the same reference Network Time Protocol server.
Accessing a Network Time Protocol server, however, may not always be an option.
In this case, when manually setting the date and time for the nodes in the
cluster, ensure that the date and time of the node you are performing the software
installations from (vmlinux1) is less than all other nodes in the cluster (vmlinux2).
I generally use a 20 second difference as shown in the following example:
Setting the date and time from vmlinux1:
Setting the date and time from vmlinux2:
The two-node RAC configuration described in this article does not make use
of a Network Time Protocol server.
Configure the "hangcheck-timer" Kernel Module
Oracle 9.0.1 and 9.2.0.1 used a userspace watchdog daemon called
watchdogd to monitor the health of the cluster and to
restart a RAC node in case of a failure. Starting with Oracle 9.2.0.2
(and still available in Oracle10g Release 2), the
watchdog daemon has been deprecated by a Linux kernel module named
hangcheck-timer which addresses availability and reliability
problems much better. The hang-check timer is loaded into the
Linux kernel and checks if the system hangs. It will set a timer and check the
timer after a certain amount of time. There is a configurable threshold to
hang-check that, if exceeded will reboot the machine. Although
the hangcheck-timer module is not required for Oracle Clusterware (Cluster Manager)
operation, it is highly recommended by Oracle.
Much more information about the
hangcheck-timer project
can be found
here.
These values need to be available after each reboot of the Linux server. To do this, make
an entry with the correct values to the /etc/modprobe.conf file as follows:
It is only out of pure habit that I continue to include a modprobe
of the hangcheck-timer kernel module in the /etc/rc.local file. Someday I will get
over it, but realize that it does not hurt to include a modprobe of
the hangcheck-timer kernel module during startup.
So to keep myself sane and able to
sleep at night, I always configure the loading of the hangcheck-timer kernel module on
each startup as follows:
Now, to test the hangcheck-timer kernel module to verify it is picking up the
correct parameters we defined in the /etc/modprobe.conf file, use the modprobe
command. Although you could load
the hangcheck-timer kernel module by passing it the appropriate parameters
(e.g. insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180),
we want to verify that it is picking up the options we set in the
/etc/modprobe.conf file.
To manually load the hangcheck-timer kernel module and verify it is using the
correct values defined in the /etc/modprobe.conf file, run the following command:
Configure RAC Nodes for Remote Access
So, why do we have to setup user equivalence? Installing Oracle Clusterware and the Oracle
Database software is only performed from one node in a RAC cluster. When running the Oracle
Universal Installer (OUI) on that particular node, it will use the ssh and scp
commands (or rsh and rcp
commands if using remote shell) to run remote commands on and copy files (the Oracle software)
to all other nodes within the RAC cluster. The "oracle" UNIX user account on the node running the
OUI (runInstaller) must be trusted by all other nodes in your RAC cluster. This means that you
must be able to run the secure shell commands (ssh or scp) or the remote shell commands
(rsh and rcp) on the Linux server you will be running the OUI from
against all other Linux servers in the RAC cluster without being prompted for a password.
The first step is to decide which method of remote access to use - secure shell or remote shell.
Both of them have their pros and cons. Remote shell, for example, is extremely easy to setup and
configure. It takes fewer steps to construct and is always available in the terminal session when logging
on to the trusted node (the node you will be performing the install from). The connection to the
remote nodes, however, is not secure during the installation and any patching process. Secure
shell on the other hand does provide a secure connection when installing and patching but does
require a greater number of steps. It also needs to be enabled in the terminal session each time the oracle
user logs in to the trusted node. The official Oracle documentation only describes the steps for
setting up secure shell and is considered the preferred method.
Both methods for configuring user equivalence are described in the following two sections:
To determine if SSH is installed and running, enter the following command:
Use the following steps to create the RSA and DSA key pairs. Please note that
these steps will need to be completed on both Oracle RAC nodes in the cluster:
This command will write the public key to the ~/.ssh/id_rsa.pub
file and the private key to the ~/.ssh/id_rsa file.
Note that you should never distribute the private key to anyone!
This command will write the public key to the ~/.ssh/id_dsa.pub
file and the private key to the ~/.ssh/id_dsa file. Note that
you should never distribute the private key to anyone!
Now that both Oracle RAC nodes contain a public and private key for both RSA and DSA,
you will need to create an authorized key file on one of the nodes. An
authorized key file is nothing more than a single file that contains a copy
of everyone's (every node's) RSA and DSA public key. Once the authorized key
file contains all of the public keys, it is then distributed to all other
nodes in the RAC cluster.
Complete the following steps on one of the nodes in the cluster to create
and then distribute the authorized key file. For the purpose of this
article, I am using vmlinux1:
The following example is being run from vmlinux1 and assumes a
two-node cluster, with nodes vmlinux1 and vmlinux2:
User equivalence will need to be enabled on any new terminal shell session before attempting
to run the OUI. If you log out and log back in to the node you will be performing the
Oracle installation from, you must enable user equivalence for the terminal shell session
as this is not done by default.
To enable user equivalence for the current terminal shell session,
perform the following steps:
Also, if you see any other messages or text, apart from the date and hostname,
then the Oracle installation can fail. Make any changes required to
ensure that only the date is displayed when you enter these commands.
You should ensure that any part of a login script(s) that generate any
output, or ask any questions, are modified so that they act only when
the shell is an interactive shell.
Bourne, Korn, and Bash shells:
To avoid this problem, you must modify these files to suppress all output on
STDERR as in the following examples:
The rsh daemon
validates users using the /etc/hosts.equiv file or the .rhosts
file found in the user's (oracle's) home directory.
First, let's make sure that we have the rsh RPMs installed on
both of the Oracle RAC nodes in the cluster:
To enable the "rsh" and "rlogin" services, the "disable" attribute in the /etc/xinetd.d/rsh
file must be set to "no" and xinetd must be reloaded.
This can be done by running the following commands on both Oracle RAC nodes in the cluster:
I will typically rename the Kerberos version of rsh so that the normal
rsh command will be used. Use the following:
You should now test your connections and run the rsh command from the node that will be performing
the Oracle Clusterware and 10g RAC installation. I will be using the node vmlinux1 to perform all installs so this
is where I will run the following commands from:
All Startup Commands for Each RAC Node
In this section, I provide all of the commands, parameters, and entries that
have been discussed so far that will need to be included in the startup scripts
for each Linux node in the RAC cluster. For each of the startup files below,
I indicate in blue the entries that should be included
in each of the startup files in order to provide a successful RAC node.
/etc/modprobe.conf
/etc/sysctl.conf
/etc/hosts
/etc/hosts.equiv
Allow logins to both Oracle RAC nodes as the oracle user account without
the need for a password when using the remote shell method for enabling user equivalency.
/etc/rc.local
Install and Configure Oracle Cluster File System (OCFS2)
OCFS (Release 1) was released in December 2002 to enable Oracle Real
Application Cluster (RAC) users to run the clustered database without
having to deal with RAW devices. The file system was designed to store
database related files, such as data files, control files, redo logs,
archive logs, etc. OCFS2 is the next generation of the Oracle Cluster
File System. It has been designed to be a general purpose cluster file
system. With it, one can store not only database related files on a shared
disk, but also store Oracle binaries and configuration files (shared Oracle Home)
making management of RAC even easier.
In this article, I will be using the latest release of OCFS2
(OCFS2 Release 1.2.4-2
at the time of this writing)
to store
the two files that are required to be shared by the Oracle Clusterware software.
Along with these two files, I will also be using this space to store the shared
ASM SPFILE for all Oracle RAC instances.
See the following page for more information on OCFS2
(including Installation Notes) for Linux:
Download the appropriate RPMs starting with the latest OCFS2 kernel module (the driver).
With CentOS 4.4 Enterprise Linux, I am using kernel release 2.6.9-42.EL. The appropriate
OCFS2 kernel module was found in the latest release of OCFS2 at the time of this writing
(OCFS2 Release 1.2.4-2).
The available OCFS2 kernel modules for Linux kernel 2.6.9-42.EL are listed below.
Always download the latest OCFS2 kernel module that matches the distribution, platform, kernel version and
the kernel flavor (smp, hugemem, psmp, etc).
Install OCFS2
To disable SELinux, run the "Security Level Configuration" GUI utility:
All of the above cluster services have been packaged in the o2cb system service
(/etc/init.d/o2cb). Here is a short listing of some of the more useful
commands and options for the o2cb system service.
Set the on-boot properties as follows:
We can now start to make use of the partitions we created in
the section
Create Partitions on the Shared Storage Devices.
Well, at least the first partition!
If the O2CB cluster is offline, start it. The format operation needs the cluster to be online, as it
needs to ensure that the volume is not mounted on some node in the cluster.
Earlier in this document, we created
the directory /u02/oradata/orcl
under the section Create Mount Point for OCFS / Clusterware.
This section contains the commands to create and mount the file system to be used
for the Cluster Manager - /u02/oradata/orcl.
See the instructions
below on how to create the OCFS2 file
system using the command-line tool mkfs.ocfs2.
To create the file system, we can use the Oracle
executable mkfs.ocfs2. For the purpose
of this example, I run the following command only from
vmlinux1 as the root user account using
the local SCSI volume
for crs /dev/sdb1. Also note that I specified
a label named "oracrsfiles" which will be
referred to when mounting or un-mounting the volume:
First, here is how to manually mount the OCFS2 file system from the
command-line. Remember that this needs
to be performed as the root user account:
Any other type of volume, including an Oracle home (which I will not be using for this article),
should not be mounted with this mount option.
We start by adding the following line to the /etc/fstab file on both nodes in the RAC cluster:
Now, let's make sure that the ocfs2.ko kernel module is being loaded
and that the file system will be mounted during the boot process.
If you have been following along with the
examples in this article, the actions to load the kernel module and mount
the OCFS2 file system should already be enabled. However, we should still
check those options by running the following on both nodes in the
RAC cluster as the root user account:
Let's first check the permissions:
Install and Configure Automatic Storage Management (ASMLib 2.0)
ASM was introduced in Oracle10g Release 1 and is used to alleviate the DBA from
having to manage individual files and drives. ASM is built into the Oracle kernel and
provides the DBA with a way to manage thousands of disk drives 24x7 for both
single and clustered instances of Oracle. All of the files and directories to be used
for Oracle will be contained in a disk group. ASM automatically performs
load balancing in parallel across all available disk drives to prevent hot spots and
maximize performance, even with rapidly changing data usage patterns.
There are two
different methods to configure ASM on Linux:
In this article, I will be using the "ASM with ASMLib I/O" method. Oracle states
(in Metalink Note 275315.1) that "ASMLib was provided to enable ASM I/O to Linux
disks without the limitations of the standard UNIX I/O API". I plan on performing
several tests in the future to identify the performance gains in using ASMLib. Those
performance metrics and testing details are out of scope of this article and therefore
will not be discussed.
We start this section by first downloading the ASMLib drivers (ASMLib Release 2.0) specific to
our Linux kernel. We will then install and configure the ASMLib 2.0 drivers while finishing off the
section with a demonstration of how to create the ASM disks.
If you would like to learn more about Oracle ASMLib 2.0, visit
http://www.oracle.com/technology/tech/linux/asmlib/
In the section
"Create Partitions on the Shared Storage Devices",
we created four Linux partitions to be used for storing Oracle database
files like online redo logs, database files, control files, archived redo log files, and a flash recovery area.
Here is a list of those partitions we created for use by ASM:
The last task in this section is to create the ASM Disks.
To create the ASM disks using the SCSI device names (above), type the following:
Download Oracle10g RAC Software
In this section, we will be downloading and extracting the required software from Oracle
to only one of the Linux nodes in the RAC cluster - namely vmlinux1. This is the machine
where I will be performing all of the Oracle installs from. The Oracle installer will copy the required
software packages to all other nodes in the RAC configuration using the remote access method we
setup in the section "Configure RAC Nodes for Remote Access".
Login to the node that you will be performing all of the Oracle installations from
as the "oracle" user account.
In this example, I will be downloading the required Oracle software to vmlinux1 and
saving them to "/u01/app/oracle/orainstall".
Extract the Clusterware package as follows:
Then extract the Oracle10g Database Software:
Finally, extract the Oracle10g Companion CD Software:
Pre-Installation Tasks for Oracle10g Release 2
The next pre-installation step is to run the
Cluster Verification Utility (CVU). CVU is a command-line
utility provided on the Oracle Clusterware installation media. It is
responsible for performing various system checks to assist you with
confirming the Oracle RAC nodes are properly
configured for Oracle Clusterware and Oracle Real Application Clusters
installation. The CVU only needs to be run
from the node you will be performing the Oracle installations from (vmlinux1
in this article).
To query package information (gcc and glibc-devel for example),
use the "rpm -q <PackageName> [, <PackageName>]" command
as follows:
JDK 1.4.2
If you do have JDK 1.4.2 installed, then you must define the user environment variable
CV_JDKHOME for the path to the JDK. For example, if JDK 1.4.2 is installed in
/usr/local/j2re1.4.2_08, then log in as the user that you plan to use to run
CVU, and enter the following commands:
Install cvuqdisk RPM (RHEL Users Only)
The cvuqdisk RPM can be found on the Oracle Clusterware installation
media in the rpm directory. For the purpose of this article, the Oracle Clusterware
media was extracted to the /u01/app/oracle/orainstall/clusterware
directory on vmlinux1.
Note that before installing the cvuqdisk RPM, we need to set
an environment variable named CVUQDISK_GRP to point to the group that will own
the cvuqdisk utility. The default group is oinstall which is not the group
we are using for the oracle UNIX user account in this article. Since
we are using the dba group, we will need to set CVUQDISK_GRP=dba before
attempting to install the cvuqdisk RPM.
Locate and copy the cvuqdisk RPM from vmlinux1 to vmlinux2
then perform the following steps on both Oracle RAC nodes to install:
Verify Remote Access / User Equivalence
The first error is with regards to
membership of the user "oracle" in group "oinstall" [as Primary]. For
the purpose of this article, the "oracle" user account will only
be assigned to the "dba" group so this error can be safely ignored.
The second error is with regards
to finding a suitable set of interfaces for VIPs. This is a bug
documented in Metalink Note
338924.1:
As documented in the note, this error can be safely ignored.
The last set of errors that can be ignored deal with specific
RPM package versions that do not exist in RHEL4 Update 4. For
example:
While these specific packages are listed as missing
in the CVU report, please ensure that the correct versions
of the compat-* packages are installed on both of
the Oracle RAC nodes in the cluster. For example, in RHEL4 Update 4,
these would be:
Also note that the
check for shared storage accessibility will fail.
Install Oracle10g Clusterware Software
So, what exactly is the Oracle Clusterware responsible for?
It contains all of the cluster and database configuration metadata along
with several system management features for RAC. It allows the DBA to
register and invite an Oracle instance (or instances) to the cluster. During normal
operation, Oracle Clusterware will send messages (via a special ping operation) to all nodes
configured in the cluster - often called the heartbeat. If the heartbeat fails
for any of the nodes, it checks with the Oracle Clusterware configuration files (on the shared disk)
to distinguish between a real node failure and a network failure.
After installing Oracle Clusterware, the Oracle Universal Installer (OUI)
used to install the Oracle10g database software (next section) will automatically
recognize these nodes. Like the Oracle Clusterware install we will be performing in this section,
the Oracle10g database software only needs to be run from one node. The OUI will
copy the software packages to all nodes configured in the RAC cluster.
Verify Server and Enable X Server Access
Login as the oracle User Account and Set DISPLAY (if necessary)
Verify Remote Access / User Equivalence
When using the secure shell
method, user equivalence will need to be enabled on any new terminal shell session
before attempting to run the OUI. To enable user equivalence for the current
terminal shell session, perform the following steps remembering to enter the pass
phrase for each key that you generated when prompted:
When using the remote shell
method, user equivalence is generally defined in the /etc/hosts.equiv file
for the oracle user account and is enabled on all new terminal shell
sessions:
For my installation, I had only one of the checks fail:
Simply click the check-box for "Checking physical memory requirements..." then click Next to continue.
Specify OCR Location: /u02/oradata/orcl/OCRFile
Voting Disk Location: /u02/oradata/orcl/CSSFile
Navigate to the /u01/app/oracle/oraInventory directory
and run orainstRoot.sh ON BOTH NODES
in the RAC cluster.
Within the same new console window on each node in the RAC cluster, (starting with the
node you are performing the install from), stay logged in as the "root" user account.
Navigate to the /u01/app/oracle/product/crs directory
and locate the root.sh file for both of the Oracle RAC nodes in the cluster -
(starting with the node you are performing the install from).
Run the root.sh file ON BOTH NODES
in the RAC cluster ONE AT A TIME.
You will receive several warnings while running the root.sh script
on both nodes. These warnings can be safely ignored.
The root.sh may take awhile to run. When running the root.sh
on the last node, you will receive a critical error
and the output should look like:
This issue is specific to Oracle 10.2.0.1
(noted in Metalink article 338924.1)
and needs to be
resolved before continuing. The easiest workaround is to re-run
vipca (GUI) manually as root from the last node in which the error occurred. Please
keep in mind that vipca is a GUI and will need to set your DISPLAY
variable accordingly to your X server:
# $ORA_CRS_HOME/bin/vipca
When the "VIP Configuration Assistant" appears, this is how I
answered the screen prompts:
Welcome: Click Next
Node Name: vmlinux2
Summary: Click Finish
Go back to the OUI and acknowledge the "Execute Configuration scripts" dialog window.
Check cluster nodes
Install Oracle10g Database Software
Like the Oracle Clusterware install (previous section),
the Oracle10g database software only needs to be run from one node. The OUI will
copy the software packages to all nodes configured in the RAC cluster.
Login as the oracle User Account and Set DISPLAY (if necessary)
Verify Remote Access / User Equivalence
When using the secure shell
method, user equivalence will need to be enabled on any new terminal shell session
before attempting to run the OUI. To enable user equivalence for the current
terminal shell session, perform the following steps remembering to enter the pass
phrase for each key that you generated when prompted:
When using the remote shell
method, user equivalence is generally defined in the /etc/hosts.equiv file
for the oracle user account and is enabled on all new terminal shell
sessions:
For my installation, I had one check fail:
"Checking physical memory requirements ...
Click the check-box for "Checking physical memory requirements..." to acknowledge it.
Click Next to continue.
Remember that we will create the clustered database
as a separate step using dbca.
First, open a new console window
on the node you are installing the Oracle10g database
software from as the root user account. For me, this was "vmlinux1".
Navigate to the /u01/app/oracle/product/10.2.0/db_1 directory
and run root.sh.
After running the root.sh script on all nodes in the RAC cluster,
go back to the OUI and acknowledge the "Execute Configuration scripts" dialog window.
Install Oracle10g Companion CD Software
Please keep in mind that this is an optional step. For the purpose of this article,
my testing database will often make use of the
Java Virtual Machine (Java VM) and Oracle interMedia and therefore will require
the installation of the Oracle Database 10g Companion CD. The type of
installation to perform will be the Oracle Database 10g Products installation type.
This installation type includes the Natively Compiled Java Libraries (NCOMP) files to
improve Java performance. If you do not install the NCOMP files,
the "ORA-29558:JAccelerator (NCOMP) not installed" error occurs when a database
that uses Java VM is upgraded to the patch release.
Like the Oracle Clusterware and Database install (previous sections),
the Oracle10g companion software only needs to be run from one node. The OUI will
copy the software packages to all nodes configured in the RAC cluster.
Login as the oracle User Account and Set DISPLAY (if necessary)
Verify Remote Access / User Equivalence
When using the secure shell
method, user equivalence will need to be enabled on any new terminal shell session
before attempting to run the OUI. To enable user equivalence for the current
terminal shell session, perform the following steps remembering to enter the pass
phrase for each key that you generated when prompted:
When using the remote shell
method, user equivalence is generally defined in the /etc/hosts.equiv file
for the oracle user account and is enabled on all new terminal shell
sessions:
For my installation, I had only one of the checks fail:
Simply click the check-box for "Checking physical memory requirements..." then click Next to continue.
Create TNS Listener Process
The process of creating the TNS listener only needs to be performed from
one of the nodes in the RAC cluster. All changes will be made and replicated to both Oracle RAC
nodes in the cluster. On one of the nodes (I will be using vmlinux1)
bring up the Network Configuration Assistant (NETCA) and run through the
process of creating a new TNS listener process and to also configure the node for
local access.
Login as the oracle User Account and Set DISPLAY (if necessary)
Verify Remote Access / User Equivalence
When using the secure shell
method, user equivalence will need to be enabled on any new terminal shell session
before attempting to run the OUI. To enable user equivalence for the current
terminal shell session, perform the following steps remembering to enter the pass
phrase for each key that you generated when prompted:
When using the remote shell
method, user equivalence is generally defined in the /etc/hosts.equiv file
for the oracle user account and is enabled on all new terminal shell
sessions:
The following screenshots walk you through the process of creating a new
Oracle listener for our RAC environment.
Create the Oracle Cluster Database
Before executing the Database Configuration Assistant, make sure that
$ORACLE_HOME and $PATH are set appropriately for the
$ORACLE_BASE/product/10.2.0/db_1 environment.
You should also verify that all services we have installed up to this point
(Oracle TNS listener, Oracle Clusterware processes, etc.) are running
before attempting to start the clustered database creation process.
Login as the oracle User Account and Set DISPLAY (if necessary)
Verify Remote Access / User Equivalence
When using the secure shell
method, user equivalence will need to be enabled on any new terminal shell session
before attempting to run the OUI. To enable user equivalence for the current
terminal shell session, perform the following steps remembering to enter the pass
phrase for each key that you generated when prompted:
When using the remote shell
method, user equivalence is generally defined in the /etc/hosts.equiv file
for the oracle user account and is enabled on all new terminal shell
sessions:
You will then be prompted with a dialog box asking if you want to create and start the
ASM instance. Select the OK button to acknowledge this dialog.
The OUI will now create and start the ASM instance on all nodes in the RAC cluster.
If the volumes we created earlier in this article do not show up in the
"Select Member Disks" window:
(ORCL:VOL1,
ORCL:VOL2,
ORCL:VOL3, and ORCL:VOL4)
then click on the "Change Disk Discovery Path" button and input "ORCL:VOL*".
For the first "Disk Group Name", I used the string ORCL_DATA1.
Select the first two ASM volumes
(ORCL:VOL1 and ORCL:VOL2)
in the "Select Member Disks" window.
Keep the "Redundancy" setting to Normal.
After verifying all values in this window are correct, click the OK button.
This will present the "ASM Disk Group Creation" dialog. When the ASM Disk Group Creation
process is finished, you will be returned to the "ASM Disk Groups" windows.
Click the Create New button again.
For the second "Disk Group Name", I used the string FLASH_RECOVERY_AREA.
Select the last two ASM volumes
(ORCL:VOL3 and ORCL:VOL4) in the "Select Member Disks" window.
Keep the "Redundancy" setting to Normal.
After verifying all values in this window are correct, click the OK button.
This will present the "ASM Disk Group Creation" dialog.
When the ASM Disk Group Creation process is finished, you will be returned to
the "ASM Disk Groups" window with two disk groups created and selected.
Select only one of the disk groups by using the checkbox next to the newly created
Disk Group Name ORCL_DATA1 (ensure that the
disk group for FLASH_RECOVERY_AREA is not selected) and click Next to continue.
For the Flash Recovery Area, click the [Browse]
button and select the disk group name +FLASH_RECOVERY_AREA.
My disk group has
a size of about 12GB. I used a Flash Recovery Area Size
of 12284 MB.
Click OK on the "Summary" screen.
When exiting the DBCA, another dialog will come up indicating that
it is starting all Oracle instances and HA service "orcltest". This
may take several minutes to complete. When finished, all windows and
dialog boxes will disappear.
When the Oracle Database Configuration Assistant has completed, you will
have a fully functional Oracle RAC cluster running!
Use the following to verify the orcltest service was
successfully added:
If the only service defined was for orcl.idevelopment.info, then
you will need to manually add the service to both instances:
Verify TNS Networking Files
For clarity, I included a copy of the listener.ora file from my
node vmlinux1:
You can include any of these entries on other client machines
that need access to the clustered database.
Then try to connect to the clustered database using all available service names defined
in the tnsnames.ora file:
Create / Alter Tablespaces
This section provides several optional SQL commands I used to modify and
create all tablespaces for my testing database. Please keep in
mind that the database file names (OMF files) I used in this example may
differ from what Oracle creates for your environment. The following query
can be used to determine the file names for your environment:
Here is a snapshot of the tablespaces I have defined for my test database environment:
Verify the RAC Cluster and Database Configuration
Starting / Stopping the Cluster
With all of the work we have done up to this point, a popular question
might be, "How do we start and stop services?". If you have followed
the instructions in this article, all services should start automatically
on each reboot of the Linux nodes. This would include Oracle Clusterware, all Oracle instances,
Enterprise Manager Database Console, etc.
There are times, however, when you might want to shutdown a node and manually
start it back up. Or you may find that Enterprise Manager is not running
and need to start it. This section provides the commands (using SRVCTL)
responsible for starting and stopping the cluster environment.
Ensure that you are logged in as the "oracle" UNIX user. I will
be running all of the commands in this section from vmlinux1:
Test the New Oracle RAC Configuration
Most the queries in this section make use of gv$ views. These are
known as Global versions of the same v$ views. In most cases,
the global views (gv$ views) will query the standard v$
views for each instance and then UNION the results sets.
Before attempting to configure the on-boot properties:
After looking through the trace files for OCFS2, it was apparent that access to the
voting disk was too slow (exceeding the O2CB heartbeat threshold) and causing the
Oracle Clusterware software (and the node) to crash. On the console would be a message
similar to the following:
The solution I used was to increase the O2CB heartbeat threshold from its
default value of 7, to 601. Some setups may require an even higher setting. This is
a configurable parameter that is used to compute the time it takes for a node
to "fence" itself. During the installation and configuration of OCFS2, we adjusted
this value in the section
"Configure O2CB to Start on Boot and Adjust O2CB Heartbeat Threshold".
If you encounter a kernel panic from OCFS2 and need to increase the heartbeat threshold, use
the same procedures described in the section
"Configure O2CB to Start on Boot and Adjust O2CB Heartbeat Threshold".
If you are using an earlier version of OCFS2 tools (prior to ocfs2-tools release 1.2.2-1), the following
describes how to manually adjust the O2CB heartbeat threshold.
First, let's see how to determine what the O2CB heartbeat threshold is currently set to.
This can be done by querying the /proc file system as follows:
Let's see now how to manually increase the O2CB heartbeat threshold from
7 to 601. This task will need to be performed on all Oracle RAC nodes in the cluster.
We first need to modify the file /etc/sysconfig/o2cb
and set O2CB_HEARTBEAT_THRESHOLD to 601:
After modifying the file /etc/sysconfig/o2cb, we need to
alter the o2cb configuration. Again, this should be performed
on all Oracle RAC nodes in the cluster.
About the Author
All articles, scripts and material located at the Internet address of http://www.idevelopment.info is the copyright of Jeffrey M. Hunter
and is protected under copyright laws of the United States. This document may not be hosted on any other site without my express,
prior, written permission. Application to host any of the material elsewhere can be made by contacting me at jhunter@idevelopment.info.
I have made every effort and taken great care in making sure that the material included on my web site is technically accurate,
but I disclaim any and all responsibility for any loss, damage or destruction of data or any other property which may arise from
relying on it. I will in no case be liable for any monetary damages arising from such loss, damage or destruction.
One of the most efficient ways to become familiar with Oracle10g Real
Application Cluster (RAC) technology is to have access to an actual
Oracle10g RAC cluster. In learning this new technology, you will soon start
to realize the benefits Oracle10g RAC has to offer like fault tolerance,
new levels of security, load balancing, and the ease of upgrading capacity. The problem though is the
price of the hardware required for a typical production RAC configuration.
A small two node cluster, for example, could run anywhere from $10,000
to well over $20,000. This would not even include the heart of a production
RAC environment - the shared storage. In most cases, this would be a Storage
Area Network (SAN), which generally start at $8,000.
Please keep in mind that we will not be configuring the Oracle RAC environment to use the operating system
on the laptop directly (the host environment), but
rather utilizing two virtual machines that will be hosted on this laptop. The virtual machines
will be created using a product named VMware Workstation (release 5.5.3 for this article) and will host two
Red Hat Enterprise Linux operating environments (actually CentOS 4.4) that will be used for our RAC configuration.
It is imperative to note that this configuration should never be run in a production environment
and that it is not supported by Oracle or any other vendor. In a production environment,
fiber channel is the technology of choice, since it is the high-speed serial-transfer
interface that can connect systems and storage devices in either point-to-point or switched topologies.
It is also important to mention the significance of having all Oracle instances
hosted on multiple physical machines. Having all Oracle instances hosted on multiple machines
provides for failover. Access to your data will still be possible if one of the nodes in
the cluster fails. All surviving nodes will continue to service all client requests. The next
benefit is speed. Performance of applications may be enhanced by the ability to break
jobs up into separate tasks that can be serviced by multiple Oracle instances running on
different nodes. The final benefit is scalability. By adding more nodes to the Oracle RAC cluster,
it is possible to complete mote jobs that may have to be run simultaneously on separate servers.
Although I wrote this article and performed all testing using CentOS Enterprise Linux 4.4,
these instructions should work with minor modifications for
Windows 2000, Windows 2003, Solaris 9 (x86 Platform Edition), and Solaris 10 (x86 Platform Edition)
with VMware Workstation 5.
As we start to go into the details of the installation, it should be
noted that most of the tasks within this
document will need to be performed on both servers. I will indicate at the beginning
of each section whether or not the task(s) should be performed on both
nodes or not.
VMware Workstation 5 can be obtained directly from their website -
http://www.vmware.com/download/ws/.
A 30 day evaluation copy is available for download directly from the
website. If you decide to purchase WMware Workstation, you can purchase it
directly from VMware for US$189.
For what this product can do, it is well worth the price.
Before diving into the instructions for creating the two new virtual machines,
let's first talk about the host machine and operating system that I have
VMware Workstation installed on. In the table below is the configuration I
will be using for the new virtual machines we will be creating in this
article to support Oracle10g RAC.
Host Machine
Host Machine Name
melody.idevelopment.info - (192.168.1.106)
Host Operating Environment
Windows XP Professional
WMware Version
VMware Workstation - Release 5.5.3 (Build 34685)
Host Machine
Dell Inspiron 8600 Laptop
Memory
2GB Installed
(Each virtual machine will take 748MB from this 2GB)
Internal Hard Drive
60GB
External Hard Drive
300GB
Processor
2.0 GHz.
File System
NTFS
Guest Machine
Virtual Machine Configuration #1
Guest Operating Environment
CentOS Enterprise Linux 4.4
Guest Machine Name
vmlinux1
Public Name/IP - (eth0)
vmlinux1.idevelopment.info - (192.168.1.111)
Interconnect Name/IP - (eth1)
vmlinux1-priv.idevelopment.info - (192.168.2.111)
Memory
748MB
Hard Drive
25GB
Virtual Machine Location
M:\My Virtual Machines\Workstation 5.5.3\vmlinux1
Guest Machine
Virtual Machine Configuration #2
Guest Operating Environment
CentOS Enterprise Linux 4.4
Guest Machine Name
vmlinux2
Public Name/IP - (eth0)
vmlinux2.idevelopment.info - (192.168.1.112)
Interconnect Name/IP - (eth1)
vmlinux2-priv.idevelopment.info - (192.168.2.112)
Memory
748MB
Hard Drive
25GB
Virtual Machine Location
M:\My Virtual Machines\Workstation 5.5.3\vmlinux2
Guest Machine
Virtual Storage for Database / Clusterware Files
OCFS2
Oracle Clusterware Files
  - Oracle Cluster Registry (OCR) File
  - Voting Disk
  - Shared SPFILE for ASM instancesM:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk1.vmdk
2GB
ASM Volume
Database Files
ORCL:VOL1 (+ORCL_DATA1)M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk2.vmdk
12GB
ASM Volume
Database Files
ORCL:VOL2 (+ORCL_DATA1)M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk3.vmdk
12GB
ASM Volume
Flash Recovery Area
ORCL:VOL3 (+FLASH_RECOVERY_AREA)M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk4.vmdk
12GB
ASM Volume
Flash Recovery Area
ORCL:VOL4 (+FLASH_RECOVERY_AREA)M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk5.vmdk
12GB
In addition to the two IP addresses that will be configured, (virtualized by VMware),
for each virtual machine, Oracle10g RAC will configure
a virtual IP address for each of the virtual machines during
the Oracle10g Clusterware installation process. You will not need
to create a third network interface for the virtual IP address as Oracle
will bind it to the public network interface (eth0) for each
virtual machine.
Oracle10g Public Virtual IP (VIP) addresses for eth0
VMware
Virtual MachineOracle10g
Public Virtual Machine NameOracle10g
Public Virtual IP Address
vmlinux1.idevelopment.info
vmlinux1-vip.idevelopment.info
192.168.1.211
vmlinux2.idevelopment.info
vmlinux2-vip.idevelopment.info
192.168.1.212
After successfully installing the WMware Workstation software, you should now create
two virtual machines to host CentOS Enterprise Linux.
The following table describes the values I provided in the
"New Virtual Machine Wizard" in order to create the new
CentOS Enterprise Linux virtual machine(s). To begin
the "New Virtual Machine Wizard", start the VMware Workstation console
and choose "[File] -> [New] -> [Virtual Machine]".
If you are looking for an article that provides step-by-step details (including screenshots)
for creating a new CentOS Enterprise Linux virtual
machine, visit
Creating a New Virtual Machine - (CentOS Enterprise Linux 4.2).
Guest Machine
Virtual Machine #1
Screen
Value
Welcome
Click [Next].
Select Appropriate Configuration
Select [Custom].
Select a Virtual Machine Format
Select [New - Workstation 5].
Select a Guest Operating System
Select [Linux] / [Red Hat Enterprise Linux 4].
Name of Virtual Machine
Set the virtual machine to [vmlinux1]. Also note that I am
creating the new virtual machine on my external hard drive
using the directory
"M:\My Virtual Machines\Workstation 5.5.3\vmlinux1".
Memory for the Virtual Machine
Oracle10g require a minimum of 512MB of RAM memory although more memory
is always better for performance. In my case, I do have the memory to spare
on my laptop (2GB)
and will be giving each virtual machine 748MB of memory.
Network Type
Select [Use bridged networking].
Select I/O Adaptor Types
Always select the default option chosen by the VMware installer.
For my installation, this was [ATAPI] / [LSI Logic].
Select a Disk
Select [Create a new virtual disk].
Select a Disk Type
Always select the default option chosen by the VMware installer.
For my installation, this was [SCSI].
Specify Disk Capacity
Use a size of 25GB. I also checked the box to [Allocate all disk space now].
Specify Disk File
You can use any filename here. I typically use the name Disk0.vmdk as in
"M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk0.vmdk".
Guest Machine
Virtual Machine #2
To configure the second virtual machine, follow the same steps provided
in creating the first virtual machine (above) while substituting
the machine name (and appropriate files / directories) from vmlinux1 to vmlinux2.
Since the new virtual machines will only be used to host Oracle,
there are several devices that can be successfully removed from the virtual machine.
Having the virtual machine virtualize these unnecessary hardware components
is a waste of resources that could be better served with running Oracle.
Figure 1: VMware Virtual Machine Settings
When configuring an Oracle RAC environment, each node should include one network
interface for the public network and another network interface
for private use by Oracle RAC (the interconnect). As mentioned already
in the
Host Machine and Virtual Machine Configuration Overview section,
the public network (eth0) will be 192.168.1.0 while the private network (eth1 / interconnect) will be 192.168.2.0.
When creating both virtual machines, the VMware wizard will create eth0 - the public network.
Add Hardware Wizard - (Network Adapter)
Virtual Machine #1
Screen
Value
Welcome
Click [Next].
Hardware Type
Select [Ethernet Adapter] from the list of hardware types.
Network Type
Select [Bridged: Connected directly to the physical network].
Verify that the option for "Connect at power on" is checked and then click
the [Finish] button to complete the wizard.
Add Hardware Wizard - (Network Adapter)
Virtual Machine #2
Use the same steps above to create a second network adapter for the second virtual machine.
Now that we have our two new virtual machines, the next step is to
install CentOS Enterprise Linux to each of them. The CentOS Enterprise Linux project
takes the Red Hat Enterprise Linux 4 source RPMs, and compiles
them into a free clone of the Red Hat Enterprise Server 4 product. This provides
a free and stable version of the Red Hat Enterprise Linux 4 (AS/ES) operating environment that
I can now use for testing different Oracle configurations. CentOS Enterprise Linux
comes on four CDs.
If you are downloading the above ISO files to a MS Windows machine,
there are many options for burning these images (ISO files) to a CD. You
may already be familiar with and have the proper software
to burn images to CD. If you are not familiar with this process
and do not have the required software to burn images to CD, here are just
two (of many) software packages that can be used:
in the toolbar.
If you are looking for an article that provides step-by-step details (including screenshots)
for creating a new CentOS Enterprise Linux virtual
machine, visit
Creating a New Virtual Machine - (CentOS Enterprise Linux 4.2).
Guest Machine
Virtual Machine #1
Screen
Value
Start Installation
Insert Disk #1 of CentOS Enterprise Linux into the physical CD-ROM. If any
autostart windows show up on your Windows workstation/laptop, close them out.
Now, power up the first virtual machine (vmlinux1).
The CentOS Enterprise Linux installation should start
within the virtual machine.
Boot Screen
The first screen is the boot screen. At this point, you can add any type
of boot options, but in most cases, all you need to do is press [Enter]
to continue.
Test CD Media
You can choose to verify the CD media in this screen. I know that the ISOs
that I burnt to CD were OK, so I typically choose to
[Skip] the media check.
Welcome
After the installer starts the X Server, you should have the Welcome screen.
Click [Next] to continue.
Language Selection
The installer should choose the correct language by default.
Keyboard
The installer should choose the correct keyboard by default.
Installation Type
Select [Custom].
Disk Partitioning Setup
Select [Automatic Partitioning]. When
prompted with a dialog asking,
"Would you like to Initialize this drive, erasing ALL DATA".
answer [Yes].
Automatic Partitioning
Select [Remove all partitions on this system].
Answer [Yes] when prompted with a warning dialog asking to confirm
the delete operation.
Partitioning
For most automatic layouts, the defaults should be fine. For example,
the space allocated for /boot is always OK at 100MB.
The installer will make the Swap space equal to twice the amount of
RAM configured for this virtual machine. For my example, this would
be 748MB x 2 = 1,496MB. This is more than enough for the Oracle install.
The remainder is left for the root file system. So for me, this is a
nice layout and I will accept the defaults.
Boot Loader Configuration
Keep the default option to use the GRUB boot loader.
Network Configuration
I made sure to install (or better yet, virtualize)
both NIC interfaces (cards) in each of the
Linux machines before starting the operating system
installation.
This screen should have successfully detected each of the
network devices.
- Check OFF the option to [Configure using DHCP]
- Leave the [Activate on boot] checked ON
- IP Address: 192.168.1.111
- Netmask: 255.255.255.0
- Check OFF the option to [Configure using DHCP]
- Leave the [Activate on boot] checked ON
- IP Address: 192.168.2.111
- Netmask: 255.255.255.0
Firewall Configuration
Make sure to select [No firewall]. You may be prompted
with a warning dialog about not setting the firewall.
If this occurs, simply hit [Proceed] to continue.
Additional Language Support
Nothing should need to be changed here.
Time Zone Selection
Select your time zone.
Set Root Password
Set your root password.
Package Group Selection
NOTE: With some RHEL 4 distributions,
you will not get the [Package Group Selection]
screen by default. There, you are asked to simply [Install default
software packages] or [Customize software packages to be installed].
Select the option to [Customize software packages to be installed].
This will then bring up the
[Package Group Selection] screen.
About to Install
We are now ready to start the installation process. Click the [Next]
button to start the installation.
Installation Complete
At this point, the installation is complete. The CD will be ejected from the CD-ROM
and you are asked to [Exit] and reboot the system.
Post Installation Wizard
After the virtual machine is rebooted, you will be presented with a post installation
wizard that allows you to make final configuration settings. Nothing really exciting
here other then setting the Date/Time and Display settings.
Guest Machine
Virtual Machine #2
To configure the second virtual machine, follow the same steps provided
to install CentOS Enterprise Linux (above) while substituting the following:
Although this is an optional step, you really should install
the VMware Tools for each new virtual machine.
On a Linux guest, you can install VMware Tools within X or from the
command line. Both options will be described in this section.
To install VMware Tools from X with the RPM installer:
Installing VMware Tools from the Command Line with the Tar Installer
# vmware-config-tools.pl
Respond to any questions the installer displays on the screen. At the end of
the configuration process, the program asks for the new screen resolution. You should pick
the same screen resolution you selected during the CentOS Enterprise Linux
install. I used option 2 ("800x600").
# init 6
The first steps are performed on the host, within Workstation menus:
# cd /tmp
# mount -r /dev/cdrom /mnt
# tar -zxf /mnt/VMwareTools-5.5.3-34685.tar.gz
# cd /tmp/vmware-tools-distrib
# umount /mnt
# cd /tmp/vmware-tools-distrib
# ./vmware-install.pl
Respond to the configuration questions on the screen.
When the installation process begins, you can simply accept
the default values for the first nine questions.
Press [Enter] to accept the default value.
At the end of
the configuration process, the program asks for the new screen resolution. You should pick
the same screen resolution you selected during the CentOS Enterprise Linux
install. I used option 2 ("800x600").
# init 6
Perform the following network configuration on all nodes in the cluster!
Although we configured several of the network
settings during the installation of CentOS Enterprise Linux, it is important
to not skip this section as it contains critical
steps that are required for a successful RAC environment.
Introduction to Network Settings
During the Linux O/S install we already configured the IP address and
host name for each of the nodes.
We now need to configure
the /etc/hosts file as well as adjusting several of the
network settings for the interconnect. I also include instructions
for enabling Telnet and FTP services.
I even provide instructions
on how to enable root logins for both Telnet and FTP. This is an
optional step. Enabling root logins for Telnet and FTP should never be
configured for a production environment!
Enabling Telnet and FTP Services
Linux is configured to run the Telnet and FTP server, but by default,
these services are not enabled.
To enable the telnet service, login to the
server as the root user account and run the following commands:
# chkconfig telnet on
# service xinetd reload
Reloading configuration: [ OK ]
xinetd. It has been replaced with vsftp
and can be started from /etc/init.d/vsftpd as in the following:
# /etc/init.d/vsftpd start
Starting vsftpd for vsftpd: [ OK ]
If you want the vsftpd service to start and stop when recycling (rebooting) the machine,
you can create the following symbolic links:
# ln -s /etc/init.d/vsftpd /etc/rc3.d/S56vsftpd
# ln -s /etc/init.d/vsftpd /etc/rc4.d/S56vsftpd
# ln -s /etc/init.d/vsftpd /etc/rc5.d/S56vsftpd
Allowing Root Logins to Telnet and FTP Services
Now before getting into the details of how to configure Red Hat Linux for
root logins, keep in mind that this is VERY BAD security. Make sure that you
NEVER configure your production servers for this type of login.
/etc/securetty and add the
following to the end of the file:
pts/0
pts/1
pts/2
pts/3
pts/4
pts/5
pts/6
pts/7
pts/8
pts/9
This will allow up to 10 telnet sessions to the server
as root.
/etc/vsftpd.ftpusers and /etc/vsftpd.user_list
and remove the 'root' line from each file.
Configuring Public and Private Network
In our two node example, we need to configure the network on both nodes
for access to the public network as well as their private interconnect.
# su -
# /usr/bin/system-config-network &
Do not use DHCP naming for the public IP address or the interconnects - we need static IP addresses!
/etc/hosts file. Both of these tasks can
be completed using the Network Configuration GUI. Notice that the /etc/hosts
entries are the same for both nodes.
Oracle RAC Node 1 - (vmlinux1)
Device
IP Address
Subnet
Gateway
Purpose
eth0
192.168.1.111
255.255.255.0
192.168.1.1
Connects vmlinux1 to the public network
eth1
192.168.2.111
255.255.255.0
Connects vmlinux1 (interconnect) to vmlinux2 (vmlinux2-priv)
/etc/hosts
127.0.0.1 localhost.localdomain localhost
# Public Network - (eth0)
192.168.1.111 vmlinux1
192.168.1.112 vmlinux2
# Private Interconnect - (eth1)
192.168.2.111 vmlinux1-priv
192.168.2.112 vmlinux2-priv
# Public Virtual IP (VIP) addresses for - (eth0)
192.168.1.211 vmlinux1-vip
192.168.1.212 vmlinux2-vip
Oracle RAC Node 2 - (vmlinux2)
Device
IP Address
Subnet
Gateway
Purpose
eth0
192.168.1.112
255.255.255.0
192.168.1.1
Connects vmlinux2 to the public network
eth1
192.168.2.112
255.255.255.0
Connects vmlinux2 (interconnect) to vmlinux1 (vmlinux1-priv)
/etc/hosts
127.0.0.1 localhost.localdomain localhost
# Public Network - (eth0)
192.168.1.111 vmlinux1
192.168.1.112 vmlinux2
# Private Interconnect - (eth1)
192.168.2.111 vmlinux1-priv
192.168.2.112 vmlinux2-priv
# Public Virtual IP (VIP) addresses for - (eth0)
192.168.1.211 vmlinux1-vip
192.168.1.212 vmlinux2-vip
Note that the virtual IP addresses only need to be defined in the /etc/hosts file (or your DNS)
for both Oracle RAC
nodes. The public virtual IP addresses will be configured automatically by Oracle when you run the
Oracle Universal Installer, which starts Oracle's Virtual Internet Protocol Configuration Assistant (VIPCA).
All virtual IP addresses will be activated when the srvctl start nodeapps -n <node_name> command
is run. Although I am getting ahead of myself, this is the Host Name/IP Address that will be
configured in the client(s) tnsnames.ora file for each Oracle Net Service Name.
All of this will be explained much later in this article!
In the screen shots below, only node 1 (vmlinux1) is shown. Ensure to make
all the proper network settings to both nodes!
Figure 2: Network Configuration Screen - Node 1 (vmlinux1)
Figure 3: Ethernet Device Screen - eth0 (vmlinux1)
Figure 4: Ethernet Device Screen - eth1 (vmlinux1)
Figure 5: Network Configuration Screen - /etc/hosts (vmlinux1)
Once the network if configured, you can use the ifconfig
command to verify everything is working. The following example
is from vmlinux1:
$ /sbin/ifconfig -a
eth0 Link encap:Ethernet HWaddr 00:0C:29:07:E6:0B
inet addr:192.168.1.111 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe07:e60b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:170 errors:0 dropped:0 overruns:0 frame:0
TX packets:146 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:14360 (14.0 KiB) TX bytes:11875 (11.5 KiB)
Interrupt:185 Base address:0x1400
eth1 Link encap:Ethernet HWaddr 00:0C:29:07:E6:15
inet addr:192.168.2.111 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe07:e615/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:197 errors:0 dropped:0 overruns:0 frame:0
TX packets:21 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:14618 (14.2 KiB) TX bytes:1386 (1.3 KiB)
Interrupt:169 Base address:0x1480
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:1962 errors:0 dropped:0 overruns:0 frame:0
TX packets:1962 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:3226318 (3.0 MiB) TX bytes:3226318 (3.0 MiB)
sit0 Link encap:IPv6-in-IPv4
NOARP MTU:1480 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
About Virtual IP
Why do we have a Virtual IP (VIP) in 10g?
Why does it just return a dead connection when its primary node fails?
Make sure RAC node name is not listed in loopback address
Ensure that the node names (vmlinux1 or vmlinux2) are
not included for the loopback address in the /etc/hosts file.
If the machine name is listed in the in the loopback address entry as below:
127.0.0.1 vmlinux1 localhost.localdomain localhost
it will need to be removed as shown below:
127.0.0.1 localhost.localdomain localhost
If the RAC node name is listed for the loopback address, you will
receive the following error during the RAC installation:
ORA-00603: ORACLE server session terminated by fatal error
or
ORA-29702: error occurred in Cluster Group Service operation
Adjusting Network Settings
With Oracle 9.2.0.1 and onwards, Oracle now makes use of UDP as the default protocol
on Linux for inter-process communication (IPC), such as Cache Fusion
and Cluster Manager buffer transfers
between instances within the RAC cluster.
The default and maximum window size can be changed in the /proc file system
without reboot:
# su - root
# sysctl -w net.core.rmem_default=262144
net.core.rmem_default = 262144
# sysctl -w net.core.wmem_default=262144
net.core.wmem_default = 262144
# sysctl -w net.core.rmem_max=262144
net.core.rmem_max = 262144
# sysctl -w net.core.wmem_max=262144
net.core.wmem_max = 262144
# Default setting in bytes of the socket receive buffer
net.core.rmem_default=262144
# Default setting in bytes of the socket send buffer
net.core.wmem_default=262144
# Maximum socket receive buffer size which may be set by using
# the SO_RCVBUF socket option
net.core.rmem_max=262144
# Maximum socket send buffer size which may be set by using
# the SO_SNDBUF socket option
net.core.wmem_max=262144
Check and turn off UDP ICMP rejections:
During the Linux installation process, I indicated to not configure the
firewall option. (By default the option to configure a firewall is selected
by the installer.)
This has burned me several times so I like to do a double-check that the firewall
option is not configured and to ensure udp ICMP filtering is turned off.
08/29/2005 22:17:19
oac_init:2: Could not connect to server, clsc retcode = 9
08/29/2005 22:17:19
a_init:12!: Client init unsuccessful : [32]
ibctx:1:ERROR: INVALID FORMAT
proprinit:problem reading the bootblock or superbloc 22
When experiencing this type of error, the solution was to remove the udp ICMP (iptables)
rejection rule - or to simply have the firewall option turned off.
The Oracle Clusterware software will then start to operate normally and not crash. The following commands
should be executed as the root user account:
# /etc/rc.d/init.d/iptables status
Firewall is stopped.
# /etc/rc.d/init.d/iptables stop
Flushing firewall rules: [ OK ]
Setting chains to policy ACCEPT: filter [ OK ]
Unloading iptables modules: [ OK ]
# chkconfig iptables off
At this point, we have two virtual machines configured for Linux for our
two node Oracle RAC environment. In this section we now get to take care
of one of the most essential tasks in this article - configuring shared
storage for the RAC instances.
Create Initial Virtual Disks - (from first node)
After both virtual machines are powered down, we start with the first virtual machine (vmlinux1)
and use the "Add Hardware Wizard" to create five virtual SCSI hard drives.
We will be using this option to Add five new virtual (SCSI) hard drives:
New Virtual Hard Drives
Virtual Machine #1
Virtual Hard Drive
Size
M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk1.vmdk
2GB
M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk2.vmdk
12GB
M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk3.vmdk
12GB
M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk4.vmdk
12GB
M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk5.vmdk
12GB
Add Hardware Wizard - (SCSI Hard Disk)
Virtual Machine #1
Screen
Value
Welcome
Click [Next].
Hardware Type
Select [Hard Disk] from the list of hardware types.
Select a Disk
Select [Create a new virtual disk].
Select a Disk Type
Select the [SCSI] option for the virtual disk type even if it is
not identified as the "Recommended" drive by the VMware installer.
Specify Disk Capacity
Specify [2GB] for the disk size. Also, check the option to [Allocate all disk space now].
Specify Disk File
For the first disk, use the file name [Disk1.vmdk]. For my configuration, the file
will be created in "M:\My Virtual Machines\Workstation 5.5.3\vmlinux1" by default.
Modify VMware Configuration File
In the above section, we created five (SCSI) virtual hard drives to be used as shared storage for
our Oracle RAC environment. In this section, we will accomplish the following two tasks:
M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Red Hat Enterprise Linux 4.vmx
M:\My Virtual Machines\Workstation 5.5.3\vmlinux2\Red Hat Enterprise Linux 4.vmx
With VMware Workstation 5.5 and higher, the name of the VMware configuration file has changed
from rhel4.vmx to Red Hat Enterprise Linux 4.vmx.
The configuration file for vmlinux1 will already contain configuration
information for the five new SCSI virtual hard disks:
...
scsi0:1.present = "TRUE"
scsi0:1.fileName = "Disk1.vmdk"
scsi0:2.present = "TRUE"
scsi0:2.fileName = "Disk2.vmdk"
scsi0:3.present = "TRUE"
scsi0:3.fileName = "Disk3.vmdk"
scsi0:4.present = "TRUE"
scsi0:4.fileName = "Disk4.vmdk"
scsi0:5.present = "TRUE"
scsi0:5.fileName = "Disk5.vmdk"
...
(vmlinux2 obviously will not at this time!)
The configuration information for
the five new hard disks (on vmlinux1) should be
removed and replaced with the configuration information in the table below.
Modify VMware Configuration File to Configure Disk Sharing
Virtual Machine #1 and Virtual Machine #2#
# ----------------------------------------------------------------
# SHARED DISK SECTION - (BEGIN)
# ----------------------------------------------------------------
# - The goal in meeting the hardware requirements is to have a
# shared storage for the two nodes. The way to achieve this in
# VMware is the creation of a NEW SCSI BUS. It has to be of
# type "virtual" and we must have the disk.locking = "false"
# option.
# - Just dataCacheMaxSize = "0" should be sufficient with the
# diskLib.* parameters, although I include all parameters for
# documentation purposes.
# - maxUnsyncedWrites should matter for sparse disks only, and
# I certainly do not recommend using sparse disks for
# clustering.
# - dataCacheMaxSize=0 should disable cache size completely, so
# other three dataCache options should do nothing (no harm,
# but nothing good either).
# ----------------------------------------------------------------
#
diskLib.dataCacheMaxSize = "0"
diskLib.dataCacheMaxReadAheadSize = "0"
diskLib.dataCacheMinReadAheadSize = "0"
diskLib.dataCachePageSize = "4096"
diskLib.maxUnsyncedWrites = "0"
disk.locking = "false"
# ----------------------------------------------------------------
# Create one HBA
# ----------------------------------------------------------------
scsi1.present = "TRUE"
scsi1.sharedBus = "virtual"
scsi1.virtualDev = "lsilogic"
# ----------------------------------------------------------------
# Create virtual SCSI disks on single HBA
# ----------------------------------------------------------------
scsi1:0.present = "TRUE"
scsi1:0.fileName = "M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk1.vmdk"
scsi1:0.redo = ""
scsi1:0.mode = "independent-persistent"
scsi1:0.deviceType = "disk"
scsi1:1.present = "TRUE"
scsi1:1.fileName = "M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk2.vmdk"
scsi1:1.redo = ""
scsi1:1.mode = "independent-persistent"
scsi1:1.deviceType = "disk"
scsi1:2.present = "TRUE"
scsi1:2.fileName = "M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk3.vmdk"
scsi1:2.redo = ""
scsi1:2.mode = "independent-persistent"
scsi1:2.deviceType = "disk"
scsi1:3.present = "TRUE"
scsi1:3.fileName = "M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk4.vmdk"
scsi1:3.redo = ""
scsi1:3.mode = "independent-persistent"
scsi1:3.deviceType = "disk"
scsi1:4.present = "TRUE"
scsi1:4.fileName = "M:\My Virtual Machines\Workstation 5.5.3\vmlinux1\Disk5.vmdk"
scsi1:4.redo = ""
scsi1:4.mode = "independent-persistent"
scsi1:4.deviceType = "disk"
#
# ----------------------------------------------------------------
# SHARED DISK SECTION - (END)
# ----------------------------------------------------------------
#
Power On Both Virtual Machines
After making the above changes, exit from the VMware Console (File -> Exit)
then power on both of the virtual machines
one at a time starting with vmlinux1.
Perform the following tasks on all nodes in the cluster!
I will be using the Oracle Cluster File System, Release 2 (OCFS2) to store the files required to be shared
for the Oracle Clusterware software.
When using OCFS2, the UID of the UNIX user "oracle" and GID of the UNIX group "dba" must be the same
on all machines in the cluster. If either the UID or GID are different, the files on the OCFS2 file system
will show up as "unowned" or may even be owned by a different user. For this article, I will use
175 for the "oracle" UID and 115 for the "dba" GID.
Create Group and User for Oracle
Lets continue this example by creating the UNIX
dba group
and oracle user account along with all appropriate directories.
# mkdir -p /u01/app
# groupadd -g 115 dba
# groupadd -g 116 oinstall
# useradd -u 175 -g 115 -d /u01/app/oracle -s /bin/bash -c "Oracle Software Owner" -p oracle oracle
# chown -R oracle:dba /u01
# passwd oracle
# su - oracle
When you are setting the Oracle environment variables for each RAC node, ensure to
assign each RAC node a unique Oracle SID!
Create Login Script for oracle User Account
After creating the "oracle" UNIX user account on both nodes, make sure
that you are logged in as the oracle user and
verify that the environment is setup correctly by using the
following
.bash_profile:
.bash_profile for Oracle User # .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
alias ls="ls -FA"
export JAVA_HOME=/usr/local/java
# User specific environment and startup programs
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1
export ORA_CRS_HOME=$ORACLE_BASE/product/crs
export ORACLE_PATH=$ORACLE_BASE/common/oracle/sql:.:$ORACLE_HOME/rdbms/admin
# Each RAC node must have a unique ORACLE_SID. (i.e. orcl1, orcl2,...)
export ORACLE_SID=orcl1
export PATH=.:${JAVA_HOME}/bin:${PATH}:$HOME/bin:$ORACLE_HOME/bin
export PATH=${PATH}:/usr/bin:/bin:/usr/bin/X11:/usr/local/bin
export PATH=${PATH}:$ORACLE_BASE/common/oracle/bin
export ORACLE_TERM=xterm
export TNS_ADMIN=$ORACLE_HOME/network/admin
export ORA_NLS10=$ORACLE_HOME/nls/data
export NLS_DATE_FORMAT="DD-MON-YYYY HH24:MI:SS"
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$ORACLE_HOME/oracm/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lib:/usr/lib:/usr/local/lib
export CLASSPATH=$ORACLE_HOME/JRE
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/rdbms/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/network/jlib
export THREADS_FLAG=native
export TEMP=/tmp
export TMPDIR=/tmp
Create Mount Point for OCFS2 / Clusterware
Finally, let's create the mount point for the Oracle Cluster File System, Release 2 (OCFS2)
that will be used to store the two Oracle Clusterware shared files.
These commands will need to be run as the "root" user account:
$ su -
# mkdir -p /u02/oradata/orcl
# chown -R oracle:dba /u02
Create the following partitions from only one node in the cluster!
Overview
The next step is to create a single partition on each of the five shared virtual drives. As mentioned
earlier in this article, I will be using Oracle's Cluster File System, Release 2 (OCFS2) to
store the two files to be shared for Oracle's Clusterware software. We will then be
using Automatic Storage Management (ASM) to create four ASM volumes; two for all physical database files (data/index files,
online redo log files, and control files) and two for the Flash Recovery Area (RMAN backups and archived redo log files).
The four ASM volumes will be used to create two ASM disk groups
(+ORCL_DATA1 and +FLASH_RECOVERY_AREA) using NORMAL redundancy.
Oracle Shared Drive Configuration
File System Type
Partition
Size
Mount Point
ASM Diskgroup Name
File Types
OCFS2
/dev/sdb1
2 GB
/u02/oradata/orcl
Oracle Cluster Registry (OCR) File - (~100 MB)
Voting Disk - (~20MB)
ASM
/dev/sdc1
12 GB
ORCL:VOL1
+ORCL_DATA1
Oracle Database Files
ASM
/dev/sdd1
12 GB
ORCL:VOL2
+ORCL_DATA1
Oracle Database Files
ASM
/dev/sde1
12 GB
ORCL:VOL3
+FLASH_RECOVERY_AREA
Oracle Flash Recovery Area
ASM
/dev/sdf1
12 GB
ORCL:VOL4
+FLASH_RECOVERY_AREA
Oracle Flash Recovery Area
Total
50 GB
The fdisk command is used in Linux for creating (and removing) partitions.
For this configuration, I will be creating a single partition on each of the five shared
virtual disks:
When attempting to partition the new virtual disks, it is safe to ignore any messages that indicate:
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.
The number of cylinders for this disk is set to 1566.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
# fdisk /dev/sdb
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-261, default 1): 1
Last cylinder or +size or +sizeM or +sizeK (1-261, default 261): 261
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
# fdisk /dev/sdc
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-1566, default 1): 1
Last cylinder or +size or +sizeM or +sizeK (1-1566, default 1566): 1566
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
# fdisk /dev/sdd
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-1566, default 1): 1
Last cylinder or +size or +sizeM or +sizeK (1-1566, default 1566): 1566
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
# fdisk /dev/sde
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-1566, default 1): 1
Last cylinder or +size or +sizeM or +sizeK (1-1566, default 1566): 1566
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
# fdisk /dev/sdf
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-1566, default 1): 1
Last cylinder or +size or +sizeM or +sizeK (1-1566, default 1566): 1566
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
From the first node (vmlinux1), use the fdisk -l
command to verify the new partitions:
# fdisk -l
Disk /dev/sda: 26.8 GB, 26843545600 bytes
255 heads, 63 sectors/track, 3263 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 3263 26105625 8e Linux LVM
Disk /dev/sdb: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 261 2096451 83 Linux
Disk /dev/sdc: 12.8 GB, 12884901888 bytes
255 heads, 63 sectors/track, 1566 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 1 1566 12578863+ 83 Linux
Disk /dev/sdd: 12.8 GB, 12884901888 bytes
255 heads, 63 sectors/track, 1566 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdd1 1 1566 12578863+ 83 Linux
Disk /dev/sde: 12.8 GB, 12884901888 bytes
255 heads, 63 sectors/track, 1566 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sde1 1 1566 12578863+ 83 Linux
Disk /dev/sdf: 12.8 GB, 12884901888 bytes
255 heads, 63 sectors/track, 1566 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdf1 1 1566 12578863+ 83 Linux
From the second node (vmlinux2), inform the kernel of the partition
changes using partprobe and then verify the new partitions:
# partprobe
# fdisk -l
Disk /dev/sda: 26.8 GB, 26843545600 bytes
255 heads, 63 sectors/track, 3263 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 3263 26105625 8e Linux LVM
Disk /dev/sdb: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 261 2096451 83 Linux
Disk /dev/sdc: 12.8 GB, 12884901888 bytes
255 heads, 63 sectors/track, 1566 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 1 1566 12578863+ 83 Linux
Disk /dev/sdd: 12.8 GB, 12884901888 bytes
255 heads, 63 sectors/track, 1566 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdd1 1 1566 12578863+ 83 Linux
Disk /dev/sde: 12.8 GB, 12884901888 bytes
255 heads, 63 sectors/track, 1566 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sde1 1 1566 12578863+ 83 Linux
Disk /dev/sdf: 12.8 GB, 12884901888 bytes
255 heads, 63 sectors/track, 1566 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdf1 1 1566 12578863+ 83 Linux
Perform the following configuration procedures on all nodes in the cluster!
Several of the commands within this section will need to be performed on every node within the cluster
every time the machine is booted. This section provides very detailed information about setting shared memory,
semaphores, and file handle limits. Instructions for placing them in
a startup script (/etc/sysctl.conf) are included in section
"All Startup Commands for Each RAC Node".
Overview
This section focuses on configuring both Linux servers -
getting each one prepared for the Oracle10g RAC installation. This includes
verifying enough swap space, setting shared memory and semaphores, and finally how to
set the maximum amount of file handles for the O/S.
Swap Space Considerations
(An inadequate amount of swap during the installation
will cause the Oracle Universal Installer to either "hang" or "die")
# cat /proc/meminfo | grep MemTotal
MemTotal: 755284 kB
# cat /proc/meminfo | grep SwapTotal
SwapTotal: 1540088 kB
# dd if=/dev/zero of=tempswap bs=1k count=300000
# chmod 600 tempswap
# mke2fs tempswap
# mkswap tempswap
# swapon tempswap
Setting Shared Memory
Shared memory allows processes to access common structures and data by placing
them in a shared memory segment. This is the fastest form of Inter-Process Communications
(IPC) available - mainly due to the fact that no kernel involvement occurs when data is
being passed between the processes. Data does not need to be copied between processes.
# ipcs -lm
------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 32768
max total shared memory (kbytes) = 8388608
min seg size (bytes) = 1
The SHMMAX parameters defines the maximum size (in bytes) for
a shared memory segment. The Oracle SGA is comprised of shared memory and
it is possible that incorrectly setting SHMMAX could limit the
size of the SGA. When setting SHMMAX, keep in mind that the size of the
SGA should fit within one shared memory segment. An inadequate SHMMAX setting
could result in the following:
ORA-27123: unable to attach to shared memory segment
# cat /proc/sys/kernel/shmmax
33554432
The default value for SHMMAX is 32MB. This is often too small
to configure the Oracle SGA. I generally set the SHMMAX parameter to 2GB
using the following methods:
# sysctl -w kernel.shmmax=2147483648
# echo "kernel.shmmax=2147483648" >> /etc/sysctl.conf
We now look at the SHMMNI parameters. This kernel parameter is used
to set the maximum number of shared memory segments system wide. The default value
for this parameter is 4096.
# cat /proc/sys/kernel/shmmni
4096
The default setting for SHMMNI should be adequate for our Oracle10g Release 2 RAC installation.
Finally, we look at the SHMALL shared memory kernel parameter. This parameter
controls the total amount of shared memory (in pages) that can be used at one time on the
system. In short, the value of this parameter should always be at least:
ceil(SHMMAX/PAGE_SIZE)
The default size of SHMALL is 2097152 and can be queried using the following command:
# cat /proc/sys/kernel/shmall
2097152
The default setting for SHMALL should be adequate for our Oracle10g Release 2 RAC installation.
The page size in Red Hat Linux on the i386 platform is 4096 bytes. You can, however,
use bigpages which supports the configuration of larger memory page sizes.
Setting Semaphores
Now that we have configured our shared memory settings, it is time to take
care of configuring our semaphores. The best way to describe a semaphore is as
a counter that is used to provide synchronization between processes (or threads within a process)
for shared resources like shared memory. Semaphore sets are supported in System V where each
one is a counting semaphore. When an application requests semaphores, it does so using "sets".
# ipcs -ls
------ Semaphore Limits --------
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32
semaphore max value = 32767
You can also use the following command:
# cat /proc/sys/kernel/sem
250 32000 32 128
The SEMMSL kernel parameter is used to control the
maximum number of semaphores per semaphore set.
The SEMMNI kernel parameter is used to control the
maximum number of semaphore sets in the entire Linux system.
The SEMMNS kernel parameter is used to control the
maximum number of semaphores (not semaphore sets) in the entire Linux system.
SEMMNS -or- (SEMMSL * SEMMNI)
The SEMOPM kernel parameter is used to control the
number of semaphore operations that can be performed per semop system call.
Finally, we see how to set all semaphore parameters. In the following,
the only parameter I care about changing (raising) is SEMOPM. All other default
settings should be sufficient for our example installation.
# sysctl -w kernel.sem="250 32000 100 128"
# echo "kernel.sem=250 32000 100 128" >> /etc/sysctl.conf
Setting File Handles
When configuring the Red Hat Linux server, it is critical to ensure that the maximum number
of file handles is large enough. The setting for file handles denotes the number of open
files that you can have on the Linux system.
# cat /proc/sys/fs/file-max
102563
# sysctl -w fs.file-max=65536
# echo "fs.file-max=65536" >> /etc/sysctl.conf
You can query the current usage of file handles by using the following:
# cat /proc/sys/fs/file-nr
825 0 65536
The file-nr file displays three parameters:
If you need to increase the value in
/proc/sys/fs/file-max,
then make sure that the ulimit is set properly. Usually for Linux 2.4 and 2.6 it is set to
unlimited. Verify the ulimit setting my issuing the ulimit command:
# ulimit
unlimited
Setting IP Local Port Range
Configure the system to allow a local port range of 1024 through 65000.
# cat /proc/sys/net/ipv4/ip_local_port_range
32768 61000
The default value for ip_local_port_range is ports 32768 through 61000.
Oracle recommends a local port range of 1024 to 65000.
# sysctl -w net.ipv4.ip_local_port_range="1024 65000"
# echo "net.ipv4.ip_local_port_range = 1024 65000" >> /etc/sysctl.conf
Setting Shell Limits for the oracle User
To improve the performance of the software on Linux systems, Oracle recommends you increase the
following shell limits for the oracle user:
Shell Limit
Item in limits.conf
Hard Limit
Maximum number of open file descriptors
nofile
65536
Maximum number of processes available to a single user
nproc
16384
cat >> /etc/security/limits.conf <<EOF
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
EOF
cat >> /etc/pam.d/login <<EOF
session required /lib/security/pam_limits.so
EOF
Update the default shell startup file for the "oracle" UNIX account.
cat >> /etc/profile <<EOF
if [ \$USER = "oracle" ]; then
if [ \$SHELL = "/bin/ksh" ]; then
ulimit -p 16384
ulimit -n 65536
else
ulimit -u 16384 -n 65536
fi
umask 022
fi
EOF
cat >> /etc/csh.login <<EOF
if ( \$USER == "oracle" ) then
limit maxproc 16384
limit descriptors 65536
endif
EOF
Activating All Kernel Parameters for the System
At this point, we have covered all of the required Linux kernel parameters needed
for a successful Oracle installation and configuration. Within each section above, we
configured the Linux system to persist each of the kernel parameters on system
startup by placing them all in the /etc/sysctl.conf file.
# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.core.rmem_default = 262144
net.core.wmem_default = 262144
net.core.rmem_max = 262144
net.core.wmem_max = 262144
kernel.shmmax = 2147483648
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
Setting the Correct Date and Time on Both Oracle RAC Nodes
During the installation of Oracle Clusterware, the Database, and the Companion CD,
the Oracle Universal Installer (OUI) first installs the software to the local
node running the installer (i.e. vmlinux1). The software is then copied remotely to all of the remaining
nodes in the cluster (i.e. vmlinux2). During the remote copy process, the OUI will execute the UNIX
"tar" command on each of the remote nodes to extract the files that were archived and
copied over. If the date and time on the node performing the install is greater than
that of the node it is copying to, the OUI will throw an error from the
"tar" command indicating it is attempting to extract files stamped with a time in the future:
Error while copying directory
/u01/app/oracle/product/crs with exclude file list 'null' to nodes 'vmlinux2'.
[PRKC-1002 : All the submitted commands did not execute successfully]
---------------------------------------------
vmlinux2:
/bin/tar: ./bin/lsnodes: time stamp 2007-02-19 09:21:34 is 735 s in the future
/bin/tar: ./bin/olsnodes: time stamp 2007-02-19 09:21:34 is 735 s in the future
...(more errors on this node)
# date -s "2/19/2007 23:00:00"
# date -s "2/19/2007 23:00:20"
Perform the following configuration procedures on all nodes in the cluster!
The hangcheck-timer.ko Module
The hangcheck-timer module uses a kernel-based timer that
periodically checks the system task scheduler to catch delays in order to
determine the health of the system. If the system hangs or pauses, the timer
resets the node. The hangcheck-timer module uses the Time Stamp Counter
(TSC) CPU register which is a counter that is incremented at each clock signal.
The TCS offers much more accurate time measurements since this register
is updated by the hardware automatically.
Installing the hangcheck-timer.ko Module
The hangcheck-timer was normally shipped only by Oracle, however, this
module is now included with Red Hat Linux AS starting with kernel versions
2.4.9-e.12 and higher. The hangcheck-timer should already be included.
Use the following to ensure that you have the module included:
# find /lib/modules -name "hangcheck-timer.ko"
/lib/modules/2.6.9-42.EL/kernel/drivers/char/hangcheck-timer.ko
In the above output, we care about the hangcheck timer object
(hangcheck-timer.ko) in the
/lib/modules/2.6.9-42.EL/kernel/drivers/char directory.
Configuring and Loading the hangcheck-timer Module
There are two key parameters to the hangcheck-timer module:
The two hangcheck-timer module parameters indicate how long a RAC node
must hang before it will reset the system. A node reset will occur when
the following is true:
system hang time > (hangcheck_tick + hangcheck_margin)
Configuring Hangcheck Kernel Module Parameters
Each time the hangcheck-timer kernel module is loaded (manually or by Oracle), it needs to know what value to use for
each of the two parameters we just discussed: (hangcheck-tick and hangcheck-margin).
# su -
# echo "options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180" >> /etc/modprobe.conf
Each time the hangcheck-timer kernel module gets loaded, it will use the values
defined by the entry I made in the /etc/modprobe.conf file.
Manually Loading the Hangcheck Kernel Module for Testing
Oracle is responsible for loading the hangcheck-timer kernel module when required. It is for this
reason that it is not required to perform a modprobe or insmod of the
hangcheck-timer kernel module in any of the startup files (i.e. /etc/rc.local).
# echo "/sbin/modprobe hangcheck-timer" >> /etc/rc.local
You don't have to manually load the hangcheck-timer kernel module using
modprobe or insmod after each reboot.
The hangcheck-timer module will be loaded by Oracle (automatically) when needed.
# su -
# modprobe hangcheck-timer
# grep Hangcheck /var/log/messages | tail -2
Feb 19 13:04:40 vmlinux2 kernel: Hangcheck: starting hangcheck timer 0.5.0 (tick is 30 seconds, margin is 180 seconds)
Perform the following configuration procedures on both Oracle RAC nodes in the cluster!
Before you can install and use Oracle Real Application clusters, you must configure either
secure shell (SSH) or remote shell (RSH) for the "oracle" UNIX user account on both
of the Oracle RAC nodes in the cluster. The goal here is to setup user equivalence for the
"oracle" UNIX user account. User equivalence enables the "oracle" UNIX user account to
access all other nodes in the cluster (running commands and copying files) without the need
for a password. This can be configured using either SSH or RSH where SSH is the preferred
method. Oracle added support in 10g Release 1 for using the SSH tool suite for setting up
user equivalence. Before Oracle10g, user equivalence had to be configured using remote shell.
If the Oracle Universal Installer in 10g does not detect the presence of the
secure shell tools (ssh and scp), it will attempt to use the remote
shell tools instead (rsh and rcp).
The use of secure shell or remote shell is not required for normal RAC operation.
This configuration, however, must to be enabled for RAC and patchset installations as well
as creating the clustered database.
Using the Secure Shell Method
Using the Remote Shell Method
This section describes how to configure OpenSSH version 3.
# pgrep sshd
3797
If SSH is running, then the response to this command is a list of process ID number(s).
Please run this command on both of the Oracle RAC nodes in the cluster to verify the SSH
daemons are installed and running!
To find out more about SSH, refer to the man page:
# man ssh
Creating RSA and DSA Keys on Both Oracle RAC Nodes
The first step in configuring SSH is to create RSA and DSA key pairs on both Oracle RAC nodes
in the cluster. The command to do this will create a public and private key for
both RSA and DSA (for a total of four keys per node). The content of the RSA and
DSA public keys will then need to be copied into an authorized key file which is
then distributed to both of the Oracle RAC nodes in the cluster.
# su - oracle
$ mkdir -p ~/.ssh
$ chmod 700 ~/.ssh
$ /usr/bin/ssh-keygen -t rsa
At the prompts:
$ /usr/bin/ssh-keygen -t dsa
At the prompts:
$ touch ~/.ssh/authorized_keys
$ cd ~/.ssh
$ ls -l *.pub
-rw-r--r-- 1 oracle dba 605 Feb 19 18:21 id_dsa.pub
-rw-r--r-- 1 oracle dba 225 Feb 19 18:21 id_rsa.pub
NOTE: The listing above
should show the id_rsa.pub and id_dsa.pub
public keys created in the previous section.
$ ssh vmlinux1 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
The authenticity of host 'vmlinux1 (192.168.1.111)' can't be established.
RSA key fingerprint is de:07:ad:c7:95:45:3b:e0:e9:78:13:3e:d1:29:33:bd.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'vmlinux1,192.168.1.111' (RSA) to the list of known hosts.
oracle@vmlinux1's password: xxxxx
$ ssh vmlinux1 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Enter passphrase for key '/u01/app/oracle/.ssh/id_rsa': xxxxx
$ ssh vmlinux2 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
The authenticity of host 'vmlinux2 (192.168.1.112)' can't be established.
RSA key fingerprint is ab:ec:a9:50:24:11:b4:84:2d:fc:5f:f2:15:69:03:6f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'vmlinux2,192.168.1.112' (RSA) to the list of known hosts.
oracle@vmlinux2's password: xxxxx
$ ssh vmlinux2 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
oracle@vmlinux2's password: xxxxx
The first time you use SSH to connect to a node from a particular
system, you may see a message similar to the following:
The authenticity of host 'vmlinux1 (192.168.1.111)' can't be established.
RSA key fingerprint is de:07:ad:c7:95:45:3b:e0:e9:78:13:3e:d1:29:33:bd.
Are you sure you want to continue connecting (yes/no)? yes
Enter yes at the prompt to continue. You should
not see this message again when you connect from this system
to the same node.
$ scp ~/.ssh/authorized_keys vmlinux2:.ssh/authorized_keys
oracle@vmlinux2's password: xxxxx
authorized_keys 100% 1652 1.6KB/s 00:00
$ chmod 600 ~/.ssh/authorized_keys
$ ssh vmlinux1 hostname
Enter passphrase for key '/u01/app/oracle/.ssh/id_rsa': xxxxx
vmlinux1
$ ssh vmlinux2 hostname
Enter passphrase for key '/u01/app/oracle/.ssh/id_rsa': xxxxx
vmlinux2
If you see any other messages or text, apart from the host name,
then the Oracle installation can fail. Make any changes required
to ensure that only the host name is displayed when you enter
these commands. You should ensure that any part of a login script(s)
that generate any output, or ask any questions, are modified so that
they act only when the shell is an interactive shell.
Enabling SSH User Equivalency for the Current Shell Session
When running the OUI, it will need to run the
secure shell tool commands (ssh and scp) without being prompted for a
pass phrase. Even though SSH is configured on both Oracle RAC nodes in the cluster, using the
secure shell tool commands will still prompt for a pass phrase. Before running the OUI,
you need to enable user equivalence for the terminal session you plan to run the OUI from.
For the purpose of this article, all Oracle installations will be performed from vmlinux1.
# su - oracle
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /u01/app/oracle/.ssh/id_rsa: xxxxx
Identity added: /u01/app/oracle/.ssh/id_rsa (/u01/app/oracle/.ssh/id_rsa)
Identity added: /u01/app/oracle/.ssh/id_dsa (/u01/app/oracle/.ssh/id_dsa)
At the prompts, enter the pass phrase for each key that you generated.
$ ssh vmlinux1 "date;hostname"
Mon Feb 19 18:25:31 EST 2007
vmlinux1
$ ssh vmlinux2 "date;hostname"
Mon Feb 19 18:26:09 EST 2007
vmlinux2
The commands above should display the date set on each Oracle RAC node along with its hostname.
If any of the nodes prompt for a password or pass phrase then verify
that the ~/.ssh/authorized_keys file on that node
contains the correct public keys.
$ DISPLAY=<Any X-Windows Host>:0
$ export DISPLAY
C shell:
$ setenv DISPLAY <Any X-Windows Host>:0
After setting the DISPLAY variable to a valid X Windows
display, you should perform another test of the current terminal
session to ensure that X11 forwarding is not enabled:
$ ssh vmlinux1 hostname
vmlinux1
$ ssh vmlinux2 hostname
vmlinux2
If you are using a remote client to connect to the node performing the
installation, and you see a message similar to:
"Warning: No xauth data; using fake authentication data for X11 forwarding."
then this means that your authorized keys file is configured correctly;
however, your SSH configuration has X11 forwarding enabled. For example:
$ export DISPLAY=melody:0
$ ssh vmlinux2 hostname
Warning: No xauth data; using fake authentication data for X11 forwarding.
vmlinux2
Note that having X11 Forwarding enabled will cause the Oracle installation to
fail. To correct this problem, create a user-level SSH client configuration
file for the "oracle" UNIX user account that disables X11 Forwarding:
Host *
ForwardX11 no
Remove any stty Commands
When installing the Oracle software, any hidden files on the system
(i.e. .bashrc, .cshrc, .profile) will cause the
installation process to fail if they contain stty commands.
if [ -t 0 ]; then
stty intr ^C
fi
test -t 0
if ($status == 0) then
stty intr ^C
endif
If there are hidden files that contain stty commands that
are loaded by the remote shell, then OUI indicates an error and stops the installation.
The services provided by remote shell are disabled by default on most Linux systems.
This section describes the tasks required for enabling and configuring user equivalence
for use by the Oracle Universal Installer when commands should be run and files copied
to the remote nodes in the cluster using the remote shell tools. The goal is
to enable the Oracle Universal Installer to use rsh and rcp
to run commands and copy files to a remote node without being prompted for
a password. Please note that using
the remote shell method for configuring user equivalence is not secure.
# rpm -q rsh rsh-server
rsh-0.17-25.4
rsh-server-0.17-25.4
From the above, we can see that we have the rsh and rsh-server installed.
If rsh is not installed, run the following command from the
CD where the RPM is located:
# su -
# rpm -ivh rsh-0.17-25.4.i386.rpm rsh-server-0.17-25.4.i386.rpm
# su -
# chkconfig rsh on
# chkconfig rlogin on
# service xinetd reload
Reloading configuration: [ OK ]
To allow the "oracle" UNIX user account to be trusted among the RAC nodes,
create the /etc/hosts.equiv file on both Oracle RAC nodes in the cluster:
# su -
# touch /etc/hosts.equiv
# chmod 600 /etc/hosts.equiv
# chown root.root /etc/hosts.equiv
Now add all RAC nodes to the /etc/hosts.equiv file
similar to the following example for both Oracle RAC nodes in the cluster:
# cat /etc/hosts.equiv
+vmlinux1 oracle
+vmlinux2 oracle
+vmlinux1-priv oracle
+vmlinux2-priv oracle
In the above example, the second field permits only the oracle user account to run
rsh commands on the specified nodes. For security reasons,
the /etc/hosts.equiv file should be owned by root and the permissions
should be set to 600. In fact, some systems will only honor the content of
this file if the owner of this file is root and the permissions are set to 600.
Before attempting to test your rsh command, ensure
that you are using the correct version of rsh. By default, Red Hat Linux
puts /usr/kerberos/sbin at the head of the $PATH variable. This
will cause the Kerberos version of rsh to be executed.
# su -
# which rsh
/usr/kerberos/bin/rsh
# mv /usr/kerberos/bin/rsh /usr/kerberos/bin/rsh.original
# mv /usr/kerberos/bin/rcp /usr/kerberos/bin/rcp.original
# mv /usr/kerberos/bin/rlogin /usr/kerberos/bin/rlogin.original
# which rsh
/usr/bin/rsh
# su - oracle
$ rsh vmlinux1 ls -l /etc/hosts.equiv
-rw------- 1 root root 78 Feb 19 18:28 /etc/hosts.equiv
$ rsh vmlinux1-priv ls -l /etc/hosts.equiv
-rw------- 1 root root 78 Feb 19 18:28 /etc/hosts.equiv
$ rsh vmlinux2 ls -l /etc/hosts.equiv
-rw------- 1 root root 78 Feb 19 18:28 /etc/hosts.equiv
$ rsh vmlinux2-priv ls -l /etc/hosts.equiv
-rw------- 1 root root 78 Feb 19 18:28 /etc/hosts.equiv
Unlike when using secure shell, no other actions or commands are needed to enable user
equivalence using the remote shell. User equivalence will be enabled for the "oracle"
UNIX user account after successfully logging in to a terminal session.
Verify that the following startup commands are
included on all nodes in the cluster!
Up to this point, we have talked in great detail about the parameters and resources that
need to be configured on all nodes for the Oracle10g RAC configuration. This section
will take a deep breath and recap those parameters, commands, and entries (in previous sections of this document)
that need to happen on each node when the machine is booted.
All parameters and values to be used by kernel modules.
/etc/modprobe.conf
alias eth0 vmnics
alias eth1 vmnics
alias scsi_hostadapter mptbase
alias scsi_hostadapter1 mptscsi
alias scsi_hostadapter2 mptfc
alias scsi_hostadapter3 mptspi
alias scsi_hostadapter4 mptsas
alias scsi_hostadapter5 mptscsih
alias usb-controller uhci-hcd
# Added by VMware Tools
install vmnics /sbin/modprobe vmxnet; /sbin/modprobe pcnet32; /bin/true
alias char-major-14 sb
options sb io=0x220 irq=5 dma=1 dma16=5 mpu_io=0x330
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
We wanted to adjust the default and maximum send buffer size as well as the
default and maximum receive buffer size for the interconnect. This file also contains
those parameters responsible for configuring shared memory,
semaphores, file handles, and local IP range for use by the Oracle instance.
First, verify that each of the required kernel parameters are configured in the
/etc/sysctl.conf file. Then, ensure that each of these parameters are truly in
effect by running the following command on both Oracle RAC nodes in the cluster:
# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.core.rmem_default = 262144
net.core.wmem_default = 262144
net.core.rmem_max = 262144
net.core.wmem_max = 262144
kernel.shmmax = 2147483648
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
/etc/sysctl.conf
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.
# Controls IP packet forwarding
net.ipv4.ip_forward = 0
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1
# Default setting in bytes of the socket receive buffer
net.core.rmem_default=262144
# Default setting in bytes of the socket send buffer
net.core.wmem_default=262144
# Maximum socket receive buffer size which may be set by using
# the SO_RCVBUF socket option
net.core.rmem_max=262144
# Maximum socket send buffer size which may be set by using
# the SO_SNDBUF socket option
net.core.wmem_max=262144
# +---------------------------------------------------------+
# | SHARED MEMORY |
# +---------------------------------------------------------+
kernel.shmmax=2147483648
# +---------------------------------------------------------+
# | SEMAPHORES |
# | ---------- |
# | |
# | SEMMSL_value SEMMNS_value SEMOPM_value SEMMNI_value |
# | |
# +---------------------------------------------------------+
kernel.sem=250 32000 100 128
# +---------------------------------------------------------+
# | FILE HANDLES |
# ----------------------------------------------------------+
fs.file-max=65536
# +---------------------------------------------------------+
# | LOCAL IP RANGE |
# ----------------------------------------------------------+
net.ipv4.ip_local_port_range=1024 65000
All machine/IP entries for nodes in the RAC cluster.
/etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
# Public Network - (eth0)
192.168.1.111 vmlinux1
192.168.1.112 vmlinux2
# Private Interconnect - (eth1)
192.168.2.111 vmlinux1-priv
192.168.2.112 vmlinux2-priv
# Public Virtual IP (VIP) addresses - (eth0)
192.168.1.211 vmlinux1-vip
192.168.1.212 vmlinux2-vip
192.168.1.106 melody
192.168.1.102 alex
192.168.1.105 bartman
192.168.1.120 cartman
The /etc/hosts.equiv file is only required when using the remote
shell method to establish remote access and user equivalency.
/etc/hosts.equiv
+vmlinux1 oracle
+vmlinux2 oracle
+vmlinux1-priv oracle
+vmlinux2-priv oracle
Loading the hangcheck-timer kernel module.
/etc/rc.local
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.
touch /var/lock/subsys/local
# +---------------------------------------------------------+
# | HANGCHECK TIMER |
# | (I do not believe this is required, but doesn't hurt) |
# ----------------------------------------------------------+
/sbin/modprobe hangcheck-timer
Most of the configuration procedures in this section
should be performed on all nodes in the cluster!
Creating the OCFS2 filesystem, however, should only be executed on one of
nodes in the RAC cluster.
Overview
It is now time to install the Oracle Cluster File System, Release 2 (OCFS2).
OCFS2, developed by Oracle Corporation, is a Cluster File System which allows all nodes in a cluster
to concurrently access a device via the standard file system interface.
This allows for easy management of applications that need to run across a
cluster.
OCFS2 Project Documentation
Download OCFS2
First, let's download the latest OCFS2 distribution. The OCFS2 distribution
comprises of two sets of RPMs; namely, the kernel module and the
tools. The latest kernel module is available for download from
http://oss.oracle.com/projects/ocfs2/files/
and the tools from
http://oss.oracle.com/projects/ocfs2-tools/files/.
For the tools,
simply match the platform and distribution.
You should download both the OCFS2 tools and the OCFS2 console applications.
ocfs2-2.6.9-42.EL-1.2.4-2.i686.rpm - (for single processor)
ocfs2-2.6.9-42.ELsmp-1.2.4-2.i686.rpm - (for multiple processors)
ocfs2-2.6.9-42.ELhugemem-1.2.4-2.i686.rpm - (for hugemem)
ocfs2-tools-1.2.2-1.i386.rpm - (OCFS2 tools)
ocfs2console-1.2.2-1.i386.rpm - (OCFS2 console)
The OCFS2 Console is optional but highly recommended. The ocfs2console
application requires e2fsprogs, glib2 2.2.3 or later, vte 0.11.10 or
later, pygtk2 (EL4) or python-gtk (SLES9) 1.99.16 or later, python 2.3 or later and ocfs2-tools.
If you were curious as to which OCFS2 driver release you need, use the
OCFS2 release that matches your kernel version. To determine your kernel release:
$ uname -a
Linux vmlinux1 2.6.9-42.EL #1 Sat Aug 12 09:17:58 CDT 2006 i686 i686 i386 GNU/Linux
In the absence of the string "smp" after the string "EL", we are running a single processor (Uniprocessor)
machine. If the string "smp" were to appear, then you would be running on a multi-processor machine.
I will be installing the OCFS2 files onto two - single processor machines.
The installation process is simply a matter of running the following
command on both Oracle RAC nodes in the cluster as the root user account:
$ su -
# rpm -Uvh ocfs2-2.6.9-42.EL-1.2.4-2.i686.rpm \
ocfs2console-1.2.2-1.i386.rpm \
ocfs2-tools-1.2.2-1.i386.rpm
Preparing... ########################################### [100%]
1:ocfs2-tools ########################################### [ 33%]
2:ocfs2-2.6.9-42.EL ########################################### [ 67%]
3:ocfs2console ########################################### [100%]
Disable SELinux (RHEL4 U2 and higher)
Users of RHEL4 U2 and higher (CentOS 4.4 is based on RHEL4 U4) are advised that OCFS2
currently does not work with SELinux enabled. If you are using RHEL4 U2 or higher
(which includes us since we are using CentOS 4.4) you will need to
disable SELinux (using tool system-config-securitylevel) to get
the O2CB service to execute.
A ticket has been logged with Red Hat on this issue.
# /usr/bin/system-config-securitylevel &
This will bring up the following screen:

Figure 6: Security Level Configuration Opening Screen
Now, click the SELinux tab and check off the "Enabled" checkbox.
After clicking [OK], you will be presented with a warning dialog.
Simply acknowledge this warning by clicking "Yes".
Your screen should now look like the following after disabling the SELinux option:

Figure 7: SELinux Disabled
After making this change on both nodes in the cluster, each
node will need to be rebooted to implement the change. SELinux must be disabled
before you can continue with configuring OCFS2!
# init 6
Configure OCFS2
The next step is to generate and configure the /etc/ocfs2/cluster.conf file
on both Oracle RAC nodes in the cluster. The
easiest way to accomplish this is to run the GUI tool ocfs2console.
In this section, we will not only create and configure the /etc/ocfs2/cluster.conf file
using ocfs2console, but will also create and start the cluster stack O2CB. When
the /etc/ocfs2/cluster.conf file is not present, (as will be the case in our example),
the ocfs2console tool will create this file along with a new cluster stack service (O2CB)
with a default cluster name of ocfs2.
This will need to be done on both Oracle RAC nodes in the cluster as the root user account:
$ su -
# ocfs2console &
This will bring up the GUI as shown below:

Figure 8: ocfs2console Screen
Using the ocfs2console GUI tool, perform the following steps:

Figure 9: Starting the OCFS2 Cluster Stack
The following dialog show the OCFS2 settings I used for the node vmlinux1 and vmlinux2:

Figure 10: Configuring Nodes for OCFS2
After exiting the ocfs2console, you will have a /etc/ocfs2/cluster.conf
similar to the following. This process needs to be completed on both Oracle RAC nodes in the cluster
and the OCFS2 configuration file should be exactly the same for both of the nodes:
/etc/ocfs2/cluster.conf
node:
ip_port = 7777
ip_address = 192.168.1.111
number = 0
name = vmlinux1
cluster = ocfs2
node:
ip_port = 7777
ip_address = 192.168.1.112
number = 1
name = vmlinux2
cluster = ocfs2
cluster:
node_count = 2
name = ocfs2
O2CB Cluster Service
Before we can do anything with OCFS2 like formatting or mounting
the file system, we need to first have OCFS2's cluster stack,
O2CB, running (which it will be as a result of the configuration process performed
above).
The stack includes the following services:
The following commands are for demonstration purposes only and should not
be run when installing and configuring OCFS2!
Module "configfs": Not loaded
Filesystem "configfs": Not mounted
Module "ocfs2_nodemanager": Not loaded
Module "ocfs2_dlm": Not loaded
Module "ocfs2_dlmfs": Not loaded
Filesystem "ocfs2_dlmfs": Not mounted
Note that with this example, all of the services
are not loaded. I did an "unload" right before executing
the "status" option. If you were to check the status
of the o2cb service immediately after configuring OCFS2
using ocfs2console utility, they would all be loaded.
Loading module "configfs": OK
Mounting configfs filesystem at /config: OK
Loading module "ocfs2_nodemanager": OK
Loading module "ocfs2_dlm": OK
Loading module "ocfs2_dlmfs": OK
Mounting ocfs2_dlmfs filesystem at /dlm: OK
Loads all OCFS2 modules.
Starting cluster ocfs2: OK
The above command will online the cluster we created, ocfs2.
Unmounting ocfs2_dlmfs filesystem: OK
Unloading module "ocfs2_dlmfs": OK
Unmounting configfs filesystem: OK
Unloading module "configfs": OK
The above command will offline the cluster we created, ocfs2.
Cleaning heartbeat on ocfs2: OK
Stopping cluster ocfs2: OK
The above command will unload all OCFS2 modules.
Configure O2CB to Start on Boot and Adjust O2CB Heartbeat Threshold
We would now like to configure the on-boot properties of the OC2B driver so
that the cluster stack services will start on each boot. We will also be adjusting
the OCFS2 Heartbeat Threshold from its default setting of 7 to 601. All of the tasks
within this section will need to be performed on both nodes in the cluster.
With releases of OCFS2 prior to 1.2.1, a bug existed where the
driver would not get loaded on each boot even after configuring the on-boot properties
to do so. This bug was fixed in release 1.2.1 of OCFS2 and does not
need to be addressed in this article.
If however you are using a release of OCFS2 prior to 1.2.1, please see the
Troubleshooting
section for a workaround to this bug.
# /etc/init.d/o2cb offline ocfs2
# /etc/init.d/o2cb unload
# /etc/init.d/o2cb configure
Configuring the O2CB driver.
This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on
boot. The current values will be shown in brackets ('[]'). Hitting
Format the OCFS2 File System
Unlike the other tasks in this section, creating the OCFS2 file system should only be executed on one of
nodes in the RAC cluster. I will be executing all commands in this section from vmlinux1 only.
Note that it is possible to create and mount the OCFS2 file
system using either the GUI tool ocfs2console
or the command-line tool mkfs.ocfs2.
From the ocfs2console utility,
use the menu [Tasks] - [Format].
$ su -
# mkfs.ocfs2 -b 4K -C 32K -N 4 -L oracrsfiles /dev/sdb1
mkfs.ocfs2 1.2.2
Filesystem label=oracrsfiles
Block size=4096 (bits=12)
Cluster size=32768 (bits=15)
Volume size=2146762752 (65514 clusters) (524112 blocks)
3 cluster groups (tail covers 1002 clusters, rest cover 32256 clusters)
Journal size=67108864
Initial number of node slots: 4
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Formatting Journals: done
Writing lost+found: done
mkfs.ocfs2 successful
Mount the OCFS2 File System
Now that the file system is created, we can mount it.
Let's first do it using the command-line, then I'll show
how to include it in the /etc/fstab to have it mount
on each boot.
Mounting the file system will need to be performed
on both nodes in the Oracle RAC cluster as the root
user account using the OCFS2 label oracrsfiles!
$ su -
# mount -t ocfs2 -o datavolume,nointr -L "oracrsfiles" /u02/oradata/orcl
If the mount was successful, you will simply
get your prompt back. We should, however, run the following
checks to ensure the file system is mounted correctly.
Let's use the mount command to ensure that
the new file system is really mounted. This should be performed
on both nodes in the RAC cluster:
# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
none on /proc type proc (rw)
none on /sys type sysfs (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
usbfs on /proc/bus/usb type usbfs (rw)
/dev/sda1 on /boot type ext3 (rw)
none on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
configfs on /config type configfs (rw)
ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
/dev/sdb1 on /u02/oradata/orcl type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)
Please take note of the datavolume option I am using to mount
the new file system. Oracle database users must mount any volume
that will contain the Voting Disk file, Cluster Registry (OCR), Data files,
Redo logs, Archive logs and Control files with the datavolume mount
option so as to ensure that the Oracle processes open the files with the o_direct
flag. The nointr option ensures that the I/O's are not interrupted by signals.
Why does it take so much time to mount the volume? It takes around 5
seconds for a volume to mount. It does so as
to let the heartbeat thread stabilize. In a later release, Oracle
plans to add support for a global heartbeat, which will make most
mounts instant.
Configure OCFS2 to Mount Automatically at Startup
Let's take a look at what we have done so far. We downloaded and installed the Oracle Cluster File System,
Release 2 (OCFS2), which will be used to store the files needed by Cluster Manager files. After going
through the install, we loaded the OCFS2 module into the kernel and then formatted the clustered file system.
Finally, we mounted the newly created file system using the OCFS2 label "oracrsfiles".
This section walks through the steps responsible for mounting the new OCFS2 file system each time
the machine(s) are booted using its label.
LABEL=oracrsfiles /u02/oradata/orcl ocfs2 _netdev,datavolume,nointr 0 0
Notice the "_netdev" option for mounting this file system.
The _netdev mount option is a must for OCFS2 volumes. This
mount option indicates that the volume is to be mounted after the
network is started and dismounted before the network is
shutdown.
$ su -
# chkconfig --list o2cb
o2cb 0:off 1:off 2:on 3:on 4:on 5:on 6:off
The flags that I have marked in bold should be set to "on".
Check Permissions on New OCFS2 File System
Use the ls command to check ownership. The permissions should be
set to 0775 with owner "oracle" and group "dba".
# ls -ld /u02/oradata/orcl
drwxr-xr-x 3 root root 4096 Feb 19 19:06 /u02/oradata/orcl
As we can see from the listing above, the oracle user account
(and the dba group) will not be able to write to this directory.
Let's fix that:
# chown oracle:dba /u02/oradata/orcl
# chmod 775 /u02/oradata/orcl
Let's now go back and re-check that the permissions are correct for both Oracle RAC nodes
in the cluster:
# ls -ld /u02/oradata/orcl
drwxrwxr-x 3 oracle dba 4096 Feb 19 19:06 /u02/oradata/orcl
Reboot Both Nodes
Before starting the next section, this would be a good
place to reboot both of the nodes in the RAC cluster. When the
machines come up, ensure that the cluster stack services are being
loaded and the new OCFS2 file system is being mounted:
# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
none on /proc type proc (rw)
none on /sys type sysfs (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
usbfs on /proc/bus/usb type usbfs (rw)
/dev/sda1 on /boot type ext3 (rw)
none on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
configfs on /config type configfs (rw)
ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
/dev/sdb1 on /u02/oradata/orcl type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)
If you modified the O2CB heartbeat threshold, you should verify that it is
set correctly:
# cat /proc/fs/ocfs2_nodemanager/hb_dead_threshold
601
How to Determine OCFS2 Version
To determine which version of OCFS2 is running, use:
# cat /proc/fs/ocfs2/version
OCFS2 1.2.4 Thu Feb 1 15:04:31 PST 2007 (build a821ad1645e42e94b93ec4904c40dd10)
Introduction
In this section, we will configure Automatic Storage Management (ASM)
to be used as the file system / volume manager for
all Oracle physical database files (data, online redo logs, control files,
archived redo logs) and a Flash Recovery Area.
Download the ASMLib 2.0 Packages
We start this section by downloading the latest ASMLib 2.0 libraries and the driver from OTN.
At the time of this writing, the latest release of the ASMLib driver was
2.0.3-1.
Like the Oracle Cluster File System, we need to download the version
for the Linux kernel and number of processors on the machine. We are using
kernel 2.6.9-42.EL #1 while the machines I am using are both single processor
machines:
# uname -a
Linux vmlinux1 2.6.9-42.EL #1 Sat Aug 12 09:17:58 CDT 2006 i686 i686 i386 GNU/Linux
If you do not currently have an account with Oracle OTN, you
will need to create one. This is a FREE account!
Oracle ASMLib Downloads for Red Hat Enterprise Linux 4 AS
You will also need to download the following ASMLib tools:
oracleasm-2.6.9-42.EL-2.0.3-1.i686.rpm - (for single processor)
oracleasm-2.6.9-42.ELsmp-2.0.3-1.i686.rpm - (for multiple processors)
oracleasm-2.6.9-42.ELhugemem-2.0.3-1.i686.rpm - (for hugemem)
oracleasmlib-2.0.2-1.i386.rpm - (Userspace library)
oracleasm-support-2.0.3-1.i386.rpm - (Driver support files)
Install ASMLib 2.0 Packages
This installation needs to be performed on both nodes in the RAC cluster
as the root user account:
$ su -
# rpm -Uvh oracleasm-2.6.9-42.EL-2.0.3-1.i686.rpm \
oracleasmlib-2.0.2-1.i386.rpm \
oracleasm-support-2.0.3-1.i386.rpm
Preparing... ########################################### [100%]
1:oracleasm-support ########################################### [ 33%]
2:oracleasm-2.6.9-42.EL ########################################### [ 67%]
3:oracleasmlib ########################################### [100%]
Configure and Loading the ASMLib 2.0 Packages
Now that we downloaded and installed the ASMLib 2.0 Packages for Linux,
we now need to configure and load the ASM kernel module. This task
needs to be run on both nodes in the RAC cluster as the
root user account:
$ su -
# /etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library
driver. The following questions will determine whether the driver is
loaded on boot and what permissions it will have. The current values
will be shown in brackets ('[]'). Hitting <ENTER> without typing an
answer will keep that current value. Ctrl-C will abort.
Default user to own the driver interface []: oracle
Default group to own the driver interface []: dba
Start Oracle ASM library driver on boot (y/n) [n]: y
Fix permissions of Oracle ASM disks on boot (y/n) [y]: y
Writing Oracle ASM library driver configuration: [ OK ]
Creating /dev/oracleasm mount point: [ OK ]
Loading module "oracleasm": [ OK ]
Mounting ASMlib driver filesystem: [ OK ]
Scanning system for ASM disks: [ OK ]
Create ASM Disks for Oracle
Creating the ASM disks only needs to be done on one node in the
RAC cluster as the root user account. I will be running these commands on
vmlinux1. On the other Oracle RAC node, you will need to
perform a scandisk to recognize the new volumes. When
that is complete, you should then run the
oracleasm listdisks command on both Oracle RAC nodes
to verify that all ASM disks were created and
available.
Oracle ASM Partitions Created
File System Type
Partition
Size
Mount Point
File Types
ASM
/dev/sdc1
12 GB
ORCL:VOL1
Oracle Database Files
ASM
/dev/sdd1
12 GB
ORCL:VOL2
Oracle Database Files
ASM
/dev/sde1
12 GB
ORCL:VOL3
Flash Recovery Area
ASM
/dev/sdf1
12 GB
ORCL:VOL4
Flash Recovery Area
Total
48 GB
If you are repeating this article using the same hardware (actually, the same shared logical drives),
you may get a failure when attempting to create the ASM disks. If you
do receive a failure, try listing all ASM disks using:
# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
VOL4
As you can see, the results show that I have four ASM volumes
already defined. If you have the four volumes already defined
from a previous run, go ahead and remove them using the
following commands. After removing the previously created volumes, use the "oracleasm createdisk" commands (below)
to create the new volumes.
# /etc/init.d/oracleasm deletedisk VOL1
Removing ASM disk "VOL1" [ OK ]
# /etc/init.d/oracleasm deletedisk VOL2
Removing ASM disk "VOL2" [ OK ]
# /etc/init.d/oracleasm deletedisk VOL3
Removing ASM disk "VOL3" [ OK ]
# /etc/init.d/oracleasm deletedisk VOL4
Removing ASM disk "VOL4" [ OK ]
$ su -
# /etc/init.d/oracleasm createdisk VOL1 /dev/sdc1
Marking disk "/dev/sdc1" as an ASM disk [ OK ]
# /etc/init.d/oracleasm createdisk VOL2 /dev/sdd1
Marking disk "/dev/sdd1" as an ASM disk [ OK ]
# /etc/init.d/oracleasm createdisk VOL3 /dev/sde1
Marking disk "/dev/sde1" as an ASM disk [ OK ]
# /etc/init.d/oracleasm createdisk VOL4 /dev/sdf1
Marking disk "/dev/sdf1" as an ASM disk [ OK ]
On all other nodes in the RAC cluster, you must perform
a scandisk to recognize the new volumes:
# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks [ OK ]
We can now test that the ASM disks were successfully created by
using the following command on both nodes in the RAC cluster
as the root user account:
# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
VOL4
The following download procedures only need to be performed on
one node in the cluster!
Overview
The next logical step is to install Oracle Clusterware Release 2 (10.2.0.1.0),
Oracle Database 10g Release 2 (10.2.0.1.0), and finally the Oracle Database 10g
Companion CD Release 2 (10.2.0.1.0) for Linux x86 software. However, we must first download
and extract the required Oracle software packages from the Oracle Technology Network (OTN).
If you do not currently have an account with Oracle OTN, you
will need to create one. This is a FREE account!
Oracle Clusterware Release 2 (10.2.0.1.0) for Linux x86
First, download the Oracle Clusterware Release 2 for Linux x86.
Oracle Clusterware Release 2 (10.2.0.1.0)
Oracle Database 10g Release 2 (10.2.0.1.0) for Linux x86
Next, we need to download the Oracle Database 10g Release 2 (10.2.0.1.0) Software for Linux x86.
This can be downloaded from the same page used to download the
Oracle Clusterware Release 2 software:
Oracle Database 10g Release 2 (10.2.0.1.0)
Oracle Database 10g Companion CD Release 2 (10.2.0.1.0) for Linux x86
Finally, we should download the Oracle Database 10g Companion CD for Linux x86.
This can be downloaded from the same page used to download the
Oracle Clusterware Release 2 software:
Oracle Database 10g Companion CD Release 2 (10.2.0.1.0)
As the "oracle" user account, extract the three packages you downloaded to
a temporary directory. In this example, I will use "/u01/app/oracle/orainstall".
# su - oracle
$ cd ~oracle/orainstall
$ unzip 10201_clusterware_linux32.zip
$ cd ~oracle/orainstall
$ unzip 10201_database_linux32.zip
$ cd ~oracle/orainstall
$ unzip 10201_companion_linux32.zip
Perform the following checks on both Oracle RAC nodes in the cluster!
When installing the Linux O/S (CentOS Enterprise Linux or Red Hat Enterprise Linux 4), you should verify
that all required RPMs for Oracle are installed. If you followed the instructions
I used for installing Linux, you would have installed Everything,
in which case you will have all of the required RPM packages. However,
if you performed another installation type (i.e. "Advanced Server), you
may have some packages missing and will need to install them. All
of the required RPMs are on the Linux CDs/ISOs.
The following packages (keeping in mind that the version number for your Linux distribution
may vary slightly) must be installed:
Prerequisites for Using Cluster Verification Utility
binutils-2.15.92.0.2-21
compat-db-4.1.25-9
compat-gcc-32-3.2.3-47.3
compat-gcc-32-c++-3.2.3-47.3
compat-libstdc++-33-3.2.3-47.3
compat-libgcc-296-2.96-132.7.2
control-center-2.8.0-12.rhel4.5
cpp-3.4.6-3
gcc-3.4.6-3
gcc-c++-3.4.6-3
glibc-2.3.4-2.25
glibc-common-2.3.4-2.25
glibc-devel-2.3.4-2.25
glibc-headers-2.3.4-2.25
glibc-kernheaders-2.4-9.1.98.EL
gnome-libs-1.4.1.2.90-44.1
libaio-0.3.105-2
libstdc++-3.4.6-3
libstdc++-devel-3.4.6-3
make-3.80-6.EL4
openmotif-2.2.3-10.RHEL4.5
openmotif21-2.1.30-11.RHEL4.6
pdksh-5.2.14-30.3
setarch-1.6-1
sysstat-5.0.5-11.rhel4
xscreensaver-4.18-5.rhel4.11
Note that the openmotif RPM packages are only required to install
Oracle demos. This article does not cover the installation of Oracle demos.
# rpm -q gcc glibc-devel
gcc-3.4.6-3
glibc-devel-2.3.4-2.25
If you need to install any of the above packages (which you should not have
to if you installed Everything), use the "rpm -Uvh <PackageName.rpm>" command.
For example, to install the GCC gcc-3.4.6-3 package, use:
# rpm -Uvh gcc-3.4.6-3.i386.rpm
Checking Pre-Installation Tasks for CRS with CVU
You must have JDK 1.4.2 installed on your system before you can run CVU.
If you do not have JDK 1.4.2 installed on your system, and you attempt to run CVU,
you will receive an error message similar to the following:
If you do not have JDK 1.4.2 installed, then download it from the Sun Web site,
and use the Sun instructions to install it. JDK 1.4.2 is available as a
download from the following Web site:
http://www.sun.com/java.
ERROR. Either CV_JDKHOME environment variable should be set
or /stagepath/cluvfy/jrepack.zip should exist.
CV_JDKHOME=/usr/local/j2re1.4.2_08
export CV_JDKHOME
The second pre-requisite for running the CVU is for Red Hat Linux users.
If you are using Red Hat Linux, then you must download and install the Red Hat
operating system package cvuqdisk to both of the Oracle RAC nodes
in the cluster. This means you will need to install the cvuqdisk RPM
to both vmlinux1 and vmlinux2. Without cvuqdisk, CVU will be
unable to discover shared disks, and you will receive the error message
"Package cvuqdisk not installed" when you run CVU.
$ su -
# cd /u01/app/oracle/orainstall/clusterware/rpm
# CVUQDISK_GRP=dba; export CVUQDISK_GRP
# rpm -iv cvuqdisk-1.0.1-1.rpm
Preparing packages for installation...
cvuqdisk-1.0.1-1
# ls -l /usr/sbin/cvuqdisk
-rwsr-x--- 1 root dba 4168 Jun 2 2005 /usr/sbin/cvuqdisk
The CVU should be run from vmlinux1 the node we will be performing
all of the Oracle installations from. Before running CVU, login as the
oracle user account and verify remote access / user equivalence is configured
to all nodes in the cluster. When using the secure shell method, user equivalence
will need to be enabled for the terminal shell session before attempting to run the CVU.
To enable user equivalence for the current terminal shell session, perform the
following steps remembering to enter the pass
phrase for each key that you generated when prompted:
# su - oracle
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /u01/app/oracle/.ssh/id_rsa: xxxxx
Identity added: /u01/app/oracle/.ssh/id_rsa (/u01/app/oracle/.ssh/id_rsa)
Identity added: /u01/app/oracle/.ssh/id_dsa (/u01/app/oracle/.ssh/id_dsa)
When using the
remote shell
method, user equivalence is generally defined in the /etc/hosts.equiv file
for the oracle user account and is enabled on all new terminal shell
sessions.
Once all prerequisites for using CVU have been met, we can start by
checking that all pre-installation tasks for Oracle Clusterware (CRS)
are completed by executing the following command as the "oracle" UNIX user account
(with user equivalence enabled) from vmlinux1:
Checking the Hardware and Operating System Setup with CVU
$ cd /u01/app/oracle/orainstall/clusterware/cluvfy
$ ./runcluvfy.sh stage -pre crsinst -n vmlinux1,vmlinux2 -verbose
Review the CVU report. Note that there are several errors you may
ignore in this report.
Suitable interfaces for the private interconnect on subnet "192.168.1.0":
vmlinux2 eth0:192.168.1.112
vmlinux1 eth0:192.168.1.111
Suitable interfaces for the private interconnect on subnet "192.168.2.0":
vmlinux2 eth1:192.168.2.112
vmlinux1 eth1:192.168.2.111
ERROR:
Could not find a suitable set of interfaces for VIPs.
Result: Node connectivity check failed.
The next CVU check to run will verify the hardware and operating system setup.
Again, run the following as the "oracle" UNIX user account from vmlinux1:
$ cd /u01/app/oracle/orainstall/clusterware/cluvfy
$ ./runcluvfy.sh stage -post hwos -n vmlinux1,vmlinux2 -verbose
Review the CVU report. As with the previous check
(pre-installation tasks for CRS),
the check for finding a suitable set of interfaces for VIPs will
fail and can be safely ignored.
This too can be safely ignored.
While we know the disks are visible and shared from both of our Oracle RAC nodes in the
cluster, the check itself fails. Several reasons for this have been
documented. The first came from Metalink indicating that
cluvfy currently does not work with devices other than SCSI
devices. This would include devices like EMC PowerPath and volume groups like
those from Openfiler. At the time of this writing, no workaround exists
other than to use manual methods for detecting shared devices. Another
reason for this error was documented by Bane Radulovic at Oracle Corporation.
His research shows that CVU calls smartclt on Linux, and the problem
is that smartclt does not return the serial number from our virtual SCSI devices.
The virtual SCSI devices from VMware so not support SMART.
For example, a check against /dev/sdc shows:
Checking shared storage accessibility...
WARNING:
Unable to determine the sharedness of /dev/sdf on nodes:
vmlinux2,vmlinux2,vmlinux2,vmlinux2,vmlinux2,vmlinux1,vmlinux1,vmlinux1,vmlinux1,vmlinux1
Shared storage check failed on nodes "vmlinux2,vmlinux1".
# /usr/sbin/smartctl -i /dev/sdc
smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Device: VMware, VMware Virtual S Version: 1.0
Device type: disk
Local Time is: Mon Feb 19 19:48:15 2007 EST
Device does not support SMART
Perform the following installation procedures from only one of the Oracle RAC nodes in the cluster (vmlinux1)!
The Oracle Clusterware software will be installed to both of Oracle RAC nodes in the
cluster by the Oracle Universal Installer.
Overview
We are ready to install the Cluster part of the environment -
the Oracle Clusterware. In a previous section, we
downloaded and extracted the install files for Oracle Clusterware to vmlinux1 in
the directory /u01/app/oracle/orainstall/clusterware. This is the only
node we need to perform the install from. During the installation of Oracle Clusterware, you
will be asked for the nodes involved and to configure in the RAC cluster.
Once the actual installation starts, it will copy the required software to
all nodes using the remote access we configured in the section
"Configure RAC Nodes for Remote Access".
Oracle Clusterware Shared Files
The two shared files used by Oracle Clusterware will be stored on the Oracle Cluster
File System, Release 2 (OFCS2) we created earlier. The two shared Oracle Clusterware files are:
It is not possible to use
Automatic Storage Management (ASM) for the two shared Oracle Clusterware files:
Oracle Cluster Registry (OCR) or the CRS Voting Disk files. The problem is
that these files need to be in place and accessible BEFORE any Oracle instances can be started. For
ASM to be available, the ASM instance would need to be run first.
The two shared files could be stored on the OCFS2, shared RAW devices, or another vendor's clustered file system.
Verifying Terminal Shell Environment
Before starting the Oracle Universal Installer, you should first verify
you are logged onto the server you will be running the installer from
(i.e. vmlinux1) then run
the xhost command as root from the console to
allow X Server connections. Next, login as the oracle user account.
If you are using a remote client to connect to the node performing the
installation (SSH / Telnet to vmlinux1 from a workstation configured with
an X Server), you will need to set the DISPLAY variable to point to your
local workstation.
Finally, verify remote access / user equivalence to all nodes in the cluster:
# hostname
vmlinux1
# xhost +
access control disabled, clients can connect from any host
# su - oracle
$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
$ # NODE PERFORMING THE INSTALL
$ DISPLAY=<your local workstation>:0.0
$ export DISPLAY
Verify you are able to run the Secure Shell
commands (ssh or scp) or
the Remote Shell commands
(rsh and rcp) on the Linux server
you will be running the Oracle Universal Installer from against all other Linux servers
in the cluster without being prompted for a password.
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /u01/app/oracle/.ssh/id_rsa: xxxxx
Identity added: /u01/app/oracle/.ssh/id_rsa (/u01/app/oracle/.ssh/id_rsa)
Identity added: /u01/app/oracle/.ssh/id_dsa (/u01/app/oracle/.ssh/id_dsa)
$ ssh vmlinux1 "date;hostname"
Mon Feb 19 19:51:46 EST 2007
vmlinux1
$ ssh vmlinux2 "date;hostname"
Mon Feb 19 20:53:07 EST 2007
vmlinux2
$ rsh vmlinux1 "date;hostname"
Mon Feb 19 19:52:48 EST 2007
vmlinux1
$ rsh vmlinux2 "date;hostname"
Mon Feb 19 20:53:28 EST 2007
vmlinux2
Installing Oracle Clusterware
The following tasks are used to install the Oracle Clusterware:
$ cd ~oracle
$ /u01/app/oracle/orainstall/clusterware/runInstaller -ignoreSysPrereqs
Screen Name
Response
Welcome Screen
Click Next
Specify Inventory directory and credentials
Accept the default values:
Inventory directory: /u01/app/oracle/oraInventory
Operating System group name: dba
Specify Home Details
Set the destination for the $ORACLE_HOME (actually the $ORA_CRS_HOME that
I will be using in this article) name and location
as follows:
Name: OraCrs10g_home
Path: /u01/app/oracle/product/crs
Product-Specific Prerequisite Checks
The installer will run through a series of checks to determine
if the node meets the minimum requirements for installing
and configuring the Oracle Clusterware software. If any of the
checks fail, you will need to manually verify the check that
failed by clicking on the checkbox.
"Checking physical memory requirements ...
Expected result: 922MB
Actual Result: 736MB
Check complete. The overall result of this check is: Failed <<<<"
Specify Cluster Configuration
Cluster Name: crs
Public Node Name
Private Node Name
Virtual Node Name
vmlinux1
vmlinux1-priv
vmlinux1-vip
vmlinux2
vmlinux2-priv
vmlinux2-vip
Specify Network Interface Usage
Interface Name
Subnet
Interface Type
eth0
192.168.1.0
Public
eth1
192.168.2.0
Private
Specify Oracle Cluster Registry (OCR) Location
Starting with Oracle Database 10g Release 2 (10.2) with RAC,
Oracle Clusterware provides for the creation of a mirrored
Oracle Cluster Registry (OCR) file, enhancing cluster reliability.
For the purpose of this example, I did choose to mirror
the OCR file by keeping the default option of "Normal Redundancy":
Specify OCR Mirror Location: /u02/oradata/orcl/OCRFile_mirror
Specify Voting Disk Location
Starting with Oracle Database 10g Release 2 (10.2) with RAC,
CSS has been modified to allow you to configure CSS with multiple
voting disks. In 10g Release 1 (10.1), you could configure only one
voting disk. By enabling multiple voting disk configuration, the
redundant voting disks allow you to configure a RAC database with multiple
voting disks on independent shared physical disks. This option facilitates
the use of the iSCSI network protocol, and other Network Attached Storage
(NAS) storage solutions. Note that to take advantage of the benefits of multiple
voting disks, you must configure at least three voting disks.
For the purpose of this example, I did choose to mirror
the voting disk by keeping the default option of "Normal Redundancy":
Additional Voting Disk 1 Location: /u02/oradata/orcl/CSSFile_mirror1
Additional Voting Disk 2 Location: /u02/oradata/orcl/CSSFile_mirror2
Summary
Click Install to start the installation!
Execute Configuration Scripts
After the installation has completed, you will be prompted
to run the orainstRoot.sh and root.sh script.
Open a new console window on each node in the RAC cluster, (starting with the
node you are performing the install from), as the "root" user account.
...
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
vmlinux1
vmlinux2
CSS is active on all nodes.
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps
The given interface(s), "eth0" is not public. Public interfaces should be used to configure virtual IPs.
Network interfaces: Select only the public interface - eth0
Virtual IPs for cluster nodes:
Node Name: vmlinux1
IP Alias Name: vmlinux1-vip
IP Address: 192.168.1.211
Subnet Mask: 255.255.255.0
IP Alias Name: vmlinux2-vip
IP Address: 192.168.1.212
Subnet Mask: 255.255.255.0
Configuration Assistant Progress Dialog: Click OK after configuration is complete.
Configuration Results: Click Exit
End of installation
At the end of the installation, exit from the OUI.
Verify Oracle Clusterware Installation
After the installation of Oracle Clusterware, we can run through several tests to
verify the install was successful. Run the following commands
on all nodes in the RAC cluster.
Check Oracle Clusterware Auto-Start Scripts
$ /u01/app/oracle/product/crs/bin/olsnodes -n
vmlinux1 1
vmlinux2 2
$ ls -l /etc/init.d/init.*
-r-xr-xr-x 1 root root 1951 Feb 19 20:05 /etc/init.d/init.crs
-r-xr-xr-x 1 root root 4714 Feb 19 20:05 /etc/init.d/init.crsd
-r-xr-xr-x 1 root root 35394 Feb 19 20:05 /etc/init.d/init.cssd
-r-xr-xr-x 1 root root 3190 Feb 19 20:05 /etc/init.d/init.evmd
Perform the following installation procedures from only one of the Oracle RAC nodes in the cluster (vmlinux1)!
The Oracle Database software will be installed to both of Oracle RAC nodes in the
cluster by the Oracle Universal Installer.
Overview
After successfully installing the Oracle Clusterware software, the next
step is to install the Oracle10g Release 2 Database Software (10.2.0.1.0)
with Real Application Clusters (RAC).
For the purpose of this example, we will forgoe
the "Create Database" option when installing the Oracle10g Release 2 software. We will,
instead, create the database using the Database Creation Assistant (DBCA) after the
Oracle10g Database Software install.
Verifying Terminal Shell Environment
As discussed in the previous section, (Install Oracle10g Clusterware Software),
the terminal shell environment needs to
be configured for remote access and user equivalence to all nodes in the cluster
before running the Oracle Universal Installer. Note
that you can utilize the same terminal shell session used in the previous section
which in this case, you do not
have to take any of the actions described below with regards to setting up remote
access and the DISPLAY variable:
# su - oracle
$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
$ # NODE PERFORMING THE INSTALL
$ DISPLAY=<your local workstation>:0.0
$ export DISPLAY
Verify you are able to run the Secure Shell
commands (ssh or scp) or
the Remote Shell commands
(rsh and rcp) on the Linux server
you will be running the Oracle Universal Installer from against all other Linux servers
in the cluster without being prompted for a password.
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /u01/app/oracle/.ssh/id_rsa: xxxxx
Identity added: /u01/app/oracle/.ssh/id_rsa (/u01/app/oracle/.ssh/id_rsa)
Identity added: /u01/app/oracle/.ssh/id_dsa (/u01/app/oracle/.ssh/id_dsa)
$ ssh vmlinux1 "date;hostname"
Mon Feb 19 19:51:46 EST 2007
vmlinux1
$ ssh vmlinux2 "date;hostname"
Mon Feb 19 20:53:07 EST 2007
vmlinux2
$ rsh vmlinux1 "date;hostname"
Mon Feb 19 19:52:48 EST 2007
vmlinux1
$ rsh vmlinux2 "date;hostname"
Mon Feb 19 20:53:28 EST 2007
vmlinux2
Run the Oracle Cluster Verification Utility
Before installing the Oracle Database Software, we should run
the following database pre-installation check using the
Cluster Verification Utility (CVU).
Instructions for configuring CVU can be found in the section
"Prerequisites for Using Cluster Verification Utility
discussed earlier in this article.
$ cd /u01/app/oracle/orainstall/clusterware/cluvfy
$ ./runcluvfy.sh stage -pre dbinst -n vmlinux1,vmlinux2 -r 10gR2 -verbose
Review the CVU report. Note that this report will contain
the same errors we received when checking pre-installation tasks
for CRS failure to find a suitable set of interfaces for VIPs
and the failure to find specific RPM packages that do not exist in RHEL4 Update.
These two errors can be safely ignored.
Install Oracle10g Release 2 Database Software
The following tasks are used to install the Oracle10g Release 2 Database Software:
$ cd ~oracle
$ /u01/app/oracle/orainstall/database/runInstaller -ignoreSysPrereqs
Screen Name
Response
Welcome Screen
Click Next
Select Installation Type
I selected the Enterprise Edition option.
If you need other components like Oracle Label Security or if you want to
simply customize the environment, select Custom.
Specify Home Details
Set the destination for the ORACLE_HOME name and location
as follows:
Name: OraDb10g_home1
Location: /u01/app/oracle/product/10.2.0/db_1
Specify Hardware Cluster Installation Mode
Select the Cluster Installation option
then select all nodes available.
Click Select All to select all servers: vmlinux1 and vmlinux2.
If the installation stops here and the status of any of the
RAC nodes is "Node not reachable", perform the following checks:
Product-Specific Prerequisite Checks
The installer will run through a series of checks to determine
if the node meets the minimum requirements for installing
and configuring the Oracle database software. If any of the
checks fail, you will need to manually verify the check that
failed by clicking on the checkbox.
Expected result: 922MB
Actual Result: 736MB
Check complete. The overall result of this check is: Failed <<<<"
Select Database Configuration
Select the option to Install database Software only.
Summary
Click Install to start the installation!
Root Script Window - Run root.sh
After the installation has completed, you will be prompted
to run the root.sh script. It is important to keep in mind
that the root.sh script will need to be run ON ALL NODES
in the RAC cluster ONE AT A TIME starting
with the node you are running the database installation from.
End of installation
At the end of the installation, exit from the OUI.
Perform the following installation procedures from only one of the Oracle RAC nodes in the cluster (vmlinux1)!
The Oracle10g Companion CD software will be installed to both of Oracle RAC nodes in the
cluster by the Oracle Universal Installer.
Overview
After successfully installing the Oracle Database software, the next
step is to install the Oracle10g Release 2 Companion CD software (10.2.0.1.0).
Verifying Terminal Shell Environment
As discussed in the previous section, (Install Oracle10g Database Software),
the terminal shell environment needs to
be configured for remote access and user equivalence to all nodes in the cluster
before running the Oracle Universal Installer. Note
that you can utilize the same terminal shell session used in the previous section
which in this case, you do not
have to take any of the actions described below with regards to setting up remote
access and the DISPLAY variable:
# su - oracle
$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
$ # NODE PERFORMING THE INSTALL
$ DISPLAY=<your local workstation>:0.0
$ export DISPLAY
Verify you are able to run the Secure Shell
commands (ssh or scp) or
the Remote Shell commands
(rsh and rcp) on the Linux server
you will be running the Oracle Universal Installer from against all other Linux servers
in the cluster without being prompted for a password.
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /u01/app/oracle/.ssh/id_rsa: xxxxx
Identity added: /u01/app/oracle/.ssh/id_rsa (/u01/app/oracle/.ssh/id_rsa)
Identity added: /u01/app/oracle/.ssh/id_dsa (/u01/app/oracle/.ssh/id_dsa)
$ ssh vmlinux1 "date;hostname"
Mon Feb 19 19:51:46 EST 2007
vmlinux1
$ ssh vmlinux2 "date;hostname"
Mon Feb 19 20:53:07 EST 2007
vmlinux2
$ rsh vmlinux1 "date;hostname"
Mon Feb 19 19:52:48 EST 2007
vmlinux1
$ rsh vmlinux2 "date;hostname"
Mon Feb 19 20:53:28 EST 2007
vmlinux2
Install Oracle10g Companion CD Software
The following tasks are used to install the Oracle10g Companion CD Software:
$ cd ~oracle
$ /u01/app/oracle/orainstall/companion/runInstaller -ignoreSysPrereqs
Screen Name
Response
Welcome Screen
Click Next
Select a Product to Install
Select the Oracle Database 10g Products 10.2.0.1.0 option.
Specify Home Details
Set the destination for the ORACLE_HOME name and location to that of the previous
Oracle10g Database software install as follows:
Name: OraDb10g_home1
Location: /u01/app/oracle/product/10.2.0/db_1
Specify Hardware Cluster Installation Mode
The Cluster Installation option
will be selected along with all of the available nodes in the cluster
by default.
Stay with these default options and click Next
to continue.
If the installation stops here and the status of any of the
RAC nodes is "Node not reachable", perform the following checks:
Product-Specific Prerequisite Checks
The installer will run through a series of checks to determine
if the node meets the minimum requirements for installing
and configuring the Oracle10g Companion CD Software. If any of the
checks fail, you will need to manually verify the check that
failed by clicking on the checkbox.
"Checking physical memory requirements ...
Expected result: 922MB
Actual Result: 736MB
Check complete. The overall result of this check is: Failed <<<<"
Summary
On the Summary screen, click Install to start the installation!
End of installation
At the end of the installation, exit from the OUI.
Perform the following configuration procedures
from only one of the Oracle RAC nodes in the cluster (vmlinux1)!
The Network Configuration Assistant (NETCA) will setup the TNS
listener in a clustered configuration on both of Oracle RAC nodes in the cluster.
Overview
The Database Configuration Assistant (DBCA) requires the Oracle TNS Listener
process to be configured and running on all nodes in the RAC cluster before it can
create the clustered database.
Verifying Terminal Shell Environment
As discussed in the previous section, (Install Oracle10g Companion CD Software),
the terminal shell environment needs to
be configured for remote access and user equivalence to all nodes in the cluster
before running the Network Configuration Assistant (NETCA). Note
that you can utilize the same terminal shell session used in the previous section
which in this case, you do not
have to take any of the actions described below with regards to setting up remote
access and the DISPLAY variable:
# su - oracle
$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
$ # NODE PERFORMING THE INSTALL
$ DISPLAY=<your local workstation>:0.0
$ export DISPLAY
Verify you are able to run the Secure Shell
commands (ssh or scp) or
the Remote Shell commands
(rsh and rcp) on the Linux server
you will be running the Oracle Universal Installer from against all other Linux servers
in the cluster without being prompted for a password.
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /u01/app/oracle/.ssh/id_rsa: xxxxx
Identity added: /u01/app/oracle/.ssh/id_rsa (/u01/app/oracle/.ssh/id_rsa)
Identity added: /u01/app/oracle/.ssh/id_dsa (/u01/app/oracle/.ssh/id_dsa)
$ ssh vmlinux1 "date;hostname"
Mon Feb 19 19:51:46 EST 2007
vmlinux1
$ ssh vmlinux2 "date;hostname"
Mon Feb 19 20:53:07 EST 2007
vmlinux2
$ rsh vmlinux1 "date;hostname"
Mon Feb 19 19:52:48 EST 2007
vmlinux1
$ rsh vmlinux2 "date;hostname"
Mon Feb 19 20:53:28 EST 2007
vmlinux2
Run the Network Configuration Assistant
To start the NETCA, run the following:
$ netca &
Screen Name
Response
Select the Type of Oracle
Net Services Configuration
Select Cluster configuration
Select the nodes to configure
Select all of the nodes: vmlinux1 and vmlinux2.
Type of Configuration
Select Listener configuration.
Listener Configuration - Next 6 Screens
The following screens are now like any other normal listener configuration. You can simply
accept the default parameters for the next six screens:
What do you want to do: Add
Listener name: LISTENER
Selected protocols: TCP
Port number: 1521
Configure another listener: No
Listener configuration complete! [ Next ]
You will be returned to this Welcome (Type of Configuration) Screen.
Type of Configuration
Select Naming Methods configuration.
Naming Methods Configuration
The following screens are:
Selected Naming Methods: Local Naming
Naming Methods configuration complete! [ Next ]
You will be returned to this Welcome (Type of Configuration) Screen.
Type of Configuration
Click Finish to exit the NETCA.
Verify TNS Listener Configuration
The Oracle TNS listener process should now be running on both nodes in the RAC
cluster:
$ hostname
vmlinux1
$ ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}'
LISTENER_VMLINUX1
=====================
$ hostname
vmlinux2
$ ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}'
LISTENER_VMLINUX2
The database creation process should only be performed
from one of the Oracle RAC nodes in the cluster (vmlinux1)!
Overview
We will be using the Oracle Database Configuration Assistant (DBCA) to create
the clustered database.
Verifying Terminal Shell Environment
As discussed in the previous section, (Create TNS Listener Process),
the terminal shell environment needs to
be configured for remote access and user equivalence to all nodes in the cluster
before running the Database Configuration Assistant (DBCA). Note
that you can utilize the same terminal shell session used in the previous section
which in this case, you do not
have to take any of the actions described below with regards to setting up remote
access and the DISPLAY variable:
# su - oracle
$ # IF YOU ARE USING A REMOTE CLIENT TO CONNECT TO THE
$ # NODE PERFORMING THE INSTALL
$ DISPLAY=<your local workstation>:0.0
$ export DISPLAY
Verify you are able to run the Secure Shell
commands (ssh or scp) or
the Remote Shell commands
(rsh and rcp) on the Linux server
you will be running the Oracle Universal Installer from against all other Linux servers
in the cluster without being prompted for a password.
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /u01/app/oracle/.ssh/id_rsa: xxxxx
Identity added: /u01/app/oracle/.ssh/id_rsa (/u01/app/oracle/.ssh/id_rsa)
Identity added: /u01/app/oracle/.ssh/id_dsa (/u01/app/oracle/.ssh/id_dsa)
$ ssh vmlinux1 "date;hostname"
Mon Feb 19 19:51:46 EST 2007
vmlinux1
$ ssh vmlinux2 "date;hostname"
Mon Feb 19 20:53:07 EST 2007
vmlinux2
$ rsh vmlinux1 "date;hostname"
Mon Feb 19 19:52:48 EST 2007
vmlinux1
$ rsh vmlinux2 "date;hostname"
Mon Feb 19 20:53:28 EST 2007
vmlinux2
Run the Oracle Cluster Verification Utility
Before creating the Oracle clustered database, we should run
the following database configuration check using the
Cluster Verification Utility (CVU).
Instructions for configuring CVU can be found in the section
"Prerequisites for Using Cluster Verification Utility
discussed earlier in this article.
$ cd /u01/app/oracle/orainstall/clusterware/cluvfy
$ ./runcluvfy.sh stage -pre dbcfg -n vmlinux1,vmlinux2 -d ${ORACLE_HOME} -verbose
Review the CVU report. Note that this report will contain
the same error we received when checking pre-installation tasks
for CRS failure to find a suitable set of interfaces for VIPs.
This error can be safely ignored.
Create the Clustered Database
To start the database creation process, run the following:
$ dbca &
Screen Name
Response
Welcome Screen
Select Oracle Real Application Clusters database.
Operations
Select Create a Database.
Node Selection
Click the Select All button to
select all servers: vmlinux1 and vmlinux2.
Database Templates
Select Custom Database
Database Identification
Select:
Global Database Name: orcl.idevelopment.info
SID Prefix: orcl
I used idevelopment.info for the database domain. You
may use any domain. Keep in mind that this domain does not have to be
a valid DNS domain.
Management Option
Leave the default options here which is to
Configure the Database with Enterprise Manager /
Use Database Control for Database Management
Database Credentials
I selected to Use the Same Password for All Accounts.
Enter the password (twice) and make sure the password does not start with a digit number.
Storage Options
For this article, we will select to use Automatic Storage Management (ASM).
Create ASM Instance
Supply the SYS password to use for the new ASM instance.
Also, starting with Oracle10g Release 2, the ASM instance server parameter
file (SPFILE) needs to be on a shared disk. You will need to modify the default
entry for "Create server parameter file (SPFILE)" to reside on the OCFS2 partition as
follows: /u02/oradata/orcl/dbs/spfile+ASM.ora.
All other options can stay at their defaults.
ASM Disk Groups
To start, click the Create New button.
This will bring up the "Create Disk Group" window with the four volumes
we configured earlier using ASMLib.
Database File Locations
I selected to use the default which is Use Oracle-Managed Files:
Database Area: +ORCL_DATA1
Recovery Configuration
Check the option for Specify Flash Recovery Area.
Database Content
I left all of the Database Components (and destination tablespaces) set to their default
value, although it is perfectly OK to select the Example Schemas.
This option is available since we installed the Oracle Companion CD software.
Database Services
For this test configuration, click Add,
and enter the Service Name: orcltest.
Leave both instances set to Preferred and for
the "TAF Policy" select Basic.
Initialization Parameters
Change any parameters for your environment. I left them all at their default settings.
Database Storage
Change any parameters for your environment. I left them all at their default settings.
Creation Options
Keep the default option Create Database selected and click
Finish to start the database creation process.
End of Database Creation
At the end of the database creation, exit from the DBCA.
Create the orcltest Service
During the creation of the Oracle clustered database, we added a service
named "orcltest" that will be used to connect to the database
with TAF enabled. During several of my installs, the service was added to the
tnsnames.ora, but was never updated as a service for each
Oracle instance.
SQL> show parameter service
NAME TYPE VALUE
-------------------- ----------- --------------------------------
service_names string orcl.idevelopment.info, orcltest
SQL> show parameter service
NAME TYPE VALUE
-------------------- ----------- --------------------------
service_names string orcl.idevelopment.info
SQL> alter system set service_names =
2 'orcl.idevelopment.info, orcltest.idevelopment.info' scope=both;
Ensure that the TNS networking files are configured on
both Oracle RAC nodes in the cluster!
listener.ora
We already covered how to create a TNS listener configuration file (listener.ora)
for a clustered environment in the section
Create TNS Listener Process.
The listener.ora file should be properly configured and no modifications should
be needed.
listener.ora
# listener.ora.vmlinux1 Network Configuration File:
# /u01/app/oracle/product/10.2.0/db_1/network/admin/listener.ora.vmlinux1
# Generated by Oracle configuration tools.
LISTENER_VMLINUX1 =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = vmlinux1-vip)(PORT = 1521)(IP = FIRST))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.111)(PORT = 1521)(IP = FIRST))
)
)
SID_LIST_LISTENER_VMLINUX1 =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1)
(PROGRAM = extproc)
)
)
tnsnames.ora
Here is a copy of my tnsnames.ora file that was
configured by Oracle and can be used for testing the Transparent Application Failover (TAF).
This file should already be configured on each node in the
RAC cluster.
tnsnames.ora
# tnsnames.ora Network Configuration File:
# /u01/app/oracle/product/10.2.0/db_1/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.
LISTENERS_ORCL =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = vmlinux1-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = vmlinux2-vip)(PORT = 1521))
)
ORCL2 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = vmlinux2-vip)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl.idevelopment.info)
(INSTANCE_NAME = orcl2)
)
)
ORCL1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = vmlinux1-vip)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl.idevelopment.info)
(INSTANCE_NAME = orcl1)
)
)
ORCLTEST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = vmlinux1-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = vmlinux2-vip)(PORT = 1521))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcltest.idevelopment.info)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = BASIC)
(RETRIES = 180)
(DELAY = 5)
)
)
)
ORCL =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = vmlinux1-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = vmlinux2-vip)(PORT = 1521))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl.idevelopment.info)
)
)
EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
)
(CONNECT_DATA =
(SID = PLSExtProc)
(PRESENTATION = RO)
)
)
Connecting to Clustered Database From an External Client
This is an optional step, but I like to perform it in order to verify
my TNS files are configured correctly. Use another machine (i.e. a Windows machine connected to
the network) that has Oracle
installed (either 9i or 10g) and add the TNS entries
(in the tnsnames.ora) from either of the nodes in the cluster that were created
for the clustered database.
C:\> sqlplus system/manager@orcl2
C:\> sqlplus system/manager@orcl1
C:\> sqlplus system/manager@orcltest
C:\> sqlplus system/manager@orcl
When creating the clustered database, we left all tablespaces set
to their default size. Since I am using a large drive for the
shared storage, I like to make a sizable testing database.
SQL> select tablespace_name, file_name
2 from dba_data_files
3 union
4 select tablespace_name, file_name
5 from dba_temp_files;
TABLESPACE_NAME FILE_NAME
---------------- --------------------------------------------------
EXAMPLE +ORCL_DATA1/orcl/datafile/example.263.578676853
SYSAUX +ORCL_DATA1/orcl/datafile/sysaux.261.578676829
SYSTEM +ORCL_DATA1/orcl/datafile/system.259.578676767
TEMP +ORCL_DATA1/orcl/tempfile/temp.262.578676841
UNDOTBS1 +ORCL_DATA1/orcl/datafile/undotbs1.260.578676809
UNDOTBS2 +ORCL_DATA1/orcl/datafile/undotbs2.264.578676867
USERS +ORCL_DATA1/orcl/datafile/users.265.578676887
$ sqlplus "/ as sysdba"
SQL> create user scott identified by tiger default tablespace users;
SQL> grant dba, resource, connect to scott;
SQL> alter database datafile '+ORCL_DATA1/orcl/datafile/users.265.578676887' resize 500m;
SQL> alter tablespace users add datafile '+ORCL_DATA1' size 250m autoextend off;
SQL> create tablespace indx datafile '+ORCL_DATA1' size 250m
2 autoextend on next 50m maxsize unlimited
3 extent management local autoallocate
4 segment space management auto;
SQL> alter database datafile '+ORCL_DATA1/orcl/datafile/system.259.578676767' resize 800m;
SQL> alter database datafile '+ORCL_DATA1/orcl/datafile/sysaux.261.578676829' resize 500m;
SQL> alter tablespace undotbs1 add datafile '+ORCL_DATA1' size 250m
2 autoextend on next 50m maxsize 2048m;
SQL> alter tablespace undotbs2 add datafile '+ORCL_DATA1' size 250m
2 autoextend on next 50m maxsize 2048m;
SQL> alter database tempfile '+ORCL_DATA1/orcl/tempfile/temp.262.578676841' resize 1024m;
Status Tablespace Name TS Type Ext. Mgt. Seg. Mgt. Tablespace Size Used (in bytes) Pct. Used
--------- --------------- ------------ ---------- --------- ------------------ ------------------ ---------
ONLINE UNDOTBS1 UNDO LOCAL MANUAL 471,859,200 62,586,880 13
ONLINE SYSAUX PERMANENT LOCAL AUTO 524,288,000 278,724,608 53
ONLINE USERS PERMANENT LOCAL AUTO 786,432,000 131,072 0
ONLINE SYSTEM PERMANENT LOCAL MANUAL 838,860,800 502,726,656 60
ONLINE EXAMPLE PERMANENT LOCAL AUTO 157,286,400 83,820,544 53
ONLINE INDX PERMANENT LOCAL AUTO 262,144,000 65,536 0
ONLINE UNDOTBS2 UNDO LOCAL MANUAL 471,859,200 1,835,008 0
ONLINE TEMP TEMPORARY LOCAL MANUAL 1,073,741,824 27,262,976 3
------------------ ------------------ ---------
avg 23
sum 4,586,471,424 957,153,280
8 rows selected.
The following RAC verification checks should be performed
on both Oracle RAC nodes in the cluster! For this article, however, I will only
be performing checks from vmlinux1.
Overview
This section provides several srvctl commands and SQL queries that can be used
to validate your Oracle10g RAC configuration.
There are five node-level tasks defined for SRVCTL:
Status of all instances and services
$ srvctl status database -d orcl
Instance orcl1 is running on node vmlinux1
Instance orcl2 is running on node vmlinux2
Status of a single instance
$ srvctl status instance -d orcl -i orcl2
Instance orcl2 is running on node vmlinux2
Status of a named service globally across the database
$ srvctl status service -d orcl -s orcltest
Service orcltest is running on instance(s) orcl2, orcl1
Status of node applications on a particular node
$ srvctl status nodeapps -n vmlinux1
VIP is running on node: vmlinux1
GSD is running on node: vmlinux1
Listener is running on node: vmlinux1
ONS daemon is running on node: vmlinux1
Status of an ASM instance
$ srvctl status asm -n vmlinux1
ASM instance +ASM1 is running on node vmlinux1.
List all configured databases
$ srvctl config database
orcl
Display configuration for our RAC database
$ srvctl config database -d orcl
vmlinux1 orcl1 /u01/app/oracle/product/10.2.0/db_1
vmlinux2 orcl2 /u01/app/oracle/product/10.2.0/db_1
Display all services for the specified cluster database
$ srvctl config service -d orcl
orcltest PREF: orcl2 orcl1 AVAIL:
Display the configuration for node applications - (VIP, GSD, ONS, Listener)
$ srvctl config nodeapps -n vmlinux1 -a -g -s -l
VIP exists.: /vmlinux1-vip/192.168.1.211/255.255.255.0/eth0
GSD exists.
ONS daemon exists.
Listener exists.
Display the configuration for the ASM instance(s)
$ srvctl config asm -n vmlinux1
+ASM1 /u01/app/oracle/product/10.2.0/db_1
All running instances in the cluster
SELECT
inst_id
, instance_number inst_no
, instance_name inst_name
, parallel
, status
, database_status db_status
, active_state state
, host_name host
FROM gv$instance
ORDER BY inst_id;
INST_ID INST_NO INST_NAME PAR STATUS DB_STATUS STATE HOST
-------- -------- ---------- --- ------- ------------ --------- --------
1 1 orcl1 YES OPEN ACTIVE NORMAL vmlinux1
2 2 orcl2 YES OPEN ACTIVE NORMAL vmlinux2
All data files which are in the disk group
select name from v$datafile
union
select member from v$logfile
union
select name from v$controlfile
union
select name from v$tempfile;
NAME
-------------------------------------------
+FLASH_RECOVERY_AREA/orcl/controlfile/current.256.578676737
+FLASH_RECOVERY_AREA/orcl/onlinelog/group_1.257.578676745
+FLASH_RECOVERY_AREA/orcl/onlinelog/group_2.258.578676759
+FLASH_RECOVERY_AREA/orcl/onlinelog/group_3.259.578682963
+FLASH_RECOVERY_AREA/orcl/onlinelog/group_4.260.578682987
+ORCL_DATA1/orcl/controlfile/current.256.578676735
+ORCL_DATA1/orcl/datafile/example.263.578676853
+ORCL_DATA1/orcl/datafile/indx.270.578685723
+ORCL_DATA1/orcl/datafile/sysaux.261.578676829
+ORCL_DATA1/orcl/datafile/system.259.578676767
+ORCL_DATA1/orcl/datafile/undotbs1.260.578676809
+ORCL_DATA1/orcl/datafile/undotbs1.271.578685941
+ORCL_DATA1/orcl/datafile/undotbs2.264.578676867
+ORCL_DATA1/orcl/datafile/undotbs2.272.578685977
+ORCL_DATA1/orcl/datafile/users.265.578676887
+ORCL_DATA1/orcl/datafile/users.269.578685653
+ORCL_DATA1/orcl/onlinelog/group_1.257.578676739
+ORCL_DATA1/orcl/onlinelog/group_2.258.578676753
+ORCL_DATA1/orcl/onlinelog/group_3.266.578682951
+ORCL_DATA1/orcl/onlinelog/group_4.267.578682977
+ORCL_DATA1/orcl/tempfile/temp.262.578676841
21 rows selected.
All ASM disk that belong to the 'ORCL_DATA1' disk group
SELECT path
FROM v$asm_disk
WHERE group_number IN (select group_number
from v$asm_diskgroup
where name = 'ORCL_DATA1');
PATH
----------------------------------
ORCL:VOL1
ORCL:VOL2
At this point, everything has been installed and configured
for Oracle10g RAC. We have all of the required software
installed and configured plus we have a fully functional
clustered database.
# su - oracle
$ hostname
vmlinux1
Stopping the Oracle10g RAC Environment
The first step is to stop the Oracle instance. Once the instance (and related services)
is down, then bring down the ASM instance. Finally, shutdown
the node applications (Virtual IP, GSD, TNS Listener, and ONS).
$ export ORACLE_SID=orcl1
$ emctl stop dbconsole
$ srvctl stop instance -d orcl -i orcl1
$ srvctl stop asm -n vmlinux1
$ srvctl stop nodeapps -n vmlinux1
Starting the Oracle10g RAC Environment
The first step is to start the node applications (Virtual IP, GSD, TNS Listener, and ONS).
Once the node applications are successfully started, then bring up the ASM instance.
Finally, bring up the Oracle instance (and related services) and the
Enterprise Manager Database console.
$ export ORACLE_SID=orcl1
$ srvctl start nodeapps -n vmlinux1
$ srvctl start asm -n vmlinux1
$ srvctl start instance -d orcl -i orcl1
$ emctl start dbconsole
Start / Stop All Instances with SRVCTL
Start / Stop all of the instances and its enabled services. I just included
this for fun as a way to bring down all instances!
$ srvctl start database -d orcl
$ srvctl stop database -d orcl
Now that we have both Oracle instances running and registered, let's start performing
some tests with the new environment. This section contains serveral
tests that can be run against your new RAC environment.
How Many Instances are Running?
Let's start by peforming a simple query on all instances:
SQL> select instance_number, instance_name from gv$instance order by 1;
INSTANCE_NUMBER INSTANCE_NAME
--------------- ----------------
1 orcl1
2 orcl2
Which Instance am I Logged In To?
To answer this question, we can simply query the normal v$instance view:
SQL> select instance_name from v$instance;
INSTANCE_NAME
----------------
orcl1
What Are the Names of All GV$ Views?
Simply query from DBA_OBJECTS as follows:
SQL> select object_name from dba_objects where object_name like 'GV$%' order by 1;
OBJECT_NAME
---------------------------------------
GV$ACCESS
GV$ACTIVE_INSTANCES
GV$ACTIVE_SERVICES
GV$ACTIVE_SESSION_HISTORY
GV$ACTIVE_SESS_POOL_MTH
GV$ADVISOR_PROGRESS
GV$ALERT_TYPES
GV$AQ
GV$AQ1
GV$ARCHIVE
GV$ARCHIVED_LOG
GV$ARCHIVE_DEST
GV$ARCHIVE_DEST_STATUS
GV$ARCHIVE_GAP
GV$ARCHIVE_PROCESSES
... < SNIP > ...
GV$TRANSACTION
GV$TRANSACTION_ENQUEUE
GV$TRANSPORTABLE_PLATFORM
GV$TSM_SESSIONS
GV$TYPE_SIZE
GV$UNDOSTAT
GV$VERSION
GV$VPD_POLICY
GV$WAITCLASSMETRIC
GV$WAITCLASSMETRIC_HISTORY
GV$WAITSTAT
GV$WALLET
GV$XML_AUDIT_TRAIL
GV$_LOCK
384 rows selected.
OCFS2 - Configure O2CB to Start on Boot
With the releases of OCFS2 prior to 1.2.1, there is a bug that exists where the
driver does not get loaded on each boot even after configuring the on-boot properties
to do so. After attempting to configure the on-boot properties to start on each boot
according to the official OCFS2 documentation, you will still get the following error
on each boot:
...
Mounting other filesystems:
mount.ocfs2: Unable to access cluster service
Cannot initialize cluster mount.ocfs2:
Unable to access cluster service Cannot initialize cluster [FAILED]
...
Red Hat changed the way the service is registered between chkconfig-1.3.11.2-1 and
chkconfig-1.3.13.2-1. The O2CB script used to work with the former.
After resolving the bug I listed above, we can now
continue to set the on-boot properties as follows:
### BEGIN INIT INFO
# Provides: o2cb
# Required-Start:
# Should-Start:
# Required-Stop:
# Default-Start: 2 3 5
# Default-Stop:
# Description: Load O2CB cluster services at system boot.
### END INIT INFO
# chkconfig --del o2cb
# chkconfig --add o2cb
# chkconfig --list o2cb
o2cb 0:off 1:off 2:on 3:on 4:on 5:on 6:off
# ll /etc/rc3.d/*o2cb*
lrwxrwxrwx 1 root root 14 Sep 29 11:56 /etc/rc3.d/S24o2cb -> ../init.d/o2cb
The service should be S24o2cb in the default runlevel.
# /etc/init.d/o2cb offline ocfs2
# /etc/init.d/o2cb unload
# /etc/init.d/o2cb configure
Configuring the O2CB driver.
This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on
boot. The current values will be shown in brackets ('[]'). Hitting
OCFS2 - Adjusting the O2CB Heartbeat Threshold
With previous versions of this article, I was able to install and configure OCFS2, format the new
volume, and finally install Oracle Clusterware (with its two required shared files;
the voting disk and OCR file), located on the new OCFS2 volume. While I was able to
install Oracle Clusterware and see the shared virtual drive, however, I was
receiving many lock-ups and hanging after about 15 minutes when the Clusterware software
was running on both nodes. It always varied on which node would hang (either
vmlinux1 or vmlinux2 in my example). It also didn't matter whether there
was a high I/O load or none at all for it to crash (hang).
...
Index 0: took 0 ms to do submit_bio for read
Index 1: took 3 ms to do waiting for read completion
Index 2: took 0 ms to do bio alloc write
Index 3: took 0 ms to do bio add page write
Index 4: took 0 ms to do submit_bio for write
Index 5: took 0 ms to do checking slots
Index 6: took 4 ms to do waiting for write completion
Index 7: took 1993 ms to do msleep
Index 8: took 0 ms to do allocating bios for read
Index 9: took 0 ms to do bio alloc read
Index 10: took 0 ms to do bio add page read
Index 11: took 0 ms to do submit_bio for read
Index 12: took 10006 ms to do waiting for read completion
(13,3):o2hb_stop_all_regions:1888 ERROR: stopping heartbeat on all active regions.
Kernel panic - not syncing: ocfs2 is very sorry to be fencing this system by panicing
# cat /proc/fs/ocfs2_nodemanager/hb_dead_threshold
7
We see that the value is 7, but what does this value represent? Well,
it is used in the formula below to determine the fence time (in seconds):
[fence time in seconds] = (O2CB_HEARTBEAT_THRESHOLD - 1) * 2
So, with an O2CB heartbeat threshold of 7, we would have a fence time of:
(7 - 1) * 2 = 12 seconds
If we want a larger threshold (say 1,200 seconds), we would need
to adjust O2CB_HEARTBEAT_THRESHOLD to 601 as shown below:
(601 - 1) * 2 = 1,200 seconds
/etc/sysconfig/o2cb
# O2CB_ENABELED: 'true' means to load the driver on boot.
O2CB_ENABLED=true
# O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start.
O2CB_BOOTCLUSTER=ocfs2
# O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead.
O2CB_HEARTBEAT_THRESHOLD=601
# umount /u02/oradata/orcl/
# /etc/init.d/o2cb unload
# /etc/init.d/o2cb configure
Load O2CB driver on boot (y/n) [y]: y
Cluster to start on boot (Enter "none" to clear) [ocfs2]: ocfs2
Writing O2CB configuration: OK
Loading module "configfs": OK
Mounting configfs filesystem at /config: OK
Loading module "ocfs2_nodemanager": OK
Loading module "ocfs2_dlm": OK
Loading module "ocfs2_dlmfs": OK
Mounting ocfs2_dlmfs filesystem at /dlm: OK
Starting cluster ocfs2: OK
We can now check again to make sure the settings took place
in for the o2cb cluster stack:
# cat /proc/fs/ocfs2_nodemanager/hb_dead_threshold
601
It is important to note that the value of 601 I used for the
O2CB heartbeat threshold will not work for all configurations.
In some cases, the O2CB heartbeat
threshold value had to be increased to as high as 901 in order to prevent
OCFS2 from panicking the kernel.
Jeffrey Hunter is an Oracle Certified Professional, Java Development Certified Professional, Author,
and an Oracle ACE.
Jeff currently works as a Senior Database Administrator for
The DBA Zone, Inc. located in Pittsburgh, Pennsylvania.
His work includes advanced performance tuning, Java and PL/SQL programming, capacity
planning, database security, and physical / logical database design in a UNIX,
Linux, and Windows server environment. Jeff's other interests include mathematical
encryption theory, programming language processors (compilers and interpreters)
in Java and C, LDAP, writing web-based database administration tools, and of
course Linux. He has been a Sr. Database Administrator and Software Engineer
for over 16 years and maintains his own website site at:
http://www.iDevelopment.info.
Jeff graduated from Stanislaus State University in Turlock,
California, with a Bachelor's degree in Computer Science.
Wednesday, 22-Apr-2009 18:53:28 EDT
Page Count: 47787