Virtualizing Oracle RAC
with Solaris Containers
Oracle Solaris Containers
An integral part of the Oracle
Solaris 10 Operating System, Oracle Solaris Containers isolate
software applications and services
using flexible, software-defined boundaries and allow many private execution
environments to be created within a single instance of Oracle Solaris 10. Each
environment has its own identity, separate from the underlying hardware. Each
behaves as if it is running on its own operating system making consolidation
simple, safe, and secure. Containers can all share CPU resources, can each have
dedicated CPU resources, or can each specify a guaranteed minimum amount of
resources as well as a maximum. Memory can be shared among all Oracle Solaris
Containers, or each can have a specified memory cap. Physical I/O resources
such as disk and network can be dedicated to individual Oracle Solaris
Containers, shared by some, or shared by all. Regardless of what is shared or
dedicated, each virtualized environment will have isolated access to local file
system and networking, as well as system and user processes. Ideal for
environments, that consolidates a number of applications on a single server.
The cost and complexity of managing numerous machines make it advantageous to
consolidate several applications on larger, more scalable servers. Enable more
efficient resource utilization on your system. Dynamic resource reallocation
permits unused resources to be shifted to other Oracle Solaris Containers as
needed. Fault and security isolation mean that poorly behaved applications do
not require a dedicated and under-utilized system. There are two ways to
create Oracle Solaris Containers. One of them is 'sparse root'
Container, where the root file system of the Container is mounted as read-only
from the global zone, occupies less space on the file system and its quick to
create. Second one is 'whole root' Container, where the root file system
is mounted in read-write mode, all the packages required for Container are
installed inside it. The whole root Container is like a typical system. There
are two types of networking available for Containers. One of which is
'shared-IP' where the NIC is shared with the global zone, with the shared-IP
can not plumb IP address within Container, only at the time of booting
configured IPs of Container are brought up. For 'exclusive-IP' type of network
a dedicated NIC is assigned and the IP to this NIC can be configured within a
Container.
Oracle RAC database
Oracle RAC database is a RDBMS
database, which is highly available and scalable. It can dynamically adjust to
dynamic resource changes in the environment. Oracle RAC's data files are hosted
on shared storage, be it NAS or SAN based storage connected to all the
available systems. It's recommended to use low latency high throughput private
interconnect for Oracle RAC inter node communication. Minimum 2 physical
systems are required to provide high availability, to mitigate failure of one
node. However, for increased scalability and availability add additional
physical server or node.
To host Oracle RAC in Oracle Solaris
Containers environment, following points to be considered,
• Provision one Oracle Solaris Container per
node or server.
o By this high availability is achieved per instance or node level, as
if one node
goes down the other instance is
available from other node and not the same.
o Consolidate multiple such Oracle RAC databases within one set of
physical
servers.
o In addition to that Oracle Solaris Container is being highly secure
and isolated
environment, these environments could
be assigned as independent nodes to
different DBAs managing different
Oracle RAC database installations with the
Oracle Solaris Container root access.
• Name space isolation offers, configuration
of different Containers of different Oracle
RAC database to be in different time zones on a given server, however,
it's practiced
that the same time zone is configured across all the other nodes hosting
that particular
Oracle RAC database.
o Name space isolation offers, to
configure these contained environments as
independent systems by configuring
them to different name server, be it DNS
or NIS etc.
o Independent process tree per Container offer process level isolation,
there by
a processes running in one
Container will not have any impact on the other
Container in the event of a crash
and reboot by that process on that
environment reboots other
Containers are untouched.
• To host Oracle RAC database binaries use ZFS file system. Or by
creating Containers
environment on ZFS and host database binaries on it. Gives great
advantages like:
o Faster deployment of Oracle binaries
and setting up of the Container
environment by cloning the ZFS file
system hosting it.
o Dynamically increase the file system
size transparently, when the file system
is near full.
• To host the Oracle RAC database data files leverage ASM, as it offers
o Greater availability of the LUNs
achieved by using redundancy. ASM does
mirror disks or LUNs across
controllers there by the availability increases.
o Most optimum way to retrieve and
store the data, as Oracle has better
understanding of data.
• Flexibility to change the CPU resources based on the requirement could
be achieved
by the host administrator, to accommodate workload needs.
Deploy Oracle RAC without
Virtualization
To host Oracle RAC without
virtualization, following is considered,
• Need minimum 2 to
3 nodes for high availability, Shared storage to host data files,
and the required private and public networking for this Oracle Cluster
environment.
• However there is
no scope of hosting other version or different patch level of Oracle
RAC on the
same set of systems.
• Though
the over-all CPU and other resource utilization might be just 20 to 25%. To
host
another
setup with a different version another set of systems needs to be
configured, installed
and managed.
The below drawing 1 shows that,
• There are 4 * Sun
SPARC T5220 servers, configured in the RAC environment, hosts
Oracle 10gR2 RAC database with 2 different databases on global zone or
host’s
operating system environment.
• There
are 2 'Sun StorageTek 6140' array with
redundant controllers. Each redundant
controller is connected to redundant HBAs on the systems. On the system,
multipath IO
(MPxIO) is used to
provide high availability of connectivity to the LUNs on the disk array.
• There are 2 color
codes used to show 2 different controllers,
• From each storage
controller, say storage1, cables would be connected to port1 on
slot4 and slot5. Same with storage2 however port02 is used.
• MPxIO enables HA
among slot4 and slot5 and one disk name is provided instead of
two.
• Oracle Automatic
Storage Management (ASM) normal disk group is used for normal
redundancy among 2 LUNs of port1 and port2, it provides HA among the
failure of
entire storage array.
·
Public network
cables are connected to switch1 and switch2 from e1000g0 and e1000g1, an IPMP
group ipmp0 provides HA for the public network to hosts IP and VIP addresses.
·
These switches
are connected to the public network, in the lab environment they can have an
uplink port configured
· Private
network cables are connected to switch3 and switch4 from e1000g2 and e1000g3,
an IPMP group ipmp1 provides HA for the private network on the host, and the private
IP address is configured and brought up as the host comes up, and the ipmp1
group also comes-up along with that.
· Oracle
RAC leverages this IP address as its private IP address, when the Oracle
clusterware comes up.
· These
switches have an up-link port configured as trunk port to pass the traffic in
case of a failure of one of the switches components the link fails-over
transparently to other switch.
Oracle RAC Deployment with
Virtualization
The virtualized environment is Oracle
Solaris Containers, a built-in technology as explained earlier. To deploy
Oracle RAC inside Oracle Solaris Containers extend the learning from the
previous chapter as the high level deployment inside this virtual environment
is seamless. Let us consider the following while deploying Oracle RAC inside
Oracle Solaris Containers.
The below drawing 2 is a typical
deployment scenario, that shows,
• There are 4
nodes, hosting 2 Container environments, running 2 different versions
of Oracle RAC database 10gR2 and 11gR1 on 4 different Oracle clusterware
inside
Containers.
• This shows that
one container is created per node for a given Oracle Clusterware
environment, there by HA of instances is achieved.
• At the same time,
other container environment on a given failed node also goes
down, will not have major impact as only one
Container or virtual node per cluster is
not available.
• These cont10g01
to cont10g04, and cont11g01 to cont11g04, are hosted on physical
nodes port01 to port04 respectively.
• Two cores are
assigned (16 threads) per Container.
• The public and
private networking shares the same set of hardware NICs among 2
Oracle Clusterware environments without impacting each other by using
VLAN tagged
NICs. Since Oracle clusterware plumbs the VIP, exclusive-IP type
Containers are
created and VLAN tagged NICs are assigned instead of physical NICs.
• NICs e1000g0 and
e1000g1 are connected to switch1 and switch2.
o Ports of these switches are configured as trunk ports to allow VLAN
traffic with
VLAN tags 131,132 for Oracle 10gR2 and Oracle 11gR1 environments.
o Are public NICs, VIP is hosted on it. Plumbed by vip service of Oracle
clusterware.
o For example, inside cont10g01 Container these NICs are e1000g131000
and
e1000g131001, an IPMP group ipmp0 is created to provide HA at this
network layer.
• NICs e1000g2 and
e1000g3 are connected to switch3 and switch4.
o Ports of these switches are configured as trunk ports to allow VLAN
traffic with
VLAN tags 111,112,113 for Oracle 10gR2,
Oracle 11gR1, and Oracle 11gR2
environments.
o Private IPs are configured by Container and Oracle clusterware.
o For example, inside cont10g01 Container these NICs are e1000g111002
and
e1000g111003, an IPMP group ipmp1 is
created to provide HA at this network
layer.
• For
simplification, same IPMP group names could be used across all the Container
environments of a given Clusterware.
• A set of dedicated Storage
LUNs are assigned per Oracle clusterware environment, and the same set
of LUNs
are configured for other Containers across all the nodes. Physical connectivity
is same as
explained in the previous chapter.
Conclusion
• Oracle Solaris
Containers is the best choice for consolidating various Oracle RAC
databases with different versions or patch
levels. And could be hosted on the same
operating system kernel.
• Oracle Solaris
Containers enable the end users to manage resources well, pool the
resources from underutilized Containers and provision the same to the
required
Containers.
• In addition to
that the accounting feature of SRM could be leveraged to bill the end
users based on the CPUs consumed, probably a grid environment could
leverage
this feature for appropriate billing based on resource utilization.
• Dynamic resource
management like memory and CPU changes would offer greater
flexibility.
• VLAN tagged NICs
overcome the hard limitation of physical NICs on a server. To
support this, network switches also posses this feature.
• IPMP takes care
of availability at the network stack, apart from Oracle RAC's monitoring
and timeouts to detect network failure.
• MPxIO offers
seamless availability by offering single disk name for different paths of the
same disk/LUN. It manages the availability of different paths of
LUNs/disks from
storage
arrays to hosts.
• ASM offers the
benefits of a cluster volume manager where disks/LUNs from different
storage arrays could be mirrored to offer HA among two different storage
arrays.
Sample Container configuration
Hardware:
Sun SPARC Enterprise T5220 (M-series
or x86 based system could also be used to host Container virtual environment).
4 * Sun SPARC Enterprise T5220 server
each configured with 1.6GHz 8 cores or 64 threads, and
64GB RAM.
Storage:
Sun StorageTek 6140 Array with dual
controller.
OS:
Solaris10 10/09 SPARC with kernel
patch 142900-14 and it's dependent patch 143055-01
Oracle database:
• Oracle RAC 10gR2
10.2.0.4 with latest patch set 9352164
• Oracle RAC 11gR1
11.1.0.7 with latest patch set 9207257 + 9352179
Configuration files
Host system configurations files:
1. Limit ZFS cache to 1GB
edit /etc/system # add these lines
set zfs:zfs_arc_max = 0x3E800000
2. To override the system wide limit
of 1/4 of physical memory by default on S10, the
shminfo_shmmax tunable would need to be configured in /etc/system to
remove that
limit. Edit /etc/system and set the value of shminfo_shmmax to the value
that suites
the requirement.
set shmminfo_shmmax = 0x600000000
3. Enable MPxIO
edit /kernel/drv/fp.conf # the
following entry
If the entry is present, make sure
it's set as mentioned above
to enable the MPxIO or add the entry.
For all the /etc/system values to
take affect reboot the node.
Inside Containers configuration
files:
1. Oracle shared memory configuration
using SRM facility /etc/project.
root@cont11g01:/# cat /etc/project
system:0::::
user.root:1::::
noproject:2::::
default:3::::
group.staff:10::::
user.oracle:100:Oracle
project:::process.max-sem-nsems=(privileged,4096,deny);project.max-shmmemory=(
privileged,16106127360,deny)
2. IPMP configuration
a. Public network
root@cont10g01:/# cat
/etc/hostname.e1000g131000
cont10g01 group pub_ipmp0
root@cont10g01:/# cat /etc/hostname.e1000g131001
group pub_ipmp0 up
root@cont10g01:/#
b. Private network
cont10g01-priv group priv_ipmp0
root@cont10g01:/# cat
/etc/hostname.e1000g111003
group priv_ipmp0 up
root@cont10g01:/#
Container configuration file:
Create a file with the following
content to create the Containers.
#save the below content as:
config_template_to_create_cont10g01.cfg
set zonepath=/zonespool/cont10g01
set autoboot=false
set
limitpriv=default,proc_priocntl,proc_lock_memory
set scheduling-class=TS,RT,FX
set ip-type=exclusive
add net
set physical=e1000g111002
end
add net
set physical=e1000g111003
end
add net
set physical=e1000g131000
end
add net
set physical=e1000g131001
end
add capped-memory
set physical=24G
end
add dedicated-cpu
set ncpus=16
end
add rctl
set name=zone.max-swap
add value
(priv=privileged,limit=25769803776,action=deny)
end
add rctl
set name=zone.max-locked-memory
add value (priv=privileged,limit=12884901888,action=deny)
end
add device
set
match=/dev/rdsk/c5t600A0B800011FC3E00000E074BBE32EAd0s6
end
add device
set
match=/dev/dsk/c5t600A0B800011FC3E00000E074BBE32EAd0s6
end
add device
set match=/dev/rdsk/c5t600A0B800011FC3E00000E194BBE3514d0s6
end
add device
set
match=/dev/dsk/c5t600A0B800011FC3E00000E194BBE3514d0s6
end
add device
set
match=/dev/rdsk/c5t600A0B800011FC3E00000E234BBE720Cd0s6
end
add device
set
match=/dev/dsk/c5t600A0B800011FC3E00000E234BBE720Cd0s6
end
# Copy paste the above content, change the disk
paths and NIC names to suite your configuration.
# This will create the zone with the name
“cont10g01”.
#Installation is complete, boot the Container and
configure it.
#Login at the Container console to configure it
for the first time, set the root passwd, configure networking, configure time
zone, etc. And zone/Container will reboot.
# Follow the steps on the screen to configure the
zone.
0 reacties:
Post a Comment