Deploying VMWare High Availability & Fault Tolerance
Deploying VMWare High Availability & Fault Tolerance
Copyright © 2014 Storageflex Inc. All rights reserved. Storageflex is a registered trademark of Storageflex. All other marks
and names mentioned herein may be trademarks of their respective owners. The information contained herein is subject to
change without notice. Content provided as is, without express or implied warranties of any kind.
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
Table of Contents
High Availability.......................................................................................................... 6
Fault Tolerance .......................................................................................................... 7
2
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
3
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
4
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
5
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
VMware provides two solutions that aim at providing access to your virtual
machines at nearly 100% uptime. The first one is High Availability (HA) that
can automatically migrate and restart VMs on a secondary ESXi server in case
of any failures occurred on the primary one. Although being fully automatic,
VMware HA cannot completely meet the needs of business-critical operations
that require 100% uptime, and that is where Fault Tolerance (FT) comes in. FT
keeps a shadow up-to-date copy of the original VM on the second ESXi server,
which can fully eliminate service downtime during the switch between the
original and secondary VMs. Both solutions are illustrated and described below.
High Availability
As you can see from the picture above, VMware gathers all virtual machines into
a shared resource pool or "cluster". After HA is enabled for a cluster, it starts to
monitor ESXi servers’ availability. If one of the servers fails, its VMs will be
migrated and restarted on other servers.
6
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
Fault Tolerance
Fault Tolerance (FT) provides continuous availability for VMs in case of ESXi
server failure. FT leverages existing vSphere HA clusters, and utilizes the
vLockstep technology to protect specific business-critical VMs by keeping their
identical VMs (or "shadow VMs") on secondary ESX /ESXi servers. When the
primary server fails, the shadow VMs could be restarted instantly, thus
guaranteeing zero downtime and no data, transaction, or connection loss.
7
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
To prepare the test environment for this application note, the following minimum
hardware is required:
• At least two VMware compatible hardware servers running ESXi server
• At least one Storageflex HA3969U storage system
• At least one LAN switch/router
http://pubs.vmware.com/vsphere-55/index.jsp#com.vmware.vsphere.avail.doc/
GUID-BA85FEC4-A37C-45BA-938D-37B309010D93.html
In order to protect business critical VMs with FT, the following additional
requirements should be met: 8
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
• There should be at least two hosts running the same FT version/host build
number
• Protected VMs must be stored in virtual RDM or virtual machine disk
(VMDK) files that are thick provisioned
Full list of FT requirements can be found in VMware vSphere 5.5 Documentation
Center:
http://pubs.vmware.com/vsphere-55/index.jsp#com.vmware.vsphere.avail.doc/
GUID-83FE5A45-8260-436B-A603-B8CBD2A1A611.html
In order to create a HA protected cluster, both ESXi hosts should have at least
two shared datastores (as per VMware requirements).
After the two datastores (e.g. "nfs_db" and "nfs_db2") are successfully added to
both of the ESXi hosts, you can open vSphere Client and check their information
in the Status column of the Storage panel.
The number of heartbeat datastores for host is 1, which is less than required: 2
The number of heartbeat datastores for host is 0, which is less than required: 2
If you still want to use just one shared datastore without seeing the above
message, you will need to add the "das.ignoreInsufficientHbDatastore" entry in
the cluster settings. Refer to VMware knowledge base article for further
information:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd
=displayKC&externalId=2004739
9
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
Open vSphere Web Client or vSphere Client, navigate to the datacenter where
you want to create the cluster, and then click New Cluster.
Keep in mind that if you enable HA before ESXi hosts are added to the cluster,
the cluster will not be fully functional until hosts join it.
Also note that you can decide whether you want to enable vSphere DRS
(Distributed Resource Scheduler which is responsible for automated load
balancing), which is compatible with vSphere HA, and is therefore enabled in our
example. For more information about vSphere DRS, refer to VMware vSphere
features overview web page:
http://www.vmware.com/products/vsphere/features/drs-dpm.html
10
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
In vSphere Web Client or vSphere Client, navigate to the cluster you just created,
and select Add Host. You will need to provide host name, username, and
password for the hosts.
If you decide to enable vSphere HA after adding hosts to the cluster, from the
details panel you will see the status of HA agent installation for the newly added
hosts.
In our example (shown below), you can see the process of configuring vSphere
HA after two ESXi hosts are added to the cluster.
vSphere HA is now successfully configured for cluster1, which has two ESXi
hosts (172.24.110.35 and 172.24.110.53)
11
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
After a HA protected cluster is successfully created, you can take VMware High
Availability features to the next level by protecting business critical VMs with
vSphere Fault Tolerance.
Navigate to the virtual machine you want to protect with Fault Tolerance (e.g.
"linux_vm"), right-click on it, and then select either of the following:
• For vSphere Web Client: All vCenter Actions > Fault Tolerance > Turn On
Fault Tolerance.
12
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
After the task is complete and FT is successfully enabled for the VM, you can
see the status of the task becoming green.
13
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
FT information for a host can be seen in the Summary tab. In our example, the
ESXi host inside cluster1 is the master host configured for HA and FT, and hosts
one primary VM.
As to the slave host of cluster1, it is also HA protected and configured for FT, but
the VM it hosts is a secondary VM (populated from the primary one).
14
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
At this point, if you check the VMs for cluster1 (shown below), you will notice a
second instance of the FT protected VM appeared in the list. Original FT
protected VM (“linux_vm”) is running on ESXi host 172.24.110.53, and the
secondary VM (or "shadow copy") of the original VM is running on ESXi host
172.24.110.42.
You can also check vSphere map to see the primary and secondary VMs.
After the entire configuration is done, we can start testing the efficiency of the
solution. In order to do this, we will simulate vSphere High Availability Failover
and then vSphere Fault Tolerance failover.
15
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
Keep in mind that the VM in this example (called "ubuntu") is not FT protected.
Now, right-click on the primary host and choose Shut Down. In a few seconds
you will see the host become unavailable.
16
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
Downtime for this VM is a little bit longer that 1 minute, which under most
circumstances is NOT acceptable for many mission critical applications.
17
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
Among all the above scenarios, we are going to use vCenter Server to test the
FT capability by right-clicking on a FT protected VM and selecting Fault
Tolerance > Test Failover.
Observing the status in the Recent Tasks panel of vSphere Web Client or
vSphere Client, you can see the process being finished in just a few seconds.
18
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
Protected VM will remain available during this test without any interruption, while
there might be a slight delay depending on the network latency. According to
VMware Fault Tolerance FAQ, this delay is usually less than 1 millisecond (ms).
Refer to VMware Fault Tolerance FAQ for additional information:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd
=displayKC&externalId=1013428
19
Deploying VMware High Availability & Fault Tolerance cluster on HA3969U (NFS)
20