0% found this document useful (0 votes)
46 views233 pages

Leap Xi Leap Admin Guide v5 20

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views233 pages

Leap Xi Leap Admin Guide v5 20

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 233

Disaster Recovery (Formerly Leap) 5.

20

Leap Administration Guide


October 12, 2022
Contents

Leap Overview.............................................................................................................. 4
Leap Terminology...................................................................................................................................................... 8
Nutanix Disaster Recovery Solutions................................................................................................................ 11
Leap Deployment Workflow................................................................................................................................ 12
On-Prem Hardware Resource Requirements................................................................................................ 14

Protection and DR between On-Prem Sites (Leap)...................................... 17


Leap Requirements..................................................................................................................................................18
Leap Limitations...................................................................................................................................................... 22
vGPU Enabled Guest VMs....................................................................................................................... 23
Leap Configuration Maximums.......................................................................................................................... 25
Leap Recommendations....................................................................................................................................... 25
Leap Service-Level Agreements (SLAs)........................................................................................................ 27
Leap Views................................................................................................................................................................. 27
Availability Zones View.............................................................................................................................27
Protection Policies View...........................................................................................................................29
Recovery Plans View................................................................................................................................. 30
Dashboard Widgets.....................................................................................................................................31
Enabling Leap for On-Prem Site.......................................................................................................................32
Pairing Availability Zones (Leap)......................................................................................................................33
Protection and Automated DR..........................................................................................................................36
Protection with Asynchronous Replication Schedule and DR (Leap)....................................36
Protection with NearSync Replication Schedule and DR (Leap).............................................96
Protection with Synchronous Replication Schedule (0 RPO) and DR...................................112
Protection Policy Management............................................................................................................128
Recovery Plan Management................................................................................................................. 134
Manual Disaster Recovery (Leap)................................................................................................................... 137
Creating Recovery Points Manually (Out-of-Band Snapshots)............................................... 137
Replicating Recovery Points Manually.............................................................................................. 137
Recovering a Guest VM from a Recovery Point Manually (Clone).........................................138
Entity Synchronization Between Paired Availability Zones.................................................................. 138
Entity Synchronization Recommendations (Leap)...................................................................... 139
Forcing Entity Synchronization (Leap)............................................................................................ 140

Protection and DR between On-Prem Site and Xi Cloud Service (Xi


Leap)...........................................................................................................................141
XiLeap Requirements......................................................................................................................................... 142
XiLeap Limitations...............................................................................................................................................146
XiLeap Configuration Maximums...................................................................................................................148
XiLeap Recommendations................................................................................................................................148
XiLeap Service-Level Agreements (SLAs)................................................................................................. 148
XiLeap Views......................................................................................................................................................... 149
Availability Zones View in Xi Cloud Services.................................................................................149
Protection Policies View in Xi Cloud Services...............................................................................150
Recovery Plans View in Xi Cloud Services......................................................................................152
Dashboard Widgets in Xi Cloud Services........................................................................................153
Enabling Leap in the On-Prem Site............................................................................................................... 153

ii
Xi Leap Environment Setup.............................................................................................................................. 153
Pairing Availability Zones (Xi Leap).................................................................................................. 154
VPN Configuration (On-prem and Xi Cloud Services)............................................................... 154
Nutanix Virtual Networks....................................................................................................................... 174
Xi Leap RPO Sizer.................................................................................................................................................176
Protection and Automated DR (Xi Leap)....................................................................................................180
Protection with Asynchronous Replication and DR (Xi Leap)................................................ 180
Protection with NearSync Replication and DR (Xi Leap)..........................................................212
Protection Policy Management............................................................................................................ 221
Recovery Plan Management.................................................................................................................225
Manual Disaster Recovery (Xi Leap).............................................................................................................228
Creating Recovery Points Manually (Out-of-Band Snapshots).............................................. 228
Replicating Recovery Points Manually............................................................................................. 228
Recovering a Guest VM from a Recovery Point Manually (Clone)........................................229
Entity Synchronization Between Paired Availability Zones................................................................. 229
Entity Synchronization Recommendations (Xi Leap)................................................................. 231
Forcing Entity Synchronization (Xi Leap)....................................................................................... 231

Migrating Guest VMs from a Protection Domain to a Protection


Policy........................................................................................................................ 232

Copyright.................................................................................................................... 233

iii
LEAP OVERVIEW
Legacy disaster recovery (DR) configurations use protection domains (PDs) and third-party
integrations to protect your applications. These DR configurations replicate data between on-
prem Nutanix clusters. Protection domains provide limited flexibility in terms of supporting
complex operations (for example, VM boot order, network mapping). With protection domains,
you have to perform manual tasks to protect new guest VMs as and when your application
scales up.
Leap offers an entity-centric automated approach to protect and recover applications. It
uses categories to group the guest VMs and automate the protection of the guest VMs
as the application scales. Application recovery is more flexible with network mappings, an
enforceable VM start sequence, and inter-stage delays. Application recovery can also be
validated and tested without affecting your production workloads. Asynchronous, NearSync,
and Synchronous replication schedules ensure that an application and its configuration details
synchronize to one or more recovery locations for a smoother recovery.

Note: You can protect a guest VM either with legacy DR solution (protection domain-based)
or with new Leap. To see various Nutanix DR solutions, refer Nutanix Disaster Recovery
Solutions on page 11.

Leap works with sets of physically isolated locations called availability zones. An instance of
Prism Central represents an availability zone. One availability zone serves as the primary site for
an application while one or more paired availability zones serve as the recovery sites.

Figure 1: A primary on-prem AZ and one recovery on-prem AZ

Disaster Recovery (Formerly Leap) | Leap Overview | 4


Figure 2: A primary on-prem AZ and two recovery on-prem AZs

Disaster Recovery (Formerly Leap) | Leap Overview | 5


Figure 3: A primary on-prem AZ and two recovery AZs: one on-prem recovery AZ and one
recovery AZ in Cloud (Xi Cloud Services)

Figure 4: A primary on-prem AZ and one recovery AZ at Xi Cloud Services

Disaster Recovery (Formerly Leap) | Leap Overview | 6


Figure 5: A primary Nutanix cluster and at most two recovery Nutanix clusters at the same
on-prem AZ

Figure 6: A primary site at Xi Cloud Services and recovery on-prem site

When paired, the primary site replicates the entities (protection policies, recovery plans, and
recovery points) to the recovery sites in the specified time intervals (RPO). The approach
helps application recovery at any of the recovery sites when there is a service disruption at
the primary site (For example, natural disasters or scheduled maintenance). The entities start
replicating back to the primary site when the primary site is up and running to ensure High
Availability of applications. The entities you create or update synchronize continuously between
the primary and recovery sites. The reverse synchronization enables you to create or update
entities (protection policies, recovery plans, or guest VMs) at either the primary or the recovery
sites.
This guide is primarily divided into the following two parts.

• Protection and DR between On-Prem Sites (Leap) on page 17


The section walks you through the procedure of application protection and DR to other
Nutanix clusters at the same or different on-prem sites. The procedure also applies to
protection and DR to other Nutanix clusters in supported public cloud.

Disaster Recovery (Formerly Leap) | Leap Overview | 7


• Protection and DR between On-Prem Site and Xi Cloud Service (Xi Leap) on page 141
Xi Leap is essentially an extension of Leap to Xi Cloud Services. You can protect applications
and perform DR to Xi Cloud Services or from Xi Cloud Services to an on-prem availability
zone. The section describes application protection and DR from Xi Cloud Services to an
on-prem Nutanix cluster. For application protection and DR to Xi Cloud Services, refer the
supported capabilities in Protection and DR between On-Prem Sites (Leap) on page 17
because the protection procedure remains the same when the primary site is an on-prem
availability zone.
Configuration tasks and DR workflows are largely the same regardless of the type of recovery
site. For more information about the protection and DR workflow, see Leap Deployment
Workflow on page 12.

Leap Terminology
The following section describes the terms and concepts used throughout the guide. Nutanix
recommends gaining familiarity with these terms before you begin configuring protection and
Leap or Xi Leap disaster recovery (DR).

Availability Zone
A zone that can have one or more independent datacenters inter-connected by low latency
links. An availability zone can either be in your office premises (on-prem) or in Xi Cloud
Services. Availability zones are physically isolated from each other to ensure that a disaster
at one availability zone does not affect another availability zone. An instance of Prism Central
represents an on-prem availability zone.

Note: An availability zone is referred to as a site throughout this document.

On-Prem Availability Zone


An availability zone (site) in your premises.

Xi Cloud Services
A site in the Nutanix Enterprise Cloud Platform (Xi Cloud Services).

Primary Availability Zone


A site that initially hosts guest VMs you want to protect.

Recovery Availability Zone


A site where you can recover the protected guest VMs when a planned or an unplanned event
occurs at the primary site causing its downtime. You can configure at most two recovery sites
for a guest VM.

Nutanix Cluster
A cluster running AHV or ESXi nodes on an on-prem availability zone, Xi Cloud Services, or any
supported public cloud. Leap does not support guest VMs from Hyper-V clusters.

Prism Element
The GUI that provides you the ability to configure, manage, and monitor a single Nutanix
cluster. It is a service built into the platform for every Nutanix cluster deployed.

Disaster Recovery (Formerly Leap) | Leap Overview | 8


Prism Central
The GUI that allows you to monitor and manage many Nutanix clusters (Prism Element running
on those clusters). Prism Starter, Prism Pro, and Prism Ultimate are the three flavors of Prism
Central. For more information about the features available with these licenses, see Software
Options.
Prism Central essentially is a VM that you deploy (host) in a Nutanix cluster (Prism Element).
For more information about Prism Central, see Prism Central Guide. You can set up the
following configurations of Prism Central VM.
Small Prism Central
A Prism Central VM with configuration equal to or less than 8 vCPU and 32 GB memory.
The VM hot adds extra 4 GB and 1 GB memory when you enable Leap and Flow
respectively in small Prism Central.
Small Prism Central (Single node)
A small Prism Central deployed in a single VM.
Small Prism Central (Scaleout)
Three small Prism Centrals deployed in three VMs in the same availability zone (site).
Large Prism Central
A Prism Central VM with configuration more than 8 vCPU and 32 GB memory. The VM
hot adds extra 8 GB and 1 GB memory when you enable Leap and Flow respectively in
large Prism Central.
Large Prism Central (Single node)
A large Prism Central deployed in a single VM.
Large Prism Central (Scaleout)
Three large Prism Centrals deployed in three VMs in the same availability zone (site).

Note: A scaleout Prism Central works like a single node Prism Central in the availability zone
(AZ). You can upgrade a single node Prism Central to scaleout Prism Central to increase the
capacity, resiliency, and redundancy of Prism Central VM. For detailed information about the
available configurations of Prism Central, see Prism Central Scalability in Prism Central Release
Notes.

Virtual Private Cloud (VPC)


A logically isolated network service in Xi Cloud Services. A VPC provides the complete IP
address space for hosting user-configured VPNs. A VPC allows creating workloads manually or
by failover from a paired primary site.
The following VPCs are available in each Xi Cloud Services account. You cannot create more
VPCs in Xi Cloud Services.
Production VPC
Used to host production workloads.
Test VPC
Used to test failover from a paired site.

Source Virtual Network


The virtual network from which guest VMs migrate during a failover or failback.

Recovery Virtual Network


The virtual network to which guest VMs migrate during a failover or failback operation.

Disaster Recovery (Formerly Leap) | Leap Overview | 9


Network Mapping
A mapping between two virtual networks in paired sites. A network mapping specifies a
recovery network for all guest VMs of the source network. When you perform a failover or
failback, the guest VMs in the source network recover in the corresponding (mapped) recovery
network.

Category
A VM category is a key-value pair that groups similar guest VMs. Associating a protection
policy with a VM category ensures that the protection policy applies to all the guest VMs in the
group regardless of how the group scales with time. For example, you can associate a group of
guest VMs with the Department: Marketing category, where Department is a category that includes
a value Marketing along with other values such as Engineering and Sales.
VM categories remain the same way on on-prem sites and Xi Cloud Services. For more
information about VM categories, see Category Management in Prism Central Guide.

Recovery Point
A copy of the state of a system at a particular point in time.
Crash-consistent Snapshots
A snapshot is crash-consistent if it captures all of the data components (write order
consistent) at the instant of the crash. VM snapshots are crash-consistent (by default),
which means that the vDisks that the snapshot captures are consistent with a single point
in time. Crash-consistent snapshots are more suited for non-database operating systems
and applications which may not support quiescence (freezing) and un-quiescence
(thawing) and such as file servers, DHCP servers, print servers.
Application-consistent Snapshots
A snapshot is application-consistent if, in addition to capturing all of the data
components (write order consistent) at the instant of the crash, the running applications
have completed all their operations and flushed their buffers to disk (in other words,
the application is quiesced). Application-consistent snapshots capture the same data as
crash-consistent snapshots, with the addition of all data in memory and all transactions in
process. Therefore, application-consistent snapshots may take longer to complete.
Application-consistent snapshots are more suited for systems and applications that
can be quiesced and un-quiesced or thawed, such as database operating systems and
applications such as SQL, Oracle, and Exchange.

Recoverable Entity
A guest VM that you can recover from a recovery point.

Protection Policy
A configurable policy that takes recovery points of the protected guest VMs in equal time
intervals, and replicates those recovery points to the recovery sites.

Recovery Plan
A configurable policy that orchestrates the recovery of protected guest VMs at the recovery
site.

Recovery Point Objective (RPO)


The time interval that refers to the acceptable data loss if there is a failure. For example, if the
RPO is 1 hour, the system creates a recovery point every 1 hour. On recovery, you can recover

Disaster Recovery (Formerly Leap) | Leap Overview | 10


the guest VMs with data as of up to 1 hour ago. Take Snapshot Every in the Create Protection
Policy GUI represents RPO.

Recovery Time Objective (RTO)


The time period from failure event to the restored service. For example, an RTO of 30 minutes
enables you to back up and run the protected guest VMs in 30 minutes after the failure event.

Nutanix Disaster Recovery Solutions


The following flowchart provides you with the detailed representation of the disaster recovery
(DR) solutions of Nutanix. This decision tree covers both the DR solutions—protection domain-
based DR and Leap helping you to make quick decisions on which DR strategy will best suit
your environment.

Figure 7: Decision Tree for Nutanix DR Solutions

For information about protection domain-based (legacy) DR, see Data Protection and Recovery
with Prism Element guide. With Leap, you can protect your guest VMs and perform DR to on-
prem availability zones (sites) or to Xi Cloud Services. A Leap deployment for DR from Xi Cloud
Services to an on-prem Nutanix cluster is Xi Leap. The detailed information about Leap and Xi
Leap DR configuration is available in the following sections of this guide.
Protection and DR between On-Prem Sites (Leap) on page 17

• For information about protection with Asynchronous replication schedule and DR, see
Protection with Asynchronous Replication Schedule and DR (Leap) on page 36.
• For information about protection with NearSync replication schedule and DR, see Protection
with NearSync Replication Schedule and DR (Leap) on page 96.

Disaster Recovery (Formerly Leap) | Leap Overview | 11


• For information about protection with Synchronous replication schedule and DR, see
Protection with Synchronous Replication Schedule (0 RPO) and DR on page 112.
Protection and DR between On-Prem Site and Xi Cloud Service (Xi Leap) on page 141

• For information about protection with Asynchronous replication schedule and DR, see
Protection with Asynchronous Replication and DR (Xi Leap) on page 180.
• For information about protection with NearSync replication schedule and DR, see Protection
with NearSync Replication and DR (Xi Leap) on page 212.

Leap Deployment Workflow


The workflow for entity-centric protection and disaster recovery (DR) configuration is as
follows. The workflow is largely the same for both Leap and Xi Leap configurations except a
few extra steps you must perform while configuring Xi Leap.

Procedure

1. Enable Leap at the primary and recovery on-prem availability zones (Prism Central).
Enable Leap at the on-prem availability zones (sites) only. For more information about
enabling Leap, see Enabling Leap for On-Prem Site on page 32.

2. Pair the primary and recovery sites with each other.


Only when you pair a site, the site lists for recovery sites while configuring protection
policies and recovery plans (see step 6 and step 7). For more information about pairing the
sites, see Pairing Availability Zones (Leap) on page 33.

3. (only for Xi Leap configuration) Set up your environment to proceed with replicating to Xi
Cloud Services.
For more information about environment setup, see Xi Leap Environment Setup on
page 153.

4. (only for Xi Leap configuration) Reserve floating IP addresses.


For more information about floating IP addresses, see Floating IP Address Management in
Xi Infrastructure Service Management Guide.

5. Create production and test virtual networks at the primary and recovery sites.
Create production and test virtual networks only at the on-prem sites. Xi Cloud Services
create production and test virtual networks dynamically for you. However, Xi Cloud
Services provides floating IP addresses (step 4), a feature that is not available for on-prem
sites. For more information about production and test virtual networks, see Nutanix Virtual
Networks on page 174.

Disaster Recovery (Formerly Leap) | Leap Overview | 12


6. Create a protection policy with replication schedules at the primary site.
A protection policy can replicate recovery points to at most two other Nutanix clusters
at the same or different sites. To replicate the recovery points, add a replication schedule
between the primary site and each recovery site.

• To create a protection policy with an Asynchronous replication schedule, see:

• Creating a Protection Policy with an Asynchronous Replication Schedule (Leap) on


page 38
• Creating a Protection Policy with Asynchronous Replication Schedule (Xi Leap) on
page 182
• To create a protection policy with a NearSync replication schedule, see:

• Creating a Protection Policy with a NearSync Replication Schedule (Leap) on


page 100
• Creating a Protection Policy with NearSync Replication Schedule (Xi Leap) on
page 217

Note: To maintain the efficiency of minutely replication, protection policies allow you to
add NearSync replication schedule between the primary site and only one recovery site.

• To create a protection policy with the Synchronous replication schedule, see Creating a
Protection Policy with the Synchronous Replication Schedule (Leap) on page 115.

Note: To maintain the efficiency of synchronous replication, protection policies allow


you to add only one recovery site when you add Synchronous replication schedule. If
you already have an Asynchronous or a NearSync replication schedule in the protection
policy, you cannot add another recovery site to protect the guest VMs with Synchronous
replication schedule.

You can also create a protection policy at a recovery site. Protection policies you create or
update at a recovery site synchronize back to the primary site. The reverse synchronization
helps when you protect more guest VMs in the same protection policy at the recovery site.

7. Create a recovery plan at the primary site.


A recovery plan orchestrates the failover of the protected guest VMs (step 6) to a recovery
site. For two recovery sites, create two discrete recovery plans at the primary site—one for
DR to each recovery site.

• To create a recovery plan for DR to another Nutanix cluster at the same or different on-
prem sites, see Creating a Recovery Plan (Leap) on page 56.
• To create a recovery plan for DR to Xi Cloud Services, see Creating a Recovery Plan (Xi
Leap) on page 193.
You can also create a recovery plan at a recovery site. The recovery plan you create
or update at a recovery site synchronizes back to the primary site. The reverse
synchronization helps in scenarios where you add more guest VMs to the same recovery
plan at the recovery site.

Disaster Recovery (Formerly Leap) | Leap Overview | 13


8. Validate or test the recovery plan you create in step 7.
To test a recovery plan, perform a test failover to a recovery site.

• To perform test failover to another Nutanix cluster at the same or different on-prem
sites, see Performing a Test Failover (Leap) on page 73.
• To perform test failover to Xi Cloud Services, see Failover and Failback Operations (Xi
Leap) on page 200.

9. (only for Xi Leap configuration) After the failover to recovery site, enable external
connectivity. To enable external connectivity, perform the following.

• 1. After a planned failover, shut down the VLAN interface on the on-prem Top-of-Rack
(TOR) switch.
2. To access the Internet from Xi Cloud Services, create both inbound and outbound
policy-based routing (PBR) policies on the virtual private cloud (VPC). For more
information, see Policy Configuration in Xi Infrastructure Service Administration Guide.

10. (only for Xi Leap configuration) Perform the following procedure to access the recovered
guest VMs through the Internet.

• 1. Assign a floating IP address to the guest VMs failed over to Xi Cloud Services. For
more information, see Floating IP Address Management in Xi Infrastructure Service
Administration Guide
2. Create PBR policies and specify the internal or private IP address of the guest
VMs. For more information, see Policy Configuration in Xi Infrastructure Service
Administration Guide.

Note: If a guest VM (that hosts a publicly accessible website) fails over, update the
authoritative DNS server (for example, Amazon Route 53, GoDaddy, DNSmadeEasy) with
the primary failover record (on-prem public IP address) and the secondary failover record
(Xi floating IP address). For example, if your authoritative DNS server is Amazon Route53,
configure the primary and the secondary failover records. Amazon Route53 performs the
health checks on the primary failover record and returns the secondary failover record when
the primary is down.

On-Prem Hardware Resource Requirements


For DR solutions with Asynchronous, NearSync, and Synchronous replication schedules
to succeed, the nodes in the on-prem Availability Zones (AZs or sites) must have certain
resources. This section provides information about the node, disk and Foundation
configurations necessary to support the RPO-based recovery point frequencies.

• The conditions and configurations provided in this section apply to Local and Remote
recovery points.
• Any node configuration with two or more SSDs, each SSD being 1.2 TB or greater capacity,
supports recovery point frequency for NearSync.
• Any node configuration that supports recovery point frequency of six (6) hours also
supports AHV-based Synchronous replication schedules because a protection policy with
Synchronous replication schedule takes recovery points of the protected VMs every 6 hours.
See Protection with Synchronous Replication Schedule (0 RPO) and DR on page 112 for
more details about Synchronous replication.
• Both the primary cluster and replication target cluster must fulfill the same minimum
resource requirements.

Disaster Recovery (Formerly Leap) | Leap Overview | 14


• Ensure that any new node or disk additions made to the on-prem sites (Availability Zones)
meet the minimum requirements.
• Features such as Deduplication and RF3 may require additional memory depending on the
DR schedules and other workloads run on the cluster.

Note: In case of on-prem deployments, the default minimum recovery point frequency using
the default Foundation configuration is 6 hours. To increase the recovery point frequency, you
must also modify the Foundation configuration (SSD and CVM) accordingly. For example, an all-
flash setup with a capacity between 48 TB to 92 TB has the default recovery point frequency is
6 hours. If you want to decrease the recovery point interval to one (1) hour, you must modify the
default Foundation configuration to:

• 14 vCPUs for CVM


• 40 GB for CVM

The table lists the supported frequency for the recovery points across various hardware
configurations.

Table 1: Recovery Point Frequency

Type of disk Capacity per node Minimum recovery Foundation


point frequency Configuration -
SSD and CVM
requirements
Hybrid Total HDD tier No change required
capacity of 32 TB or • NearSync —Default Foundation
lower. Total capacity • Async (Hourly) configuration.
(HDD + SSD) of 40 TB
or lower. • 2 x SSDs
• Each SSD must be
minimum 1.2 TB or
more for NearSync.

Total HDD tier Modify Foundation


• NearSync configurations to
capacity between
32-64 TB. Total • Async (Hourly) minimum:
capacity (HDD + SSD)
of 92 TB or lower. • 4 x SSDs

Up to 64 TB HDD • Each SSD must be


minimum 1.2 TB or
Up to 32 TB SSD (4 x more for NearSync.
7.68 TB SSDs)
• 14 vCPU for CVM
• 40 GB for CVM

Disaster Recovery (Formerly Leap) | Leap Overview | 15


Type of disk Capacity per node Minimum recovery Foundation
point frequency Configuration -
SSD and CVM
requirements

Total HDD tier Async (every 6 Hours) No change required


capacity between —Default Foundation
32-64 TB. Total configuration.
capacity (HDD + SSD)
of 92 TB or lower.
Up to 64 TB HDD
Up to 32 TB SSD

Total HDD tier Async (every 6 Hours) No change required


capacity between —Default Foundation
64-80 TB. Total configuration.
capacity (HDD + SSD)
of 96 TB or lower.

Total HDD tier Async (every 6 Hours) Modify Foundation


capacity greater than configurations to
80 TB. Total capacity minimum:
(HDD + SSD) of 136
TB or lower. • 12 vCPU for CVM
• 36 GB for CVM

All Flash Total capacity of 48 No change required


TB or lower • NearSync —Default Foundation
• Async (Hourly) configuration.

Total capacity Modify Foundation


between 48-92 TB • NearSync configurations to
• Async (Hourly) minimum:

• 14 vCPU for CVM


• 40 GB for CVM

Total capacity Async (every 6 Hours) No change required


between 48-92 TB —Default Foundation
configuration.
Total capacity greater Async (every 6 Hours) Modify Foundation
than 92 TB configurations to
minimum:

• 12 vCPU for CVM


• 36 GB for CVM

Disaster Recovery (Formerly Leap) | Leap Overview | 16


PROTECTION AND DR BETWEEN ON-
PREM SITES (LEAP)
Leap protects your guest VMs and orchestrates their disaster recovery (DR) to other Nutanix
clusters when events causing service disruption occur at the primary availability zone (site).
For protection of your guest VMs, protection policies with Asynchronous, NearSync, or
Synchronous replication schedules generate and replicate recovery points to other on-prem
availability zones (sites). Recovery plans orchestrate DR from the replicated recovery points to
other Nutanix clusters at the same or different on-prem sites.
Protection policies create a recovery point—and set its expiry time—in every iteration of the
specified time period (RPO). For example, the policy creates a recovery point every 1 hour
for an RPO schedule of 1 hour. The recovery point expires at its designated expiry time based
on the retention policy—see step 3 in Creating a Protection Policy with an Asynchronous
Replication Schedule (Leap) on page 38. If there is a prolonged outage at a site, the Nutanix
cluster retains the last recovery point to ensure you do not lose all the recovery points. For
NearSync replication (lightweight snapshot), the Nutanix cluster retains the last full hourly
snapshot. During the outage, the Nutanix cluster does not clean up the recovery points due to
expiry. When the Nutanix cluster comes online, it cleans up the recovery points that are past
expiry immediately.
For High Availability of a guest VM, Leap enables replication of its recovery points to one or
more on-prem sites. A protection policy can replicate recovery points to maximum two on-
prem sites. For replication, you must add a replication schedule between sites. You can set up
the on-prem sites for protection and DR in the following arrangements.

Figure 8: The Primary and recovery Nutanix clusters at the different on-prem AZs

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 17
Figure 9: The Primary and recovery Nutanix clusters at the same on-prem AZ

The replication to multiple sites enables DR to Nutanix clusters at all the sites where the
recovery points replicate or exist. To enable performing DR to a Nutanix cluster at the same or
different site (recovery site), you must create a recovery plan. To enable performing DR to two
different Nutanix clusters at the same or different recovery sites, you must create two discrete
recovery plans—one for each recovery site. In addition to performing DR to Nutanix clusters
running the same hypervisor type, you can also perform cross-hypervisor disaster recovery
(CHDR)—DR from AHV clusters to ESXi clusters, or from ESXi clusters to AHV clusters.
The protection policies and recovery plans you create or update synchronize continuously
between the primary and recovery on-prem sites. The reverse synchronization enables you
to create or update entities (protection policies, recovery plans, and guest VMs) at either the
primary or the recovery sites.
The following section describes protection of your guest VMs and DR to a Nutanix cluster at the
same or different on-prem sites. The workflow is the same for protection and DR to a Nutanix
cluster in supported public cloud platforms. For information about protection of your guest
VMs and DR from Xi Cloud Services to an on-prem Nutanix cluster (Xi Leap), see Protection
and DR between On-Prem Site and Xi Cloud Service (Xi Leap) on page 141.

Leap Requirements
The following are the general requirements of Leap. Along with the general requirements, there
are specific requirements for protection with the following supported replication schedules.

• For information about the on-prem node, disk and Foundation configurations required to
support Asynchronous, NearSync, and Synchronous replication schedules, see On-Prem
Hardware Resource Requirements on page 14.
• For specific requirements of protection with Asynchronous replication schedule (1 hour or
greater RPO), see Asynchronous Replication Requirements (Leap) on page 36.
• For specific requirements of protection with NearSync replication schedule (1–15 minutes
RPO), see NearSync Replication Requirements (Leap) on page 98.
• For specific requirements of protection with Synchronous replication schedule (0 RPO), see
Synchronous Replication Requirements on page 113.

License Requirements
The AOS license required depends on the features that you want to use. For information about
the features that are available with an AOS license, see Software Options.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 18
Hypervisor Requirements
The underlying hypervisors required differ in all the supported replication schedules. For more
information about underlying hypervisor requirements for the supported replication schedules,
see:

• Asynchronous Replication Requirements (Leap) on page 36


• NearSync Replication Requirements (Leap) on page 98
• Synchronous Replication Requirements on page 113

Nutanix Software Requirements

• Each on-prem availability zone (site) must have a Leap enabled Prism Central instance. To
enable Leap in Prism Central, see Enabling Leap for On-Prem Site on page 32.

Note: If you are using ESXi, register at least one vCenter Server to Prism Central. You can also
register two vCenter Servers, each to a Prism Central at different sites. If you register both the
Prism Central to the single vCenter Server, ensure that each ESXi cluster is part of different
datacenter object in vCenter.

• The primary and recovery Prism Central and Prism Element on the Nutanix clusters must be
running on the supported AOS versions. For more information about the required versions
for the supported replication schedules, see:

• Asynchronous Replication Requirements (Leap) on page 36


• NearSync Replication Requirements (Leap) on page 98
• Synchronous Replication Requirements on page 113

Tip:
Nutanix supports replications between the all the latest supported LTS and STS
released AOS versions. To check the list of the latest supported AOS versions, see
KB-5505. To determine if the AOS versions currently running on your clusters are
EOL, see the EOL document .
Upgrade the AOS version to the next available supported LTS/STS release. To
determine if an upgrade path is supported, check the Upgrade Paths page before
you upgrade the AOS.

Note: If both clusters have different AOS versions that are EOL, upgrade the cluster
with lower AOS version to match the cluster with higher AOS version and then
perform the upgrade to the next supported LTS version.

For example, the clusters are running AOS versions 5.5.x and 5.10.x respectively.
Upgrade the cluster on 5.5.x to 5.10.x. After both the clusters are on 5.10.x, proceed
to upgrade each cluster to 5.15.x (supported LTS). Once both clusters are on 5.15.x
you can upgrade the clusters to 5.20.x or newer.
Nutanix recommends that both the primary and the replication clusters or sites run
the same AOS version.

User Requirements
You must have one of the following roles in Prism Central.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 19
• User admin
• Prism Central admin
• Prism Self Service admin
• Xi admin
To view the available roles or create a role, click the hamburger icon at the top-left corner of
the window and go to Administration > Roles in the left pane.

Firewall Port Requirements


To allow two-way replication between Nutanix clusters at the same or different sites, you must
enable certain ports in your external firewall. To know about the required ports, see Disaster
Recovery - Leap in Port Reference.

Networking Requirements
Requirements for static IP address preservation after failover
You can preserve one IP address of a guest VM (with static IP address) for its failover
(DR) to an IPAM network. After the failover, the other IP addresses of the guest VM have
to be reconfigured manually. To preserve an IP address of a guest VM (with static IP
address), ensure that:

Caution: By default, you cannot preserve statically assigned DNS IP addresses after
failover (DR) of guest VMs. However, you can create custom in-guest scripts to preserve
the statically assigned DNS IP addresses. For more information, see Creating a Recovery
Plan (Leap) on page 56.

• Both the primary and the recovery Nutanix clusters run AOS 5.11 or newer.
• The protected guest VMs have Nutanix Guest Tools (NGT) version 1.5 or newer
installed.
For information about installing NGT, see Nutanix Guest Tools in Prism Web Console
Guide.
• The protected guest VMs have at least one empty CD-ROM slot.
The empty CD-ROM is required for mounting NGT at the recovery site.
• The protected guest VMs can reach the Controller VM from both the sites.
• The protected guest VMs have NetworkManager command-line tool (nmcli) version
0.9.10.0 or newer installed.
Also, the NetworkManager must manage the networks on Linux VMs. To enable
NetworkManager on a Linux VM, in the interface configuration file, set the value of the
NM_CONTROLLED field to yes. After setting the field, restart the network service on the VM.

Tip: In CentOS, the interface configuration file is /etc/sysconfig/network-scripts/


ifcfg-eth0.

Requirements for static IP address mapping of guest VMs between source and target
virtual networks
You can explicitly define IP addresses for guest VMs that have static IP addresses on the
primary site. On recovery, such guest VMs retain the explicitly defined IP address. To map
static IP addresses of guest VMs between source and target virtual networks, ensure that:

• Both the primary and the recovery Nutanix clusters run AOS 5.17 or newer.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 20
• The protected guest VMs have static IP addresses at the primary site.
• The protected guest VMs have Nutanix Guest Tools (NGT) version 1.5 or newer
installed.
For information about installing NGT, see Nutanix Guest Tools in Prism Web Console
Guide.
• The protected guest VMs have at least one empty CD-ROM slot.
The empty CD-ROM is required for mounting NGT at the recovery site.
• The protected guest VMs can reach the Controller VM from both the sites.
• The recovery plan selected for failover has VM-level IP address mapping configured.
Virtual network design requirements
You can design the virtual subnets that you plan to use for DR to the recovery site so
that they can accommodate the guest VMs running in the source virtual network.

• Maintain a uniform network configuration for all the virtual LANs (VLANs) with the
same VLAN ID and network range in all the Nutanix clusters at a site. All such VLANs
must have the same subnet name, IP address range, and IP address prefix length
((Gateway IP/Prefix Length)).
For example, if you have VLAN with id 0 and network 10.45.128.0/17, and three
clusters PE1, PE2, and PE3 at the site AZ1, all the clusters must maintain the same
name, IP address range, and IP address prefix length ((Gateway IP/Prefix Length)), for
VLAN with id 0.
• To use a virtual network as a recovery virtual network, ensure that the virtual network
meets the following requirements.

• The network prefix is the same as the network prefix of the source virtual network.
For example, if the source network address is 192.0.2.0/24, the network prefix of
the recovery virtual network must also be 24.
• The gateway IP address is the same as the gateway IP address in the source
network. For example, if the gateway IP address in the source virtual network
192.0.2.0/24 is 192.0.2.10, the last octet of the gateway IP address in the recovery
virtual network must also be 10.
• To use a single Nutanix cluster as a target for DR from multiple primary Nutanix
clusters, ensure that the number of virtual networks on the recovery cluster is equal to
the sum of the number of virtual networks on the individual primary Nutanix clusters.
For example, if there are two primary Nutanix clusters, with one cluster having m
networks and the other cluster having n networks, ensure that the recovery cluster has
m + n networks. Such a design ensures that all recovered VMs attach to a network.

Additional Requirements

• Both the primary and recovery Nutanix clusters must have an external IP address.
• Both the primary and recovery Prism Centrals and Nutanix clusters must have a data
services IP address.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 21
• The Nutanix cluster that hosts the Prism Central must meet the following requirements.

• The Nutanix cluster must be registered to the Prism Central instance.


• The Nutanix cluster must have an iSCSI data services IP address configured on it.
• The Nutanix cluster must also have sufficient memory to support a hot add of memory
to all Prism Central nodes when you enable Leap. A small Prism Central instance (4
vCPUs, 16 GB memory) requires a hot add of 4 GB, and a large Prism Central instance
(8 vCPUs, 32 GB memory) requires a hot add of 8 GB. If you enable Nutanix Flow, each
Prism Central instance requires an extra hot-add of 1 GB.
• Each node in a scaled-out Prism Central instance must have a minimum of 4 vCPUs and 16
GB memory.
For more information about the scaled-out deployments of a Prism Central, see Leap
Terminology on page 8.
• The protected guest VMs must have Nutanix VM mobility drivers installed.
Nutanix VM mobility drivers are required for accessing the guest VMs after failover. Without
Nutanix VM mobility drivers, the guest VMs become inaccessible after a failover.
• Maintain a uniform network configuration for all the virtual LANs (VLANs) with the same
VLAN ID and network range in all the clusters at an availability zone (site). All such VLANs
must have the same subnet name, IP address range, and IP address prefix length ((Gateway
IP/Prefix Length)).
For example, if you have VLAN with id 0 and network 10.45.128.0/17, and three clusters PE1,
PE2, and PE3 at the site AZ1, all the clusters must maintain the same name, IP address range,
and IP address prefix length ((Gateway IP/Prefix Length)), for VLAN with id 0.

Leap Limitations
Consider the following general limitations before configuring protection and disaster recovery
(DR) with Leap. Along with the general limitations, there are specific protection limitations with
the following supported replication schedules.

• For specific limitations of protection with Asynchronous replication schedule (1 hour or


greater RPO), see Asynchronous Replication Limitations (Leap) on page 38.
• For specific limitations of protection with NearSync replication schedule (1–15 minutes RPO),
see NearSync Replication Limitations (Leap) on page 99.
• For specific limitations of protection with Synchronous replication schedule (0 RPO), see
Synchronous Replication Limitations on page 115.

Virtual Machine Limitations


You cannot do or implement the following.

• Deploy witness VMs.


• Protect multiple guest VMs that use disk sharing (for example, multi-writer sharing, Microsoft
Failover Clusters, Oracle RAC).

• Protect VMware fault tolerance enabled guest VMs.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 22
• Recover vGPU console enabled guest VMs efficiently.
When you perform DR of vGPU console-enabled guest VMs, the VMs recover with the
default VGA console (without any alert) instead of vGPU console. The guest VMs fail to
recover when you perform cross-hypervisor disaster recovery (CHDR). For more information
about DR and backup behavior of guest VMs with vGPU, see vGPU Enabled Guest VMs on
page 23.
• Configure NICs for a guest VM across both the virtual private clouds (VPC).
You can configure NICs for a guest VM associated with either production or test VPC.

Volume Groups Limitation


You cannot protect volume groups.

Network Segmentation Limitation


You cannot apply network segmentation for management traffic (any traffic not on the
backplane network) in Leap.
You get an error when you try to enable network segmentation for management traffic on
a Leap enabled Nutanix Cluster or enable Leap in a network segmentation enabled Nutanix
cluster. For more information about network segmentation, see Securing Traffic Through
Network Segmentation in the Security Guide.

Note: However, you can apply network segmentation for backplane traffic at the primary and
recovery clusters. Nutanix does not recommend this because when you perform a planned
failover of guest VMs having network segmentation for backplane enabled, the guest VMs fail to
recover and the guest VMs at the primary AZ are removed.

Virtual Network Limitation


Although there is no limit to the number of VLANs that you can create, only the first 500
VLANs list in the drop-down of Network Settings while creating a recovery plan. For more
information about VLANs in the recovery plan, see Nutanix Virtual Networks on page 174.

Nutanix to vSphere Cluster Mapping Limitation


Due to the way the Nutanix architecture distributes data, there is limited support for mapping
a Nutanix cluster to multiple vSphere clusters. If a Nutanix cluster is split into multiple vSphere
clusters, migrate and recovery operations fail.

Failover Limitation
After the failover, the recovered guest VMs do not retain their associated labels.

Tip: Assign categories to the guest VMs instead of labels because VM categories are retained
after the failover.

vGPU Enabled Guest VMs


The following table list the behavior of guest VMs with vGPU to disaster recovery (DR) and
backup deployments.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 23
Table 2:

Primary cluster Recovery cluster DR or Backup Identical vGPU Unidentical vGPU


models models or no
vGPU

AHV AHV Nutanix Disaster Supported: Supported:


Recovery
• Recovery • Recovery
point creation point creation
• Replication • Replication
• Restore • Restore
• Migrate • Migrate
• VM start Unsupported:

• Failover and • VM start


Failback
• Failover and
Failback

Note:
Only
for
Synchronous
replication,
protection
of
guest
VMs
fail.

Backup: HYCU Guest VMs with Guest VMs with


vGPU fail to vGPU fail to
recover. recover.
Backup: Veeam Guest VMs with
vGPU fail to • Guest VMs
recover. with vGPU
recover but
with older
vGPU.
• Guest VMs
with vGPU
recover but do
not start.

Tip: The
VMs start
when you
disable
vGPU on
the guest
VM

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 24
Primary cluster Recovery cluster DR or Backup Identical vGPU Unidentical vGPU
models models or no
vGPU
ESXi ESXi Nutanix Disaster Guest VMs with Guest VMs with
Recovery vGPU cannot be vGPU cannot be
protected. protected.
Backup Guest VMs with Guest VMs with
vGPU cannot be vGPU cannot be
protected. protected.
AHV ESXi Nutanix Disaster vGPU is disabled vGPU is disabled
Recovery after failover of after failover of
Guest VMs with Guest VMs with
vGPU. vGPU.
ESXi AHV Nutanix Disaster Guest VMs with Guest VMs with
Recovery vGPU cannot be vGPU cannot be
protected. protected.

Leap Configuration Maximums


For the maximum number of entities you can configure with different replication schedules and
perform failover (disaster recovery), see Nutanix Configuration Maximums. The limits have been
tested for Leap production deployments. Nutanix does not guarantee the system to be able to
operate beyond these limits.

Tip: Upgrade your NCC version to 3.10.1 to get configuration alerts.

Leap Recommendations
Nutanix recommends the following best practices for configuring protection and disaster
recovery (DR) with Leap.

General Recommendations

• Create all entities (protection policies, recovery plans, and VM categories) at the primary AZ
(AZ).
• Upgrade Prism Central before upgrading Prism Element on the Nutanix clusters registered to
it. For more information about upgrading Prism Central, see Upgrading Prism Central in the
Acropolis Upgrade Guide.
• Do not include the guest VMs protected with Asynchronous, NearSync, and Synchronous
replication schedules in the same recovery plan. You can include guest VMs protected with
Asynchronous or NearSync replication schedules in the same recovery plan. However, if
you combine these guest VMs with the guest VMs protected by Synchronous replication
schedules in a recovery plan, the recovery fails.
• Disable Synchronous replication before unpairing the AZs.
If you unpair the AZs while the guest VMs in the Nutanix clusters are still in synchronization,
the Nutanix cluster becomes unstable. For more information about disabling Synchronous
replication, see Synchronous Replication Management on page 121.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 25
Recommendation for Migrating Protection Domains to Protection Policies
You can protect a guest VM either with legacy DR solution (protection domain-based) or with
Leap. To protect a legacy DR-protected guest VM with Leap, you must migrate the guest
VM from protection domain to a protection policy. During the migration, do not delete the
guest VM snapshots in the protection domain. Nutanix recommends keeping the guest VM
snapshots in the protection domain until the first recovery point for the guest VM is available on
Prism Central. For more information, see Migrating Guest VMs from a Protection Domain to a
Protection Policy on page 232.

Recommendation for DR to Nutanix Clusters at the Same On-Prem Availability Zone


If the single Prism Central that you use for protection and DR to Nutanix clusters at the same
availability zone (site) becomes inactive, you cannot perform a failover when required. To avoid
the single point of failure in such deployments, Nutanix recommends installing the single Prism
Central at a different site (different fault domain).

Recommendation for Virtual Networks

• Map the networks while creating a recovery plan in Prism Central.


• Recovery plans do not support overlapping subnets in a network-mapping configuration. Do
not create virtual networks that have the same name or overlapping IP address ranges.

Recommendation for Container Mapping


Create storage containers with the same name on both the primary and recovery Nutanix
clusters.
Leap automatically maps the storage containers during the first replication (seeding) of a guest
VM. If a storage container with the same name exists on both the primary and recovery Nutanix
clusters, the recovery points replicate to the same name storage container only. For example,
if your protected guest VMs are in the SelfServiceContainer on the primary Nutanix cluster,
and the recovery Nutanix cluster also has SelfServiceContainer, the recovery points replicate
to SelfServiceContainer only. If a storage container with the same name does not exist at the
recovery AZ, the recovery points replicate to a random storage container at the recovery AZ.
For more information about creating storage containers on the Nutanix clusters, see Creating a
Storage Container in Prism Web Console Guide.

General Recommendations

• Create all entities (protection policies, recovery plans, and VM categories) at the primary
availability zone (site).
• Upgrade Prism Central before upgrading Prism Element on the Nutanix clusters registered to
it.
• Do not include the guest VMs protected with Asynchronous, NearSync, and Synchronous
replication schedules in the same recovery plan. You can include guest VMs protected with
Asynchronous or NearSync replication schedules in the same recovery plan. However, if
you combine these guest VMs with the guest VMs protected by Synchronous replication
schedules in a recovery plan, the recovery fails.
• Disable Synchronous replication before unpairing the sites.
If you unpair the sites while the guest VMs in the Nutanix clusters are still in synchronization,
the Nutanix cluster becomes unstable. For more information about disabling Synchronous
replication, see Synchronous Replication Management on page 121.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 26
Leap Service-Level Agreements (SLAs)
Leap enables protection of your guest VMs and disaster recovery (DR) to one or more Nutanix
clusters at the same or different on-prem sites. A Nutanix cluster is essentially an AHV or an
ESXi cluster running AOS. In addition to performing DR to Nutanix clusters running the same
hypervisor type, you can also perform cross-hypervisor disaster recovery (CHDR)—DR from
AHV clusters to ESXi clusters, or from ESXi clusters to AHV clusters.
Leap supports DR (and CHDR) to maximum two different Nutanix clusters at the same
or different availability zones (sites). You can protect your guest VMs with the following
replication schedules.

• Asynchronous replication schedule (1 hour or greater RPO). For information about protection
with Asynchronous replication schedule, see Protection with Asynchronous Replication
Schedule and DR (Leap) on page 36.
• NearSync replication schedule (1–15 minute RPO). For information about protection with
NearSync replication schedule, see Protection with NearSync Replication Schedule and DR
(Leap) on page 96.
• Synchronous replication schedule (0 RPO). For information about protection with
Synchronous replication schedule, see Protection with Synchronous Replication Schedule (0
RPO) and DR on page 112.
To maintain the efficiency in protection and DR, Leap allows to protect a guest VM with
Synchronous replication schedule to only one AHV cluster and at the different on-prem
availability zone.

Leap Views
The disaster recovery (DR) views enable you to perform CRUD operations on the following
types of Leap entities.

• Configured entities (for example, availability zones, protection policies, and recovery plans)
• Created entities (for example, guest VMs, and recovery points)
This chapter describes the views of Prism Central (on-prem site).

Availability Zones View


The Availability Zones view under the hamburger icon > Administration lists all of your paired
availability zones.
The following figure is a sample view, and the tables describe the fields and the actions that you
can perform in this view.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 27
Figure 10: AZs View

Table 3: Availability Zones View Fields

Field Description
Name Name of the availability zone.
Region Region to which the availability zone belongs.
Type Type of availability zone. Availability zones
that are backed by on-prem Prism Central
instances are shown to be of type physical.
The availability zone that you are logged in to
is shown as a local availability zone.
Connectivity Status Status of connectivity between the local
availability zone and the paired availability
zone.

Table 4: Workflows Available in the Availability Zones View

Workflow Description
Connect to Availability Zone (on-prem Prism Connect to an on-prem Prism Central or to a
Central only) Xi Cloud Services for data replication.

Table 5: Actions Available in the Actions Menu

Action Description
Disconnect Disconnect the remote availability zone. When
you disconnect an availability zone, the pairing
is removed.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 28
Protection Policies View
The Protection Policies view under the hamburger icon > Data Protection lists all of configured
protection policies from all the paired availability zones.
The following figure is a sample view, and the tables describe the fields and the actions that you
can perform in this view.

Figure 11: Protection Policies View

Table 6: Protection Policies View Fields

Field Description
Policy Name Name of the protection policy.
Schedules Number of schedules configured in the
protection policy. If the protection policy
has multiple schedules, a drop-down icon is
displayed. Click the drop-down icon to see
the primary location:primary Nutanix cluster,
recovery location:recovery Nutanix cluster,
and RPO of the schedules in the protection
policy.
Alerts Number of alerts issued for the protection
policy.

Table 7: Workflows Available in the Protection Policies View

Workflow Description
Create protection policy Create a protection policy.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 29
Table 8: Actions Available in the Actions Menu

Action Description
Update Update the protection policy.
Clone Clone the protection policy.
Delete Delete the protection policy.

Recovery Plans View


The Recovery Plans view under the hamburger icon > Data Protection lists all of configured
recovery plans from all the paired availability zones.
The following figure is a sample view, and the tables describe the fields and the actions that you
can perform in this view.

Figure 12: Recovery Plans View

Table 9: Recovery Plans View Fields

Field Description
Name Name of the recovery plan.
Primary Location Replication source site for the recovery plan.
Recovery Location Replication target site for the recovery plan.
Entities Sum of the following VMs:

• Number of local, live VMs that are specified


in the recovery plan.
• Number of remote VMs that the recovery
plan can recover at this site.

Last Validation Status Status of the most recent validation of the


recovery plan.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 30
Field Description
Last Test Status Status of the most recent test performed on
the recovery plan.
Last Failover Status Status of the most recent failover performed
on the recovery plan.

Table 10: Workflows Available in the Recovery Plans View

Workflow Description
Create Recovery Plan Create a recovery plan.

Table 11: Actions Available in the Actions Menu

Action Description
Validate Validates the recovery plan to ensure that
the VMs in the recovery plan have a valid
configuration and can be recovered.
Test Tests the recovery plan.
Clean-up test VMs Cleans up the VMs failed over as a result of
testing recovery plan.
Update Updates the recovery plan.
Failover Performs a failover.
Delete Deletes the recovery plan.

Dashboard Widgets
The dashboard includes widgets that display the statuses of configured protection policies and
recovery plans. If you have not configured these VMs, the widgets display a summary of the
steps required to get started with Leap.
To view these widgets, click the Dashboard tab.
The following figure is a sample view of the dashboard widgets.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 31
Figure 13: Dashboard Widgets for Leap

Enabling Leap for On-Prem Site


To perform disaster recovery (DR) to Nutanix clusters at different on-prem available zones
(sites), enable Leap at both the primary and recovery sites (Prism Central). Without enabling
Leap, you can configure protection policies and recovery plans that synchronize to the paired
sites but you cannot perform failover and failback operations. To perform DR to different
Nutanix clusters at the same site, enable Leap in the single Prism Central.

About this task


To enable Leap, perform the following procedure.

Note: You cannot disable Leap once you have enabled it.

Procedure

1. Log on to the Prism Central web console.

2. Click the settings button (gear icon) at the top-right corner of the window.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 32
3. Click Enable Leap in the Setup section on the left pane.

Figure 14: Enabling Leap

The Leap dialog box run prechecks. If any precheck fails, resolve the issue that is causing the
failure and click check again.

4. Click Enable after all the prechecks pass.


Leap is enabled after at least 10 seconds.

Pairing Availability Zones (Leap)


To replicate entities (protection policies, recovery plans, and recovery points) to different on-
prem availability zones (sites) bidirectionally, pair the sites with each other. To replicate entities
to different Nutanix clusters at the same site bidirectionally, you need not pair the sites because
the primary and the recovery Nutanix clusters are registered to the same site (Prism Central).
Without pairing the sites, you cannot perform DR to a different site.

About this task


To pair an on-prem AZ with another on-prem AZ, perform the following procedure at either of
the on-prem AZs.

Procedure

1. Log on to the Prism Central web console.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 33
2. Click the hamburger icon at the top-left corner of the window. Go to Administration >
Availability Zones in the left pane.

Figure 15: Pairing Availability Zone

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 34
3. Click Connect to Availability Zone.
Specify the following information in the Connect to Availability Zone window.

Figure 16: Connect to Availability Zone

a. Availability Zone Type: Select Physical Location from the drop-down list.
A physical location is an on-prem availability zone (site). To pair the on-prem site with
Xi Cloud Services, select XI from the drop-down list, and enter the credentials of your Xi
Cloud Services account in step c and set d.
b. IP Address for Remote PC: Enter the IP address of the recovery site Prism Central.
c. Username: Enter the username of your recovery site Prism Central.
d. Password: Enter the password of your recovery site Prism Central.

4. Click Connect.
Both the on-prem AZs are paired to each other.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 35
Protection and Automated DR
Automated disaster recovery (DR) configurations use protection policies to protect your guest
VMs, and recovery plans to orchestrate the recovery of those guest VMs to different Nutanix
clusters at the same or different availability zones (sites). You can automate protection of your
guest VMs with the following supported replication schedules in Leap.

• Asynchronous replication schedule (1 hour or greater RPO). For information about protection
with Asynchronous replication schedule, see Protection with Asynchronous Replication
Schedule and DR (Leap) on page 36.
• NearSync replication schedule (1–15 minute RPO). For information about protection with
NearSync replication schedule, see Protection with NearSync Replication Schedule and DR
(Leap) on page 96.
• Synchronous replication schedule (0 RPO). For information about protection with
Synchronous replication schedule, see Protection with Synchronous Replication Schedule (0
RPO) and DR on page 112.
To maintain the efficiency in protection and DR, Leap allows to protect a guest VM with
Synchronous replication schedule to only one AHV cluster and at the different on-prem
availability zone.

Protection with Asynchronous Replication Schedule and DR (Leap)


Asynchronous replication schedules enable you to protect your guest VMs with an RPO of
1 hour or beyond. A protection policy with an Asynchronous replication schedule creates a
recovery point in an hourly time interval, and replicates it to the recovery availability zones
(sites) for High Availability. For guest VMs protected with Asynchronous replication schedule,
you can perform disaster recovery (DR) to different Nutanix clusters at same or different sites.
In addition to performing DR to Nutanix clusters running the same hypervisor type, you can also
perform cross-hypervisor disaster recovery (CHDR)—DR from AHV clusters to ESXi clusters, or
from ESXi clusters to AHV clusters.

Note: Nutanix provides multiple DR solutions to protect your environment. See Nutanix
Disaster Recovery Solutions on page 11 for the detailed representation of the DR offerings of
Nutanix.

Asynchronous Replication Requirements (Leap)


The following are the specific requirements for protecting your guest VMs with Asynchronous
replication schedule. Ensure that you meet the following requirements in addition to the general
requirements of Leap.
For information about the general requirements of Leap, see Leap Requirements on page 18.
For information about node, disk and Foundation configurations required to support
Asynchronous replication schedules, see On-Prem Hardware Resource Requirements on
page 14.

Hypervisor Requirements
AHV or ESXi

• The AHV clusters must be running on AHV versions that come bundled with the supported
version of AOS.
• The ESXi clusters must be running on version ESXi 6.5 GA or newer.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 36
Nutanix Software Requirements
Each on-prem site must have a Leap enabled Prism Central instance.
The primary and recovery Prism Central and Prism Element on the Nutanix clusters must be
running the following versions of AOS.

• AHV clusters

• AOS 5.17 or newer for DR to different Nutanix clusters at the same site.
• AOS 5.10 or newer for DR to Nutanix clusters at the different sites.
• ESXi clusters

• AOS 5.17 or newer for DR to different Nutanix clusters at the same site.
• AOS 5.11 or newer for DR to Nutanix clusters at the different sites.

Cross Hypervisor Disaster Recovery (CHDR) Requirements


Guest VMs protected with Asynchronous replication schedule support cross-hypervisor
disaster recovery. You can perform failover (DR) to recover guest VMs from AHV clusters to
ESXi clusters or guest VMs from ESXi clusters to AHV clusters by considering the following
requirements.

• Both the primary and the recovery Nutanix clusters must be running AOS 5.17 or newer for
CHDR to Nutanix clusters at the same availability zone (site).
• Both the primary and the recovery Nutanix clusters must be running AOS 5.11.2 or newer for
CHDR to Nutanix clusters at different availability zones (sites).
• Install and configure Nutanix Guest Tools (NGT) on all the guest VMs. For more information,
see Enabling and Mounting Nutanix Guest Tools in Prism Web Console Guide.
NGT configures the guest VMs with all the required drivers for VM portability. For more
information about general NGT requirements, see Nutanix Guest Tools Requirements and
Limitations in Prism Web Console Guide.
• CHDR supports guest VMs with flat files only.
• CHDR supports IDE/SCSI and SATA disks only.
• For all the non-boot SCSI disks of Windows guest VMs, set the SAN policy to OnlineAll so
that they come online automatically.
• In vSphere 6.7, guest VMs are configured with UEFI secure boot by default. Upon CHDR to
an AHV cluster, these guest VMs do not start if the host does not support the UEFI secure
boot feature. For more information about supportability of UEFI secure boot on Nutanix
clusters, see Compatibility Matrix.

• For information about operating systems that support UEFI and Secure Boot, see UEFI and
Secure Boot Support for CHDR on page 211.
• Nutanix does not support vSphere inventory mapping (for example, VM folder and resource
pools) when protecting workloads between VMware clusters.

• Nutanix does not support vSphere snapshots or delta disk files.


If you have delta disks attached to a guest VM and you proceed with failover, you get
a validation warning and the guest VM does not recover. Contact Nutanix Support for
assistance.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 37
Table 12: Operating Systems Supported for CHDR (Asynchronous Replication)

Operating System Version Requirements and limitations

Windows
• Windows 2008 R2 or newer • Only 64-bit operating systems
versions are supported.
• Windows 7 or newer versions

Linux
• CentOS 6.5 and 7.0 • SLES operating system is not
supported.
• RHEL 6.5 or newer and RHEL
7.0 or newer.
• Oracle Linux 6.5 and 7.0
• Ubuntu 14.04

Additional Requirement
The storage container name of the protected guest VMs must be the same on both the primary
and recovery clusters. Therefore, a storage container must exist on the recovery cluster with
the same name as the one on the primary cluster. For example, if the protected VMs are
in the SelfServiceContainer storage container on the primary cluster, there must also be a
SelfServiceContainer storage container on the recovery cluster.

Asynchronous Replication Limitations (Leap)


Consider the following specific limitations before protecting your guest VMs with Asynchronous
replication schedule. These limitations are in addition to the general limitations of Leap.
For information about the general limitations of Leap, see Leap Limitations on page 22.

• You cannot restore guest VMs with incompatible GPUs at the recovery Nutanix cluster.
• You cannot protect guest VMs configured as part of a network function chain.
• You cannot retain hypervisor-specific properties after cross hypervisor disaster recovery
(CHDR).
CHDR does not preserve hypervisor-specific properties (for example, multi-writer flags,
independent persistent and non-persistent disks, changed block tracking (CBT), PVSCSI disk
configurations).

Creating a Protection Policy with an Asynchronous Replication Schedule (Leap)


To protect the guest VMs in an hourly replication schedule, configure an Asynchronous
replication schedule while creating the protection policy. The policy takes recovery points of
those guest VMs in the specified time intervals (hourly) and replicates them to the recovery
availability zones (sites) for High Availability. To protect the guest VMs at the same or different
recovery sites, the protection policy allows you to configure Asynchronous replication
schedules to at most two recovery sites—a unique replication schedule to each recovery site.
The policy synchronizes continuously to the recovery sites in a bidirectional way.

Before you begin


See Asynchronous Replication Requirements (Leap) on page 36 and Asynchronous
Replication Limitations (Leap) on page 38 before you start.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 38
About this task
To create a protection policy with an Asynchronous replication schedule, do the following at
the primary site. You can also create a protection policy at the recovery site. Protection policies
you create or update at a recovery site synchronize back to the primary site.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Policies > Protection
Policies in the left pane.

Figure 17: Protection Policy Configuration: Protection Policies

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 39
3. Click Create Protection Policy.
Specify the following information in the Create Protection Policy window.

Figure 18: Protection Policy Configuration: Select Primary Location

a. Policy name: Enter a name for the protection policy.

Caution: The name can be of only alphanumeric, dot, dash, and underscore characters.

b. In the Primary Location pane, specify the following information.

• 1. Location: From the drop-down list, check an availability zone (site) that hosts the
guests VMs to protect.
The drop-down lists all the sites paired with the local site. Local AZ represents the
local site (Prism Central). For your primary site, you can check either the local site or
a non-local site.
2. Cluster: From the drop-down list, check the Nutanix cluster that hosts the guest
VMs to protect.
The drop-down lists all the Nutanix clusters registered to Prism Central representing
the selected site. If you want to protect the guest VMs from multiple Nutanix
clusters in the same protection policy, check the clusters that host those guest
VMs. All Clusters protects the guest VMs of all Nutanix clusters registered to Prism
Central.
3. Click Save.
Clicking Save activates the Recovery Location pane. After saving the primary
site configuration, you can optionally add a local schedule (step iv) to retain the
recovery points at the primary site.
4. Click + Add Local Schedule if you want to retain recovery points locally in addition
to retaining recovery points in a replication schedule (step d.iv). For example, you
can create a local schedule to retain 15 minute recovery points locally and also

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 40
an hourly replication schedule to retain recovery points and replicate them to a
recovery site every 2 hours. The two schedules apply differently on the guest VMs.
Specify the following information in the Add Schedule window.

Figure 19: Protection Policy Configuration: Add Local Schedule

1. Take Snapshot Every: Specify the frequency in minutes, hours, days, or weeks at
which you want the recovery points to be taken locally.
2. Retention Type: Specify one of the following two types of retention policy.

• Linear: Implements a simple retention scheme at the local site. If you set the
retention number to n, the local site retains the n recent recovery points.
When you enter the frequency in minutes, the system selects the Roll-up
retention type by default because minutely recovery points do not support
Linear retention types.
• Roll-up: Rolls up the recovery points into a single recovery point at the local
site.
For more information about the roll-up recovery points, see step d.iii.
3. Retention on Local AZ:PE_A3_AHV: Specify the retention number for the local
site.
4. If you want to take application-consistent recovery points, check Take App-
Consistent Recovery Point.
Irrespective of the local or replication schedules, the recovery points are of the
specified type. If you check Take App-Consistent Recovery Point, the recovery
points generated are application-consistent and if you do not check Take App-
Consistent Recovery Point, the recovery points generated are crash-consistent.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 41
If the time in the local schedule and the replication schedule match, the single
recovery point generated is application-consistent.

Note: See Application-consistent Recovery Point Conditions and


Limitations on page 51 before you take application-consistent snapshot.
5. Click Save Schedule.
c. In the Recovery Location pane, specify the following information.

Figure 20: Protection Policy Configuration: Select Recovery Location

• 1. Location: From the drop-down list, select the availability zone (site) where you want
to replicate the recovery points.
The drop-down lists all the sites paired with the local site. Local AZ represents the
local site (Prism Central). Select Local AZ if you want to configure DR to a different
Nutanix cluster at the same site.
If you do not select a site, local recovery points that are created by the protection
policy do not replicate automatically. You can, however, replicate the recovery
points manually and use recovery plans to recover the guest VMs. For more
information, see Manual Disaster Recovery (Leap) on page 137.
2. Cluster: From the drop-down list, select the Nutanix cluster where you want to
replicate the recovery points.
The drop-down lists all the Nutanix clusters registered to Prism Central representing
the selected site. You can select one cluster at the recovery site. If you want to
replicate the recovery points to more clusters at the same or different sites, add

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 42
another recovery site with a replication schedule. For more information to add
another recovery site with a replication schedule, see step e.

Note: Selecting auto-select from the drop-down list replicates the recovery points
to any available cluster at the recovery site. Select auto-select from the drop-down
list only if all the clusters at the recovery site are up and running.

Caution: If the primary Nutanix cluster contains an IBM POWER Systems server, you
can replicate recovery points to an on-prem site only if that on-prem site contains an
IBM Power Systems server.

3. Click Save.
Clicking Save activates the + Add Schedule button between the primary and the
recovery site. After saving the recovery site configuration, you can optionally add a
local schedule to retain the recovery points at the recovery site.
4. Click + Add Local Schedule if you want to retain recovery points locally in addition
to retaining recovery points in a replication schedule (step d.iv). For example,
you can create a local schedule to retain one hourly recovery points locally to
supplement the hourly replication schedule. The two schedules apply differently on
the guest VMs after failover, when the recovery points replicate back to the primary
site.
Specify the following information in the Add Schedule window.

Figure 21: Protection Policy Configuration: Add Local Schedule

1. Take Snapshot Every: Specify the frequency in minutes, hours, days, or weeks at
which you want the recovery points to be taken locally.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 43
2. Retention Type: Specify one of the following two types of retention policy.

• Linear: Implements a simple retention scheme at the local site. If you set the
retention number to n, the local site retains the n recent recovery points.
When you enter the frequency in minutes, the system selects the Roll-up
retention type by default because minutely recovery points do not support
Linear retention types.
• Roll-up: Rolls up the recovery points into a single recovery point at the local
site.
For more information about the roll-up recovery points, see step d.iii.
3. Retention on 10.xx.xx.xxx:PE_C1_AHV: Specify the retention number for the local
site.
4. If you want to take application-consistent recovery points, check Take App-
Consistent Recovery Point.
Irrespective of the local or replication schedules, the recovery points are of the
specified type. If you check Take App-Consistent Recovery Point, the recovery
points generated are application-consistent and if you do not check Take App-
Consistent Recovery Point, the recovery points generated are crash-consistent.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 44
If the time in the local schedule and the replication schedule match, the single
recovery point generated is application-consistent.

Note: See Application-consistent Recovery Point Conditions and


Limitations on page 51 before you take application-consistent snapshot.
5. Click Save Schedule.
d. Click + Add Schedule to add a replication schedule between the primary and the recovery
site.
Specify the following information in the Add Schedule window. The window auto-
populates the Primary Location and Recovery Location that you have selected in step b
and step c.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 45
Figure 22: Protection Policy Configuration: Add Schedule (Asynchronous)

• 1. Protection Type: Click Asynchronous.


2. Take Snapshot Every: Specify the frequency in hours, days, or weeks at which you
want the recovery points to be taken.
The specified frequency is the RPO. For more information about RPO, see Leap
Terminology on page 8.
3. Retention Type: Specify one of the following two types of retention policy.

• Linear: Implements a simple retention scheme at both the primary (local) and the
recovery (remote) site. If you set the retention number for a given site to n, that

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 46
site retains the n recent recovery points. For example, if the RPO is 1 hour, and
the retention number for the local site is 48, the local site retains 48 hours (48 X 1
hour) of recovery points at any given time.

Tip: Use linear retention policies for small RPO windows with shorter retention
periods or in cases where you always want to recover to a specific RPO window.

• Roll-up: Rolls up the recovery points as per the RPO and retention period into a
single recovery point at a site. For example, if you set the RPO to 1 hour, and the
retention time to 5 days, the 24 oldest hourly recovery points roll up into a single
daily recovery point (one recovery point = 24 hourly recovery points) after every
24 hours. The system keeps one day (of rolled-up hourly recovery points) and 4
days of daily recovery points.

Note:

• If the retention period is n days, the system keeps 1 day of RPO


(rolled-up hourly recovery points) and n-1 days of daily recovery
points.
• If the retention period is n weeks, the system keeps 1 day of RPO, 1
week of daily and n-1 weeks of weekly recovery points.
• If the retention period is n months, the system keeps 1 day of RPO, 1
week of daily, 1 month of weekly, and n-1 months of monthly recovery
points.
• If the retention period is n years, the system keeps 1 day of RPO, 1
week of daily, 1 month of weekly, and n-1 months of monthly recovery
points.

Note: The recovery points that are used to create a rolled-up recovery point are
discarded.

Tip: Use roll-up retention policies for anything with a longer retention period.
Roll-up policies are more flexible and automatically handle recovery point aging/
pruning while still providing granular RPOs for the first day.

4. To specify the retention number for the primary and recovery sites, do the following.

• Retention on Local AZ: PE_A3_AHV: Specify the retention number for the
primary site.
This field is unavailable if you do not specify a recovery location.
• Retention on 10.xx.xx.xxx:PE_C1_AHV: Specify the retention number for the
recovery site.
If you select linear retention, the remote and local retention count represents
the number of recovery points to retain at any given time. If you select roll-up
retention, these numbers specify the retention period.
5. If you want to enable reverse retention of the recovery points, check Reverse
retention for VMs on recovery location.

Note: Reverse retention for VMs on recovery location is available only when the
retention numbers on the primary and recovery sites are different.

Reverse retention maintains the retention numbers of recovery points even after
failover to a recovery site in the same or different availability zones. For example, if

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 47
you retain two recovery points at the primary site and three recovery points at the
recovery site, and you enable reverse retention, a failover event does not change the
initial retention numbers when the recovery points replicate back to the primary site.
The recovery site still retains two recovery points while the primary site retains three
recovery points. If you do not enable reverse retention, a failover event changes the
initial retention numbers when the recovery points replicate back to the primary site.
The recovery site retains three recovery points while the primary site retains two
recovery points.
Maintaining the same retention numbers at a recovery site is required if you want to
retain a particular number of recovery points, irrespective of where the guest VM is
after its failover.
6. If you want to take application-consistent recovery points, check Take App-
Consistent Recovery Point.
Application-consistent recovery points ensure that application consistency is
maintained in the replicated recovery points. For application-consistent recovery
points, install NGT on the guest VMs running on AHV clusters. For guest VMs
running on ESXi clusters, you can take application-consistent recovery points
without installing NGT, but the recovery points are hypervisor-based, and leads to
VM stuns (temporary unresponsive VMs) after failover to the recovery sites.

Note: See Application-consistent Recovery Point Conditions and Limitations


on page 51 before you take application-consistent snapshot.

Caution: Application-consistent recovery points fail for EFI-boot enabled Windows


2019 VM running on ESXi when NGT is not installed. Nutanix recommends installing
NGT on guest VMs running on ESXi also.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 48
7. Click Save Schedule.
e. Click + Add Recovery Location at the top-right if you want to add an additional recovery
site for the guest VMs in the protection policy.

» To add an on-prem site for recovery, see Protection and DR between On-Prem Sites
(Leap) on page 17
» To add Xi Cloud Services for recovery, see Protection and DR between On-Prem Site
and Xi Cloud Service (Xi Leap) on page 141.

Figure 23: Protection Policy Configuration: Additional Recovery Location


f. Click + Add Schedule to add a replication schedule between the primary site and the
additional recovery site you specified in step e.
Perform step d again in the Add Schedule window to add the replication schedule. The
window auto-populates the Primary Location and the additional Recovery Location that
you have selected in step b and step c.
By default, recovery point creation begins immediately after you create the protection
policy. If you want to specify when recovery point creation must begin, click Immediately
at the top-right corner, and then, in the Start Time dialog box, do the following.

• 1. Click Start protection at specific point in time.


2. Specify the time at which you want to start taking recovery points.
3. Click Save.
g. Click Next.
Clicking Next shows a list of VM categories where you can optionally check one or more
VM categories to protect in the protection policy. DR configurations using Leap allows
you to protect a guest VM by using only one protection policy. Therefore, VM categories
specified in another protection policy are not in the list. If you protect a guest VM in
another protection policy by specifying the VM category of the guest VM (category-
based inclusion), and if you protect the guest VM from the VMs page in this policy
(individual inclusion), the individual inclusion supersedes the category-based inclusion.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 49
Effectively, only the protection policy that protected the individual guest VM protects the
guest VM.
For example, the guest VM VM_SherlockH is in the category Department:Admin, and
you add this category to the protection policy named PP_AdminVMs. Now, if you add
VM_SherlockH from the VMs page to another protection policy named PP_VMs_UK,
VM_SherlockH is protected in PP_VMs_UK and unprotected from PP_AdminVMs.
h. If you want to protect the guest VMs category wise, check the VM categories that you
want to protect from the list and click Add.

Figure 24: Protection Policy Configuration: Add VM Categories

Prism Central includes built-in VM categories for frequently encountered applications (for
example, MS Exchange and Oracle). If the VM category or value you want is not available,
first create the category with the required values, or update an existing category so
that it has the values you require. Doing so ensures that the VM categories and values
are available for selection. You can add VMs to the category either before or after you
configure the protection policy. If the guest VMs have a common characteristic, such as
belonging to a specific application or location, create a VM category and add the guest
VMs into the category.
If you do not want to protect the guest VMs category wise, proceed to the next step
without checking VM categories. You can add the guest VMs individually to the protection
policy later from the VMs page (see Adding Guest VMs individually to a Protection Policy
on page 128).
i. Click Create.
The protection policy with an Asynchronous replication schedule is created. To verify the
protection policy, see the Protection Policies page. If you check VM categories in step
h, the protection policy starts generating recovery points of the guest VMs in those VM
categories. To see the generated recovery points, click the hamburger icon at the top-
left corner of the window and go to VM Recovery Points. Click the recovery points for its

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 50
information. You can see the time estimated for the very first replication (seeding) to the
recovery sites.

Figure 25: Recovery Points Overview

Application-consistent Recovery Point Conditions and Limitations

This topic describes the conditions and limitations for application-consistent recovery points
that you can generate through a protection policy. For information about the operating
systems that support the AOS version you have deployed, see the Compatibility Matrix.

• Before taking an application-consistent recovery point, consider the workload type of your
guest VM.
Applications running in your guest VM must be able to quiesce I/O operations. For example,
For example, you can quiesce I/O operations for database applications and similar workload
types.
• Before taking an application-consistent recovery point, install and enable Nutanix Guest
Tools (NGT) on your guest VM.
For installing and enabling NGT, see Nutanix Guest Tools in the Prism Web Console Guide.
For guest VMs running on ESXi, consider these points.
• Install and enable NGT on guest VMs running on ESXi also. Application-consistent recovery
points fail for EFI boot-enabled Windows 2019 VMs running on ESXi without installing NGT.

• (vSphere) If you do not enable NGT and then try to take an application-consistent recovery
point, the system creates a Nutanix native recovery point with a single vSphere host-based
recovery point. The system deletes the vSphere host-based recovery point. If you enable
NGT and then try to take application-consistent recovery point, the system directly captures
a Nutanix native recovery point.
• Do not delete the .snapshot folder in the vCenter.

• The following table lists the operating systems that support application-consistent recovery
points with NGT installed.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 51
Table 13: Supported Operating Systems (NGT Installed)

Operating system Version

Windows
• Windows 2008 R2 through Windows 2019

Linux
• CentOS 6.5 through 6.9 and 7.0 through 7.3
• Red Hat Enterprise Linux (RHEL) 6.5 through 6.9
and 7.0 through 7.3.
• Oracle Linux 6.5 and 7.0
• SUSE Linux Enterprise Server (SLES) 11 SP1
through 11 SP4 and 12 SP1 through 12 SP3
• Ubuntu 14.04

Application-consistent Recovery Points with Microsoft Volume Shadow Copy Service (VSS)

• To take application-consistent recovery points on Windows guest VMs, enable Microsoft VSS
services.
When you configure a protection policy and select Take App-Consistent Recovery Point,
the Nutanix cluster transparently invokes the VSS (also known as Shadow copy or volume
snapshot service).

Note: This option is available for ESXi and AHV only. However, you can use third-party
backup products to invoke VSS for Hyper-V.

• To take application-consistent recovery points on guest VMs that use VSS, systems invoke
Nutanix native in-guest VmQuiesced Snapshot Service (VSS) agent. VSS framework takes
application-consistent recovery points without causing VM stuns (temporary unresponsive
VMs).
• VSS framework enables third-party backup providers like Commvault and Rubrik to take
application-consistent snapshots on Nutanix platform in a hypervisor-agnostic manner.

• The default and only backup type for VSS snapshots is VSS_BT_COPY (copy backup).
Third party Backup products can choose between VSS_BT_FULL (full backup )and
VSS_BT_COPY (copy backup) backup types.
• Guest VMs with delta, SATA, and IDE disks do not support Nutanix VSS recovery points.
• Guest VMs with iSCSI attachments (LUNs) do not support Nutanix VSS recovery points.
Nutanix VSS recovery points fail for such guest VMs.

• Do not take Nutanix enabled application-consistent recovery points while using any third-
party backup provider enabled VSS snapshots (for example, Veeam).

Pre-freeze and Post-thaw scripts

• You can take application-consistent recovery points on NGT and Volume Shadow Copy
Service (VSS) enabled guest VMs. However, some applications require more steps before or
after the VSS operations to fully quiesce the guest VMs to an appropriate restore point or

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 52
state in which the system can capture a recovery point. Such applications need pre-freeze
and post-thaw scripts to run the necessary extra steps.
• Any operation that the system must perform on a guest VM before replication or a recovery
point capture is a pre-freeze operation. For example, if a guest VM hosts a database, you can
enable hot backup of the database before replication using a pre-freeze script. Similarly, any
operation that the system must perform on guest VM after replication or a recovery point
capture is a post-thaw operation.

Tip: Vendors such as CommVault provide pre-freeze and post-thaw scripts. You can also
write your own pre-freeze and post-thaw scripts.

Script Requirements

• For Windows VMs, you must administrator and have read, write, and execute
permissions on the scripts.
• For Linux VMs, you must have root ownership and root access with 700 permissions
on the scripts.
• For completion of any operation before or after replication or recovery point capture,
you must have both the pre_freeze and post_thaw scripts for the operation.
• Timeout for both the scripts is 60 seconds.
• A script must return 0 to indicate a successful run. A non-zero return value implies
that the script execution failed. The necessary log entries are available in the NGT logs.

Tip: (AHV) For a non-zero return value from the pre-freeze script, the system captures
a non application-consistent snapshot and raises an alert on the Prism web console.
Similarly, for a non-zero return value from the post-thaw script, the system attempts to
capture an application-consistent snapshot once again. If the attempt fails, the system
captures a non application-consistent snapshot, and raises an alert on the Prism web
console.

• Irrespective of whether the pre-freeze script execution is successful, the


corresponding post-thaw script runs.
Script Location
You can define Python or shell scripts or any executable or batch files at the following
locations in Linux or Windows VMs. The scripts can contain commands and routines
necessary to run specific operations on one or more applications.

• In Windows VMs,

• Batch script file path for pre_freeze scripts:


C:\Program Files\Nutanix\Scripts\pre_freeze.bat

• Batch script file path for post_thaw scripts:


C:\Program Files\Nutanix\Scripts\post_thaw.bat

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 53
• In Linux VMs,

• Shell script file path for production failover:


/usr/local/sbin/pre_freeze

Replace pre_freeze with the script name (without extension).


• Shell script file path for test failover:
/usr/local/sbin/post_thaw

Replace post_thaw with the script name (without extension).

Note: The scripts must have root ownership and root access with 700 permissions.

Script Sample

Note: The following are only sample scripts and therefore must be modified to fit your
deployment.

• For Linux VMs


#!/bin/sh
#pre_freeze-script
date >> '/scripts/pre_root.log'
echo -e "\n attempting to run pre_freeze script for MySQL as root user\n" >> /scripts/
pre_root.log
if [ "$(id -u)" -eq "0" ]; then
python '/scripts/quiesce.py' &
echo -e "\n executing query flush tables with read lock to quiesce the database\n"
>> /scripts/pre_freeze.log
echo -e "\n Database is in quiesce mode now\n" >> /scripts/pre_freeze.log
else
date >> '/scripts/pre_root.log'
echo -e "not root useri\n" >> '/scripts/pre_root.log'
fi
#!/bin/sh
#post_thaw-script
date >> '/scripts/post_root.log'
echo -e "\n attempting to run post_thaw script for MySQL as root user\n" >> /scripts/
post_root.log
if [ "$(id -u)" -eq "0" ]; then
python '/scripts/unquiesce.py'
else
date >> '/scripts/post_root.log'
echo -e "not root useri\n" >> '/scripts/post_root.log'
fi

• For Windows VMs


@echo off
echo Running pre_freeze script >C:\Progra~1\Nutanix\script\pre_freeze_log.txt
@echo off
echo Running post_thaw script >C:\Progra~1\Nutanix\script\post_thaw_log.txt

Note: If any of these scripts prints excessive output to the console session, the script
freezes. To avoid script freeze, perform the following.

• Add @echo off to your scripts.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 54
• Redirect the script output to a log file.

If you receive a non-zero return code from the pre-freeze script, the system captures a
non application-consistent recovery point and raises an alert on the Prism web console.
If you receive a non-zero return code from the post-thaw script, the system attempts to
capture an application-consistent snapshot once again. If that attempt fails, the system
captures a non application-consistent snapshot, and raises an alert on the Prism web
console.
Applications supporting application-consistent recovery points without scripts
Only the following applications support application-consistent recovery points without
pre-freeze and post-thaw scripts.

• Microsoft SQL Server 2008, 2012, 2016, and 2019


• Microsoft Exchange 2010
• Microsoft Exchange 2013
• Microsoft Exchange 2016

• Nutanix does not support application-consistent recovery points on Windows VMs that have
mounted VHDX disks.
• The system captures hypervisor-based recovery points only when you have VMware Tools
running on the guest VM and the guest VM does not have any independent disks attached to
it.
If these requirements are not met, the system captures crash-consistent snapshots.

• The following table provides detailed information on whether a recovery point is application-
consistent or not depending on the operating systems and hypervisors running in your
environment.

Note:

• Installed and active means that the guest VM has the following.

• NGT installed.
• VSS capability enabled.
• Powered on.
• Actively communicating with the CVM.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 55
Table 14: Application-consistent Recovery Points

Server ESXi AHV

NGT status Result NGT status Result

Microsoft Installed and Nutanix script- Installed and Nutanix script-


Windows Server active. Also pre- based VSS active. Also pre- based VSS
edition freeze and post- snapshots freeze and post- snapshots
thaw scripts are thaw scripts are
present. present.

Installed and Nutanix VSS- Installed and Nutanix VSS-


active enabled active enabled
snapshots. snapshots

Not enabled Hypervisor-based Not enabled Crash-consistent


application- snapshots
consistent or
crash-consistent
snapshots.

Microsoft Installed and Nutanix script- Installed and Nutanix script-


Windows Client active. Also pre- based VSS active. Also pre- based VSS
edition freeze and post- snapshots freeze and post- snapshots
thaw scripts are thaw scripts are
present. present.

Not enabled Hypervisor-based Not enabled Crash-consistent


snapshots or snapshots
crash-consistent
snapshots.

Linux VMs Installed and Nutanix script- Installed and Nutanix script-
active. Also pre- based VSS active. Also pre- based VSS
freeze and post- snapshots freeze and post- snapshots
thaw scripts are thaw scripts are
present. present.

Not enabled Hypervisor-based Not enabled Crash-consistent


snapshots or snapshots
crash-consistent
snapshots.

Creating a Recovery Plan (Leap)


To orchestrate the failover (disaster recovery) of the protected guest VMs to the recovery site,
create a recovery plan. After a failover, a recovery plan recovers the protected guest VMs to
the recovery site. If you have configured two on-prem recovery sites in a protection policy,
create two recovery plans for DR—one for recovery to each recovery site. The recovery plan
synchronizes continuously to the recovery site in a bidirectional way.

About this task


To create a recovery plan, do the following at the primary site. You can also create a recovery
plan at a recovery site. The recovery plan you create or update at a recovery site synchronizes
back to the primary site.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 56
Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Policies > Recovery
Plans in the left pane.

Figure 26: Recovery Plan Configuration: Recovery Plans

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 57
3. Click Create Recovery Plan.
Specify the following information in the Create Recovery Plan window.

Figure 27: Recovery Plan Configuration: General

4. In the General tab, enter Recovery Plan Name, Recovery Plan Description, Primary Location,
Recovery Location, and click Next.
From Primary Location and Recovery Location drop-down lists, you can select either
the local availability zone (site) or a non-local site to serve as your primary and recovery
sites respectively. Local AZ represents the local site (Prism Central). If you are configuring
recovery plan to recover the protected guest VMs to another Nutanix cluster at the same
site, select Local AZ from both Primary Location and Recovery Location drop-down lists.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 58
5. In the Power On Sequence tab, click + Add Entities to add the guest VMs to the start
sequence.

Figure 28: Recovery Plan Configuration: Add Entities

a. In the Search Entities by, select VM Name from the drop-down list to specify guest VMs
by name.
b. In the Search Entities by, select Category from the drop-down list to specify guest VMs
by category.
c. To add the guest VMs or VM categories to the stage, select the VMs or VM categories
from the list.

Note: The VMs listed in the search result are in the active state of replication.

d. Click Add.
The selected guest VMs are added to the start sequence in a single stage by default. You
can also create multiple stages to add guest VMs and define the order of their power-on
sequence. For more information about stages, see Stage Management on page 64.

Caution: Do not include the guest VMs protected with Asynchronous, NearSync, and
Synchronous replication schedules in the same recovery plan. You can include guest VMs
protected with Asynchronous or NearSync replication schedules in the same recovery plan.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 59
However, if you combine these guest VMs with the guest VMs protected by Synchronous
replication schedules in a recovery plan, the recovery fails.

e. To automate in-guest script execution on the guest VMs during recovery, select the
individual guest VMs or VM categories in the stage and click Manage Scripts.

Note: In-guest scripts allow you to automate various task executions upon recovery of
the guest VMs. For example, in-guest scripts can help automate the tasks in the following
scenarios.

• After recovery, the guest VMs must use new DNS IP addresses and also
connect to a new database server that is already running at the recovery site.
Traditionally, to achieve this new configuration, you would manually log on to
the recovered VM and modify the relevant files. With in-guest scripts, you have
to write a script to automate the required steps and enable the script when
you configure a recovery plan. The recovery plan execution automatically
invokes the script and performs the reassigning of DNS IP address and
reconnection to the database server at the recovery site.
• If guest VMs are part of domain controller siteA.com at the primary site
AZ1, and after the guest VMs recover at the site AZ2, you want to add the
recovered guest VMs to the domain controller siteB.com.
Traditionally, to reconfigure, you would manually log on to the VM, remove the
VM from an existing domain controller, and then add the VM to a new domain

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 60
controller. With in-guest scripts, you can automate the task of changing the
domain controller.

Note: In-guest script execution requires NGT version 1.9 or newer installed on the VM. The
in-guest scripts run as a part of the recovery plan only if they have executable permissions
for the following.

• Administrator user (Windows)


• Root user (Linux)

Note: You can define a batch or shell script that executes automatically in the guest VMs
after their disaster recovery. Place two scripts—one for production failover and the other
for test failover—at the following locations in the guest VMs with the specified name.

• In Windows VMs,

• Batch script file path for production failover:


C:\Program Files\Nutanix\scripts\production\vm_recovery

• Batch script file path for test failover:


C:\Program Files\Nutanix\scripts\test\vm_recovery

• In Linux VMs,

• Shell script file path for production failover:


/usr/local/sbin/production_vm_recovery

• Shell script file path for test failover:


/usr/local/sbin/test_vm_recovery

Note: When an in-guest script runs successfully, it returns code 0. Any non-zero error code
signifies that the execution of the in-guest script was unsuccessful.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 61
Figure 29: Recovery Plan Configuration: In-guest Script Execution

• 1. To enable script execution, click Enable.


A command prompt icon appears against the guest VMs or VM categories to
indicate that in-guest script execution is enabled on those guest VMs or VM
categories.
2. To disable script execution, click Disable.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 62
6. In the Network Settings tab, map networks in the primary cluster to networks at the
recovery cluster.

Figure 30: Recovery Plan Configuration: Network Settings

Network mapping enables replicating the network configurations of the primary Nutanix
clusters to the recovery Nutanix clusters, and recover guest VMs into the same subnet
at the recovery Nutanix cluster. For example, if a guest VM is in the vlan0 subnet at the
primary Nutanix cluster, you can configure the network mapping to recover that guest VM
in the same vlan0 subnet at the recovery Nutanix cluster. To specify the source (primary
Nutanix cluster) and destination (recovery Nutanix cluster) network information for network
mapping, do the following in Local AZ (Primary) and PC 10.xx.xx.xxx (Recovery) panes.

a. Under Production in Virtual Network or Port Group drop-down list, select the production
subnet that contains the protected guest VMs. (optional) If the virtual network is a non-
IPAM network, specify the gateway IP address and prefix length in the Gateway IP/Prefix
Length field.
b. Under Test Failback in Virtual Network or Port Group drop-down list, select the test
subnet that you want to use for testing failback from the recovery Nutanix cluster.
(optional) If the virtual network is a non-IPAM network, specify the gateway IP address
and prefix length in the Gateway IP/Prefix Length field.
c. To add more network mappings, click Add Networks at the top-right corner of the page,
and then repeat the steps 6.a-6.b.

Note: The primary and recovery Nutanix clusters must have identical gateway IP
addresses and prefix length. Therefore you cannot use a test failover network for two or
more network mappings in the same recovery plan.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 63
d. Click Done.

Note: For ESXi, you can configure network mapping for both standard and distributed (DVS)
port groups. For more information about DVS, see VMware documentation.

Caution: Leap does not support VMware NSX-T datacenters. For more information about
NSX-T datacenters, see VMware documentation.

7. To perform VM-level static IP address mapping between the primary and the recovery sites,
click Advanced Settings, click Custom IP Mapping, and then do the following.

Note: The Custom IP Mapping shows all the guest VMs with static IP address configured, NGT
installed, and VNIC in the source subnet specified in the network mapping.

a. To locate the guest VM, type the name of the guest VM in the filter field.
A guest VM that has multiple NICs lists in multiple rows, allowing you to specify an
IP address mapping for each VNIC. All the fields auto-populate with the IP addresses
generated based on the offset IP address-mapping scheme.
b. In the Test Failback field for the local site, Production field for the remote (recovery) site,
and Test Failover for the remote site, edit the IP addresses.
Perform this step for all the IP addresses that you want to map.

Caution: Do not edit the IP address assigned to the VNIC in the local site. If you do not
want to map static IP addresses for a particular VNIC, you can proceed with the default
entries.

c. Click Save.
d. If you want to edit one or more VM-level static IP address mappings, click Edit, and then
change the IP address mapping.

8. If VM-level static IP address mapping is configured between the primary and the recovery
Nutanix clusters and you want to use the default, offset-based IP address-mapping scheme,
click Reset to Matching IP Offset.

9. Click Done.
The recovery plan is created. To verify the recovery plan, see the Recovery Plans page. You
can modify the recovery plan to change the recovery location, add, or remove the protected
guest VMs. For information about various operations that you can perform on a recovery
plan, see Recovery Plan Management on page 134.

Stage Management

A stage defines the order in which the protected guest VMs start at the recovery cluster. You
can create multiple stages to prioritize the start sequence of the guest VMs. In the Power On
Sequence, the VMs in the preceding stage start before the VMs in the succeeding stages. On
recovery, it is desirable to start some VMs before the others. For example, database VMs must
start before the application VMs. Place all the database VMs in the stage before the stage
containing the application VMs, in the Power On Sequence.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 64
Figure 31: Recovery Plan Configuration: Power On Sequence

To Add a Stage in the Power-On Sequence and Add Guest VMs to It, Do the Following.
1. Click +Add New Stage.
2. Click +Add Entities.
3. To add guest VMs to the current stage in the power-on sequence, do the following.
1. In the Search Entities by, select VM Name from the drop-down list to specify guest VMs
by name.
2. In the Search Entities by, select Category from the drop-down list to specify guest VMs
by category.
3. To add the guest VMs or VM categories to the stage, select the guest VMs or VM
categories from the list.

Note: The VMs listed in the search result are in the active state of replication.

4. Click Add.

To Remove a Stage from the Power-On Sequence, Do the Following.


Click Actions > Remove Stage

Note: You see Actions in a stage only when none of the VMs in the stage are selected. When one
or more VMs in the stage are selected, you see More Actions.

To Change the Position of a Stage in the Power-On Sequence, Do the Following.

• To move a stage up or down in the power-on sequence, click # or # respectively, at the top-
right corner of the stage.
• To expand or collapse a stage, click + or - respectively, at the top-right corner of the stage.
• To move VMs to a different stage, select the VMs, do the following.
1. Click More Actions > Move.
2. Select the target stage from the list.

Note: You see Move in the More Actions only when you have defined two or more stages.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 65
To Set a Delay Between the Power-On Sequence of Two Stages, Do the Following.
1. Click +Add Delay.
2. Enter the time in seconds.
3. Click Add.

To Add Guest VMs to an Existing Stage, Do the Following.


1. Click Actions > Add Entities.

Note: You see Actions in a stage only when none of the VMs in the stage are selected. When
one or more VMs in the stage are selected, you see More Actions.

2. To add VMs to the current stage in the power-on sequence, do the following.
1. In the Search Entities by, select VM Name from the drop-down list to specify guest VMs
by name.
2. In the Search Entities by, select Category from the drop-down list to specify guest VMs
by category.
3. To add the guest VMs or VM categories to the stage, select the guest VMs or VM
categories from the list.

Note: The VMs listed in the search result are in the active state of replication.

3. Click Add.

To Remove Guest VMs from an Existing Stage, Do the Following.


1. Select the VMs from the stage.
2. Click More Actions > Remove.

Note: You see More Actions in a stage only when one or more VMs in the stage are selected.
When none of the VMs in the stage are selected, you see Actions.

To Move Guest VMs to a Different Stage, Do the Following.


1. Select the VMs from the stage.
2. Click More Actions > Move.

Note: You see More Actions in a stage only when one or more VMs in the stage are selected.
When none of the VMs in the stage are selected, you see Actions.

3. Select the target stage from the list.

Failover and Failback Management


You perform failover of the protected guest VMs when unplanned failure events (for example,
natural disasters) or planned events (for example, scheduled maintenance) happen at the
primary availability zone (site) or the primary cluster. The protected guest VMs migrate to the
recovery site where you perform the failover operations. On recovery, the protected guest VMs
start in the Nutanix cluster you specify in the recovery plan that orchestrates the failover.
The following are the types of failover operations.
Test failover
To ensure that the protected guest VMs failover efficiently to the recovery site, you
perform a test failover. When you perform a test failover, the guest VMs recover in
the virtual network designated for testing purposes at the recovery site. However, the
guest VMs at the primary site are not affected. Test failovers rely on the presence of VM
recovery points at the recovery sites.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 66
Planned failover
To ensure VM availability when you foresee service disruption at the primary site, you
perform a planned failover to the recovery site. For a planned failover to succeed, the
guest VMs must be available at the primary site. When you perform a planned failover,
the recovery plan first creates a recovery point of the protected guest VM, replicates the
recovery point to the recovery site, and then starts the guest VM at the recovery site.
The recovery point used for migration is retained indefinitely. After a planned failover, the
guest VMs no longer run at the primary site.
Unplanned failover
To ensure VM availability when a disaster causing service disruption occurs at the
primary site, you perform an unplanned failover to the recovery site. In an unplanned
failover, you can expect some data loss to occur. The maximum data loss possible
is equal to the least RPO you specify in the protection policy, or the data that was
generated after the last manual recovery point for a given guest VM. In an unplanned
failover, by default, the protected guest VMs recover from the most recent recovery
point. However, you can recover from an earlier recovery point by selecting a date and
time of the recovery point.
At the recovery site, the guest VMs can recover using the recovery points replicated from
the primary site only. The guest VMs cannot recover using the local recovery points. For
example, if you perform an unplanned failover from the primary site AZ1 to the recovery
site AZ2, the guest VMs recover at AZ2 using the recovery points replicated from AZ1 to
AZ2.
You can perform a planned or an unplanned failover in different scenarios of network failure.
For more information about network failure scenarios, see Leap and Xi Leap Failover Scenarios
on page 67.

At the recovery site after a failover, the recovery plan creates only the VM category that was
used to include the guest VM in the recovery plan. Manually create the remaining VM categories
at the recovery site and associate the guest VMs with those categories.
The recovered guest VMs generate recovery points as per the replication schedule that
protects it even after recovery. The recovery points replicate back to the primary site when the
primary site starts functioning. The approach for reverse replication enables you to perform
failover of the guest VMs from the recovery site back to the primary site (failback). The same
recovery plan applies to both the failover and the failback operations. Only, for failover, you
must perform the failover operations on the recovery plan at the recovery site while for
failback, you must perform the failover operations on the recovery plan at the primary site. For
example, if a guest VM fails over from AZ1 (Local) to AZ2, the failback fails over the same VMs
from AZ2 (Local) back to AZ1.

Leap and Xi Leap Failover Scenarios

You have the flexibility to perform a real or simulated failover for the full and partial workloads
(with or without networking). The term virtual network is used differently on on-prem clusters
and Xi Cloud Services. In Xi Cloud Services, the term virtual network is used to describe the
two built-in virtual networks—production and test. Virtual networks on the on-prem clusters
are virtual subnets bound to a single VLAN. Manually create these virtual subnets, and create
separate virtual subnets for production and test purposes. Create these virtual subnets before
you configure recovery plans. When configuring a recovery plan, you map the virtual subnets at
the primary site to the virtual subnets at the recovery site.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 67
Figure 32: Failover in Network Mapping

The following are the various scenarios that you can encounter in Leap configurations for
disaster recovery (DR) to an on-prem availability zone (site) or to Xi Cloud (Xi Leap). Each
scenario is explained with the required network-mapping configuration for Xi Leap. However,
the configuration remains the same irrespective of disaster recovery (DR) using Leap or Xi
Leap. You can either create a recovery plan with the following network mappings (see Creating
a Recovery Plan (Leap) on page 56) or update an existing recovery plan with the following
network mappings (see Updating a Recovery Plan on page 136).

Scenario 1: Leap Failover (Full Network Failover)


Full network failure is the most common scenario. In this case, it is desirable to bring up the
whole primary site in the Xi Cloud. All the subnets must failover, and the WAN IP address must
change from the on-prem IP address to the Xi WAN IP address. Floating IP addresses can be
assigned to individual guest VMs, otherwise, everything use Xi network address translation
(NAT) for external communication.
Perform the failover when the on-prem subnets are down and jump the host available on the
public Internet through the floating IP address of Xi production network.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 68
Figure 33: Full Network Failover

To set up the recovery plan that orchestrates the full network failover, perform the following.
1. Open the Network Settings page to configure network mappings in a recovery plan.
2. Select the Local AZ > Production > Virtual Network or Port Group.
The selection auto-populates the Xi production and test failover subnets.
3. Select the Outbound Internet Access switch to allow the Xi NAT to use for Internet access.
4. Dynamically assign the floating IP addresses to the guest VMs you select in the recovery
plan.
Perform steps 1–4 for every subnet.

Figure 34: Recovery Plan Configuration: Network Settings

Scenario 2: Xi Network Failover (Partial Network Failover)


You want to failover one or more subnets from the primary site to Xi Cloud. The
communications between the sites happen through the VPN or using the external NAT

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 69
or floating IP addresses. A use case of this type of scenario is that the primary site needs
maintenance, but some of its subnets must see no downtime.
Perform partial failover when some subnets are active in the production networks at both on-
prem and Xi Cloud, and jump the host available on the public Internet through the floating IP
address of Xi production network.
On-prem guest VMs can connect to the guest VMs on the Xi Cloud Services.

Figure 35: Partial Network Failover

To set up the recovery plan that orchestrates the partial network failover, perform the
following.
1. Open the Network Settings page to configure network mappings in a recovery plan.
2. Select the Local AZ > Production > Virtual Network or Port Group.
The selection auto-populates the Xi production and test failover subnets.
3. Select the Outbound Internet Access switch to allow the Xi NAT to use for Internet access.
4. Dynamically assign the floating IP addresses to the guest VMs you select in the recovery
plan.
Perform steps 1–4 for one or more subnets based on the maintenance plan.

Figure 36: Recovery Plan Configuration: Network Settings

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 70
Scenario 3: Xi Network Failover (Partial Subnet Network Failover)
You want to failover some guest VMs to Xi Cloud, while keeping the other guest VMs up and
running at the on-prem cluster (primary site). A use case of this type of scenario is that the
primary site needs maintenance, but some of its guest VMs must see no downtime.
This scenario requires changing IP addresses for the guest VMs running at Xi Cloud. Since
you cannot have the subnet active on both the sites, create a subnet to host the failed over
guest VMs. Jump the host available on the public Internet through the floating IP address of Xi
production network.
On-prem guest VMs can connect to the guest VMs on the Xi Cloud Services.

Figure 37: Partial Subnet Network Failover

To set up the recovery plan that orchestrates the partial subnet network failover, perform the
following.
1. Open the Network Settings page to configure network mappings in a recovery plan.
2. Select the Local AZ > Production > Virtual Network or Port Group.
The selection auto-populates the Xi production and test failover subnets for a full subnet
failover

Note: In this case, you have created subnets on the Xi Cloud Services also. Choose that
subnets to avoid full subnet failover (scenario 1).

3. Select the Outbound Internet Access switch to allow the Xi NAT to use for Internet access.
4. Dynamically assign the floating IP addresses to the guest VMs you select in the recovery
plan.
Perform steps 1–4 for one or more subnets based on the maintenance plan.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 71
Figure 38: Recovery Plan Configuration: Network Settings

Scenario 4: Xi Network Failover (Test Failover and Failback)


You want to test all the preceding three scenarios by creating an isolated test network so that
no routing or IP address conflict happens. Clone all the guest VMs from a local recovery point
and bring up to test failover operations. Test failover test when all on-prem subnets are active
and on-prem guest VMs can connect to the guest VMs at the Xi Cloud. Jump the host available
on the public Internet through the floating IP address of Xi production network.

Figure 39: Test Failover & Failback

In this case, focus on the test failover section when creating the recovery plan. When you select
a local AZ production subnet, it copies to the test network. You can go one step further and
create a test subnet at the Xi Cloud.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 72
Figure 40: Recovery Plan Configuration: Network Settings

After the guest VMs test failover to Xi Cloud, you can do a test failback to the primary site.

Note: Make a test subnet in advance for the failback to the on-prem cluster.

Figure 41: Recovery Plan Configuration: Network Settings

Failover and Failback Operations

You can perform test failover, planned failover, and unplanned failover of the guest VMs
protected with Asynchronous replication schedule across different Nutanix clusters at the
same or different on-prem availability zones (sites). The steps to perform test, planned, and
unplanned failover are largely the same irrespective of the replication schedules that protect
the guest VMs.

Performing a Test Failover (Leap)

After you create a recovery plan, you can run a test failover periodically to ensure that the
failover occurs smoothly when required. To perform a test failover, do the following procedure
at the recovery site. If you have two recovery sites for DR, perform the test at the site where
you want to recover the guest VMs.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Policies > Recovery
Plans in the left pane.

3. Select the recovery plan that you want to test.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 73
4. Click Test from the Actions drop-down menu.

Figure 42: Test Failover (Drop-down)

Test Recovery Plan window shows. The window auto-populates the Failover From and
Failover To locations from the recovery plan you select in step 3. Failover To location is Local
AZ by default and is unavailable for editing.

Figure 43: Test Recovery Plan

5. Click + Add target clusters if you want to failover to specific Nutanix clusters at the recovery
site.
If you do not add target clusters, the recovery plan recovers the guest VMs to any eligible
cluster at the recovery site.

6. Click Test.
The Test Recovery Plan dialog box lists the errors and warnings, if any, and allows you to
stop or continue the test operation. If there are no errors or you resolve the errors in step 7,
the guest VMs failover to the recovery cluster.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 74
7. If you see errors, do the following.

• To review errors or warnings, click View Details in the description.


Resolve the error conditions and then restart the test procedure.
• Select one of the following.

• To stop the failover operation, click Abort.


• To continue the failover operation despite the warnings, click Execute Anyway.

Note: You cannot continue the failover operation when the validation fails with errors.

Cleaning up Test VMs (Leap)

After testing a recovery plan, you can remove the test VMs that the recovery plan creates in the
recovery test network. To clean up the test VMs, do the following at the recovery site where the
test failover created the test VMs.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Policies > Recovery
Plans in the left pane.

3. Select the recovery plans whose test VMs you want to remove.

4. Click Clean Up Test VMs from the Actions drop-down menu.


Clean Up Test VMs dialog box shows with the name of the recovery plan you selected in
step 3.

5. Click Clean.

Figure 44: Clean Up Test VMs

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 75
Performing a Planned Failover (Leap)

If there is a planned event (for example, scheduled maintenance of guest VMs) at the primary
availability zone (site), perform a planned failover to the recovery site. To perform a planned
failover, do the following procedure at the recovery site. If you have two recovery sites for DR,
perform the failover at the site where you want to recover the guest VMs.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Policies > Recovery
Plans in the left pane.

3. Select a recovery plan for the failover operation.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 76
4. Click Failover from the Actions drop-down menu.

Note: If you select more than one recovery plan in step 3, the Failover action is available only
when the selected recovery plans have the same primary and recovery locations.

Specify the following information in the Failover from Recovery Plan window. The window
auto-populates the Failover From and Failover To locations from the recovery plan you
select in step 3.

Figure 45: Planned Failover

a. Failover Type: Click Planned Failover.

Warning: Do not check Live Migrate VMs. Live migration works only for the planned
failover of the guest VMs protected in Synchronous replication schedule. If you check

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 77
Live Migrate VMs for the planned failover of the guest VMs protected in Asynchronous or
NearSync replication schedule, the failover task fails.

b. Click + Add target clusters if you want to failover to specific Nutanix clusters at the
recovery site.

Figure 46: Planned Failover: Select Recovery Cluster

If you do not add target clusters, the recovery plan recovers the guest VMs to any eligible
cluster at the recovery site.

5. Click Failover.
The Failover from Recovery Plan dialog box lists the errors and warnings, if any, and allows
you to stop or continue the failover operation. If there are no errors or you resolve the errors
in step 6, the guest VMs failover to the recovery Nutanix cluster.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 78
6. If you see errors, do the following.

• To review errors or warnings, click View Details in the description.


Resolve the error conditions and then restart the failover procedure.
• Select one of the following.

• To stop the failover operation, click Abort.


• To continue the failover operation despite the warnings, click Execute Anyway.

Note: You cannot continue the failover operation when the validation fails with errors.

Note:

The entities of AHV/ESXi clusters recover at a different path on the ESXi clusters if
their files conflict with the existing files on the recovery ESXi cluster. For example,
there is a file name conflict if a VM (VM1) migrates to a recovery cluster that already
has a VM (VM1) in the same container.
However, the entities recover at a different path with VmRecoveredAtAlternatePath
alert only if the following conditions are met.

• Prism Element running on both the primary and the recovery Nutanix clusters are
of version 5.17 or newer.
• A path for the entity recovery is not defined while initiating the failover operation.
• The protected entities do not have shared disk/s.
If these conditions are not satisfied, the failover operation fails.

Performing an Unplanned Failover (Leap)

If there is an unplanned event (for example, a natural disaster or network failure) at the primary
availability zone (site), perform an unplanned failover to the recovery site. To perform an
unplanned failover, do the following procedure at the recovery site. If you have two recovery
sites for DR, perform the failover at the site where you want to recover the guest VMs.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Policies > Recovery
Plans in the left pane.

3. Select a recovery plan for the failover operation.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 79
4. Click Failover from the Actions drop-down menu.

Note: If you select more than one recovery plan in step 3, the Failover action is available only
when the selected recovery plans have the same primary and recovery locations.

Specify the following information in the Failover from Recovery Plan window. The window
auto-populates the Failover From and Failover To locations from the recovery plan you
select in step 3.

Figure 47: Unplanned Failover

a. Failover Type: Click Unplanned Failover and do one of the following.

» Click Recover from latest Recovery Point to use the latest recovery point for recovery.
» Click Recover from specific point in time to use a recovery point taken at a specific
point in time for recovery.

Note: If you click Recover from specific point in time, select a Nutanix cluster that
hosts the specific point in time recovery point (step 4.b). If you do not select a cluster,
or select multiple clusters where the same recovery points exist, the guest VMs fail to
recover efficiently because the system encounters more than one recovery point at the
recovery site. For example, if a primary site AZ1 replicates the same recovery points to
two clusters CLA and CLB at site AZ2, select either the cluster CLA or the cluster CLB

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 80
as the target cluster when you click to recover from a specific point in time. If you select
both CLA and CLB, the guest VMs fail to recover.

b. Click + Add target clusters if you want to failover to specific Nutanix clusters at the
recovery site.

Figure 48: Unplanned Failover: Select Recovery Cluster

If you do not add target clusters, the recovery plan recovers the guest VMs to any eligible
cluster at the recovery site.

Note: If recovery plans contain VM categories, the VMs from those categories recover in the
same category after an unplanned failover to the recovery site. Also, the recovery points keep
generating at the recovery site for those recovered VMs. Since the VM count represents the
number of recoverable VMs (calculated from recovery points), the recovered VMs and their
newly created recovery points sum up. Their sum gives double the count of the originally

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 81
recovered VMs on the recovery plans page. Now, if some VMs belonging to the given
category at the primary or recovery site are deleted, the VM count at both sites still stay the
same until the recovery points of deleted VMs expire. For example, when two VMs have failed
over, the recovery plans page at the recovery site shows four VMs (two replicated recovery
points from source and two newly generated recovery points). The page shows four VMs
even if the VMs are deleted from the primary or recovery site. The VM count synchronizes and
becomes consistent in the subsequent RPO cycle conforming to the retention policy set in the
protection policy (due to the expiration of recovery points).

5. Click Failover.
The Failover from Recovery Plan dialog box lists the errors and warnings, if any, and allows
you to stop or continue the failover operation. If there are no errors or you resolve the errors
in step 6, the guest VMs failover to the recovery Nutanix cluster.

6. If you see errors, do the following.

• To review errors or warnings, click View Details in the description.


Resolve the error conditions and then restart the failover procedure.
• Select one of the following.

• To stop the failover operation, click Abort.


• To continue the failover operation despite the warnings, click Execute Anyway.

Note: You cannot continue the failover operation when the validation fails with errors.

Note: To avoid conflicts when the primary site becomes active after the failover, shut down
the guest VMs associated with this recovery plan. Manually power off the guest VMs on either
primary or recovery site after the failover is complete. You can also block the guest VMs
associated with this recovery plan through the firewall.

Note:

The entities of AHV/ESXi clusters recover at a different path on the ESXi clusters if
their files conflict with the existing files on the recovery ESXi cluster. For example,
there is a file name conflict if a VM (VM1) migrates to a recovery cluster that already
has a VM (VM1) in the same container.
However, the entities recover at a different path with VmRecoveredAtAlternatePath
alert only if the following conditions are met.

• Prism Element running on both the primary and the recovery Nutanix clusters are
of version 5.17 or newer.
• A path for the entity recovery is not defined while initiating the failover operation.
• The protected entities do not have shared disk/s.
If these conditions are not satisfied, the failover operation fails.

Performing Failback (Leap)

A failback is failover of the guest VMs from the recovery availability zone (site) back to the
primary site. The same recovery plan applies to both the failover and the failback operations.
Only, for failover, you must perform the failover operations on the recovery plan at the recovery

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 82
site while for failback, you must perform the failover operations on the recovery plan at the
primary site.

About this task


To perform a failback, do the following procedure at the primary site.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Policies > Recovery
Plans in the left pane.

3. Select a recovery plan for the failover operation.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 83
4. Click Failover from the Actions drop-down menu.

Note: If you select more than one recovery plan in step 3, the Failover action is available only
when the selected recovery plans have the same primary and recovery locations.

Specify the following information in the Failover from Recovery Plan window. The window
auto-populates the Failover From and Failover To locations from the recovery plan you
select in step 3.

Figure 49: Unplanned Failover

a. Failover Type: Click Unplanned Failover and do one of the following.

Tip: You can also click Planned Failover to perform planned failover procedure for a
failback.

» Click Recover from latest Recovery Point to use the latest recovery point for recovery.
» Click Recover from specific point in time to use a recovery point taken at a specific
point in time for recovery.
b. Click + Add target clusters if you want to failover to specific Nutanix clusters at the
primary site.
If you do not add target clusters, the recovery plan recovers the guest VMs to any eligible
cluster at the primary site.

Note: If recovery plans contain VM categories, the VMs from those categories recover in the
same category after an unplanned failover to the recovery site. Also, the recovery points keep
generating at the recovery site for those recovered VMs. Since the VM count represents the
number of recoverable VMs (calculated from recovery points), the recovered VMs and their
newly created recovery points sum up. Their sum gives double the count of the originally
recovered VMs on the recovery plans page. Now, if some VMs belonging to the given

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 84
category at the primary or recovery site are deleted, the VM count at both sites still stay the
same until the recovery points of deleted VMs expire. For example, when two VMs have failed
over, the recovery plans page at the recovery site shows four VMs (two replicated recovery
points from source and two newly generated recovery points). The page shows four VMs
even if the VMs are deleted from the primary or recovery site. The VM count synchronizes and
becomes consistent in the subsequent RPO cycle conforming to the retention policy set in the
protection policy (due to the expiration of recovery points).

5. Click Failover.
The Failover from Recovery Plan dialog box lists the errors and warnings, if any, and allows
you to stop or continue the failover operation. If there are no errors or you resolve the errors
in step 6, the guest VMs failover to the recovery cluster.

6. If you see errors, do the following.

• To review errors or warnings, click View Details in the description.


Resolve the error conditions and then restart the failover procedure.
• Select one of the following.

• To stop the failover operation, click Abort.


• To continue the failover operation despite the warnings, click Execute Anyway.

Note: You cannot continue the failover operation when the validation fails with errors.

Note: To avoid conflicts when the primary site becomes active after the failover, shut down
the guest VMs associated with this recovery plan. Manually power off the guest VMs on either
primary or recovery site after the failover is complete. You can also block the guest VMs
associated with this recovery plan through the firewall.

Note:

The entities of AHV/ESXi clusters recover at a different path on the ESXi clusters if
their files conflict with the existing files on the recovery ESXi cluster. For example,
there is a file name conflict if a VM (VM1) migrates to a recovery cluster that already
has a VM (VM1) in the same container.
However, the entities recover at a different path with VmRecoveredAtAlternatePath
alert only if the following conditions are met.

• Prism Element running on both the primary and the recovery Nutanix clusters are
of version 5.17 or newer.
• A path for the entity recovery is not defined while initiating the failover operation.
• The protected entities do not have shared disk/s.
If these conditions are not satisfied, the failover operation fails.

Monitoring a Failover Operation (Leap)

After you trigger a failover operation, you can monitor failover-related tasks. To monitor a
failover, perform the following procedure at the recovery site. If you have two recovery sites for
DR, perform the procedure at the site where you trigger the failover.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 85
Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Policies > Recovery
Plans in the left pane.

3. Click the name of the recovery plan for which you triggered failover.

4. Click the Tasks tab.


The left pane displays the overall status. The table in the details pane lists all the running
tasks and their individual statuses.

Self-Service Restore
The self-service restore (also known as file-level restore) feature allows you to do a self-service
data recovery from the Nutanix data protection recovery points with minimal intervention. You
can perform self-service data recovery on both on-prem and Xi Cloud Services.
You must deploy NGT 2.0 or newer on guest VMs to enable self-service restore from Prism
Central. For more information about enabling and mounting NGT, see Enabling and Mounting
Nutanix Guest Tools in the Prism Web Console Guide. When you enable self-service restore and
attach a disk by logging into the VM, you can recover files within the guest OS. If you fail to
detach the disk from the VM, the disk is detached automatically from the VM after 24 hours.

Note:

• You can enable self-service restore for a guest VM through a web interface or nCLI.
• NGT performs the in-guest actions For more information about in-guest actions, see
Nutanix Guest Tools in the Prism Web Console Guide.
• Self-service restore supports only full snapshots generated from Asynchronous and
NearSync replication schedules.

Self-Service Restore Requirements

The requirements of self-service restore of Windows and Linux VMs are as follows.

General Requirements of Self-Service Restore


The following are the general requirements of self-service restore. Ensure that you meet the
requirements before configuring self-service restore for guest VMs.

License Requirements
AOS Ultimate. For more information about the features available with AOS Starter license, see
Software Options.

Hypervisor Requirements
Two AHV or ESXi clusters, each registered to the same or different Prism Centrals.

• AHV (running AOS 5.18 or newer)


The on-prem clusters must be running the version of AHV that comes bundled with the
supported version of AOS.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 86
• ESXi (running AOS 5.18 or newer)
The on-prem clusters must be running on version ESXi 6.5 GA or newer.

Nutanix Software Requirements


Prism Centrals and their registered on-prem clusters (Prism Elements) must be running the
following versions of AOS.

• AOS 5.18 or newer with AHV.


• AOS 5.18 or newer with ESXi.

• You have installed NGT 2.0 or newer. For more information about enabling and mounting
NGT, see Enabling and Mounting Nutanix Guest Tools in the Prism Web Console Guide.
• You have set disk.enableUUID=true in the .vmx file for the guest VMs running on ESXi.
• You have configured Nutanix recovery points by adding guest VM to an Asynchronous
protection policy.
• You have attached an IDE/SCSI or SATA disk

Requirements for Guest VMs Running Windows OS


The following are the specific requirements of self-service restore for guest VMs running
Windows OS. Ensure that you meet the requirements before proceeding.

• You have enough logical drive letters to bring the disk online.
• You have one of the following Windows OS as the guest OS.

• Windows Server 2008 R2 or newer


• Windows 7 through Windows 10

Requirements for Guest VMs Running Linux OS


The following are the specific requirements of self-service restore for guest VMs running Linux
OS. Ensure that you meet the requirements before proceeding.

• You have appropriate file systems to recover. Self-service restore supports only extended
file systems (ext2, ext3, and ext4) and XFS file systems.
• Logical Volume Manager (LVM) disks for which the volume group corresponds to only a
single physical disk are mounted.
• You have one of the following Linux OS as the guest OS.

• CentOS 6.5 through 6.9 and 7.0 through 7.3


• Red Hat Enterprise Linux (RHEL) 6.5 through 6.9 and 7.0 through 7.3
• Oracle Linux 6.5 and 7.0
• SUSE Linux Enterprise Server (SLES) 11 SP1 through 11 SP4 and 12 SP1 through 12 SP3
• Ubuntu 14.04 for both AHV and ESXi
• Ubuntu 16.10 for AHV only

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 87
Self-Service Restore Limitations

The limitations of self-service restore of Windows and Linux VMs are as follows.

General Limitations of Self-Service Restore


The following are the general limitations of self-service restore.

• Volume groups are not supported.


• Snapshots created in AOS 4.5 or later releases are only supported.
• PCI and delta disks are not supported.

Limitations of VMs Running Windows OS


The following are the specific limitations of self-service restore for guest VMs running Windows
OS.

• File systems. Self-service restore does not support dynamic disks consisting of NTFS on
simple volumes, spanned volumes, striped volumes, mirrored volumes, and RAID-5 volumes.
• Only 64-bit OSes are supported.
• Self-service restore does not support disks created as Microsoft Storage Space devices by
using Microsoft Windows Server 2016 or newer.

Limitations of VMs Running Linux OS


Whenever the snapshot disk has an inconsistent filesystem (as indicated by the fsck check), the
disk is only attached and not mounted.

Enabling Self-Service Restore

After enabling NGT for a guest VM, you can enable the self-service restore for that guest VM.
Also, you can enable the self-service restore for a guest VM while you are installing NGT on that
guest VM.

Before you begin


For more information, see Enabling and Mounting Nutanix Guest Tools in the Prism Web
Console Guide.
Ensure that you have installed and enabled NGT 2.0 or newer on the guest VM.

About this task


To enable self-service restore, perform the following procedure.

Procedure

1. Log on to Prism Central.

2. Click the hamburger icon at the top-left corner of the window. Go to Virtual Infrastructure >
VMs in the left pane.

3. Select the guest VM where you want to enable self-service restore.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 88
4. Click Manage NGT Applications from the Actions drop-down menu.

Figure 50: Enabling Self-Service Restore

Note: If the guest VM does not have NGT installed, click Install NGT from the Actions drop-
down menu and select to enable Self Service Restore (SSR).

5. Click Enable below the Self Service Restore (SSR) panel.

6. Click Confirm.
Self-service restore feature is enabled on the guest VM. You can now restore the desired files
from the VM.

Self-Service Restore for Windows VMs

You can restore the desired files from the VM through the web interface or by using the ngtcli
utility of self-service restore.

Restoring a File through Web Interface (Windows VM)

After you install NGT in the Windows guest VM, you can restore the desired files from the VM
through the web interface.

Before you begin


Ensure that you have configured your Windows VM to use NGT. For more information, see
Installing NGT on Windows Machines in the Prism Web Console Guide.

About this task


To restore a file in Windows guest VMs by using web interface, perform the following.

Procedure

1. Log in to the guest Windows VM by using administrator credentials.

2. Click the Nutanix SSR icon on the desktop.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 89
3. Type the administrator credentials of the VM.

Note: If you use:

• NETBIOS domain name in username field (for example, domain\username),


then you will be able to log on to SSR only if your account is explicitly added
to Administrators group on the server. If username is added to any domain
group, which is then added to Administrators group, then logon will fail. Also,
you must type NETBIOS domain name in capital letters (domain name has to be
written in the same way as you see in the output of command net localgroup
administrators).
• FQDN in username (for example domain.com\username), then you will only be
able to logon if username user is a member of the domain admins group.

Note: The snapshots that are taken for that day are displayed. You also have an option to
select the snapshots for the week, month, and the year. In addition, you can also define a
custom range of dates and select the snapshot.

Figure 51: Snapshot Selection

4. Select the appropriate tab, This Week, This Month, This Year.
You can also customize the selection by clicking Custom Range tab and selecting the date
range in the From and To fields.

5. Select the check box of the disks that you want to attach from the snapshot.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 90
6. Select Mount from the Disk Action drop-down menu.

Figure 52: Mounting of Disks

The selected disk or disks are mounted and the relevant disk label is displayed.

7. Go to the attached disk label drive in the VM and restore the desired files.

8. To view the list of all the mounted snapshots, select Mounted Snapshots.
This page displays the original snapshot drive letters and its corresponding current drive
letters. The original drive letters get assigned to the disk at the time of the snapshot.
Mounted drive letters are on which the snapshotted disk is mounted right now.

Figure 53: List of Mounted Snapshots

a. To detach a disk, click the disk label and click Unmount.


You can unmount all the disks at once by clicking Select All and then clicking Unmount.

9. To detach a disk, select the check box of the disk that you want to unmount and then from
the Disk Action drop-down menu, select Unmount.

Restoring a File through Ngtcli (Windows VM)

After you install NGT in the Windows guest VM, you can restore the desired files from the VM
through the ngtcli utility.

Before you begin


Ensure that you have configured your Windows VM to use NGT. For more information, see
Installing NGT on Windows Machines in the Prism Web Console Guide.

About this task


To restore a file in Windows guest VMs by using ngtcli, perform the following.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 91
Procedure

1. Log in to the guest Windows VM by using administrator credentials.

2. Open the command prompt as an administrator.

3. Go to the ngtcli directory in Program Files > Nutanix.


> cd c:\Program Files\Nutanix\ngtcli

Tip:
> python ngtcli.py
creates a terminal with auto-complete.

4. Run the ngtcli.cmd command.

5. List the snapshots and virtual disks that are present for the guest VM.
ngtcli> ssr ls-snaps

The snapshot ID, disk labels, logical drives, and create time of the snapshot is displayed. You
can use this information and take a decision to restore the files from the relevant snapshot
that has the data.
List the snapshots with a specific number.
ngtcli> ssr ls-snaps snapshot-count=count_value

Replace count_value with the number that you want to list.

6. Attach the disk from the snapshots.


ngtcli> ssr attach-disk disk-label=disk_label snapshot-id=snap_id

Replace disk_label with the name of the disk that you want to attach.
Replace snap_id with the snapshot ID of the disk that you want to attach.
For example, to attach a disk with snapshot ID 16353 and disk label scsi0:1, type the
folllowing command.
ngtcli> ssr attach-disk snapshot-id=16353 disk-label=scsi0:1

After successfully running the command, a new disk with label gets attached to the guest
VM.

Note: If sufficient logical drive letters are not present, bringing disks online action fails. In this
case, you should detach the current disk, create enough free slots by detaching other self-
service disks and reattach the disk again.

7. Go to the attached disk label drive and restore the desired files.

8. Detach a disk.
ngtcli> ssr detach-disk attached-disk-label=attached_disk_label

Replace attached_disk_label with the name of the disk that you want to attach.

Note: If the disk is not removed by the guest VM administrator, the disk is automatically
removed after 24 hours.

9. View all the attached disks to the VM.


ngtcli> ssr list-attached-disks

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 92
Self-Service Restore for Linux VMs

The Linux guest VM user with sudo privileges can restore the desired files from the VM through
the web interface or by using the ngtcli utility.

Restoring a File through Web Interface (Linux VM)

After you install NGT in the Linux guest VM, you can restore the desired files from the VM
through the web interface.

Before you begin

• Mount NGT for a VM. For more information about enabling NGT, see Enabling and Mounting
Nutanix Guest Tools in the Prism Web Console Guide.
• Ensure that you have configured your Linux VM to use NGT.

About this task


To restore a file in Linux guest VMs by using web interface, perform the following.

Procedure

1. Log in to the guest Linux VM as a user with sudo privileges.

2. Click the Nutanix SSR icon on the desktop.

3. Type the root or sudo user credentials of the VM.


The snapshots that are taken for that day is displayed. You also have an option to select the
snapshots for the week, month, and the year. In addition, you can also define a custom range
of dates and select the snapshot. For example, in the following figure snapshot taken on this
month is displayed.

Figure 54: Snapshot Selection

4. Select the appropriate tab, This Week, This Month, This Year.
You can also customize the selection by clicking Custom Range tab and selecting the date
range in the From and To fields.

5. Select the check box of the disks that you want to attach from the snapshot.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 93
6. Select Mount from the Disk Action drop-down menu.
The selected disk or disks are mounted and the relevant disk label is displayed.

Figure 55: Mounting of Disks

7. Go to the attached disk label partitions in the VM and restore the desired files.

Note: If the disk gets updated between the snapshots, the restore process may not work
as expected. If this scenario occurs, you need to contact support to help with the restore
process.

8. To view the list of all the mounted snapshots, select Mounted Snapshots.
This page displays the original snapshot drive letters and its corresponding current drive
letters. The original drive letters gets assigned to the disk at the time of the snapshot.
Mounted drive letters are on which the snapshotted disk is mounted right now.

Figure 56: List of Mounted Snapshots

a. To detach a disk, click the disk label and click Unmount.


You can unmount all the disks at once by clicking Select All and then clicking Unmount.

9. To detach a disk, select the check box of the disk that you want to unmount and then from
the Disk Action drop-down menu, select Unmount.

Restoring a File through Ngtcli (Linux VM)

After you install NGT in the Linux guest VM, you can restore the desired files from the VM
through the ngtcli utility.

Before you begin

• Mount NGT for a VM. For more information about enabling NGT, see Enabling and Mounting
Nutanix Guest Tools in the Prism Web Console Guide.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 94
• Ensure that you have configured your Linux VM to use NGT.

About this task


To restore a file in Linux guest VMs by using ngtcli, perform the following.

Procedure

1. Log in to the guest Linux VM with sudo or root user credentials.

2. Go to the ngtcli directory.


> cd /usr/local/nutanix/ngt/ngtcli

3. Run the python ngtcli.py command.

Tip: This command creates a terminal with auto-complete.

4. List the snapshots and virtual disks that are present for the guest VM.
ngtcli> ssr ls-snaps

The snapshot ID, disk labels, logical drives, and create time of the snapshot is displayed. You
can use this information and take a decision to restore the files from the relevant snapshot
that has the data.
To list the snapshots with a specific number.
ngtcli> ssr ls-snaps snapshot-count=count_value

Replace count_value with the number that you want to list.

5. Attach the disk from the snapshots.


ngtcli> ssr attach-disk disk-label=disk_label snapshot-id=snap_id

Replace disk_label with the name of the disk that you want to attach.
Replace snap_id with the snapshot ID of the disk that you want to attach.
For example, to attach a disk with snapshot ID 1343 and disk label scsi0:2,
ngtcli> ssr attach-disk snapshot-id=1343 disk-label=scsi0:2

After successfully running the command, a new disk with new label is attached to the guest
VM.

6. Go to the attached disk label partition and restore the desired files.

Note: If the disk gets updated between the snapshots, the restore process may not work
as expected. If this scenario occurs, you need to contact support to help with the restore
process.

7. Detach a disk.
ngtcli> ssr detach-disk attached-disk-label=attached_disk_label

Replace attached_disk_label with the name of the disk that you want to attach.
For example, to remove the disk with disk label scsi0:3, type the following command.
ngtcli> ssr detach-disk attached-disk-label=scsi0:3

Note: If the disk is not removed by the guest VM administrator, the disk is automatically
removed after 24 hours.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 95
8. View all the attached disks to the VM.
ngtcli> ssr list-attached-disks

Protection with NearSync Replication Schedule and DR (Leap)


NearSync replication enables you to protect your guest VMs with an RPO of as low as 1 minute.
A protection policy with a NearSync replication creates a recovery point in a minutely time
interval (between 1–15 minutes), and replicates it to the recovery availability zones (sites) for
High Availability. For guest VMs protected with NearSync replication schedule, you can perform
disaster recovery (DR) to a different Nutanix cluster at same or different sites. In addition to DR
to Nutanix clusters of the same hypervisor type, you can also perform cross-hypervisor disaster
recovery (CHDR)—disaster recovery from AHV clusters to ESXi clusters, or from ESXi clusters
to AHV clusters.

Note: Nutanix provides multiple DR solutions to secure your environment. See Nutanix Disaster
Recovery Solutions on page 11 for the detailed representation of the DR offerings of Nutanix.

The following are the advantages of protecting your guest VMs with a NearSync replication
schedule.

• Protection for the mission-critical applications. Securing your data with minimal data loss
if there is a disaster, and providing you with more granular control during the recovery
process.
• No minimum network latency or distance requirements.
• Low stun time for guest VMs with heavy I/O applications.
Stun time is the time of application freeze when the recovery point is taken.
• Allows resolution to a disaster event in minutes.
To implement the NearSync feature, Nutanix has introduced a technology called lightweight
snapshots (LWSs). LWS recovery points are created at the metadata level only, and they
continuously replicate incoming data generated by workloads running on the active clusters.
LWS recovery points are stored in the LWS store, which is allocated on the SSD tier. When you
configure a protection policy with a NearSync replication schedule, the system allocates the
LWS store automatically.

Note: The maximum LWS store allocation for each node is 360 GB. For the hybrid systems, it is
7% of the SSD capacity on that node.

Transitioning in and out of NearSync


When you create a NearSync replication schedule, the schedule remains an hourly schedule
until its transition into a minutely schedule is complete.
To transition into NearSync (minutely) replication schedule, initial seeding of the recovery site
with the data is performed, the recovery points are taken on an hourly basis, and replicated
to the recovery site. After the system determines that the recovery points containing the
seeding data have replicated within a specified amount of time (default is an hour), the system
automatically transitions the replication schedule into NearSync schedule depending on the
bandwidth and the change rate. After you transition into the NearSync replication schedule, you
can see the configured minutely recovery points in the web interface.
The following are the characteristics of the process.

• Until you are transitioned into NearSync replication schedule, you can see only the hourly
recovery points in Prism Central.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 96
• If for any reason, a guest VM transitions out of NearSync replication schedule, the system
raises alerts in the Alerts dashboard, and the minutely replication schedule transitions out
to the hourly replication schedule. The system continuously tries to get into the minutely
replication schedule that you have configured. If the transition is successful, the replication
schedule automatically transitions back into NearSync, and alerts specific to this condition
are raised in the Alerts dashboard.
To transition out of the NearSync replication schedule, you can do one of the following.

• Delete the NearSync replication schedule that you have configured.


• Update the NearSync replication schedule to use an hourly RPO.
• Unprotect the guest VMs.

Note: There is no transitioning out of the NearSync replication schedule on the addition or
deletion of a guest VM.

Repeated transitioning in and out of NearSync replication schedule can occur because of the
following reasons.

• LWS store usage is high.


• The change rate of data is high for the available bandwidth between the primary and the
recovery sites.
• Internal processing of LWS recovery points is taking more time because the system is
overloaded.

Retention Policy
Depending on the RPO (1–15 minutes), the system retains the recovery points for a specific time
period. For a NearSync replication schedule, you can configure the retention policy for days,
weeks, or months on both the primary and recovery sites instead of defining the number of
recovery points you want to retain. For example, if you desire an RPO of 1 minute and want to
retain the recovery points for 5 days, the retention policy works in the following way.

• For every 1 minute, a recovery point is created and retained for a maximum of 15 minutes.

Note: The recent 15 recovery points are only visible in Prism Central and are available for the
recovery operation.

• For every hour, a recovery point is created and retained for 6 hours.
• One daily recovery point is created and retained for 5 days.
You can also define recovery point retention in weeks or months. For example, if you configure
a 3-month schedule, the retention policy works in the following way.

• For every 1 minute, a recovery point is created and retained for 15 minutes.
• For every hour, a recovery point is created and retained for 6 hours.
• One daily recovery point is created and retained for 7 days.
• One weekly recovery point is created and retained for 4 weeks.
• One monthly recovery point is created and retained for 3 months.

Note:

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 97
• You can define different retention policies on the primary and recovery sites.
• The system retains subhourly and hourly recovery points for 15 minutes and 6 hours
respectively. Maximum retention time for days, weeks, and months is 7 days, 4
weeks, and 12 months respectively.
• If you change the replication schedule from an hourly schedule to a minutely
schedule (Asynchronous to NearSync), the first recovery point is not created
according to the new schedule. The recovery points are created according to
the start time of the old hourly schedule (Asynchronous). If you want to get the
maximum retention for the first recovery point after modifying the schedule, update
the start time accordingly for NearSync.

NearSync Replication Requirements (Leap)


The following are the specific requirements for protecting your guest VMs with NearSync
replication schedule. Ensure that you meet the following requirements in addition to the general
requirements of Leap.
For more information about the general requirements of Leap, see Leap Requirements on
page 18.
For information about node, disk and Foundation configurations required to support NearSync
replication schedules, see On-Prem Hardware Resource Requirements on page 14.

Hypervisor Requirements
AHV or ESXi

• The AHV clusters must be running on version 20190916.189 or newer.


• The ESXi clusters must be running on version ESXi 6.5 GA or newer.

Nutanix Software Requirements


Each on-prem site must have a Leap enabled Prism Central instance.
The primary and recovery Prism Centrals and their registered Nutanix clusters must be running
the following versions of AOS.

• AOS 5.17.1 or newer for DR to different Nutanix clusters at the same site.
• AOS 5.17 or newer for DR to Nutanix clusters at the different sites.

Cross Hypervisor Disaster Recovery (CHDR) Requirements


Guest VMs protected with NearSync replication schedule support cross-hypervisor disaster
recovery. You can perform failover (DR) to recover guest VMs from AHV clusters to ESXi
clusters or guest VMs from ESXi clusters to AHV clusters by considering the following
requirements.

• Both the primary and the recovery Nutanix clusters must be running AOS 5.18 or newer.
• Install and configure Nutanix Guest Tools (NGT) on all the guest VMs. For more information,
see Enabling and Mounting Nutanix Guest Tools in Prism Web Console Guide.
NGT configures the guest VMs with all the required drivers for VM portability. For more
information about general NGT requirements, see Nutanix Guest Tools Requirements and
Limitations in Prism Web Console Guide.
• CHDR supports guest VMs with flat files only.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 98
• CHDR supports IDE/SCSI disks only.

Tip: From AOS 5.19.1, CHDR supports SATA disks also.

• For all the non-boot SCSI disks of Windows guest VMs, set the SAN policy to OnlineAll so
that they come online automatically.
• In vSphere 6.7, guest VMs are configured with UEFI secure boot by default. Upon CHDR to
an AHV cluster, these guest VMs do not start if the host does not support the UEFI secure
boot feature. For more information about supportability of UEFI secure boot on Nutanix
clusters, see the Compatibility Matrix.

• For information about operating systems that support UEFI and Secure Boot, see UEFI and
Secure Boot Support for CHDR on page 211.
• Nutanix does not support vSphere inventory mapping (for example, VM folder and resource
pools) when protecting workloads between VMware clusters.

• Nutanix does not support vSphere snapshots or delta disk files. If you have delta disks
attached to a VM and you proceed with failover, you get a validation warning and the VM
does not recover. Contact Nutanix Support for assistance.

Table 15: Operating Systems Supported for CHDR (Asynchronous Replication)

Operating System Version Requirements and limitations

Windows
• Windows 2008 R2 or newer • Only 64-bit operating systems
versions are supported.
• Windows 7 or newer versions

Linux
• CentOS 6.5 and 7.0 • SLES operating system is not
supported.
• RHEL 6.5 or newer and RHEL
7.0 or newer.
• Oracle Linux 6.5 and 7.0
• Ubuntu 14.04

Additional Requirements

• Both the primary and the recovery Nutanix clusters must be of minimum three-nodes.
• The recovery site container must have as much space as the protected VMs working size set
of the primary site. For example, if you are protecting a VM that is using 30 GB of space on
the container of the primary site, the same amount of space is required on the recovery site
container.

NearSync Replication Limitations (Leap)


Consider the following specific limitations before protecting your guest VMs with NearSync
replication schedule. These limitations are in addition to the general limitations of Leap.
For information about the general limitations of Leap, see Leap Limitations on page 22.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 99
• All files associated with the VMs running on ESXi must be located in the same folder as the
VMX configuration file. The files not located in the same folder as the VMX configuration file
might not recover on a recovery cluster. On recovery, the guest VM with such files fails to
start with the following error message. Operation failed: InternalTaskCreationFailure:
Error creating host specific VM change power state task. Error: NoCompatibleHost:
No host is compatible with the virtual machine

• Deduplication enabled on storage containers having guest VMs protected with NearSync
replication schedule lowers the replication speed.
• Cross hypervisor disaster recovery (CHDR) does not preserve hypervisor-specific properties
(for example, multi-writer flags, independent persistent and non-persistent disks, changed
block tracking (CBT), PVSCSI disk configurations).
• On CHDR, NearSync replication schedules do not support retrieving recovery points from
the recovery sites.
For example, if you have 1 day retention at the primary site and 5 days retention at the
recovery site, and you want to go back to a recovery point from 5 days ago. NearSync
replication schedule does not support replicating 5 days retention back from the recovery
site to the primary site.

Creating a Protection Policy with a NearSync Replication Schedule (Leap)


To protect the guest VMs in a minutely replication schedule, configure a NearSync replication
schedule while creating the protection policy. The policy takes recovery points of the protected
guest VMs in the specified time intervals (1–15 minutes) and replicates them to the recovery
availability zone (site) for High Availability. To maintain the efficiency of minutely replication,
the protection policy allows you to configure a NearSync replication schedule to only one
recovery site. When creating a protection policy, you can specify only VM categories. If you
want to include VMs individually, you must first create the protection p policy—which can also
include VM categories and then include the VMs individually in the protection policy from the
VMs page.

Before you begin


Ensure that the primary and the recovery AHV or ESXi clusters at the same or different sites are
NearSync capable. A cluster is NearSync capable if the capacity of each SSD in the cluster is at
least 1.2 TB.
See NearSync Replication Requirements (Leap) on page 98 and NearSync Replication
Limitations (Leap) on page 99 before you start.

About this task


To create a protection policy with a NearSync replication schedule, do the following at the
primary site. You can also create a protection policy at the recovery site. Protection policies you
create or update at a recovery site synchronize back to the primary site.

Procedure

1. Log on to the Prism Central web console.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 100
2. Click the hamburger icon at the top-left corner of the window. Go to Policies > Protection
Policies in the left pane.

Figure 57: Protection Policy Configuration: Protection Policies

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 101
3. Click Create Protection Policy.
Specify the following information in the Create Protection Policy window.

Figure 58: Protection Policy Configuration: Select Primary Location

a. Policy name: Enter a name for the protection policy.

Caution: The name can be of only alphanumeric, dot, dash, and underscore characters.

b. In the Primary Location pane, specify the following information.

• 1. Location: From the drop-down list, select an availability zone (site) that hosts the
guest VMs to protect.
The drop-down lists all the sites paired with the local site. Local AZ represents the
local site (Prism Central). For your primary site, you can check either the local site or
a non-local site.
2. Cluster: From the drop-down list, select the cluster that hosts the guest VMs to
protect.
The drop-down lists all the Nutanix clusters registered to Prism Central representing
the selected site. If you want to protect the guest VMs from multiple Nutanix
clusters in the same protection policy, select the clusters that host those guest
VMs. All Clusters protects the guest VMs of all Nutanix clusters registered to Prism
Central.
3. Click Save.
Clicking Save activates the Recovery Location pane. After saving the primary site
configuration, you can optionally add a local schedule to retain the recovery points
at the primary site.
4. Click + Add Local Schedule if you want to retain recovery points locally in addition
to retaining recovery points in a replication schedule (step d.iv). For example, you
can create a local schedule to retain 15 minute recovery points locally and also

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 102
an hourly replication schedule to retain recovery points and replicate them to a
recovery site every 2 hours. The two schedules apply differently on the guest VMs.
Specify the following information in the Add Schedule window.

Figure 59: Protection Policy Configuration: Add Local Schedule

1. Take Snapshot Every: Specify the frequency in minutes, hours, days, or weeks at
which you want the recovery points to be taken locally.
2. Retention Type: Specify one of the following two types of retention policy.

• Linear: Implements a simple retention scheme at the local site. If you set the
retention number to n, the local site retains the n recent recovery points.
When you enter the frequency in minutes, the system selects the Roll-up
retention type by default because minutely recovery points do not support
Linear retention types.
• Roll-up: Rolls up the recovery points into a single recovery point at the local
site.
For more information about the roll-up recovery points, see step d.iii.
3. Retention on Local AZ:PE_A3_AHV: Specify the retention number for the local
site.
4. If you want to take application-consistent recovery points, check Take App-
Consistent Recovery Point.
Irrespective of the local or replication schedules, the recovery points are of the
specified type. If you check Take App-Consistent Recovery Point, the recovery
points generated are application-consistent and if you do not check Take App-
Consistent Recovery Point, the recovery points generated are crash-consistent.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 103
If the time in the local schedule and the replication schedule match, the single
recovery point generated is application-consistent.

Note: See Application-consistent Recovery Point Conditions and


Limitations on page 51 before you take application-consistent snapshot.
5. Click Save Schedule.
c. In the Recovery Location pane, specify the following information.

Figure 60: Protection Policy Configuration: Select Recovery Location

• 1. Location: From the drop-down list, select the availability zone (site) where you want
to replicate the recovery points.
The drop-down lists all the sites paired with the local site. Local AZ represents the
local site (Prism Central). Select Local AZ if you want to configure DR to a different
Nutanix cluster at the same site.
If you do not select a site, local recovery points that are created by the protection
policy do not replicate automatically. You can, however, replicate the recovery
points manually and use recovery plans to recover the guest VMs. For more
information, see Manual Disaster Recovery (Leap) on page 137.
2. Cluster: From the drop-down list, select the cluster where you want to replicate the
recovery points.
The drop-down lists all the Nutanix clusters registered to Prism Central representing
the selected site. You can select one cluster at the recovery site. To maintain the
efficiency of minutely replication, a protection policy allows you to configure only
one recovery site for a NearSync replication schedule. However, you can add
another Asynchronous replication schedule for replicating recovery points to the

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 104
same or different sites. For more information to add another recovery site with a
replication schedule, see step e.

Note: Selecting auto-select from the drop-down list replicates the recovery points
to any available cluster at the recovery site. Select auto-select from the drop-down
list only if all the clusters at the recovery site are NearSync capable and are up and
running. A cluster is NearSync capable if the capacity of each SSD in the cluster is at
least 1.2 TB. All-flash clusters do not have any specific SSD sizing requirements.

Caution: If the primary Nutanix cluster contains an IBM POWER Systems server, you
can replicate recovery points to an on-prem site only if that on-prem site contains an
IBM Power Systems server.

3. Click Save.
Clicking Save activates the + Add Schedule button between the primary and the
recovery site. After saving the recovery site configuration, you can optionally add a
local schedule to retain the recovery points at the recovery site.
4. Click + Add Local Schedule if you want to retain recovery points locally in addition
to retaining recovery points in a replication schedule (step d.iv). For example,
you can create a local schedule to retain one hourly recovery points locally to
supplement the hourly replication schedule. The two schedules apply differently on
the guest VMs after failover, when the recovery points replicate back to the primary
site.
Specify the following information in the Add Schedule window.

Figure 61: Protection Policy Configuration: Add Local Schedule

1. Take Snapshot Every: Specify the frequency in minutes, hours, days, or weeks at
which you want the recovery points to be taken locally.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 105
2. Retention Type: Specify one of the following two types of retention policy.

• Linear: Implements a simple retention scheme at the local site. If you set the
retention number to n, the local site retains the n recent recovery points.
When you enter the frequency in minutes, the system selects the Roll-up
retention type by default because minutely recovery points do not support
Linear retention types.
• Roll-up: Rolls up the recovery points into a single recovery point at the local
site.
For more information about the roll-up recovery points, see step d.iii.
3. Retention on 10.xx.xx.xxx:PE_C1_AHV: Specify the retention number for the local
site.
4. If you want to take application-consistent recovery points, check Take App-
Consistent Recovery Point.
Irrespective of the local or replication schedules, the recovery points are of the
specified type. If you check Take App-Consistent Recovery Point, the recovery
points generated are application-consistent and if you do not check Take App-
Consistent Recovery Point, the recovery points generated are crash-consistent.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 106
If the time in the local schedule and the replication schedule match, the single
recovery point generated is application-consistent.
5. Click Save Schedule.
d. Click + Add Schedule to add a replication schedule between the primary and the recovery
site.
Specify the following information in the Add Schedule window. The window auto-
populates the Primary Location and Recovery Location that you have selected in step b
and step c.

Figure 62: Protection Policy Configuration: Add Schedule (NearSync)

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 107
• 1. Protection Type: Click Asynchronous.
2. Take Snapshot Every: Specify the frequency in minutes (anywhere between 1-15
minutes) at which you want the recovery points to be taken.
The specified frequency is the RPO. For more information about RPO, see Leap
Terminology on page 8.
3. Retention Type: When you enter the frequency in minutes in step ii, the system
selects the Roll-up retention type by default because NearSync replication
schedules do not support Linear retention types.
Roll-up retention type rolls up the recovery points as per the RPO and retention
period into a single recovery point at a site. For example, if you set the RPO to 1
hour, and the retention time to 5 days, the 24 oldest hourly recovery points roll up
into a single daily recovery point (one recovery point = 24 hourly recovery points)
after every 24 hours. The system keeps one day (of rolled-up hourly recovery
points) and 4 days of daily recovery points.

Note:

• If the retention period is n days, the system keeps 1 day of RPO (rolled-up
hourly recovery points) and n-1 days of daily recovery points.
• If the retention period is n weeks, the system keeps 1 day of RPO, 1 week
of daily and n-1 weeks of weekly recovery points.
• If the retention period is n months, the system keeps 1 day of RPO, 1 week
of daily, 1 month of weekly, and n-1 months of monthly recovery points.
• If the retention period is n years, the system keeps 1 day of RPO, 1 week
of daily, 1 month of weekly, and n-1 months of monthly recovery points.

Note: The recovery points that are used to create a rolled-up recovery point are
discarded.

Tip: Use roll-up retention policies for anything with a longer retention period. Roll-
up policies are more flexible and automatically handle recovery point aging/pruning
while still providing granular RPOs for the first day.

4. To specify the retention number for the primary and recovery sites, do the following.

• Retention on Local AZ: PE_A3_AHV: Specify the retention number for the
primary site.
This field is unavailable if you do not specify a recovery location.
• Retention on 10.xx.xx.xxx:PE_C1_AHV: Specify the retention number for the
recovery site.
5. If you want to enable reverse retention of the recovery points, check Reverse
retention for VMs on recovery location.
Reverse retention maintains the retention numbers of recovery points even after
failover to a recovery site in the same or different availability zones. For example, if
you retain two recovery points at the primary site and three recovery points at the
recovery site, and you enable reverse retention, a failover event does not change the
initial retention numbers when the recovery points replicate back to the primary site.
The recovery site still retains two recovery points while the primary site retains three
recovery points. If you do not enable reverse retention, a failover event changes the

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 108
initial retention numbers when the recovery points replicate back to the primary site.
The recovery site retains three recovery points while the primary site retains two
recovery points.
Maintaining the same retention numbers at a recovery site is required if you want to
retain a particular number of recovery points, irrespective of where the guest VM is
after its failover.
6. If you want to take application-consistent recovery points, check Take App-
Consistent Recovery Point.
Application-consistent recovery points ensure that application consistency is
maintained in the replicated recovery points. For application-consistent recovery
points, install NGT on the guest VMs running on AHV clusters. For guest VMs
running on ESXi clusters, you can take application-consistent recovery points
without installing NGT, but the recovery points are hypervisor-based, and leads to
VM stuns (temporary unresponsive VMs) after failover to the recovery sites.

Caution: Application-consistent recovery points fail for EFI-boot enabled Windows


2019 VM running on ESXi when NGT is not installed. Nutanix recommends installing
NGT on guest VMs running on ESXi also.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 109
7. Click Save Schedule.
e. Click + Add Recovery Location if you want to add an additional recovery site for the
guest VMs in the protection policy.

» To add an on-prem site for recovery, see Protection and DR between On-Prem Sites
(Leap) on page 17
» To add Xi Cloud Services for recovery, see Protection and DR between On-Prem Site
and Xi Cloud Service (Xi Leap) on page 141.

Figure 63: Protection Policy Configuration: Additional Recovery Location


f. Click + Add Schedule to add a replication schedule between the primary site and the
additional recovery site you specified in step e.
The Add Schedule window shows that auto-populates the Primary Location and the
additional Recovery Location. Perform step d again to add the replication schedule.
By default, recovery point creation begins immediately after you create the protection
policy. If you want to specify when recovery point creation must begin, click Immediately
at the top-right corner, and then, in the Start Time dialog box, do the following.

• 1. Click Start protection at specific point in time.


2. Specify the time at which you want to start taking recovery points.
3. Click Save.
g. Click Next.
Clicking Next shows a list of VM categories where you can optionally check one or more
VM categories to protect in the protection policy. DR configurations using Leap allows
you to protect a guest VM by using only one protection policy. Therefore, VM categories
specified in another protection policy are not in the list. If you protect a guest VM in
another protection policy by specifying the VM category of the guest VM (category-
based inclusion), and if you protect the guest VM from the VMs page in this policy
(individual inclusion), the individual inclusion supersedes the category-based inclusion.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 110
Effectively, only the protection policy that protected the individual guest VM protects the
guest VM.
For example, the guest VM VM_SherlockH is in the category Department:Admin, and
you add this category to the protection policy named PP_AdminVMs. Now, if you add
VM_SherlockH from the VMs page to another protection policy named PP_VMs_UK,
VM_SherlockH is protected in PP_VMs_UK and unprotected from PP_AdminVMs.
h. If you want to protect the guest VMs category wise, check the VM categories that you
want to protect from the list and click Add.

Figure 64: Protection Policy Configuration: Add VM Categories

Prism Central includes built-in VM categories for frequently encountered applications (for
example, MS Exchange and Oracle). If the VM category or value you want is not available,
first create the category with the required values, or update an existing category so
that it has the values you require. Doing so ensures that the VM categories and values
are available for selection. You can add VMs to the category either before or after you
configure the protection policy. If the guest VMs have a common characteristic, such as
belonging to a specific application or location, create a VM category and add the guest
VMs into the category.
If you do not want to protect the guest VMs category wise, proceed to the next step
without checking VM categories. You can add the guest VMs individually to the protection
policy later from the VMs page (see Adding Guest VMs individually to a Protection Policy
on page 128).
i. Click Create.
The protection policy with a NearSync replication schedule is created. To verify the
protection policy, see the Protection Policies page. If you check VM categories in step
h, the protection policy starts generating recovery points of the guest VMs in those VM
categories. To see the generated recovery points, click the hamburger icon at the top-
left corner of the window and go to VM Recovery Points. Click the recovery points for its

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 111
information. You can see the time estimated for the very first replication (seeding) to the
recovery sites.

Figure 65: Recovery Points Overview

Tip: DR using Leap with a NearSync replication schedule also allows you to recover the data
of the minute just before the unplanned failover. For example, with a 10 minutely protection
policy, you can use the internal lightweight snapshots (LWS) to recover the data of the ninth
minute when there is an unplanned failover.

Creating a Recovery Plan (Leap)


To orchestrate the failover of the protected guest VMs to the recovery site, create a recovery
plan. After a failover, a recovery plan recovers the protected guest VMs to the recovery site. If
you have configured two recovery sites in a protection policy, create two recovery plans for DR
—one for recovery to each recovery site. The recovery plan synchronizes continuously to the
recovery site in a bidirectional way.
For more information about creating a recovery plan, see Creating a Recovery Plan (Leap) on
page 56.

Failover and Failback Operations (Leap)


You can perform test failover, planned failover, and unplanned failover of the guest VMs
protected with NearSync replication schedule across different Nutanix clusters at the same or
different on-prem availability zone (site). The steps to perform test, planned, and unplanned
failover are largely the same irrespective of the replication schedules that protect the guest
VMs.
Refer Failover and Failback Management on page 66 for test, planned, and unplanned
failover procedures.

Protection with Synchronous Replication Schedule (0 RPO) and DR


Synchronous replication enables you to protect your guest VMs with a zero recovery point
objective (0 RPO). A protection policy with Synchronous replication schedule replicates all the
writes on the protected guest VMs synchronously to the recovery availability zone (sites) for
High Availability. The policy also takes recovery points of those protected VMs every 6 hours—
the first snapshot is taken immediately—for raw node (HDD+SSD) size up to 120 TB. Since the
replication is synchronous, the recovery points are crash-consistent only. For guest VMs (AHV)
protected with Synchronous replication schedule, you can perform DR only to an AHV cluster

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 112
at the same or different site. Replicating writes synchronously and also generating recovery
points helps to eliminate data losses due to:

• Unplanned failure events (for example, natural disasters and network failure).
• Planned failover events (for example, scheduled maintenance).
Nutanix recommends that the round-trip latency (RTT) between AHV clusters be less than 5 ms
for optimal performance of Synchronous replication schedules. Maintain adequate bandwidth to
accommodate peak writes and have a redundant physical network between the clusters.
To perform the replications synchronously yet efficiently, the protection policy limits you
to configure only one recovery site if you add a Synchronous replication schedule. If you
configure Synchronous replication schedule for a guest VM, you cannot add an Asynchronous
or NearSync schedule to the same guest VM. Similarly, if you configure an Asynchronous or a
NearSync replication schedule, you cannot add a Synchronous schedule to the same guest VM.
If you unpair the sites while the guest VMs in the Nutanix clusters are still in synchronization, the
Nutanix cluster becomes unstable. Therefore, disable Synchronous replication and clear stale
stretch parameters if any on both the primary and recovery Prism Element before unpairing
the sites. For more information about disabling Synchronous replication, see Synchronous
Replication Management on page 121.

Note: Nutanix provides multiple disaster recovery (DR) solutions to secure your environment.
See Nutanix Disaster Recovery Solutions on page 11 for the detailed representation of the DR
offerings of Nutanix.

Synchronous Replication Requirements


The following are the specific requirements for protecting your AHV guest VMs with
Synchronous replication schedule. Ensure that you meet the following requirements in addition
to the general requirements of Leap.
For information about the general requirements of Leap, see Leap Requirements on page 18.
For information about node, disk and Foundation configurations required to support
Synchronous replication schedules, see On-Prem Hardware Resource Requirements on page 14.

Hypervisor Requirements
AHV
The AHV clusters must be running on version 20190916.189 or newer.

Note: Synchronous replication schedules support only AHV.

Nutanix Software Requirements

• Each on-prem availability zone (AZ) must have a Leap enabled Prism Central instance.
The primary and recovery Nutanix Clusters can be registered with a single Prism Central
instance or each can be registered with different Prism Central instances.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 113
• The primary and recovery Prism Central and Prism Element on the registered Nutanix
clusters must be running on the same AOS version.

• AOS 5.17 or newer.


• AOS 5.17.1 or newer to support Synchronous replications of UEFI secure boot enabled
guest VMs.

• AOS 5.19.2 or newer for DR to an AHV cluster in the same AZ (registered to the same
Prism Central). For DR to an AHV cluster in the same AZ, Prism Central must be running
version 2021.3 or newer.

Additional Requirements

• For optimal performance, maintain the round trip latency (RTT) between Nutanix clusters to
less than 5 ms. Also, maintain adequate bandwidth to accommodate peak writes and have a
redundant physical network between the clusters.
• The storage container name of the protected guest VMs must be the same on both the
primary and recovery clusters. Therefore, a storage container must exist on the recovery
cluster with the same name as the one on the primary cluster. For example, if the protected
guest VMs are in the SelfServiceContainer storage container on the primary cluster, there
must also be a SelfServiceContainer storage container on the recovery cluster.
• For hardware and Foundation configurations required to support Synchronous replication
schedules, see On-Prem Hardware Resource Requirements on page 14.
• The clusters on the primary site and the recovery site communicate over the ports 2030,
2036, 2073, and 2090. Ensure that these ports have open access between both the primary
and the recovery clusters (Prism Element). For the complete list of required ports, see Port
Reference.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 114
• If the primary and the recovery clusters (Prism Element) are in different subnets, open the
ports manually for communication.

Tip: If the primary and the recovery clusters (Prism Element) are in the same subnet, you
need not open the ports manually.

• To open the ports for communication to the recovery cluster, run the following command
on all CVMs of the primary cluster.

nutanix@cvm$ allssh 'modify_firewall -f -r remote_cvm_ip,remote_virtual_ip -p


2030,2036,2073,2090 -i eth0'

Replace remote_cvm_ip with the IP address of the recovery cluster CVM. If there are
multiple CVMs, replace remote_cvm_ip with the IP addresses of the CVMs separated by
comma.
Replace remote_virtual_ip with the virtual IP address of the recovery cluster.
• To open the ports for communication to the primary cluster, run the following command
on all CVMs of the recovery cluster.

nutanix@cvm$ allssh 'modify_firewall -f -r source_cvm_ip,source_virtual_ip -p


2030,2036,2073,2090 -i eth0'

Replace source_cvm_ip with the IP address of the primary cluster CVM. If there are multiple
CVMs, replace source_cvm_ip with the IP addresses of the CVMs separated by comma.
Replace source_virtual_ip with the virtual IP address of the primary cluster.

Note: Use the eth0 interface only. eth0 is the default CVM interface that shows up when you
install AOS.

Synchronous Replication Limitations


Consider the following specific limitations before protecting your guest VMs with Synchronous
replication schedule. These limitations are in addition to the general limitations of Leap.
For information about the general limitations of Leap, see Leap Limitations on page 22.

• You cannot restore guest VMs with incompatible GPUs at the recovery cluster.
• You cannot protect guest VMs configured as part of a network function chain.
• You cannot protect guest VMs with affinity policies.
• You cannot resize a guest VM disk while the guest VM is in replication. See KB-9986 for
more information.

Creating a Protection Policy with the Synchronous Replication Schedule (Leap)


To protect the guest VMs in an instant replication schedule, configure a Synchronous
replication schedule while creating the protection policy. The policy replicates all the writes
on the protected guest VMs synchronously to the recovery availability zone (site) for High
Availability. For a raw node (HDD+SSD) size up to 120 TB, the policy also takes crash-consistent
recovery points of those guest VMs every 6 hours and replicates them to the recovery site—
the first snapshot is taken immediately. To maintain the efficiency of synchronous replication,
the protection policy allows you to add only one recovery site for the protected VMs. When
creating a protection policy, you can specify only VM categories. If you want to protect guest
VMs individually, you must first create the protection policy—which can also include VM
categories, and then include the guest VMs individually in the protection policy from the VMs
page.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 115
Before you begin
See Synchronous Replication Requirements on page 113 and Synchronous Replication
Limitations on page 115 before you start.

About this task


To create a protection policy with the Synchronous replication schedule, do the following at the
primary site. You can also create a protection policy at the recovery site. Protection policies you
create or update at a recovery site synchronize back to the primary site.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Policies > Protection
Policies in the left pane.

Figure 66: Protection Policy Configuration: Protection Policies

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 116
3. Click Create Protection Policy.
Specify the following information in the Create Protection Policy window.

Figure 67: Protection Policy Configuration: Select Primary Location

a. Policy name: Enter a name for the protection policy.

Caution: The name can be of only alphanumeric, dot, dash, and underscore characters.

b. In the Primary Location pane, specify the following information.

• 1. Location: From the drop-down list, select an availability zone (site) that hosts the
guest VMs to protect.
The drop-down lists all the sites paired with the local site. Local AZ represents the
local site (Prism Central). For your primary site, you can check either the local site or
a non-local site.
2. Cluster: From the drop-down list, select the AHV cluster that hosts the VMs to
protect.
The drop-down lists all the Nutanix clusters registered to Prism Central representing
the selected site. If you want to protect the guest VMs from multiple AHV clusters
in the same protection policy, select the AHV clusters that host those guest VMs. All
Clusters protects the guest VMs of all Nutanix clusters registered to Prism Central.
Select All Clusters only if all the clusters are running AHV.
3. Click Save.
Clicking Save activates the Recovery Location pane. Do not add a local schedule
to retain the recovery points locally. To maintain the replication efficiency,

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 117
Synchronous replication allows only the replication schedule. If you add a local
schedule, you cannot click Synchronous in step d.
c. In the Recovery Location pane, specify the following information.

Figure 68: Protection Policy Configuration: Select Recovery Location

• 1. Location: From the drop-down list, select the availability zone (site) where you want
to replicate the recovery points.
The drop-down lists all the sites paired with the local site. Local AZ represents the
local site (Prism Central). Select Local AZ if you want to configure DR to a different
AHV cluster at the same site.
If you do not select a site, local recovery points that are created by the protection
policy do not replicate automatically. You can, however, replicate the recovery
points manually and use recovery plans to recover the guest VMs. For more
information, see Manual Disaster Recovery (Leap) on page 137.
2. Cluster: From the drop-down list, select the AHV cluster where you want to
replicate the guest VM writes synchronously and recovery points.
The drop-down lists all the Nutanix clusters registered to Prism Central representing
the selected site. You can select one AHV cluster at the recovery site. Do not select
an ESXi cluster because DR configurations using Leap support only AHV cluster.
If you select an ESXi cluster and configure a Synchronous replication schedule,
replications fail.

Note: Selecting auto-select from the drop-down menu replicates the recovery
points to any available cluster at the recovery site. Select auto-select from the drop-
down list only if all the clusters at the recovery site are running on AHV and are up
and running.

3. Click Save.
Clicking Save activates the + Add Schedule button between the primary and the
recovery site. Do not add a local schedule to retain the recovery points locally.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 118
To maintain the replication efficiency, Synchronous replication allows only the
replication schedule. If you add a local schedule, you cannot click Synchronous in
step d.
d. Click + Add Schedule to add a replication schedule between the primary and the recovery
site.
Specify the following information in the Add Schedule window. The window auto-
populates the Primary Location and Recovery Location that you have selected in step b
and step c.

Figure 69: Protection Policy Configuration: Add Schedule (Synchronous)

• 1. Protection Type: Click Synchronous.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 119
2. Failure Handling: Select one of the following options to handle failure. For example,
if the connection between the primary and the recovery site breaks and VM writes
on the primary cluster stops replicating.

• Manual: Select this option if you want to resume the VM writes on the primary
site only when you manually disable Synchronous replication.
• Automatic: Select this option to resume VM writes on the primary site
automatically after the specified Timeout after seconds.

Note: The minimum timeout is 10 seconds.

3. Click Save Schedule.


Clicking Save Schedule disables the + Add Recovery Location button at the top-
right because to maintain the efficiency of synchronous replication, the policy allows
you to add only one recovery site.
By default, recovery point creation begins immediately after you create the protection
policy. If you want to specify when recovery point creation must begin, click Immediately
at the top-right corner, and then, in the Start Time dialog box, do the following.

• 1. Click Start protection at specific point in time.


2. Specify the time at which you want to start taking recovery points.
3. Click Save.
e. Click Next.
Clicking Next shows a list of VM categories where you can optionally check one or more
VM categories to protect in the protection policy. DR configurations using Leap allows
you to protect a guest VM by using only one protection policy. Therefore, VM categories
specified in another protection policy are not in the list. If you protect a guest VM in
another protection policy by specifying the VM category of the guest VM (category-
based inclusion), and if you protect the guest VM from the VMs page in this policy
(individual inclusion), the individual inclusion supersedes the category-based inclusion.
Effectively, only the protection policy that protected the individual guest VM protects the
guest VM.
For example, the guest VM VM_SherlockH is in the category Department:Admin, and
you add this category to the protection policy named PP_AdminVMs. Now, if you add
VM_SherlockH from the VMs page to another protection policy named PP_VMs_UK,
VM_SherlockH is protected in PP_VMs_UK and unprotected from PP_AdminVMs.
f. If you want to protect the guest VMs category wise, check the VM categories that you
want to protect from the list and click Add.
Prism Central includes built-in VM categories for frequently encountered applications (for
example, MS Exchange and Oracle). If the VM category or value you want is not available,
first create the category with the required values, or update an existing category so
that it has the values you require. Doing so ensures that the VM categories and values
are available for selection. You can add VMs to the category either before or after you
configure the protection policy. If the guest VMs have a common characteristic, such as
belonging to a specific application or location, create a VM category and add the guest
VMs into the category.
If you do not want to protect the guest VMs category wise, proceed to the next step
without checking VM categories. You can add the guest VMs individually to the protection

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 120
policy later from the VMs page (see Adding Guest VMs individually to a Protection Policy
on page 128).
g. Click Create.
The protection policy with Synchronous replication schedule is created. To verify the
protection policy, see the Protection Policies page. If you check VM categories in step
f, the protection policy starts generating recovery points of the guest VMs in those VM
categories. To see the generated recovery points, click the hamburger icon at the top-
left corner of the window and go to VM Recovery Points. Click the recovery points for its
information. You can see the time estimated for the very first replication (seeding) to the
recovery sites.

Figure 70: Recovery Points Overview

Creating a Recovery Plan (Leap)


To orchestrate the failover of the protected guest VMs to the recovery site, create a recovery
plan. After a failover, a recovery plan recovers the protected guest VMs to the recovery site. If
you have configured two recovery sites in a protection policy, create two recovery plans for DR
—one for recovery to each recovery site. The recovery plan synchronizes continuously to the
recovery site in a bidirectional way.
For more information about creating a recovery plan, see Creating a Recovery Plan (Leap) on
page 56.

Synchronous Replication Management


Synchronous replication instantly replicates all writes on the protected guest VMs to the
recovery cluster. Replication starts when you configure a protection policy and add the guest
VMs to protect. You can manage the replication by enabling, disabling, pausing, or resuming the
Synchronous replication on the protected guest VMs from the Prism Central.

Enabling Synchronous Replication

When you configure a protection policy with Synchronous replication schedule and add
guest VMs to protect, the replication is enabled by default. However, if you have disabled the
Synchronous replication on a guest VM, you have to enable it to start replication.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 121
About this task
To enable Synchronous replication on a guest VM, perform the following procedure at the
primary availability zone (site). You can also perform the following procedure at the recovery
site. The operations you perform at a recovery site synchronize back to the primary site.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Virtual Infrastructure >
VMs > List in the left pane.

3. Select the guest VMs on which you want to enable Synchronous replication.

4. Click Protect from the Actions drop-down menu.

5. Select the protection policy in the table to include the guest VMs in the protection policy.

6. Click Protect.

Pausing Synchronous Replication

The protected guest VMs on the primary cluster stop responding when the recovery cluster
is disconnected abruptly (for example, due to network outage or internal service crash). To
come out of the unresponsive state, you can pause Synchronous replication on the guest VMs.
Pausing Synchronous replication temporarily suspends the replication state of the guest VMs
without completely disabling the replication relationship.

About this task


To pause Synchronous replication on a guest VM, perform the following procedure.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Virtual Infrastructure >
VMs > List in the left pane.

3. Select the guest VMs on which want to pause the Synchronous replication.

4. Click Pause Synchronous Replication from the Actions drop-down menu.

Resuming Synchronous Replication

You can resume the Synchronous replication that you had paused to come out of the
unresponsive state of the primary cluster. Resuming Synchronous replication restores the
replication status and reconciles the state of the guest VMs. To resume Synchronous replication
on a guest VM, perform the following procedure.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Virtual Infrastructure >
VMs > List in the left pane.

3. Select the guest VMs on which want to resume Synchronous replication.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 122
4. Click Resume Synchronous Replication from the Actions drop-down menu.

Failover and Failback Operations (Leap)


You can perform test failover, planned failover, and unplanned failover of the guest VMs
protected with Synchronous replication schedule across the AHV clusters at the different
on-prem availability zone (site). The steps to perform test, planned, and unplanned failover
are largely the same irrespective of the replication schedules that protects the guest VMs.
Additionally, a planned failover of the guest VMs protected with Synchronous replication
schedule also allows for live migration of the protected guest VMs.
Refer Failover and Failback Management on page 66 for test, planned, and unplanned
failover procedures.

Cross-Cluster Live Migration


Planned failover of the guest VMs protected with Synchronous replication schedule supports
live migration to another AHV cluster. Live migration offers zero downtime for your applications
during a planned failover event to the recovery cluster (for example, during scheduled
maintenance).

Cross-Cluster Live Migration Requirements

The following are the specific requirements to successfully migrate your guest VMs with Live
Migration.
Ensure that you meet the following requirements in addition to the requirements of
Synchronous replication schedule (Synchronous Replication Requirements on page 113) and
general requirements of Leap (Leap Requirements on page 18).

• Stretch L2 networks across the primary and recovery sites.


Network stretch spans your network across different sites. A stretched L2 network retains
the IP addresses of guest VMs after their Live Migration to the recovery site.
• Both the primary and recovery Nutanix clusters must have identical CPU types.
The primary and recovery Nutanix clusters must have identical CPU feature set. If the CPU
feature sets (set of CPU flags) are unidentical, Live Migration fails.
• Both the primary and recovery Nutanix clusters must run on the same AHV version.
• If the primary and the recovery Nutanix clusters (Prism Element) are in different subnets,
open the ports 49250–49260 for communication. For the complete list of required ports, see
Port Reference.

Cross-Cluster Live Migration Limitations

Consider the following limitation in addition to the limitations of Synchronous replication


schedule (Synchronous Replication Limitations on page 115) and general limitations of Leap
(Leap Limitations on page 22) before performing live migration of your guest VMs.

• Live migration of guest VMs fails if the guest VMs are part of Flow security policies.

Tip: To enable the guest VMs to retain the Flow security policies after the failover (live
migration), revoke the policies on the guest VMs and Export them to the recovery site. At
the recovery site, Import the policies. The guest VMs read the policies automatically after
recovery.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 123
Performing Cross-Cluster Live Migration

If due to a planned event (for example, scheduled maintenance of guest VMs) at the primary
availability zone (site), you want to migrate your applications to another AHV cluster without
downtime, perform a planned failover with Live Migration to the recovery site.

About this task


To live migrate the guest VMs, do the following procedure at the recovery site.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Policies > Recovery
Plans in the left pane.

3. Select a recovery plan for the failover operation.

Caution: The Recovery Plans page displays many recovery plans. Select the recovery plan
that has Stretch Networks. If you select a recovery plan having Non-stretch networks, the
migration fails. For more information about selection of stretch and non-stretch networks, see
Creating a Recovery Plan (Leap) on page 56.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 124
4. Click Failover from the Actions drop-down menu.
Specify the following information in the Failover from Recovery Plan window. The window
auto-populates the Failover From and Failover To locations from the recovery plan you
select in step 3.

Note: If you select more than one recovery plan in step 3, the Failover action is available only
when the selected recovery plans have the same primary and recovery locations.

Figure 71: Planned Failover

a. Failover Type: Click Planned Failover and check Live Migrate VMs.
b. Click + Add target clusters if you want to failover to specific clusters at the recovery site.
If you do not add target clusters, the recovery plan migrates the guest VMs to any AHV
cluster at the recovery site.

5. Click Failover.
The Failover from Recovery Plan dialog box lists the errors and warnings, if any, and allows
you to stop or continue the test operation. If there are no errors or you resolve the errors in
step 6, the guest VMs migrate and start at the recovery cluster. The migration might show
a network latency of 300-600 ms. You cannot see the migrated guest VMs on the primary
cluster because those VMs come up at the recovery cluster after the migration.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 125
6. If you see errors, do the following.

• To review errors or warnings, click View Details in the description.


Resolve the error conditions and then restart the failover procedure.
• Select one of the following.

• To stop the failover operation, click Abort.


• To continue the failover operation despite the warnings, click Execute Anyway.

Note: You cannot continue the failover operation when the validation fails with errors.

Converting a Multi-AZ Deployment to Single-AZ


To use disaster recovery (DR) features that support only single Prism Central (AZ) managed
deployments, you can convert your multi-AZ deployment to single-AZ deployment. For
example, in two AZ deployments where each Prism Central (Prism Central A, Prism Central B)
instance hosts one Prism Element cluster (Prism Element A, Prism Element B), you can perform
the following procedure to convert to a single-AZ deployment (Prism Central A managing both
Prism Element A, Prism Element B).

Before you begin


This procedure converts deployments protected with Synchronous replication schedules. See
Synchronous Replication Requirements on page 113 for the supported Prism Central and
AOS versions. To avoid the single point of failure in such deployments, Nutanix recommends
installing the single Prism Central at a different AZ (different fault domain).
Perform this procedure to convert deployments protected in Asynchronous and NearSync
replications schedules also. The conversion procedure for deployments protected in
Asynchronous and NearSync replications schedules are identical except that the protection
status (step 2 in the described procedure) of Asynchronous and NearSync replications
schedules is available only in Focus > Data Protection.

Figure 72: Focus

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 126
About this task

Procedure

1. Log on to the web console of Prism Central A.

2. Modify all the protection policies and recovery plans that refer to Prism Element B and Prism
Central B.

a. Modify the protection policies to either remove all the references to Prism Element B and
Prism Central B or remove all the guest VMs from the policy.
For more information about updating a protection policy, see Updating a Protection
Policy on page 132.
b. Modify the recovery plans to remove all the references to Prism Element B and Prism
Central B.

Note: If you do not modify the recovery plans, the recovery plans become invalid after
the unregistration of Prism Element B with Prism Central B in step 2. For more information
about updating a recovery plan, see Updating a Recovery Plan on page 136.

c. Ensure that there are no issues (in Alerts) with the modified protection policies and
recovery plans.

Note: Before unregistering Prism Element B from Prism Central B in step 2, ensure that no
guest VM is protected to and from Prism Element B.

3. Unprotect all the guest VMs replicating to and from Prism Element B and Prism Central B.

Note: If the guest VMs are protected by VM categories, update or delete the VM categories
from the protection policies and recovery plans.

To see the unprotect status of the guest VMs, click Focus > Data Protection

4. Ensure that the guest VMs unprotect completely.

• To ensure all the stretch states are deleted, log on to Prism Element B through SSH as the
nutanix user and run the following command.
nutanix@cvm$ stretch_params_printer

Empty response indicates that all stretch states are deleted.

• To ensure all the stretch states between Prism Central B and Prism Element B are deleted,
log on to Prism Central B through SSH as nutanix user and run the following commands.
pcvm$ mcli
mcli> mcli dr_coordinator.list

Empty response indicates that all stretch states are deleted.

5. Unregister Prism Element B from Prism Central B.


After unregistering Prism Element B from Prism Central B, the system deletes all Prism
Central B attributes and policies applied to guest VMs on Prism Element B (for example, VM
categories).

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 127
6. Register Prism Element B to Prism Central A.
After registering Prism Element B to Prism Central A, reconfigure all Prism Central B
attributes and policies applied to entities on the Prism Element B (for example, VM
categories).

7. Modify the protection policies and recovery plans to refer to Prism Central A and Prism
Element B.

8. Unpair Prism Central B.


To ensure all the stretch states between Prism Central A and Prism Central B are deleted,
log on to both Prism Central A and Prism Central B through SSH and run the following
command..
pcvm$ mcli
mcli> mcli dr_coordinator.list

Empty response indicates that all stretch states are deleted.


Multi-AZ deployment is converted to Single-AZ. Prism Element A and Prism Element B is
registered to single Prism Central (Prism Central A) managed deployment.

Protection Policy Management


A protection policy automates the creation and replication of recovery points. When creating a
protection policy, you specify replication schedules, retention policies for the recovery points,
and the guest VMs you want to protect. You also specify a recovery availability zone (maximum
2) if you want to automate recovery point replication to the recovery availability zones (sites).
When you create, update, or delete a protection policy, it synchronizes to the recovery sites
and works bidirectionally. The recovery points generated at the recovery sites replicate back
to the primary site when the primary site starts functioning. For information about how Leap
determines the list of sites for synchronization, see Entity Synchronization Between Paired
Availability Zones on page 138.

Adding Guest VMs individually to a Protection Policy


You can also protect guest VMs individually in a protection policy from the VMs page, without
the use of a VM category. To protect guest VMs individually in a protection policy, perform the
following procedure.

About this task

Note: If you protect a guest VM individually, you can remove the guest VM from the protection
policy only by using the procedure in Removing Guest VMs individually from a Protection
Policy on page 130.

Procedure

1. Log on to the Prism Central web console.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 128
2. Click the hamburger icon at the top-left corner of the window. Go to Compute & Storage >
VMs in the left pane.

Figure 73: Virtual Infrastructure: VMs

3. Select the guest VMs that you want to add to a protection policy.

4. Click Protect from the Actions drop-down menu.

Figure 74: Protect Guest VMs Individually: Actions

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 129
5. Select the protection policy in the table to protect the selected guest VMs.

Figure 75: Protect Guest VMs Individually: Protection Policy Selection

6. Click Protect.

Removing Guest VMs individually from a Protection Policy


You can remove guest VMs individually from a protection policy from the VMs page. To remove
guest VMs individually from a protection policy, perform the following procedure.

About this task

Note: If a guest VM is protected under a VM category, you cannot remove the guest VM from
the protection policy by the following procedure. You can remove the guest VM from the
protection policy only by dissociating the guest VM from the VM category.

Procedure

1. Log on to the Prism Central web console.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 130
2. Click the hamburger icon at the top-left corner of the window. Go to Compute & Storage >
VMs in the left pane.

Figure 76: Virtual Infrastructure: VMs

3. Select the guest VMs that you want to remove from a protection policy.

4. Click UnProtect from the Actions drop-down menu.

Cloning a Protection Policy


If the requirements of the protection policy that you want to create are similar to an existing
protection policy, you can clone the existing protection policy and update the clone. To clone a
protection policy, perform the following procedure.

About this task

Procedure

1. Log on to the Prism Central web console.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 131
2. Click the hamburger icon at the top-left corner of the window. Go to Data Protection >
Protection Policies in the left pane.

Figure 77: Protection Policy Configuration: Protection Policies

3. Select the protection policy that you want to clone.

4. Click Clone from the Actions drop-down menu.

5. Make the required changes on the Clone Protection Policy page.


For information about the fields on the page, see:

• Creating a Protection Policy with an Asynchronous Replication Schedule (Leap) on


page 38
• Creating a Protection Policy with a NearSync Replication Schedule (Leap) on page 100
• Creating a Protection Policy with the Synchronous Replication Schedule (Leap) on
page 115

6. Click Save.

Updating a Protection Policy


You can modify an existing protection policy in Prism Central. To update an existing protection
policy, perform the following procedure.

Procedure

1. Log on to the Prism Central web console.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 132
2. Click the hamburger icon at the top-left corner of the window. Go to Data Protection >
Protection Policies in the left pane.

Figure 78: Protection Policy Configuration: Protection Policies

3. Select the protection policy that you want to update.

4. Click Update from the Actions drop-down menu.

5. Make the required changes on the Update Protection Policy page.


For information about the fields on the page, see:

• Creating a Protection Policy with an Asynchronous Replication Schedule (Leap) on


page 38
• Creating a Protection Policy with a NearSync Replication Schedule (Leap) on page 100
• Creating a Protection Policy with the Synchronous Replication Schedule (Leap) on
page 115

6. Click Save.

Finding the Protection Policy of a Guest VM


You can use the data protection focus on the VMs page to determine the protection policies
to which a guest VM belongs. To determine the protection policy, perform the following
procedure.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Virtual Infrastructure >
VMs > List in the left pane.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 133
3. Click Data Protection from the Focus menu at the top-right corner.
The Protection Policy column that is displayed shows the protection policy to which the
guest VMs belong.

Figure 79: Focus

Recovery Plan Management


A recovery plan orchestrates the recovery of protected VMs at the recovery site. Recovery
plans are predefined procedures (runbooks) that use stages to enforce VM power-on sequence.
You can also specify the inter-stage delays to recover applications.
When you create, update, or delete a recovery plan, it synchronizes to the recovery sites
and works bidirectionally. For information about how Leap determines the list of sites for
synchronization, see Entity Synchronization Between Paired Availability Zones on page 138.
After a failover from the primary site to a recovery site, you can failback to the primary site by
using the same recovery plan.
Recovery plans are independent of protection policies and do not reference protection policies
in their configuration information. Also, they do not create recovery points. While the process of
planned failover includes the creation of a recovery point so that the latest data can be used for
recovery, unplanned and test failovers rely on the availability of the required recovery points at
the recovery site. A recovery plan therefore requires the guest VMs in the recovery plan to also
be associated with a protection policy.

Adding Guest VMs individually to a Recovery Plan


You can also add guest VMs individually to a recovery plan from the VMs page, without the use
of a VM category. To add VMs individually to a recovery plan, perform the following procedure.

Procedure

1. Log on to the Prism Central web console.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 134
2. Click the hamburger icon at the top-left corner of the window. Go to Compute & Storage >
VMs in the left pane.

Figure 80: Virtual Infrastructure: VMs

3. Select the guest VMs that you want to add to a recovery plan.

4. Click Add to Recovery Plan from the Actions drop-down menu.

5. Select the recovery plan where you want to add the guest VMs in the Add to Recovery Plan
page.

Tip: Click +Create New if you want to create another recovery plan to add the selected guest
VM. For more information about creating a recovery plan, see Creating a Recovery Plan
(Leap) on page 56.

6. Click Add.
The Update Recovery Plan page appears. Make the required changes to the recovery plan.
For information about the various fields and options, see Creating a Recovery Plan (Leap) on
page 56.

Removing Guest VMs individually from a Recovery Plan


You can also remove guest VMs individually from a recovery plan. To remove guest VMs
individually from a recovery plan, perform the following procedure.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Data Protection >
Recovery Plans in the left pane.

3. Click the hamburger icon at the top-left corner of the window. Go to Data Protection >
Recovery Plans in the left pane.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 135
4. Select the recovery plan from which you want to remove guest VM.

5. Click Update from the Actions drop-down menu.


The Update Recovery Plan page appears. Make the required changes to the recovery plan.
For information about the various fields and options, see Creating a Recovery Plan (Leap) on
page 56.

Updating a Recovery Plan


You can update an existing recovery plan. To update a recovery plan, perform the following
procedure.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Data Protection >
Recovery Plans in the left pane.

3. Select the recovery plan that you want to update.

4. Click Update from the Actions drop-down menu.


The Update Recovery Plan dialog box appears. Make the required changes to the recovery
plan. For information about the various fields and options, see Creating a Recovery Plan
(Leap) on page 56.

Validating a Recovery Plan


You can validate a recovery plan from the recovery site. Recovery plan validation does not
perform a failover like the test failover does, but reports warnings and errors. To validate a
recovery plan, perform the following procedure.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Data Protection >
Recovery Plans in the left pane.

3. Select the recovery plan that you want to validate.

4. Click Validate from the Actions drop-down menu.

5. In the Validate Recovery Plan page, do the following.

a. In Primary Location, select the primary location.


b. In Recovery Location, select the recovery location.
c. Click Proceed.
The validation process lists any warnings and errors.

6. Click Back.
A summary of the validation is displayed. You can close the dialog box.

7. To return to the detailed results of the validation, click the link in the Validation Errors
column.
The selected recovery plan is validated for its correct configuration. The updated recovery
plan starts synchronizing to the recovery Prism Central.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 136
Manual Disaster Recovery (Leap)
Manual data protection involves manually creating recovery points, manually replicating
recovery points, and manually recovering the VMs at the recovery site. You can also automate
some of these tasks. For example, the last step—that of manually recovering VMs at the
recovery site—can be performed by a recovery plan while the underlying recovery point
creation and replication can be performed by protection policies. Conversely, you can configure
protection policies to automate recovery point creation and replication and recover VMs at the
recovery site manually.

Creating Recovery Points Manually (Out-of-Band Snapshots)

About this task


To create recovery points manually, do the following.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Virtual Infrastructure >
VMs > List in the left pane.

3. Select the guest VMs for which you want to create a recovery point.

4. Click Create Recovery Point from the Actions drop-down menu.

5. To verify that the recovery point is created, click the name of the VM, click the Recovery
Points tab, and verify that a recovery point is created.

Replicating Recovery Points Manually


You can manually replicate recovery points only from the availability zone (site) where the
recovery points exist.

About this task


To replicate recovery points manually, do the following.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Virtual Infrastructure >
VMs > List in the left pane.

3. Click the guest VMs whose recovery point you want to replicate, and then click Recovery
Points in the left.
The Recovery Points view lists all the recovery points of the VM.

4. Select the recovery points that you want to replicate.

5. Click Replicate from the Actions drop-down menu.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 137
6. In the Replicate dialog box, do the following.

a. In Recovery Location, select the location where you want to replicate the recovery point.
b. In Target Cluster, select the cluster where you want to replicate the recovery point.
c. Click Replicate Recovery Point.

Recovering a Guest VM from a Recovery Point Manually (Clone)


You can recover a guest VM by cloning the guest VM from a recovery point.

About this task


To recover a guest VM from a recovery point, do the following.

Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Virtual Infrastructure >
VMs > List in the left pane.

3. Click the VM whose recovery point you want to replicate, and then click Recovery Points in
the left.
The Recovery Points view lists all the recovery points of the VM.

4. Select the recovery point from which you want to recover the VM.

5. Click Restore from the Actions drop-down menu.

6. In the Restore dialog box, do the following.

a. In the text box provided for specifying a name for the VM, specify a new name or do
nothing to use the automatically generated name.
b. Click Restore.

Warning: The following are the limitations of the manually recovered VMs (VMs recovered
without the use of a recovery plan).

• The VMs recover without a VNIC if the recovery is performed at the remote site.
• VM categories are not applied.
• NGT needs be reconfigured.

Entity Synchronization Between Paired Availability Zones


When paired with each other, availability zones (sites) synchronize disaster recovery (DR)
configuration entities. Paired sites synchronize the following DR configuration entities.
Protection Policies
A protection policy is synchronized whenever you create, update, or delete the
protection policy.
Recovery Plans
A recovery plan is synchronized whenever you create, update, or delete the recovery
plan. The list of availability zones (sites) to which the on-prem must synchronize a
recovery plan is derived from the guest VMs that are included in the recovery plan. The

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 138
guest VMs used to derive the availability zone list are VM categories and individually
added guest VMs.
If you specify VM categories in a recovery plan, Leap determines which protection
policies use those VM categories, and then synchronizes the recovery plans to the
availability zones specified in those protection plans.
If you include guest VMs individually (without VM categories) in a recovery plan,
Leap uses the recovery points of those guest VMs to determine which protection
policies created those recovery points, and then synchronizes the recovery plans to the
availability zones (sites) specified in those protection policies. If you create a recovery
plan for VM categories or guest VMs that are not associated with a protection policy,
Leap cannot determine the availability zone list and therefore cannot synchronize the
recovery plan. If a recovery plan includes only individually added guest VMs and a
protection policy associated with a guest VM has not yet created guest VM recovery
points, Leap cannot synchronize the recovery plan to the availability zone specified in
that protection policy. However, recovery plans are monitored every 15 minutes for the
availability of recovery points that can help derive availability zone information. When
recovery points become available, the paired on-prem site derives the availability zone by
the process described earlier and synchronizes the recovery plan to the availability zone.
VM Categories used in Protection Policies and Recovery Plans
A VM category is synchronized when you specify the VM category in a protection policy
or recovery plan.
Issues such as a loss of network connectivity between paired availability zones or user actions
such as unpairing of availability zones followed by repairing of those availability zones can
affect VM synchronization.

Tip: Nutanix recommends to unprotect all the VMs on the availability zone before unpairing
it to avoid getting into a state where the entities have stale configurations after repairing of
availability zones.

If you update guest VMs in either or both availability zones before such issues are resolved or
before unpaired availability zones are paired again, VM synchronization is not possible. Also,
during VM synchronization, if a guest VM cannot be synchronized because of an update failure
or conflict (for example, you updated the same VM in both availability zones during a network
connectivity issue), no further VMs are synchronized. Entity synchronization can resume only
after you resolve the error or conflict. To resolve a conflict, use the Entity Sync option, which
is available in the web console. Force synchronization from the availability zone that has the
desired configuration. Forced synchronization overwrites conflicting configurations in the
paired availability zone.

Note: Forced synchronization cannot resolve errors arising from conflicting values in guest VM
specifications (for example, the paired availability zone already has a VM with the same name).

If you do not update entities before a connectivity issue is resolved or before you pair
the availability zones again, the synchronization behavior described earlier resumes. Also,
pairing previously unpaired availability zones trigger an automatic synchronization event. For
recommendations to avoid facing such issues, see Entity Synchronization Recommendations
(Leap) on page 139.

Entity Synchronization Recommendations (Leap)


Consider the following recommendations to avoid inconsistencies and the resulting
synchronization issues.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 139
• During network connectivity issues, do not update entities at both the availability zones
(sites) in a pair. You can safely make updates at any one site. After the connectivity issue is
resolved, force synchronization from the site in which you made updates. Failure to adhere
to this recommendation results in synchronization failures.
You can safely create entities at either or both the sites as long as you do not assign
the same name to entities at the two sites. After the connectivity issue is resolved, force
synchronization from the site where you created entities.
• If one of the sites becomes unavailable, or if any service in the paired site is down perform
force synchronization from the paired availability zone after the issue is resolved.

Forcing Entity Synchronization (Leap)


Entity synchronization, when forced from an availability zone (site), overwrites the
corresponding entities in paired sites. Forced synchronization also creates, updates, and
removes those entities from paired sites.

About this task


The availability zone (site) to which a particular entity is forcefully synchronized depends
on which site requires the entity (see Entity Synchronization Between Paired Availability
Zones on page 138). To avoid inadvertently overwriting required entities, ensure to force VM
synchronization from the site in which the entities have the desired configuration.
If a site is paired with two or more availability zones (sites), you cannot select one or more sites
with which to synchronize entities.
To force entity synchronization, do the following.

Procedure

1. Log on to the Prism Central web console.

2. Click the settings button (gear icon) at the top-right corner of the window.

3. Click Entity Sync in the left pane.

4. In the Entity Sync dialog box, review the message at the top of the dialog box, and then do
the following.

a. To review the list of entities that will be synchronized to an AVAILABILITY ZONE, click the
number of ENTITIES adjacent to an availability zone.
b. After you review the list of entities, click Back.

5. Click Sync Entities.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Sites (Leap) | 140
PROTECTION AND DR BETWEEN ON-
PREM SITE AND XI CLOUD SERVICE (XI
LEAP)
Xi Leap is essentially an extension of Leap to Xi Cloud Services. Xi Leap protects your guest
VMs and orchestrates their disaster recovery (DR) to Xi Cloud Services when events causing
service disruption occur at the primary availability zone (site). For protection of your guest
VMs, protection policies with Asynchronous and NearSync replication schedules generate
and replicate recovery points to Xi Cloud Services. Recovery plans orchestrate DR from the
replicated recovery points to Xi Cloud Services.
Protection policies create a recovery point—and set its expiry time—in every iteration of the
specified time period (RPO). For example, the policy creates a recovery point every 1 hour for
an RPO schedule of 1 hour. The recovery point expires at its designated expiry time based on
the retention policy—see step 3 in Creating a Protection Policy with Asynchronous Replication
Schedule (Xi Leap) on page 182. If there is a prolonged outage at a site, the Nutanix cluster
retains the last recovery point to ensure you do not lose all the recovery points. For NearSync
replication (lightweight snapshot), the Nutanix cluster retains the last full hourly snapshot.
During the outage, the Nutanix cluster does not clean up the recovery points due to expiry.
When the Nutanix cluster comes online, it cleans up the recovery points that are past expiry
immediately.
If a guest VM is removed from a protection policy, Delete all the recovery points associated
with the guest VM. If the recovery points are not deleted explicitly, the recovery points adhere
to the expiration period set in the protection policy and will continue to incur charges until the
expiry. To stop the charges immediately, log on to Xi Cloud Services and delete all of these
explicitly.
For High Availability of a guest VM, Leap can enable replication of recovery points to one or
more sites. A protection policy can replicate recovery points to maximum two sites. One of the
two sites can be in cloud (Xi Cloud Services). For replication to Xi Cloud Services, you must add
a replication schedule between the on-prem site and Xi Cloud Services. You can set up the on-
prem site and Xi Cloud Services in the following arrangements.

Figure 81: The Primary Nutanix Cluster at on-prem site and recovery Xi Cloud Services

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 141
Figure 82: The Primary Xi Cloud Services and recovery Nutanix Cluster at on-prem site

The replication schedule between an on-prem site and Xi Cloud Services enables DR to Xi
Cloud Services. To enable performing DR to Xi Cloud Services, you must create a recovery
plan. In addition to performing DR from AHV clusters to Xi Cloud Services (only AHV), you can
also perform cross-hypervisor disaster recovery (CHDR)—DR from ESXi clusters to Xi Cloud
Services.
The protection policies and recovery plans you create or update synchronize continuously
between the on-prem site and Xi Cloud Services. The reverse synchronization enables you
to create or update entities (protection policies, recovery plans, and guest VMs) at either the
primary or the recovery site.
This section describes protection of your guest VMs and DR from Xi Cloud Services to a
Nutanix cluster at the on-prem site. In Xi Cloud Services, you can protect your guest VMs and
DR to a Nutanix cluster at only one on-prem site. For information about protection of your
guest VMs and DR to Xi Cloud Services, see Protection and DR between On-Prem Sites (Leap)
on page 17.

Xi Leap Requirements
The following are the general requirements of Xi Leap. Along with the general requirements,
there are specific requirements for protection with the following supported replication
schedules.

• For information about the on-prem node, disk and Foundation configurations required
to support Asynchronous and NearSync replication schedules, see On-Prem Hardware
Resource Requirements on page 14.
• For specific requirements of protection with Asynchronous replication schedule (1 hour or
greater RPO), see Asynchronous Replication Requirements (Xi Leap) on page 180.
• For specific requirements of protection with NearSync replication schedule (1–15 minutes
RPO), see NearSync Replication Requirements (Xi Leap) on page 215.

License Requirements
The AOS license required depends on the features that you want to use. For information about
the features that are available with an AOS license, see Software Options.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 142
Hypervisor Requirements
The underlying hypervisors required differ in all the supported replication schedules. For more
information about underlying hypervisor requirements for the supported replication schedules,
see:

• Asynchronous Replication Requirements (Xi Leap) on page 180


• NearSync Replication Requirements (Xi Leap) on page 215

Nutanix Software Requirements

• Each on-prem availability zone (site) must have a Leap enabled Prism Central instance. To
enable Leap in Prism Central, see Enabling Leap for On-Prem Site on page 32.

Note: If you are using ESXi, register at least one vCenter Server to Prism Central. You can also
register two vCenter Servers, each to a Prism Central at different sites. If you register both the
Prism Central to the single vCenter Server, ensure that each ESXi cluster is part of different
datacenter object in vCenter.

• The on-prem Prism Central and its registered Nutanix clusters (Prism Element) must be
running on the supported AOS versions. For more information about the required versions
for the supported replication schedules, see:

• Asynchronous Replication Requirements (Xi Leap) on page 180


• NearSync Replication Requirements (Xi Leap) on page 215

Tip:
Nutanix supports replications between the all the latest supported LTS and STS
released AOS versions. To check the list of the latest supported AOS versions, see
KB-5505. To determine if the AOS versions currently running on your clusters are
EOL, see the EOL document .
Upgrade the AOS version to the next available supported LTS/STS release. To
determine if an upgrade path is supported, check the Upgrade Paths page before
you upgrade the AOS.

Note: If both clusters have different AOS versions that are EOL, upgrade the cluster
with lower AOS version to match the cluster with higher AOS version and then
perform the upgrade to the next supported LTS version.

For example, the clusters are running AOS versions 5.5.x and 5.10.x respectively.
Upgrade the cluster on 5.5.x to 5.10.x. After both the clusters are on 5.10.x, proceed
to upgrade each cluster to 5.15.x (supported LTS). Once both clusters are on 5.15.x
you can upgrade the clusters to 5.20.x or newer.
Nutanix recommends that both the primary and the replication clusters or sites run
the same AOS version.

User Requirements
You must have one of the following roles in Xi Cloud Services.

• User admin
• Prism Central admin

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 143
• Prism Self Service admin
• Xi admin

Firewall Port Requirements


To allow two-way replication between an on-prem Nutanix cluster and and Xi Cloud Services,
you must enable certain ports in your external firewall. To know about the required ports, see
Disaster Recovery - Leap in Port Reference.

Networking Requirements
Requirements for static IP address preservation after failover
You can preserve one IP address of a guest VM (with static IP address) for its failover
(DR) to an IPAM network. After the failover, the other IP addresses of the guest VM have
to be reconfigured manually. To preserve an IP address of a guest VM (with static IP
address), ensure that:

Caution: By default, you cannot preserve statically assigned DNS IP addresses after
failover (DR) of guest VMs. However, you can create custom in-guest scripts to preserve
the statically assigned DNS IP addresses. For more information, see Creating a Recovery
Plan (Xi Leap) on page 193.

• Both the primary and the recovery Nutanix clusters run AOS 5.11 or newer.
• The protected guest VMs have Nutanix Guest Tools (NGT) version 1.5 or newer
installed.
For information about installing NGT, see Nutanix Guest Tools in Prism Web Console
Guide.
• The protected guest VMs have at least one empty CD-ROM slot.
The empty CD-ROM is required for mounting NGT at the recovery site.
• The protected guest VMs have NetworkManager command-line tool (nmcli) version
0.9.10.0 or newer installed.
Also, the NetworkManager must manage the networks on Linux VMs. To enable
NetworkManager on a Linux VM, in the interface configuration file, set the value of the
NM_CONTROLLED field to yes. After setting the field, restart the network service on the VM.

Tip: In CentOS, the interface configuration file is /etc/sysconfig/network-scripts/


ifcfg-eth0.

Requirements for static IP address mapping of guest VMs between source and target
virtual networks
You can explicitly define IP addresses for protected guest VMs that have static IP
addresses at the primary site. On recovery, such guest VMs retain the explicitly defined
IP address. To map static IP addresses of guest VMs between source and target virtual
networks, ensure that:

• Both the primary and the recovery Nutanix clusters run AOS 5.17 or newer.
• The protected guest VMs have static IP addresses at the primary site.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 144
• The protected guest VMs have Nutanix Guest Tools (NGT) version 1.5 or newer
installed.
For information about installing NGT, see Nutanix Guest Tools in Prism Web Console
Guide.
• The protected guest VMs have at least one empty CD-ROM slot.
The empty CD-ROM is required for mounting NGT at the recovery site.
• The protected guest VMs can reach the Controller VM from both the sites.
• The recovery plan selected for failover has VM-level IP address mapping configured.
Virtual Network Design Requirements
You can design the virtual subnets that you plan to use for DR to the recovery site so
that they can accommodate the guest VMs running in the source virtual network.

• To use a virtual network as a recovery virtual network, ensure that the virtual network
meets the following requirements.

• The network prefix is the same as the network prefix of the source virtual network.
For example, if the source network address is 192.0.2.0/24, the network prefix of
the recovery virtual network must also be 24.
• The gateway IP address is the same as the gateway IP address in the source
network. For example, if the gateway IP address in the source virtual network
192.0.2.0/24 is 192.0.2.10, the last octet of the gateway IP address in the recovery
virtual network must also be 10.
• To use a single Nutanix cluster as a target for DR from multiple primary Nutanix
clusters, ensure that the number of virtual networks on the recovery cluster is equal to
the sum of the number of virtual networks on the individual primary Nutanix clusters.
For example, if there are two primary Nutanix clusters, with one cluster having m
networks and the other cluster having n networks, ensure that the recovery cluster has
m + n networks. Such a design ensures that all recovered VMs attach to a network.

• After the recovery of guest VMs to Xi Cloud Services, ensure that the router in your
primary site stops advertising the subnet that hosted the guest VMs.
• The protected guest VMs and Prism Central VM must be on different networks.
If protected guest VMs and Prism Central VM are on the same network, the Prism
Central VM becomes inaccessible when the route to the network is removed after
failover.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 145
• Xi Cloud Services supports the following third-party VPN gateway solutions.

• CheckPoint
• Cisco ASA
• PaloAlto

Note: If you are using the Palo Alto VPN gateway solution, set the MTU value to
1356 in the Tunnel Interface settings. The replication fails for the default MTU value
(1427).

• Juniper SRX
• Fortinet
• SonicWall
• VyOS

Additional Requirements

• Both the primary and recovery Nutanix clusters must have an external IP address.
• Both the primary and recovery Prism Centrals and Nutanix clusters must have a data
services IP address.
• The Nutanix cluster that hosts the Prism Centrals must meet the following requirements.

• The Nutanix cluster must be registered to the Prism Central instance.


• The Nutanix cluster must have an iSCSI data services IP address configured on it.
• The Nutanix cluster must also have sufficient memory to support a hot add of memory
to all Prism Central nodes when you enable Leap. A small Prism Central instance (4
vCPUs, 16 GB memory) requires a hot add of 4 GB, and a large Prism Central instance
(8 vCPUs, 32 GB memory) requires a hot add of 8 GB. If you enable Nutanix Flow, each
Prism Central instance requires an extra hot-add of 1 GB.
• Each node in a scaled-out Prism Central instance must have a minimum of 4 vCPUs and 16
GB memory.
For more information about the scaled-out deployments of a Prism Central, see Leap
Terminology on page 8.
• The protected guest VMs must have Nutanix VM mobility drivers installed.
Nutanix VM mobility drivers are required for accessing the guest VMs after failover. Without
Nutanix VM mobility drivers, the guest VMs become inaccessible after a failover.

Xi Leap Limitations
Consider the following general limitations before configuring protection and disaster recovery
(DR) with Xi Leap. Along with the general limitations, there are specific limitations of protection
with the following supported replication schedules.

• For specific limitations of protection with Asynchronous replication schedule (1 hour or


greater RPO), see Asynchronous Replication Limitations (Xi Leap) on page 182.
• For specific limitations of protection with NearSync replication schedule (1–15 minutes RPO),
see NearSync Replication Limitations (Xi Leap) on page 217.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 146
Virtual Machine Limitations

• You cannot start or replicate the following guest VMs at Xi Cloud Services.

• VMs configured with a GPU resource.


• VMs configured with four or more vNUMA sockets.
• VMs configured with more than 24 vCPUs.
• VMs configured with more than 128 GB memory.

• You cannot deploy witness VMs.


• You cannot protect multiple guest VMs that use disk sharing (for example, multi-writer
sharing, Microsoft Failover Clusters, Oracle RAC).

• You cannot protect VMware fault tolerance enabled guest VMs.

• You cannot recover vGPU console enabled guest VMs efficiently.


When you perform DR of vGPU console-enabled guest VMs, the VMs recover with the
default VGA console (without any alert) instead of vGPU console. The guest VMs fail to
recover when you perform cross-hypervisor disaster recovery (CHDR).
• You cannot recover guest VMs with vGPU.
However, you can manually restore guest VMs with vGPU.


• You cannot configure NICs for a guest VM across both the virtual private clouds (VPC).
You can configure NICs for a guest VM associated with either production or test VPC.

Volume Groups Limitation


You cannot protect volume groups.

Network Segmentation Limitation


You cannot apply network segmentation for management traffic (any traffic not on the
backplane network) in Xi Leap.
You get an error when you try to enable network segmentation for management traffic on
a Leap enabled Nutanix Cluster or enable Leap in a network segmentation enabled Nutanix
cluster. For more information about network segmentation, see Securing Traffic Through
Network Segmentation in the Security Guide.

Note: However, you can apply network segmentation for backplane traffic at the primary and
recovery clusters. Nutanix does not recommend this because when you perform a planned
failover of guest VMs having network segmentation for backplane enabled, the guest VMs fail to
recover and the guest VMs at the primary AZ are removed.

Self-Service Restore
You cannot perform self-service restore.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 147
Virtual Network Limitations
Although there is no limit to the number of VLANs that you can create, only the first 500
VLANs list in drop-down of Network Settings while creating a recovery plan. For more
information about VLANs in the recovery plan, see Nutanix Virtual Networks on page 174.

Xi Leap Configuration Maximums


For the maximum number of entities you can configure with different replication schedules and
perform failover (disaster recovery), see Nutanix Configuration Maximums. The limits have been
tested for Xi Leap production deployments. Nutanix does not guarantee the system to be able
to operate beyond these limits.

Tip: Upgrade your NCC version to 3.10.1 to get configuration alerts.

Xi Leap Recommendations
Nutanix recommends the following best practices for configuring protection and disaster
recovery (DR) with Xi Leap.

Recommendation for Migrating Protection Domains to Protection Policies


You can protect a guest VM either with legacy DR solution (protection domain-based) or with
Leap. To protect a legacy DR-protected guest VM with Leap, you must migrate the guest
VM from protection domain to a protection policy. During the migration, do not delete the
guest VM snapshots in the protection domain. Nutanix recommends keeping the guest VM
snapshots in the protection domain until the first recovery point for the guest VM is available on
Prism Central. For more information, see Migrating Guest VMs from a Protection Domain to a
Protection Policy on page 232.

Recommendation for Virtual Networks

• Map the networks while creating a recovery plan in Prism Central.


• Recovery plans do not support overlapping subnets in a network-mapping configuration. Do
not create virtual networks that have the same name or overlapping IP address ranges.

General Recommendations

• Create all entities (protection policies, recovery plans, and VM categories) at the primary
availability zone (site).
• Upgrade Prism Central before upgrading the Nutanix clusters (Prism Elements) registered to
it.
• Do not include the guest VMs protected with Asynchronous, NearSync, and Synchronous
replication schedules in the same recovery plan. You can include guest VMs protected with
Asynchronous or NearSync replication schedules in the same recovery plan. However, if
you combine these guest VMs with the guest VMs protected by Synchronous replication
schedules in a recovery plan, the recovery fails.

Xi Leap Service-Level Agreements (SLAs)


Xi Leap is essentially an extension of Leap to Xi Cloud Services. Xi Leap enables protection of
your guest VMs and disaster recovery (DR) to Xi Cloud Services. With reverse synchronization,
Xi Leap can protect you guest VMs and enable DR from Xi Cloud Services to a Nutanix cluster
at an on-prem availability zone (site). A Nutanix cluster is essentially an AHV or an ESXi cluster
running AOS. In addition to performing DR from AHV clusters to Xi Cloud Services (only AHV),

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 148
you can also perform cross-hypervisor disaster recovery (CHDR)—DR from ESXi clusters to Xi
Cloud Services.
You can protect your guest VMs with the following replication schedules.

• Asynchronous (1 hour or greater RPO). For information about protection with Asynchronous
replication in Xi Leap, see Protection with Asynchronous Replication and DR (Xi Leap) on
page 180.
• NearSync (1–15 minute RPO). For information about protection with NearSync replication in
Xi Leap, see Protection with NearSync Replication and DR (Xi Leap) on page 212.

Xi Leap Views
The disaster recovery views enable you to perform CRUD options on the following types of
Leap VMs.

• Configured entities (for example, availability zones, protection policies, and recovery plans)
• Created entities (for example, VMs, and recovery points)
Some views available in the Xi Cloud Services differ from the corresponding view in on-prem
Prism Central. For example, the option to connect to an availability zone is on the Availability
Zones page in an on-prem Prism Central, but not on the Availability Zones page in Xi Cloud
Services. However, the views of both user interfaces are largely the same. This chapter
describes the views of Xi Cloud Services.

Availability Zones View in Xi Cloud Services


The Availability Zones view lists all of your paired availability zones.
The following figure is a sample view, and the tables describe the fields and the actions that you
can perform in this view.

Figure 83: AZs View

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 149
Table 16: Availability Zones View Fields

Field Description
Name Name of the availability zone.
Region Region to which the availability zone belongs.
Type Type of availability zone. Availability zones in
Xi Cloud Services are shown as being of type
Xi. Availability zones that are backed by on-
prem Prism Central instances are shown to be
of type physical. The availability zone that you
are logged in to is shown as a local availability
zone.
Connectivity Status Status of connectivity between the local
availability zone and the paired availability
zone.

Table 17: Workflows Available in the Availability Zones View

Workflow Description
Connect to Availability Zone (on-prem Prism Connect to an on-prem Prism Central or to a
Central only) Xi Cloud Services for data replication.

Table 18: Actions Available in the Actions Menu

Action Description
Disconnect Disconnect the remote availability zone. When
you disconnect an availability zone, the pairing
is removed.

Protection Policies View in Xi Cloud Services


The Protection Policies view lists all of configured protection policies from all availability zones.
The following figure is a sample view, and the tables describe the fields and the actions that you
can perform in this view.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 150
Figure 84: Protection Policies View

Table 19: Protection Policies View Fields

Field Description
Name Name of the protection policy.
Primary Location Replication source site for the protection
policy.
Recovery Location Replication target site for the protection
policy.
RPO Recovery point objective for the protection
policy.
Remote Retention Number of retention points at the remote site.
Local Retention Number of retention points at the local site.

Table 20: Workflows Available in the Protection Policies View

Workflow Description
Create protection policy Create a protection policy.

Table 21: Actions Available in the Actions Menu

Action Description
Update Update the protection policy.
Clone Clone the protection policy.
Delete Delete the protection policy.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 151
Recovery Plans View in Xi Cloud Services
The Recovery Plans view lists all of configured recovery plans from all availability zones.
The following figure is a sample view, and the tables describe the fields and the actions that you
can perform in this view.

Figure 85: Recovery Plans View

Table 22: Recovery Plans View Fields

Field Description
Name Name of the recovery plan.
Source Replication source site for the recovery plan.
Destination Replication target site for the recovery plan.
Entities Sum of the following VMs:

• Number of local, live VMs that are specified


in the recovery plan.
• Number of remote VMs that the recovery
plan can recover at this site.

Last Validation Status Status of the most recent validation of the


recovery plan.
Last Test Status Status of the most recent test performed on
the recovery plan.

Table 23: Workflows Available in the Recovery Plans View

Workflow Description
Create Recovery Plan Create a recovery plan.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 152
Table 24: Actions Available in the Actions Menu

Action Description
Validate Validates the recovery plan to ensure that
the VMs in the recovery plan have a valid
configuration and can be recovered.
Test Test the recovery plan.
Update Update the recovery plan.
Failover Perform a failover.
Delete Delete the recovery plan.

Dashboard Widgets in Xi Cloud Services


The Xi Cloud Services dashboard includes widgets that display the statuses of configured
protection policies and recovery plans. If you have not configured these VMs, the widgets
display a summary of the steps required to get started with Leap.
To view these widgets, click the Dashboard tab.
The following figure is a sample view of the dashboard widgets.

Figure 86: Dashboard Widgets for Xi Leap

Enabling Leap in the On-Prem Site


To perform disaster recovery (DR) from Xi Cloud Services to a Nutanix cluster at an on-
prem availability zone (site), enable Leap at the on-prem site (Prism Central) only. You need
not enable Leap in the Xi Cloud Services portal; Xi Cloud Services does that by default for
you. Without enabling Leap, you can configure protection policies and recovery plans that
synchronize to the on-prem site but you cannot perform failover and failback operations.
To enable Leap at the on-prem site, see Enabling Leap for On-Prem Site on page 32.

Xi Leap Environment Setup


You can set up a secure environment to enable replication between an on-prem site and Xi
Cloud Services with virtual private network (VPN). To configure the required environment,
perform the following steps.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 153
1. Pair your on-prem availability zone (site) with Xi Cloud Services. For more information about
pairing, see Pairing Availability Zones (Xi Leap) on page 154.
2. Set up an on-prem VPN solution.
3. Enable VPN on the production virtual private cloud by using the Xi Cloud Services portal.
4. Set up a VPN client as a VM in Xi Cloud Services to enable connectivity to the applications
that have failed over to the Xi Cloud Services.
5. Configure policy-based routing (PBR) rules for the VPN to successfully work with the Xi
Cloud Services. If you have a firewall in the Xi Cloud Services and a floating IP address is
assigned to the firewall, create a PBR policy in the Xi Cloud Services to configure the firewall
as the gateway to the Internet. For example, specify 10.0.0.2/32 (private IP address of the
firewall) in the Subnet IP. For more information, see Policy Configuration in Xi Infrastructure
Service Administration Guide .
6. Configure the custom DNS in your virtual private cloud in the Xi Cloud Services. For more
information, see Virtual Private Cloud Management in Xi Infrastructure Service Administration
Guide .

Note: For more information about Xi Cloud Services, see Xi Infrastructure Service
Administration Guide.

Pairing Availability Zones (Xi Leap)


To perform disaster recovery (DR) from Xi Cloud Services to a Nutanix cluster at an on-prem
availability zone (site), pair the on-prem site (Prism Central) only to Xi Cloud Services. For
reverse synchronization, you need not pair again from Xi Cloud Services portal; Xi Cloud
Services captures the paring configuration from the on-prem site that pairs Xi Cloud Services.
To pair an on-prem site with Xi Cloud Services, see Pairing Availability Zones (Leap) on
page 33.

VPN Configuration (On-prem and Xi Cloud Services)


Xi Cloud Services enables you to set up a secure VPN connection between your on-prem sites
and Xi Cloud Services to enable end-to-end disaster recovery services of Leap. A VPN solution
between your on-prem site and Xi Cloud Services enables secure communication between
your on-prem Prism Central instance and the production virtual private cloud (VPC) in Xi Cloud
Services. If your workload fails over to Xi Cloud Services, the communication between the on-
prem resources and failed over resources in Xi Cloud Services takes place over an IPSec tunnel
established by the VPN solution.

Note: Set up the VPN connection before data replication begins.

You can connect multiple on-prem sites to Xi Cloud Services. If you have multiple remote
sites, you can set up secure VPN connectivity between each of your remote sites and Xi Cloud
Services. With this configuration, you do not need to force the traffic from your remote site
through your main site to Xi Cloud Services.
A VPN solution to connect to Xi Cloud Services includes a VPN gateway appliance in the Xi
Cloud and a VPN gateway appliance (remote peer VPN appliance) in your on-prem site. A VPN
gateway appliance learns about the local routes, establishes an IPSec tunnel with its remote
peer, exchanges routes with its peer, and directs network traffic through the VPN tunnel.
After you complete the VPN configuration in the Xi Cloud Services portal, Nutanix creates a
virtual VPN gateway appliance in the Xi Cloud. To set up a remote peer VPN gateway appliance
in your on-prem site, you can either use the On Prem - Nutanix VPN solution (provided by
Nutanix) or use a third-party VPN solution:

• On Prem - Nutanix (recommended): If you select this option, Nutanix creates a VPN gateway
VM (remote peer appliance) on your on-prem cluster, connects the appliance to your

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 154
network, and establishes an IPsec tunnel with the VPN gateway that is running in the Xi
Cloud.
The Nutanix VPN controller runs as a service in the Xi Cloud and on the on-prem Nutanix
cluster and is responsible for the creation, setup, and lifecycle maintenance of the VPN
gateway appliance (in the Xi Cloud and on-prem). The VPN controller deploys the virtual
VPN gateway appliance in the Xi Cloud after you complete the VPN configuration in the
Xi Cloud Services portal. The on-prem VPN controller deploys the virtual VPN gateway
appliance on the on-prem cluster in the subnet you specify when you configure a VPN
gateway in the Xi Cloud Services portal.
The virtual VPN gateway appliance in the Xi Cloud and VPN gateway VM (peer appliance) in
your on-prem cluster each consume 1 physical core, 4 GB RAM, and 10 GB storage.
• On Prem - Third Party: If you select this option, you must manually set up a VPN solution as
an on-prem VPN gateway (peer appliance) that can establish an IPsec tunnel with the VPN
gateway VM in the Xi Cloud. The on-prem VPN gateway (peer appliance) can be a virtual
or hardware appliance. See On-Prem - Third-Party VPN Solution on page 163 for a list of
supported third-party VPN solutions.

VPN Configuration Entities


To set up a secure VPN connection between your on-prem sites and Xi Cloud Services,
configure the following entities in the Xi Cloud Services portal:

• VPN Gateway: Represents the gateway of your VPN appliances.


VPN gateways are of the following types:

• Xi Gateway: Represents the Xi VPN gateway appliance


• On Prem - Nutanix Gateway: Represents the VPN gateway appliance at your on-prem site
if you are using the on-prem Nutanix VPN solution.
• On Prem - Third Party Gateway: Represents the VPN gateway appliance at your on-prem
site if you are using your own VPN solution (provided by a third-party vendor).
• VPN Connection: Represents the VPN IPSec tunnel established between a VPN gateway in
the Xi Cloud and VPN gateway in your on-prem site. When you create a VPN connection,
you select a Xi gateway and on-prem gateway between which you want to create the VPN
connection.
You configure a VPN gateway in the Xi Cloud and at each of the on-prem sites you want to
connect to the Xi Cloud. You then configure a VPN connection between a VPN gateway in the
Xi Cloud and VPN gateway in your on-prem site.

Single-Site Connection
If you want to connect only one on-prem site to the Xi Cloud, configure the following entities in
the Xi Cloud Services portal:
1. One Xi gateway to represent the Xi VPN gateway appliance
2. One on-prem gateway (On-prem - Nutanix Gateway or on-prem - third-party Gateway) to
represent the VPN gateway appliance at your on-prem site
3. One VPN connection to connect the two VPN gateways

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 155
Figure 87: Single-Site Connection

Multi-Site Connection
If you want to connect multiple on-prem sites to the Xi Cloud, configure the following entities in
the Xi Cloud Services portal:
1. One Xi gateway to represent the Xi VPN gateway appliance
2. On-prem gateways (On-prem - Nutanix Gateway or on-prem - third-party Gateway) for each
on-prem site
3. VPN connections to connect the Xi gateway and the on-prem gateway at each on-prem site
For example, if you want to connect two on-prem sites to the Xi Cloud, configure the following:
1. One Xi gateway
2. Two on-prem gateways for the two on-prem sites
3. Two VPN connections

Figure 88: Multi-Site Connection for less the 1 Gbps Bandwidth

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 156
One Xi VPN gateway provides 1 Gbps of aggregate bandwidth for IPSec traffic. Therefore,
connect only as many on-prem VPN gateways to one Xi VPN gateway to accommodate 1 Gbps
of aggregate bandwidth.
If you require an aggregate bandwidth of more than 1 Gbps, configure multiple Xi VPN
gateways.

Figure 89: Multi-Site Connection for more the 1 Gbps Bandwidth

On-Prem - Nutanix VPN Solution


You can use the on-prem - Nutanix VPN solution to set up VPN between your on-prem site and
Xi Cloud Services. If you select this option, you are using an end-to-end VPN solution provided
by Nutanix and you do not need to use your own VPN solution to connect to Xi Cloud Services.
After you complete the VPN configuration in the Xi Cloud Services portal, Nutanix creates a
virtual VPN gateway appliance in the Xi Cloud. The On Prem - Nutanix VPN solution creates a
VPN gateway VM (remote peer appliance) on your on-prem cluster, connects the appliance to
your network, and establishes an IPsec tunnel with the VPN gateway VM that is running in the
Xi Cloud.
Following is the workflow if you choose the On Prem - Nutanix VPN solution to set up a VPN
connection between your on-prem site and Xi Cloud Services.
1. Create one or more Xi VPN gateways.
2. The VPN controller running in Xi Cloud Services creates a VPN gateway VM in the Xi Cloud.
The Xi VPN gateway VM runs in your (tenant) overlay network.
3. Create one or more on-prem VPN gateways.
Create a VPN gateway for each on-prem site that you want to connect to the Xi Cloud.
4. Create one or more VPN connections.
Create a VPN connection between each on-prem site (on-prem VPN gateway) and Xi Cloud
(Xi gateway).
5. The VPN controller creates a VPN gateway VM on the on-prem cluster in the subnet you
specify when you create an on-prem VPN gateway. The VPN gateway VM becomes the peer
appliance to the VPN gateway VM in the Xi Cloud.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 157
6. Both the VPN appliances are now configured, and the appliances now proceed to perform
the following:
1. An on-prem router communicates the on-prem routes to the on-prem VPN gateway by
using iBGP or OSPF.
2. The Xi VPN controller communicates the Xi subnets to the Xi VPN gateway VM.
3. The on-prem VPN gateway VM then establishes a VPN IPsec tunnel with the Xi VPN
gateway VM. Both appliances establish an eBGP peering session over the IPsec tunnel and
exchange routes.
4. The on-prem VPN gateway VM publishes the Xi subnet routes to the on-prem router by
using iBGP or OSPF.

Nutanix VPN Solution Requirements

In your on-prem site, ensure the following before you configure VPN on Xi Cloud Services:
1. The Prism Central instance and cluster are running AOS 5.11 or newer for AHV and AOS 5.19
or newer for ESXi.
2. A router with iBGP, OSPF, or Static support to communicate the on-prem routes to the on-
prem VPN gateway VM.
3. Depending on whether you are using iBGP or OSPF, ensure that you have one of the
following:

• Peer IP (for iBGP): The IP address of the on-prem router to exchange routes with the VPN
gateway VM.
• Area ID (for OSPF): The OSPF area ID for the VPN gateway in the IP address format.
4. Determine the following details for the deployment of the on-prem VPN gateway VM.

• Subnet UUID: The UUID of the subnet of the on-prem cluster in which you want to install
the on-prem VPN gateway VM. Log on to your on-prem Prism Central web console to
determine the UUID of the subnet.
• Public IP address of the VPN Gateway Device: A public WAN IP address that you want
the on-prem gateway to use to communicate with the Xi VPN gateway appliance.
• VPN VM IP Address: A static IP address that you want to allocate to the on-prem VPN
gateway VM.
• IP Prefix Length: The subnet mask in CIDR format of the subnet on which you want to
install the on-prem VPN gateway VM.
• Default Gateway IP: The gateway IP address for the on-prem VPN gateway appliance.
• On Prem Gateway ASN: ASN must not be the same as any of your on-prem BGP ASNs. If
you already have a BGP environment in your on-prem site, the customer gateway is the
ASN for your organization. If you do not have a BGP environment in your on-prem site,
you can choose any number. For example, you can choose a number in the 65000 range.

On-Prem Site Firewall Port Requirements

Configure rules for ports in your on-prem firewall depending on your deployment scenario.

On-Prem Behind a Network Address Translation or Firewall Device


In this scenario, the IPSec tunnel terminates behind a network address translation (NAT) or
firewall device. For NAT to work, open UDP ports 500 and 4500 in both directions. Ports 1024–
1034 are ephemeral ports used by the CVMs.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 158
Enable the on-prem VPN gateway to allow the traffic according to the rules described in the
Port Rules table.

IPSec Terminates on the Firewall Device


In this scenario, you do not need to open the ports for NAT (500 and 4500), but enable the on-
prem VPN gateway to allow the traffic according to the rules described in the Port Rules table.
In the following table, PC subnet refers to the subnet where your on-prem Prism Central is
running. The Xi infrastructure load balancer route is where the traffic for Xi CVMs and PC is
located. You receive this information when you begin using Xi Cloud Services.

Table 25: Port Rules

Source address Destination address Source port Destination port

PC subnet Load balancer route Any 1024–1034


advertised

Xi infrastructure load PC and CVM subnet Any 2020


balancer route
2009
9440

The following port requirements are applicable only if you are using the Nutanix VPN solution.

Nutanix VPN VM 8.8.8.8 and 8.8.4.4 IP VPN VM DNS UDP port 53


addresses of the DNS
server

Nutanix VPN VM time.google.com, VPN VM NTP UDP port 123


0.pool.ntp.org,
1.pool.ntp.org,
2.pool.ntp.org of the
NTP server

Nutanix VPN VM ICMP ping to NTP NA NA


servers

CVM IP address in HTTPS request to the AHV hosts HTTPS port 443
AHV clusters Internet

CVM IP address in HTTPS and FTP ESXi hosts HTTPS port 443 and
ESXi clusters requests to the FTP 21
Internet

Creating a Xi VPN Gateway

Create a VPN gateway to represent the Xi VPN gateway appliance.

About this task


Perform the following to create a Xi VPN gateway.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Gateways.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 159
2. Click Create VPN Gateway.

Figure 90: Create a Xi Gateway

The Create VPN Gateway window appears.

3. Do the following in the indicated fields.

a. Name: Enter a name for the VPN gateway.


b. VPC: Select the production VPC.
c. Type: Select Xi Gateway.
d. Routing Protocol: Select eBGP or Static to set up a routing protocol between the Xi and
on-prem gateways.
e. (eBGP only): Select this option if you want to set up the eBGP routing protocol between
the Xi and on-prem gateways. Do the following in the indicated fields.

• In the ASN field, set an ASN for the Xi gateway. Ensure that the Xi gateway ASN is
different from that on-prem gateway ASN.
• In the eBGP Password field, set up a password for the eBGP session that is established
between the on-prem VPN gateway and Xi VPN gateway. The eBGP password can be
any string, preferably alphanumeric.
f. (Static only) If you select this option, manually set up static routes between the Xi and on-
prem gateways.
For more information, see Adding a Static Route in Xi Infrastructure Service
Administration Guide.

4. Click Save.
The Xi gateway you create is displayed in the VPN Gateways page.

Creating an On-Prem VPN Gateway (Nutanix)

Create a VPN gateway to represent the on-prem VPN gateway appliance.

About this task


Perform the following to create an on-prem VPN gateway.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Gateways.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 160
2. Click Create VPN Gateway.

Figure 91: Create a on-prem Gateway

The Create VPN Gateway window appears.

3. Do the following in the indicated fields.

a. Name: Enter a name for the VPN gateway.


b. Type: Select On Prem - Nutanix.
c. Automatically add route in PC and PE CVMs to enable replication: Select this option to
automatically enable traffic between the on-prem CVMs and CVMs in Xi Cloud Services.
If you select this option, a route to the CVMs in Xi Cloud Services is added with the on-
prem VPN gateway as the next-hop. Therefore, even if you choose to have static routes
between your on-prem router and the on-prem gateway, you do not need to manually
add those static routes (see step g).
A route to Xi CVMs is added with the on-prem VPN gateway as the next-hop.

Note: This option is only for the CVM-to-CVM (on-prem CVM and Xi Cloud CVMs) traffic.

d. Under Routing Protocol (between Xi Gateway and On Prem Nutanix Gateway), do the
following to set up the eBGP routing protocol between the Xi and on-prem gateways:

• In the ASN field, enter the ASN for your on-prem gateway. If you do not have a BGP
environment in your on-prem site, you can choose any number. For example, you can
choose a number in the 65000 range. Ensure that the Xi gateway ASN and on-prem
gateway ASN are not the same.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 161
• In the eBGP Password field, enter the same eBGP password as the Xi gateway.
e. Subnet UUID: Enter the UUID of the subnet of the on-prem cluster in which you want to
install the on-prem VPN gateway VM. Log on to your on-prem Prism Central web console
to determine the UUID of the subnet.
f. Under IP Address Information, do the following in the indicated fields:

• Public IP Address of the On Premises VPN Gateway Device: Enter a public WAN IP
address for the on-prem VPN gateway VM.
• VPN VM IP Address: Enter a static IP address that you want to allocate to the on-prem
VPN gateway VM.
• IP Prefix Length: Enter the subnet mask of mask length 24 of the subnet on which you
want to install the on-prem VPN gateway VM.
• Default Gateway IP: Enter the gateway IP address of the subnet on which you want to
install the on-prem VPN gateway VM.
g. Under Routing Protocol Configuration, do the following in the indicated fields:

• In the Routing Protocol drop-down list, select the dynamic routing protocol (OSPF,
iBGP, or Static) to set up the routing protocol between the on-prem router and on-
prem gateway.
• (Static only) If you select Static, manually add these routes in Xi Cloud Services.
For more information, see Adding a Static Route in Xi Infrastructure Service
Administration Guide.

Note: You do not need to add static routes for CVM-to-CVM traffic (see step c).

• (OSPF only) If you select OSPF, in the Area ID field, type the OSPF area ID for the VPN
gateway in the IP address format. In the Password Type field, select MD5 and type a
password for the OSPF session.
• (iBGP only) If you select iBGP, in the Peer IP field, type the IP address of the on-prem
router to exchange routes with the VPN gateway VM. In the Password field, type the
password for the iBGP session.

4. Click Save.
The on-prem gateway you create is displayed in the VPN Gateways page.

Creating a VPN Connection

Create a VPN connection to establish a VPN IPSec tunnel between a VPN gateway in the Xi
Cloud and VPN gateway in your on-prem site. Select the Xi gateway and on-prem gateway
between whom you want to create the VPN connection.

About this task


Perform the following to create a VPN connection.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Connections.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 162
2. Click Create VPN Connection.

Figure 92: Create a VPN Connection

The Create VPN Connection window appears.

3. Do the following in the indicated fields:

a. Name: Enter a name for the VPN connection.


b. Description: Enter a description for the VPN connection.
c. IPSec Secret. Enter an alphanumeric string as the IPSec string for the VPN connection.
d. Xi Gateway: Select the Xi gateway for which you want to create this VPN connection.
e. On Premises Gateway: Select the on-prem gateway for which you want to create this
VPN connection.
f. Dynamic Route Priority: This is not a mandatory field. Set this field if you have multiple
routes to the same destination. For example, consider you have VPN connection 1
and VPN connection 2 and you want VPN connection 1 to take precedence over VPN
connection 2, set the priority for VPN connection 1 higher than VPN connection 2. Higher
the priority number, higher is the precedence of that connection. You can set a priority
number from 10 through 1000.
See the Routes Precedence section in Routes Management in Xi Infrastructure Service
Administration Guide for more information.

4. Click Save.
The VPN connection you create is displayed in the VPN Connections page.

On-Prem - Third-Party VPN Solution


You can use your own VPN solution to connect your on-prem site to Xi Cloud Services. If you
select this option, you must manually set up a VPN solution by using a supported third-party
VPN solution as an on-prem VPN gateway (peer appliance) that can establish an IPsec tunnel
with the VPN gateway VM in the Xi Cloud.
Following is the workflow if you want to use a third-party VPN solution to set up a VPN
connection between your on-prem site and Xi Cloud Services.
1. Create one or more Xi VPN gateways.
2. The VPN controller running in Xi Cloud Services creates a VPN gateway VM in the Xi Cloud.
The Xi VPN gateway VM runs in your (tenant) overlay network.
3. Create one or more on-prem VPN gateways.
Create a VPN gateway for each on-prem site that you want to connect to the Xi Cloud.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 163
4. Create one or more VPN connections.
Create a VPN connection to create an IPSec tunnel between each on-prem site (on-prem
VPN gateway) and Xi Cloud (Xi gateway).
5. Configure a peer VPN gateway appliance (hardware or virtual) in your on-prem site.
Depending upon your VPN solution, you can download detailed instructions about how to
configure your on-prem VPN gateway appliance. For more information, see Downloading the
On-Prem VPN Appliance Configuration on page 169.
Xi Cloud Services supports the following third-party VPN gateway solutions.

• CheckPoint
• Cisco ASA
• PaloAlto

Note: If you are using the Palo Alto VPN gateway solution, set the MTU value to 1356 in
the Tunnel Interface settings. The replication fails for the default MTU value (1427).

• Juniper SRX
• Fortinet
• SonicWall
• VyOS

Third-Party VPN Solution Requirements

Ensure the following in your on-prem site before you configure VPN in Xi Cloud Services.
1. A third-party VPN solution in your on-prem site that functions as an on-prem VPN gateway
(peer appliance).
2. The on-prem VPN gateway appliance supports the following.

• IPSec IKEv2
• Tunnel interfaces
• External Border Gateway Protocol (eBGP)
3. Note the following details of the on-prem VPN gateway appliance.

• On Prem Gateway ASN: Assign an ASN for your on-prem gateway. If you already have
a BGP environment in your on-prem site, the customer gateway is the ASN for your
organization. If you do not have a BGP environment in your on-prem site, you can choose
any number. For example, you can choose a number in the 65000 range.
• Xi Gateway ASN: Assign an ASN for the Xi gateway. The Xi gateway ASN must not be the
same as the on-prem gateway ASN.
• eBGP Password: The eBGP password is the shared password between the Xi gateway and
on-prem gateway. Set the same password for both the gateways.
• Public IP address of the VPN Gateway Device: Ensure that the public IP address of the
on-prem VPN gateway appliance can reach the public IP address of Xi Cloud Services.
4. The on-prem VPN gateway appliance can route the traffic from the on-prem CVM subnets to
the established VPN tunnel.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 164
5. Ensure that the following ports are open in your on-prem VPN gateway appliance.

• IKEv2: Port number 500 of the payload type UDP.


• IPSec: Port number 4500 of the payload type UDP.
• BGP: Port number 179 of the payload type TCP.

On-Prem Site Firewall Port Requirements

Configure rules for ports in your on-prem firewall depending on your deployment scenario.

On-Prem Behind a Network Address Translation or Firewall Device


In this scenario, the IPSec tunnel terminates behind a network address translation (NAT) or
firewall device. For NAT to work, open UDP ports 500 and 4500 in both directions. Ports 1024–
1034 are ephemeral ports used by the CVMs.
Enable the on-prem VPN gateway to allow the traffic according to the rules described in the
Port Rules table.

IPSec Terminates on the Firewall Device


In this scenario, you do not need to open the ports for NAT (500 and 4500), but enable the on-
prem VPN gateway to allow the traffic according to the rules described in the Port Rules table.
In the following table, PC subnet refers to the subnet where your on-prem Prism Central is
running. The Xi infrastructure load balancer route is where the traffic for Xi CVMs and PC is
located. You receive this information when you begin using Xi Cloud Services.

Table 26: Port Rules

Source address Destination address Source port Destination port

PC subnet Load balancer route Any 1024–1034


advertised

Xi infrastructure load PC and CVM subnet Any 2020


balancer route
2009
9440

The following port requirements are applicable only if you are using the Nutanix VPN solution.

Nutanix VPN VM 8.8.8.8 and 8.8.4.4 IP VPN VM DNS UDP port 53


addresses of the DNS
server

Nutanix VPN VM time.google.com, VPN VM NTP UDP port 123


0.pool.ntp.org,
1.pool.ntp.org,
2.pool.ntp.org of the
NTP server

Nutanix VPN VM ICMP ping to NTP NA NA


servers

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 165
Source address Destination address Source port Destination port

CVM IP address in HTTPS request to the AHV hosts HTTPS port 443
AHV clusters Internet

CVM IP address in HTTPS and FTP ESXi hosts HTTPS port 443 and
ESXi clusters requests to the FTP 21
Internet

Creating a Xi VPN Gateway

Create a VPN gateway to represent the Xi VPN gateway appliance.

About this task


Perform the following to create a Xi VPN gateway.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Gateways.

2. Click Create VPN Gateway.

Figure 93: Create a Xi Gateway

The Create VPN Gateway window appears.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 166
3. Do the following in the indicated fields.

a. Name: Enter a name for the VPN gateway.


b. VPC: Select the production VPC.
c. Type: Select Xi Gateway.
d. Routing Protocol: Select eBGP or Static to set up a routing protocol between the Xi and
on-prem gateways.
e. (eBGP only): Select this option if you want to set up the eBGP routing protocol between
the Xi and on-prem gateways. Do the following in the indicated fields.

• In the ASN field, set an ASN for the Xi gateway. Ensure that the Xi gateway ASN is
different from that on-prem gateway ASN.
• In the eBGP Password field, set up a password for the eBGP session that is established
between the on-prem VPN gateway and Xi VPN gateway. The eBGP password can be
any string, preferably alphanumeric.
f. (Static only) If you select this option, manually set up static routes between the Xi and on-
prem gateways.
For more information, see Adding a Static Route in Xi Infrastructure Service
Administration Guide.

4. Click Save.
The Xi gateway you create is displayed in the VPN Gateways page.

Creating an On-Prem VPN Gateway (Third-Party)

Create a VPN gateway to represent the on-prem VPN gateway appliance.

Before you begin


Ensure that you have all the details about your on-prem VPN appliance as described in Third-
Party VPN Solution Requirements on page 164.

About this task


Perform the following to create an on-prem VPN gateway.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Gateways.

2. Click Create VPN Gateway.

Figure 94: Create an On Prem Gateway

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 167
3. Do the following in the indicated fields.

a. Name: Enter a name for the VPN gateway.


b. Type: Select On Prem - Third Party.
c. IP Address of your Firewall or Router Device performing VPN: Enter the IP address of the
on-prem VPN appliance.
d. Routing Protocol: Select eBGP or Static to set up a routing protocol between the Xi and
on-prem gateways.
e. (eBGP only) If you select eBGP, do the following:

• In the ASN field, enter the ASN for your on-prem gateway. If you do not have a BGP
environment in your on-prem site, you can choose any number. For example, you can
choose a number in the 65000 range. Ensure that the Xi gateway ASN and on-prem
gateway ASN are not the same.
• In the eBGP Password field, enter the same eBGP password as the Xi gateway.
f. (Static only) If you select Static, manually set up static routes between the Xi and on-
prem gateways.
For more information, see Adding a Static Route in Xi Infrastructure Service
Administration Guide.

Creating a VPN Connection

Create a VPN connection to establish a VPN IPSec tunnel between a VPN gateway in the Xi
Cloud and VPN gateway in your on-prem site. Select the Xi gateway and on-prem gateway
between whom you want to create the VPN connection.

About this task


Perform the following to create a VPN connection.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Connections.

2. Click Create VPN Connection.

Figure 95: Create a VPN Connection

The Create VPN Connection window appears.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 168
3. Do the following in the indicated fields:

a. Name: Enter a name for the VPN connection.


b. Description: Enter a description for the VPN connection.
c. IPSec Secret. Enter an alphanumeric string as the IPSec string for the VPN connection.
d. Xi Gateway: Select the Xi gateway for which you want to create this VPN connection.
e. On Premises Gateway: Select the on-prem gateway for which you want to create this
VPN connection.
f. Dynamic Route Priority: This is not a mandatory field. Set this field if you have multiple
routes to the same destination. For example, consider you have VPN connection 1
and VPN connection 2 and you want VPN connection 1 to take precedence over VPN
connection 2, set the priority for VPN connection 1 higher than VPN connection 2. Higher
the priority number, higher is the precedence of that connection. You can set a priority
number from 10 through 1000.
See the Routes Precedence section in Routes Management in Xi Infrastructure Service
Administration Guide for more information.

4. Click Save.
The VPN connection you create is displayed in the VPN Connections page.

Downloading the On-Prem VPN Appliance Configuration

Depending upon your VPN solution, you can download detailed instructions about how to
configure your on-prem VPN gateway appliance.

About this task


Perform the following to download the instructions to configure your on-prem VPN gateway
appliance.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Gateways.

2. Click an on-prem VPN gateway.

3. In the details page, click On Prem Gateway Configuration.

Figure 96: On-prem VPN Gateway Appliance Configuration

4. Select the type and version of your on-prem VPN gateway appliance and click Download.

5. Follow the instructions in the downloaded file to configure the on-prem VPN gateway
appliance.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 169
VPN Gateway Management
You can see the details of each VPN gateway, update the gateway, or delete the gateway.
All your VPN gateways are displayed in the VPN Gateways page.

Displaying the Details of a VPN Gateway

You can display the details such as the type of gateway, VPC, IP addresses, protocols, and
connections associated with the gateways.

About this task


Perform the following to display the details of a VPN gateway.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Gateways.
A list of all your VPN gateways is displayed. The VPN gateways table displays details such
as the name, type, VPC, status, public IP address, and VPN connections associated with each
VPN gateway.

Figure 97: VPN Gateways List

2. Click the name of a VPN gateway to display additional details of that VPN gateway.

3. In the details page, click the name of a VPN connection to display the details of that VPN
connection associated with the gateway.

Updating a VPN Gateway

The details that you can update in a VPN gateway depend on the type of gateway (Xi gateway
or On Prem gateway).

About this task


Perform the following to update a VPN gateway.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Gateways.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 170
2. Do one of the following:

• Select the checkbox next to the name of the VPN gateway and, in the Actions drop-down
list, click Update.

Figure 98: Use the Actions drop-down list


• Click the name of the VPN gateway and, in the details page that appears, click Update.
The Update VPN Gateway dialog box appears.

3. Update the details as required.


The fields are similar to the Create VPN Gateway dialog box. For more information, see
Creating a Xi VPN Gateway on page 159, Creating an On-Prem VPN Gateway (Nutanix) on
page 160, or Creating an On-Prem VPN Gateway (Third-Party) on page 167 depending
on the type of gateway you are updating.

4. Click Save.

Deleting a VPN Gateway

If you want to delete a VPN gateway, you must first delete all the VPN connections associated
with the gateway and only then you can delete the VPN gateway.

About this task


Perform the following to delete a VPN gateway.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Gateways.

2. Do one of the following:

» Select the checkbox next to the name of the VPN gateway and, in the Actions drop-down
list, click Delete.
» Click the name of the VPN gateway and, in the details page that appears, click Delete.

3. Click OK in the confirmation message that appears to delete the VPN gateway.

VPN Connection Management


You can see the details of each VPN connection, update the connection, or delete the
connection.
All your VPN connections are displayed in the VPN Connections page.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 171
Displaying the Details of a VPN Connection

You can display details such as the gateways associated with the connection, protocol details,
Xi gateway routes, throughput of the connection, and logs of the IPSec and eBGP sessions for
troubleshooting purposes.

About this task


Perform the following to display the details of a VPN connection.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Connections.
A list of all your VPN connections is displayed. The VPN connections table displays details
such as the name, IPSec and eBGP status, dynamic route priority, and the VPC and gateways
associated with each VPN connection.

Figure 99: VPN Connections List

2. Click the name of a VPN connection to display more details of that VPN connection.
The details page displays the following tabs:

• Summary: Displays details of each gateway, protocol, and Xi gateway routes associated
with the connection.
• Throughput: Displays a graph for throughput of the VPN connection.
• IPSec Logging: Displays logs of the IPSec sessions of the VPN connection. You can see
these logs to troubleshoot any issues with the VPN connection.
• EBGP Logging: Displays logs of the eBGP sessions of the VPN connection. You can see
these logs to troubleshoot any issues with the VPN connection.
Click the name of the tab to display the details in that tab. For example, click the Summary
tab to display the details.

Figure 100: VPN Connection Summary Tab

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 172
Updating a VPN Connection

You can update the name, description, IPSec secret, and dynamic route priority of the VPN
connection.

About this task


Perform the following to update a VPN connection.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Connections.

2. Do one of the following:

• Select the checkbox next to the name of the VPN connection and, in the Actions drop-
down list, click Update.

Figure 101: Use the Actions drop-down list


• Click the name of the VPN connection and, in the details page that appears, click Update.

Figure 102: Click the name of the VPN connection


The Update VPN Connection dialog box appears.

3. Update the details as required.


The fields are similar to the Create VPN Connection dialog box. See Creating a VPN
Connection on page 162 for more information.

4. Click Save.

Deleting a VPN Connection

About this task


Perform the following to delete a VPN connection.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 173
Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Connections.

2. Do one of the following:

» Select the checkbox next to the name of the VPN connection and, in the Actions drop-
down list, click Delete.
» Click the name of the VPN connection and, in the details page that appears, click Delete.

3. Click OK in the confirmation message that appears to delete the VPN connection.

Upgrading the VPN Gateway Appliances


You can upgrade the VPN gateway VM in the Xi Cloud and on-prem VPN gateway VM in
your on-prem site if you are using the On Prem - Nutanix VPN solution by using the Xi Cloud
Services portal. If you are using a third-party VPN solution, you can upgrade only the VPN
gateway VM running in the Xi Cloud by using the Xi Cloud Services portal. To upgrade the on-
prem VPN gateway appliance provided by a third-party vendor, see the documentation of that
vendor for instructions about how to upgrade the VPN appliance.

About this task

Note: The VPN gateway VM restarts after the upgrade is complete. Therefore, perform the
upgrade during a scheduled maintenance window.

Perform the following to upgrade your VPN gateway appliances.

Procedure

1. In the Xi Cloud Services portal, go to Explore -> Networking -> VPN Gateways.

2. Click the name of the VPN gateway.


To upgrade the VPN gateway VM running in the Xi Cloud, select a Xi gateway.
To upgrade the VPN gateway VM running in your on-prem site, select the on-prem gateway
associated with that on-prem VPN gateway VM.

3. In the details page of the gateway, click the link in the Version row.
The VPN Version dialog box appears.
If you are using the latest version of the VPN gateway VM, the VPN Version dialog box
displays a message that your VPN gateway VM is up to date.
If your VPN gateway VM is not up to date, the VPN Version dialog box displays the Upgrade
option.

4. In the VPN Version dialog box, click Upgrade to upgrade your VPN gateway VM to the latest
version.
The VPN gateway VM restarts after the upgrade is complete and starts with the latest
version.

Nutanix Virtual Networks


A planned or an unplanned failover for production workloads requires production virtual
networks in both the primary and the recovery site. To ensure that a failover operation,
whenever necessary, goes as expected, you also need test virtual network in both the sites
for testing your recovery configuration in both directions (failover and failback). To isolate

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 174
production and test workflows, a recovery plan in Leap uses four separate virtual networks,
which are as follows.
Two Production Networks
A production virtual network in the primary site is mapped to a production network in
the recovery site. Production failover and failback are confined to these virtual networks.
Two Test Networks
The production virtual network in each site is mapped to a test virtual network in the
paired site. Test failover and failback are confined to these virtual networks.
The following figures show the source and target networks for planned, unplanned, and test
failovers.

Figure 103: Virtual Network Mapping

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 175
Figure 104: Virtual Network Mapping (On-Prem to Xi Cloud Services)

Virtual networks on on-prem Nutanix clusters are virtual subnets bound to a single VLAN. At
on-prem sites (including the recovery site), you must manually create the production and test
virtual networks before you create your first recovery plan.
The virtual networks required in Xi Cloud Services are contained within virtual private clouds
(VPCs). Virtual networks required for production workloads are contained within a virtual
private cloud named production. Virtual networks required for testing failover from on-prem
sites are contained within a virtual private cloud named Test. The task of creating virtual
networks in the VPCs in Xi Cloud Services is an optional one. If you do not create a virtual
network in a VPC, Leap dynamically creates the virtual networks for you when a failover
operation is in progress. Leap cleans up dynamically created virtual networks when they are no
longer required (after failback).

Note: You cannot create more VPCs in Xi Cloud Services. However, you can update the VPCs
to specify settings such as DNS and DHCP, and you can configure policies to secure the virtual
networks.

Virtual Subnet Configuration in On-Prem Site


You can use your on-prem Prism Central instance to create, modify, and remove virtual
networks. For information about how to perform these procedures by using Prism Central, see
the Prism Central Guide.

Virtual Subnet Configuration in Xi Cloud Services


You can create virtual subnets in the production and test virtual networks. This is an optional
task. You must perform these procedures in Xi Cloud Services. For more information, see the Xi
Infrastructure Services Guide.

Xi Leap RPO Sizer


Nutanix offers standard service level agreements (SLAs) for data replication from your on-
prem AHV clusters to Xi Cloud Services based on RPO and RTO. The replication to Xi Cloud
Services occurs over public Internet (VPN or DirectConnect) and therefore the network
bandwidth available for replication to Xi Cloud Services cannot be controlled. The unstable
network bandwidth and the lack of network information affects the amount of data that can
be replicated in a given time frame. You can test your RPO objectives by setting up a real
protection policy or use Xi Leap RPO sizer utility to simulate the protection plan (without

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 176
replicating data to Xi Cloud Services). Xi Leap RPO Sizer provides you with information
required to determine if the RPO SLAs are achievable. The utility provides insights on your
network bandwidth, estimates performance, calculates actual change rate, and calculates the
feasible RPO for your data protection plan.

About this task


See Xi Leap Service-Level Agreements (SLAs) on page 148 for more information about
Nutanix SLAs for data replication to Xi Cloud Services. To use the Xi Leap RPO Sizer utility,
perform the following steps.

Procedure

1. Log on to the My Nutanix portal with your account credentials.

2. Click Launch in Xi Leap RPO Sizer widget.

3. (optional) Download the bundle (rpo_sizer.tar) using the hyperlink given in the instructions.

Tip: You can also download the bundle directly (using wget command in CLI) into the
directory after step 4.a.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 177
4. Log on to any on-prem guest VM through an SSH session and do the following.

Note: The guest VM must have connectivity to the Prism Central VM and CVMs.

a. Create a separate directory to ensure that all the downloaded and extracted files inside
the downloaded bundle remain in one place.
nutanix@cvm$ mkdir dir_name

Replace dir_name with an identifiable name. For example, rpo_sizer.


b. (optional) Copy the downloaded bundle into the directory created in the previous step.
nutanix@cvm$ cp download_bundle_path/rpo_sizer.tar ./dir_name/

Replace download_bundle_path with the path to the downloaded bundle.


Replace dir_name with the directory name created in the previous step.

Tip: If you download the bundle directly (using wget command in CLI) from the directory,
you can skip this step.

c. Go to the directory where the bundle is stored and extract the bundle.
nutanix@cvm$ cd ./dir_name

Replace dir_name with the directory name created in the step 4.a.
nutanix@cvm$ tar -xvf rpo_sizer.tar

This command generates rpo_sizer.sh and rposizer.tar in the same directory.


d. Change the permissions to make the extracted shell file executable.
nutanix@cvm$ chmod +x rpo_sizer.sh

e. Run the shell script in the bundle.


nutanix@cvm$ ./rpo_sizer.sh

Note: If you ran the Xi Leap RPO Sizer previously on the Prism Central VM, ensure that you
clean up the script before you run the shell script again. Run the command ./rpo_sizer.sh
delete to clean up the script. If you do not clean up the script, you get an error similar to
The container name "/rpo_sizer" is already in use by container "xxxx"(where xxxx
is the container name. You have to remove (or rename) that container to be able
to reuse that name
.

5. Open a web browser and go to http://Prism_Central_IP_address:8001/ to run the RPO test.


Replace Prism_Central_IP_address with the virtual IP address of your Prism Central
deployment.

Note: If you have set up a firewall on Prism Central, ensure that the port 8001 is open.
nutanix@cvm$ modify_firewall -p 8001 -o open -i eth0 -a
Close the port after running the RPO test.
nutanix@cvm$ modify_firewall -p 8001 -o close -i eth0 -a

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 178
6. Click Configure and execute test and specify the following information in the Configuration
Wizard.

Note: If you are launching the Xi Leap RPO Sizer utility for the first time, generate an API key
pair. To generate API key pair, see Creating an API Key in the Nutanix Licensing Guide guide.

a. In the API Key and PC Credentials tab, specify the following information.

• 1. API Key: Enter the API key that you generated.


2. Key ID: Enter the key ID that you generated.
3. PC IP: Enter the IP address of Prism Central VM.
4. Username: Enter the username of your Prism Central deployment.
5. Password: Enter the password of your Prism Central deployment.
6. Click Next.
b. In the Select Desired RPO and Entities tab, select the desired RPO from the drop-down
list, select the VM Categories or individual VMs, and click +. If you want to add more RPO
and entities to the test, enter the information again and click Next.

Note: Only when you select the desired RPO, you can select the VM Categories or
individual VMs on which you can test the selected RPO.

The system discovers Prism Element automatically based on the VM Categories and the
individual VMs you choose.
c. In the Enter PE credentials tab, enter the SSH password or SSH key for Prism Element
("nutanix" user) running on AHV cluster and click Next.
d. In the Network Configuration tab, specify the following information.

• 1. Select region: Select a region closest to Xi Leap datacenter from the drop-down list
where the workloads should be copied.
2. Select availability zone: Select an availability zone (site) from the drop-down list.
3. NAT Gateway IPs: Enter the public facing IP address of Prism Element running on
your AHV cluster.

Note: To find the NAT gateway IP address of Prism Element running on your AHV
cluster, log on to Prism Element through an SSH session (as the "nutanix" user) and
run the curl ifconfig.me command.

Note: Do not turn on the Configure Advanced Options switch unless advised by the
Nutanix Support.

4. Click Next.
e. In the View Configuration tab, review the RPO, entity, and network configuration, the
estimated test duration, and click Submit.
The new window shows the ongoing test status in a progress bar.

7. When the RPO test completes, click Upload result to upload the test result. To view the
detailed and intuitive report of the test, click View Report. To abort the test, click X.

Note: If a test is in progress, a new test cannot be triggered.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 179
Protection and Automated DR (Xi Leap)
Automated data recovery (DR) configurations use protection policies to protect the guest VMs,
and recovery plans to orchestrate the recovery of those guest VMs to Xi Cloud Services. With
reverse synchronization, you can protect guest VMs and enable DR from Xi Cloud Services to
a Nutanix cluster at an on-prem availability zone (site). You can automate protection of your
guest VMs with the following supported replication schedules in Xi Leap.

• Asynchronous replication schedule (1 hour or greater RPO). For information about protection
with Asynchronous replication schedule, see Protection with Asynchronous Replication and
DR (Xi Leap) on page 180.
• NearSync replication schedule (1–15 minute RPO). For information about protection with
NearSync replication schedule, see Protection with NearSync Replication and DR (Xi Leap)
on page 212.

Protection with Asynchronous Replication and DR (Xi Leap)


Asynchronous replication schedules enable you to protect your guest VMs with an RPO of
1 hour or beyond. A protection policy with an Asynchronous replication schedule creates
a recovery point in an hourly time interval, and replicates it to Xi Cloud Services for High
Availability. For guest VMs protected with Asynchronous replication schedule, you can perform
disaster recovery (DR) to Xi Cloud Services. With reverse synchronization, you can perform
DR from Xi Cloud Services to a Nutanix cluster at an on-prem site. In addition to performing
DR from AHV clusters to Xi Cloud Services (only AHV), you can also perform cross-hypervisor
disaster recovery (CHDR)—DR from ESXi clusters to Xi Cloud Services.

Note: Nutanix provides multiple disaster recovery (DR) solutions to secure your environment.
See Nutanix Disaster Recovery Solutions on page 11 for the detailed representation of the DR
offerings of Nutanix.

Asynchronous Replication Requirements (Xi Leap)


The following are the specific requirements for protecting your guest VMs with Asynchronous
replication schedule. Ensure that you meet the following requirements in addition to the general
requirements of Xi Leap.
For information about the general requirements of Xi Leap, see Xi Leap Requirements on
page 142.
For information about the on-prem node, disk and Foundation configurations required to
support Asynchronous replication schedules, see On-Prem Hardware Resource Requirements
on page 14.

Hypervisor Requirements
AHV or ESXi

• The AHV clusters must be running on AHV versions that come bundled with the latest
version of AOS.
• The ESXi clusters must be running on version ESXi 6.5 GA or newer.

Nutanix Software Requirements


The on-prem Prism Central and their registered clusters (Prism Elements) must be running the
following versions of AOS.

• AOS 5.10 or newer with AHV.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 180
• AOS 5.11 or newer with ESXi.
Xi Cloud Services runs the latest versions of AOS.

Cross Hypervisor Disaster Recovery (CHDR) Requirements


Guest VMs protected with Asynchronous replication schedule support cross-hypervisor disaster
recovery. You can perform failover (DR) to recover guest VMs from ESXi clusters to AHV
clusters (Xi Cloud Services) by considering the following requirements.

• The on-prem Nutanix clusters must be running AOS 5.11 or newer.


• Install and configure Nutanix Guest Tools (NGT) on all the guest VMs. For more information,
see Enabling and Mounting Nutanix Guest Tools in Prism Web Console Guide.
NGT configures the guest VMs with all the required drivers for VM portability. For more
information about general NGT requirements, see Nutanix Guest Tools Requirements and
Limitations in Prism Web Console Guide.
• CHDR supports guest VMs with flat files only.
• CHDR supports IDE/SCSI and SATA disks only.
• For all the non-boot SCSI disks of Windows guest VMs, set the SAN policy to OnlineAll so
that they come online automatically.
• In vSphere 6.7, guest VMs are configured with UEFI secure boot by default. Upon CHDR to
an AHV cluster, these guest VMs do not start if the host does not support the UEFI secure
boot feature. For more information about supportability of UEFI secure boot on Nutanix
clusters, see Compatibility Matrix.
For operating systems that support UEFI and Secure Boot, see UEFI and Secure Boot
Support for CHDR on page 211.
• Nutanix does not support vSphere inventory mapping (for example, VM folder and resource
pools) when protecting workloads between VMware clusters.

• Nutanix does not support vSphere snapshots or delta disk files.


If you have delta disks attached to a guest VM and you proceed with failover, you get
a validation warning and the guest VM does not recover. Contact Nutanix Support for
assistance.

Table 27: Operating Systems Supported for CHDR

Operating System Version Requirements and limitations

Windows
• Windows 2008 R2 or newer • Only 64-bit operating systems
versions are supported.
• Windows 7 or newer versions

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 181
Operating System Version Requirements and limitations

Linux
• CentOS 6.5 and 7.0 • SLES operating system is not
supported.
• RHEL 6.5 or newer and RHEL
7.0 or newer.
• Oracle Linux 6.5 and 7.0
• Ubuntu 14.04

Additional Requirement
The storage container name of the protected guest VMs must be the same on both the primary
and recovery clusters. Therefore, a storage container must exist on the recovery cluster with
the same name as the one on the primary cluster. For example, if the protected VMs are
in the SelfServiceContainer storage container on the primary cluster, there must also be a
SelfServiceContainer storage container on the recovery cluster.

Asynchronous Replication Limitations (Xi Leap)


Consider the following specific limitations before protecting your guest VMs with Asynchronous
replication schedule. These limitations are in addition to the general limitations of Xi Leap.
For information about the general limitations of Leap, see Xi Leap Limitations on page 146.

• You cannot restore guest VMs with incompatible GPUs at the recovery site.
• You cannot protect guest VMs configured as part of a network function chain.
• You cannot retain hypervisor-specific properties after cross hypervisor disaster recovery
(CHDR).
Cross hypervisor disaster recovery (CHDR) does not preserve hypervisor-specific properties
(for example, multi-writer flags, independent persistent and non-persistent disks, changed
block tracking (CBT), PVSCSI disk configurations).

Creating a Protection Policy with Asynchronous Replication Schedule (Xi Leap)


To protect the guest VMs in an hourly replication schedule, configure an Asynchronous
replication schedule while creating the protection policy. The policy takes recovery points
of those guest VMs in the specified time intervals (hourly) and replicates them to Xi Cloud
Services for High Availability. With reverse synchronization, you can create policy at Xi Cloud
Services and replicate to an on-prem availability zone (site). For protection from Xi Cloud
Services to an on-prem site, the protection policy allows you to add only one Asynchronous
replication schedule.

Before you begin


See Asynchronous Replication Requirements (Xi Leap) on page 180 and Asynchronous
Replication Limitations (Xi Leap) on page 182 before you start.

About this task


To create a protection policy with an Asynchronous replication schedule, perform the following
procedure at Xi Cloud Services. You can also create a protection policy at the on-prem site.
Protection policies you create or update at the on-prem site synchronize back to Xi Cloud
Service.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 182
Procedure

1. Log on to Xi Cloud Services.

2. Click the hamburger icon at the top-left corner of the window. Go to Data Protection &
Recovery > Protection Policies in the left pane.

3. Click Create Protection Policy.


Specify the following information in the Create Protection Policy window.

Figure 105: Protection Policy Configuration: Asynchronous

a. Policy name: Enter a name for the protection policy.

Caution: The name can be of only alphanumeric, dot, dash, and underscore characters.

b. In the Primary Location pane, specify the following information.

• 1. Location: From the drop-down list, check the Xi Cloud Services availability zone
(site) that hosts the guests VMs to protect.
The drop-down lists all the sites paired with the local site. Local AZ represents the
local site (Prism Central). For your primary site, you can check either the local site or
a non-local site.
2. Cluster: Xi Cloud Services automatically selects the cluster for you. Therefore the
only option available is Auto.
The drop-down lists all the Nutanix clusters registered to Prism Central representing
the selected site. If you want to protect the guest VMs from multiple Nutanix
clusters in the same protection policy, check the clusters that host those guest

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 183
VMs. All Clusters protects the guest VMs of all Nutanix clusters registered to Prism
Central.
3. Click Save.
Clicking Save activates the Recovery Location pane. After saving the primary
site configuration, you can optionally add a local schedule (step iv) to retain the
recovery points at the primary site.
4. Click + Add Local Schedule if you want to retain recovery points locally in addition
to retaining recovery points in a replication schedule (step d.iv). For example, you
can create a local schedule to retain 15 minute recovery points locally and also

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 184
an hourly replication schedule to retain recovery points and replicate them to a
recovery site every 2 hours. The two schedules apply differently on the guest VMs.
Specify the following information in the Add Schedule window.

Figure 106: Protection Policy Configuration: Add Local Schedule

1. Take Snapshot Every: Specify the frequency in minutes, hours, days, or weeks at
which you want the recovery points to be taken locally.
2. Retention Type: Specify one of the following two types of retention policy.

• Linear: Implements a simple retention scheme at the local site. If you set the
retention number to n, the local site retains the n recent recovery points.
When you enter the frequency in minutes, the system selects the Roll-up
retention type by default because minutely recovery points do not support
Linear retention types.
• Roll-up: Rolls up the recovery points into a single recovery point at the local
site.
For more information about the roll-up recovery points, see step d.iii.
3. Retention on XI-US-EAST-1A-PPD : Auto: Specify the retention number for the
local site.
4. If you want to take application-consistent recovery points, check Take App-
Consistent Recovery Point.
Irrespective of the local or replication schedules, the recovery points are of the
specified type. If you check Take App-Consistent Recovery Point, the recovery
points generated are application-consistent and if you do not check Take App-
Consistent Recovery Point, the recovery points generated are crash-consistent.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 185
If the time in the local schedule and the replication schedule match, the single
recovery point generated is application-consistent.

Note: See Application-consistent Recovery Point Conditions and


Limitations on page 51 before you take application-consistent snapshot.
5. Click Save Schedule.
c. In the Recovery Location pane, specify the following information.

Figure 107: Protection Policy Configuration: Select Recovery Location

• 1. Location: From the drop-down list, select the availability zone (site) where you want
to replicate the recovery points.
The drop-down lists all the sites paired with the Xi Cloud Services. XI-US-EAST-1A-
PPD : Auto represents the local site (Prism Central). Do not select XI-US-EAST-1A-
PPD : Auto because a duplicate location is not supported in Xi Cloud Services.
If you do not select a site, local recovery points that are created by the protection
policy do not replicate automatically. You can, however, replicate the recovery
points manually and use recovery plans to recover the guest VMs. For more
information, see Manual Disaster Recovery (Leap) on page 137.
2. Cluster: Xi Cloud Services automatically selects the cluster for you. Therefore the
only option available is Auto.
The drop-down lists all the Nutanix clusters registered to Prism Central representing
the selected site. If you want to protect the guest VMs from multiple Nutanix
clusters in the same protection policy, check the clusters that host those guest

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 186
VMs. All Clusters protects the guest VMs of all Nutanix clusters registered to Prism
Central.
3. Click Save.
Clicking Save activates the + Add Schedule button between the primary and the
recovery site. After saving the recovery site configuration, you can optionally add a
local schedule to retain the recovery points at the recovery site.
4. Click + Add Local Schedule if you want to retain recovery points locally in addition
to retaining recovery points in a replication schedule (step d.iv). For example,
you can create a local schedule to retain one hourly recovery points locally to
supplement the hourly replication schedule. The two schedules apply differently on
the guest VMs after failover, when the recovery points replicate back to the primary
site.
Specify the following information in the Add Schedule window.

Figure 108: Protection Policy Configuration: Add Local Schedule

1. Take Snapshot Every: Specify the frequency in minutes, hours, days, or weeks at
which you want the recovery points to be taken locally.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 187
2. Retention Type: Specify one of the following two types of retention policy.

• Linear: Implements a simple retention scheme at the local site. If you set the
retention number to n, the local site retains the n recent recovery points.
When you enter the frequency in minutes, the system selects the Roll-up
retention type by default because minutely recovery points do not support
Linear retention types.
• Roll-up: Rolls up the recovery points into a single recovery point at the local
site.
For more information about the roll-up recovery points, see step d.iii.
3. Retention on PC_xx.xx.xxx:PE_yyy: Specify the retention number for the local
site.
4. If you want to take application-consistent recovery points, check Take App-
Consistent Recovery Point.
Irrespective of the local or replication schedules, the recovery points are of the
specified type. If you check Take App-Consistent Recovery Point, the recovery
points generated are application-consistent and if you do not check Take App-
Consistent Recovery Point, the recovery points generated are crash-consistent.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 188
If the time in the local schedule and the replication schedule match, the single
recovery point generated is application-consistent.

Note: See Application-consistent Recovery Point Conditions and


Limitations on page 51 before you take application-consistent snapshot.
5. Click Save Schedule.
d. Click + Add Schedule to add a replication schedule between the primary and the recovery
site.
Specify the following information in the Add Schedule window. The window auto-
populates the Primary Location and Recovery Location that you have selected in step b
and step c.

Figure 109: Protection Policy Configuration: Add Schedule (Asynchronous)

• 1. Protection Type: Click Asynchronous.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 189
2. Take Snapshot Every: Specify the frequency in hours, days, or weeks at which you
want the recovery points to be taken.
The specified frequency is the RPO. For more information about RPO, see Leap
Terminology on page 8.
3. Retention Type: Specify one of the following two types of retention policy.

• Linear: Implements a simple retention scheme at both the primary (local) and the
recovery (remote) site. If you set the retention number for a given site to n, that
site retains the n recent recovery points. For example, if the RPO is 1 hour, and
the retention number for the local site is 48, the local site retains 48 hours (48 X 1
hour) of recovery points at any given time.

Tip: Use linear retention policies for small RPO windows with shorter retention
periods or in cases where you always want to recover to a specific RPO window.

• Roll-up: Rolls up the recovery points as per the RPO and retention period into a
single recovery point at a site. For example, if you set the RPO to 1 hour, and the
retention time to 5 days, the 24 oldest hourly recovery points roll up into a single
daily recovery point (one recovery point = 24 hourly recovery points) after every
24 hours. The system keeps one day (of rolled-up hourly recovery points) and 4
days of daily recovery points.

Note:

• If the retention period is n days, the system keeps 1 day of RPO


(rolled-up hourly recovery points) and n-1 days of daily recovery
points.
• If the retention period is n weeks, the system keeps 1 day of RPO, 1
week of daily and n-1 weeks of weekly recovery points.
• If the retention period is n months, the system keeps 1 day of RPO, 1
week of daily, 1 month of weekly, and n-1 months of monthly recovery
points.
• If the retention period is n years, the system keeps 1 day of RPO, 1
week of daily, 1 month of weekly, and n-1 months of monthly recovery
points.

Note: The recovery points that are used to create a rolled-up recovery point are
discarded.

Tip: Use roll-up retention policies for anything with a longer retention period.
Roll-up policies are more flexible and automatically handle recovery point aging/
pruning while still providing granular RPOs for the first day.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 190
4. To specify the retention number for the primary and recovery sites, do the following.

• Retention on XI-US-EAST-1A-PPD : Auto: Specify the retention number for the


primary site.
This field is unavailable if you do not specify a recovery location.
• Retention on PC_xx.xx.xx.xxx:PE_yyy: Specify the retention number for the
recovery site.
If you select linear retention, the remote and local retention count represents
the number of recovery points to retain at any given time. If you select roll-up
retention, these numbers specify the retention period.
5. If you want to enable reverse retention of the recovery points, check Reverse
retention for VMs on recovery location.

Note: Reverse retention for VMs on recovery location is available only when the
retention numbers on the primary and recovery sites are different.

Reverse retention maintains the retention numbers of recovery points even after
failover to a recovery site in the same or different availability zones. For example, if
you retain two recovery points at the primary site and three recovery points at the
recovery site, and you enable reverse retention, a failover event does not change the
initial retention numbers when the recovery points replicate back to the primary site.
The recovery site still retains two recovery points while the primary site retains three
recovery points. If you do not enable reverse retention, a failover event changes the
initial retention numbers when the recovery points replicate back to the primary site.
The recovery site retains three recovery points while the primary site retains two
recovery points.
Maintaining the same retention numbers at a recovery site is required if you want to
retain a particular number of recovery points, irrespective of where the guest VM is
after its failover.
6. If you want to take application-consistent recovery points, check Take App-
Consistent Recovery Point.
Application-consistent recovery points ensure that application consistency is
maintained in the replicated recovery points. For application-consistent recovery
points, install NGT on the guest VMs running on AHV clusters. For guest VMs
running on ESXi clusters, you can take application-consistent recovery points
without installing NGT, but the recovery points are hypervisor-based, and leads to
VM stuns (temporary unresponsive VMs) after failover to the recovery sites.

Note: See Application-consistent Recovery Point Conditions and Limitations


on page 51 before you take application-consistent snapshot.

Caution: Application-consistent recovery points fail for EFI-boot enabled Windows


2019 VM running on ESXi when NGT is not installed. Nutanix recommends installing
NGT on guest VMs running on ESXi also.

7. Click Save Schedule.


e. Click Next.
Clicking Next shows a list of VM categories where you can optionally check one or more
VM categories to protect in the protection policy. DR configurations using Leap allows
you to protect a guest VM by using only one protection policy. Therefore, VM categories
specified in another protection policy are not in the list. If you protect a guest VM in
another protection policy by specifying the VM category of the guest VM (category-
based inclusion), and if you protect the guest VM from the VMs page in this policy

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 191
(individual inclusion), the individual inclusion supersedes the category-based inclusion.
Effectively, only the protection policy that protected the individual guest VM protects the
guest VM.
For example, the guest VM VM_SherlockH is in the category Department:Admin, and
you add this category to the protection policy named PP_AdminVMs. Now, if you add
VM_SherlockH from the VMs page to another protection policy named PP_VMs_UK,
VM_SherlockH is protected in PP_VMs_UK and unprotected from PP_AdminVMs.
f. If you want to protect the guest VMs category wise, check the VM categories that you
want to protect from the list and click Add.

Figure 110: Protection Policy Configuration: Add VM Categories

Prism Central includes built-in VM categories for frequently encountered applications (for
example, MS Exchange and Oracle). If the VM category or value you want is not available,
first create the category with the required values, or update an existing category so
that it has the values you require. Doing so ensures that the VM categories and values
are available for selection. You can add VMs to the category either before or after you
configure the protection policy. If the guest VMs have a common characteristic, such as
belonging to a specific application or location, create a VM category and add the guest
VMs into the category.
If you do not want to protect the guest VMs category wise, proceed to the next step
without checking VM categories. You can add the guest VMs individually to the protection
policy later from the VMs page (see Adding Guest VMs individually to a Protection Policy
on page 128).
g. Click Create.
The protection policy with an Asynchronous replication schedule is created. To verify
the protection policy, see the Protection Policies page. You can add VMs individually
(without VM categories) to the protection policy or remove VMs from the protection
policy. For information about the operations that you can perform on a protection policy,
see Protection Policy Management on page 221.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 192
Creating a Recovery Plan (Xi Leap)
To orchestrate the failover (disaster recovery) of the protected guest VMs to the recovery site,
create a recovery plan. After a failover, a recovery plan recovers the protected guest VMs to
the recovery availability zone (site). To create a recovery plan, perform the following procedure
at Xi Cloud Services. You can also create a recovery plan at the on-prem site. The recovery plan
you create or update at the on-prem site synchronizes back to Xi Cloud Service.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to Recovery Plans in the left pane.

3. Click Create Recovery Plan.


Specify the following information in the Create Protection Policy window.

a. Primary Location: Select the primary site that hosts the guest VMs to protect. This list
displays the Local AZ by default and is unavailable for editing.
b. Recovery Location: Select the on-prem site where you want to replicate the recovery
points.
c. Click Proceed.

Tip: After you create the recovery plan, you cannot change the Recovery Location from the
Recovery Plans page. To change the recovery location on an existing recovery plan, do the
following.

• Update the protection policy to point to the new recovery location. For more
information, see Updating a Protection Policy on page 224.
• Configure the network mapping. For more information, see Nutanix Virtual
Networks on page 174.

Caution: If all the VMs in the recovery plan do not point to the new recovery location, you
get an availability zone conflict alert.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 193
4. In the General tab, enter Recovery Plan Name, Recovery Plan Description. Click Next.

Figure 111: Recovery Plan Configuration: General

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 194
5. In the Power On Sequence tab, click + Add Entities to add VMs to the sequence and do the
following.

Figure 112: Recovery Plan Configuration: Add Entities

a. In the Search Entities by, select VM Name from the drop-down list to specify VMs by
name.
b. In the Search Entities by, select Category from the drop-down list to specify VMs by
category.
c. To add the VMs or VM categories to the stage, select the VMs or VM categories from the
list.

Note: The VMs listed in the search result are in the active state of replication.

d. Click Add.
The selected VMs are added to the sequence. You can also create multiple stages and
add VMs to those stages to define their power-on sequence. For more information about
stages, see Stage Management on page 64.

Caution: Do not include the guest VMs protected with Asynchronous, NearSync, and
Synchronous replication schedules in the same recovery plan. You can include guest VMs
protected with Asynchronous or NearSync replication schedules in the same recovery plan.
However, if you combine these guest VMs with the guest VMs protected by Synchronous
replication schedules in a recovery plan, the recovery fails.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 195
6. To manage in-guest script execution on guest VMs during recovery, select the individual
VMs or VM categories in the stage. Click Manage Scripts and then do the following.

Note: In-guest scripts allow you to automate various task executions upon recovery of the
VMs. For example, in-guest scripts can help automate the tasks in the following scenarios.

• After recovery, the VMs must use new DNS IP addresses and also connect to a
new database server that is already running at the recovery site.
Traditionally, to achieve this new configuration, you would manually log on to
the recovered VM and modify the relevant files. With in-guest scripts, you have
to write a script to automate the required steps and enable the script when you
configure a recovery plan. The recovery plan execution automatically invokes
the script and performs the reassigning of DNS IP address and reconnection to
the database server at the recovery site.
• If VMs are part of domain controller siteA.com at the primary site AZ1, and after
the VMs recover on the site AZ2, you want to add the recovered VMs to the
domain controller siteB.com.
Traditionally, to reconfigure, you would manually log on to the VM, remove the
VM from an existing domain controller, and then add the VM to a new domain

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 196
controller. With in-guest scripts, you can automate the task of changing the
domain controller.

Note: In-guest script execution requires NGT version 1.9 or newer installed on the VM. The
in-guest scripts run as a part of the recovery plan only if they have executable permissions
for the following.

• Administrator user (Windows)


• Root user (Linux)

Note: You can have only two in-guest batch or shell scripts—one for production (planned
and unplanned failover) while the other for test failover. One script, however, can invoke
other scripts. Place the scripts at the following locations in the VMs.

• In Windows VMs,

• Batch script file path for production failover:


C:\Program Files\Nutanix\scripts\production\vm_recovery

• Batch script file path for test failover:


C:\Program Files\Nutanix\scripts\test\vm_recovery

• In Linux VMs,

• Shell script file path for production failover:


/usr/local/sbin/production_vm_recovery

• Shell script file path for test failover:


/usr/local/sbin/test_vm_recovery

Note: When an in-guest script runs successfully, it returns code 0. Error code 1 signifies that
the execution of the in-guest script was unsuccessful.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 197
Figure 113: Recovery Plan Configuration: In-guest Script Execution

a. To enable script execution, click Enable.


A command prompt icon appears against the VMs or VM categories to indicate that in-
guest script execution is enabled on those VMs or VM categories.
b. To disable script execution, click Disable.

7. In the Network Settings tab, map networks in the primary cluster to networks at the
recovery cluster.

Figure 114: Recovery Plan Configuration: Network Settings

Network mapping enables replicating the network configurations of the primary clusters
to the recovery clusters, and recover VMs into the same subnet at the recovery cluster.
For example, if a VM is in the vlan0 subnet at the primary cluster, you can configure the
network mapping to recover that VM in the same vlan0 subnet at the recovery cluster.
To specify the source and destination network information for a network mapping, do the
following in Local AZ (Primary) and PC 10.51.1xx.xxx (Recovery).

a. Under Production in Virtual Network or Port Group, select the production subnet that
contains the protected VMs for which you are configuring a recovery plan. (Optional)

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 198
If the virtual network is a non-IPAM network, specify the gateway IP address and prefix
length in the Gateway IP/Prefix Length field.
b. Under Test Failback in Virtual Network or Port Group, select the test subnet that you
want to use for testing failback from the recovery cluster. (Optional) If the virtual
network is a non-IPAM network, specify the gateway IP address and prefix length in the
Gateway IP/Prefix Length field.
c. To add a network mapping, click Add Networks at the top-right corner of the page, and
then repeat the steps 7.a-7.b.

Note: The primary and recovery Nutanix clusters must have identical gateway IP
addresses and prefix length. Therefore you cannot use a test failover network for two or
more network mappings in the same recovery plan.

d. Click Done.

Note: For ESXi, you can configure network mapping for both standard and distributed
(DVS) port groups. For more information about DVS, see VMware documentation.

Caution: Leap does not support VMware NSX-T datacenters. For more information about
NSX-T datacenters, see VMware documentation.

8. If you want to enable the VMs in the production VPC to access the Internet, enable
Outbound Internet Access.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 199
9. To assign floating IP addresses to the VMs when they are running in Xi Cloud Services, click
+ Floating IPs in Floating IPs section and do the following.

Figure 115: Recovery Plan Configuration: Assign Floating IP Address

a. In the NUMBER OF FLOATING IPS, enter the number of floating IP addresses you need
for assigning to VMs.
b. In the ASSIGN FLOATING IPS TO VMS (OPTIONAL), enter the name of the VMs and
select the IP address for it.
c. In Actions, click Save.
d. To assign a floating IP address to another VM, click + Assign Floating IP, and then repeat
the steps for assigning a floating IP address.

10. Click Done.


The recovery plan is created. To verify the recovery plan, see the Recovery Plans page.
You can modify the recovery plan to change the recovery location, add, or remove the
protected guest VMs. For information about various operations that you can perform on a
recovery plan, see Recovery Plan Management on page 134.

Failover and Failback Operations (Xi Leap)

You perform failover of the protected guest VMs when unplanned failure events (for example,
natural disasters) or planned events (for example, scheduled maintenance) happen at the
primary availability zone (site) or the primary cluster. The protected guest VMs migrate to the
recovery site where you perform the failover operations. On recovery, the protected guest
VMs start in the Xi Cloud Services region you specify in the recovery plan that orchestrates the
failover.
The following are the types of failover operations in Xi Leap.
Test Failover
To ensure that the protected guest VMs failover efficiently to the recovery site, you
perform a test failover. When you perform a test failover, the guest VMs recover in the
virtual network designated for testing purposes at the recovery site (a manually created

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 200
virtual subnet in the test VPC in Xi Cloud Services). However, the guest VMs at the
primary site are not affected. Test failovers rely on the presence of VM recovery points at
the recovery sites.
Planned Failover
To ensure VM availability when you foresee service disruption at the primary site, you
perform a planned failover to the recovery site. For a planned failover to succeed, the
guest VMs must be available at the primary site. When you perform a planned failover,
the recovery plan first creates a recovery point of the protected guest VM, replicates the
recovery point to the recovery site, and then starts the guest VM at the recovery site.
The recovery point used for migration is retained indefinitely. After a planned failover, the
guest VMs no longer run at the primary site. After a planned failover, the VMs no longer
run at the primary site.
Unplanned Failover
To ensure VM availability when a disaster causing service disruption occurs at the
primary site, you perform an unplanned failover to the recovery site. In an unplanned
failover, you can expect some data loss to occur. The maximum data loss possible
is equal to the least RPO you specify in the protection policy, or the data that was
generated after the last manual recovery point for a given guest VM. In an unplanned
failover, by default, the protected guest VMs recover from the most recent recovery
point. However, you can recover from an earlier recovery point by selecting a date and
time of the recovery point.
After the failover, replication begins in the reverse direction. You can perform an
unplanned failover operation only if recovery points have replicated to the recovery
cluster. At the recovery site, failover operations cannot use recovery points that were
created locally in the past. For example, if you perform an unplanned failover from
the primary site AZ1 to recovery site AZ2 in Xi Cloud Services and then attempt an
unplanned failover (failback) from AZ2 to AZ1, the recovery succeeds at AZ1 only if the
recovery points are replicated from AZ2 to AZ1 after the unplanned failover operation.
The unplanned failover operation cannot perform recovery based on the recovery points
that were created locally when the VMs were running in AZ1.
The procedure for performing a planned failover is the same as the procedure for performing
an unplanned failover. You can perform a failover even in different scenarios of network failure.
For more information about network failure scenarios, see Leap and Xi Leap Failover Scenarios
on page 67.

Performing a Test Failover (Xi Leap)

After you create a recovery plan, you can run a test failover periodically to ensure that the
failover occurs smoothly when required. You can perform the test failover from Xi Cloud
Services.

About this task


To perform a test failover to Xi Cloud Services, do the following.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to Recovery Plans in the left pane.

3. Select the recovery plan that you want to test.

4. Click Test from the Actions drop-down menu.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 201
5. In the Test Recovery Plan dialog box, do the following.

a. In Primary Location, select the primary availability zone (site).


b. In Recovery Location, select the recovery availability zone.
c. Click Test.
If you get errors or warnings, see the failure report that is displayed. Click the report to
review the errors and warnings. Resolve the error conditions and then restart the test
procedure.

6. Click Close.

Cleaning up Test VMs (Xi Leap)

After testing a recovery plan, you can remove the test VMs that the recovery plan created in
the recovery test network on Xi Cloud Services. To clean up the test VMs created when you test
a recovery plan, do the following.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to Recovery Plans in the left pane.

3. Click the recovery plans whose VMs you want to remove.

4. Click Clean Up Test VMs from the Actions drop-down menu.

5. In the Clean Up Test VMs dialog box, click Clean.


Test VMs are deleted. If you get errors or warnings, see the failure report that is displayed.
Click the report to review the errors and warnings. Resolve the error conditions and then
restart the test procedure.

Performing a Planned Failover (Xi Leap)

Perform a planned failover at the recovery site. To perform a planned failover to Xi Cloud
Services, do the following procedure.

Procedure

1. Log on to Xi Cloud Services.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 202
2. Click Explore and go to Recovery Plans in the left pane.

Figure 116: Planned Failover

3. Select a recovery plan for the failover operation.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 203
4. Click Failover from the Actions drop-down menu. Specify the following information in the
Failover from Recovery Plan dialog box.

Note: The Failover action is available only when all the selected recovery plans have the same
primary and recovery locations.

Figure 117: Planned Failover

a. Failover Type: Click Planned Failover.


b. Failover From (Primary): Select the protected primary cluster.
c. Failover To (Recovery): Select the recovery cluster where you want the VMs to failover.
This list displays Local AZ by default and is unavailable for editing.

Note: Click + to add more combinations of primary and recovery clusters. You can add as
many primary clusters as there are in the selected recovery plan.

5. Click Failover.
The Failover from Recovery Plan dialog box lists the errors and warnings, if any, and allows
you to stop or continue the failover operation.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 204
6. If you see errors, do the following.

a. To review errors or warnings, click View Details in the description.


b. Click Cancel to return to the Failover from Recovery Plan dialog box.
c. Select one of the following.

» To stop the failover operation, click Abort.


» To continue the failover operation despite the warnings, click Execute Anyway.
You cannot continue the failover operation when the validation fails with errors.

Note:

The entities of AHV/ESXi clusters recover at a different path on the ESXi clusters if
their files conflict with the existing files on the recovery ESXi cluster. For example,
there is a file name conflict if a VM (VM1) migrates to a recovery cluster that already
has a VM (VM1) in the same container.
However, the entities recover at a different path with VmRecoveredAtAlternatePath
alert only if the following conditions are met.

• Both the primary and the recovery clusters (Prism Elements) are of version 5.17 or
newer.
• A path for the entity recovery is not defined while initiating the failover operation.
• The protected entities do not have shared disk/s.
If these conditions are not satisfied, the failover operation fails.

Performing an Unplanned Failover (Xi Leap)

Perform an unplanned failover at the recovery site. To perform an unplanned failover to Xi


Cloud Services, do the following procedure.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to Recovery Plans in the left pane.

3. Select a recovery plan for the failover operation.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 205
4. Click Failover from the Actions drop-down menu. Specify the following information in the
Failover from Recovery Plan dialog box.

Note: The Failover action is available only when all the selected recovery plans have the same
primary and recovery locations.

Figure 118: Unplanned Failover

a. Failover Type: Click Unplanned Failover and do one of the following.

» Click Recover from latest Recovery Point to use the latest recovery point for recovery.
» Click Recover from specific point in time to use a recovery point taken at a specific
point in time for recovery.
b. Failover From (Primary): Select the protected primary cluster.
c. Failover To (Recovery): Select the recovery cluster where you want the VMs to failover.
This list displays Local AZ by default and is unavailable for editing.

Note: Click + to add more combinations of primary and recovery clusters. You can add as
many primary clusters as there are in the selected recovery plan.

Note: If recovery plans contain VM categories, the VMs from those categories recover in the
same category after an unplanned failover to the recovery site. Also, the recovery points keep
generating at the recovery site for those recovered VMs. Since the VM count represents the
number of recoverable VMs (calculated from recovery points), the recovered VMs and their
newly created recovery points sum up. Their sum gives double the count of the originally
recovered VMs on the recovery plans page. Now, if some VMs belonging to the given
category at the primary or recovery site are deleted, the VM count at both sites still stay the
same until the recovery points of deleted VMs expire. For example, when two VMs have failed
over, the recovery plans page at the recovery site shows four VMs (two replicated recovery
points from source and two newly generated recovery points). The page shows four VMs
even if the VMs are deleted from the primary or recovery site. The VM count synchronizes and
becomes consistent in the subsequent RPO cycle conforming to the retention policy set in the
protection policy (due to the expiration of recovery points).

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 206
5. Click Failover.
The Failover from Recovery Plan dialog box lists the errors and warnings, if any, and allows
you to stop or continue the failover operation.

6. If you see errors, do the following.

a. To review errors or warnings, click View Details in the description.


b. Click Cancel to return to the Failover from Recovery Plan dialog box.
c. Select one of the following.

» To stop the failover operation, click Abort.


» To continue the failover operation despite the warnings, click Execute Anyway.

Note: You cannot continue the failover operation when the validation fails with errors.

Note:

The entities of AHV/ESXi clusters recover at a different path on the ESXi clusters
if their files conflict with the existing files on the recovery ESXi cluster. For
example, there is a file name conflict if a VM (VM1) migrates to a recovery cluster
that already has a VM (VM1) in the same container.
However, the entities recover at a different path with
VmRecoveredAtAlternatePath alert only if the following conditions are met.

• Both the primary and the recovery clusters (Prism Elements) are of version 5.17
or newer.
• A path for the entity recovery is not defined while initiating the failover
operation.
• The protected entities do not have shared disk/s.
If these conditions are not satisfied, the failover operation fails.

Note: To avoid conflicts when the primary site becomes active after the failover, shut down
the guest VMs associated with this recovery plan. Manually power off the guest VMs on either
primary or recovery site after the failover is complete. You can also block the guest VMs
associated with this recovery plan through the firewall.

Performing Failback (Xi Leap)

A failback is similar to a failover but in the reverse. The same recovery plan applies to both
the failover and the failback operations. Therefore, how you perform a failback is identical
to how you perform a failover. Log on to the site where you want the VMs to failback, and
then perform a failover. For example, if you failed over VMs from an on-prem site to Xi Cloud
Services, to failback to the on-prem site, perform the failover from the on-prem site.

About this task


To perform a failback, do the following procedure at the primary site.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 207
Procedure

1. Log on to the Prism Central web console.

2. Click the hamburger icon at the top-left corner of the window. Go to Policies > Recovery
Plans in the left pane.

3. Select a recovery plan for the failover operation.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 208
4. Click Failover from the Actions drop-down menu.

Note: If you select more than one recovery plan in step 3, the Failover action is available only
when the selected recovery plans have the same primary and recovery locations.

Specify the following information in the Failover from Recovery Plan window. The window
auto-populates the Failover From and Failover To locations from the recovery plan you
select in step 3.

Figure 119: Unplanned Failover

a. Failover Type: Click Unplanned Failover and do one of the following.

Tip: You can also click Planned Failover to perform planned failover procedure for a
failback.

» Click Recover from latest Recovery Point to use the latest recovery point for recovery.
» Click Recover from specific point in time to use a recovery point taken at a specific
point in time for recovery.
b. Click + Add target clusters if you want to failover to specific Nutanix clusters at the
primary site.
If you do not add target clusters, the recovery plan recovers the guest VMs to any eligible
cluster at the primary site.

Note: If recovery plans contain VM categories, the VMs from those categories recover in the
same category after an unplanned failover to the recovery site. Also, the recovery points keep
generating at the recovery site for those recovered VMs. Since the VM count represents the
number of recoverable VMs (calculated from recovery points), the recovered VMs and their
newly created recovery points sum up. Their sum gives double the count of the originally
recovered VMs on the recovery plans page. Now, if some VMs belonging to the given

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 209
category at the primary or recovery site are deleted, the VM count at both sites still stay the
same until the recovery points of deleted VMs expire. For example, when two VMs have failed
over, the recovery plans page at the recovery site shows four VMs (two replicated recovery
points from source and two newly generated recovery points). The page shows four VMs
even if the VMs are deleted from the primary or recovery site. The VM count synchronizes and
becomes consistent in the subsequent RPO cycle conforming to the retention policy set in the
protection policy (due to the expiration of recovery points).

5. Click Failover.
The Failover from Recovery Plan dialog box lists the errors and warnings, if any, and allows
you to stop or continue the failover operation. If there are no errors or you resolve the errors
in step 6, the guest VMs failover to the recovery cluster.

6. If you see errors, do the following.

• To review errors or warnings, click View Details in the description.


Resolve the error conditions and then restart the failover procedure.
• Select one of the following.

• To stop the failover operation, click Abort.


• To continue the failover operation despite the warnings, click Execute Anyway.

Note: You cannot continue the failover operation when the validation fails with errors.

Note:

The entities of AHV/ESXi clusters recover at a different path on the ESXi clusters if
their files conflict with the existing files on the recovery ESXi cluster. For example,
there is a file name conflict if a VM (VM1) migrates to a recovery cluster that already
has a VM (VM1) in the same container.
However, the entities recover at a different path with VmRecoveredAtAlternatePath
alert only if the following conditions are met.

• Prism Element running on both the primary and the recovery Nutanix clusters are
of version 5.17 or newer.
• A path for the entity recovery is not defined while initiating the failover operation.
• The protected entities do not have shared disk/s.
If these conditions are not satisfied, the failover operation fails.

Note: To avoid conflicts when the primary site becomes active after the failover, shut down
the guest VMs associated with this recovery plan. Manually power off the guest VMs on either
primary or recovery site after the failover is complete. You can also block the guest VMs
associated with this recovery plan through the firewall.

Monitoring a Failover Operation (Xi Leap)

After you trigger a failover operation, you can monitor failover-related tasks. To monitor a
failover, do the following.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 210
Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to Recovery Plans in the left pane.

3. Click the name of the recovery plan for which you triggered failover.

4. Click the Tasks tab.


The left pane displays the overall status. The table in the details pane lists all the running
tasks and their individual statuses.

UEFI and Secure Boot Support for CHDR


Nutanix supports CHDR migrations of guest VMs having UEFI and Secure Boot.

Table 28: Nutanix Software - Minimum Requirements

Nutanix Software Minimum Supported Version

Minimum AOS 5.19.1

Minimum PC pc.2021.1

Minimum NGT 2.1.1

Table 29: Applications and Operating Systems Requirements - UEFI

Operating Systems Versions

• Microsoft Windows 10
Microsoft Windows • Microsoft Windows Server 2016
• Microsoft Windows Server 2019

• CentOS Linux 7.3

Linux • Ubuntu 18.04


• Red Hat Enterprise Linux Server versions 7.1
and 7.7

Table 30: Applications and Operating Systems Requirements - Secure Boot

Operating Systems Versions

• Microsoft Windows Server 2016


Microsoft Windows
• Microsoft Windows Server 2019

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 211
Operating Systems Versions

• CentOS Linux 7.3


Linux
• Red Hat Enterprise Linux Server versions 7.7

Table 31: Recovery Limitations

System Configuration Limitation

VMs which have Credential Guard enabled


Microsoft Windows Defender Credential
cannot be recovered with CHDR recovery
Guard
solution.

VMs on ESXi which have IDE Disks or CD-ROM


IDE + Secure Boot and Secure Boot enabled cannot be recovered
on AHV.

UEFI VMs on CentOS 7.4, CentOS 7.5 and CentOS 7.4, CentOS 7.5 and Ubuntu 16.04
Ubuntu 16.04 may fail to boot after CHDR UEFI VMs do not boot after cross-hypervisor
migration. disaster recovery migrations.
See KB-10633 for more information about
this limitation. Contact Nutanix Support for
assistance with this limitation.

UEFI VM may fail to boot after failback. When a UEFI VM is booted on AHV for the
first time, UEFI firmware settings of the VM
are initialized. The next step is to perform a
guest reboot or guest shutdown to fully flush
the settings into persistent storage in the
NVRAM.
If this UEFI VM is failed over to an ESXi
host without performing the guest reboot/
shutdown, the UEFI settings of the VM remain
partial. Although the VM boots on ESXi, it fails
to boot on AHV when a failback is performed.
See KB-10631 for more information about
this limitation. Contact Nutanix Support for
assistance with this limitation.

Protection with NearSync Replication and DR (Xi Leap)


NearSync replication enables you to protect your data with an RPO of as low as 1 minute.
You can configure a protection policy with NearSync replication by defining the VMs or VM
categories. The policy creates a recovery point of the VMs in minutes (1–15 minutes) and
replicates it to Xi Cloud Services. You can configure disaster recovery with Asynchronous
replication between an on-prem AHV or ESXi clusters and Xi Cloud Services. You can also
perform cross-hypervisor disaster recovery (CHDR)—disaster recovery of VMs from AHV
clusters to ESXi clusters or of VMs from ESXi clusters to AHV clusters.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 212
Note: Nutanix provides multiple disaster recovery (DR) solutions to secure your environment.
See Nutanix Disaster Recovery Solutions on page 11 for the detailed representation of the DR
offerings of Nutanix.

The following are the advantages of NearSync replication.

• Protection for the mission-critical applications. Securing your data with minimal data loss
if there is a disaster, and providing you with more granular control during the recovery
process.
• No minimum network latency or distance requirements.

Note: However, a maximum of 75 ms network latency is allowed for replication between an


AHV cluster and Xi Cloud Services.

• Low stun time for VMs with heavy I/O applications.


Stun time is the time of application freeze when the recovery point is taken.
• Allows resolution to a disaster event in minutes.
To implement the NearSync feature, Nutanix has introduced a technology called lightweight
snapshots (LWSs). LWS recovery points are created at the metadata level only, and they
continuously replicate incoming data generated by workloads running on the active clusters.
LWS recovery points are stored in the LWS store, which is allocated on the SSD tier. When you
configure a protection policy with NearSync replication, the system allocates the LWS store
automatically.

Note: The maximum LWS store allocation for each node is 360 GB. For the hybrid systems, it is
7% of the SSD capacity on that node.

Transitioning in and out of NearSync


When you configure a protection policy with NearSync replication, the policy remains in an
hourly schedule until its transition into NearSync is complete.
To transition into NearSync, initial seeding of the recovery site with the data is performed,
the recovery points are taken on an hourly basis, and replicated to the recovery site. After
the system determines that the recovery points containing the seeding data have replicated
within a specified amount of time (default is an hour), the system automatically transitions
the protection policy into NearSync depending on the bandwidth and the change rate. After
you transition into NearSync, you can see the configured NearSync recovery points in the web
interface.
The following are the characteristics of the process.

• Until you are transitioned into NearSync, you can see only the hourly recovery points in
Prism Central.
• If for any reason, a VM transitions out of NearSync, the system raises alerts in the Alerts
dashboard, and the protection policy transitions out to the hourly schedule. The system
continuously tries to get to the NearSync schedule that you have configured. If the transition
is successful, the protection policy automatically transitions back into NearSync, and alerts
specific to this condition are raised in the Alerts dashboard.
To transition out of NearSync, you can do one of the following.

• Delete the protection policy with NearSync replication that you have configured.
• Update the protection policy with NearSync replication to use an hourly RPO.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 213
• Unprotect the VMs.

Note: There is no transitioning out of the protection policy with NearSync replication on the
addition or deletion of a VM.

Repeated transitioning in and out of NearSync can occur because of the following reasons.

• LWS store usage is high.


• The change rate of data is high for the available bandwidth between the primary and the
recovery sites.
• Internal processing of LWS recovery points is taking more time because the system is
overloaded.

Retention Policy
Depending on the RPO (1–15 minutes), the system retains the recovery points for a specific
amount of time. For protection policy with NearSync replication, you can configure the
retention policy for days, weeks, or months on both the primary and recovery sites instead of
defining the number of recovery points you want to retain. For example, if you desire an RPO
of 1 minute and want to retain the recovery points for 5 days, the following retention policy is
applied.

• For every 1 minute, a recovery point is created and retained for a maximum of 15 minutes.

Note: The recent 15 recovery points are only visible in Prism Central and are available for the
restore operation.

• For every hour, a recovery point is created and retained for 6 hours.
• One daily recovery point is created and retained for 5 days.
You can also define recovery point retention in weeks or months. For example, if you configure
a 3-month schedule, the following retention policy is applied.

• For every 1 minute, a recovery point is created and retained for 15 minutes.
• For every hour, a recovery point is created and retained for 6 hours.
• One daily recovery point is created and retained for 7 days.
• One weekly recovery point is created and retained for 4 weeks.
• One monthly recovery point is created and retained for 3 months.

Note:

• You can define different retention policies on the primary and recovery sites.
• The system retains subhourly and hourly recovery points for 15 minutes and 6 hours
respectively. Maximum retention time for days, weeks, and months is 7 days, 4
weeks, and 12 months respectively.
• If you change the protection policy configuration from hourly schedule to minutely
schedule (Asynchronous to NearSync), the first recovery point is not created
according to the new schedule. The recovery points are created according to
the start time of the old hourly schedule (Asynchronous). If you want to get the
maximum retention for the first recovery point after modifying the schedule, update
the start time accordingly for NearSync.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 214
NearSync Replication Requirements (Xi Leap)
The following are the specific requirements of configuring protection policies with NearSync
replication schedule in Xi Leap. Ensure that you meet the following requirements in addition to
the general requirements of Xi Leap.
For more information about the general requirements of Xi Leap, see Xi Leap Requirements on
page 142.
For information about the on-prem node, disk and Foundation configurations required to
support NearSync replication schedules, see On-Prem Hardware Resource Requirements on
page 14.

Hypervisor Requirements
AHV or ESXi clusters running AOS 5.17 or newer, each registered to a different Prism Central

• The on-prem AHV clusters must be running on version 20190916.189 or newer.


• The on-prem ESXi clusters must be running on version ESXi 6.5 GA or newer.

Nutanix Software Requirements


The on-prem Prism Central and its registered clusters (Prism Elements) must be running the
following versions of AOS.

• AOS 5.17 or newer with AHV.


• AOS 5.17 or newer with ESXi.

Cross Hypervisor Disaster Recovery (CHDR) Requirements


Data Protection with NearSync replication supports cross-hypervisor disaster recovery. You
can configure disaster recovery to recover VMs from AHV clusters to ESXi clusters or VMs from
ESXi clusters to AHV clusters by considering the following requirement of CHDR.

• The on-prem clusters are running AOS 5.18 or newer.


• Install and configure Nutanix Guest Tools (NGT) on all the guest VMs. For more information,
see Enabling and Mounting Nutanix Guest Tools in Prism Web Console Guide.
NGT configures the guest VMs with all the required drivers for VM portability. For more
information about general NGT requirements, see Nutanix Guest Tools Requirements and
Limitations in Prism Web Console Guide.
• CHDR supports guest VMs with flat files only.
• CHDR supports IDE/SCSI disks only.

Tip: From AOS 5.19.1, CHDR supports SATA disks also.

• For all the non-boot SCSI disks of Windows guest VMs, set the SAN policy to OnlineAll so
that they come online automatically.
• In vSphere 6.7, guest VMs are configured with UEFI secure boot by default. Upon CHDR to
an AHV cluster, these guest VMs do not start if the host does not support the UEFI secure
boot feature. For more information about supportability of UEFI secure boot on Nutanix
clusters, see Compatibility Matrix.

• For information about operating systems that support UEFI and Secure Boot, see UEFI and
Secure Boot Support for CHDR on page 211.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 215
• Nutanix does not support vSphere inventory mapping (for example, VM folder and resource
pools) when protecting workloads between VMware clusters.

• Nutanix does not support vSphere snapshots or delta disk files. If you have delta disks
attached to a VM and you proceed with failover, you get a validation warning and the VM
does not recover. Contact Nutanix Support for assistance.

Note: CHDR does not preserve hypervisor-specific properties (for example, multi-writer flags,
independent persistent and non-persistent disks, changed block tracking (CBT), PVSCSI disk
configurations).

Note: In vSphere 6.7, guest VMs are configured with EFI secure boot by default. Upon CHDR to
AHV, these guest VMs will not start if the host does not support the UEFI secure boot feature.
For more information about the supportability of UEFI secure boot on Nutanix clusters, see
https://portal.nutanix.com/page/documents/compatibility-interoperability-matrix/
guestos.

Table 32: Operating System Supported for CHDR

Operating System Version Requirements and Limitations

Windows
• Windows 2008 R2 or newer • Only 64-bit operating systems
versions are supported.
• Windows 7 or newer versions

Linux
• CentOS 6.5 and 7.0 • SLES operating system is not
supported.
• RHEL 6.5 or newer and RHEL
7.0 or newer.
• Oracle Linux 6.5 and 7.0
• Ubuntu 14.04

Additional Requirements

• Both the primary and the recovery clusters must be of minimum three-nodes.
• See On-Prem Hardware Resource Requirements on page 14 for the on-prem hardware and
Foundation configurations required to support NearSync replication schedules.
• Set the virtual IP address and the data services IP address in the primary and the recovery
clusters.
• The recovery site container must have as much space as the protected VMs working size set
of the primary site. For example, if you are protecting a VM that is using 30 GB of space on
the container of the primary site, the same amount of space is required on the recovery site
container.
• The bandwidth between the two sites must be approximately equal to or higher than the
change rate of the protected VMs (maximum change rate is 20 MBps).

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 216
NearSync Replication Limitations (Xi Leap)
The following are the specific limitations of data protection with NearSync replication in Xi
Leap. These limitations are in addition to the general limitations of Leap.
For information about the general limitations of Leap, see Xi Leap Limitations on page 146.

• Deduplication enabled on storage containers having VMs protected with NearSync lowers
the replication speed.
• All files associated with the VMs running on ESXi must be located in the same folder as the
VMX configuration file. The files not located in the same folder as the VMX configuration file
might not recover on a recovery cluster. On recovery, the VM with such files fails to start
with the following error message.Operation failed: InternalTaskCreationFailure: Error
creating host specific VM change power state task. Error: NoCompatibleHost: No host
is compatible with the virtual machine

• In CHDR, NearSync replication does not support retrieving recovery points from the
recovery sites.
For example, if you have 1 day retention at the primary site and 5 days retention at the
recovery site, and you want to go back to a recovery point from 5 days ago. NearSync does
not support replicating 5 days retention back from the recovery site to the primary site.

Creating a Protection Policy with NearSync Replication Schedule (Xi Leap)


Create a NearSync protection policy in the primary site Prism Central. The policy schedules
recovery points of the protected VMs as per the set RPO and replicates them to Xi Cloud
Services for availability. When creating a protection policy, you can specify only VM categories.
If you want to include VMs individually, you must first create the protection policy—which can
also include VM categories and then include the VMs individually in the protection policy from
the VMs page.

Before you begin


Ensure that the AHV or ESXi clusters on both the primary and recovery site are NearSync
capable. A cluster is NearSync capable if the capacity of each SSD in the cluster is at least 1.2
TB.
See NearSync Replication Requirements (Xi Leap) on page 215 and NearSync Replication
Limitations (Xi Leap) on page 217 before you start.

About this task


To create a protection policy with NearSync replication in Xi Cloud Services, perform the
following procedure.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to Protection Policies in the left pane.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 217
3. Click Create Protection Policy.
Specify the following information in the Create Protection Policy window.

Figure 120: Protection Policy Configuration: NearSync

a. Name: Enter a name for the policy.

Caution: The name can be of only alphanumeric, dot, dash, and underscore characters.

b. Primary Location: Select the primary availability zone that hosts the VMs to protect. This
list displays the Local AZ by default and is unavailable for editing.
c. Primary Cluster(s): Select the cluster that hosts the VMs to protect.
d. Recovery Location: Select the recovery availability zone where you want to replicate the
recovery points.
If you do not select a recovery location, the local recovery points that are created by this
protection policy do not replicate automatically. You can, however, replicate recovery
points manually and use recovery plans to recover the VMs. For more information, see
Manual Disaster Recovery (Xi Leap) on page 228.
e. Target Cluster: Select the NearSync capable cluster where you want to replicate the
recovery points.
This field becomes available only if the recovery location is a physical remote site. If the
specified recovery location is an availability zone in Xi Cloud Services, the Target Cluster
field becomes unavailable because Xi Cloud Services selects a cluster for you. If the
specified recovery location is a physical location, you can select a cluster of your choice.

Caution: If the primary cluster contains an IBM Power Systems server, you cannot replicate
recovery points to Xi Cloud Services. However, you can replicate recovery points to the

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 218
on-prem target cluster if the target on-prem cluster also contains an IBM Power Systems
server.

Caution: Select auto-select from the drop-down list only if all the clusters at the recovery
site are NearSync capable.

f. Policy Type: Click Asynchronous.


g. Recovery Point Objective: Specify the frequency in minutes (anywhere between 1-15
minutes) at which you want recovery points to be taken.
By default, recovery point creation begins immediately after you create the protection
policy. If you want to specify when recovery point creation must begin, click Change, and
then, in the Start Time dialog box, do the following.
Click Start from specific point in time.
In the time picker, specify the time at which you want to start taking recovery points.
Click Save.

Tip: NearSync also allows you to recover the data of the minute just before the unplanned
failover. For example, on a protection policy with 10 minute RPO, you can use the internal
lightweight snapshots (LWS) to recover the data of the 9th minute when there is an
unplanned failover.

h. Retention Policy: Specify the type of retention policy.

Figure 121: Roll-up Retention Policy

» Roll-up: Rolls up the recovery points as per the RPO and retention period into a
single recovery point at a site. For example, if you set the RPO to 1 hour, and the
retention time to 5 days, the 24 oldest hourly recovery points roll up into a single daily
recovery point (one recovery point = 24 hourly recovery points) after every 24 hours.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 219
The system keeps one day (of rolled-up hourly recovery points) and 4 days of daily
recovery points.

Note:

• If the retention period is n days, the system keeps 1 day of RPO (rolled-up
hourly recovery points) and n-1 days of daily recovery points.
• If the retention period is n weeks, the system keeps 1 day of RPO, 1 week of
daily and n-1 weeks of weekly recovery points.
• If the retention period is n months, the system keeps 1 day of RPO, 1 week of
daily, 1 month of weekly, and n-1 months of monthly recovery points.
• If the retention period is n years, the system keeps 1 day of RPO, 1 week of
daily, 1 month of weekly, and n-1 months of monthly recovery points.

Note: The recovery points that are used to create a rolled-up recovery point are
discarded.

Tip: Use roll-up retention policies for anything with a longer retention period. Roll-up
policies are more flexible and automatically handle recovery point aging/pruning while
still providing granular RPOs for the first day.

Note: NearSync does not support Linear retention policies. When you enter a minutely
time unit in the Recovery Point Objective, the Roll-up retention policy is automatically
selected.

4. To specify the retention number for the sites, do the following.

a. Remote Retention: Specify the retention number for the remote site.
This field is unavailable if you do not specify a recovery location.
b. Local Retention: Specify the retention number for the local site.
If you select linear retention, the remote and local retention count represents the number
of recovery points to retain at any given time. If you select roll-up retention, these
numbers specify the retention period.

5. If you want to take application consistent recovery points, select Take App-Consistent
Recovery Point.
Application consistent recovery points ensure that application consistency is maintained in
the replicated recovery points. For application-consistent recovery points, install NGT on the
VMs running on AHV. For VMs running on ESXi, you can take application-consistent recovery
points without installing NGT, but the recovery points are hypervisor-based and lead to VM
stuns (temporary unresponsive VMs).

Caution: Application-consistent recovery points fail for EFI-boot enabled Windows 2019
VM running on ESXi when NGT is not installed. Nutanix recommends installing NGT on VMs
running on ESXi also.

6. Associated Categories: To protect categories of VMs, perform the following.

Tip: Before associating VM categories to a protection policy, determine how you want to
identify the VMs you want to protect. If they have a common characteristic (for example,
the VMs belong to a specific application or location), check the Categories page to ensure
that both the category and the required value are available. Prism Central includes built-in
categories for frequently encountered applications such as MS Exchange and Oracle. You can

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 220
also create your custom categories. If the category or value you want is not available, first
create the category with the required values, or update an existing category so that it has
the values that you require. Doing so ensures that the categories and values are available for
selection when creating the protection policy. You can add VMs to the category either before
or after you configure the protection policy. For more information about VM categories, see
Category Management in the Prism Central Guide.

a. Click Add Categories.


b. Select the VM categories from the list to add to the protection policy.

Note:
You cannot protect a VM by using two or more protection policies. Therefore, VM
categories specified in another protection policy are not listed here. Also, if you
included a VM in another protection policy by specifying the category to which it
belongs (category-based inclusion), and if you add the VM to this policy by using
its name (individual inclusion), the individual inclusion supersedes the category-
based inclusion. Effectively, the VM is protected only by this protection policy and
not by the protection policy in which its category is specified.
For example, the guest VM VM_SherlockH is in the category Department:Admin,
and you add this category to the protection policy named PP_AdminVMs. Now,
if you add VM_SherlockH from the VMs page to another protection policy named
PP_VMs_UK, VM_SherlockH is protected in PP_VMs_UK and unprotected from
PP_AdminVMs.

c. Click Save.

Tip: To add or remove categories from the existing protection policy, click Update.

7. Click Save.
You have successfully created a protection policy with NearSync replication in Xi Leap. You
can add VMs individually (without VM categories) to the protection policy or remove VMs
from the protection policy. For information about the operations that you can perform on a
protection policy, see Protection Policy Management on page 221.

Creating a Recovery Plan (Xi Leap)


Create a recovery plan in the primary Prism Central. The procedure for creating a recovery plan
is the same for all the data protection strategies in Xi Leap.
For more information about creating a recovery plan in Xi Leap, see Creating a Recovery Plan
(Xi Leap) on page 193.

Protection Policy Management


A protection policy automates the creation and replication of recovery points. When
configuring a protection policy for creating local recovery points, you specify the RPO,
retention policy, and the VMs that you want to protect. You also specify the recovery location if
you want to automate recovery point replication to Xi Cloud Services.
When you create, update, or delete a protection policy, it synchronizes to the paired Xi Cloud
Services. The recovery points automatically start replicating in the reverse direction after
you perform a failover at the recovery Xi Cloud Services. For information about how Xi Leap
determines the list of availability zones for synchronization, see Entity Synchronization Between
Paired Availability Zones on page 229.

Note: A VM cannot be simultaneously protected by a protection domain and a protection policy.


If you want to use a protection policy to protect a VM that is part of a protection domain, first

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 221
remove the VM from the protection domain, and then include it in the protection policy. For
information, see Migrating Guest VMs from a Protection Domain to a Protection Policy on
page 232

Adding Guest VMs individually to a Protection Policy


You can also add VMs directly to a protection policy from the VMs page, without the use of
a VM category. To add VMs directly to a protection policy in Xi Cloud Services, perform the
following procedure.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to VMs in the left pane.

3. Click Protect from the Actions drop-down menu.

Figure 122: Protect VMs Individually

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 222
4. Select the protection policy in the table to include the VMs in a protection policy.

Figure 123: Protection Policy Selection

5. Click Protect.
The VMs are added to the selected protection policy. The updated protection policy starts
synchronizing to the recovery Prism Central.

Removing Guest VMs Individually from a Protection Policy


You can directly remove guest VMs from a protection policy from the VMs page. To remove
guest VMs from a protection policy in Xi Cloud Services, perform the following procedure.

About this task

Note: If a guest VM is protected individually (not through VM categories), you can remove it
from the protection policy only by using this individual removal procedure.

Note: If a guest VM is protected under a VM category, you cannot remove the guest VM from
the protection policy with this procedure. You can remove the guest VM from the protection
policy only by dissociating the guest VM from the category.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to VMs in the left pane.

3. Select the guest VMs that you want to remove from a protection policy.

4. Click UnProtect from the Actions drop-down menu.


The selected guest VMs are removed from the protection policy. The updated protection
policy starts synchronizing to the recovery Prism Central.

Note: Delete all the recovery points associated with the guest VM to avoid incurring
subscription charges. The recovery points adhere to the expiration period set in the protection
policy and unless deleted individually, continue to incur charges until the expiry.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 223
Cloning a Protection Policy
If the requirements of the protection policy that you want to create are similar to an existing
protection policy in Xi Cloud Services, you can clone the existing protection policy and update
the clone.

About this task


To clone a protection policy from Xi Cloud Services, perform the following procedure.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to Protection Policies in the left pane.

3. Select the protection policy that you want to clone.

4. Click Clone from the Actions drop-down menu.

5. Make the required changes on the Clone Protection Policy page. For information about the
fields on the page, see:

• Creating a Protection Policy with Asynchronous Replication Schedule (Xi Leap) on


page 182
• Creating a Protection Policy with NearSync Replication Schedule (Xi Leap) on page 217

6. Click Save.
The selected protection policy is cloned. The updated protection policy starts synchronizing
to the recovery Prism Central.

Updating a Protection Policy


You can modify an existing protection policy in the Xi Cloud Services. To update an existing
protection policy in Xi Cloud Services, perform the following procedure.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to Protection Policies in the left pane.

3. Select the protection policy that you want to update.

4. Click Update from the Actions drop-down menu.

5. Make the required changes on the Update Protection Policy page. For information about the
fields on the page, see:

• Creating a Protection Policy with Asynchronous Replication Schedule (Xi Leap) on


page 182
• Creating a Protection Policy with NearSync Replication Schedule (Xi Leap) on page 217

6. Click Save.
The selected protection policy is updated. The updated protection policy starts
synchronizing to the recovery Prism Central.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 224
Finding the Protection Policy of a Guest VM
You can use the data protection focus on the VMs page to determine the protection policies
to which a VM belongs in Xi Cloud Services. To determine the protection policy in Xi Cloud
Services to which a VM belongs, do the following.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to VMs in the left pane.

3. Click Data Protection from the Focus menu at the top-right corner.
The Protection Policy column that is displayed shows the protection policy to which the VMs
belong.

Figure 124: Focus

4. After you review the information, to return the VM page to the previous view, remove the
Focus Data Protection filter from the filter text box.

Recovery Plan Management


A recovery plan orchestrates the recovery of protected VMs at a recovery site. Recovery plans
are predefined procedures (runbooks) that use stages to enforce VM power-on sequence. You
can also configure the inter-stage delays to recover applications gracefully. Recovery plans that
recover applications in Xi Cloud Services are also capable of creating the required networks
during failover and can assign public-facing IP addresses to VMs.
A recovery plan created in one availability zone (site) replicates to the paired availability zone
and works bidirectionally. After a failover from the primary site to a recovery site, you can
failback to the primary site by using the same recovery plan.
After you create a recovery plan, you can validate or test it to ensure that recovery goes
through smoothly when failover becomes necessary. Xi Cloud Services includes a built-in VPC
for validating or testing failover.
Recovery plans are independent of protection policies and do not reference protection policies
in their configuration information. Also, they do not create recovery points. While the process of
planned failover includes the creation of a recovery point so that the latest data can be used for
recovery, unplanned and test failovers rely on the availability of the required recovery points at

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 225
the designated recovery site. A recovery plan therefore requires the VMs in the recovery plan to
also be associated with a protection policy.
Recovery plans are synchronized to one or more paired sites when they are created, updated,
or deleted. For information about how Leap determines the list of availability zones (sites) for
synchronization, see Entity Synchronization Between Paired Availability Zones on page 229.

Adding Guest VMs individually to a Recovery Plan


You can also add VMs directly to a recovery plan in the VMs page, without the use of a
category. To add VMs directly to a recovery plan in Xi Cloud Services, perform the following
procedure.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to VMs in the left pane.

3. Select the VMs that you want to add to a recovery plan.

4. Click Add to Recovery Plan from the Actions drop-down menu.


The Update Recovery Plan page is displayed.

5. Select the recovery plan where you want to add the VMs in the Add to Recovery Plan
dialog box.

6. Click Add.
The Update Recovery Plan dialog box appears.

7. In the General tab, check Recovery Plan Name, Recovery Plan Description. Click Next.

8. In the Power On Sequence tab, add VMs to the stage. For more information, see Stage
Management on page 64

9. Click Next.

10. In the Network Settings tab, update the network settings as required for the newly added
VMs. For more information, see Creating a Recovery Plan (Xi Leap) on page 193.

11. Click Done.


The VMs are added to the recovery plan.

Removing Guest VMs individually from a Recovery Plan


You can also remove VMs directly from a recovery plan in Xi Cloud Services. To remove VMs
directly from a protection policy, perform the following procedure.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to Recovery Plans in the left pane.

3. Select the recovery plan from which you want to remove VM.

4. Click Update from the Actions drop-down menu.


The Update Recovery Plan dialog box appears.

5. In the General tab, check Recovery Plan Name, Recovery Plan Description. Click Next.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 226
6. In the Power On Sequence tab, select the VMs and click More Actions > Remove.

Note: You see More Actions in a stage only when one or more VMs in the stage are selected.
When none of the VMs in the stage are selected, you see Actions.

7. Click Next.

8. In the Network Settings tab, update the network settings as required for the newly added
VMs. For more information, see Stage Management on page 64.

9. Click Done.
The VMs are removed from the selected recovery plan.

Updating a Recovery Plan


You can update an existing recovery plan in Xi Cloud Services. To update a recovery plan,
perform the following procedure.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to Recovery Plans in the left pane.

3. Select the recovery plan that you want to update.

4. Click Update from the Actions drop-down menu.


The Update Recovery Plan dialog box appears.

5. Make the required changes to the recovery plan. For information about the various fields and
options, see Creating a Recovery Plan (Xi Leap) on page 193.

6. Click Done.
The selected recovery plan is updated.

Validating a Recovery Plan


You can validate a recovery plan from the recovery site. For example, if you perform the
validation in the Xi Cloud Services (primary site being an on-prem site), Leap validates failover
from the on-prem site to Xi Cloud Services. Recovery plan validation only reports warnings and
errors. Failover is not performed. In this procedure, you need to specify which of the two paired
sites you want to treat as the primary, and then select the other site as the secondary.

About this task


To validate a recovery plan, do the following.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to Recovery Plans in the left pane.

3. Select the recovery plan that you want to validate.

4. Click Validate from the Actions drop-down menu.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 227
5. In the Validate Recovery Plan dialog box, do the following.

a. In Primary Location, select the primary location.


b. In Recovery Location, select the recovery location.
c. Click Proceed.
The validation process lists any warnings and errors.

6. Click Back.
A summary of the validation is displayed. You can close the dialog box.

7. To return to the detailed results of the validation, click the link in the Validation Errors
column.
The selected recovery plan is validated for its correct configuration.

Manual Disaster Recovery (Xi Leap)


Manual data protection involves manually creating recovery points, manually replicating
recovery points, and manually recovering the VMs at the recovery site. You can also automate
some of these tasks. For example, the last step—that of manually recovering VMs at the
recovery site—can be performed by a recovery plan while the underlying recovery point
creation and replication can be performed by protection policies. Conversely, you can configure
protection policies to automate recovery point creation and replication and recover VMs at the
recovery site manually.

Creating Recovery Points Manually (Out-of-Band Snapshots)

About this task


To create recovery points manually in Xi Cloud Services, do the following.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to VMs in the left pane.

3. Select the VMs for which you want to create a recovery point.

4. Click Create Recovery Point from the Actions drop-down menu.

5. To verify that the recovery point is created, click the name of the VM, click the Recovery
Points tab, and verify that a recovery point is created.

Replicating Recovery Points Manually


You can manually replicate recovery points only from the site where the recovery points exist.

About this task


To replicate recovery points manually from Xi Cloud Service, do the following.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to VMs in the left pane.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 228
3. Click the VM whose recovery point you want to replicate, and then click Recovery Points in
the left.
The Recovery Points view lists all the recovery points of the VM.

4. Select the recovery points that you want to replicate.

5. Click Replicate from the Actions drop-down menu.

6. In the Replicate dialog box, do the following.

a. In Recovery Location, select the location where you want to replicate the recovery point.
b. In Target Cluster, select the cluster where you want to replicate the recovery point.
c. Click Replicate Recovery Point.

Recovering a Guest VM from a Recovery Point Manually (Clone)


You can recover a VM by cloning a VM from a recovery point.

About this task


To recover a VM from a recovery point at Xi Cloud Services, do the following.

Procedure

1. Log on to Xi Cloud Services.

2. Click Explore and go to VMs in the left pane.

3. Click the VM whose recovery point you want to replicate, and then click Recovery Points in
the left.
The Recovery Points view lists all the recovery points of the VM.

4. Select the recovery point from which you want to recover the VM.

5. Click Restore from the Actions drop-down menu.

6. In the Restore dialog box, do the following.

a. In the text box provided for specifying a name for the VM, specify a new name or do
nothing to use the automatically generated name.
b. Click Restore.

Warning: The following are the limitations of the manually recovered VMs (VMs recovered
without the use of a recovery plan).

• The VMs recover without a VNIC if the recovery is performed at the remote site.
• VM categories are not applied.
• NGT needs be reconfigured.

Entity Synchronization Between Paired Availability Zones


When paired with each other, availability zones (sites) synchronize disaster recovery
configuration entities. Paired sites synchronize the following disaster recovery configuration
entities.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 229
Protection Policies
A protection policy is synchronized whenever you create, update, or delete the
protection policy.
Recovery Plans
A recovery plan is synchronized whenever you create, update, or delete the recovery
plan. The list of availability zones (sites) to which Xi Leap must synchronize a recovery
plan is derived from the VMs that are included in the recovery plan. The VMs used to
derive the availability zone list are VM categories and individually added VMs.
If you specify VM categories in a recovery plan, Leap determines which protection
policies use those VM categories, and then synchronizes the recovery plans to the
availability zones specified in those Protection Plans.
If you include VMs individually in a recovery plan, Leap uses the recovery points of those
VMs to determine which protection policies created those recovery points, and then
synchronizes the recovery plans to the availability zones specified in those protection
policies. If you create a recovery plan for VM categories or VMs that are not associated
with a protection policy, Leap cannot determine the availability zone list and therefore
cannot synchronize the recovery plan. If a recovery plan includes only individually added
VMs and a protection policy associated with a VM has not yet created VM recovery
points, Leap cannot synchronize the recovery plan to the availability zone specified in
that protection policy. However, recovery plans are monitored every 15 minutes for the
availability of recovery points that can help derive availability zone information. When
recovery points become available, Xi Leap derives the availability zone by the process
described earlier and synchronizes the recovery plan to the availability zone.
VM Categories used in Protection Policies and Recovery Plans
A VM category is synchronized when you specify the VM category in a protection policy
or recovery plan.
Issues such as a loss of network connectivity between paired availability zones or user actions
such as unpairing of availability zones followed by repairing of those availability zones can
affect VM synchronization.

Tip: Nutanix recommends to unprotect all the VMs on the availability zone before unpairing
it to avoid getting into a state where the entities have stale configurations after repairing of
availability zones.

If you update VMs in either or both availability zones before such issues are resolved or before
unpaired availability zones are paired again, VM synchronization is not possible. Also, during VM
synchronization, if a VM cannot be synchronized because of an update failure or conflict (for
example, you updated the same VM in both availability zones during a network connectivity
issue), no further VMs are synchronized. Entity synchronization can resume only after you
resolve the error or conflict. To resolve a conflict, use the Entity Sync option, which is available
in the web console. Force synchronization from the availability zone that has the desired
configuration. Forced synchronization overwrites conflicting configurations in the paired
availability zone.

Note: Forced synchronization cannot resolve errors arising from conflicting values in VM
specifications (for example, the paired availability zone already has a VM with the same name).

If you do not update entities before a connectivity issue is resolved or before you pair
the availability zones again, the synchronization behavior described earlier resumes. Also,
pairing previously unpaired availability zones trigger an automatic synchronization event. For
recommendations to avoid facing such issues, see Entity Synchronization Recommendations (Xi
Leap) on page 231.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 230
Entity Synchronization Recommendations (Xi Leap)
Consider the following recommendations to avoid inconsistencies and the resulting
synchronization issues.

• During network connectivity issues, do not update entities at both the availability zones
(sites) in a pair. You can safely make updates at any one site. After the connectivity issue is
resolved, force synchronization from the site in which you made updates. Failure to adhere
to this recommendation results in synchronization failures.
You can safely create entities at either or both the sites as long as you do not assign
the same name to entities at the two sites. After the connectivity issue is resolved, force
synchronization from the site where you created entities.
• If one of the sites becomes unavailable, or if any service in the paired site is down perform
force synchronization from the paired availability zone after the issue is resolved.

Forcing Entity Synchronization (Xi Leap)


Entity synchronization, when forced from an availability zone (site), overwrites the
corresponding entities in paired sites. Forced synchronization also creates, updates, and
removes those entities from paired sites.

About this task


The availability zone (site) to which a particular entity is forcefully synchronized depends on
which site requires the entity (seeEntity Synchronization Between Paired Availability Zones
on page 229). To avoid inadvertently overwriting required entities, ensure to force VM
synchronization from the site in which the entities have the desired configuration.
If a site is paired with two or more availability zones (sites), you cannot select one or more sites
with which to synchronize entities.
To force entity synchronization from Xi Cloud Services, do the following.

Procedure

1. Log on to Xi Cloud Services.

2. Click the settings button (gear icon) at the top-right corner of the window.

3. Click Entity Sync in the menu.

4. In the Entity Sync dialog box, review the message at the top of the dialog box, and then do
the following.

a. To review the list of entities that will be synchronized to an AVAILABILITY ZONE, click the
number of ENTITIES adjacent to an availability zone.
b. After you review the list of entities, click Back.

5. Click Sync Entities.

Disaster Recovery (Formerly Leap) | Protection and DR between On-Prem Site and Xi Cloud Service
(Xi Leap) | 231
MIGRATING GUEST VMS FROM
A PROTECTION DOMAIN TO A
PROTECTION POLICY
You can protect a guest VM either with a protection domain in Prism Element or with a
protection policy in Prism Central. If you have guest VMs in protection domains, migrate those
guest VMs to protection policies to orchestrate their disaster recovery using Leap.

Before you begin


Migration from protection domains to protection policies is a disruption process. For successful
migration,

• Ensure that the guest VMs have no on-going replication.


• Ensure that the guest VMs do not have volume groups.
• Ensure that the guest VMs are not in consistency groups.

About this task


To migrate a guest VM from a protection domain to a protection policy manually, perform the
following procedure.

Tip: To automate the migration using a script, refer KB 10323.

Procedure

1. Unprotect the guest VM from the protection domain.

Caution: Do not delete the guest VM snapshots in the protection domain. Prism Central reads
those guest VM snapshots to generate new recovery points without full replication between
the primary and recovery Nutanix clusters. If you delete the guest VM snapshots, the VM data
replicates afresh (full replication). Nutanix recommends keeping the VM snapshots in the
protection domain until the first recovery point for the guest VM is available on Prism Central.

Caution: Use the automated script for migrating guest VMs from a large protection domain. A
large protection domain consists of more than 500 guest VMs. If you migrate the guest VMs
manually from a large protection domain, the VM data replicates afresh (full replication).

2. Log on to Prism Central and protect the guest VMs with protection policies individually (see
Adding Guest VMs individually to a Protection Policy on page 128) or through VM categories.

Disaster Recovery (Formerly Leap) | Migrating Guest VMs from a Protection Domain to a Protection
Policy | 232
COPYRIGHT
Copyright 2022 Nutanix, Inc.
Nutanix, Inc.
1740 Technology Drive, Suite 150
San Jose, CA 95110
All rights reserved. This product is protected by U.S. and international copyright and intellectual
property laws. Nutanix and the Nutanix logo are registered trademarks of Nutanix, Inc. in the
United States and/or other jurisdictions. All other brand and product names mentioned herein
are for identification purposes only and may be trademarks of their respective holders.

Disaster Recovery (Formerly Leap) | Copyright | 233

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy