0% found this document useful (0 votes)
45 views17 pages

Wp-Site Recovery With Nakivo Backup Replication

This white paper discusses the importance of disaster recovery planning for modern businesses. It notes that 94% of companies that suffer significant data loss are unable to fully recover, with 43% never resuming operations and 51% closing within two years. For businesses to be "always on" and ensure continuity, they must focus on high availability of their infrastructure and maintaining business processes even during a disaster. Downtime can result in significant losses, including lost revenue, decreased productivity, and lost customers. The paper examines costs of downtime for small, medium and large businesses. It emphasizes choosing an appropriate disaster recovery solution, whether using disaster recovery as a service or an in-house strategy, to minimize negative impacts of outages and ensure business continuity

Uploaded by

daniel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views17 pages

Wp-Site Recovery With Nakivo Backup Replication

This white paper discusses the importance of disaster recovery planning for modern businesses. It notes that 94% of companies that suffer significant data loss are unable to fully recover, with 43% never resuming operations and 51% closing within two years. For businesses to be "always on" and ensure continuity, they must focus on high availability of their infrastructure and maintaining business processes even during a disaster. Downtime can result in significant losses, including lost revenue, decreased productivity, and lost customers. The paper examines costs of downtime for small, medium and large businesses. It emphasizes choosing an appropriate disaster recovery solution, whether using disaster recovery as a service or an in-house strategy, to minimize negative impacts of outages and ensure business continuity

Uploaded by

daniel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

White Paper

Site Recovery:
DR Automation
and Orchestration
White Paper

Dangers of Disaster for Modern Businesses


Today’s information-driven business world operates according to the “always-on” principle:
the uninterrupted delivery of their goods or services is what separates companies that
dominate the market from those that lag behind. Continuous availability does not only help
companies prosper, but it also helps them survive - 94% of companies that suffer significant
data loss are unable to fully recover, 43% never resume business operations, and 51% close
within the next two years1.

6%
Successfully Recover

51%
100%
Never Reopen

43% Close Within 2 Years

The larger the company, the costlier the downtime and, thus, the greater the importance
of preparedness for unforeseen circumstances ranging from natural disasters to simple
hardware failures. Knowing how to minimize (or even prevent) the negative impacts of such
disaster scenarios lays the foundation for a resilient and successful company.

This White Paper explores disaster recovery planning as a means of mitigating the negative
consequences of disasters, with a special focus on how NAKIVO’s advanced Site Recovery
functionality can assist in this area.

Defining the Needs of the Always-On Business


The goals of increasing profit and the customer base stay relevant at all times. However, to
become the always-on business and keep up with modern standards, the company needs to
focus on ensuring availability and maintaining business continuity.

1
“Management information systems for the information age” Haag, Cummings & McCubbrey (2005)

2
White Paper

Ensuring Availability
IT technologies have been instrumental in shaping the modern business landscape. Every
part of a modern company’s infrastructure is connected with IT in one way or another, and
constant access to the world-wide web is essential. Email, VoIP, CRM, and instant messaging
must be online at all times just for the business to survive, let alone prosper. That is why
traditional high availability (i.e., 90% availability) is no longer enough today. A 90% availability
rate might seem high, but it would mean that your customers cannot use your services for
36.5 days per year.

At the same time, many companies operate across several locations, with data centers spread
all over the world, which makes ensuring the availability of the entire infrastructure extremely
difficult and costly. Simply ensuring the protection of data, such as files or directories, is not
enough. Complex applications responsible for business-critical processes are now running on
virtual machines (VMs). If those applications are disrupted for any reason and you are unable
to restore them, then your business may be at risk.

As such, ensuring high availability for the always-on business is a considerable challenge from
the IT perspective. However, even though constant availability might be difficult to achieve in
itself, there is still another thing to consider – business continuity.

Maintaining Business Continuity


Key terms
A company with a business continuity plan has a set
of processes and procedures in place that allow critical Availability in business is
services, such as customer support, sales, or accounting, the amount of time that a
to operate without disruptions through a disaster service or resource is fully
scenario. available for its intended
use.
While “availability” is mostly data- and IT-centric,
continuity is distinctly business-centric. High availability Business continuity is the
entails having a resilient infrastructure in place to strategic capability of the
ensure there are no interruptions to business processes; organization to plan for
business continuity is about making sure that the and respond to incidents
company can still operate after and even during the that cause business
disaster. disruptions. It is a measure
of their ability to continue
Unfortunately, far fewer companies are concerned
business operations at an
about possible disasters than one would expect, given
acceptable level.
that they can occur at any moment and threaten the
company’s prosperity. This emphasizes the importance
of understanding the consequences that a disaster
scenario could have on your business.

3
White Paper

The Cost and Consequences of a Disaster


Depending on its scale, a disaster could have devastating effects for a company. These can be
split into three main categories: loss of revenue, loss of productivity, and loss of customers.

Revenue Loss
Heavy dependence on IT infrastructure and IT technologies certainly has the potential to
give companies the competitive advantage, but this reliance can also destroy the business
outright. Consider the following statistics, drawn from Aberdeen Group2:
 In terms of revenue loss, for small businesses, an hour of downtime costs around $8,580,
all things considered.
 For medium-sized companies, the losses are greater; they can amount to $215,637 per
hour.
 Large enterprises can experience revenue loss that reaches $686,250 per single hour of
downtime.

900 000
800 000
$686 250
700 000
600 000
500 000
400 000
300 000 $215 637
200 000
100 000
$8 580
Small Medium-Sized Larger
Companies Companies Enterprises

The length of an average outage is around 18.5 hours. Using the estimates above, this would
mean around $158,000 to $3,389,000 in lost revenue for SMBs and more than $12 million for
enterprises.

2
“Building a Fast Lane to Better Data Center Performance” Aberdeen Group (2016)

4
White Paper

Productivity Loss
In addition to the catastrophic losses in terms of revenue, certain disasters can also impact
productivity. If your entire business workflow has stopped as a result of a disaster, then your
employees cannot do their jobs, which might mean that critical business operations cannot
be performed. In the US only, productivity drops caused by failures translate into over $250
billion in losses annually3.

Loss of Customers
Customers expect your company to deliver the necessary goods and services at any time
and under any circumstances. High levels of competition on the market have driven down
the prices on services, improved their quality, and, most importantly, vastly increased the
expectations that a customer has. When a company is facing a disaster, the customer won’t
wait for such company to recover; they’ll simply move on and choose one of the competitors.

The situation gets even worse in cases where the customers are losing their own money
because of the downtime. Companies that plan to stay in the game abide by the age-old
insight: “Retaining customers may be costly, but re-acquiring them can be tremendously more
expensive.”

Customers choose to use a company if they can reliably get the necessary services or goods
provided to them whenever and wherever they need. Things such as service outages can be
either prevented or swiftly recovered from. Even hurricanes and volcanos can be worked
around if the company sets its mind to providing truly continuous delivery of services.
Therefore, these circumstances should never be an excuse for not providing the services the
customer is paying for.

This is where companies face an important challenge – choosing the appropriate disaster
recovery (DR) solution.

3
“Health and productivity among U.S. workers” Commonwealth Fund (2005)

5
White Paper

Choosing a DR Solution That Works


The two main DR planning approaches available to businesses are using Disaster Recovery
as a Service (DRaaS) and in-house DR planning. In the former case, you’ll need to engage a
third party providing such services; in the latter you may want to weigh the possible costs and
consider virtualizing your infrastructure.

Relying on DRaaS
This option places the protection of your IT infrastructure in the hands of a company that
specializes in DR. One of the main benefits of this type of DR is that you avoid needing to
create and maintain your own DR environment. This option costs much less than organizing
DR by yourself. Furthermore, your backups can be stored in the cloud, which increases
the probability of successfully restoring the necessary data in case of disaster.

Two of the biggest concerns associated with DRaaS are the security and privacy of the
company’s critical data when it is stored on the cloud servers of the third-party service
provider. You may never know if unauthorized personnel can gain access to your financial
records, important documents, etc.

To avoid these risks, companies often choose to manage their own DR. This involves having
a DR location (preferably geographically distant from the production site) and purchasing
a reliable DR software solution.

In-house DR Planning
Before choosing your DR solution, you must ensure that your company has taken all the
necessary steps to allow for the seamless DR process. These may include identifying key
business services potentially affected by disasters, performing a risk assessment and impact
analysis, determining RTOs and RPOs, designating a DR site for network/data failover, and
performing regular testing. Performing these steps takes a considerable amount of time and
resources. However, it may benefit the whole process in the long run and help you calculate
the total costs of the DR later on.

Understanding DR Costs
If you choose to handle disaster recovery by yourself, you will incur corresponding costs. As
mentioned, you can choose to put your DR operations in the hands of a DRaaS provider. If you
understand and choose to accept the risks mentioned above (or if you find a DRaaS provider
you can trust wholeheartedly), then the cost of DR can be made significantly lower.

On the other hand, if you choose the more reliable and personal approach of managing DR by
yourself, the following should be accounted for:

6
White Paper

 Separate DR location. For maximal effectiveness, your DR location should be in a


different place (geographically speaking) than your production site. If it is not far enough
removed, the same disaster that takes out your primary site could also affect the DR site.
The hefty space requirements for your new hardware can get rather costly, especially
if the location is used exclusively for DR. Don’t forget about the bills and the need for
regular maintenance at your DR site.
 Setting up new hardware. The cost of this secondary infrastructure can vary depending
on the scale of operations (the number of servers, laptops, etc. you need).
 Maintaining the network. Besides an internet connection, you need to maintain
the stability of the network between the DR site and the production site.

Among the many approaches, there is one strategy particularly well suited to decreasing
the costs of your DR plan and increasing its effectiveness: virtualization.

Advantages of Virtualization
The virtualization practice has garnered much attention in recent years due to its efficiency as
well as its cost-effectiveness. One of the key benefits of virtualization is that you can significantly
reduce the amount of additional hardware needed by making efficient use of your existing
hardware. Indeed, by relying on virtualization one can reduce both server energy consumption
and floor space requirements by over 80%4. Additionally, backing up and restoring virtualized data
is much easier, since the server files are encapsulated in a single image file. This is especially useful
for DR, as your entire virtualized environment can be quickly restored at an off-site location.

Choosing to proceed with virtualization can make the DR process significantly easier.
However, you must still choose an appropriate DR solution that not only fits your budget, but
is versatile and reliable enough to work for any type of disaster.

Accounting for DR Complexity When Choosing a Solution


Before choosing your DR solution, you must ensure that it is scalable enough to accommodate
your business needs. Creating a DR plan that encompasses a handful of VMs located on a single
host is easy enough for an SMB. However, if you are an enterprise-level company housing
thousands of VMs on hundreds of hosts, you need to make sure that the software solution you
choose for disaster recovery is highly scalable and can accommodate all of your data reliably.

Furthermore, the DR software chosen must be flexible enough to orchestrate a disaster recovery
plan of any complexity at any time. The solution should also allow you to constantly keep track of,
update, and test the DR plan whenever you like without disrupting the production servers.

NAKIVO Backup & Replication can accommodate the DR needs of any business with its
advanced Site Recovery functionality.

4
“Gartner Outlines Seven Practical Ways to Save Costs in the Data Center” Pettey and Meulen (2009)

7
White Paper

How Site Recovery Functionality


Is Reinventing DR
NAKIVO Backup & Replication offers the benefits of powerful backup and replication
software combined with advanced Site Recovery functionality. Not only can the product
make sure your company’s critical data remains protected at all times, but it can enable you
to quickly resume business operations after a disaster. The solution is extremely scalable
and has enough versatility to accommodate the needs of any type of business, from SMBs
to enterprises. Site Recovery can be used for multiple purposes, including disaster recovery,
planned migration, etc.

How Site Recovery Works


NAKIVO Backup & Replication’s Site Recovery functionality allows you to construct a recovery
workflow (i.e., a Site Recovery job) from the available actions, such as:
 Starting or stopping VMware and Hyper-V VMs, as well as EC2 instances
 Performing Failover or Failback for VMware and Hyper-V VMs, as well as EC2 instances
 Running or stopping specific data protection jobs created in NAKIVO Backup & Replication
 Running specific scripts
 Attaching or detaching repositories
 Sending an email
 Waiting for a specified period of time

Tailor Site Recovery Jobs for Any Use Case or Scope


There are three main points to consider when exploring NAKIVO Backup & Replication’s Site
Recovery functionality: simplicity, reliability, and versatility. While some DR solutions require
you to spend hours upon hours configuring your virtual environment in preparation for
both actual DR and testing, Site Recovery works completely out of the box. Minimal setup is
required. Moreover, while perfectly suitable for a multitude of use cases, the functionality is
incredibly simple to use.

To name just a few examples:


 You could create a Site Recovery job to check the availability of your VMs every hour and
to inform you if any problems are detected.
 You could create a Site Recovery job that is simple enough and suitable for a small-scale
disaster. This job could include a simple Failover job to temporarily transfer key workloads
to the DR location. After that, the job can check whether the process has been completed
successfully and send you a report by email.

8
White Paper

 You could create a multi-layered Site Recovery job for complex situations when you have
to deal with the consequences of a major disaster. This job could, for example, start or
stop specific jobs (e.g. replication jobs), run specific scripts when you need to fine-tune the
process, attach specific repositories for archival purposes, and even launch a different Site
Recovery job if necessary.

There are several things you may want to consider when it comes to Site Recovery
orchestration. Planning should be the first step as it may have a direct influence on the
complexity of your recovery workflow and testing procedure.

Planning the DR Process


There are a few basic steps to take before building the first site recovery workflow:
 Define the scope. Identify the apps, services, and VMs housing them that have higher
priority in terms of data protection planning.
 Define your RPOs. The RPO essentially defines how much data (measured in time) your
business could afford to lose as a result of a disaster.
 Define your RTO. The RTO determines the period of downtime your company can
“tolerate” before the data is recovered and operations are restored. Site Recovery in
NAKIVO Backup & Replication allows you to set an RTO and see if your target can be met
with the Site Recovery job, or whether it needs to be refined or readjusted.
 Define the recovery infrastructure and recovery resources. Performing a failover
requires a significant amount of RAM and CPU resources. Thus, besides identifying which
servers are going to be housing which VM replicas, etc., you need to make sure that there
won’t be any issues during the procedure hardware-wise.
 Decide upon your testing schedule. This step is crucial and may influence the creation
of Site Recovery jobs later on, by requiring you to make it more or less complex. As an
example, if you have a complex recovery workflow in place, you may require a significant
amount of CPU and RAM resources for testing. This may be detrimental to your business
processes. Thus, you may consider creating a less elaborate workflow to accommodate
the required testing schedule.

A comprehensive emergency Site Recovery job should include Automated Failover – a step
that transfers the workloads in your production environment to your VM replicas at the DR
location. However, in order for this to work, those replicas must be created beforehand.
Only after you have replicated the VMs that are going to be involved in the DR process, can
you move on to creating a Site Recovery workflow.

Creating a VM Replication Job


NAKIVO Backup & Replication allows you to create as many VM replicas as you require.
Considering which VMs require Failover (and thus replication) and which ones can make do
with a simple backup is a must.

9
White Paper

The replication job wizard can guide you through the job creation process, asking you to
specify the necessary details along the way. While creating the replication job(s), please
consider the following:
 The container and the datastore for the replicas. These are crucial as these replicas
are going to run the workloads after Failover and should therefore be at the location
separate from the DR site.
 The Network Mapping and Re-IP rules. Source and target network parameters often differ.
Network Mapping can ensure that VMs are connected to the right network upon failover.
The Re-IP feature can automatically assign new IPs to replicas at the DR location, following a
simple set of rules that you input. You can also create a virtual isolated network for testing the
Site Recovery job. Note that Network Mapping and Re-IP can also be configured as part of Site
Recovery job creation if it has the Failover action. In case of any conflicts, the rules for the Site
Recovery job overrule the individual rules for a replication job.

10
White Paper

Creating a Site Recovery Job


Creating a Site Recovery job is an easy streamlined process. As mentioned earlier, you can
create multiple Site Recovery jobs for different purposes, varying in scale and complexity. By
choosing which actions to incorporate into specific jobs, you can create either a small Site
Recovery job solely for utility purposes or a multi-layered one to be used for complex situations.

For example, you can create a small Site Recovery job using two actions: Check condition
and Send email. By scheduling this job to run every 5 minutes, you can tell the solution to
periodically scan your virtual environment to check if your VMs are reachable and send you
an e-mail in case there are any problems so that you can launch your primary Site Recovery
job with Failover.

Also, by including the specific actions into your Site Recovery jobs, you can account for
situations that other DR solutions would not. To illustrate, you can include Stop VMs into

11
White Paper

the job that is going to stop all of the unimportant VMs at the DR location, thus freeing up
valuable CPU and RAM resources for the upcoming Failover.

To help connect all these actions together, Site Recovery functionality uses the Action Options.
These options are present when configuring every single step of your Site Recovery job. They
allow you to decide how NAKIVO Backup & Replication should act in different scenarios:
 Run this action in: Here, you can determine if the solution should run this action in
production mode only, in testing mode only, or in both modes. This allows you to fine-
tune your jobs for specific purposes or make them more general.
 Waiting behavior: You can decide whether NAKIVO Backup & Replication should wait
for step to complete before proceeding or move on to the next step immediately after
initiating the action.
 Error handling: This is where you determine how the solution should handle any error
that can arise during the set action. You can have the product either stop and fail the job
if there are any issues or proceed to the next step despite them.

12
White Paper

Keep in mind that if your Site Recovery job includes Failover or Failback actions, you may
want to configure Network Mapping and Re-IP options to allow for seamless transition of
workflows to VM replicas at the DR site.

At the end of the process you can configure test scheduling to make the whole testing procedure
fully automated or allow it to run only on demand. Additionally, in the Options section you can
set the RTO goal for the Site Recovery job, which will be useful for testing later on.

Testing a Site Recovery Job


In itself, Failover is a rare occurrence that only happens during emergencies. Virtual
environments can change over time in addition to network settings, internet service
providers, etc. With this in mind, regular testing and updates of Site Recovery jobs are critical
before you can move on to the actual Failover. With Site Recovery functionality, you can
create jobs that are flexible enough to be used for both testing and production modes.

13
White Paper

As mentioned earlier, each separate action can be configured to run in production mode, in
testing mode, or in both. If you have left the default options for the actions that you created,
all actions will run in both modes, so your entire Site Recovery job can be run in testing or in
production mode.

You can initiate a test run by selecting Test site recovery job in the small menu that pops up
after the Site Recovery job is prompted to run. Additionally, you can reselect RTO once again
for the test run if needed.

NAKIVO Backup & Replication is going to send you a comprehensive report on the test run of
the Site Recovery job if you have enabled this option in the job settings earlier. Analyzing the
report is critical to see if all actions in the job were completed properly, how long they took,
and if you were able to meet your RTO goals with this test run. In case there are problems,
you can update the job at any time to accommodate your needs.

14
White Paper

Also, note that if any Failover took place during the test, the corresponding Failback will
automatically be carried out after testing is complete; all workloads return to their original
location. Testing mode was designed to be non-disruptive and shall not affect the production
IT infrastructure if it is run in the isolated network.

Running a Site Recovery Job in Production Mode


A Site Recovery job can be started like any other NAKIVO Backup & Replication job by
selecting Run Job from the main UI. However, if your Site Recovery job includes the Failover
action, then the solution is going to prompt you to select which type of Failover you are
planning to perform.

You have the following options:


 Planned Failover. This option should be selected when you still have time to perform one
final data synch before switching workloads over to the replicas at the DR location. As an
example, you should use this method if you know that there could be a power outage in
30 minutes and you want to optimize the failover process.

15
White Paper

 Emergency Failover. This option sees the solution transfer your workloads to the replicas
immediately. You should select this option when you need to perform the Site Recovery
job urgently because of an unexpected disaster, and have no time to spare for a final data
synch.

The main difference in running a Site Recovery job with the Failover action, in production
mode as opposed to testing, is that there is going to be no automated Failback after the job
has finished running. To perform the Failback later on you need to create a different Site
Recovery job for this purpose.

Performing Failback
Failback is the process of moving the workloads back from the replicas to their source
VMs. This action can be included to run in a Site Recovery job and can be used whenever
necessary.

When you configure the Failback action, you can choose the location of the Failback.
By default, this can be the original source location – e.g., your main office or data center.
However, should the production site be still unavailable (e.g., as a result of a fire burning down
the whole office), you can transfer the workloads to a new long-term location instead.

Before you proceed with Failback, however, you may want to consider performing reverse
replication. This basically means creating a replication job and selecting the VM replicas at
the DR location as your source VMs. Such procedure may be crucial for synching the data
between VMs before the Failback.

Conclusion
In today’s fast-paced business world, successful companies operate on the “always-on” basis.
Ensuring availability as well as business continuity is crucial for any company that wants to
retain their customers and avoid losing revenue. Considering that disaster, whether natural
disaster or hardware failure, can occur at any point in time, virtualizing your infrastructure
and creating a strong disaster recovery plan are integral for the survival of your business.

NAKIVO Backup & Replication v8.0 introduces the advanced Site Recovery functionality
that redefines traditional DR with its versatility and reliability. Site Recovery is extremely
scalable and flexible, letting you have multiple Site Recovery jobs active to accommodate
every conceivable scenario. You can perform non-disruptive testing and carry out planned
migration, as well as building a Site Recovery job that you activate for fast recovery after a
disaster. NAKIVO Backup & Replication with Site Recovery functionality can be your personal
solution for achieving the “always-on” status your business needs to stay competitive.

16
White Paper

NAKIVO Backup & Replication at a Glance


NAKIVO Backup & Replication is a fast, reliable, and affordable VM backup solution.
The product protects VMware, Hyper-V, and AWS EC2 environments. NAKIVO Backup &
Replication offers advanced features that increase backup performance, improve reliability,
and speed up recovery.

Deploy in under 1 minute Reduce backup size


Pre-configured VMware VA and AWS AMI; Incremental backups with CBT/
1-click deployment on ASUSTOR, QNAP, RCT, LAN-free data transfer,
Synology, WD NAS, and NETGEAR; 1-click network acceleration; up to 2X
Windows installer, 1-command Linux installer performance when installed on NAS

Protect VMs Decrease recovery time


Native, agentless, image-based, Instant recovery of VMs, files,
application-aware backup and replication Exchange objects, SQL objects,
for VMware, Hyper-V VMs, as well as AWS Active Directory objects; automated
EC2 instances Site Recovery

Increase backup speed Ensure recoverability


Exclusion of swap files and partitions, Instant backup verification with
global backup deduplication, adjustable screenshots of test-recovered VMs;
backup compression backup copy offsite/to the cloud

About NAKIVO
The winner of a “Best of VMworld 2018” and the Gold Award for Data Protection, NAKIVO is a US
corporation dedicated to developing the ultimate VM backup and site recovery solution. With 20
consecutive quarters of double-digit growth, 5-star online community reviews, 97.3% customer
satisfaction with support, and more than 10,000 deployments worldwide, NAKIVO delivers an
unprecedented level of protection for VMware, Hyper-V, and Amazon EC2 environments.

As a unique feature, NAKIVO Backup & Replication runs natively on leading storage systems
including QNAP, Synology, ASUSTOR, Western Digital, and NETGEAR to deliver up to 2X
performance advantage. The product also offers support for high-end deduplication appliances
including Dell/EMC Data Domain and NEC HYDRAstor. Being one of the fastest-growing data
protection software vendors in the industry, NAKIVO provides a data protection solution for
major companies such as Coca-Cola, Honda, and China Airlines, as well as works with over
3,000 channel partners in 137 countries worldwide. Learn more at www.nakivo.com

© 2018 NAKIVO, Inc. All rights reserved. All trademarks are the property of their respective owners.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy