003 Technical Deep Dive HX
003 Technical Deep Dive HX
Lenovo HX is a Hyperconverged solution, packed with compute nodes, hypervisor, storage and
all the advanced features like seamless scalability, non-disruptive upgrades, replication, tiering,
management, monitoring with self-heling capabilities and one-click operation for majority of
daily tasks. It‘s a Software Defined IaaS. Because it‘s built with software on top of hardware,
thus decoupling software from hardware and consequently allowing the users to scale to
hybrid cloud, public cloud and private cloud. When everything is in software, we have the
flexibility to scale like big known internet companies like Google or Facebook….
2
Web-scale is a set of architectural principles and technology concepts. Infrastructure built
around these principles is called web-scale infrastructure.
You will have your own way of introducing Web-Scale.. I like to talk about Google, Facebook
etc etc and how they have no SAN storage. When they want storage they buy servers, when
they want routers they buy servers… Everything is done in software.
3
Lenovo HX line of products, powered by Nutanix Xtreme Computing Pltform consists of two
major layers Acropolis and Prism:
- Acropolis offers IT professionals the flexibility to choose the best application platform
technology for their organization – whether it is traditional hypervisors, emerging
hypervisors or containers. With Nutanix Acropolis, infrastructure decisions can be made
based on application performance and scalability and economic considerations, while
allowing workloads to move more seamlessly without penalty should requirements change.
Nutanix Acropolis is comprised of three foundational components:
- Distributed storage fabric (DFS) – Built on Nutanix Distributed File System, it enables
common web-scale services across multiple storage protocols. It supports mounting
volumes trough iSCSI for applications with specific protocol requirements, and it
includes a powerful erasure coding algorith EC-X which is unique in hyperconverged
environments
- App Mobility Fabric – An open environment, capable of delivering powerful virtual
machine (VM) placement, VM migration, and VM conversion, as well as cross-
hypervisor high availability and integrated disaster recovery. It supports most
virtualized applications, and will provide a more seamless path to containers and
hybrid cloud computing.
- Acropolis Hypervisor (AHV) - While the Distributed Storage Fabric fully supports
traditional hypervisors such as VMware vSphere and Microsoft Hyper-V, Acropolis
also includes a native hypervisor based on the proven Linux KVM hypervisor. With
enhanced security, self-healing capabilities based on SaltStack and enterprise-grade
VM management, Acropolis Hypervisor delivers a better overall user experience at a
4
lower TCO and will be the first hypervisor to plug into the App Mobility
Fabric
- Prism Prism is an enhanced solution from Nutanix that brings simplicity to
infrastructure management. Prism features innovative One-Click technology that
streamlines time-consuming IT tasks, and includes one-click software upgrades for
more efficient maintenance, one-click insight for detailed capacity trend analysis
and planning and one-click troubleshooting for rapid issue identification and
resolution. Nutanix Prism delivers better value to IT administrators as a result of:
- Convergence of storage, compute and virtualization resources into a unified
system to provide an end-to-end view of all workflows – something difficult
to achieve with legacy three-tier architectures;
- Advanced machine learning technology with built-in heuristics and business
intelligence to easily and quickly mine large volumes of system data and
generate actionable insights for enhancing all aspects of infrastructure
performance; and
- True consumer-grade user experience with sophisticated search technology
that makes management tasks elegantly simple and intuitive, with no need
for specialized training.
- And also impportant – API‘s ! – RESTful API support for Automation (upward
integration….)
4
Lenovo HX appliances suited for every need!
Special offering and unique is the express offering – it scales from three to four nodes with
very attractive pricing and majority of Acropolis and Prism functionalities
Never forget – for best connectivity within Lenovo HX clusters we need to use high-speed
ethernet 10Gb switches with low latency – Choose from Lenovo offering where we have
switches that can provide latencies bellow 600ns !
5
Current offering – mind the pre-defined HW configurations
Basically, we have a node for every load. They can be mixed within the cluster. If you need VDI
solution, you can choose several nodes with dedicated GPU‘s. There is one for storage, there
are nodes with more drives, more RAM, more cores…..
6
Acropolis and Prism are build on open source and proven components
- Zeus is the Nutanix library that all other components use to access the cluster configuration,
which is currently implemented using Apache Zookeeper.
- Zookeeper runs on either three or five nodes, depending on the redundancy factor that is
applied to the cluster. Of these three nodes, one Zookeeper node is elected as the leader.
- Medusa is a Nutanix abstraction layer that sits in front of the database that holds this
metadata. The database is distributed across all nodes in the cluster, using a modified form
of Apache Cassandra.
- Cassandra is a distributed, high-performance, scalable database that stores all metadata
about the guest VM data stored in a Nutanix datastore. Cassandra runs on all nodes of the
cluster. Cassandra depends on Zeus to gather information about the cluster configuration.
- From the perspective of the hypervisor, Stargate is the main point of contact for the Nutanix
cluster. All read and write requests are sent across vSwitchNutanix to the Stargate process
running on that node. Stargate depends on Medusa to gather metadata and Zeus to gather
cluster configuration data.
- A Curator master node periodically scans the metadata database and identifies cleanup and
optimization tasks that Stargate or other components should perform. Analyzing the
metadata is shared across other Curator nodes, using a MapReduce algorithm. Curator
depends on Zeus to learn which nodes are available, and Medusa to gather metadata.
Based on that analysis, it sends commands to Stargate.
7
Minimum Lenovo HX cluster (powered by Nutanix) begins with three nodes. Recommendation
is that first three nodes are of the same type.
On every node there is an Controller VM (CVM) that, regardless of hypervisor (ESXi, Hyper-V or
AHV) provides the complete Nutanix stack: Storage across nodes is handled with DSF, Prism
runs in CVM‘s, CVM handles data locality (meaning that data used by VMs is always on a node
where VMs run). SW within CVM also provides data path redundancy allowing the cluster,
together with DSF replication factor (RF), to sustain the loss of a CVM (or temporary absence
due to upgrades) or the loss of a full node (with RF2) or loss of two nodes (with RF3)
Because a lot is going on within cluster and because they (CVMs) communicate constantly with
between each other, we need low latency connectivity between them.
8
Storage is not handled in a traditional way (RAID arrays across nodes etc.) but with DSF. DSF
includes ILM functionalities like tiering, high availability with redundancy factor 2 or 3 (with
RF2, we store two copies of data – one copy on local node to preserve data locality and one
copy within the cluster on another, most appropriate node while with RF3 we store three
copies)
Since RF2 requires double the amount of RAW storage to be populated and RF3 even triple the
amount, we use the power of web-scale infrastructure – the power of SDS (software defined
storage) and power of Intel CPUs to reclaim as much of effective space as possible. We can
use:
- Deduplication
- Compression
- EC-X – Erasure coding - data, older than 6 days is restriped trough nodes with parity like
RAID5 - more on next two slides
9
Once RF2 or RF3 is not needed (for cold data) we switch to EC-X. Think of EC-X like RAID, but
because we do not work with drives but rather with block within DSF, we can refer to it as
RAIN (Redundant Array of Independent Nodes). It‘s powered by SW and runs in the
background of distributed Nutanix stack
10
http://www.nutanix.com/2015/06/17/erasure-coding-x-ec-x-predictably-increase-usable-
storage-capacity/
Replication Factor (RF) is a quick and efficient technique of creating multiple (2 or 3) data
copies across the cluster, making the cluster highly resilient with the ability to tolerate up to
two simultaneous node (server) failures without downtime or data loss. Replication factor is
also the king of rebuilds. If a node or disk fails, the cluster rebuilds back to the desired
resiliency level using the power of all spare resources in the cluster, far quicker than any
traditional method. However, all these benefits come at a cost. Replication Factor, as the name
implies, creates data copies and therefore consumes more capacity than traditional RAID 5 or
6 schemes.
This is where Nutanix EC-X fits in.
EC-X overcomes the capacity cost of Replication Factor without taking away any of the
benefits. EC-X is an implementation of Erasure Coding and therefore works by creating a
mathematical function around a data set such that if a member of the data set is lost, the lost
data can be recovered easily from the rest of the members of the set.
11
Minimum number of nodes to enable EC-X is three when using RF2 and four when using RF3
Maximum number of stripes that EC-X will do is four stripes + one parity for RF2 and four
stripes + two parities for RF3
12
Lenovo HX is a unified solution when it comes to storage. It utilizes Nutanix Distributed File
System that is capable of managing billions of files, so in later versions, File Server service was
easily added to the stack.
13
Unlimited local snapshots on primary cluster with Time Stream
Back-up to public cloud
Quick restore and state recovery
WAN-optimized replication for DR
Works with ESXi and Hyper-V
14
How snapshots are handled – In Nutanix, snapshots have byte level resolution for maximum
granularity
15
How snapshots are handled – In Nutanix, snapshots have byte level resolution for maximum
granularity
16
Use Cases:
• Protection against VM corruption/deletion
• Protection against complete site failure(s)
• Backup
Points of differentiation
• Multiple DR topologies
• Multiple retention policies
• WAN optimized replication
• Rest API driven (customer DR runbooks)
• Usability
• VM and application level consistency
• Distributed
17
Use Cases:
• Protection against VM corruption/deletion
• Protection against complete site failure(s)
• Backup
Points of differentiation
• Multiple DR topologies
• Multiple retention policies
• WAN optimized replication
• Rest API driven (customer DR runbooks)
• Usability
• VM and application level consistency
• Distributed
18
Use Cases:
• Protection against complete site failure(s)
• No data loss in case of a site failure
• No Backup
Points of differentiation
• Easy to setup and manage
• Interop with a-sync replication (3rd site)
• Rest API driven (custom DR runbooks)
19
Use Cases:
Protection against complete site failure(s)
No data loss in case of a site failure
No Backup
Points of differentiation
Easy to setup and manage
Interop with a-sync replication (3rd site)
Rest API driven (custom DR runbooks)
20
Use Cases:
• Archiving
• Backup
Points of differentiation
• Easy to setup and manage
• WAN optimized replication
• Interop with Nutanix DR portfolio
21
Use Cases:
• Archiving
• Backup
Points of differentiation
• Easy to setup and manage
• WAN optimized replication
• Interop with Nutanix DR portfolio
22
The better way involves enabling the Admin to give control of a VM’s snapshots to the end
user.
This way an authorized end user can see only his/her snapshots within the guest, then mount
the appropriate snapshot to the guest (which shows up like a drive letter) and then copy over
lost files
23
Cross Hypervisor DR: What is it?
Disaster Recovery
• Native two-way VM configuration conversion when migrating or failing over
• Native One-click migration between sites
• Scripted Workflow customization
24
The main window of Prism. This is the starting point where the majority of one-click
management is done.
Prism is always available since it runs within distributed Nutanix cluster. From three nodes to
thousands of nodes, Prism is reachable trough single cluster management IP via WEB Browser.
It is a true HTML5 application.
It gives the user the ability to provision machines (when using AHV), create and manage
replication schedules, schedule VM snapshots, manage storage. It is also a „single pane of
glass“ over Lenovo HX / Nutanix cluster with monitoring capabilities, historical data analysis,
HW and SW maintenance, error logging and can help you plan future growth based on
trending of current performance data.
Prism, because it‘s part of the system, enable the user to manage the whole environment: HW
where Nutanix runs from (monitoring of HW components like HDDs, RAM, CPUs, PSUs….) and
maintenance (Firmware upgrades, SW component upgrades), provisioning of storage, alerting
and much, much more!
25
Prism helps us to plan future upgrades of the system based on consumption of resources over
time.
26
Prism central: Manage many clusters across many locations from a single pane of glass
27
Prism comes in several flavors:
- Express is only available with Lenovo express offering and cannot be upgraded to other
types
- Starter, Pro and Ultimate are price sensitive because of features they offer.
Customer can upgrade from one to the other when in need of additional functionality
28
Some of the questions/topics when preparing for sizing
Nutanix offers a very granular and precise sizing tool for Lenovo HX – available to Lenovo and
Nutanix experts
For users and partners, Nutanix and Lenovo have prepared a detalied questionaire where data
like this and more can be put in. The more the details the customer can provide, the better the
sized solution will be prepared.
29
Some of the questions/topics when preparing for sizing
Nutanix offers a very granular and precise sizing tool for Lenovo HX – available to Lenovo and
Nutanix experts
For users and partners, Nutanix and Lenovo have prepared a detalied questionaire where data
like this and more can be put in. The more the details the customer can provide, the better the
sized solution will be prepared.
30
Template for collecting data for sizer
31
Screenshot of Lenovo HX sizer
32
Since Lenovo HX powered by Nutanix also incorporates SDS part, it is only fair to compare the
traditional with Nutanix. And this is it!
33
34
35
ROBO – for VDI and business apps
36
For SMB – VDI and Business apps
37
Dense with more storage – for SMB - VDI and Business apps
38
For Compute-heavy loads – for VDI and Business apps
39
All Flash model – For VDI and Business apps
40
GPU model – Graphically intense VDI and Business apps
41
Dense with more storage – VDI and Business apps
42
Dense All-Flash – VDI and Business apps
43
For storage heavy workloads – BigData platform apps and Enterprise apps
44
For storage heavy workloads – NO HYPERVISOR SUPPORT – Only AHV for running CVM!
45
For high-performance workloads – Enterprise apps like SAP, Exchange…..
46
Hyperconvergence solves this issue and a lot more. We still have servers with direct attached
storage. But what we do is pool storage from all the servers in a cluster into a shared storage
pool so that storage from all the independent nodes are available to all the nodes in the
cluster and the associated VMs.
You can start off with as little as 3 nodes in a cluster and incrementally grow from there. When
you need additional capacity – you just add a node – either compute heavy, storage heavy
depending on what you need.
All the enterprise storage capabilities that shared storage solutions such as NetApp and EMC
provided, Nutanix provides it.
Now addressing the bottleneck/hotspot problem in shared storage – the storage controller is
virtualized and is present on every node in the system. When you can virtualize mission critical
workloads such as Oracle and SQL, why not virtualize storage as well. To us storage is an App
as well and should be the first App that should be virtualized.
Everytime a node is added a CVM gets added. All the requests from the user VM will be
handled by the CVM that sits on the same node. So requests don’t typically have to go through
the network.
47
Replication factoring 2 and 3 are container based
48
Loss of CVM – no problem – data path will find it‘s way to a neighboring CVM
Loss of node – no problem – machines will be restarted on other nodes and data is still
araound because of RF2/RF3 and EC-X
49
Local SSD is used for read cache and Memory cache (ram cahe). When data gets colder, it is
then migrated to slower tier. SSD are mainly used for random I/O so user can decide to
configure sequential workloads to bypass SSD tier.
50
Compression happens in-line.
51
Compression flow
52
Dedupe happens in-line.
Fingerprints are calculated at write I/O and stored in metadata. Only if they are new.
53
EC-X – erasure coding – eliminate RF2 and RF3 for older data but still preserve HA!
54
DSF can act as storage – after all, it is SDS
55
When using AHV as hypervisor. This is like VMWare tools for ESX environment. When installed
in VM, this is what we gain:
56
Empowering the users!
57
58
AHV is built from KVM source by Nutanix. It is not KVM.
It works out-of-the-box and does not require additional license – no hidden costs and is the
only of the three supported hypervisors (ESX and HyperV) that is integrated in Prism: You can
create VM, clone VM, delete VM, snap-shot VM, open the console to VM, replicate VM….
Directly from Prism GUI
59
Things to consider when planning VMWare to Nutanix AHV disaster recovery solution
60
RESTful API is the new standard in integration of software – trough RestAPI you can integrate
Prism components in to enterprise monitoring, integrate provisioning tasks in you own GUI
etc, etc…
61
Key sizing parameters
62
63
Management Ease with Prism
• HTML 5, RESTful APIs
• No SPOF
• Manage the entire cluster from here
• Hardware, VM, Storage stats etc.
• No more ‘islands’ of management
• No more ‘finger pointing’
64
The usual suspects when it comes to problems after the order…..
65
66