Unit 2 - Final
Unit 2 - Final
Virtual Machines and Virtualization of Clusters and Data Centres: Implementation Levels of
Virtualization, Virtualization Structures/Tools and mechanisms, Virtualization of CPU, Memory
and I/O Devices, Virtual Clusters and Resource Management, Virtualization for Data Centre
Automation.
A virtual machine (VM) is a digital version of a physical computer system. Virtual machine can
run the programs and operating systems, stores data, connect to networks, and do other
computing functions, and requires maintenance such as updates and system monitoring.
A VM is a virtualized instance of a computer that can perform almost all of the same functions
as a computer, including running applications and operating systems.
Virtual machines run on a physical machine and access computing resources of that physical
machine with the help of software called a hypervisor.
Virtual machine is a software-based-computer that exists within the operating system of another
computer. In simpler terms, it is a virtualization of an actual computer, except that it exists on
another system.
So with VM, multiple OS environments can exist simultaneously on the same machine
And one or more virtual “guest” machines can run on a single physical “host” machine
The purpose of a VM is to enhance resource sharing by many users and improve computer
performance in terms of resource utilization and application flexibility
What is Virtualization?
Virtualization can be defined as a process that enables the creation of a virtual version of a
desktop, operating system, network resources, or server. Virtualization plays a key and
dominant role in cloud computing.
It is also defined as a creation of a virtual version of a server, a desktop, a storage device, an
operating system, or network resources. It is essentially a technique or method that allows the
sharing of a single physical instance of a resource or an application amongst multiple
organizations or customers.
The machine on which the virtual machine is built is called the Host Machine and the virtual
machine is known as the guest machine.
In cloud computing, this virtualization facilitates the creation of virtual machines and ensures
the smooth functioning of multiple operating systems. It also helps create a virtual ecosystem
for server operating systems and multiple storage devices.
Actually, the idea of VMs can be dated back to the 1960s
Hardware resources (CPU, memory, I/O devices, etc.) or software resources (operating system
and software libraries) can be virtualized in various functional layers.
The following figure shows the computer before and after virtualization
A traditional computer runs with a host OS specially designed for its hardware architecture,
however after virtualization, different user applications managed by their own OSs can run on
the same hardware, independent of the host OS.
This is done by adding an additional layer between physical hardware and host OS.
This virtualization layer is known as Hypervisor or VMM (Virtual Machine Monitor)
The main function of this virtualization layer is to virtualize the physical hardware of host
machine into virtual resources to be used by VMs.
The hypervisor creates an abstraction of VMs by placing virtualization layer at various
operational levels of computer system.
These levels include, instruction set architecture (ISA) level, hardware level, OS level, library
level and application level.
Instruction set architecture (ISA) level: This level defines a way in which a microprocessor is
Hardware abstraction layer (HAL) level: This approach generates a virtual hardware
environment for a VM. The idea is to virtualize a resource of computer such as processors,
memory, and I/O devices. This is done at the top of the bare or base hardware. The goal of this level
is to enhance the hardware utilization by enabling concurrent system usage among multiple users.
This is done by creating a virtual hardware environment for actual machine and manage the
hardware through virtualization. Most recently Xen hypervisor has been applied to visulaize x86
based machine to run Linux or other guest OS applications.
OS level: It refers to an abstraction layer between tradition OS and user applications. OS-level
virtualization creates isolated containers on a single physical server and the OS instances to utilize
the hardware and software in data centres. The containers behave like real servers. Using this level
of virtualization, a virtual platform can be created to assign hardware resources to different users
which do not trust each other.
Library support level: The virtualization at library level can be done simply by managing APIs
associated with applications systems. Most applications use APIs exported by user-level libraries
rather than using lengthy system calls by the OS. Since most systems provide well-documented
APIs, such an interface becomes another level for virtualization. Virtualization with library
interfaces is possible by controlling the communication link between applications and the rest of a
system.
User-Application level: Virtualization at this level virtualizes an application as VM. The process
involves wrapping the application in a layer that is isolated from the host OS and other applications.
It is also known as process-level virtualization because OS considers each application as a process.
The most popular approach is to deploy HLL VMs. Any program written in the HLL and compiled
for this VM will be able to run on it. JVM & Microsoft .NET CLR are two good examples of this
class of VM.
A virtualization system that partitions single physical system into multiple VMs
In order to create & deploy VMs and services to physical servers, VMM is the solution
for virtualization environment
The software that provides virtualization is often called VMM or Hypervisor.
Therefore VMM manages the hardware resources of a host computing system.
There are three requirements for a VMM
1. VMM should provide an environment for programs which is essentially identical to the
original machine.
2. Programs run in this environment should show, at worst, only minor decreases in speed.
3. VMM should be in complete control of the system resources
The complete control of computing resources by VMM include the following aspects:
1. The VMM is responsible for allocating hardware resources for programs.
All OS-level VMs on the same physical machine share a single operating system kernel.
The virtualization layer can be designed in a way that allows processes in VMs to access as
many resources of the host machine as possible, but never to modify them
Dis-advantages of OS Extensions:
The main disadvantage of OS extensions is that all the VMs at operating system level on a
single container must have the same kind of guest operating system.
The hypervisor supports hardware level virtualization on bare metal devices like CPU,
memory, disk and network interfaces
This hypervisor software sits directly between physical hardware and its OS and able to
convert physical devices into virtual resources dedicated for deployed VM to use.
This virtualization layer is referred as VMM or the hypervisor.
Depending on functionality, this hypervisor can be a micro-kernel architecture like
Microsoft Hyper-V or it can be a monolithic hypervisor architecture like VMware ESX for
server virtualization.
A micro-kernel hypervisor includes only basic and unchanging functions. i.e device drivers
and other changeable components are outside of hypervisor. However, monolithic
hypervisor can implement all such functions including device drivers.
Therefore, the size of micro-kernel hypervisor is smaller than monolithic hypervisor.
Top hypervisor tools are Microsoft Hyper-V, VMware, KVM, Oracle’s VirtualBox
The hardware virtualization provides the architectural support to build virtual machine
manager that can run a guest OS in isolation.
Depending on the implementation technologies, hardware virtualization can be classified
into two categories, like full virtualization and host-based virtualization.
Full virtualization:
Disadvantages:
2. Hypervisor does not contain the device driver and it might be difficult for new device
drivers to be installer by users.
Host-based Virtualization:
1. The user can install this VM architecture with modifying host OS. So the VMM can rely on
host OS to provide device drivers and other low level services
2. This approach appeals to many host machine configurations.
Para virtualization needs to modify the guest OSs. i.e A para-virtualized VM provides
special APIs requiring substantial OS modifications in user applications.
The virtualization layer can be inserted at different positions in a machine software stack.
The guest operating systems are para-virtualized. They are assisted by an intelligent
compiler to replace the nonvirtualizable OS instructions by hypercalls.
The guest OS kernel is modified to replace privileged and sensitive instructions with hyper
calls to hypervisor or VMM. So the guest OS may not able to run them. These are
implemented by hypervisor.
Therefore para virtualization replaces non-virtualizable instructions with hypercalls that
communicate directly with hypervisor or VMM
When guest OS kernel is modified for virtualization, it can no longer run on hardware
directly.
Unlike full virtualization architecture which interprets and emulates privileged and sensitive
instructions at runtime, however para virtualization handles them at compile time.
The traditional x86 processor offers four instruction execution rings: Rings 0, 1, 2, and 3.
The OS is responsible for managing the hardware and the privileged instructions to execute
at Ring 0, while user-level applications run at Ring 3
However, para-virtualization attempts to reduce the virtualization overhead, and thus
improve performance by modifying only the guest OS kernel.
Unlike full virtualization ,guest OSs are aware of one another.
The following figures illustrate the concept of para-virtualized VM architecture.
The lower the ring number, the higher the privilege of instruction being executed,
Compared with full virtualization, para-virtualization is relatively easy and more practical.
The main problem in full virtualization is its low performance in binary translation. To
speed up binary translation is difficult.
Therefore, many virtualization products employ the para-virtualization architecture.
The popular Xen, KVM and VMware ESX are some good examples of this type.
KVM is hardware assisted para-virtualization tool which improves the performance and
support unmodified guest OS such as Windows, Linux, Solaris etc.
This is Linux para-virtualization system as a part of the Linux version 2.6.20 kernel.
Memory management and scheduling activities are carried out by the existing Linux kernel.
The rest of things can be carried out by KVM, which makes it simpler than the hypervisor
that controls the entire machine.
Advantages of para-virtualization:
Para virtualization requires the guest OS to be modified in order to interact with para
virtualization interfaces.
It requires significant support and maintainability issues in production environment.
2.3 Virtualization of CPU, Memory and I/O devices.
To support virtualization processors can employ a special running mode and instructions,
know as hardware assisted virtualization. So VMM and guest OS run in different modes.
The components to consider when selecting virtualization hardware include, CPU, Memory
and Network I/O devices.
These are all critical for workload consolidation issues. The issues with CPU pertain to
either clock speed or the number of cores held by CPU
Hardware virtualization allows to run several OSs on a unique machine. This is done due to
the specific software called “Virtual Machine Monitor/Manager (VMM)”.
In hardware virtualization, there are two things, like, Host machine and Guest machine.
The software that creates a VM on host hardware is called hypervisor or VMM.
Modern OSs and processors permit multiple processes to run simultaneously.
If there is no protection mechanism in a processor, all instructions from different processes
will access the hardware directly and cause system crash,
Therefore, all processors have at least two modes, user mode and supervisor mode to ensure
controlled access of critical hardware.
Instructions running in supervisor mode are called privileged instructions. Other instructions
are unprivileged instructions.
In a virtualized environment it is more difficult to make OSs and applications run correctly
because there are more layers in the machine stack.
The following figure shows hardware support for virtualization in the Intel x86 processor.
The guest OS continues to control the mapping of virtual addresses to physical memory
addresses of VMs. However the guest OS can’t directly access the actual machine memory.
The VMM is responsible for mapping the guest physical memory to the actual machine
memory.
By this I/O virtualization, a single hardware device can be shared by multiple VMs that run
concurrently.
This I/O virtualization involves managing the routing of I/O request between virtual devices
and the shared physical hardware.
The physical processor cores a physical unit of CPU and virtual processor core is also called as
VCPU or virtual processor which is also physical unit that is assigned to VM.
The multi-core virtualization method allow hardware designers to get an abstraction of the low
level details of the processor cores. So this virtualization in multi-cores method alleviates the
burden and inefficiency of managing hardware resources by software. It is illustrated in the
following figure
Virtual Hierarchy:
Instead of supporting time-sharing jobs on one or few cores, we can use cores in a space-sharing,
where single or multi threaded jobs are assigned to separate groups of cores for long time
intervals.
Virtual hierarchies can be created to overlay a coherence and caching hierarchy onto a physical
processor.
Unlike a fixed physical hierarchy, a virtual hierarchy is a cache hierarchy that can adapt to fit the
workload or mix of workloads.
The first level of hierarchy locates data blocks close to the cores needing them for faster access,
establishes a shared- cache domain, and establishes a point of coherence for faster
communication.
The provisioning of VMs to a virtual cluster is done dynamically to have the following interesting
properties:
The virtual cluster nodes can be either physical or virtual machines. Multiple VMs running
with different OSs can be deployed on the same physical node,
A VM runs with guest OS, which is often different form host OS, that manages the resources in
physical machine where the VM is implemented.
Live Migration refers to process of moving a running virtual machine or application between
different physical machines without disconnecting the client or application.
Memory, storage, and network connectivity of virtual machine are transferred from original
guest machine to destination
The live migration of VMs allow workloads of one node to transfer to another node. However
it does not guarantee that VMs can randomly migrate among themselves.
Network Migration:
It involves moving data and programs from one network to another as an upgrade or add-on to
a network system.
Lightweight Directory Access Protocol (LDAP) is a set of open protocols used to access and
modify centrally stored information over a network.
Dynamic Host Configuration Protocol (DHCP) is a protocol that provides quick, automatic,
and central management for the distribution of IP addresses within a network.
In data centres, a large number of heterogeneous workloads can run on servers at various times.
These workloads can be roughly divided into 2 categories.
1. Chatty workloads and
2. Non interactive workloads
2. Non interactive workload: This workloads don’t require people’s effort to make progress after
they are submitted. High performance computing is a typical example of this. At various stages,
the requirements for resources of these workloads are dramatically different. A workload will
always be satisfied with all demand levels, the workload is statically allocated enough resources so
that peak demand is stratified.
It is common that most servers in data centres are underutilized. A large amount of hardware,
space, power and management cost of these servers is wasted.
Server consolidation is an approach to improve the low utility ratio of hardware resources by
reducing the no. of physical servers.
Among several server consolidation techniques, such as centralized and physical consolidation,
virtualized based server consolidation is most powerful. Data centres need to optimize their
resource management.
Consolidation enhances hardware utilization. Many unutilized servers are consolidated into
fewer servers to enhance resource utilization. Consolidation is also facilitates backup services
and disaster recovery.
This approach enables more agile provisioning and deployment of resources. In a virtual
environment, the images of guest OSs and their applications are readily cloned and reused.
The total cost of ownership is reduced. In this sense, server virtualization causes differed
purchases of new servers, a smaller data centre footpoint, lower maintenance costs and lower
power, cooling and cabling requirements.
This approach improves availability and business continuity. The crash of guest OS has no
effect on the host OS or any other guest OS. It becomes easier to transfer a VM from one
server to another because virtual servers unaware of underlying hardware.
In system virtualization, virtual storage includes the storage managed by VMMs and guest
OSs. Generally the data stored in this environment can be classified into two categories,
1. VM Images and
2. Application Data
The VM images are special to the virtual environment
The application data includes all other data which is same as the data in traditional OS
environment.
The most important aspects of system virtualization are encapsulation and isolation
Traditional OSs and applications running on them can be encapsulated in VMs. Only one OS
runs in virtualization while many applications run in the OS. System virtualization allows
multiple VMs to run on a physical machine and VMs are completely isolated.
To achieve encapsulation & isolation both system software and hardware platform, such as
CPU and chipset, are rapidly updated. However the storage is lagging. The storage systems
become the main bottleneck of VM deployment.
These VI managers and OSs are specially tailored for virtualizing data centres which own a
large no of servers in clusters.
Nimbus, Eucalyptus and openNebula are all open source software available to the general
public. Only vSphere 4 is a proprietary OS for cloud resource virtualization and management
over data centres.
A VMM changes the computer architecture. It provides a layer of software between OS and
system hardware to create one or more VMs on a single physical platform.
VMM can provide a secure isolation and a VM accesses hardware resources through the
control of the VMM, so the VMM is the base of the security of a virtual system. Normally one
VM is taken as management VM to have some privileges such as creating, suspending,
resuming or deleting a VM
Once a hacker successfully enters the VMM or management VM, the whole system is in
danger.
Intrusions are unauthorised access to certain computer systems from local or network users.
Intrusion detection is used to recognize the unauthorised access.
An Intrusion Detection System (IDS) is built on OSs and is based on the characteristics of
intrusion actions
A typical IDS can be classified as a host based IDS (HIDS) or a network based IDS (NIDS),
depending on data source.
HIDS can be implemented on the monitored system. When the monitored system is hacked by
hackers, the HIDS also faces the risk of being hacked. A NIDS is based on the flow of network
traffic which can’t detect take actions.
The VM-based IDS contain policy engine and policy module. The policy framework can
monitor events in different guest VMs