Cloud Computing- MCA III Sem
Cloud Computing- MCA III Sem
Structure:
1.0 Learning Objectives
1.1 Definition
1.2 Historical Developments
1.3 Enabling Technology
1.4 Vision
1.5 Essential Characteristics of Cloud Computing
1.6 Components of Cloud Computing
1.7 Approaches of migration into Cloud
1.8 Challenges and solutions of migrations into cloud
1.9 Cloud Applications
1.10 Benefits
1.11 Advantages and Disadvantages
1.12 Unit End Questions
1.0 LEARNING OBJECTIVES
1.1 DEFINITION
What is Cloud?
The term Cloud refers to a Network or Internet. In other words, we can say that Cloud is
something, which is present at remote location.
Cloud Computing
Cloud Computing is the delivery of computing services such as servers, storage, databases,
networking, software, analytics, intelligence, and more, over the Cloud (Internet).
Cloud Computing provides an alternative to the on-premises datacentre. With an on-premises
datacentre, we have to manage everything, such as purchasing and installing hardware,
virtualization, installing the operating system, and any other required applications, setting up the
network, configuring the firewall, and setting up storage for data. After doing all the set-up, we
become responsible for maintaining it through its entire lifecycle.
But if we choose Cloud Computing, a cloud vendor is responsible for the hardware purchase and
maintenance. They also provide a wide variety of software and platform as a service. We can take
any required services on rent. The cloud computing services will be charged based on usage.
Cloud Computing refers to manipulating, configuring, and accessing the hardware and software
resources remotely. It offers online data storage, infrastructure, and application.
Cloud computing offers platform independency, as the software is not required to be installed
locally on the PC. Hence, the Cloud Computing is making our business
applications mobile and collaborative.
2
Cloud computing means storing and accessing the data and programs on remote servers that are
hosted on the internet instead of the computer’s hard drive or local server. Cloud computing is
also referred to as Internet-based computing, it is a technology where the resource is provided as
a service through the Internet to the user. The data which is stored can be files, images,
documents, or any other storable document.
Some operations which can be performed with cloud computing are –
• Storage, backup, and recovery of data
• Delivery of software on demand
• Development of new applications and services
• Streaming videos and audio
Before Computing was come into existence, client Server Architecture was used where all the
data and control of client resides in Server side. If a single user wants to access some data, firstly
user need to connect to the server and after that user will get appropriate access. But it has many
disadvantages. So, After Client Server computing, Distributed Computing was come into
existence, in this type of computing all computers are networked together with the help of this,
user can share their resources when needed. It also has certain limitations. So, in order to remove
limitations faced in distributed system, cloud computing was emerged.
Distributed Computing
Distributed computing refers to a system where processing and data storage is distributed across
multiple devices or systems, rather than being handled by a single central device. In a distributed
system, each device or system has its own processing capabilities and may also store and manage
its own data. These devices or systems work together to perform tasks and share resources, with
3
no single device serving as the central hub. It is difficult to provide adequate security in
distributed systems because the nodes as well as the connections need to be secured. Some
messages and data can be lost in the network while moving from one node to another. The
database connected to the distributed systems is quite complicated and difficult to handle as
compared to a single user system. Overloading may occur in the network if all the nodes of the
distributed system try to send data at once. Then the existence of cloud computing came.
4
Technologies through which cloud computing does its job, like providing software, computing
power, storage, in short technical things that make cloud computing possible are enabling
technologies for cloud computing.
Cloud computing relies on a range of enabling technologies that have evolved over time to make
its infrastructure and services possible. These technologies play a crucial role in the scalability,
flexibility, and accessibility of cloud computing.
Here are the key enabling technologies of cloud computing:
5
3. User: It is a computer which uses the resources on the network.
Mainly, grid computing is used in the ATMs, back-end infrastructures, and marketing
research.
• Utility Computing: Utility computing is the most trending IT service model. It provides
on-demand computing resources (computation, storage, and programming services via
API) and infrastructure based on the pay per use method. It minimizes the associated
costs and maximizes the efficient use of resources. The advantage of utility computing is
that it reduced the IT cost, provides greater flexibility, and easier to manage.
Large organizations such as Google and Amazon established their own utility services
for computing storage and application.
1.4 VISION
There are many characteristics of Cloud Computing and according to NIST there are 5 major
characteristics as follows:
6
following their needs and specifications. They are charged at the end of the billing cycle
based on how much they use the services provided by the cloud service providers.
2. Broad Network Access: The cloud is accessible to any device from any location because
of widespread network access. A cloud provider must offer its clients numerous network
access options. Otherwise, a few systems would be available for using the cloud service.
Broad network access contains configuration for secure remote access, paying special
attention to mobile cloud computing, regulating the data that broad access network
providers have collected, enforcing role-based access control, etc. As a result, cloud
computing removes obstacles and borders because it operates across numerous regions.
4. Resource Pooling: Resource pooling is one of the core components of cloud computing.
A cloud service provider can provide each client with different services based on their
demands by employing resource pooling to divide resources across many clients.
Resource pooling is a multi-client approach for location independence, network
infrastructure pooling, storage systems, etc. The process of real-time resource
assignment does not affect the client's experience. This is often used in wireless
technologies like a radio transmission.
7
6. Security: Users of cloud computing are particularly concerned about data security. Cloud
service providers store users' encrypted data and offer additional security features like
user authentication and protection against breaches and other threats.
User authentication entails identifying and verifying a user's authorization. Access is
denied to the user if they do not have permission. Data servers are physically protected.
These servers are usually kept in a secure, isolated location to prevent unauthorized
access or disruption.
8. Budget Friendly: Businesses can reduce their IT expenses by utilizing this aspect of the
cloud. In cloud computing, the client is responsible for paying the administration for any
space they use. There are no additional fees or hidden costs to be paid.
The payment structure is crucial since it reduces expenses. Due to the extra functionality,
cloud computing choices have a wide range of pricing. The payment option is simple and
helps consumers save time when making frequent payments.
9. Flexibility: Cloud computing users can access data or services with internet-enabled
devices like smartphones and laptops. You can instantly access anything you want in the
cloud with just a click, making working with data and sharing it simple.
Many businesses prefer to store their work on cloud systems because it facilitates
collaboration and saves money and resources. Its expansion is also being sped up by the
number of features analytic tools offer.
10. Resilience: Resilience in cloud computing refers to a service's capacity to quickly recover
from any disruption. The speed at which a cloud's servers, databases, and network system
restart and recover from damage or harm is a measure of its resilience.
Cloud computing offers vital services because it guarantees constant server uptime. This
enables service recovery in the event of a disruption, and the cloud service provider plans
to enhance disaster management by maintaining backup cloud nodes.
Cloud computing is a complex ecosystem that consists of various components working together
to provide scalable, on demand services over the internet. The main components of cloud
computing are:
8
1. User Side Components: Users needs some devices using which they interact with cloud, these
devices are called the clients. They can be further classified as:
• Mobile devices, includes phones and tablets
• Thin clients are computer without internal hard drive
• Thick clients, regular computer
• Web browser
• Access to Internet
2. Cloud Side Components: The services through which the user can connect to the cloud are
called the cloud side components. They are as follows:
• Servers
• Networking
• Virtualization Software
• Storage
• Measurement Software
Cloud computing components correspond to platforms such as front end, back end, and cloud-
dependent delivery and the utilized network. So, a framework of cloud computing is broadly
categorized as three specifically clients, distributed servers and datacentre.
For the operation of this computing, the following three components have a big hand and the
responsibilities of these components can be elucidated clearly as below:
1. Clients: Clients in cloud computing are in general to the operation of Local Area
Networks (LAN’s). They are just the desktops where they have their place on desks. These
might be also in the form of laptops, mobiles, tablets to enhance mobility. Clients hold the
responsibility of interaction which pushes for the management of data on cloud servers.
3. Distributed Servers: These are considered as a server where that is housed in the other
location. So, the physical servers might not be housed in a similar location. Even the
distributed server and the physical server appear to be in different locations, they
perform as they are so close to each other.
9
While the other component is Cloud Applications, where it is defined as cloud
computing in the form of software architecture. So, cloud applications serve as a service
which operates both the hardware and software architecture.
Further, cloud computing has many other components and those come under mainly as four
classifications and these components are the services of cloud computing and they can be
described as follow:
4. Infrastructure as a Service (IaaS): The fundamental classification of cloud computing
services. This service allows for the rent of servers and virtual systems, networks, IT
infrastructure and storage too. It avoids the complication behind acquiring and
administering own physical servers and infrastructure. Few of the business aspects
offered by IaaS are:
• Economical web hosting services
• Supports application and web servers and manage networking resources
• Increased performance on computing
• Assists in big data analysis
• Maintains huge storage, backup, and recovery
IaaS providers
• Amazon Elastic Compute Cloud (EC2)
Each instance provides 1-20 processors, up to 16 GB RAM, 1.69TB
storage
• RackSpace Hosting
Each instance provides 4 core CPU, up to 8 GB RAM, 480 GB storage
• Joyent Cloud
Each instance provides 8 CPUs, up to 32 GB RAM, 48 GB storage
• Go Grid
Each instance provides 1-6 processors, up to 15 GB RAM, 1.69TB storage
5. Platform as a Service (PaaS): So, this is the one service offering an on-demand
atmosphere for development, testing, deployment of many software applications. It
serves as a cloud deployment environment that maintains servers, and this enables the
delivery of both simple cloud applications and complex enterprise applications. Few of
the business aspects offered by PaaS are:
• Stands as a platform for the development and customization of cloud-based
applications.
• PaaS tools allow you to investigate and mine their information thus finding
deeper insights to deliver better outcomes.
• Offers services for enhanced protection, workflow, directory, and scheduling.
10
PaaS providers
• Google App Engine
Python, Java, Eclipse
• Microsoft Azure
.Net, Visual Studio
• Sales Force
Apex, Web wizard
• TIBCO
• VMware
• Zoho
6. Software as a Service (SaaS): This is the approach to deliver software applications over
the Internet and it might be on-demand or through a subscription basis. It allows
individuals to get connected and utilize cloud applications through the Internet. SaaS
offers the feature of enhancement and operation of application at a reduced cost. Few of
the business aspects offered by SaaS are:
• Shows simple accessibility for complex applications
• Allows using client software in a free manner
• Mobilize workforce
Accessibility for application information from any location
So, the other services/components of cloud computing are:
• Cloud Clients
• Cloud Services
• Cloud Applications
• Cloud Platform
• Cloud Storage
• Cloud Infrastructure
SaaS providers
• Google’s Gmail, Docs, Talk etc
• Microsoft’s Hotmail
• Sharepoint
• SalesForce
• Yahoo
• Facebook
Cloud migration is the procedure of transferring applications, data, and other types of
business components to any cloud computing platform. There are three main types of cloud
migration you can perform — on-premises to cloud, cloud to cloud, or cloud to on-premises.
11
1. On-premise to cloud migration: When data and applications from your servers (on-
premise) are transferred to a cloud infrastructure. It's where the cloud providers host and
manage them. On-premise refers to managing your IT resources in your data centre.
2. Cloud to cloud migration: Moving the content that exists in one cloud storage service to
another cloud drive. This could be moving files from Google Drive to Dropbox, Google
Drive to Google Drive, or Dropbox to OneDrive. They are all simple examples of cloud file
migration. Cloud-to-cloud migration allows users to switch cloud storage services
without first transferring data to a local device, which is convenient for users.
When performing any of these three migration types, there are five methods and strategies you
can use. The strategies were first defined in the Gartner “5 Rs” model in 2011.These strategies
are:
▪ Rehosting—moving applications to the cloud as-is. This is also referred to as Lift and
Shift.
▪ Refactor—modifying applications to better support the cloud environment.
▪ Replatform—moving applications to the cloud without major changes, but taking
advantage of benefits of the cloud environment.
▪ Rebuild—rewrite the application from scratch.
▪ Replace—retire the application and replace it with a new cloud-native application.
12
Advantages:
• No code or architecture changes—applications are rehosted to the cloud with
no significant application or infrastructure changes, eliminating costly
development and testing.
• Migrate core services easily—you can move critical core services like Active
Directory quickly and directly. This presents minimal risk and disruption to
business activity.
• Easier compliance and security management—since applications aren’t
changing security and compliance properties also stay largely the same and just
need to be mapped to new resources.
Disadvantages:
• Does not take full advantage of the cloud—legacy applications are not scalable
and do not allow for distributed workloads like cloud-native applications do.
• Latency and performance—on-premise applications might suffer from latency
or performance issues after migration, because they were not optimized or
modified to suit the cloud environment.
• Increased risk—migrating an application with known problems may result in
increased risks after migration.
• Migration failures—the migration process might fail if the organization doesn't
accurately map application requirements to the corresponding cloud
configuration.
13
Advantages:
• Long-term cost savings—can reduce costs by matching actual resource
requirements with cloud infrastructure. The ability to scale as needed reduces
resource consumption and provides long lasting ROI of your refactoring efforts.
• Adapting to changing requirements—cloud-native and microservices
architectures allow applications to rapidly change to adapt to new customer
requirements, by adding new features or modifying existing functionality.
• Increased resilience—by decoupling application components and wiring
together managed solutions that provide high availability, the application inherits
the durability of this cloud.
Disadvantages:
• Vendor lock-in—the more cloud-native your application is, the more cloud
features it is likely to consume. This makes applications tightly coupled to the
public cloud you are using.
• Time—refactoring is resource-intensive and much more complex than a lift-and-
shift migration, meaning projects take much longer to start showing business
value.
• Skills—refactoring isn't for beginners. It requires advanced coding, automation,
and DevOps skills.
• Getting it wrong—refactoring means changing many aspects of the application,
so there is a high risk of errors at the code, configuration, and infrastructure level.
Each mistake can cause delays, cost escalations, and possible outages.
14
enterprises that are not ready for configuration and expansion, or those enterprises that
wish to improve trust inside the cloud.
There are some common modifications that are typically performed during re-
platforming. For example:
• Changing the way, the program interacts with the database to benefit from
automation and an elastic database infrastructure.
• Enabling better scaling and leveraging reserved resources in the cloud
environment with minimal code changes.
Advantages:
• Cost-efficient—this approach is cost effective and does not require a major
development project.
• Start small and scale as needed—re-platforming lets you move some workloads
to the cloud, experiment with the cloud environment, learn lessons and then move
on to other workloads, without committing to a large migration effort.
• Cloud-native functionality—re-platforming allows applications to leverage
cloud capabilities like auto-scaling, managed storage and data processing
services, infrastructure as code (IaC), and more.
Disadvantages:
• Work scope can grow— “scope creep” can turn a re-platforming project into a
full-blown refactoring project. Managing scope and preventing unnecessary
changes is essential to mitigate this problem.
• Aggressive changes—to minimize work, you need to stick to common, well
known cloud components. Specialized components often require dramatic
changes to the application, and may not be worthwhile unless they provide high
business value or use is unavoidable.
• Automation is required—re-platforming is extremely limited if the resulting
workload in the cloud is managed manually. This means you are required to invest
in basic automation that provides some level of flexibility when operating the
system in the cloud.
15
4. Rebuild: A “Rebuild” migration strategy means that you completely redevelop the
application on a PaaS infrastructure – unlike with Rearchitect, where you only modify
parts of the application.
Rebuild involves removing existing code and redesigning the application in the Cloud,
after which you can utilize innovative features on the Cloud provider’s platform. A Cloud-
native application is cheap to use and highly scalable. A complete rebuild of your
application comes with a price tag. Rebuild is a PaaS based solution, it requires good
familiarity with existing application and business processes as well as cloud services. The
major challenge could be to work with application consumers who are forced to use a new
application which could make or break things.
Advantages:
• Cloud computing benefits, such as cost savings and pay as you go model.
• Modernization of an application (e.g. Monolithic to Microservices design).
• Developer productivity, opportunity to update skillset in an organization.
Disadvantages:
• Requires familiarity with existing application and business processes.
• Consumer would be forced to switch to the new application.
• Redefine SLAs.
16
Advantages:
• Less administrative overhead.
• Mobile, better features, flexibility, and support.
• Good value, cost-effective, best of the breed.
Disadvantages:
• Limitation of customization.
• Might need to change the business process according to an application.
• Limited data access, vendor lock-in.
1. Data Security and Compliance Risks: Data safety and compliance risks are still
among organizations’ top cloud migration security challenges. These risks can emanate
from data breaches during the migration or weak access control over sensitive enterprise
data.
Solution: While cloud services are inherently secure, it will help if you pass your data
through a secure path inside the firewall when migrating it. You can add an extra layer of
assurance by encrypting the data and ensuring that your strategy follows industry
compliance standards.
17
Solution: You can prevail over this uncertainty by planning carefully with your service
providers and factoring in the hidden costs of migration in the budget. Overall, you are
looking at the following cloud migration costs:
Pre-migration
• Consultations
• On-premise data management
• Database upgrade
Migration
• The project itself
• Refactoring
• Application and code changes
Post-migration
• Monthly or yearly license
• System maintenance
3. Wrong Cloud Migration Strategy: You may encounter various cloud migration
challenges if your team doesn’t deliberate on a well-hatched plan. Many get it wrong—
that planning is all about anticipating migration bottlenecks and creating remediation
contingencies, forgetting to focus on understanding the current infrastructure and target
cloud.
Solution: With that in mind, you would want to devise a well-thought strategy that covers
all intricate approaches to successful migration, such as app modernization and platform
refactoring. This means conducting a thorough assessment of the current infrastructure
and making adjustments whenever possible for optimized performance in the cloud.
Solution: You can overcome this barrier by hiring an experienced cloud migration
company to assist with the process and upskill your team. The experts will figure out the
right strategy that complements a smooth transition process with everyone on board.
Alternatively, you can hire fresh talents to fill the new IT positions.
18
Solution: To overcome this challenge, you need clearly assess your current infrastructure
to identify any compatibility issues or dependencies that could impact your migration,
create a roadmap that outlines the migration process, and choose the right cloud provider
that can meet your specific business needs.
6. Low Application Performance After Migration: One of the most overlooked post-
cloud migration problems is low performance after migration. Legacy systems often
feature several components hosted on multiple servers which lowers their flexibility
when migrated to the cloud. At the same time, on-demand computing solutions come with
varying storage options and services – it’s elementary for legacy systems to face
performance issues.
7. Network Bandwidth: You can also experience database migration to cloud challenges
if you don’t specifically define the network bandwidth requirements of cloud-powered
applications. Typically, these applications should match or exceed the performance of on-
premise deployments. Unless you define the bandwidth requirements accurately, the
migrated applications will experience latency, resulting in poor user experience and
unmet business requirements.
Solution: You can solve this problem by factoring in network constraints before
migration, with the help of network technology modelling tools. This will help you
evaluate current application performance to determine areas that need improvement
before the migration process.
8. Cloud Environment Adoption Resistance: Employees may feel that cloud adoption
will reduce their control over data and processes, leading them to resist the change.
Resistance toward change is common, and switching to a cloud computing environment
means more change and disruption around processes, systems, or leadership.
Solution: Train your employees on the new cloud environment and the tools and
processes that will be used. To ensure that everyone is okay and ready for the new IT
strategy, develop a change management plan and implement it before, during, and after
the migration. Also, involve them in the cloud adoption process and address their
concerns. This supposed to smooth out difficulties.
19
o Resource Availability: The process of migration may need holding in-house
servers offline temporarily. But downtime can be harmful to application
performance when not supported by an accurate plan for the recovery process.
o Resource Management: Yet, not all the professionals of IT trust the cloud. If our
team has been used to handling physical servers, they may require educating over
newer infrastructure or reconfiguring to define new roles.
Cloud services can be used at both the personal and organizational levels. We normally have a
large amount of data to store at the organizational level, and as the firm grows, so does its data.
As a result, any company would want more storage options, or an expansion of storage services,
at some point.
The number of employees will also grow over time, resulting in an increase in the capacity of on-
premise data centres. Such data centres are expensive to build and maintain. Using a cloud is an
excellent way to get out of this situation. Cloud computing uses are increasing every day. Here are
some applications of cloud computing in different spheres-
20
data sharing, ensuring that healthcare providers have access to the most up-to-
date patient data when delivering care. Cloud-based HIE systems improve care
coordination, reduce duplicate testing, and enhance patient safety.
Adopting cloud healthcare computing reduces the overhead of tasks like upgrading
records. Cloud technology in the healthcare industry helps develop a proper plan for
monitoring patients remotely. Additionally, cloud technology in healthcare provides new
and valuable insights for different healthcare management solutions.
• CRM
21
Customer relationship management (CRM) is a technology for managing all your
company’s relationships and interactions with customers and potential customers. The
goal is simple: Improve business relationships. A CRM system helps companies stay
connected to customers, streamline processes, and improve profitability. Customer
relationship management makes sure the smooth functioning of storage of customer data
like demographics, purchase behaviour, pattern, history, etc, and every interaction with a
customer to build strong relations and increase the sales and profits of an organization.
The top three CRM-industry leaders currently include Microsoft Dynamic 365, Oracle
CRM, and Salesforce Sales Cloud.
• The diverse Microsoft Dynamic 365 combines 200 apps in a cloud-based CRM
with an ERP. Its custom workflows and automated smart email responses to
customer actions rank among its other features.
• Oracle focuses on the three-prong approach of loyalty, marketing, and sales.
Marketers love this option for its simplicity in creating mass marketing mailings
and e-mailing campaigns. It also automates sales processes and can provide
accurate sales forecasts.
• Salesforce Sales Cloud ranks as the most popular web-based CRM. It works in the
cloud, so your entire team can access information from any location with Internet.
Its top feature is customer monitoring and information tracking that helps
develop long and short sales strategies.
• ERP
ERP is an abbreviation for Enterprise Resource Planning and is a software similar to CRM
that is hosted on cloud servers which helps the enterprises to manage and manipulate
their business data as per their needs and user requirements. ERP software follows pay
per use methodologies of payment, that is at the end of the month, the enterprise pays the
amount as per the cloud resources utilized by them. There are various ERP vendors
available like Oracle, SAP, Epicor, SAGE, Microsoft Dynamics, Lawson Softwares and many
more.
22
• High mobility
• Increase in productivity
• No security issues
• Scalable and efficient
2. SAP: SAP has been serving the industry and businesses of various sizes and scales for
over four decades. It has a user-friendly system that manages the entire operations. Their
solutions and integrated processes enable enhanced efficiencies across the organization.
3. Microsoft: Just like Oracle and SAP, Microsoft’s cloud application has been around for
years and needs no introduction. They are always at the forefront of creating new data
centres to support the advanced capabilities of the cloud. Dynamics 365 suite is one of
their flagship offerings.
5. Infor CloudSuite: Infor’s cloud ERP services help organizations meet ever-changing
business demands. They help businesses overcome traditional challenges related to
outdated and legacy systems. Infor’s solutions help unlock the hidden capabilities that
businesses can use to optimize their operations.
23
ii. Twitter: Twitter is a social networking site. It is a microblogging system. It allows
users to follow high profile celebrities, friends, relatives, and receive news. It sends and
receives real-time short posts called tweets. Cloud based servers and distributed
systems allow Twitter to handle high volumes of tweets, user interactions and trending
topics. By leveraging cloud infrastructure, Twitter can scale its service to handle surges
in user activity during peak periods.
iv. LinkedIn: LinkedIn is a social network platform, which utilizes cloud applications
to manage user profiles, job postings, and connections. Cloud-based databases and
storage enable efficient data management, while cloud computing resources support
the processing of large-scale user interactions and content updates. The cloud
infrastructure also allows LinkedIn to integrate with other enterprise applications and
services.
24
ii. Music Streaming Services: Cloud-based music streaming platforms like Spotify,
Apple Music, Google Play Music have become popular due to their vast catalos of
songs and the ability to access music on various devices. These services utilize
cloud infrastructure to store multiple music libraries, provide seamless playback
across multiple devices and personalize music recommendations based on the
user preferences.
1.10 BENEFITS
The most important reason why cloud computing is growing rapidly is the various benefits it
offers. It saves businesses the time and resources required to set up full-fledged physical IT
infrastructure.
It offers benefits in the following sectors:
25
ii. Market Expansion: Cloud computing enables businesses to expand their reach and
target new markets more easily. With cloud infrastructure and services, businesses
can quickly establish a presence in different geographical locations, serving
customers globally without the need for significant physical infrastructure
investments.
iii. Cost Efficiency: Cloud computing allows businesses to reduce capital expenses by
eliminating the need for upfront infrastructure investments. The pay-as-you go
pricing model enables businesses to scale resources based on demand, optimizing
cost efficiency and aligning expenses with actual usage.
iv. Rapid Innovation: Cloud computing provides a platform for rapid innovation and
experimentation. Businesses can leverage cloud-based development tools,
infrastructure and managed services to quickly develop and test new products and
services. This accelerates time-to-market and gives businesses a competitive edge
in an evolving market landscape.
v. Collaboration and Partnership Opportunities: Cloud computing facilitates
collaboration among businesses by providing a common platform for sharing data,
resources and services. It enables businesses to form partnership, integrate their
systems, and collaborate on joint initiatives, leading to improved efficiency,
innovation, and shared success in the market.
26
v. Focus on Core Competencies: By offloading IT infrastructure management to
cloud service providers, enterprises can focus on their core competencies and
strategic initiatives. Cloud computing frees up resources and allows enterprises to
redirect their efforts towards innovation, customer satisfaction, and business
growth.
27
other media content on various devices, providing personalized entertainment
experiences.
viii. Social Networking and Communication: Cloud-based networking platforms
and communication tools enable users to connect, communicate and share
content with friends, family and colleagues across the globe. Users can engage in
real-time conversations, videocalls, and social interactions through cloud-based
messaging, social media, and video-conferencing applications.
ix. Personalization and Recommendation: Cloud-based services leverage data
analytics and machine learning algorithms to provide personalized experiences
and recommendations. Users benefit from tailored content, product
recommendations and personalized advertisements based on their preferences,
behaviour and usage patterns.
It is obvious that business and organizations are getting various benefits because of Cloud
Computing. However, every coin has two faces so there are several disadvantages of cloud
computing too.
Here are following advantages and disadvantages of cloud computing:
i. Data security: Data security is one of the biggest advantages of cloud computing. Cloud
offers many advanced features related to security and ensures that data is securely stored
and handled.
iii. Services in the pay-per-use model: Cloud computing allows you flexibility because you
have to pay only for what you use as a service.
iv. Cost Effective: The major reason companies shift towards cloud computing is that it takes
lower costs. The business does not need to build its own IT infrastructure or purchase
hardware or equipment. Costs include physical hardware for data storage purposes like
hard drives, solid-state drives or disks, etc.
v. Accessibility: Cloud computing allows you to quickly and easily store, access, and
manipulate information on the cloud. An internet cloud infrastructure increases
organization productivity and efficiency by ensuring that our data is always accessible.
28
vi. Backup and restore data: As the data is stored in the cloud, it is a lot easier to get the
backup and recovery of that data with just a few clicks; otherwise, manually, it is a very
time-consuming process on-premise.
vii. Unlimited storage capacity: Cloud offers us a huge amount of storing capacity for
storing our important data such as documents, images, audio, video, etc. in one place. No
storage capacity is predefined, so you can increase or decrease storage capacity according
to your needs at any time.
viii. Automatic Software Integrations: Cloud computing allows you to set automation of
software updates and upgrades. So as soon as a newer version of any software is released,
it will automatically integrate into the services you are using.
ix. Multi-tenancy: It means shared hosting when server resources are shared among several
clients. When a software instance or computer system has only one end user or set of
users, it is referred to as multi-tenancy. It reduces usage of physical devices and thus
power consumption and cooling cost save.
ii. Vendor lock-in: Vendor lock-in is the biggest disadvantage of cloud computing.
Organizations may face problems when transferring their services from one vendor to
another. As different vendors provide different platforms, that can cause difficulty moving
from one cloud to another.
iii. Downtime: We can't access the data if there is downtime (internet loss at the cloud
provider's end). Other than this, downtime also includes cloud providers that may face
power loss, service maintenance, etc.
iv. Limited Bandwidth: As the Cloud provider provides limited bandwidth to all its users,
you have to pay significantly higher costs if your organization surpasses that limit.
29
v. Security: Even though the cloud providers are storing information very securely, we still
don't have to forget that data is vulnerable to cyber-attacks when stored in the cloud.
Many organizations and companies have suffered from security breaches and their
potential risks in the cloud.
vi. Limited Control and Flexibility: The cloud infrastructure is completely owned,
managed, and monitored by the cloud providers. So, businesses using cloud computing
have limited control over their data, applications, and services. It makes it hard for
companies to have the level of control they want over the different services they use. The
customer may not have access to key administrative services.
vii. Technical issues: Due to frequent version releases of some applications, you have to
constantly upgrade your systems to meet a market need; in between these updates, there
is a chance that you may be stuck on some technical problems.
viii. Lack of support staff: Some cloud companies do not provide proper support to their
clients; then, you have to only depend on FAQs or online help.
ix. May not get all the features: Each cloud service provider is unique from the others.
Users may occasionally only be able to use the basic kinds of cloud storage that cloud
providers provide. As a result, one cannot modify certain features or take advantage of all
of their benefits.
x. Varied Performances: When you are working in a cloud environment, your application
is running on the server which simultaneously provides resources to other businesses.
Any greedy behavior or DDOS attack on your tenant could affect the performance of your
shared resource.
30
13. What is scalability?
14. What is Disaster recovery?
15. Explain cloud-to-cloud migration.
Short Questions
Long Questions
31
UNIT 2 CLOUD COMPUTING ARCHITECTURE
Structure:
2.1 Introduction
2.2 Cloud Reference Model
2.3 Cloud Architecture
2.4 Infrastructure/ Hardware as a Service
2.5 Platform as a Service
2.6 Software as a Service
2.7 Types of Clouds
2.8 Economics of the Cloud
2.9 Open Challenges
2.10 Cloud Interoperability and Standards
2.11 Scalability and fault Tolerance
2.12 Parallel and Distributed Computing
2.13 Map Reduce
2.14 Hadoop
2.15 High Level Language of Cloud
2.16 Service oriented Computing
32
2.1 INTRODUCTION
With the increase in popularity of Cloud Computing, many definitions, applications, structures,
technologies and concepts have evolved around it. And hence, there are varied perceptions on
using this service which may differ from client or locations. Therefore, the need of a reference
model, which can act as an abstract model to standardize the functions, parameters, and vendors
of cloud computing spread all over the world using different technologies, so that they can
communicate with each other with ease.
In this Unit we will discuss the universally accepted Cloud Reference Model proposed by The
Information Technology Laboratory (ITL) at the National Institute of Standards and Technology
(NIST).
The Cloud Computing Reference Model provides a conceptual framework for understanding and
categorizing the various components and functions of cloud computing. It helps define the
relationships and interactions between different cloud computing elements. The most widely
recognized and used reference model is the NIST (National Institute of Standards and
Technology) Cloud Computing Reference Architecture. It serves as a common language for
discussing and designing cloud-based solutions, enabling interoperability and facilitating the
adoption of cloud computing technologies.
The model is not tied to any specific vendor products, services, or reference implementation, nor
does it define prescriptive solutions that inhibit innovation. It defines a set of actors, activities,
and functions that can be used in the process of developing cloud computing architectures, and
relates to a companion cloud computing taxonomy. It contains a set of views and descriptions
that are the basis for discussing the characteristics, uses, and standards for cloud computing.
There are five major actors in NIST cloud computing reference architecture. They are:
1. Cloud Consumer
2. Cloud Provider
3. Cloud Carrier
4. Cloud Auditor
5. Cloud Broker
Each actor is an entity that participates in the process and/or completes duties in cloud
computing. This entity could be a person or an organization.
33
1. Cloud Consumer
The end user that the cloud computing service is designed to support is the cloud consumer. An
individual or corporation with a working relationship with a cloud provider and utilizing its
services is referred to as a cloud consumer. A cloud customer peruses a cloud provider's service
catalogue, makes the proper service request, enters into a service agreement with the cloud
provider, and then utilizes the service. The cloud customer may be charged for the service
provided, in which case payment arrangements must be made. They need to have a cloud Service
Level Agreement (SLA). SLAs can cover terms regarding the quality of service, security, remedies
for performance failures. A cloud provider may also list in the SLAs a set of promises explicitly
not made to consumers, i.e., limitations, and obligations that cloud consumers must accept. A
cloud consumer can freely choose a cloud provider with better pricing and more favourable
terms.
2. Cloud Provider
A cloud provider is a person, an organization; it is the entity responsible for making a service
available to interested parties. A Cloud Provider acquires and manages the computing
infrastructure required for providing the services, runs the cloud software that provides the
services, and makes arrangement to deliver the cloud services to the Cloud Consumers through
network access.
i. Service Orchestration: Service Orchestration refers to the composition of system
components to support the Cloud Providers activities in arrangement, coordination and
management of computing resources in order to provide cloud services to Cloud
Consumers. There are three components in Service Orchestration.
34
1. Service Layer: The top is the service layer; this is where Cloud Providers define
interfaces for Cloud Consumers to access the computing services. Access interfaces of
each of the three service models are provided in this layer.
• For SaaS, the cloud provider deploys, configures, maintains and updates the operation of
the software applications on a cloud infrastructure so that the services are provisioned at
the expected service levels to cloud consumers.
• For PaaS, the Cloud Provider manages the computing infrastructure for the platform and
runs the cloud software that provides the components of the platform, such as runtime
software execution stack, databases, and other middleware components. The PaaS Cloud
Provider typically also supports the development, deployment and management process
of the PaaS Cloud Consumer by providing tools such as integrated development
environments (IDEs), development version of cloud software, software development kits
(SDKs), deployment and management tools.
• For IaaS, the Cloud Provider acquires the physical computing resources underlying the
service, including the servers, networks, storage and hosting infrastructure. The Cloud
Provider runs the cloud software necessary to makes computing resources available to
the IaaS Cloud Consumer through a set of service interfaces and computing resource
abstractions, such as virtual machines and virtual network interfaces.
2. Resource Abstraction and Control Layer: The middle layer in the model is the resource
abstraction and control layer. This layer contains the system components that Cloud
Providers use to provide and manage access to the physical computing resources through
software abstraction. Examples of resource abstraction components include software
elements such as hypervisors, virtual machines, virtual data storage, and other computing
resource abstractions. The resource abstraction needs to ensure efficient, secure, and
reliable usage of the underlying physical resources. The control aspect of this layer refers
to the software components that are responsible for resource allocation, access control,
and usage monitoring. This is the software fabric that ties together the numerous
underlying physical resources and their software abstractions to enable resource pooling,
dynamic allocation, and measured service.
3. Physical Resource Layer: The lowest layer in the stack is the physical resource layer,
which includes all the physical computing resources. This layer includes hardware
resources, such as computers (CPU and memory), networks (routers, firewalls, switches,
network links and interfaces), storage components (hard disks) and other physical
computing infrastructure elements. It also includes facility resources, such as heating,
ventilation and air conditioning (HVAC), power, communications, and other aspects of the
physical plant.
ii. Cloud Service Management: Cloud Service Management includes all of the service-related
functions that are necessary for the management and operation of those services required
by or proposed to cloud consumers. Cloud service management can be described from the
35
perspective of business support, provisioning and configuration, and from the perspective
of portability and interoperability requirements.
1. Business Support: Business Support entails the set of business-related services dealing
with clients and supporting processes. It includes the components used to run business
operations that are client-facing.
• Customer management: Manage customer accounts, open/close/terminate
accounts, manage user profiles, manage customer relationships by providing
points-of-contact and resolving customer issues and problems, etc.
• Contract management: Manage service contracts,
setup/negotiate/close/terminate contract, etc.
• Inventory Management: Set up and manage service catalogs, etc.
• Accounting and Billing: Manage customer billing information, send billing
statements, process received payments, track invoices, etc.
• Reporting and Auditing: Monitor user operations, generate reports, etc.
• Pricing and Rating: Evaluate cloud services and determine prices, handle
promotions and pricing rules based on a user's profile, etc.
36
iii. Security: It is critical to recognize that security is a cross-cutting aspect of the architecture
that spans across all layers of the reference model, ranging from physical security to
application security. Therefore, security in cloud computing architecture concerns is not
solely under the purview of the Cloud Providers, but also Cloud Consumers and other
relevant actors. Cloud-based systems still need to address security requirements such as
authentication, authorization, availability, confidentiality, identity management, integrity,
audit, security monitoring, incident response, and security policy management.
iv. Privacy: Cloud providers should protect the assured, proper, and consistent collection,
processing, communication, use and disposition of personal information (PI) and personally
identifiable information (PII) in the cloud. PII is the information that can be used to
distinguish or trace an individual’s identity, such as their name, social security number,
biometric records, etc. alone, or when combined with other personal or identifying
information that is linked or linkable to a specific individual, such as date and place of birth,
mother’s maiden name, etc. Though cloud computing provides a flexible solution for shared
resources, software and information, it also poses additional privacy challenges to
consumers using the clouds.
3. Cloud Auditor: A cloud auditor is a party that can perform an independent examination of
cloud service controls with the intent to express an opinion thereon. Audits are performed to
verify conformance to standards through review of objective evidence. A cloud auditor can
evaluate the services provided by a cloud provider in terms of security controls, privacy impact,
performance, etc.
• For security auditing, a cloud auditor can make an assessment of the security controls in
the information system to determine the extent to which the controls are implemented
correctly, operating as intended, and producing the desired outcome with respect to the
security requirements for the system. The security auditing should also include the
verification of the compliance with regulation and security policy.
• A privacy impact audit can help Federal agencies comply with applicable privacy laws and
regulations governing an individual’s privacy, and to ensure confidentiality, integrity, and
availability of an individual’s personal information at every stage of development and
operation
4. Cloud Broker: As cloud computing evolves, the integration of cloud services can be too
complex for cloud consumers to manage. A cloud consumer may request cloud services from a
cloud broker, instead of contacting a cloud provider directly. A cloud broker is an entity that
manages the use, performance and delivery of cloud services and negotiates relationships
between cloud providers and cloud consumers.
In general, a cloud broker can provide services in three categories:
• Service Intermediation: A cloud broker enhances a given service by improving some
specific capability and providing value-added services to cloud consumers. The
37
improvement can be managing access to cloud services, identity management,
performance reporting, enhanced security, etc.
• Service Aggregation: A cloud broker combines and integrates multiple services into one
or more new services. The broker provides data integration and ensures the secure data
movement between the cloud consumer and multiple cloud providers.
• Service Arbitrage: Service arbitrage is similar to service aggregation except that the
services being aggregated are not fixed. Service arbitrage means a broker has the
flexibility to choose services from multiple agencies. The cloud broker, for example, can
use a credit-scoring service to measure and select an agency with the best score.
5. Cloud Carrier: A cloud carrier acts as an intermediary that provides connectivity and transport
of cloud services between cloud consumers and cloud providers. Cloud carriers provide access to
consumers through network, telecommunication and other access devices. For example, cloud
consumers can obtain cloud through network access devices, such as computers, laptops, mobile
phones, mobile Internet devices (MIDs), etc. The distribution of cloud services is normally
provided by network and telecommunication carriers or a transport agent, where a transport
agent refers to a business organization that provides physical transport of storage media such as
high-capacity hard drives. Note that a cloud provider will set up SLAs with a cloud carrier to
provide services consistent with the level of SLAs offered to cloud consumers, and may require
the cloud carrier to provide dedicated and secure connections between cloud consumers and
cloud providers.
Cloud Computing Architecture is divided into two parts, i.e., front-end and back-end. Front-end
and back-end communicate via a network or internet.
The front end is the client of such architecture and communicates with the backend through a
network or internet connection. In cloud computing architecture, the client-side or front-end
becomes visible to other entities whereas the backend remains hidden from contact with anyone
on the outside, yet it is able to communicate directly with its client through a predetermined
protocol.
The backend of cloud architecture helps protect vital information from the demand of client-
facing technology. It receives queries about your data and responds appropriately. The backend
is an important aspect of your overall computer system that makes up a big part of the entire
cloud concept.
38
1. Front-End: Frontend of the cloud architecture refers to the client side of cloud computing
system. Means it contains all the user interfaces and applications which are used by the
client to access the cloud computing services/resources. For example, use of a web
browser to access the cloud platform.
• Client Infrastructure – Client Infrastructure is a part of the frontend component.
It contains the applications and user interfaces which are required to access the
cloud platform. Cloud infrastructure consists of hardware and software
components such as data storage, server, virtualization software, etc.
• It also provides a Graphical User Interface to the end-users to perform respective
tasks.
• The front end includes web servers (including Chrome, Firefox, internet explorer,
etc.), thin & fat clients, tablets, and mobile devices.
2. Back-End: Backend refers to the cloud itself which is used by the service provider. It
contains the resources as well as manages the resources and provides security
mechanisms. Along with this, it includes huge storage, virtual applications, virtual
machines, traffic control mechanisms, deployment models, etc.
It is responsible for monitoring all the programs that run the application on the front-end
It has a large number of data storage systems and servers.
The components of the back-end cloud architecture are mentioned below.
39
i. Application- Application in backend refers to a software or platform to which client
accesses. Depending upon the client requirement, the application provides the result to
the end-user (with resources) in the back end.
ii. Service- A Cloud Services manages that which type of service you access according to
the client’s requirement. Cloud computing offers the following three type of services:
a) Software as a Service (SaaS) – It is also known as cloud application services.
Mostly, SaaS applications run directly through the web browser means we do not
require to download and install these applications.
Example: Google Apps, Salesforce Dropbox, Slack, Hubspot, Cisco WebEx.
iii. Runtime Cloud- Runtime cloud in backend provides the execution and Runtime
platform/environment to the Virtual machine. It makes use of technology such as
virtualization and allows people to access countless networked servers at any given time.
iv. Storage- Storage is one of the most important components of cloud computing. It
provides a huge amount of storage capacity in the cloud to store and manage data. It
stores and maintains data like files, videos, documents, etc. over the internet. Some of the
popular examples of storage services are Amazon S3, Oracle Cloud-Storage, Microsoft
Azure Storage, etc. Its capacity varies depending upon the service providers available in
the market.
40
it implements security management to the cloud server with virtual firewalls which
results in preventing data loss.
vii. Infrastructure- It provides services on the host level, application level, and network
level. Cloud infrastructure includes hardware and software components such as servers,
storage, network devices, virtualization software, and other storage resources that are
needed to support the cloud computing model.
3. Deployment Software: It helps to deploy and integrate the application on the cloud. It
consists of all the mandatory installations and configurations required to run a cloud
service. Every deployment of cloud services is performed using a deployment software.
The three different models which can be deployed are the following:
• SaaS - Software as a service host and manages applications of the end-user.
Example: Gmail
• PaaS - Platform as a service helps developers to build, create, and manage
applications. Example: Microsoft Azure
• IaaS - Infrastructure as a service provides services on a pay-as-you-go pricing
model.
41
4. Network: It is the key component of cloud infrastructure. It allows to connect cloud
services over the Internet. It is also possible to deliver network as a utility over the
Internet, which means, the customer can customize the network route and protocol. It
connects the front-end and back-end. Also, allows every user to access cloud resources.
It helps users to connect and customize the route and protocol. It is a virtual server which
is hosted on the cloud computing platform. It is highly flexible, secure, and cost-effective
5. Server: The server helps to compute the resource sharing and offers other services such
as resource allocation and de-allocation, monitoring the resources, providing security
etc. It is a virtual server delivered over the internet which is built and hosted on the cloud
computing platform. It can be accessed remotely and has all the characteristics of an on-
premises server.
6. Storage: Storage forms an important part of the cloud as all the user-generated data can
be stored on the cloud and accessed from any location. It reduces resources spent on the
on-premises physical storage and also brings flexibility to the storage requirements. The
storage is necessarily a back-end part. Cloud keeps multiple replicas of storage. If one of
the storage resources fails, then it can be extracted from another one, which makes cloud
computing more reliable.
Iaas is also known as Hardware as a Service (HaaS). It is one of the layers of the cloud computing
platform. It allows customers to outsource their IT infrastructures such as servers, networking,
processing, storage, virtual machines, and other resources. Customers access these resources on
the Internet using a pay-as-per use model.
In traditional hosting services, IT infrastructure was rented out for a specific period of time, with
pre-determined hardware configuration. The client paid for the configuration and time,
regardless of the actual use. With the help of the IaaS cloud computing platform layer, clients can
dynamically scale the configuration to meet changing requirements and are billed only for the
services actually used. IaaS cloud computing platform layer eliminates the need for every
organization to maintain the IT infrastructure.
42
All of the above resources are made available to end user via server virtualization. Moreover,
these resources are accessed by the customers as if they own them.
43
Disadvantages of IaaS cloud computing layer
1. Security: Security is one of the biggest issues in IaaS. Most of the IaaS providers are not
able to provide 100% security.
2. Maintenance & Upgrade: Although IaaS service providers maintain the software, but
they do not upgrade the software for some organizations.
3. Interoperability issues: It is difficult to migrate VM from one IaaS provider to the
other, so the customers might face problem related to vendor lock-in.
Top IaaS Providers who are providing IaaS cloud computing platform
IaaS Vendor Iaas Solution Details
Amazon Web Elastic, Elastic Compute The cloud computing platform pioneer, Amazon
Services Cloud (EC2) MapReduce, offers auto scaling, cloud monitoring, and load
Route 53, Virtual Private balancing features as part of its portfolio.
Cloud, etc.
Netmagic Netmagic IaaS Cloud Netmagic runs from data centers in Mumbai,
Solutions Chennai, and Bangalore, and a virtual data center
in the United States. Plans are underway to extend
services to West Asia.
Rackspace Cloud servers, cloud The cloud computing platform vendor focuses
files, cloud sites, etc. primarily on enterprise-level hosting services.
Reliance Reliance Internet Data RIDC supports both traditional hosting and cloud
Communications Center services, with data centers in Mumbai, Bangalore,
Hyderabad, and Chennai. The cloud services
offered by RIDC include IaaS and SaaS.
44
PaaS includes infrastructure (servers, storage, and networking) and platform (middleware,
development tools, database management systems, business intelligence, and more) to support
the web application life cycle.
App Engine of Google and Force.com are examples of PaaS offering vendors. Developer may log
on to these websites and use the built-in API to create web-based applications.
But the disadvantage of using PaaS is that, the developer locks-in with a particular vendor. For
example, an application written in Python against API of Google, and using App Engine of Google
is likely to work only in that environment.
PaaS providers provide the Programming languages, Application frameworks, Databases, and
other tools:
1. Programming languages: PaaS providers provide various programming languages for
the developers to develop the applications. Some popular programming languages
provided by PaaS providers are Java, PHP, Ruby, Perl, and Go.
4. Other tools: PaaS providers provide various other tools that are required to develop, test,
and deploy the applications.
Advantages of PaaS
There are the following advantages of PaaS -
1. Simplified Development- PaaS allows developers to focus on development and
innovation without worrying about infrastructure management.
2. Lower risk- No need for up-front investment in hardware and software. Developers only
need a PC and an internet connection to start building applications.
3. Prebuilt business functionality- Some PaaS vendors also provide already defined
business functionality so that users can avoid building everything from very scratch and
hence can directly start the projects only.
4. Instant community- PaaS vendors frequently provide online communities where the
developer can get the ideas to share experiences and seek advice from others.
5. Scalability- Applications deployed can scale from one to thousands of users without any
changes to the applications.
45
1. Vendor lock-in- One has to write the applications according to the platform provided by
the PaaS vendor, so the migration of an application to another PaaS vendor would be a
problem.
2. Data Privacy- Corporate data, whether it can be critical or not, will be private, so if it is
not located within the walls of the company, there can be a risk in terms of privacy of data.
3. Integration with the rest of the systems applications- It may happen that some
applications are local, and some are in the cloud. So, there will be chances of increased
complexity when we want to use data which in the cloud with the local data.
3. Social Networks - As we all know, social networking sites are used by the general public,
so social networking service providers use SaaS for their convenience and handle the
general public's information.
46
4. Mail Services - To handle the unpredictable number of users and load on e-mail services,
many e-mail providers offering their services using SaaS.
47
Provider Services
Salseforce.com On-demand CRM solutions
Microsoft Office Online office suite
365
Google Apps Gmail, Google Calendar, Docs, and sites
NetSuite ERP, accounting, order management, CRM, Professionals Services
Automation (PSA), and e-commerce applications.
GoToMeeting Online meeting and video-conferencing software
Constant Contact E-mail marketing, online survey, and event marketing
Oracle CRM CRM applications
Workday, Inc Human capital management, payroll, and financial management.
48
o Shared Infrastructure: Several users share the infrastructure in public cloud settings.
Cost reductions and effective resource use are made possible by this.
o Scalability: By using the public cloud, users can easily adjust the resources they need
based on their requirements, allowing for quick scaling up or down.
o Pay-per-Usage: When using the public cloud, payment is based on usage, so users only
pay for the resources they actually use. This helps optimize costs and eliminates the need
for upfront investments.
o Managed by Service Providers: Cloud service providers manage and maintain public
cloud infrastructure. They handle hardware maintenance, software updates, and security
tasks, relieving users of these responsibilities.
o Reliability and Redundancy: Public cloud providers ensure high reliability by
implementing redundant systems and multiple data centers. By doing this, the probability
of losing data and experiencing service disruptions is reduced.
o Security Measures: Public cloud providers implement robust security measures to
protect user data. These include encryption, access controls, and regular security audits.
49
• Potential for unexpected costs with usage-based pricing models.
• Lack of customization options and flexibility compared to private or hybrid cloud
environments.
• Reliance to the cloud provider's support and responsiveness for issue resolution.
2. Private cloud: Private cloud is also known as an internal cloud or corporate cloud.
Private Cloud allows systems and services to be accessible within an organization. It is
used by organizations to build and manage their own data centers internally or by the
third party. It can be deployed using Opensource tools such as Openstack and Eucalyptus.
Instead of a pay-as-you-go model in private clouds, there could be other schemes that
manage the usage of the cloud and proportionally billing of the different departments or
sections of an enterprise. Private cloud providers are HP Data Centers, Ubuntu, Elastic-
Private cloud, Microsoft, etc.
Based on the location and management, National Institute of Standards and Technology
(NIST) divide private cloud into the following two parts-
50
from the expertise and resources of the service provider, alleviating the burden
of infrastructure management. The outsourced private cloud model offers
scalability, as the provider can adjust resources based on the organization's
needs. Due to its flexibility, it is a desirable choice for businesses that desire the
advantages of a private cloud deployment without the initial capital outlay and
ongoing maintenance expenses involved with an on-premise implementation.
Compared to public cloud options, both on-premise and external private clouds give
businesses more control over their data, apps, and security. Private clouds are
particularly suitable for organizations with strict compliance requirements, sensitive
data, or specialized workloads that demand high levels of customization and security.
51
• Hybrid Cloud Integration: Private cloud can be integrated with public cloud services,
forming a hybrid cloud infrastructure. This integration allows organizations to leverage
the benefits of both private and public clouds.
3. Hybrid cloud: The term “hybrid cloud” refers to a system that is a combination of both
public and private clouds, i.e., public cloud + private cloud = hybrid cloud. Non-critical
activities are performed using public cloud while the critical activities are performed
using private cloud. A hybrid cloud is mostly employed in finance, healthcare, and higher
education.
Example: Google Application Suite (Gmail, Google Apps, and Google Drive), Office 365
(MS Office on the Web and One Drive), Amazon Web Services.
52
Characteristics of Hybrid Cloud
• Integration of Public and Private Clouds: Hybrid cloud seamlessly integrates public
and private clouds, allowing organizations to leverage both advantages. It provides a
unified platform where workloads and data can be deployed and managed across both
environments.
• Flexibility and Scalability: Hybrid cloud offers resource allocation and scalability
flexibility. Organizations can dynamically scale their infrastructure by utilizing additional
resources from the public cloud while maintaining control over critical workloads on the
private cloud.
• Enhanced Security and Control: Hybrid cloud allows organizations to maintain higher
security and control over their sensitive data and critical applications. Private cloud
components provide a secure and dedicated environment, while public cloud resources
can be used for non-sensitive tasks, ensuring a balanced approach to data protection.
• Cost Optimization: Hybrid cloud enables organizations to optimize costs by utilizing the
cost-effective public cloud for non-sensitive workloads while keeping mission-critical
applications and data on the more cost-efficient private cloud. This approach allows for
efficient resource allocation and cost management.
• Data and Application Portability: Organizations can move workloads and data between
public and private clouds as needed with a hybrid cloud. This portability offers agility and
the ability to adapt to changing business requirements, ensuring optimal performance
and responsiveness.
• Compliance and Regulatory Compliance: Hybrid cloud helps organizations address
compliance and regulatory requirements more effectively. Sensitive data and applications
can be kept within the private cloud, ensuring compliance with industry-specific
regulations while leveraging the public cloud for other non-sensitive operations.
• Disaster Recovery and Business Continuity: Hybrid cloud facilitates robust disaster
recovery and business continuity strategies. Organizations can replicate critical data and
applications between the private and public clouds, ensuring redundancy and minimizing
the risk of data loss or service disruptions.
53
Advantages of Hybrid Cloud
There are the following advantages of Hybrid Cloud -
• Hybrid cloud is suitable for organizations that require more security than the public
cloud.
• Hybrid cloud helps you to deliver new products and services more quickly.
• Hybrid cloud provides an excellent way to reduce the risk.
• Hybrid cloud offers flexible resources because of the public cloud and secure resources
because of the private cloud.
• Hybrid facilitates seamless integration between on-premises infrastructure and cloud
environments.
• Hybrid provides greater control over sensitive data and compliance requirements.
• Hybrid enables efficient workload distribution based on specific needs and performance
requirements.
• Hybrid offers cost optimization by allowing organizations to choose the most suitable
cloud platform for different workloads.
• Hybrid enhances business continuity and disaster recovery capabilities with private and
public cloud resources.
• Hybrid supports hybrid cloud architecture, allowing applications and data to be
deployed across multiple cloud environments based on their unique requirements.
54
cloud infrastructure. This infrastructure allows them to access shared services,
applications, and data relevant to their community.
Example: Health Care community cloud
55
exploitation. This encourages creativity, education, and effectiveness within the
neighborhood.
• Scalability and Flexibility: Community cloud enables organizations to scale up or
reduce their resources in response to demand. This allows the community to adjust to
shifting computing requirements and efficiently use cloud resources as needed.
56
5. Multi cloud: Multi-cloud is a strategy in cloud computing where companies utilize more
than one cloud service provider or platform to meet their computing needs. It involves
distributing workloads, applications, and statistics throughout numerous cloud
environments consisting of public, private, and hybrid clouds.
Adopting a multi-cloud approach allows businesses to have the ability to select and
leverage the most appropriate cloud services from different providers based on their
specific necessities. This allows them to harness each provider's distinctive capabilities
and services, mitigating the risk of relying solely on one vendor while benefiting from
competitive pricing models. '
Examples: Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform
(GCP)
Characteristics of Multi-cloud
• Multiple Cloud Providers: The key characteristic of multi-cloud is the utilization of
multiple cloud service providers. Organizations can leverage the offerings of different
providers, such as Amazon web services (AWS), Microsoft Azure, Google Cloud Platform
(GCP), and others, to access a huge range of services and capabilities.
• Diversification and Risk Reduction: Thanks to multi-cloud, organizations may
distribute workloads, apps, and data across several cloud environments. This
diversification decreases the danger of vendor lock-in, and the effects of any service
interruptions or outages from a single cloud provider are lessened.
• Flexibility and Vendor Independence: Businesses using multi-cloud can choose the
finest cloud services from various providers per their requirements. This approach
enables companies to leverage each provider's unique benefits and avoids needing to
depend solely on a single supplier for all their cloud computing requirements.
• Optimisation of Services and Costs: Organisations may optimize their services and
costs by using a multi-cloud strategy and choosing the most affordable and appropriate
cloud provider for each workload or application. They can use specialized services from
57
many sources to meet certain demands, taking advantage of competitive pricing
structures.
• Enhanced Reliability and Performance: Multi-cloud enhances reliability and
performance by utilizing multiple cloud environments. By utilizing the infrastructure and
resources of various providers, organizations can achieve high availability, scalability,
and enhanced performance for their applications and services
• Data Sovereignty and Compliance: Multi-cloud allows organizations to address data
sovereignty and compliance requirements by choosing cloud providers with data centers
in specific regions or jurisdictions. It provides flexibility in managing data residency and
regulatory compliance obligations.
• Interoperability and Integration: Multi-cloud necessitates interoperability and
integration between different cloud platforms. Organizations must ensure seamless data
exchange, application compatibility, and integration of services across the various cloud
environments they utilize.
Advantages of Multi-Cloud:
There are the following advantages of multi-Cloud -
• It allows organizations to choose the most suitable cloud services from different
providers based on their specific requirements.
• Distributing workloads and data across multiple cloud environments enhances reliability
and ensures resilience in case of service disruptions or downtime.
• By utilizing its providers, organizations can avoid dependency on a single vendor and
mitigate the risks associated with vendor lock-in.
• Organizations can optimize services and costs by selecting the most cost-effective and
suitable cloud provider for each workload or application.
• Leveraging the infrastructure and resources of different cloud providers allows
organizations to achieve high availability, scalability, and improved performance.
• It enables organizations to select cloud providers with data centers in specific regions,
addressing data sovereignty and compliance requirements.
• Access to specialized services and capabilities from different providers promotes
innovation and allows organizations to leverage the best-in-class offerings in the market.
• Distributing workloads across multiple clouds reduces the risk of data loss or service
disruptions, providing enhanced disaster recovery capabilities.
Disadvantages of Multi-Cloud:
• Increased complexity in managing multiple cloud environments.
• Potential for higher costs due to multiple subscriptions and data transfer fees.
• Challenges in ensuring data governance and compliance across multiple clouds.
• Integration difficulties and compatibility issues between different cloud providers.
• Potential for increased management overhead and resource requirements.
• Risk of vendor dependencies and interoperability challenges.
58
2.8 ECONOMICS OF CLOUD COMPUTING
Cloud Computing Economics is the study of its costs and benefits and the economic principles
behind them. It explores the key points of the business. Some of the topics that are considered
are:
• What will be the return on investment of migrating to cloud computing?
• Benefits of switching to a different cloud provider
• Comparing the cost of cloud computing to the traditional setup of computing power
The total cost of ownership: The organization first estimates their expenditure if they deploy
an in-campus system. Based on that, the cost estimation of the cloud computing setup is done.
Cost of the current data center: If an organization is moving from an existing data center to
cloud computing, then it is necessary to include the maintenance costs, IT hardware costs,
software licenses, supplies, spare parts, and everything else that the organization needs to pay to
keep the operations up and running.
Cost of estimated cloud infrastructure: After assessing the current infrastructure, the
organization compares it to the pricing of the cloud computing setup. The pricing of cloud
computing varies widely depending on the service provider and many other factors.
Cost of cloud migration execution: The cost incurred to migrate the operations from the
current infrastructure to cloud computing is considered. These include the consulting fees,
software licensing costs, software testing, integration costs, etc.
Additional post-migration cost: There are certain costs associated with maintenance of the
cloud computing infrastructure like skilled labor force, improving the cloud environment,
administration, etc. These costs are also taken into consideration.
59
Economic Characteristics of cloud
Cloud Computing is an economically feasible option for computing resources. The new start-ups
don't have enough funding and resources in the initial stages to set up an in-campus computing
infrastructure. Cloud computing has given a perfect and economical option for organizations like
these. Some of the characteristics of cloud computing are discussed below.
• Scalability: Cloud Computing offers a flexible option to expand operations as and when
required. It allows access to unlimited computing resources without thinking about the
economic aspects. The key to this feature requires good planning.
• Low Entry Barrier: Users can access good computing resources based on their needs
starting at a low price point. They don't need to invest a massive amount of money in
beginning their basic operations.
• Utility: Cloud Service providers provide pricing models to match the needs ad resources
to the pricing. This allows the client to only pay for the resources that they need. This
eliminates wastage of resources and, in turn, reduces the end-user cost of services and
products.
• Flexibility: The users can resize their resources based on their needs. They offer high
economic flexibility.
Cloud services can be used at both the personal and organizational levels. We normally have a
large amount of data to store at the organizational level, and as the firm grows, so does its data.
As a result, any company would want more storage options, or an expansion of storage services,
at some point.
The number of employees will also grow over time, resulting in an increase in the capacity of on-
premise data centers. Such data centers are expensive to build and maintain. Using a cloud is an
excellent way to get out of this situation. Cloud computing uses are increasing every day. Here are
some applications of cloud computing in different spheres-
60
2.9 CLOUD COMPUTING CHALLENGES
The Cloud Computing is getting implemented in almost all companies as the companies are in
need to store the data. A tremendous amount of data generates and store by the companies. So,
there are lots of security issues faced by them. Cloud computing, an emergent technology, has
placed many challenges in different aspects of data and information handling.
1) Security and Privacy: Security and Privacy of information is the biggest challenge to
cloud computing. The data store in the cloud must secure and provide full confidentiality.
The customers rely on the cloud provider so much. This means that the cloud provider
should take necessary security measures to secure the data of the customers. Securities
are also the responsibility of the customer as they should provide a strong password,
should not share the password with anyone, and regularly change the password when we
did. If the data is outside the firewall there may be some issues which can eliminate by
the cloud provider.
Hacking and malware are also one of the major problems as it can affect multiple
customers. Hacking can lead to data loss; disrupt the encrypted file system and many
other problems. Security and privacy issues can be overcome by employing encryption,
security hardware and security applications.
2) Portability: This is another challenge to cloud computing that applications should easily
be migrated from one cloud provider to another. There must not be vendor lock-in.
However, it is not yet made possible because each of the cloud provider uses different
standard languages for their platforms.
5) Reliability and Flexibility: Reliability and flexibility are also one of the challenges of
cloud customers and it can eliminate in a way that the data provided to the cloud should
not leak and the host should provide the reliability to the customers. To eliminate this
challenge the services provided by the third party should be monitored and supervision
should be done on performance, robustness and business dependency.
61
Cloud challenge can eliminate by training employees, utilization of proper tools, and
doing research.
7) Cost: Cloud computing is affordable but modifying the cloud to the customer’s demand
can be sometimes expensive. Moreover, it can cause hindrance to the small-scale
organization is modifying the cloud as per their demand can sometimes cost more. In
addition, transferring of data from the Cloud to the premises can also sometimes be costly.
9) Lack of resources: Lack of resources and expertise is also one of the major challenges
faced by the cloud industry and many companies are hoping to overcome this challenge
by hiring more workers which are more experienced. These workers will not only help to
eliminate the challenges of the companies but also, they will train existing staff to benefit
the company. Today many IT workers are working to boost the cloud computing expertise
and CEO of the company is finding it difficult as the workers are not much skilled. It
believes that workers with knowledge of the latest development and the technologies
related to it will become more valuable in business.
Portability and interoperability are related to the ability to create systems that function
together "out of the box" from interchangeable components.
Interoperability:
It is defined as the capacity of at least two systems or applications to trade with data and utilize
it. On the other hand, cloud interoperability is the capacity or extent at which one cloud service is
connected with the other by trading data as per strategy to get results.
The two crucial components in Cloud interoperability are usability and connectivity, which are
further divided into multiple layers.
1. Behaviour
2. Policy
3. Semantic
4. Syntactic
5. Transport
6. Portability
62
It is the process of transferring the data or an application from one framework to others, making
it stay executable or usable. Portability can be separated into two types: Cloud data portability
and Cloud application portability.
• Cloud data portability – It is the capability of moving information from one cloud service
to another and so on without expecting to re-enter the data.
1. Data Portability – Data portability, which is also termed as cloud portability, refers to
the transfer of data from one source to another source or from one service to another
service, i.e., from one application to another application or it may be from one cloud
service to another cloud service in the aim of providing a better service to the customer
without affecting its usability. Moreover, it makes the cloud migration process easier.
3. Platform Portability –There are two types of platform portability- platform source
portability and machine image portability. In the case of platform source portability, e.g.,
UNIX OS, which is mostly written in C language, can be implemented by re-compiling on
various different hardware and re-writing sections that are hardware-dependent which
are not coded in C. Machine image portability binds application with platform by porting
the resulting bundle which requires standard program representation.
63
5. Platform Interoperability – It is the interoperability between deployed components of
platforms deployed in a system. It is an important aspect, as application interoperability
can’t be achieved without platform interoperability.
6. Management Interoperability – Here, the Cloud services like SaaS, PaaS or IaaS and
applications related to self-service are assessed. It would be pre-dominant as Cloud
services are allowing enterprises to work-in-house and eradicate dependency from third
parties.
Major Scenarios where interoperability and portability is required: Cloud Standards Custom
Council (CSCC) has identified some of the basic scenarios where portability and interoperability
is required.
• Switching between cloud service providers – The customer wants to transfer data or
applications from Cloud 1 to Cloud 2.
• Using multiple cloud service providers- The client may subscribe to the same or
different services e.g., Cloud 1 and 2.
• Directly linked cloud services- The customer can use the service by linking to Cloud 1
and Cloud 3.
• Hybrid Cloud configuration- Here the customer connects with a legacy system not in a
public, but, private cloud, i.e., Cloud 1, which is then connected to public cloud services
i.e., Cloud 3.
• Cloud Migration- Clients migrate to one or more in-house applications to Cloud 1.
64
• The degree of mobility of data can also act as an obstacle. Moving data from one cloud to
another cloud, the capability of moving workload from one host to another should also be
accessed.
• Interoperability should not be left out, otherwise data migration can be highly affected.
So, the functioning of all components and applications should be ensured.
• As data is highly important in business, the safety of customer’s data should be ensured.
Cloud interoperability eradicates the complex parts by providing custom interfaces. Moving from
one framework can be conceivable with a container service which improves scalability. Having a
few hurdles, adaptability to change in service providers, better assistance in cloud clients will
enhance the improvement of cloud interoperability.
Interoperability Standards:
Standards are important in cloud computing for a variety of reasons. Standards for
interoperability and data and application portability can ensure an open competitive market in
cloud computing because customers are not locked-in to cloud providers and can easily transfer
data or applications between cloud providers.
Vendor lock-in can prevent a customer from switching to another competitor’s solution. If
switching is possible, it happens at considerable conversion cost and requires significant amounts
of time. Switching happen because may be customer wants to find a more suitable solution for
customer needs. Or vendor may not be able to provide the service required. So, the presence of
standards that are actually implemented and adopted in the cloud computing community gives
power for interoperability and then lessen the risks resulting from vendor lock-in.
Open Standards:
The development and adoption of open standards play avital role in achieving cloud
interoperability. Initiatives such as OpenStack, Cloud Foundry, and Kubernetes provide open-
source frameworks and APIs that enable seamless integration and collaboration across different
cloud platforms.
Intercloud Standards:
Intercloud standards focus on establishing protocols and frameworks for interconnecting
multiple cloud environments. Efforts like CloudEthernet Forum, Cloud Standards Customer
Council (CSCC), and Distributed Management Task Force (DMTF) work towards defining
interoperability guidelines and best practices for hybrid and multi-cloud environments.
65
2.11 SCALABILITY AND FAULT TOLERANCE
Scalability:
Cloud scalability in cloud computing refers to increasing or decreasing IT resources as needed to
meet changing demand. Scalability is one of the hallmarks of the cloud and the primary driver of
its explosive popularity with businesses. Data storage capacity, processing power, and
networking can all be increased by using existing cloud computing infrastructure. Scaling can be
done quickly and easily, usually without any disruption or downtime.
Third-party cloud providers already have the entire infrastructure in place; In the past, when
scaling up with on-premises physical infrastructure, the process could take weeks or months and
require exorbitant expenses. This is one of the most popular and beneficial features of cloud
computing, as businesses can grow up or down to meet the demands depending on the season,
projects, development, etc. By implementing cloud scalability, you enable your resources to grow
as your traffic or organization grows and vice versa. If your business needs more data storage
capacity or processing power, you'll want a system that scales easily and quickly. Cloud
computing solutions can do just that, which is why the market has grown so much. Using existing
cloud infrastructure, third-party cloud vendors can scale with minimal disruption.
Types of scaling
1) Vertical Scalability (Scale-up) – In this type of scalability, we increase the power of
existing resources in the working environment in an upward direction. Vertical
scalability, also known as scale-up, refers to the ability to add more resources to an
existing instance. For example, if you need more computing power, you can add CPU, RAM,
or storage to an existing server. This type of scalability is often used for applications that
require more processing power than can be handled by a single instance. One of the
primary benefits of vertical scalability is that it allows you to optimize your existing
resources, which can help you save costs and reduce waste.
2) Horizontal Scalability: In this kind of scaling, the resources are added in a horizontal
row. Horizontal scalability, also known as scale-out, refers to the ability to add more
instances of the same resource to a cloud environment. For example, you can add more
66
servers to your environment if you need more computing power. This type of scalability
is often used to handle large-scale web traffic or data processing needs. One of the
primary benefits of horizontal scalability is that it allows you to achieve greater
processing power and performance by distributing workloads across multiple resources.
3) Diagonal Scalability –It is a mixture of both Horizontal and Vertical scalability where the
resources are added both vertically and horizontally. Hybrid scalability, also known as
diagonal scaling, combines both horizontal and vertical scalability to provide a flexible
and scalable cloud environment. This type of scalability allows you to add more instances
or resources as needed while also optimizing your existing resources to achieve
maximum efficiency. Hybrid scalability is often used for complex applications that require
a combination of processing power, storage, and bandwidth.
Fault Tolerance
Fault tolerance refers to the ability of the system to keep functioning in even if a software or
hardware failure occurs or going through a down state. It is a critical aspect to improve the
reliability of a system and keep it useful for the user under all circumstances. Cloud computing
enables the system to have a good fault-tolerant environment by providing on-demand services
and access to a pool of configurable resources that can be utilized easily with the least
management effort.
Types of faults
The overall faults can be categorized into different types based on the domain that they affect and
the system's aspect that is influenced by it.
Aging-related faults Full disk space or denial of services
67
Reasons for fault occurrence
• System failure: The hardware or the software of the system crashes, causing the process
of the system to abort. Hardware failure can occur due to insufficient maintenance, and
software failure can occur due to stack overflow resulting in the system crashing or
hanging.
• Security breach: The servers of the system are hacked by an outsider party resulting in
the data being exposed and server damage. There can be different types of malicious
attacks, including viruses and ransomware.
Highly automated management systems Track the performance of the cloud services and
initiate backup operations if any fault is detected.
Geographically spread in multiple regions Ensure the system is functional during
geographical outrages and disasters.
Use AI for proactive maintenance Anticipate and fix defects beforehand to avoid
serious failures and downtime.
68
Fault tolerance through virtualization
The main purpose behind virtualization is to allow multiple users to access the same resources
simultaneously. To make the virtual machines fault tolerant, the virtual machine is replicated
across two separate physical servers. Hence, if one server fails, the other server takes over and
keeps the virtual machine running, consequently ensuring that the services are available at all
times. Adding on, there is a strong isolation between the virtual machines; therefore, if one virtual
machine faces a failure, the other virtual machines are unaffected.
69
Only one server in use.
70
2.12 PARALLEL AND DISTRIBUTED COMPUTING
Parallel computing is the method of dividing multiple tasks among several processors to perform
them simultaneously. These parallel systems can either share memory between processors or
distribute tasks across them. Parallel computing is beneficial as it helps reduce costs and increase
efficiency.
2. Bit-level parallelism: It is the form of parallel computing which is based on the increasing
processor’s size. It reduces the number of instructions that the system must execute in order to
perform a task on large-sized data. Example: Consider a scenario where an 8-bit processor must
compute the sum of two 16-bit integers. It must first sum up the 8 lower-order bits, then add the
8 higher-order bits, thus requiring two instructions to perform the operation. A 16-bit processor
can perform the operation with just one instruction.
3. Instruction-level parallelism: A processor can only address less than one instruction for each
clock cycle phase. These instructions can be re-ordered and grouped which are later on executed
concurrently without affecting the result of the program. This is called instruction-level
parallelism.
71
• Increased Scalability: Parallel computing offers scalability by adding more processors
or resources as needed, making it suitable for handling larger workloads or increasing
data sizes. This flexibility allows parallel systems to adapt to changing computing
requirements, making them suitable for small and large-scale applications.
• Improved Efficiency: Parallel computing reduces processing time and improves overall
efficiency by distributing tasks across multiple processors. This can lead to cost savings
as more work is accomplished in less time, making it an attractive option for time-
sensitive or resource-intensive applications.
• Handling Big Data: Parallel computing is well-suited for processing big data as it
efficiently handles large and complex datasets. By dividing data into smaller chunks and
processing them in parallel, parallel computing accelerates data processing, analysis, and
decision-making tasks.
• Scientific and Technical Computing: Parallel computing is widely used in scientific and
technical computing, including simulations, modeling, and data-intensive computations.
It enables efficient processing of large amounts of data and complex computations,
leading to faster results and advancements in various fields.
72
• Limited Applicability: Parallel computing may not be suitable for all types of
applications. Some algorithms and tasks may not be easily parallelizable, and the
overhead of parallelism may outweigh the benefits. Additionally, not all software or tools
may be optimized for parallel computing, limiting its applicability in certain domains or
industries.
• Debugging and Testing Challenges: Debugging and testing parallel programs can be
challenging due to the concurrent and distributed nature of the computations. Identifying
and fixing issues in parallel code may require additional effort and expertise compared to
serial code, leading to increased development and maintenance costs.
73
• In a distributed system, resources such as printers can be shared with multiple nodes
instead of being limited to just one.
• Cost Savings: Distributed computing can potentially result in cost savings by utilizing
existing resources more efficiently and reducing the need for costly hardware upgrades
or infrastructure investments.
• Scalable Data Storage: Distributed computing allows for distributed storage of data
across multiple nodes, providing scalability and redundancy. This enables efficient
management of large and complex datasets.
• Flexibility and Modularity: Distributed computing allows for flexible and modular
system design, where components can be added or replaced easily without disrupting the
entire system. This enables easier system maintenance, upgrades, and adaptability to
changing business requirements.
74
• Resource Sharing: Distributed computing enables sharing of resources such as
computing power, storage, and bandwidth among multiple nodes, maximizing resource
utilization and reducing resource wastage.
• Data Consistency and Integrity: Ensuring consistency and integrity of data across
distributed nodes can be challenging, as updates or changes to data may need to be
propagated across multiple nodes. Maintaining data consistency and integrity can require
additional coordination and synchronization mechanisms.
• Security Risks: Distributed computing systems may be vulnerable to security risks, such
as unauthorized access, data breaches, and attacks on communication channels. Ensuring
security across distributed nodes requires robust authentication, authorization,
encryption, and other security mechanisms.
75
impact system performance and availability, and may require additional mechanisms for
network resilience and optimization.
76
2.13 MAP REDUCE
A MapReduce is a data processing tool which is used to process the data parallelly in a distributed
form. It was developed in 2004, on the basis of paper titled as "MapReduce: Simplified Data
Processing on Large Clusters," published by Google.
The MapReduce is a paradigm which has two phases, the mapper phase, and the reducer phase.
In the Mapper, the input is given in the form of a key-value pair. The output of the Mapper is fed
to the reducer as input. The reducer runs only after the Mapper is over. The reducer too takes
input in key-value format, and the output of reducer is the final output.
Hadoop divides the job into tasks. There are two types of tasks:
1. Map tasks (Splits & Mapping)
2. Reduce tasks (Shuffling, Reducing)
The complete execution process (execution of Map and Reduce tasks, both) is controlled by two
types of entities called a
1. Jobtracker: Acts like a master (responsible for complete execution of submitted job)
2. Multiple Task Trackers: Acts like slaves, each of them performing the job
For every job submitted for execution in the system, there is one Jobtracker that resides
on Namenode and there are multiple tasktrackers which reside on Datanode.
• A job is divided into multiple tasks which are then run onto multiple data nodes in a
cluster.
• It is the responsibility of job tracker to coordinate the activity by scheduling tasks to run
on different data nodes.
• Execution of individual task is then to look after by task tracker, which resides on every
data node executing part of the job.
• Task tracker’s responsibility is to send the progress report to the job tracker.
• In addition, task tracker periodically sends ‘heartbeat’ signal to the Jobtracker so as to
notify him of the current state of the system.
77
• Thus job tracker keeps track of the overall progress of each job. In the event of task
failure, the job tracker can reschedule it on a different task tracker.
1. Map: As the name suggests its main use is to map the input data in key-value pairs. The input
to the map may be a key-value pair where the key can be the id of some kind of address and
value is the actual value that it keeps. The Map() function will be executed in its memory
repository on each of these input key-value pairs and generates the intermediate key-value
pair which works as input for the Reducer or Reduce() function.
2. Reduce: The intermediate key-value pairs that work as input for Reducer are shuffled and sort
and send to the Reduce() function. Reducer aggregate or group the data based on its key-value
pair as per the reducer algorithm written by the developer.
How Job tracker and the task tracker deal with MapReduce:
1. Job Tracker: The work of Job tracker is to manage all the resources and all the jobs across
the cluster and also to schedule each map on the Task Tracker running on the same data node
since there can be hundreds of data nodes available in the cluster.
2. Task Tracker: The Task Tracker can be considered as the actual slaves that are working on
the instruction given by the Job Tracker. This Task Tracker is deployed on each of the nodes
available in the cluster that executes the Map and Reduce task as instructed by Job Tracker.
There is also one important component of MapReduce Architecture known as Job History
Server. The Job History Server is a daemon process that saves and stores historical information
about the task or application, like the logs which are generated during or after the job execution
are stored on Job History Server.
78
MapReduce Architecture in Big Data explained with Example
The whole process goes through four phases of execution namely, splitting, mapping, shuffling,
and reducing.
Consider you have following input data for your MapReduce in Big data Program
Welcome to Hadoop Class
Hadoop is good
Hadoop is bad
The data goes through the following phases of MapReduce in Big Data
Input Splits: An input to a MapReduce in Big Data job is divided into fixed-size pieces called input
splits Input split is a chunk of the input that is consumed by a single map
Mapping: This is the very first phase in the execution of map-reduce program. In this phase data
in each split is passed to a mapping function to produce output values. In our example, a job of
mapping phase is to count a number of occurrences of each word from input splits (more details
about input-split is given below) and prepare a list in the form of <word, frequency>
79
Shuffling: This phase consumes the output of Mapping phase. Its task is to consolidate the
relevant records from Mapping phase output. In our example, the same words are clubed together
along with their respective frequency.
Reducing: In this phase, output values from the Shuffling phase are aggregated. This phase
combines values from Shuffling phase and returns a single output value. In short, this phase
summarizes the complete dataset.
In our example, this phase aggregates the values from Shuffling phase i.e., calculates total
occurrences of each word.
2.14 HADOOP
Hadoop is an open-source framework from Apache and is used to store process and analyze data
which are very huge in volume. Hadoop is written in Java and is not OLAP (online analytical
processing). It is used for batch/offline processing. It is being used by Facebook, Yahoo, Google,
Twitter, LinkedIn and many more. Moreover, it can be scaled up just by adding nodes in the
cluster.
Modules of Hadoop
1. HDFS: Hadoop Distributed File System. Google published its paper GFS and on the basis
of that HDFS was developed. It states that the files will be broken into blocks and stored
in nodes over the distributed architecture.
2. Yarn: Yet another Resource Negotiator is used for job scheduling and manage the cluster.
3. Map Reduce: This is a framework which helps Java programs to do the parallel
computation on data using key value pair. The Map task takes input data and converts it
into a data set which can be computed in Key value pair. The output of Map task is
consumed by reduce task and then the out of reducer gives the desired result.
4. Hadoop Common: These Java libraries are used to start Hadoop and are used by other
Hadoop modules.
Hadoop Architecture
The Hadoop architecture is a package of the file system, MapReduce engine and the HDFS
(Hadoop Distributed File System). The MapReduce engine can be MapReduce/MR1 or
YARN/MR2.
A Hadoop cluster consists of a single master and multiple slave nodes. The master node includes
Job Tracker, Task Tracker, NameNode, and DataNode whereas the slave node includes DataNode
and TaskTracker.
80
Hadoop Distributed File System
HDFS (Hadoop Distributed File System) is utilized for storage permission. It is mainly designed
for working on commodity Hardware devices (inexpensive devices), working on a distributed file
system design. HDFS is designed in such a way that it believes more in storing the data in a large
chunk of blocks rather than storing small data blocks.
HDFS in Hadoop provides Fault-tolerance and High availability to the storage layer and the other
devices present in that Hadoop cluster. Data storage Nodes in HDFS.
• NameNode (Master)
• DataNode (Slave)
NameNode: NameNode works as a Master in a Hadoop cluster that guides the Datanode (Slaves).
Namenode is mainly used for storing the Metadata i.e. the data about the data. Meta Data can be
the transaction logs that keep track of the user’s activity in a Hadoop cluster.
Meta Data can also be the name of the file, size, and the information about the location (Block
number, Block ids) of Datanode that Namenode stores to find the closest DataNode for Faster
Communication. Namenode instructs the DataNodes with the operation like delete, create,
Replicate, etc.
DataNode: DataNodes works as a Slave DataNodes are mainly utilized for storing the data in a
Hadoop cluster, the number of DataNodes can be from 1 to 500 or even more than that. The
greater number of DataNode, the Hadoop cluster will be able to store more data. So, it is advised
that the DataNode should have High storing capacity to store a large number of file blocks.
81
File Block In HDFS: Data in HDFS is always stored in terms of blocks. So the single block of data
is divided into multiple blocks of size 128MB which is default and you can also change it
manually.
Let’s understand this concept of breaking down of file in blocks with an example. Suppose you
have uploaded a file of 400MB to your HDFS then what happens is this file got divided into blocks
of 128MB+128MB+128MB+16MB = 400MB size. Means 4 blocks are created each of 128MB
except the last one. Hadoop doesn’t know or it doesn’t care about what data is stored in these
blocks so it considers the final file blocks as a partial record as it does not have any idea regarding
it.
In the Linux file system, the size of a file block is about 4KB which is very much less than the
default size of file blocks in the Hadoop file system. As we all know Hadoop is mainly configured
for storing the large size data which is in petabyte, this is what makes Hadoop file system different
from other file systems as it can be scaled, nowadays file blocks of 128MB to 256MB are
considered in Hadoop.
82
Replication In HDFS Replication ensures the availability of the data. Replication is making a copy
of something and the number of times you make a copy of that particular thing can be expressed
as it’s Replication Factor. As we have seen in File blocks that the HDFS stores the data in the form
of various blocks at the same time Hadoop is also configured to make a copy of those file blocks.
By default, the Replication Factor for Hadoop is set to 3 which can be configured means you can
change it manually as per your requirement like in above example, we have made 4 file blocks
which means that 3 Replica or copy of each file block is made means total of 4×3 = 12 blocks are
made for the backup purpose.
This is because for running Hadoop we are using commodity hardware (inexpensive system
hardware) which can be crashed at any time. We are not using the supercomputer for our Hadoop
setup. That is why we need such a feature in HDFS which can make copies of that file blocks for
backup purposes, this is known as fault tolerance.
Now one thing we also need to notice that after making so many replicas of our file blocks we are
wasting so much of our storage but for the big brand organization the data is very much important
than the storage so nobody cares for this extra storage. You can configure the Replication factor
in your hdfs-site.xml file.
Rack Awareness The rack is nothing but just the physical collection of nodes in our Hadoop
cluster (maybe 30 to 40). A large Hadoop cluster is consisting of so many Racks. with the help of
this Racks information Namenode chooses the closest Datanode to achieve the maximum
performance while performing the read/write information which reduces the Network Traffic.
HDFS Architecture
83
small jobs so that each job can be assigned to various slaves in a Hadoop cluster and Processing
can be Maximized. Job Scheduler also keeps track of which job is important, which job has more
priority, dependencies between the jobs and all the other information like job timing, etc. And the
use of Resource Manager is to manage all the resources that are made available for running a
Hadoop cluster.
Features of YARN
• Multi-Tenancy
• Scalability
• Cluster-Utilization
• Compatibility
MapReduce
MapReduce nothing but just like an Algorithm or a data structure that is based on the YARN
framework. The major feature of MapReduce is to perform the distributed processing in parallel
in a Hadoop cluster which Makes Hadoop working so fast. When you are dealing with Big Data,
serial processing is no more of any use. MapReduce has mainly 2 tasks which are divided phase-
wise:
In first phase, Map is utilized and in next phase Reduce is utilized.
Here, we can see that the Input is provided to the Map() function then it’s output is used as an
input to the Reduce function and after that, we receive our final output. Let’s understand What
this Map() and Reduce() does.
As we can see that an Input is provided to the Map(), now as we are using Big Data. The Input is
a set of Data. The Map() function here breaks this DataBlocks into Tuples that are nothing but a
key-value pair. These key-value pairs are now sent as input to the Reduce(). The Reduce()
function then combines this broken Tuples or key-value pair based on its Key value and form set
of Tuples, and perform some operation like sorting, summation type job, etc. which is then sent
to the final Output Node. Finally, the Output is Obtained.
The data processing is always done in Reducer depending upon the business requirement of that
industry. This is How First Map() and then Reduce is utilized one by one.
84
Map Task:
• RecordReader The purpose of recordreader is to break the records. It is responsible for
providing key-value pairs in a Map() function. The key is actually is its locational
information and value is the data associated with it.
• Map: A map is nothing but a user-defined function whose work is to process the Tuples
obtained from record reader. The Map() function either does not generate any key-value
pair or generate multiple pairs of these tuples.
• Combiner: Combiner is used for grouping the data in the Map workflow. It is similar to a
Local reducer. The intermediate key-value that are generated in the Map is combined with
the help of this combiner. Using a combiner is not necessary as it is optional.
• Partitionar: Partitional is responsible for fetching key-value pairs generated in the
Mapper Phases. The partitioner generates the shards corresponding to each reducer.
Hashcode of each key is also fetched by this partition. Then partitioner performs
it’s(Hashcode) modulus with the number of reducers(key.hashcode()%(number of
reducers)).
Reduce Task
• Shuffle and Sort: The Task of Reducer starts with this step, the process in which the
Mapper generates the intermediate key-value and transfers them to the Reducer task is
known as Shuffling. Using the Shuffling process the system can sort the data using its key
value.
Once some of the Mapping tasks are done Shuffling begins that is why it is a faster process and
does not wait for the completion of the task performed by Mapper.
• Reduce: The main function or task of the Reduce is to gather the Tuple generated from
Map and then perform some sorting and aggregation sort of process on those key-value
depending on its key element.
• OutputFormat: Once all the operations are performed, the key-value pairs are written
into the file with the help of record writer, each record in a new line, and the key and value
in a space-separated manner.
85
Hadoop common or Common Utilities
Hadoop common or Common utilities are nothing but our java library and java files or we can say
the java scripts that we need for all the other components present in a Hadoop cluster. these
utilities are used by HDFS, YARN, and MapReduce for running the cluster. Hadoop Common verify
that Hardware failure in a Hadoop cluster is common so it needs to be solved automatically in
software by Hadoop Framework.
Advantages of Hadoop
o Fast: In HDFS the data distributed over the cluster and are mapped which helps in faster
retrieval. Even the tools to process the data are often on the same servers, thus reducing
the processing time. It is able to process terabytes of data in minutes and Peta bytes in
hours.
o Scalable: Hadoop cluster can be extended by just adding nodes in the cluster.
o Cost Effective: Hadoop is open source and uses commodity hardware to store data so it
really cost effective as compared to traditional relational database management system.
o Resilient to failure: HDFS has the property with which it can replicate data over the
network, so if one node is down or some other network failure happens, then Hadoop
takes the other copy of data and use it. Normally, data are replicated thrice but the
replication factor is configurable.
86
• Use Cases in Cloud Computing: JavaScript, especially with Node.js, is commonly
employed for serverless functions, API development, and building scalable, real-
time applications. It is integral to the development of serverless applications and
can be utilized with cloud services like AWS Lambda, Azure Functions, and Google
Cloud Functions.
3. Java:
• Features: Java is an object-oriented, platform-independent language with a
strong emphasis on reliability and portability. It is known for its performance,
scalability, and extensive ecosystem. Java's "write once, run anywhere"
philosophy makes it suitable for large-scale enterprise applications.
• Use Cases in Cloud Computing: Java is widely used for building enterprise-level
cloud applications, especially in microservices architecture. It is suitable for
developing backend services, distributed systems, and applications that require
high scalability. Many cloud services provide strong support for Java, including
AWS, Azure, and Google Cloud.
4. C#:
•Features: C# is a modern, object-oriented language developed by Microsoft. It is
known for its integration with the .NET framework, providing a comprehensive
set of libraries and tools for application development. C# emphasizes
productivity, type safety, and ease of use.
• Use Cases in Cloud Computing: C# is commonly used in the Microsoft
ecosystem, making it a natural choice for cloud applications on Azure. It is utilized
for building web applications using ASP.NET, microservices, and backend
services. C# integrates seamlessly with Azure services, facilitating the
development of robust cloud applications.
5. Go (Golang):
• Features: Go, or Golang, is designed for simplicity, efficiency, and concurrency. It
has a concise syntax, fast compilation, and built-in support for concurrent
programming. Go is statically typed, which helps catch errors at compile time.
• Use Cases in Cloud Computing: Go is gaining popularity for cloud-native
development due to its efficiency and simplicity. It is used in microservices
architecture, container-based applications (e.g., Kubernetes), and serverless
computing. Go's performance makes it suitable for building scalable and
performant cloud applications.
6. Ruby:
• Features: Ruby is known for its developer-friendly syntax, expressiveness, and
elegant design. It follows the principle of "convention over configuration,"
emphasizing simplicity and ease of use.
• Use Cases in Cloud Computing: Ruby is used in cloud computing for web
application development, often with the Ruby on Rails framework. Ruby on Rails
simplifies the development of cloud-based applications, providing features such
as an ORM (Object-Relational Mapping) system, routing, and templating.
7. PHP:
87
• Features: PHP is a server-side scripting language designed for web development.
It is known for its simplicity, ease of integration with web servers, and dynamic
typing.
• Use Cases in Cloud Computing: PHP is commonly used for building server-side
components of web applications in the cloud. Frameworks like Laravel enhance
PHP's capabilities, offering features such as an expressive syntax, an ORM
(Eloquent), and a powerful routing system.
8. Swift (for iOS apps):
• Features: Swift is a modern, statically typed language developed by Apple for iOS,
macOS, watchOS, and tvOS app development. It is designed for safety,
performance, and expressiveness.
• Use Cases in Cloud Computing: Swift is primarily used for iOS app development,
but it can also be employed in the cloud for building backend services and APIs
that interact with cloud services. Swift is known for its performance and safety
features.
9. TypeScript:
• Features: TypeScript is a superset of JavaScript that introduces static typing,
making it easier to catch errors during development. It supports the latest
ECMAScript features and is designed for large-scale application development.
• Use Cases in Cloud Computing: TypeScript is often used in cloud development,
especially with modern frontend frameworks like Angular. It brings enhanced
code quality, better tooling support, and improved maintainability to cloud
applications. TypeScript can be employed for both frontend and backend
development
What is Service?
A service is a well-defined, self-contained function that represents a unit of functionality. A
service can exchange information from another service. It is not dependent on the state of another
service. It uses a loosely coupled, message-based communication model to communicate with
applications and other services.
Service Connections
Service consumer sends a service request to the service provider, and the service provider sends
the service response to the service consumer. The service connection is understandable to both
the service consumer and service provider.
88
In this architecture, services are provided to form applications, through a network call over the
internet. It uses common communication standards to speed up and streamline the service
integrations in applications. Each service in SOA is a complete business function in itself. The
services are published in such a way that it makes it easy for the developers to assemble their
apps using those services. Note that SOA is different from microservice architecture.
• SOA allows users to combine a large number of facilities from existing services to form
applications.
• SOA encompasses a set of design principles that structure system development and
provide means for integrating components into a coherent and decentralized system.
• SOA-based computing packages functionalities into a set of interoperable services, which
can be integrated into different software systems belonging to separate business
domains.
Services might aggregate information and data retrieved from other services or create workflows
of services to satisfy the request of a given service consumer. This practice is known as service
orchestration Another important interaction pattern is service choreography, which is the
coordinated interaction of services without a single point of control.
89
Service-Oriented Terminologies
o Services - The services are the logical entities defined by one or more published
interfaces.
o Service provider - It is a software entity that implements a service specification.
o Service consumer - It can be called as a requestor or client that calls a service provider.
A service consumer can be another service or an end-user application.
o Service locator - It is a service provider that acts as a registry. It is responsible for
examining service provider interfaces and service locations.
o Service broker - It is a service provider that pass service requests to one or more
additional service providers.
Components of SOA:
The service-oriented architecture stack can be categorized into two parts - functional aspects and
quality of service aspects.
90
Functional aspects
The functional aspect contains:
• Transport - It transports the service requests from the service consumer to the service
provider and service responses from the service provider to the service consumer.
• Service Communication Protocol - It allows the service provider and the service
consumer to communicate with each other.
• Service Description - It describes the service and data required to invoke it.
• Service - It is an actual service.
• Business Process - It represents the group of services called in a particular sequence
associated with the particular rules to meet the business requirements.
• Service Registry - It contains the description of data which is used by service providers
to publish their services.
5. Autonomy: Services have control over the logic they encapsulate and, from a service
consumer point of view, there is no need to know about their implementation.
91
7. Composability: Using services as building blocks, sophisticated and complex operations
can be implemented. Service orchestration and choreography provide a solid support for
composing services and achieving business goals.
Advantages of SOA:
• Service reusability: In SOA, applications are made from existing services. Thus, services
can be reused to make many applications.
• Easy maintenance: As services are independent of each other they can be updated and
modified easily without affecting other services.
• Platform independent: SOA allows making a complex application by combining services
picked from different sources, independent of the platform.
• Availability: SOA facilities are easily available to anyone on request.
• Reliability: SOA applications are more reliable because it is easy to debug small services
rather than huge codes
• Scalability: Services can run on different servers within an environment, this increases
scalability
Disadvantages of SOA:
• High overhead: A validation of input parameters of services is done whenever services
interact this decreases performance as it increases load and response time.
• High investment: A huge initial investment is required for SOA.
• Complex service management: When services interact they exchange messages to
tasks. the number of messages may go in millions. It becomes a cumbersome task to
handle a large number of messages.
Practical applications of SOA: SOA is used in many ways around us whether it is mentioned or
not.
1. SOA infrastructure is used by many armies and air forces to deploy situational awareness
systems.
2. SOA is used to improve healthcare delivery.
3. Nowadays many apps are games and they use inbuilt functions to run. For example, an
app might need GPS so it uses the inbuilt GPS functions of the device. This is SOA in mobile
solutions.
4. SOA helps maintain museums a virtualized storage pool for their information and content.
92
UNIT 3 VIRTUALIZATION
Structure:
3.1 Introduction
3.2 Architecture of Virtualization
3.3 Hypervisor
3.4 Characteristics of Virtualized Environment
3.5 Types of Virtualizations
3.6 Taxonomy of Virtualization Techniques
3.7 Virtualization and Cloud Computing
3.8 Virtualization: of CPU, Memory, I/O Devices, Server, desktop, Network and Data-
Center
3.9 Pros and Cons of Virtualization
3.10 Technology Examples- VMware, Microsoft Hyper-B, KVM, Xen
93
3.1 INTRODUCTION
Virtualization is the "creation of a virtual (rather than actual) version of something, such as a
server, a desktop, a storage device, an operating system or network resources".
In other words, Virtualization is a technique, which allows to share a single physical instance of a
resource or an application among multiple customers and organizations. It does by assigning a
logical name to a physical storage and providing a pointer to that physical resource when
demanded. With the help of Virtualization, multiple operating systems and applications can run
on the same machine and its same hardware at the same time, increasing the utilization and
flexibility of hardware.
The machine on which the virtual machine is going to create is known as Host Machine and that
virtual machine is referred as a Guest Machine. This virtual machine is managed by a software
or firmware, which is known as hypervisor.
In a virtualized environment there are three major components: guest, host, and virtualization
layer. The guest represents the system component that interacts with the virtualization layer
rather than with the host, as would normally happen. The host represents the original
environment where the guest is supposed to be managed. The virtualization layer is responsible
for recreating the same or a different environment where the guest will operate.
94
3.2 ARCHITECTURE OF VIRTUALIZATION
The architecture of virtualization is a complex system of hardware and software components that
work together to create virtual instances of computing resources. The virtualization architecture
includes several layers of abstraction, each responsible for a specific aspect of the virtualization
process.
1. Physical Hardware Layer: The physical hardware layer consists of the physical servers,
storage devices, and networking equipment that form the underlying infrastructure of the
virtualization environment. This layer includes the CPUs, memory, disk drives, and
network adapters that are used to provide the computing resources for virtual machines.
2. Hypervisor Layer: The hypervisor layer, also known as the virtual machine monitor
(VMM), is responsible for managing the virtualization layer and allocating resources to
each virtual machine (VM). The hypervisor runs directly on the physical hardware and
creates virtual machines by partitioning the physical resources into logical units.
4. Guest Operating System Layer: The guest operating system layer consists of the
operating systems and applications that run inside the virtual machines. Each virtual
machine has its own copy of an operating system, which runs on top of the hypervisor
layer.
95
5. Application Layer: The application layer includes the applications and workloads that
run inside the guest operating system. Each virtual machine can run multiple applications,
depending on the resources allocated to it.
The virtualization architecture provides a layer of abstraction between the physical hardware
and the software layer, enabling multiple virtual instances of an operating system or application
to run on a single physical machine. The hypervisor layer is the key component of the
virtualization architecture, responsible for managing the virtualization layer and allocating
resources to each virtual machine.
The virtualization management layer provides the tools necessary to create and manage virtual
machines and other virtual resources, such as virtual networks and virtual storage devices. The
guest operating system layer and the application layer run inside the virtual machines, providing
the environment in which applications and workloads can be executed.
3.3 HYPERVISOR
A hypervisor is a form of virtualization software used in Cloud hosting to divide and allocate the
resources on various pieces of hardware. The program which provides partitioning, isolation, or
abstraction is called a virtualization hypervisor. The hypervisor is a hardware virtualization
technique that allows multiple guest operating systems (OS) to run on a single host system at the
same time. A hypervisor is sometimes also called a virtual machine manager (VMM).
Types of Hypervisors –
TYPE-1 Hypervisor:
The hypervisor runs directly on the underlying host system. It is also known as a “Native
Hypervisor” or “Bare metal hypervisor”. It does not require any base server operating system. It
96
has direct access to hardware resources. Examples of Type 1 hypervisors include VMware ESXi,
Citrix XenServer, and Microsoft Hyper-V hypervisor.
Type 1 Hypervisor
Pros & Cons of Type-1 Hypervisor:
Pros: Such kinds of hypervisors are very efficient because they have direct access to the physical
hardware resources (like Cpu, Memory, Network, and Physical storage). This causes the
empowerment of the security because there is nothing any kind of the third-party resource so
that attacker couldn’t compromise with anything.
Cons: One problem with Type-1 hypervisors is that they usually need a dedicated separate
machine to perform their operation and to instruct different VMs and control the host hardware
resources.
TYPE-2 Hypervisor:
A Host operating system runs on the underlying host system. It is also known as ‘Hosted
Hypervisor”. Such kind of hypervisors doesn’t run directly over the underlying hardware rather
they run as an application in a Host system (physical machine). Basically, the software is installed
on an operating system. Hypervisor asks the operating system to make hardware calls. An
example of a Type 2 hypervisor includes VMware Player or Parallels Desktop. Hosted hypervisors
are often found on endpoints like PCs. The type-2 hypervisor is very useful for engineers, and
security analysts (for checking malware, or malicious source code and newly developed
applications).
Type 2 Hypervisor
Pros & Cons of Type-2 Hypervisor:
Pros: Such kind of hypervisors allows quick and easy access to a guest Operating System
alongside the host machine running. These hypervisors usually come with additional useful
features for guest machines. Such tools enhance the coordination between the host machine and
the guest machine.
Cons: Here there is no direct access to the physical hardware resources so the efficiency of these
hypervisors lags in performance as compared to the type-1 hypervisors, and potential security
risks are also there an attacker can compromise the security weakness if there is access to the
host operating system so he can also access the guest operating system.
97
3.4 CHARACTERISTICS OF VIRTUALIZED ENVIRONMENT
The virtual environment is often referred to as a guest machine or virtual machine. A virtual
machine (VM) is a virtual environment that functions as a virtual computer system with its own
CPU, memory, network interface, and storage, created on a physical hardware system (located
off- or on-premises).
1. Increased security: The ability to control the execution of a guest in a completely
transparent manner opens new possibilities for delivering a secure, controlled execution
environment.
2. Managed execution: Virtualization of the execution environment not only allows
increased security, but a wider range of features also can be implemented. In particular,
sharing, aggregation, emulation, and isolation are the most relevant features.
2.2 Aggregation- While sharing involves multiple computing environments sharing one
host, the opposite is also possible in virtualization. A group of separate hosts can be
consolidated together to appear as one single host to the user. This is called
aggregation of resources, and each consolidated group is called a ‘cluster’.
2.3 Emulation – Emulation means using a program or a device to imitate the working of
another program or device. A program/device completely different to the host can be
emulated on the host device through virtualization.
2.4 Isolation - Virtualization enables the virtual machine (VM) or the guest application
to be completely separate from the host machine. An application called the Virtual
Machine Manager, also called hypervisor, acts as the middleman between the guest
and host machines. This lets the harmful actions from either the guest or host from
harming the other.
98
3. Portability: The virtualization resources are portable, meaning they can be copied and
moved from one system to another, and the same functionality can be expected. This
allows the users to create and reuse the configuration instead of repeating it.
Network Virtualization
This is among the types of virtualizations in cloud computing that enables us to combine the
various available resources by judiciously separating the available bandwidth into different
channels. This lets us run multiple virtual networks at the same time, with each channel having
different systems communicating.
99
Here, to the client, the different physical server networks that lie on the internet aren’t visible.
This enables the client to view the entire network as one big system.
Desktop Virtualization
This is among the types of virtualizations in cloud computing that allows the user’s operating
system to be accessible from a remote server in a far-off data center. It allows the user to access
the OS from any system virtually without having to store any data on one client system. A huge
benefit of this is that the user can run many different operating systems at the same time without
having to reboot or change system settings.
Storage Virtualization
This enables the data given as input and generated as output to be stored on a vast array of
servers at a remote data center/s. All the memory storage systems are controlled by the virtual
storage systems. The data from multiple sources are stored as a common repository.
Data can be stored on multiple physical storage servers but appear to the user that it is being
stored on one single system.
100
Server Virtualization
This enables one central server to be restructured into multiple smaller virtual servers that mimic
the functioning of the central server. Each virtual server has its own OS and works in an isolated
manner.
With server virtualization, it is now possible to process multiple requests at the same time
without overloading the main server with requests. This also increases the availability of the
server resource as one server failing won’t have an effect on the functioning of the others.
The data coming from the user can be processed by multiple different servers, but to the user, it
appears as if one single server is processing all its requests.
Data Virtualization
This enables the various devices to receive and process data without having to know when and
how the data was collected. The data is formatted logically in a way that the virtual view of the
data can be accessed by the various stakeholders without having to see the background
processes that took place to process that data
101
The data virtualization allows the data processors and data consumers to be separate. This lets
the consumers receive data from the sources without having to know when, where or how the
data was processed. This significantly increases security as the client doesn’t know where the
data is coming from.
The first classification discriminates against the service or entity that is being emulated.
Virtualization is mainly used to emulate execution environments, storage, and networks.
Among these categories, execution virtualization constitutes the oldest, most popular, and most
developed area. Therefore, it deserves major investigation and a further categorization.
We can divide these execution virtualization techniques into two major categories by considering
the type of host they require.
• Process-level: These techniques are implemented on top of an existing operating system,
which has full control of the hardware.
• System-level: These techniques are implemented directly on hardware and do not
require - or require a minimum of support from - an existing operating system
Within these two categories we can list various techniques that offer the guest a different type of
virtual computation environment:
• bare hardware
• operating system resources
• low-level programming language
102
• application libraries
Execution virtualization: Execution virtualization includes all techniques that aim to emulate
an execution environment that is separate from the one hosting the virtualization layer.
All these techniques concentrate their interest on providing support for the execution of
programs, whether these are the operating system, a binary specification of a program compiled
against an abstract machine model, or an application. Therefore, execution virtualization can be
implemented directly on top of the hardware by the operating system, an application, or libraries
dynamically or statically linked to an application image
Type II: hypervisors require the support of an operating system to provide virtualization
services. This means that they are programs managed by the operating system, which interact
with its hardware for guest operating systems. This type of hypervisor is also called a hosted
virtual machine since it is hosted within an operating system.
103
Hardware Virtualization Techniques:
Full virtualization: Full virtualization refers to the ability to run a program, most likely an
operating system, directly on top of a virtual machine and without any modification, as though it
were run on the raw hardware. To make this possible, virtual machine manager are required to
provide a complete emulation of the entire underlying hardware.
Application-level virtualization:
The application-level virtualization is used when there is a desire to virtualize only one
application. Application virtualization software allows users to access and use an application
from a separate computer than the one on which the application is installed.
Storage virtualization: It is a system administration practice that allows decoupling the physical
organization of the hardware from its logical representation. Using this technique, users do not
104
have to be worried about the specific location of their data, which can be identified using a logical
path. Storage virtualization allows us to harness a wide range of storage facilities and represent
them under a single logical filesystem.
Virtualization is the process of creating a virtual version of a resource, such as a server, operating
system, or network, to maximize resource utilization and improve efficiency. There are several
virtualization techniques, each with its own characteristics and use cases. Here is a taxonomy of
virtualization techniques along with a brief explanation:
1. Full Virtualization: Full Virtualization is virtualization in which the guest operating
system is unaware that it is in a virtualized environment, and therefore hardware is
virtualized by the host operating system so that the guest can issue commands to what it
thinks is actual hardware, but really are just simulated hardware devices created by the
host. Full Virtualization is done with a hardware emulation tool and processor-based
virtualization support that allows you to run unmodified guest kernels that are not aware
they are being virtualized. The result is that you give up performance on these platforms.
Windows, NetWare, and most closed-source OSs require full virtualization. Many of these
guests have PV drivers available, though, which allow for devices like disks, network
cards, etc., to run with improved performance. Full virtualization is called “full” because
the entire system’s resources are abstracted by the virtualization software layer. Full
virtualization has proven highly successful for:
• sharing a computer system among multiple users
• isolating users from each other (and from the control program)
• Emulating new hardware to achieve improved reliability, security and
productivity.
105
3. Hardware-assisted Virtualization: Hardware-assisted virtualization leverages special
features provided by modern CPUs, such as Intel VT-x and AMD-V, to improve
virtualization performance. These features allow the hypervisor to run in a privileged
mode, reducing the need for software emulation and improving overall efficiency.
106
Virtualization:
Virtualization is the establishment of cloud computing. It is this novelty that empowers a
continuous asset age from certain eccentric conditions or a singular physical device framework.
Here the job of hypervisor is essential, which is legitimately associated with the equipment to
make a few virtual machines from it. These virtual machines working is unmistakable,
independent and doesn’t meddle with one another. In, the condition of disaster recovery, it relies
on single peripheral device as single dedicated hardware do a great job in it.
Virtualization exists in different classes which are:
107
9. Cloud computing provides
unlimited While storage space depends on
storage space. physical server capacity in
virtualization.
10. Cloud computing is of two types: Public Virtualization is of two types: Hardware
cloud and Private cloud. virtualization and Application
virtualization.
11. In Cloud Computing, Configuration is image In Virtualization, Configuration is
based. template based.
12. In cloud computing, we utilize the entire In Virtualization, the entire servers are
server capacity and the entire servers are on-demand.
consolidated.
13. In cloud computing, the pricing pay as you In Virtualization, the pricing is totally
go model, and consumption is the metric on dependent on infrastructure costs.
which billing is done.
TYPES OF VIRTUALIZATIONS
There are many variants or types available under virtualization technology as listed below:
Server Virtualization:
Server virtualization is a technology that allows multiple virtual instances (virtual machines or
VMs) to run on a single physical server. Each VM operates as an independent server with its own
operating system, applications, and resources. This enables more efficient use of server
resources, improved flexibility, and easier management.
Working of Server Virtualization:
1. Hypervisor Installation:
• A hypervisor, also known as a Virtual Machine Monitor (VMM), is installed
directly on the physical server hardware.
• The hypervisor manages and allocates resources to multiple VMs while isolating
them from each other.
2. Creation of Virtual Machines:
• Once the hypervisor is installed, it allows the creation of multiple virtual machines
on the physical server.
• Each VM is an isolated environment with its own virtual CPU, memory, storage,
and network interfaces.
3. Resource Allocation:
• The hypervisor allocates physical resources, such as CPU cycles, memory, and
storage, to each virtual machine.
• Resource allocation is typically dynamic, allowing VMs to scale resources up or
down based on demand.
4. Guest Operating Systems:
• Each virtual machine runs its own guest operating system, which can be different
from the host operating system installed on the physical server.
108
• The hypervisor manages interactions between the guest operating systems and
the underlying hardware.
5. Isolation:
• VMs are isolated from each other, providing a level of security and preventing one
VM from impacting the others.
• If one VM crashes or experiences issues, it does not affect the operation of other
VMs.
6. Hardware Independence:
• Virtualization abstracts the underlying hardware, making VMs hardware-
independent.
• This abstraction allows for greater flexibility and portability of virtual machines
across different physical servers.
Methods of Server Virtualization:
1. Full Virtualization:
• In full virtualization, the hypervisor creates a complete virtual replica of the
underlying physical hardware.
• Guest operating systems run unmodified, believing they are interacting with real
hardware.
• Examples include VMware ESXi, Microsoft Hyper-V, and KVM.
2. Paravirtualization:
• Paravirtualization requires modifications to the guest operating system to make
it aware that it is running in a virtualized environment.
• This allows for more efficient communication between the guest OS and the
hypervisor.
• Xen is a commonly used paravirtualization hypervisor.
3. Hardware-Assisted Virtualization (HVM):
• HVM relies on hardware support from the CPU to assist in virtualization.
• CPUs with virtualization extensions (e.g., Intel VT-x, AMD-V) provide hardware
support for creating and managing virtual machines.
• HVM is often used in conjunction with full virtualization techniques.
4. Container-based Virtualization:
• Containers offer a lightweight form of virtualization, where applications and their
dependencies are packaged together.
• Containers share the host operating system's kernel, making them more resource-
efficient than traditional VMs.
• Docker and Kubernetes are popular containerization technologies.
Advantages of Server Virtualization:
• Improved Resource Utilization: Multiple VMs can run on a single physical server,
maximizing resource utilization.
• Cost Savings: Reduces hardware and energy costs by consolidating multiple servers into
a single physical machine.
• Scalability: Allows for easy scaling by adding or removing virtual machines based on
demand.
109
• Isolation: Provides isolation between virtual machines for improved security and
stability.
• Disaster Recovery: Facilitates easier backup and recovery of virtual machines.
Disadvantages of Server Virtualization:
• Performance Overhead: Introduces a slight performance overhead due to the
virtualization layer.
• Initial Setup Complexity: Setting up virtualization infrastructure can be complex.
• Resource Contentions: In highly dynamic environments, contention for resources may
occur.
• Dependency on Hypervisor: There is a dependency on the hypervisor, and if it fails, all
virtual machines on the host may be impacted.
Server virtualization plays a crucial role in cloud computing, enabling efficient resource
utilization and flexibility in deploying and managing applications and services. The choice of
virtualization method depends on factors such as performance requirements, hardware
capabilities, and management preferences.
Storage Virtualization:
Storage virtualization is a technology that abstracts and pools physical storage resources from
multiple storage devices into a unified, centralized management layer. This abstraction allows for
more efficient utilization of storage resources, simplified management, and improved flexibility
in storage provisioning and data migration.
Working of Storage Virtualization:
1. Abstraction of Storage Resources:
• Storage virtualization abstracts the underlying physical storage devices, such as
hard drives, RAID arrays, or SAN (Storage Area Network) systems.
• A virtualization layer is introduced to create a logical representation of the entire
storage infrastructure.
2. Creation of a Storage Pool:
• Physical storage devices are aggregated into a single pool of storage resources.
• This pooled storage can be dynamically allocated to different applications,
servers, or users as needed.
3. Centralized Management:
• The virtualization layer provides a centralized management interface for
configuring, provisioning, and monitoring storage resources.
• Administrators can manage the entire storage infrastructure from a single
console.
4. Dynamic Provisioning:
• Storage can be dynamically provisioned to applications or servers based on
demand.
• Virtualized storage resources can be allocated or expanded without affecting the
underlying physical devices.
5. Data Migration:
110
• Storage virtualization facilitates seamless data migration between different
storage devices within the virtualized pool.
• This allows for non-disruptive storage upgrades or changes without impacting
applications.
6. Improving Data Access:
• Storage virtualization can optimize data access by implementing caching, tiering,
or other performance-enhancing techniques.
• This ensures that frequently accessed data is stored in the most efficient manner.
Types of Storage Virtualization:
1. File-level Storage Virtualization:
• In file-level storage virtualization, the virtualization layer operates at the file level.
• It abstracts and manages individual files and directories, providing a unified
namespace for users or applications.
2. Block-level Storage Virtualization:
• Block-level storage virtualization operates at the lower level, dealing with storage
blocks or chunks.
• It abstracts and manages individual blocks of data, allowing for more granular
control over storage allocation.
3. Object Storage Virtualization:
• Object storage virtualization abstracts and manages data as objects, each with its
own unique identifier.
• This type is commonly used in cloud storage environments, providing a scalable
and flexible storage architecture.
Methods of Storage Virtualization:
1. In-band Storage Virtualization:
• In-band storage virtualization processes data within the data path, directly in the
I/O (Input/Output) path.
• The virtualization layer is actively involved in data transfer, making real-time
decisions on data access and storage.
2. Out-of-band Storage Virtualization:
• Out-of-band storage virtualization separates the data path from the control path.
• The virtualization layer is not directly involved in the data transfer; instead, it
handles management functions and provides instructions to the storage devices.
3. Appliance-based Storage Virtualization:
• Appliance-based storage virtualization involves deploying a dedicated hardware
appliance that acts as the virtualization layer.
• This appliance sits between the servers and the storage devices, managing storage
resources.
4. Host-based Storage Virtualization:
• Host-based storage virtualization involves installing software on the server that
manages storage resources.
• The server itself becomes the virtualization layer, handling storage abstraction
and management functions.
111
Advantages of Storage Virtualization:
• Simplified Management: Centralized management interface for all storage resources.
• Improved Utilization: Efficient utilization of storage resources through pooling.
• Flexibility: Dynamic provisioning and easy scalability.
• Data Mobility: Seamless data migration and non-disruptive upgrades.
• Cost Savings: Reduced hardware dependency and improved efficiency.
Disadvantages of Storage Virtualization:
• Complexity: Implementing storage virtualization can be complex.
• Performance Overhead: Introduces some performance overhead.
• Dependency on Virtualization Layer: Relies on the reliability of the virtualization layer.
Storage virtualization is a crucial component in modern data centers and cloud environments,
providing the necessary abstraction and flexibility to meet the demands of evolving storage
needs. The choice of storage virtualization type and method depends on specific use cases,
infrastructure requirements, and organizational preferences.
Application Virtualization:
Application virtualization is a technology that abstracts an application from the underlying
operating system and encapsulates it in a virtual environment. This allows the application to run
independently of the local system configuration, dependencies, and potential conflicts with other
applications. The primary goal is to improve application compatibility, reduce conflicts, and
enhance the ease of application management and deployment.
Working of Application Virtualization:
1. Isolation from the Operating System:
• Application virtualization creates a virtual environment for an application,
isolating it from the underlying operating system.
• The virtual environment includes necessary components, libraries, and settings
required for the application to run.
2. Package Creation:
• The application and its dependencies are packaged into a virtual container or
package.
• This container encapsulates the application and ensures that it runs consistently
across different environments.
3. Decoupling from the OS Registry:
• Application virtualization decouples the application from the system registry,
preventing conflicts with other applications that might share the same registry
keys.
4. On-Demand Loading:
• The virtualized application is not installed in the traditional sense. Instead, it is
loaded on-demand when needed.
• The application container is streamed or cached locally to the user's system,
minimizing the need for traditional installation procedures.
5. Portability:
112
• Application virtualization enhances portability as virtualized applications can be
run on different machines and operating systems without modification.
• This portability simplifies application deployment in diverse environments.
6. Run-time Isolation:
• The virtualized application operates in a runtime environment that is separate
from the host operating system and other applications.
• This isolation helps prevent conflicts, allowing multiple versions of the same
application or different applications to run simultaneously.
Types of Application Virtualization:
1. Full Virtualization:
• Full virtualization involves creating a complete, virtualized instance of the
operating system and running the application within that virtualized OS.
• This approach provides strong isolation but may have higher resource overhead.
2. Containerization:
• Containerization virtualizes the application along with its dependencies in
lightweight containers.
• Containers share the host operating system's kernel, resulting in lower resource
overhead compared to full virtualization.
• Docker is a popular containerization platform used for application virtualization.
3. Streaming-based Virtualization:
• In streaming-based virtualization, the application is delivered to the user's system
in a streaming manner.
• Components are downloaded on demand or cached locally, reducing the need for
a full installation.
Methods of Application Virtualization:
1. Process Virtualization:
• Process virtualization involves encapsulating the application's processes and
runtime components, isolating them from the host operating system.
• This method allows applications to run independently, avoiding conflicts with
other applications.
2. Presentation Virtualization:
• Presentation virtualization, also known as application streaming, involves
delivering only the user interface and inputs to the client system.
• The application's core processes run on a remote server, and only the user
interface is presented on the user's machine.
3. Desktop Virtualization (VDI):
• Desktop virtualization involves virtualizing the entire desktop environment,
including the operating system and applications.
• Users access their virtual desktops remotely, providing a complete, isolated
workspace.
Advantages of Application Virtualization:
• Compatibility: Mitigates compatibility issues by encapsulating application
dependencies.
113
• Isolation: Prevents conflicts between applications, allowing them to run independently.
• Portability: Applications can be easily deployed on different systems without
modification.
• Simplified Management: Streamlines application deployment and updates.
Disadvantages of Application Virtualization:
• Performance Overhead: Some performance overhead may be incurred due to the
virtualization layer.
• Dependency on Virtualization Layer: Applications depend on the reliability of the
virtualization layer.
• Not Suitable for All Applications: Some applications with complex dependencies may
not be suitable for virtualization.
Application virtualization is particularly valuable in environments where maintaining application
compatibility, reducing conflicts, and simplifying software deployment and management are
critical considerations. The choice of application virtualization method depends on specific use
cases, requirements, and the nature of the applications being virtualized.
Desktop Virtualization:
Desktop virtualization, often referred to as Virtual Desktop Infrastructure (VDI), is a technology
that separates the desktop environment and applications from the physical client device. Instead
of running applications and storing data locally on a user's device, desktop virtualization
centralizes these resources in a data center or cloud environment. Users access their virtual
desktops remotely, providing flexibility, scalability, and enhanced security.
Working of Desktop Virtualization:
1. Hypervisor or Virtualization Server:
• A hypervisor or virtualization server is set up in a data center or cloud
environment.
• The hypervisor creates and manages virtual machines (VMs), each representing a
virtual desktop.
2. Creation of Virtual Desktops:
• Virtual desktops are created within the VMs, and each virtual desktop functions
as an independent instance of the operating system and applications.
• The hypervisor allocates resources dynamically based on user demand.
3. User Access:
• Users access their virtual desktops using thin clients, desktop computers, laptops,
or even mobile devices.
• Remote display protocols transmit the user interface, allowing users to interact
with their virtual desktops as if they were using a local machine.
4. Centralized Management:
• Desktop virtualization centralizes desktop management tasks, including software
updates, security patches, and application installations.
• Administrators can manage and configure virtual desktops from a central console.
5. User Isolation:
114
• Each user's virtual desktop operates independently of others, providing isolation
and security.
• Users cannot impact each other's sessions, and changes made on one virtual
desktop do not affect others.
6. Data Security:
• Data is stored centrally in the data center or cloud, reducing the risk of data loss
or theft from individual client devices.
• Administrators can implement robust security measures at the data center level.
115
• Ideal for scenarios where users require specific configurations or
applications.
5. Client Hypervisors:
• Description: Client hypervisors are installed directly on user devices (such as
laptops or desktops). Users can run virtual desktops locally on their devices,
providing flexibility and offline access.
• Characteristics:
• Virtual desktops run directly on the user's device.
• Enables offline access to virtual desktop instances.
6. Application Virtualization:
• Description: While not strictly desktop virtualization, application virtualization
focuses on isolating and running applications independent of the underlying
operating system. It can complement desktop virtualization solutions.
• Characteristics:
• Applications run in isolated environments, reducing conflicts.
• Enhances flexibility and compatibility by isolating application
dependencies.
7. Remote Desktop Services (RDS):
• Description: RDS is a Microsoft technology that allows multiple users to access a
shared Windows desktop environment simultaneously. It is often used in
conjunction with virtual desktop infrastructure (VDI).
• Characteristics:
• Users connect to a shared Windows desktop session.
• Suitable for scenarios where users need access to a shared desktop
environment.
8. VDI with Persistent and Non-Persistent Desktops:
• Description: In VDI, virtual desktops can be classified as persistent or non-
persistent. Persistent desktops provide each user with a dedicated virtual
machine that retains personalization settings. Non-persistent desktops are reset
to a clean state after each session.
• Characteristics:
• Persistent desktops offer a consistent experience for users.
• Non-persistent desktops are ideal for scenarios where users don't need to
retain changes between sessions.
Methods of Desktop Virtualization:
1. Hardware Virtualization:
• Hardware virtualization involves using a hypervisor to create and manage virtual
machines on physical servers.
• Each virtual machine represents a virtual desktop, and multiple virtual desktops
can run on a single physical server.
2. Software-based Virtualization:
• Software-based virtualization utilizes software solutions to create and manage
virtual desktops.
116
• This approach may involve using software that runs on top of an existing
operating system to create virtual desktop instances.
3. Client Hypervisors:
• Client hypervisors are installed directly on user devices (such as laptops or
desktops).
• Users can run virtual desktops locally on their devices, providing flexibility and
offline access.
Advantages of Desktop Virtualization:
• Flexibility: Users can access their desktops from various devices and locations.
• Centralized Management: Easier management of software updates, security, and
configurations.
• Resource Efficiency: Better utilization of computing resources with dynamic allocation.
• Enhanced Security: Data is centralized, reducing the risk of data loss from local devices.
• Isolation: Each user operates in an isolated environment, preventing interference with
other users.
Disadvantages of Desktop Virtualization:
• Infrastructure Costs: Setting up and maintaining the required infrastructure can be
expensive.
• Network Dependency: Relies on a robust and high-bandwidth network for optimal
performance.
• User Experience: The user experience may be impacted by network latency or
bandwidth limitations.
• Complex Implementation: Implementing desktop virtualization can be complex,
requiring careful planning and configuration.
Desktop virtualization is beneficial in scenarios where organizations require centralized control,
improved security, and flexibility in delivering desktop environments to users. The choice of
desktop virtualization type and method depends on specific use cases, organizational
requirements, and user needs.
CPU Virtualization:
CPU virtualization is a technology that allows multiple virtual machines (VMs) to run on a single
physical CPU. It enables the efficient sharing of CPU resources among multiple operating systems
and applications, creating isolated environments for each virtual machine.
Working of CPU Virtualization:
The working of CPU virtualization involves the use of a hypervisor or Virtual Machine Monitor
(VMM) to create and manage multiple virtual machines (VMs) on a single physical CPU. The goal
is to efficiently share the CPU resources among different operating systems and applications,
allowing them to run independently in isolated environments. Here is a step-by-step explanation
of how CPU virtualization works:
1. Hypervisor Installation:
• A hypervisor is installed directly on the physical server hardware. The hypervisor
is responsible for creating and managing virtual machines.
117
• Examples of hypervisors include VMware ESXi, Microsoft Hyper-V, KVM (Kernel-
based Virtual Machine), Xen, and others.
2. Virtual Machine Creation:
• The hypervisor creates multiple virtual machines, each of which functions as an
independent and isolated instance of an operating system.
• The number of virtual machines that can run simultaneously depends on the
available CPU, memory, and other resources on the physical server.
3. CPU Allocation:
• The hypervisor allocates portions of the physical CPU's processing power to each
virtual machine.
• The allocation is dynamic and can be adjusted based on the workload
requirements of each VM. This allows for efficient utilization of CPU resources.
4. Instruction Translation:
• When a virtual machine issues instructions, the hypervisor intercepts and
translates these instructions to ensure compatibility with the underlying physical
CPU architecture.
• This translation ensures that the instructions from the virtual machine can be
executed on the real hardware.
5. Direct Execution and Paravirtualization:
• Modern CPUs often support hardware-assisted virtualization, allowing for more
efficient execution of virtualized workloads.
• In direct execution, the CPU can execute some instructions directly without the
need for translation by the hypervisor.
• Paravirtualization is an alternative approach where the guest operating system is
modified to be aware of the virtualization layer, resulting in more efficient
communication.
6. Isolation:
• Each virtual machine operates in complete isolation from other virtual machines.
This isolation ensures that one virtual machine cannot interfere with the
operation of others.
• Isolation extends to memory, storage, and other resources, providing security and
stability.
7. Hardware-Assisted Virtualization (HVM):
• CPUs with virtualization extensions, such as Intel VT-x and AMD-V, provide
hardware support for creating and managing virtual machines.
• Hardware-assisted virtualization enhances performance by offloading certain
virtualization tasks to dedicated CPU instructions.
8. Resource Management:
• The hypervisor monitors resource usage and dynamically allocates CPU resources
based on the demands of each virtual machine.
• This dynamic resource management allows for efficient scaling and optimization
of workloads.
Types of CPU Virtualization:
118
1. Full Virtualization:
• Description: In full virtualization, the virtual machine (VM) simulates the entire
hardware environment, including the CPU. The guest operating system runs
without modification.
• Characteristics:
• Guest OS runs unmodified.
• Examples include VMware, Microsoft Hyper-V, and KVM.
2. Paravirtualization:
• Description: Paravirtualization requires modifying the guest operating system to
be aware of the virtualization layer. This allows for more efficient communication
between the guest OS and the hypervisor.
• Characteristics:
• Guest OS is aware of the virtualization layer.
• Examples include Xen and Virtuozzo.
3. Hardware-Assisted Virtualization (HVM):
• Description: HVM relies on hardware support from the CPU to assist in
virtualization. CPUs with virtualization extensions (e.g., Intel VT-x, AMD-V)
provide hardware support for creating and managing virtual machines.
• Characteristics:
• Hardware extensions aid virtualization.
• Often used in conjunction with full virtualization techniques.
4. Binary Translation:
• Description: Binary translation involves the dynamic translation of instructions
from the guest operating system to the host CPU's instruction set. It allows guest
OSes to run unmodified.
• Characteristics:
• Instructions are translated during runtime.
• Examples include VMware's ESXi.
5. Nested Virtualization:
• Description: Nested virtualization allows a hypervisor to run as a virtual machine
on another hypervisor. This is useful for testing and development scenarios
where virtualization is used within virtual machines.
• Characteristics:
• Enables running a hypervisor within a virtual machine.
• Useful for creating nested virtualization environments.
6. CPU Pinning and Affinity:
• Description: CPU pinning and affinity involve dedicating specific physical CPU
cores to virtual machines. This can enhance performance and reduce contention
for CPU resources.
• Characteristics:
• Assigns specific CPU cores to virtual machines.
• Improves performance by reducing contention.
7. Thread-Level Virtualization:
119
• Description: Thread-level virtualization allows multiple virtual threads to run
concurrently on a single physical core. It enhances CPU utilization by leveraging
simultaneous multithreading (SMT) capabilities.
• Characteristics:
• Multiple virtual threads run on a single physical core.
• Improves CPU efficiency.
8. GPU Virtualization:
• Description: GPU virtualization involves sharing graphics processing units
(GPUs) among multiple virtual machines. It is crucial for scenarios requiring
virtualized graphics workloads.
• Characteristics:
• Enables multiple VMs to share GPU resources.
• Important for graphical workloads in virtualized environments.
Advantages of CPU Virtualization:
• Efficient Resource Utilization: Multiple VMs share the same physical CPU, optimizing
resource usage.
• Isolation: Provides isolation between virtual machines, enhancing security and stability.
• Flexibility: Enables running multiple operating systems and applications on a single
physical server.
• Scalability: Allows for easy scaling by adding or removing virtual machines based on
demand.
Disadvantages of CPU Virtualization:
• Performance Overhead: Introduces a slight performance overhead due to the
virtualization layer.
• Complexity: Managing virtualized environments can be complex.
• Dependency on Hardware Support: Certain virtualization features depend on CPU
hardware support.
CPU virtualization is a fundamental technology in the world of server virtualization, enabling the
consolidation of workloads, improved efficiency, and flexibility in deploying and managing
applications and services. The choice of virtualization type and method depends on specific use
cases, hardware capabilities, and management preferences.
Memory Virtualization:
Memory virtualization is a technique that abstracts and manages the physical memory resources
of a computer system, allowing multiple virtual machines (VMs) or processes to share and access
memory independently. It provides a layer of abstraction between the physical RAM (Random
Access Memory) and the software, enabling efficient utilization of memory resources and
improved flexibility in managing workloads.
Working of Memory Virtualization:
1. Memory Abstraction:
• Memory virtualization starts by abstracting the physical memory resources into
a virtualized layer.
120
• Each VM or process interacts with this virtualized memory layer, believing it has
its dedicated memory space.
2. Memory Allocation:
• The memory virtualization layer, often implemented by a hypervisor or the
operating system's memory manager, allocates portions of the physical memory
to different VMs or processes.
• Allocation can be dynamic, adjusting based on the changing demands of each VM
or process.
3. Isolation:
• Memory virtualization ensures isolation between different VMs or processes.
Each VM or process operates within its own virtual memory space, preventing
interference with others.
4. Address Translation:
• When a VM or process accesses its virtual memory, the memory virtualization
layer translates the virtual addresses to corresponding physical addresses in the
actual RAM.
• This translation is transparent to the VM or process and is handled by the memory
management unit (MMU) or similar mechanisms.
5. Page Tables:
• Page tables are used to maintain the mapping between virtual and physical
memory addresses.
• When a VM or process requests memory, the memory virtualization layer consults
the appropriate page table to determine the physical location of the data.
6. Dynamic Memory Balancing:
• In virtualized environments, memory usage can vary among different VMs or
processes.
• Memory virtualization allows for dynamic memory balancing, adjusting the
allocation of physical memory to meet the changing needs of each VM or process.
7. Overcommitment and Page Sharing:
• Memory virtualization may support overcommitment, allowing the allocation of
more virtual memory than physically available.
• Techniques such as page sharing identify identical memory pages across VMs and
share them to reduce redundancy and save memory.
Types of Memory Virtualization:
1. Full Virtualization:
• In full virtualization, each VM has its complete virtual address space, which is then
mapped to the underlying physical memory.
• Guest operating systems run unmodified, and the virtualization layer handles
address translation.
• Examples include VMware, Hyper-V, and KVM.
2. Paravirtualization:
• Paravirtualization involves modifying the guest operating system to be aware of
the virtualization layer.
121
• The guest OS cooperates with the virtualization layer to achieve better
performance and efficiency in memory management.
• Examples include Xen.
3. Memory Overcommitment:
• Memory overcommitment allows allocating more virtual memory to VMs than the
physical memory available.
• Techniques like demand paging and memory balloon drivers help manage
memory usage efficiently.
Methods of Memory Virtualization:
1. Memory Ballooning:
• Memory ballooning involves adjusting the memory allocation of VMs dynamically.
• A memory balloon driver inside the VM communicates with the hypervisor to
request more or less memory based on demand.
2. Memory Compression:
• Memory compression is a technique where data in memory is compressed to save
space.
• Compressed memory pages can be stored more efficiently, allowing for better
memory utilization.
3. Memory Deduplication:
• Memory deduplication identifies identical memory pages across VMs and shares
them.
• This reduces redundancy and saves physical memory.
4. Memory Tiering:
• Memory tiering involves categorizing memory into different tiers based on access
patterns and priorities.
• Frequently accessed data can be stored in faster memory tiers, while less
frequently used data can be in slower tiers.
Advantages of Memory Virtualization:
• Efficient Resource Utilization: Enables multiple VMs to share physical memory
resources.
• Isolation: Provides isolation between different VMs or processes.
• Flexibility: Dynamic allocation and adjustment of memory resources based on demand.
• Overcommitment: Supports allocating more virtual memory than physically available.
Disadvantages of Memory Virtualization:
• Performance Overhead: Introduces a slight performance overhead due to address
translation and management.
• Complexity: Memory management in virtualized environments can be complex.
• Dependency on Virtualization Layer: Relies on the reliability of the memory
virtualization layer.
Memory virtualization is a critical component in modern virtualized environments, allowing for
efficient utilization of physical memory resources and providing flexibility in managing diverse
workloads. The choice of memory virtualization type and method depends on specific use cases,
organizational requirements, and the nature of the applications being virtualized.
122
I/O Device Virtualization:
I/O (Input/Output) device virtualization is a technology that abstracts and manages the physical
I/O devices of a computer system, allowing multiple virtual machines (VMs) or processes to share
and access these devices independently. The goal is to efficiently utilize I/O resources, enable
isolation between VMs or processes, and provide a flexible environment for managing diverse
workloads.
Working of I/O Device Virtualization:
1. I/O Device Abstraction:
• I/O device virtualization begins by abstracting the physical I/O devices into a
virtualized layer.
• Each VM or process interacts with this virtualized I/O layer, believing it has its
dedicated I/O devices.
2. Device Allocation:
• The I/O virtualization layer, often implemented by a hypervisor or the operating
system's I/O manager, allocates portions of the physical I/O devices to different
VMs or processes.
• This allocation can be dynamic, adjusting based on the changing demands of each
VM or process.
3. Device Emulation or Passthrough:
• I/O device virtualization can use different methods, including emulation or
passthrough.
• In emulation, the virtualization layer emulates I/O devices, presenting virtual
devices to the VMs.
• In passthrough, the VMs have direct access to physical I/O devices, bypassing the
virtualization layer.
4. Interrupt and DMA Virtualization:
• I/O operations often involve interrupts and Direct Memory Access (DMA) for data
transfer.
• The I/O virtualization layer manages interrupt routing and DMA operations to
ensure that each VM or process operates independently.
5. I/O Address Translation:
• When a VM or process issues I/O requests, the I/O virtualization layer translates
the virtual I/O addresses to corresponding physical addresses for the actual
devices.
• This translation ensures that the I/O requests from the VMs or processes are
directed to the correct physical devices.
6. I/O Queues and Buffering:
• The I/O virtualization layer may use queues and buffering mechanisms to
efficiently handle I/O requests from multiple VMs or processes.
• This helps in managing contention for I/O resources and optimizing data transfer.
Types of I/O Device Virtualization:
1. Emulated Devices:
123
• In emulated device virtualization, the virtualization layer provides emulated
versions of I/O devices to VMs.
• VMs interact with these emulated devices, and the virtualization layer translates
their requests to the actual physical devices.
2. Passthrough Devices:
• Passthrough device virtualization allows VMs to have direct access to physical I/O
devices.
• This is achieved by bypassing the virtualization layer, granting VMs exclusive
control over specific I/O devices.
3. Para-virtualized Devices:
• Para-virtualized device virtualization involves modifying the guest operating
system to use special drivers that communicate efficiently with the virtualization
layer.
• This cooperation between the guest OS and the virtualization layer enhances I/O
performance.
Methods of I/O Device Virtualization:
1. Interrupt Virtualization:
• Interrupt virtualization manages the routing of interrupts from I/O devices to the
appropriate VMs.
• Each VM is assigned its interrupt vectors, ensuring that interrupts are handled
independently.
2. DMA Virtualization:
• DMA virtualization involves managing Direct Memory Access operations for data
transfer between I/O devices and memory.
• The virtualization layer ensures proper coordination and isolation of DMA
activities among VMs.
3. I/O Address Translation:
• I/O address translation is performed to map virtual I/O addresses used by VMs to
the corresponding physical addresses of the actual I/O devices.
• This translation is transparent to VMs and ensures correct communication with
the physical devices.
4. I/O Request Queues:
• I/O request queues are used to manage and prioritize I/O requests from multiple
VMs.
• This helps prevent contention for I/O resources and optimizes the order of data
transfer.
Advantages of I/O Device Virtualization:
• Efficient Resource Utilization: Enables multiple VMs to share physical I/O devices.
• Isolation: Provides isolation between different VMs or processes.
• Flexibility: Dynamic allocation and adjustment of I/O resources based on demand.
• Device Sharing: Allows multiple VMs to use the same physical I/O device without
conflicts.
Disadvantages of I/O Device Virtualization:
124
• Performance Overhead: Introduces a slight performance overhead due to additional
layers of abstraction and management.
• Device Compatibility: Certain I/O devices may have limited virtualization support.
• Complexity: Managing I/O devices in virtualized environments can be complex.
I/O device virtualization is crucial in creating efficient and flexible virtualized environments,
especially in scenarios where multiple VMs or processes need to share and access I/O resources
simultaneously. The choice of virtualization type and method depends on specific use cases,
organizational requirements, and the nature of the applications being virtualized.
Network Virtualization:
Network virtualization is a technology that abstracts and decouples the physical network
infrastructure from the applications and services that use it. It enables the creation of multiple
virtual networks on top of a shared physical network, allowing for improved resource utilization,
isolation, and flexibility in network management.
Working of Network Virtualization:
1. Abstraction of Physical Network:
• Network virtualization begins by abstracting the physical network infrastructure,
including routers, switches, and other networking devices.
• This abstraction creates a virtualized layer that sits on top of the physical
network.
2. Virtual Network Creation:
• Virtual networks are created on demand by allocating segments of the physical
network infrastructure to each virtual network.
• Each virtual network operates as an independent entity, unaware of the existence
of other virtual networks.
3. Network Isolation:
• Network virtualization provides isolation between different virtual networks.
Traffic within one virtual network is isolated from traffic in other virtual
networks.
• Isolation is achieved by creating separate routing tables, subnets, and addressing
spaces for each virtual network.
4. Overlay Networks:
• Overlay networks are commonly used in network virtualization. They involve
encapsulating and tunneling virtual network traffic over the physical network
infrastructure.
• Encapsulation helps in maintaining isolation and allows virtual networks to
operate independently.
5. Virtual Switching and Routing:
• Virtual switches and routers are introduced in the virtualized layer to handle
traffic within each virtual network.
• These virtual devices are responsible for making forwarding decisions based on
the virtual network's configuration.
6. Address Translation:
125
• Network virtualization may involve address translation to map virtual network
addresses to physical addresses.
• This translation ensures that virtual networks can communicate seamlessly over
the shared physical network.
7. Dynamic Resource Allocation:
• Network virtualization allows for dynamic allocation and adjustment of network
resources based on the changing needs of each virtual network.
• This flexibility enables efficient use of network resources.
Types of Network Virtualization:
1. Software-Defined Networking (SDN):
• SDN is an architecture that separates the control plane from the data plane in
network devices.
• It provides a centralized control plane, allowing administrators to dynamically
manage and configure the network.
2. Network Function Virtualization (NFV):
• NFV involves virtualizing network functions, such as firewalls, load balancers, and
routers.
• Instead of relying on physical appliances, these network functions are
implemented as software and can be dynamically deployed as needed.
3. Virtual LANs (VLANs):
• VLANs are a form of network virtualization that divides a physical LAN into
multiple logical LANs.
• Each VLAN operates as a separate broadcast domain, providing segmentation and
isolation.
Methods of Network Virtualization:
1. Overlay Networks:
• Overlay networks create virtual networks on top of the physical infrastructure.
Technologies like VXLAN (Virtual Extensible LAN) and GRE (Generic Routing
Encapsulation) are used for encapsulation and tunneling.
2. Virtual Routing and Forwarding (VRF):
• VRF is a method that enables multiple instances of a routing table to coexist within
the same router. Each VRF operates as an independent router, allowing for
network segmentation.
3. Network Hypervisors:
• Network hypervisors manage the creation and operation of virtual networks.
They provide the necessary abstraction to enable the deployment and
management of virtual networks.
4. SDN Controllers:
• SDN controllers centralize the control plane in software. They communicate with
network devices and make dynamic decisions to optimize the network based on
application needs.
5. NFV Orchestrators:
126
• NFV orchestrators manage the deployment, configuration, and scaling of
virtualized network functions. They automate the lifecycle management of
network functions.
Advantages of Network Virtualization:
• Efficient Resource Utilization: Enables multiple virtual networks to share the same
physical infrastructure.
• Isolation: Provides isolation between different virtual networks, enhancing security.
• Flexibility: Allows for dynamic allocation and adjustment of network resources.
• Ease of Management: Simplifies network configuration and management through
abstraction.
Disadvantages of Network Virtualization:
• Complexity: Implementing and managing virtualized networks can be complex.
• Performance Overhead: Introduces a slight performance overhead due to additional
layers of abstraction.
• Dependency on Virtualization Layer: Relies on the reliability of the network
virtualization layer.
Network virtualization is a critical component in modern data center and cloud environments,
providing the agility and scalability required to meet the demands of diverse applications and
services. The choice of network virtualization type and method depends on specific use cases,
organizational requirements, and the nature of the network infrastructure.
127
• The hypervisor allocates portions of the virtualized pool of resources to each VM
based on its requirements.
• Resources include CPU, memory, storage, and network bandwidth.
5. Storage Virtualization:
• Storage virtualization abstracts physical storage devices into virtualized storage
pools.
• Virtualization layers or software-defined storage solutions enable efficient
storage management and allocation to VMs.
6. Network Virtualization:
• Network virtualization abstracts and decouples the physical network
infrastructure.
• Virtual networks are created to enable communication between VMs and other
devices within the data center.
7. Dynamic Resource Allocation:
• Data center virtualization allows for dynamic allocation and adjustment of
resources based on the changing needs of applications and services.
• This flexibility enables efficient use of resources and accommodates varying
workloads.
Types of Data Center Virtualization:
1. Server Virtualization:
• Server virtualization involves abstracting physical servers into virtual machines.
• Hypervisors create and manage multiple VMs on a single physical server,
optimizing server utilization.
2. Storage Virtualization:
• Storage virtualization abstracts physical storage devices into virtualized storage
pools.
• This allows for centralized management, efficient allocation, and dynamic scaling
of storage resources.
3. Network Virtualization:
• Network virtualization abstracts and isolates the physical network infrastructure.
• Virtual networks are created to provide communication channels between VMs
and other devices in the data center.
4. Desktop Virtualization:
• Desktop virtualization abstracts and centralizes the management of desktop
computing resources.
• Virtual desktops can be created and managed in a data center, providing users
with remote access to their desktop environments.
5. Application Virtualization:
• Application virtualization isolates and encapsulates applications from the
underlying operating system.
• Applications can run independently, reducing conflicts and compatibility issues.
Methods of Data Center Virtualization:
1. Hypervisor-Based Virtualization:
128
• Hypervisor-based virtualization, also known as server virtualization, involves the
use of hypervisors to create and manage virtual machines.
• Examples include VMware ESXi, Microsoft Hyper-V, and KVM.
2. Software-Defined Storage (SDS):
• SDS abstracts storage resources and allows for centralized management through
software.
• Storage virtualization solutions, like VMware vSAN or Ceph, fall under this
category.
3. Software-Defined Networking (SDN):
• SDN abstracts and centralizes the control of network resources.
• Virtual networks are created and managed through software controllers, allowing
for dynamic network configuration.
4. Containerization:
• Containerization abstracts applications and their dependencies into containers.
• Technologies like Docker and Kubernetes enable the deployment and
management of containerized applications.
5. Desktop Virtualization Platforms:
• Desktop virtualization platforms, such as VMware Horizon or Citrix Virtual Apps
and Desktops, abstract and manage desktop computing resources.
• Virtual desktops are created and delivered to users based on demand.
Advantages of Data Center Virtualization:
• Resource Optimization: Improves resource utilization by consolidating and sharing
physical resources.
• Flexibility: Enables dynamic allocation and scaling of resources based on workload
demands.
• Isolation: Ensures isolation between different applications, services, and users.
• Cost Savings: Reduces hardware costs and operational expenses through efficient
resource usage.
Disadvantages of Data Center Virtualization:
• Complexity: Implementing and managing virtualized environments can be complex.
• Performance Overhead: Introduces a slight performance overhead due to additional
layers of abstraction.
• Dependency on Virtualization Layer: Relies on the reliability of the virtualization layer.
Data center virtualization is a fundamental concept in modern IT infrastructure, providing the
foundation for cloud computing, scalability, and efficient resource utilization. The choice of
virtualization types and methods depends on specific use cases, organizational requirements, and
the nature of the applications and services being virtualized.
129
a full OS, OS-level virtualization shares the host OS kernel among multiple containers or virtual
environments. This approach provides lightweight, efficient, and fast virtualization.
Working of Operating System Virtualization:
The working of operating system (OS) virtualization involves creating and managing multiple
isolated instances of operating systems on a single physical machine. The primary goal is to
provide a way to efficiently share and utilize resources while maintaining isolation between
different virtualized environments. The specific workings may vary depending on the
virtualization technique used, such as hypervisor-based virtualization, paravirtualization, or
containerization. Here, we'll provide a general overview based on hypervisor-based
virtualization:
Hypervisor-Based Virtualization:
1. Hypervisor Installation:
• A hypervisor, also known as a Virtual Machine Monitor (VMM), is installed
directly on the physical hardware or on top of the host operating system.
• The hypervisor abstracts and manages the physical hardware resources,
including CPU, memory, storage, and network.
2. Creation of Virtual Machines (VMs):
• Multiple virtual machines are created, each representing a complete instance of
an operating system.
• Each VM has its own virtual hardware, including a virtual CPU, virtual memory,
virtual disks, and virtual network interfaces.
3. Guest Operating System Installation:
• Guest operating systems are installed on each VM, just like on a physical machine.
The guest OS interacts with the virtual hardware provided by the hypervisor.
• The guest OS can be the same or different from the host operating system.
4. Hypervisor Management:
• The hypervisor manages the allocation of physical resources to each virtual
machine.
• It schedules the execution of VMs on the physical CPU, assigns memory, and
controls access to I/O devices.
5. Resource Isolation:
• Each VM is isolated from other VMs. The hypervisor ensures that the execution of
one VM does not interfere with others.
• Memory addresses, I/O operations, and CPU states are virtualized and managed
by the hypervisor to maintain isolation.
6. Hardware-Assisted Virtualization:
• Modern CPUs often come with virtualization extensions (e.g., Intel VT-x, AMD-V)
that enhance the performance of virtualization.
• These extensions provide hardware support for tasks like efficient memory
management, direct execution of certain instructions, and nested virtualization.
7. I/O Virtualization:
• Hypervisors manage I/O operations to ensure that each VM can communicate
with storage devices, network interfaces, and other peripherals.
130
• Virtual I/O devices are presented to VMs, and the hypervisor translates these
virtual operations into actual physical operations.
8. Migration and Live Migration:
• Some hypervisors support migration, allowing VMs to be moved between physical
hosts without interruption.
• Live migration enables moving a running VM from one host to another without
downtime for applications.
9. Snapshot and Cloning:
• Hypervisors often provide features like snapshotting and cloning, allowing
administrators to capture the state of a VM at a specific point in time or create
identical copies of VMs.
10. Integration Tools:
• Guest OS integration tools or drivers may be installed within VMs to enhance
communication with the hypervisor. These tools improve performance, enable
features like dynamic resource allocation, and facilitate interaction between the
host and guest OS.
The exact details of the working may differ based on the specific hypervisor and virtualization
technology used. For example, paravirtualization modifies the guest OS to enhance
communication, while containerization involves creating lightweight, isolated environments that
share the host OS kernel
Techniques of OS Virtualization
Operating system (OS) virtualization employs various techniques to create and manage multiple
isolated instances of operating systems on a single physical machine. The primary techniques
used in OS virtualization include:
1. Hypervisor-Based Virtualization:
• Description: Hypervisor, also known as a Virtual Machine Monitor (VMM), is a
layer of software that runs directly on the hardware and allows multiple virtual
machines (VMs) to run on the same physical machine. Each VM has its own
complete OS instance.
• Key Characteristics:
• Full OS instances run in each VM.
• Hypervisor manages the allocation of physical resources (CPU, memory,
etc.) to VMs.
• Examples: VMware ESXi, Microsoft Hyper-V, KVM, Xen.
2. Paravirtualization:
• Description: Paravirtualization involves modifying the guest OS to make it aware
of the virtualization layer. This collaboration between the guest OS and the
hypervisor enhances performance by allowing more efficient communication and
coordination.
• Key Characteristics:
• Guest OS is modified for better virtualization support.
• Improved performance compared to full virtualization.
131
• Examples: Xen, Virtuozzo.
3. Containerization:
• Description: Containers provide lightweight, isolated environments for
applications. Instead of virtualizing an entire OS, containers share the host OS
kernel but run in isolated user spaces.
• Key Characteristics:
• Lightweight and fast.
• Share the host OS kernel for efficiency.
• Examples: Docker, Podman, containerd.
4. Kernel-Based Virtual Machine (KVM):
• Description: KVM is a Linux kernel module that turns the host OS into a
hypervisor. It allows the host OS to act as a hypervisor for running multiple VMs.
• Key Characteristics:
• Uses hardware-assisted virtualization features.
• Can run both fully virtualized and paravirtualized VMs.
• Examples: KVM/QEMU.
5. Virtualization Extensions (Intel VT-x, AMD-V):
• Description: Modern CPUs come with virtualization extensions (e.g., Intel VT-x,
AMD-V) that provide hardware support for virtualization. These extensions
enhance the performance and efficiency of virtualization.
• Key Characteristics:
• Hardware support for virtualization tasks.
• Accelerate virtual machine operations.
• Examples: Intel VT-x, AMD-V.
6. Nested Virtualization:
• Description: Nested virtualization allows running a hypervisor inside a virtual
machine. This is useful for testing, development, and scenarios where
virtualization needs to be extended to multiple layers.
• Key Characteristics:
• Hypervisor runs inside a VM.
• Enables virtualization within virtualization.
• Examples: Running VMware ESXi inside a VM.
7. Hardware-Assisted Memory Virtualization:
• Description: Memory virtualization involves techniques to efficiently manage
and allocate memory resources in a virtualized environment. Hardware support,
such as Extended Page Tables (EPT) and Rapid Virtualization Indexing (RVI),
accelerates memory virtualization.
• Key Characteristics:
• Hardware support for efficient memory management.
• Enhances translation of virtual to physical memory addresses.
• Examples: EPT (Intel), RVI (AMD).
8. Direct Execution (Intel VT-x, AMD-V):
132
• Description: Direct execution allows certain instructions from virtual machines
to be executed directly on the physical CPU without the need for translation by
the hypervisor. This feature improves the efficiency of virtualization.
• Key Characteristics:
• Allows certain instructions to be executed directly.
• Reduces the overhead of instruction translation.
• Examples: Intel VT-x, AMD-V.
These techniques cater to different virtualization requirements, offering a range of options from
full OS virtualization with hypervisors to lightweight containerization. The choice of technique
depends on factors such as isolation needs, performance considerations, and the nature of the
applications being virtualized.
133
• Advantage: Virtualization provides isolation between different virtual machines
or containers. This isolation enhances security by preventing one virtualized
environment from impacting others.
• Explanation: In the event of a security breach or a failure in one virtualized
instance, others remain unaffected due to the inherent isolation provided by the
virtualization layer.
3. Flexibility and Scalability:
• Advantage: Virtualization allows for dynamic allocation and scaling of resources
based on changing workloads. This flexibility is crucial for handling varying
demands and optimizing resource usage.
• Explanation: Virtual machines and containers can be easily provisioned or
decommissioned to adapt to changing requirements, ensuring that resources are
allocated efficiently.
4. Hardware Independence:
• Advantage: Virtualization abstracts the underlying hardware, providing a layer of
independence for virtualized instances. This abstraction makes it easier to
migrate virtual machines between different physical servers.
• Explanation: The ability to move virtual machines across servers (live migration)
allows for hardware maintenance, upgrades, and optimizations without affecting
running services.
5. Cost Savings:
• Advantage: Virtualization leads to significant cost savings by reducing the need
for a large number of physical servers, lowering energy consumption, and
streamlining management and maintenance.
• Explanation: By optimizing resource usage and consolidating workloads,
organizations can achieve cost efficiencies in terms of hardware procurement,
power consumption, and operational expenses.
6. Backup and Disaster Recovery:
• Advantage: Virtualization simplifies backup and disaster recovery processes.
Virtual machines can be snapshot for quick backups, and recovery solutions can
be implemented more efficiently.
• Explanation: Snapshotting allows organizations to capture the state of a virtual
machine at a specific point in time, facilitating quick recovery in case of data loss
or system failures.
7. Testing and Development:
• Advantage: Virtualization is valuable for testing and development purposes. It
enables the creation of isolated environments for software testing, development,
and debugging.
• Explanation: Developers can create virtualized instances with different
configurations for testing and debugging software without impacting the
production environment.
Disadvantages of Virtualization:
1. Performance Overhead:
134
• Disadvantage: Virtualization introduces a performance overhead due to the
additional layer of abstraction and the need for resource sharing among virtual
machines or containers.
• Explanation: While advances in virtualization technologies aim to minimize
overhead, certain workloads may experience a performance impact compared to
running on bare-metal hardware.
2. Complexity and Learning Curve:
• Disadvantage: Implementing and managing virtualized environments can be
complex, requiring knowledge of virtualization technologies, hypervisors, and
orchestration tools.
• Explanation: The complexity increases with the scale of virtualized infrastructure,
necessitating skilled administrators and potentially leading to a learning curve for
organizations adopting virtualization.
3. Resource Contention:
• Disadvantage: In a shared virtualized environment, resource contention can occur
when multiple virtual machines or containers compete for the same resources.
• Explanation: If not properly managed, resource contention can lead to
performance degradation and unpredictable behavior, especially during peak
usage periods.
4. Dependency on Virtualization Layer:
• Disadvantage: Organizations become dependent on the stability and reliability of
the virtualization layer. Failures or vulnerabilities in the virtualization layer can
impact multiple virtualized instances.
• Explanation: Virtualized environments rely on the underlying hypervisor or
container runtime, and any issues with these components can affect the overall
stability and security of the virtualized infrastructure.
5. Licensing Costs:
• Disadvantage: Some virtualization solutions may involve licensing costs,
especially for enterprise-level features or advanced management tools.
• Explanation: While many open-source virtualization solutions exist, enterprises
may opt for commercial solutions that come with additional features and support,
leading to licensing expenses.
6. Limited OS Compatibility:
• Disadvantage: Certain operating systems may have limited virtualization support,
particularly when using specific hypervisors.
• Explanation: While popular operating systems are well-supported, less common
or specialized operating systems may face challenges in virtualized environments,
affecting application compatibility.
7. I/O Performance:
• Disadvantage: Input/output (I/O) performance can be a challenge in virtualized
environments, especially for applications that rely heavily on disk or network I/O.
• Explanation: Virtualized environments may introduce latency in I/O operations,
impacting the performance of storage-intensive or network-intensive workloads.
135
It's important to note that while virtualization offers numerous advantages, organizations should
carefully consider their specific needs, workloads, and available resources to determine the most
appropriate virtualization strategy. Addressing potential disadvantages involves proper
planning, optimization, and ongoing management practices.
Working of VMware
The working of VMware involves virtualizing computing resources to create a flexible and
efficient IT infrastructure. The core concept is to abstract physical hardware and present it as
virtual resources, enabling the simultaneous operation of multiple virtual machines on a single
physical server. VMware achieves this through its suite of products, primarily VMware vSphere.
Below is an overview of the working of VMware:
1. Hypervisor Installation:
• VMware vSphere utilizes a hypervisor, known as ESXi, which is installed directly on the
physical hardware. ESXi is a bare-metal hypervisor, meaning it runs directly on the server
without the need for an underlying operating system.
2. Virtual Machine Creation:
• After installing ESXi, virtual machines (VMs) are created. Each VM represents a complete
virtualized environment, including a virtual CPU, memory, storage, and network
interfaces.
3. Centralized Management with vCenter Server:
• VMware vCenter Server is used for centralized management of the virtualized
environment. It allows administrators to configure, monitor, and manage multiple ESXi
hosts and VMs from a single interface.
4. Resource Allocation and Management:
• vCenter Server manages the allocation of physical resources (CPU, memory, storage, and
network) to individual VMs. It ensures efficient utilization of resources and enables
dynamic allocation based on workload demands.
5. Hypervisor-Level Isolation:
• ESXi provides isolation between VMs at the hypervisor level. Each VM operates
independently, with its own operating system and applications. This isolation prevents
one VM from impacting others running on the same host.
6. vMotion for Live Migration:
• VMware vSphere includes a feature called vMotion, which allows for live migration of
running VMs between different ESXi hosts without any downtime. This feature is useful
for load balancing, maintenance, and improving resource utilization.
7. Storage Virtualization with vSAN:
136
• VMware's vSAN (Virtual Storage Area Network) is a software-defined storage solution
that aggregates local storage resources from ESXi hosts to create a shared storage pool. It
eliminates the need for external storage and provides scalable, distributed storage.
8. Network Virtualization with NSX:
• NSX is VMware's network virtualization and security platform. It enables the creation of
virtual networks, allowing for flexible and scalable networking within the virtualized
environment. Micro-segmentation enhances security by implementing fine-grained
access controls.
9. Automation with vRealize Suite:
• VMware offers the vRealize Suite for cloud management, automation, and analytics.
vRealize Automation automates the deployment and management of applications,
vRealize Operations provides performance monitoring and capacity management, and
vRealize Log Insight offers log analytics.
10. Integration with Cloud:
• VMware Cloud Foundation is an integrated cloud infrastructure platform that extends
virtualization to the cloud. It provides a consistent infrastructure across on-premises data
centers and public cloud environments.
11. Desktop and Application Virtualization with Horizon:
• VMware Horizon provides solutions for desktop and application virtualization. It allows
organizations to deliver virtual desktops and applications to end-users, providing
flexibility, centralized management, and support for remote access.
12. Backup and Disaster Recovery:
• VMware environments often integrate with backup and disaster recovery solutions to
ensure the protection of virtualized workloads. Snapshot capabilities and replication
features are commonly used for data protection.
Here are some key components and technologies associated with VMware:
1. VMware vSphere:
• Description: VMware vSphere is a comprehensive virtualization platform that
includes a hypervisor (ESXi), centralized management (vCenter Server), and
various features for creating and managing virtualized data centers.
• Key Components:
• ESXi Hypervisor: A bare-metal hypervisor that runs directly on physical
hardware.
• vCenter Server: Centralized management platform for vSphere
environments.
• vMotion: Live migration of virtual machines between hosts.
• vSAN: Software-defined storage solution for hyper-converged
infrastructure.
2. VMware ESXi:
• Description: ESXi is a bare-metal hypervisor that enables the virtualization of
servers. It runs directly on the hardware without the need for a host operating
137
system, providing a lightweight and optimized platform for hosting virtual
machines.
• Key Features:
• Hardware-assisted virtualization support.
• Secure isolation between virtual machines.
• Efficient resource utilization.
3. VMware NSX:
• Description: VMware NSX is a network virtualization and security platform that
allows organizations to create virtual networks and implement micro-
segmentation. It provides flexibility, scalability, and enhanced security for
networking in virtualized environments.
• Key Features:
• Network virtualization for agile and scalable networking.
• Micro-segmentation for fine-grained security controls.
• Integration with third-party security solutions.
4. VMware vSAN:
• Description: VMware vSAN is a software-defined storage solution integrated into
the vSphere platform. It aggregates local storage devices from ESXi hosts to create
a shared and distributed datastore, eliminating the need for traditional external
storage.
• Key Features:
• Hyper-converged infrastructure for storage and compute.
• Automatic scaling and distributed storage architecture.
• Enhanced performance and availability.
5. VMware Horizon:
• Description: VMware Horizon is a desktop and application virtualization solution.
It allows organizations to deliver virtual desktops and applications to end-users,
providing flexibility, centralized management, and support for remote access.
• Key Features:
• Virtual Desktop Infrastructure (VDI) for delivering virtual desktops.
• Application virtualization for centralized application management.
• Support for mobile and remote access.
6. VMware Cloud Foundation:
• Description: VMware Cloud Foundation is an integrated cloud infrastructure
platform that combines compute, storage, networking, and cloud management
services. It provides a consistent and unified architecture for both private and
hybrid cloud environments.
• Key Features:
• Integrated stack for private and hybrid cloud deployments.
• Automated lifecycle management of the entire cloud infrastructure.
• Consistent infrastructure across on-premises and cloud environments.
7. VMware vRealize Suite:
138
• Description: The vRealize Suite is a set of cloud management products for
managing hybrid cloud environments. It includes tools for automation, operations
management, and log analytics.
• Key Components:
• vRealize Automation: Automates the delivery and management of
applications.
• vRealize Operations: Provides performance monitoring and capacity
management.
• vRealize Log Insight: Offers log analytics and troubleshooting
capabilities.
VMware technologies are widely adopted in enterprise environments to build and manage
virtualized infrastructures. They play a crucial role in enhancing operational efficiency,
optimizing resource usage, and providing scalable and secure IT solutions for various business
needs.
Pros of VMware:
1. Comprehensive Virtualization Platform:
• VMware offers a comprehensive virtualization platform that includes various
products such as vSphere, ESXi, and vCenter Server, providing a complete solution
for server virtualization.
2. Performance and Stability:
• VMware is known for its performance and stability. It has been widely adopted in
enterprise environments where reliability is crucial for mission-critical
workloads.
3. Advanced Management Tools:
• VMware provides advanced management tools like vCenter Server, vRealize Suite,
and vSphere Client, offering centralized management and monitoring capabilities
for virtualized environments.
4. Broad Hypervisor Support:
• VMware supports a wide range of operating systems and guest VMs, making it
versatile and suitable for various application workloads.
5. High Availability and Fault Tolerance:
• VMware includes features like High Availability (HA) and Fault Tolerance (FT),
enhancing the availability of virtualized applications and ensuring continuous
operation in case of hardware failures.
6. Resource Optimization:
• VMware's Distributed Resource Scheduler (DRS) and Dynamic Resource
Scheduler (DRM) help optimize resource utilization by automatically balancing
workloads across host servers.
7. Security Features:
• VMware offers security features such as VM encryption, vShield, and secure boot
options, contributing to the overall security of virtualized environments.
8. Ecosystem and Integration:
139
• VMware has a broad ecosystem and integrates with various third-party solutions,
making it compatible with a wide range of storage, networking, and backup
solutions.
9. Community and Support:
• VMware has a large and active community, and there is a wealth of documentation
and community support available. Additionally, VMware provides professional
support for its products.
Cons of VMware:
1. Cost:
• VMware solutions, especially in enterprise editions, can be relatively expensive.
Licensing costs, along with hardware requirements, can contribute to a significant
investment.
2. Resource Intensive:
• VMware's hypervisor and management tools may be resource-intensive
compared to alternative virtualization solutions. This can be a consideration for
environments with limited hardware resources.
3. Vendor Lock-In:
• Using VMware may lead to vendor lock-in as migrating VMs to a different
virtualization platform can be complex. Compatibility issues may arise when
transitioning away from VMware.
4. Complexity for Small Environments:
• In smaller environments, the extensive feature set of VMware may be seen as
complex and unnecessary, especially when compared to simpler and more
lightweight solutions.
5. Learning Curve:
• Implementing and managing VMware solutions might have a steeper learning
curve for administrators who are new to virtualization or are accustomed to other
virtualization platforms.
6. Licensing Complexity:
• VMware's licensing model can be complex, with various editions and features
requiring different license levels. This complexity can be challenging for
organizations to navigate.
7. Limited Free Version:
• While VMware offers a free edition of its hypervisor (ESXi), some advanced
features and management tools are only available in paid editions, potentially
limiting functionality for users of the free version.
Microsoft Hyper-V
Microsoft Hyper-V is a virtualization platform developed by Microsoft. It allows users to create
and manage virtual machines (VMs) on Windows-based servers. Hyper-V is part of the Windows
Server operating system and is also available as a standalone product known as Hyper-V Server.
Hyper-V is a Type 1 hypervisor, meaning it operates directly on the hardware without the need
for a host operating system. Here's an overview of how Hyper-V works:
140
1. Hypervisor Installation:
• Hyper-V is installed on a physical server as a role in the Windows Server operating system
or as a standalone product called Hyper-V Server. Once installed, it becomes the
hypervisor layer that allows for the creation and management of virtualized instances.
2. Virtual Machine Creation:
• Hyper-V enables the creation of virtual machines. A virtual machine is a software
emulation of a physical computer that runs an operating system and applications. Each
VM has its own virtual hardware, including CPU, memory, storage, and network
interfaces.
3. Hyper-V Manager:
• Administrators use Hyper-V Manager, a graphical user interface (GUI) tool, to create,
configure, and manage virtual machines. Hyper-V Manager provides a central point for
monitoring and controlling the virtualized environment.
4. Hypervisor Resource Management:
• Hyper-V manages the allocation of physical resources to virtual machines. This includes
CPU cycles, memory, storage space, and network bandwidth. Hyper-V ensures efficient
resource utilization by dynamically allocating resources based on the needs of running
VMs.
5. Integration Services:
• Integration Services are software components that enhance the interaction between the
host operating system and virtual machines. They improve communication, coordination,
and performance between the host and guest operating systems.
6. Virtual Machine State:
• Each virtual machine has its own operating system and runs applications as if it were a
physical machine. The hypervisor ensures that the VM's state is independent of other VMs
running on the same host, providing isolation.
7. Live Migration:
• Hyper-V supports Live Migration, allowing administrators to move running virtual
machines from one Hyper-V host to another with minimal downtime. This feature is
essential for load balancing, hardware maintenance, and improving resource utilization.
8. Dynamic Memory:
• Hyper-V includes Dynamic Memory, a feature that allows virtual machines to dynamically
adjust their memory allocation based on workload demands. This helps optimize memory
usage across VMs.
9. Hyper-V Replica:
• Hyper-V Replica provides asynchronous replication of virtual machines to a secondary
Hyper-V host. In the event of a primary site failure, administrators can failover to the
replicated VMs on the secondary host to minimize downtime.
10. Checkpoints:
• Hyper-V Checkpoints, formerly known as snapshots, allow administrators to capture the
state of a virtual machine at a specific point in time. This feature is useful for testing,
development, and creating recovery points.
11. Hyper-V Containers:
141
• Hyper-V Containers provide lightweight and isolated environments for running
applications. They use Windows Server Containers as the underlying technology, allowing
for efficient deployment and scaling of containerized applications.
12. Nested Virtualization:
• Hyper-V supports nested virtualization, allowing users to run Hyper-V within a virtual
machine. This feature is valuable for testing and development scenarios that involve
multiple layers of virtualization.
13. System Center Integration:
• Microsoft System Center Virtual Machine Manager (SCVMM) is an additional management
tool that integrates with Hyper-V, providing advanced features for large-scale virtualized
environments, such as automation, self-service provisioning, and monitoring.
Pros of Hyper-V:
1. Integration with Microsoft Ecosystem:
• Hyper-V seamlessly integrates with other Microsoft products and technologies,
making it a preferred choice for organizations already invested in the Microsoft
ecosystem.
2. Cost-Effective for Windows Environments:
• Hyper-V is included as a feature in Windows Server editions, making it a cost-
effective solution for organizations already using Windows servers.
3. Windows-Based Management Tools:
• Hyper-V can be managed using familiar Windows-based tools such as Hyper-V
Manager and System Center Virtual Machine Manager (SCVMM), providing a
consistent experience for Windows administrators.
4. Scalability:
• Hyper-V supports scalability with features like failover clustering, live migration,
and support for large numbers of virtual CPUs and memory. It is suitable for both
small businesses and large enterprises.
5. Dynamic Memory Allocation:
• Hyper-V offers dynamic memory allocation, allowing virtual machines to adjust
their memory usage based on demand. This helps optimize resource utilization.
6. Snapshot and Cloning:
• Hyper-V provides snapshot and cloning features, allowing administrators to
create point-in-time copies of virtual machines for backup or testing purposes.
7. Live Migration:
• Hyper-V supports live migration, enabling the movement of virtual machines
between Hyper-V hosts with minimal downtime. This is useful for load balancing
and maintenance.
8. Windows Containers Support:
• Hyper-V includes support for Windows Containers, facilitating the deployment
and management of containerized applications alongside traditional virtual
machines.
9. Replication:
142
• Hyper-V offers replication features, allowing organizations to replicate virtual
machines to a secondary site for disaster recovery purposes.
10. Nested Virtualization:
• Hyper-V supports nested virtualization, allowing virtualization inside virtual
machines. This is useful for testing and development scenarios.
Cons of Hyper-V:
1. Limited Hypervisor Support:
• Hyper-V primarily supports Windows-based operating systems. While it does
provide some support for Linux and other OSes, it may not be as versatile as other
hypervisors in mixed environments.
2. Resource Overhead:
• Some users report a relatively higher resource overhead compared to other
hypervisors, which might affect the overall performance of the host system.
3. Complexity for Linux Environments:
• Administrators accustomed to Linux-based virtualization solutions may find
Hyper-V more complex, leading to a steeper learning curve.
4. Management Tools Limitations:
• While Hyper-V Manager is suitable for basic management tasks, advanced
features may require the use of System Center Virtual Machine Manager
(SCVMM), which is a separate product and may add to the overall cost.
5. Vendor Lock-In:
• Organizations heavily invested in the Microsoft ecosystem may experience
vendor lock-in, making it challenging to switch to alternative virtualization
platforms.
6. Free Edition Limitations:
• The free edition of Hyper-V (Hyper-V Server) lacks some advanced features
available in the paid editions, such as live migration and advanced management
tools.
7. Third-Party Ecosystem:
• The third-party ecosystem around Hyper-V might be smaller compared to some
other virtualization platforms, limiting the availability of specialized tools and
integrations.
8. Linux Performance:
• While Hyper-V has made strides in supporting Linux distributions, some users
report better performance for Linux VMs on alternative hypervisors.
KVM
The Kernel-based Virtual Machine (KVM) is a virtualization infrastructure for the Linux kernel
that enables the running of multiple virtual machines (VMs) on a single physical host. KVM
leverages hardware virtualization extensions, such as Intel VT-x and AMD-V, to provide efficient
and secure virtualization. Here's an overview of how KVM works:
1. Kernel Module:
143
• KVM is implemented as a kernel module that is part of the Linux kernel. When KVM is
loaded as a module, it turns the host operating system into a hypervisor, allowing it to
manage virtual machines.
2. Hardware Virtualization Extensions:
• KVM relies on hardware virtualization extensions provided by modern processors (e.g.,
Intel VT-x, AMD-V). These extensions enhance the performance of virtualization by
offloading certain tasks to the hardware, such as running guest operating systems in a
more isolated and efficient manner.
3. Creation of Virtual Machines:
• Users can create virtual machines using KVM, each running its own guest operating
system. KVM supports various guest operating systems, including Linux distributions,
Windows, and others. Virtual machines are isolated from each other and run as separate
processes on the host.
4. QEMU Integration:
• KVM often works in conjunction with QEMU (Quick Emulator), a user-level emulator that
provides device emulation for virtual machines. QEMU handles tasks such as disk and
network I/O emulation, while KVM accelerates the CPU-related aspects of virtualization.
This combination is known as QEMU-KVM.
5. Device Emulation:
• QEMU provides device emulation for virtual machines, allowing them to interact with
emulated hardware components such as virtual disks, network interfaces, and graphics
adapters. This helps abstract the hardware differences between the host and guest
systems.
6. VirtIO Drivers:
• VirtIO is a set of paravirtualized drivers that improve the performance of I/O operations
between the guest and host systems. VirtIO drivers are commonly used with KVM to
enhance storage and network performance in virtualized environments.
7. Memory Management:
• KVM manages memory resources for virtual machines, allocating and deallocating
memory as needed. It utilizes the host's memory management mechanisms to provide
efficient memory isolation between virtual machines.
8. CPU Scheduling:
• KVM handles CPU scheduling for virtual machines, allowing them to share the physical
CPU cores of the host. It uses the host operating system's scheduler to allocate CPU time
to each virtual machine.
9. Bridged Networking:
• KVM allows virtual machines to connect to the host's physical network through bridged
networking. This enables virtual machines to communicate with other devices on the
network as if they were physical machines.
10. Live Migration:
• KVM supports live migration, allowing virtual machines to be moved from one physical
host to another without significant downtime. This feature is beneficial for load balancing,
maintenance, and achieving high availability.
144
11. Management Tools:
• Various management tools exist for working with KVM, including command-line utilities
like virsh, graphical user interfaces like virt-manager, and orchestration tools. These
tools provide administrators with ways to create, manage, and monitor virtual machines.
Pros of KVM
1. Open Source:
• KVM is open-source software, meaning it is freely available, and its source code
can be modified and distributed. This fosters a collaborative community and
allows for transparency.
2. Integration with Linux Kernel:
• As KVM is integrated into the Linux kernel, it benefits from ongoing kernel
advancements and optimizations. This integration ensures close alignment with
the Linux ecosystem.
3. Performance:
• KVM is known for its high performance, especially when leveraging hardware
virtualization extensions. It allows for near-native performance for virtual
machines in many scenarios.
4. Support for Hardware Virtualization Extensions:
• KVM takes advantage of hardware virtualization extensions (Intel VT-x and AMD-
V) to improve performance and efficiency in running virtual machines.
5. Versatility:
• KVM supports a wide range of guest operating systems, including Linux
distributions, Windows, and other operating systems, making it versatile for
different use cases.
6. Live Migration:
• KVM supports live migration, allowing virtual machines to be moved from one
host to another without downtime. This is useful for load balancing and
maintenance purposes.
7. Snapshot and Cloning:
• KVM provides snapshot and cloning features, allowing administrators to create
point-in-time copies of virtual machines for backup or testing purposes.
8. Management Tools:
• KVM can be managed using various tools, including libvirt, Virt-manager, and
oVirt. These tools provide a graphical user interface and a programmatic interface
for managing virtualized environments.
9. Security:
• KVM benefits from the security features of the Linux kernel, and its development
community actively addresses security vulnerabilities. Security features like
SELinux can be used to enhance security in virtualized environments.
10. Nested Virtualization:
• KVM supports nested virtualization, allowing virtualization inside virtual
machines. This is useful for testing and development scenarios.
145
Cons of KVM:
1. Learning Curve:
• Setting up and configuring KVM may have a steeper learning curve, especially for
administrators who are new to virtualization or are accustomed to other
hypervisors.
2. Graphical User Interface (GUI):
• While tools like Virt-manager provide a GUI for managing KVM, some users may
find the GUI less polished or feature-rich compared to those of commercial
solutions.
3. Ecosystem Size:
• The third-party ecosystem around KVM, in terms of management tools and
integrations, may be smaller compared to some commercial hypervisors.
4. Windows Guest Integration:
• While KVM supports Windows as a guest operating system, the integration may
not be as seamless as with some other hypervisors. Specific tools and drivers may
be required for optimal Windows performance.
5. Commercial Support:
• While there are community-driven forums and support options, commercial
support for KVM may not be as extensive as that available for some commercial
virtualization solutions.
Xen
Xen is an open-source, type-1 hypervisor that provides virtualization capabilities for running
multiple virtual machines (VMs) on a single physical host. As a type-1 hypervisor, Xen runs
directly on the hardware without the need for a host operating system, offering a lightweight and
efficient virtualization solution. Here are detailed aspects of Xen virtualization:
1. Hypervisor Architecture:
• Xen is a bare-metal hypervisor that operates directly on the hardware. It consists of a
small, privileged component called the hypervisor, which is responsible for managing
virtual machines and their access to physical resources.
2. Domain Model:
• Xen employs a domain-based model. The first domain, Domain 0 (Dom0), is privileged
and runs a modified Linux kernel. Dom0 is responsible for managing other unprivileged
domains (DomU) that run guest operating systems. Dom0 has direct access to hardware
and serves as the administrative domain.
3. Paravirtualization:
• Xen introduced the concept of paravirtualization, where guest operating systems are
modified to be aware of the hypervisor. This modification enhances communication
between the hypervisor and guest OS, improving performance and resource utilization.
4. Hardware Virtualization Extensions:
• Xen supports both paravirtualization and hardware-assisted virtualization using
technologies like Intel VT-x and AMD-V. This enables Xen to run unmodified guest
operating systems alongside paravirtualized ones, providing flexibility and compatibility.
146
5. Resource Allocation and Scheduling:
• Xen uses a credit-based scheduler for CPU resource allocation. Each virtual machine is
assigned a credit value, and the scheduler allocates CPU time based on these credits. This
ensures fair distribution of resources among VMs.
6. Memory Management:
• Xen manages memory resources for virtual machines. It provides isolation between VMs,
and each VM runs in its own memory space. Xen employs techniques like ballooning,
where memory can be dynamically adjusted between VMs.
7. Device Emulation and Pass-Through:
• Xen provides device emulation for virtual machines, allowing them to interact with
emulated hardware components. It also supports hardware pass-through, enabling direct
access to physical devices from VMs for improved performance.
8. Live Migration:
• Xen supports live migration, allowing a running virtual machine to be moved from one
physical host to another without downtime. Live migration facilitates load balancing,
hardware maintenance, and achieving high availability.
9. XenStore:
• XenStore is a shared storage system used for communication between Dom0 and DomU.
It stores configuration information, runtime data, and other parameters essential for the
interaction between the domains.
10. Virtual Networking:
• Xen provides various networking configurations, including bridged networking and
network address translation (NAT). Virtual machines can communicate with each other
and the external network using these networking options.
11. Community and Ecosystem:
• The Xen Project is an open-source community governed by the Linux Foundation. It has a
diverse ecosystem with contributions from organizations and individuals worldwide,
ensuring continuous development and support.
12. Xen Management Tools:
• Xen can be managed using command-line tools like xl (Xen Light), graphical user
interfaces like virt-manager, and management tools like Xen Orchestra for web-based
management.
Pros of Xen:
1. Open Source and Community Support:
• Xen is open-source software, and its development is backed by a community of
contributors. This fosters innovation, transparency, and a collaborative approach
to improving the hypervisor.
2. Paravirtualization:
• Xen introduced paravirtualization, where guest operating systems are modified
to be aware of the hypervisor. This can result in improved performance and
efficiency compared to full virtualization.
3. Strong Isolation:
147
• Xen provides strong isolation between virtual machines, ensuring that one VM
cannot access the memory or resources of another without proper authorization.
4. Versatility:
• Xen is versatile and can run on various hardware architectures. It supports both
paravirtualization and hardware-assisted virtualization, providing flexibility in
deployment.
5. Resource Efficiency:
• Xen is known for its resource efficiency, making optimal use of hardware
resources. The hypervisor is designed to minimize overhead and maximize
performance.
6. Live Migration:
• Xen supports live migration, allowing a running virtual machine to be moved from
one physical host to another without downtime. This is useful for load balancing,
maintenance, and achieving high availability.
7. Active Development:
• The Xen Project has an active development community, ensuring that the
hypervisor continues to evolve with updates, security patches, and new features.
8. Hardware Virtualization Support:
• Xen leverages hardware virtualization extensions (e.g., Intel VT-x and AMD-V) for
enhanced performance and compatibility with unmodified guest operating
systems.
9. Credit-Based Scheduler:
• Xen uses a credit-based scheduler to allocate CPU resources among virtual
machines based on their configured weights. This helps ensure fair distribution of
resources.
Cons of Xen:
1. Learning Curve:
• Xen may have a steeper learning curve for administrators who are new to
virtualization or are accustomed to other hypervisors. Configuring and managing
Xen requires understanding its specific concepts and terminology.
2. Less User-Friendly Interface:
• Xen's management interfaces, while functional, may be considered less user-
friendly compared to some commercial hypervisors. The absence of a centralized
graphical user interface may require more command-line interaction.
3. Sparse Documentation for Newer Features:
• While Xen has extensive documentation, some newer features or niche
configurations may have less comprehensive documentation compared to more
widely used hypervisors.
4. Limited Ecosystem:
• The third-party ecosystem around Xen may be smaller compared to some
commercial hypervisors, limiting the availability of specialized tools and
integrations.
5. Potential Compatibility Issues:
148
While Xen supports a wide range of guest operating systems, there may be
•
occasional compatibility issues, especially with newer or less common
distributions.
6. Integration Challenges:
• Integrating Xen into existing environments may pose challenges, particularly if an
organization is migrating from a different virtualization solution.
149
Backup and Compatible with various Integrates with DPM (Data Protection
Recovery third-party backup Manager) and third-party solutions
solutions
Market Share Historically dominant in Gaining market share, especially in Windows-
the virtualization market centric environments
150