CCS335-Cloud-Computing-QB - Unit 3, 4 & 5
CCS335-Cloud-Computing-QB - Unit 3, 4 & 5
EnggTree.co
m
UNIT III
VIRTUALIZATION INFRASTRUCTURE AND DOCKER
SYLLABUS:Desktop Virtualization – Network Virtualization – Storage
Virtualization – System-level of Operating Virtualization – Application Virtualization
– Virtual clusters and Resource Management – Containers vs. Virtual Machines –
Introduction to Docker – Docker Components – Docker Container – Docker Images
and Repositories.
PART A
2 Marks
The guest can share the same network interface of the host and use Network Address
Translation (NAT) to access the network;The virtual machine manager can emulate,
and install on the host, an additional network device, together with the driver.The
guest can have a private network only with the guest.
2. What is Hardware-level virtualization?BTL1
3. Define hypervisor?BTL1
There are different techniques for storage virtualization, one of the most popular
being network based virtualization by means of storage area networks (SANs).SANS
use a network accessible device through a large bandwidth connection to provide
storage facilities.
EnggTree.co
m
lOMoARcPSD|32653156
EnggTree.co
m 7
technique, users do not have to be worried about the specific location of their data,
which can be identified using a logical path.
Network Migration
Downloaded from
lOMoARcPSD|32653156
EnggTree.co
REGULATION 2021
m ACADEMIC YEAR 2023-2024
Containers and virtual machines are two types of virtualization technologies that share
many similarities.
Virtualization is a process that allows a single resource, such as RAM, CPU, Disk, or
Networking, to be virtualized and represented as multiple resources.
However, the main difference between containers and virtual machines is that virtual
machines virtualize the entire machine, including the hardware layer, while containers
only virtualize software layers above the operating system level.
Bridge: This is the default network driver and is suitable for different containers that
need to communicate with the same Docker host.
Host: This network is used when there is no need for isolation between the container
and the host.
Overlay: This network allows swarm services to communicate with each other.
None: This network disables all networking.
Macvlan: This assigns a Media Access Control (MAC) address to containers, which
looks like a physical address.
www.EnggTree.com
13. What is the purpose of Docker Hub?BTL1
The Docker Hub is a cloud-based repository service where users can push their
Docker Container Images and access them from anywhere via the internet.It offers the
option to push images as private or public and is primarily used by DevOps teams.
The Docker Hub is an open-source tool that is available for all operating systems. It
functions as a storage system for Docker images and allows users to pull the required
images when needed.
PART B
13 Marks
Downloaded from
lOMoARcPSD|32653156
EnggTree.co
m
environment accessible from everywhere.
Although the term desktop virtualization strictly refers to the ability- environment,
generally the desktop
Although the term desktop virtualization strictly refers to the ability to remotely
access a desktop environment, generally the desktop environment is stored in a
remote server or a data center that provides a high availability infrastructure and
ensures the accessibility and persistence of the data.
In this scenario, an infrastructure supporting hardware virtualization is fundamental
to provide access to multiple desktop environments hosted on the same server.A
specific desktop environment is stored in a virtual machine image that is loaded and
started on demand when a client connects to the desktop environment.This is a typical
cloud computing scenario in which the user leverages the virtual infrastructure for
performing the daily tasks on his computer. The advantages of desktop virtualization
are high availability, persistence, accessibility, and ease of management.
The basic services for remotely accessing a desktop environment are implemented in
software components such as Windows Remote Services, VNC, and X Server.
Infrastructures for desktop virtualization based on cloud computing solutions include
Sun Virtual Desktop Infrastructure (VDI), Parallels Virtual Desktop Infrastructure
(VDI), Citrix XenDesktop, and others.
● ◆ Network virtualizatiwonwwco.EmnbgingesTrehear.dcwoamre
appliances and specific software for the creation and management of a virtual
network.Network virtualization can aggregate different physical networks into a
single logical network
(external network virtualization) or provide network like functionality to an operating
system partition (internal network virtualization). The result of external network
virtualization is generally a virtual LAN (VLAN).
◆ ●A VLAN is an aggregation of hosts that communicate with each other as
though they were located under the same broadcasting domain. Internal network
virtualization is generally applied together with hardware and operating system-level
virtualization, in which the guests obtain a virtual network interface to communicate
with. • There are several options for implementing internal network virtualization:
1. The guest can share the same network interface of the host and use Network
Address Translation (NAT) to access the network; The virtual machine manager
can emulate, and install on the host, an additional network device, together with
the driver.
2. The guest can have a private network only with the guest.
Downloaded from
lOMoARcPSD|32653156
EnggTree.co
m
Using this technique, users do not have to be worried about the specific location of
their data, which can be identified using a logical path.
There are different techniques for storage virtualization, one of the most popular
being network based virtualization by means of storage
area networks (SANS).
Downloaded from
lOMoARcPSD|32653156
EnggTree.co
m
Windows platform. Wine features a software application acting as a container for the
guest application and a set of libraries, called Winelib, that developers can use
to compile applications to be ported on Unix systems. ◆● Wine takes its
inspiration from a similar product from Sun, WindowsApplication Binary
Interface (WABI) which implements the Win 16
Downloaded from
lOMoARcPSD|32653156
EnggTree.co
m
hosting operating system.
Virtual clusters are built using virtual machines installed across one or more physical
clusters, logically interconnected by a virtual network across several physical
networks.
Physical node failures may disable some virtual machines, but virtual machine
failures will not affect the host system
Downloaded from
lOMoARcPSD|32653156
EnggTree.co
m
The system should be capable of quick deployment, which involves creating and
distributing software stacks (including the OS, libraries, and applications) to physical
nodes within clusters, as well as rapidly
switching runtime environments between virtual clusters for different users.
When a user is finished using their system, the corresponding virtual cluster should be
quickly shut down or suspended to free up resources for other users.The concept of
"green computing" has gained attention recently, which focuses on reducing energy
costs by applying energy-efficient techniques across clusters of homogeneous
workstations and specific applications.Live migration of VMs allows workloads to be
transferred from one node towawnowth.eEr,nbgugt
Virtual clustering provides a flexible solution for building clusters consisting of both
physical and virtual machines.
It is widely used in various computing systems such as cloud platforms, high-
performance computing systems, and computational grids.
Downloaded from
lOMoARcPSD|32653156
EnggTree.co
m
Virtual clustering enables the rapid deployment of resources upon user demand or in
response to node failures.There are four different ways to manage virtual clusters,
including having the cluster manager reside on the guest or host systems, using
independent cluster managers, or an integrated cluster manager designed to
distinguish between virtualized and physical resources.
In the event of a VM failure, another VM running with the same guest OS can replace
it on a different node. During migration, the VM state file is copied from the storage
area to the host machine
Memory Migration
Network Migration
4.1Memory Migration
One crucial aspect of VM migration is memory migration, which involves moving the
memory instance of a VM from one physical host to another.
The efficiency of this process depends on the characteristics of the
application/workloads supported by the guest OS. In today's systems, memory
migration can range from a few hundred megabytes to several gigabytes.
The Internet Suspend-Resume (ISR) technique takes advantage of temporal locality,
where memory states are likely to have significant overlap between the suspended and
resumed instances of a VM.The ISR technique represents each file in the file system
as a tree of sub files, with a copy existing in both the suspended and resumed VM
instances.
Downloaded from
lOMoARcPSD|32653156
EnggTree.co
m
By caching only the changed files, this approach minimizes transmission overhead.
However, the ISR technique is not suitable for situations where live machine
migration is necessary, as it results in high downtime compared to other techniques.
For a system to support VM migration, it must ensure that each VM has a consistent
and location-independent view of the file system that is available on all hosts.
One possible approach is to assign each VM with its own virtual disk and map the file
system to it.
However, due to the increasing capacity of disks, it's not feasible to transfer the entire
contents of a disk over a network during migration.
Another alternative is to implement a global file system that is accessible across all
machines, where a VM can be located without the need to copy files between
machines.
Network Migration
When a VM is migrated to a new physical host, it is important that any open network
connections are maintained without relying on forwarding mechanisms or support
from mobility or redirection mechanisms on the original host.
To ensure remote systems can locate and communicate with the migrated VM, it must
be assigned a virtual IP address that is known to other entities.
This virtual IP address can be wdiwffewre.nEt nfrgogmTthree eIP.caodmdress
of the host machine where
the VM is currently located.
Additionally, each VM can have its own virtual MAC address, and the VMM
maintains a mapping of these virtual IP and MAC addresses to their corresponding
VMs in an ARP table.
At Duke University, COD was developed to enable dynamic resource allocation with
a virtual cluster management system, and at Purdue University, the VIOLIN cluster
was constructed to demonstrate the benefits of dynamic adaptation using multiple VM
clustering.
Downloaded from
lOMoARcPSD|32653156
EnggTree.co
m
with each other through well-defined channels.
Unlike virtual machines, all containers share a single operating system kernel, which
results in lower resource consumption
They rely on the underlying OS kernel, which makes them lightweight and they share
resources with other containers on the same host OS while providing OS-level process
isolation. On the other hand, virtual machines run on hypervisors, which allow
multiple VMs to run on a single machine along with its own operating system.
Each VM has its own copy of an operating system along with the application and
necessary binaries, making it significantly larger and requiring more resources.
VMs provide hardware-level process isolation, but they are slow to boot.
Key Terminologies
A Docker Image is a file containing multiple layers of instructions used to create and
run a Docker container. It provides a portable and reproducible way to package and
distribute applications.
A Docker Container is a lightweight and isolated runtime environment created from
an image. It encapsulates an application and its dependencies, providing a consistent
and predictable environment for running the application.
A Dockerfile is a text file thatwcwonwta.iEnsnagsgeTt oref ein.sctroumctions
to build a Docker Image.
It defines the base image, application code, dependencies, and configuration needed
to create a custom Docker Image.
Docker Engine is the software that enables the creation and management of Docker
containers. It consists of three main components:
Docker Hub is a cloud-based registry that provides a centralized platform for storing,
sharing, and discovering Docker Images. It offers a vast collection of pre-built Docker
Images that developers can use to build, test, and deploy their applications.
Features of Docker
Open-source platform
An Easy, lightweight, and consistent way of delivery of applications Fast and efficient
development life cycle.
lOMoARcPSD|32653156
EnggTree.co
m
Downloaded from)
EnggTree.com
lOMoARcPSD|32653156
EnggTree.co
m
Segregation of duties
Service-oriented architecture
Security
Scalability
Reduction in size
Image management
Networking
Volume management
Docker implements a client-server model where the Docker client communicates with
the Docker daemon to create, manage, and distribute containers.
The Docker client can be installed on the same system as the daemon or connected
remotely.
Communication between the wcliwenwt .aEndngdgaeTmroene.occocumrs
through a REST API either
over a UNIX socket or a network.
The Docker daemon is responsible for managing various Docker services and
communicates with other daemons to do so. Using Docker's API requests, the daemon
manages Docker objects such as images, containers, networks, and volumes.
Docker Client
lOMoARcPSD|32653156
EnggTree.co
REGULATION 2021
m ACADEMIC YEAR 2023-2024
The Docker client allows users to interact with Docker and utilize its functionalities. It
communicates with the Docker daemon using the Docker API.
The Docker client has the capability to communicate with multiple daemons. When a
user runs a Docker command on the terminal, the instructions are sent to the daemon.
The Docker daemon receives these instructions in the form of commands and REST
API requests from the Docker client.
The primary purpose of the Docker client is to facilitate actions such as pulling
images from the Docker registry and running them on the Docker host.
Commonly used commands by Docker clients include docker build, docker pull, and
docker run.
Docker Host
Docker Registry
Docker images are stored in the Docker registry, which can either be a public registry
like Docker Hub, or a private registry that can be set up.
To obtain required images from a configured registry, the 'docker run' or 'docker pull'
commands can be used. Conversely, to push images into a configured registry, the
'docker push' command can bewuwsewd..EnggTree.com
Docker Objects
When working with Docker, various objects such as images, containers, volumes, and
networks are created and utilized.
Docker Images
A docker image is a set of instructions used to create a container, serving as a read-
only template that can store and transport applications.
Images play a critical role in the Docker ecosystem by enabling collaboration among
developers in ways that were previously impossible
Docker Storage
Docker storage is responsible for storing data within the writable layer of the
container, and this function is carried out by a storage driver.The storage driver is
responsible for managing and controlling the images and containers on the Docker
host.There are several types of Docker storage.
о Data Volumes, which can be mounted directly into the container's filesystem, are
essentially directories or files on the Docker Host filesystem.
о Volume Container is used to maintain the state of the containers' data produced by
the running container, where Docker volumes file systems are mounted on Docker
containers. These volumes are stored on the host, making it easy for users to exchange
file systems among containers and backup data.
lOMoARcPSD|32653156
EnggTree.co
m
Docker networking
Docker networking provides complete isolation for containers, allowing users to link
them to multiple networks with minimal OS instances required to run workloads.
There are different types of Docker networks available, including:
o Bridge: This is the default network driver and is suitable for different containers
that need to communicate with the same Docker host.
o Host: This network is used when there is no need for isolation between the
container and the host.
оOverlay: This network allows to communicate with each other. None: This network
disables all networking.
By default, a container is isolated from other containers and its host machine. It is
possible to control the level of isolation for a container's network, storage or other
underlying subsystems from other containers or from the host machine.
Any changes made to a container's state that are not stored in persistent storage will
be lost once the container is removed.
Docker provides a consistent environment for running applications from design and
development to production and maintenance, which eliminates production issues and
allows developers to focus on introducing quality features instead of debugging errors
and resolving configuration/compatibility issues.
Docker also allows for instant creation and deployment of containers for every
process, without needing to boot the OS, which saves time and increases agility.
Creating, destroying, stopping or starting a container can be done with ease, and
YAML configuration files can be used to automate deployment and scale the
infrastructure.
lOMoARcPSD|32653156
EnggTree.co
m
Docker enables significant infrastructure cost reduction, with minimal costs for
running applications when compared with VMs and other technologies. This can lead
to increased ROI and operational cost savings with smaller engineering teams
PART C
15 Marks
EnggTree.co
m
history and originally was used in 1966 for the implementation of Basic Combined
Programming Language (BCPL), a language for writing compilers and one of the
ancestors of the C
programming language.
● ◆ Other important examples of the use of this technology have been the
UCSD Pascal and Smalltalk
The Java virtual machine was originally designed for the execution of programs
written in the Java language, but other languages such as
◆ ● The ability to support multiple programming languages has been one of the
key elements of the Common Language Infrastructure (CLI) which is the
specification behind .NET Framework
This is a particular form of virtualization and serves the same purpose of storage
virtualization by providing a better quality of service rather than emulating a different
environment. 3.6.3 Virtualization Support and Disaster Recover
fundamental infrastructure are virtualized. • The user will not care about the
computing resources that are used for providing the services.
● ◆ Cloud users do not need to know and have no way to discover physical
resources that are involved while processing a service request. In addition,
application developers do not care about some infrastructure issues such as
scalability and fault tolerance. Application developers focus on service logic. In
many cloud computing systems, virtualization software is used to virtualize the
hardware.System virtualization software is a special kind of software which
simulates the execution of hardware and runs even unmodified
operating systems.
lOMoARcPSD|32653156
EnggTree.co
m
● ◆ Cloud computing systems use virtualization so ware as the running
environment for legacy software such as old operating systems and unusual
applications.
3. Hardware Virtualization
www.EnggTree.com
Virtualization software is also used as the platform for developing new cloud
applications that enable developers to use any operating systems and programming
environments they like.
The development environment and deployment environment can now be the same,
which eliminates some runtime problems.
VMs provide flexible runtime services to free users from worrying about the system
environment.
An environment that meets one user's requirements often cannot satisfy another user.
lOMoARcPSD|32653156
EnggTree.co
m
Virtualization allows us to have full privileges while keeping them separate.
Users have full access to their own VMs, which are completely separate from other
user's VMs.
● ◆ Multiple VMs can be mounted on the same physical server. Different VMs
may run with different OSes.
These managers handle loads, resources, security, data, and provisioning functions.
Figure 3.2 shows two VM platforms.
● ◆ Each platform carries out a virtual solution to a user job. All cloud services
are managed in the boxes at the top.
AWS provides extreme flexibility (VMS) for users to execute their own applications.
GAE provides limited application level virtualization for users to build applications
only based on the services that are created by Google.
● ◆ The Microsoft tools are used on PCs and some special servers.
EnggTree.co
m
This has enabled users to create customized environments atop physical infrastructure
for cloud computing.
Use of VMs in clouds has the following distinct benefits:
VMs have the ability to run legacy code without interfering with other APIs VMs can
be used to improve security through creation of sandboxes for running applications
with questionable reliability
o Virtualized cloud platforms can apply performance isolation, letting providers offer
some guarantees and better QoS to customer applications
Containers are software packages that are lightweight and self- contained, and they
comprise all the necessary dependencies to run an application.
The dependencies include external third-party code packages, system libraries, and
other operating system-level applications.
These dependencies are organized in stack levels that are higher than the operating
system.
Advantages:
One advantage of using containers is their fast iteration speed. Due to their
lightweight nature and focus on high-level software, containers can be quickly
modified and updated.
EnggTree.co
m
for development
teams. Disadvantages:
о As containers share the same hardware system beneath the operating system layer,
any vulnerability in one container can potentially affect the underlying hardware and
break out of the container.
Although many container runtimes offer public repositories of pre-built containers,
there is a security risk associated with using these containers as they may contain
exploits or be susceptible to hijacking by malicious actors. Examples:
о Docker is the most widely used container runtime that offers Docker Hub, a public
repository of containerized applications that can be easily deployed to a local Docker
runtime.
CRI-O, on the other hand, is a lightweight alternative to using Docker as the runtime
for Kubernetes, implementing the Kubernetes Container Runtime Interface (CRI) to
support Open Container Initiative (OCI)-compatible runtimes.
Virtual Machines
Virtual machines are softwarewpwawck.aEgnesgtghTatrecoen.tcaoinma
complete emulation of low-
level hardware devices, such as CPU, disk, and networking devices.They may also
include a complementary software stack that can run on the emulated hardware.
Together, these hardware and software packages create a functional snapshot of a
computational system.
Advantages:
O Virtual machines provide full isolation security since they operate as standalone
systems, which means that they are protected from any interference or exploits from
other virtual machines on the same host.
o Though a virtual machine can still be hijacked by an exploit, the affected virtual
machine will be isolated and cannot contaminate other adjacent virtual machines.
о One can manually install software to the virtual machine and snapshot the virtual
machine to capture the present configuration state.
о The virtual machine snapshots can then be utilized to restore the virtual machine to
that particular point in time or create additional virtual machines with that
lOMoARcPSD|32653156
EnggTree.co
m
configuration.
Disadvantages:
о Virtual machines are known for their slow iteration speed due to the fact that they
involve a complete system stack.
o Any changes made to a virtual machine snapshot can take a considerable amount of
time to rebuild and validate that they function correctly.
o Another issue with virtual machines is that they can occupy a significant amount of
storage space, often several gigabytes in size.
о This can lead to disk space constraints on the host machine where the virtual
machines are stored.
Examples:
Virtualbox is an open source emulation system that emulates x86 architecture, and is
owned by Oracle. It is widely used and has a set of additional tools to help develop
and distribute virtual machine images.
oVMware is a publicly traded company that provides a hypervisor along with its
virtual machine platform, which allows deployment and management of multiple
virtual machines. VMware offers robust UI for managing virtual machines, and is a
popular enterprise virtual machwinwewso.lEutniogngwTirtheesu.cppoomrt.
о QEMU is a powerful virtual machine option that can emulate any generic hardware
architecture. However, it lacks a graphical user interface for configuration or
execution, and is a command line only utility. As a result, QEMU is one of the fastest
virtual machine options available.
The Docker Hub is a cloud-based repository service where users can push their
Docker Container Images and access them from anywhere via the internet.It offers the
option to push images as private or public and is primarily used by DevOps teams.
The Docker Hub is an open-source tool that is available for all operating systems. It
functions as a storage system for Docker images and allows users to pull the required
images when needed.However, it is necessary to have a basic knowledge of Docker to
push or pull images from the Docker Hub. If a developer team wants to share a
project along with its dependencies for testing, they can push the code to Docker Hub.
To do this, the developer must create images and push them to Docker Hub. The
testing team can then pull the same image from Docker Hub without needing any
files, software, or plugins, as the developer has already shared the image with all
dependencies.
lOMoARcPSD|32653156
EnggTree.co
m
Features of Docker Hub
Docker Hub simplifies the storage, management, and sharing of images with others. It
provides security checks for images and generates comprehensive reports on any
security issues.
Additionally, Docker Hub can automate processes like Continuous Deployment and
Continuous Testing by triggering webhooks when a new image is uploaded.
Through Docker Hub, users can manage permissions for teams, users, and
organizations.
Moreover, Docker Hub can be integrated with tools like GitHub and Jenkins,
streamlining workflows.
Advantages of Docker Hub
Docker Container Images have a lightweight design, which enables us to push images
in a matter of minutes using a simple command.
This method is secure and offers the option of pushing private or public images.
Making code, software or any type of file available to the public can be done easily by
publishing the images on the Dwowckwer.HEunbgagsTpruebelic.com
UNIT IV
CLOUD DEPLOYMENT ENVIRONMENT
SYLLABUS: Google App Engine – Amazon AWS – Microsoft Azure; Cloud
Software Environments – Eucalyptus – OpenStack.
PART A
2 Marks
Google's App Engine (GAE) which offers a PaaS platform supporting various cloud
and web applications.This platform specializs in supporting scalable (elastic) web
applications.GAE enables users to run their applications on a large number of data
centers associated with Google's search engine operations.
EnggTree.co
REGULATION 2021
m ACADEMIC YEAR 2023-2024
Blob, Queue, File, and Disk Storage, Data Lake Store, Backup, and Site Recovery.
Well-known GAE applications include the Google Search Engine, Google Docs,
Google Earth, and Gmail.These applications can support large numbers of users
simultaneously.Users can interact with Google applications via the web interface
provided by each application.Third-party application providers can use GAE to build
cloud applications for providing services.
EnggTree.co
m
Mainframe
Client-Server
Cloud Computing
Mobile Computing
Grid Computing
AWS can present a challenge due to its vast array of services and functionalities,
which may be hard to comprehend and utilize, particularly for inexperienced
users.The cost of AWS can be high, particularly for high-traffic applications or when
operating multiple services.
lOMoARcPSD|32653156
EnggTree.co
m
PART B
13 Marks
Google has the world's largest search engine facilities.The company has extensive
experience in massive data processing that has led to new
insights into data-center design and novel programming models that scale to
incredible sizes.
Google platform is based on its search engine expertise.Google has hundreds of data
centers and has installed more than 460,000 servers worldwide.
For example, 200 Google data centers are used at one time for a number of cloud
applications.
Data items are stored in text, images, and video and are replicated to tolerate faults or
failures.
Google's App Engine (GAE) which offers a PaaS platform supporting various cloud
and web applications.Google has pioneered cloud development by leveraging the
large number of data centers it operates.
For example, Google pioneered cloud services in Gmail, Google Docs, and Google
Earth, among other applications.These applications can support a large number of
users simultaneously with HA.
Notable technology achievements include the Google File System (GFS),
MapReduce, BigTable, and wChwuwbb.yE.Inngg20T0r8e, eG.cooomgle
announced the GAE web
application platform which is becoming a common platform for many small cloud
service providers.This platform specializes in supporting scalable (elastic) web
applications.GAE enables users to run their applications on a large number of data
centers associated with Google's search engine operations.
GAE Architecture
GFS is used for storing large amounts of data.
MapReduce is for use in application program development.Chubby is used for
distributed application lock services.BigTable offers a storage service for accessing
structured data.
Users can interact with Google applications via the web interface provided by each
application.
Third-party application providers can use GAE to build cloud applications for
providing services.
The applications all run in data centers under tight management by Google engineers.
Inside each data center, there are thousands of servers forming different clusters
lOMoARcPSD|32653156
EnggTree.co
m
Figure 4.1 shows the overall architecture of the Google cloud infrastructure.
A typical cluster configuration can run the Google File System, MapReduce jobs and
BigTable servers for structure data.
• Extra services such as Chubby for distributed locks can also run in the clusters.
• GAE runs the user program on Gogle's infrastructure. As it is a platform running
third-party programs, application developers now do not need to worry about the
maintenance of servers. www.EnggTree
GAE can be thought of as the combination of several software components.The
frontend is an application framework which is similar to other web application
frameworks such as ASP, J2EE and JSP.At the time of this writing, GAE supports
Python and Java programming environments. The applications can run similar to web
application containers.The frontend can be used as the dynamic web serving
infrastructure which can provide the full support of common technologies.
The GAE platform comprises the following five major components.The GAE is not
an infrastructure platform, but rather an application development platform for
users.The datastore offers object-oriented, distributed, structured data storage services
based on BigTable techniques. The datastore secures data management operations.
The application runtime environment offers a platform for scalable web programming
and execution. It supports two development languages: Python and Java.
o The software development kit (SDK) is used for local application development. The
SDK allows users to execute test runs of local applications and upload application
code.
EnggTree.co
m
o The GAE web service infrastructure provides special interfaces to guarantee flexible
use and management of storage and network resources by GAE.Google offers
essentially free GAE services to all Gmail account owners.The user can register for a
GAE account or use Gmail account name to sign up for the service.The service is free
within a quota.If the user exceeds the quota, the page instructs how to pay for the
service. Then the user can download the SDK and read the Python or Java guide to
get started.
Note that GAE only accepts Python, Ruby and Java programming languages.
The platform does not provide any IaaS services, unlike Amazon, which offers IaaS
and PaaS.
This model allows the user to deploy user-built applications on top of the cloud
infrastructure that are built using the programming languages and software tools
supported by the provider (e.g., Java, Python).
Azure does this similarly for underlying cloud infrastructure. The cloud provider
facilitates support of application development, testing, and operation support on a
well-defined service platform.
Best-known GAE applications include the Google Search Engine, Google Docs,
Google Earth and Gmail.These applications can support large numbers of users
simultaneously.Users can interact with Google applications via the web interface
provided by each application.Third party application providers can use GAE to build
cloud applications for providiwngwswer.vEicnegs.gTThereaepp.cliocamtions
are all run in the Google
data centers. Inside each data center, there might be thousands of server nodes to form
different clusters. Each cluster can run multipurpose servers.
GAE supports many web applications.
One is a storage service to store application specific data in the Google infrastructure.
The data can be persistently stored in the backend storage server while still providing
the facility for queries, sorting and even transactions similar to traditional database
systems.
GAE also provides Google specific services, such as the Gmail account service. This
can eliminate the tedious work of building customized user management components
in web applications.
EnggTree.co
m
a NOSQL data management system for entities that can be, at most, 1 MB in size and
are labeled by a set of schema-less properties.Queries can retrieve entities of a given
kind filtered and sorted by the values of the properties.Java offers Java Data Object
(JDO) and Java Persistence API (JPA) interfaces implemented by the open source
Data Nucleus Access platform, while Python has a SQL-like query language called
GQL.The data store is strongly consistent and uses optimistic concurrency control.
EnggTree.co
m
with your app.
GAE provides the ability to manipulate image data using a dedicated Images service
which can resize, rotate, flip, crop and enhance images. An application can perform
tasks outside of responding to web requests.A GAE application is configured to
consume resources up to certain limits or quotas. With quotas, GAE ensures that your
application would not exceed your budget and that other applications running on GAE
would not impact the performance of your app. In particular, GAE use is free up to
certain quotas.GFS was built primarily as the fundamental storage service for
Google's search engine.As the size of the web data that was crawled and saved was
quite substantial, Google needed a distributed file system to redundantly store
massive amounts of data on cheap and unreliable computers.
In addition, GFS was designed for Google applications and Google applications were
built for GFS.
In traditional file system design, such a philosophy is not attractive, as there should be
a clear interface between applications and the file system such as a POSIX interface.
GFS typically will hold a large number of huge files, each 100 MB or larger, with
files that are multiple GB in size quite common. Thus, Google has chosen its file data
block size to be 64 MB instead of the 4 KB in typical traditional file systems.The I/O
pattern in the Google application is also special.Files are typically written once, and
the write operations are often the appending data blocks to the end of files.
Multiple appending operations might be concurrent.
BigTable was designed to provide a service for storing and retrieving structured and
semi structured data. BigTable applications include storage of web pages, per-user
data, and geographic locations.
This is one reason to rebuild the data management system and the resultant system
can be applied across many prwojwecwts .fEornaglgowTriencer.ecmoemntal
cost.
The other motivation for rebuilding the data management system is performance.
Low level storage optimizations help increase performance significantly which is
much harder to do when running on top of a traditional database layer.The design and
implementation of the BigTable system has the following goals.
The applications want asynchronous processes to be continuously updating different
pieces of data and want access to the most current data at all times.The database needs
to support very high read/write rates and the scale might be millions of operations per
second. The application may need to examine data changes over time. . Thus,
BigTable can be viewed as a distributed multilevel map. It provides a fault tolerant
and persistent database as in a storage service.
The BigTable system is scalable, which means the system has thousands of servers,
terabytes of in-memory data, peta bytes of disk based data, millions of reads/writes
per second and efficient scans.BigTable is a self managing system (i.e., servers
added/removed dynamically and it features automatic load balancing). can be
Chubby, Google's Distributed Lock Service Chubby is intended to provide a coarse-
grained locking service.
It can store small files inside Chubby storage which provides a simple namespace as
a file system tree.The files stored in Chubby are quite small compared to the huge
files in GFS.
EnggTree.co
m
Amazon EC2:
Amazon Elastic Compute Cloud (Amazon EC2) is a cloud- based web service that
offers a secure and scalable computing capacity.It allows organizations to customize
virtual compute capacity in the cloud, with the flexibility to choose from a range of
operating systems and resource configurations such as CPU, memory, and
storage.With Amazon EC2 falls under the category of Infrastructure as a
Service(IaaS) and provides reliable, cost-effective compute and high-performance
infrastructure to meet the demands of businesses.
AWS Lambda:
AWS Lambda is a serverless, event-driven compute service that enables code
execution without server management.Compute time consumption is the only factor
for payment,and there is no charge when code is not running.AWS Lambda offers the
ability to run code for any application type with no need for administration.
Amazon S3
Amazon S3(Simple Storage Service) is a web service interface for object storage that
enables you to store and retrieve any amount of data from any location on the web. It
is designed to provide limitless storage with a 99.999999999% durability
guarantee.Amazon S3 can be used as the primary storage solution for cloud-native
applications, as well as for backup and recovery and disaster recovery purposes. It
delivers unmatched scalability, data availability, security, and performance.
Amazon Glacier:
Amazon Glacier is a highly secure and cost-effective storage service designed for
long-term backup and data archiving. It offers reliable durability and ensures the
safety of your data. However, since data retrieval may take several hours, Amazon
Glacier is primarily intended for archiving purposes.
Amazon RDS
lOMoARcPSD|32653156
EnggTree.co
m
Amazon Relational Database Service (Amazon RDS) simplifies the process of setting
up, managing, and scaling a relational database in the cloud. Additionally, it offers
resizable and cost-effective capacity and is available on multiple database instance
types that are optimized for memory, performance, or I/O.With Amazon RDS, choice
of six popular database engines including Amazon Aurora, PostgreSQL, MySQL,
MariaDB, Oracle, and Microsoft SQL Server.
Amazon DynamoDB
Amazon DynamoDB is a NoSQL database service that offers fast and flexible storage
for applications requiring consistent, low-latency access at any scale.It's fully
managed and supports both document and key-value
data models.Its versatile data model and dependable performance make it well-suited
for various applications such as mobile, web, gaming, Internet of Things (IoT), and
more.
Azure is a cost-effective platform with simple pricing based on the "Pay As You G o^
prime prime model, which means the user only pay for the resources the user use.
This makes it a convenient option for setting up large servers without requiring
significant investments, effort,wowr pwhy.Esicnagl gspTarcee.e.com
History
Windows Azure was announced by Microsoft in October 2008 and became available
in February 2010.In 2014, Microsoft renamed it as Microsoft Azure.It offered a
platform for various services including .NET services, SQL Services, and Live
Services.However, some people were uncertain about using cloud
technology.Nevertheless, Microsoft Azure is constantly evolving, with new tools and
functionalities being added.The platform has two releases: v1 and v2. The earlier
version was JSON script-oriented, while the newer version features an interactive UI
for easier learning and simplification. Microsoft Azure v2 is still in the preview stage.
Advantages of Azure
Azure offers a cost-effective solution as it eliminates the need for expensive hardware
investments.With a pay-as-you-go subscription model, the user can manage their
Setting up an Azure account is a simple process through the Azure Portal, where you
can choose the desired subscription and begin using the platform.
One of the major advantages of Azure is its low operational cost. Since it operates on
dedicated servers specifically designed for cloud functionality, it provides greater
reliability compared to on-site servers.By utilizing Azure, the user can eliminate the
need for hiring a dedicated technical support team to monitor and troubleshoot
servers. This results in significant cost savings for an organization.Azure provides
easy backup and recovery options for valuable data. In the event of a disaster, the user
can quickly recover the data with a single click, minimizing any impact on end user
lOMoARcPSD|32653156
EnggTree.co
m
business.Cloud-based backup and recovery solutions offer convenience, avoid upfront
investments, and provide expertise from third-party providers.Implementing the
business models in Azure is straightforward, with intuitive features and user-friendly
interfaces. Additionally, there are numerous tutorials available to expedite learning
and deployment process
Azure offers robust security measures, ensuring the protection of your critical data
and business applications.Even in the face of natural disasters, Azure serves as a
reliable safeguard for the resources. The cloud infrastructure remains operational,
providing continuous protection.
Azure services
Azure offers a wide range of services and tools for different needs.These include
Compute, which includes Virtual Machines, Virtual Machine Scale Sets, Functions
for serverless computing, Batch for containerized batch workloads, Service Fabric for
microservices and container orchestration, and Cloud Services for building cloud-
based apps and APIs.The Networking tools in Azure offer several options like the
Virtual Network, Load Balancer, Application Gateway, VPN Gateway, Azure DNS for
domain hosting, Content Delivery Network, Traffic Manager, Express Route
dedicated private network fiber connections, and Network Watcher monitoring and
diagnostics.The Storage tools available in Azure include Blob, Queue, File, and Disk
Storage, Data Lake Store, Backup, and Site Recovery, among others. Web + Mobile
services make it easy to create and deploy web and mobile applications.Azure also
includes tools for Containers, Databases, Data + Analytics, AI + Cognitive Services,
Internet of Things, Security + Identity, and Developer Tools, such as Visual Studio
Team Services, Azure DevTest Labs, HockeyApp mobile app deployment and
monitoring, and Xamarin crossw-pwlawtf.oErmngmgoTbirlee
ede.cvoelmopment.
Client-Server: In this environment, client devices access resources and services from
a central server, facilitating the sharing of data and processing capabilities.
Cloud Computing: Cloud computing leverages the Internet to provide resources and
services that can be accessed through web browsers or client software. It offers
scalability, flexibility, and on-demand availability.
Grid Computing: Grid computing involves the sharing of computing resources and
services across multiple computers, enabling large- scale computational tasks and data
processing
lOMoARcPSD|32653156
EnggTree.co
m
Components: www.EnggTree.com
Eucalyptus has various components that work together to provide efficient cloud
computing services.
The Node Controller manages the lifecycle of instances and interacts with the
operating system, hypervisor, and Cluster Controller.On the other hand, the Cluster
Controller manages multiple Node Controllers and the Cloud Controller, which acts
as the front-end for the entire architecture.
lOMoARcPSD|32653156
EnggTree.co
m
The Storage Controller, also known as Walrus, allows the creation of snapshots of
volumes and persistent block storage over VM instances.
Eucalyptus operates in different modes, each with its own set of features.In Managed
Mode, users are assigned security groups that are isolated by VLAN between the
Cluster Controller and Node Controller. In Managed (No VLAN) Node mode,
however, the root user on the virtual machine can snoop into other virtual machines
running on the same network layer.The System Mode is the simplest mode with the
least number of features, where a MAC address is assigned to a virtual machine
instance and attached to the Node Controller's bridge Ethernet device. Finally, the
Static Mode is similar to System Mode but provides more control over the assignment
of IP addresses, as a MAC address/IP address pair is mapped to a static entry within
the DHCP server.
Features of Eucalyptus
The networking component can be divided into three modes: Static mode, which
allocates IP addresses to instances, System mode, which assigns a MAC address and
connects the instance's network interface to the physical network via NC, and
Managed mode, which creates a local network of instances.Access control is used to
limit user permissions. Elastic Block Storage provides block-level storage volumes
that can be attached to instances.Auto-scaling and load balancing are used to create or
remove instances or services bwaswedwo.nEdnegmgaTndr.ee.com
Advantages of Eucalyptus
Eucalyptus is a versatile solution that can be used for both private and public cloud
computing.
Users can easily run Amazon or Eucalyptus machine images on either type of cloud.
Additionally, its API is fully compatible with all Amazon Web Services, making it
easy to integrate with other tools like Chef and Puppet for DevOps.
Although it is not as widely known as other cloud computing solutions like
OpenStack and CloudStack, Eucalyptus has the potential to become a viable
alternative.It enables hybrid cloud computing, allowing users to combine public and
private clouds for their needs. With Eucalyptus, users can easily transform their data
centers into private clouds and extend their services to other organizations.
PART C
15 Marks
EnggTree.co
m
costs.AWS provides flexibility by allowing users to pay only for the services they
need, helping enterprises reduce their capital expenditure on building private IT
infrastructure.AWS has a physical fiber network that connects availability zones,
regions, and edge locations, with maintenance costs borne by AWS. While cloud
security is AWS's responsibility, security in the cloud is the responsibility of the
customer.Performance efficiency in the cloud has four main areas: selection, review,
monitoring, and tradeoff.
Advantages of AWS
AWS provides the convenience of easily adjusting resource usage based on your
changing needs, resulting in cost savings and ensuring that your application always
has sufficient resources.
With multiple data centers and a commitment to 99.99 for many of its services, AWS
offers a reliable and secure infrastructure.
Its flexible platform includes a variety of services and tools that can be combined to
build and deploy various applications.Additionally, AWS's pay-as-you-go pricing
model means user only pay for the resource use, eliminating upfront costs and long-
term commitments.
Disadvantages:
AWS can present a challenge due to its vast array of services and functionalities,
which may be hard to comprehend and utilize, particularly for inexperienced
users.The cost of AWS can be high, particularly for high-traffic applications or when
operating multiple services.Fwurwthwerm.Eonreg,
gsTerrveicee.ceoxmpenses can escalate over time,
necessitating frequent expense monitoring.AWS's management of various
infrastructure elements may limit authority over certain parts of your environment and
application.
Global infrastructure
The AWS infrastructure spans across the globe and consists of geographical regions,
each with multiple availability zones that are physically isolated from each other.
When selecting a region, factors such as latency optimization, cost reduction, and
government regulations are considered. In case of a failure in one zone, the
infrastructure in other availability zones remains operational, ensuring business
continuity.AWS's largest region, North Virginia, has six availability zones that are
connected by high-speed fiber-optic networking.
To further optimize content delivery, AWS has over 100 edge locations worldwide
that support the CloudFront content delivery network.This network caches frequently
accessed content, such as images and videos, at these edge locations and distributes
them globally for faster delivery and lower latency for end-users. Additionally,
CloudFront offers protection against DDoS attacks
EnggTree.co
m
dedicated computer hardware. It provides a high degree of flexibility and management
control over IT resources. Examples of laaS services on AWS include VPC, EC2, and
EBS.
Platform as a Service (PaaS): In this service model, AWS manages the underlying
infrastructure, including the operating system and hardware. This allows developers to
be more efficient and focus on deploying and managing applications rather than
managing infrastructure. Examples of PaaS services on AWS include RDS, EMR, and
ElasticSearch.
Software as a Service (SaaS): This service model provides complete end-user
applications that typically run on a browser. The service provider runs and manages
the software, so end-users only need to worry about using the software that suits their
needs. Examples of SaaS applications on AWS include Salesforce.com, web-based
email, and Office 365.
The OpenStack project is an open source cloud computing platform for all types of
clouds, which aims to be simple to implement, massively scalable and feature
rich.Developers and cloud computing technologists from around the world create the
OpenStack project.OpenStack provides an Infrastructure as a Service (IaaS) solution
through a set of interrelated services.Each service offers an application programming
interface (API) that facilitates this integration.Depending on their needs, administrator
can install some or all services.
OpenStack began in 2010 as wa jwoiwnt.EprnogjegctTorfeRea.cckosmpace
Hosting and NASA.As of
2012, it is managed by the OpenStack Foundation, a non-profit corporate entity
established in September 2013 to promote OpenStack software and its
community.Now, More than 500 companies have joined the projectThe OpenStack
system consists of several key services that are separately installed.
These services work together depending on the user cloud needs and include the
Compute, Identity, Networking, Image, Block Storage, Object Storage, Telemetry,
Orchestration, and Database services.
The administrator can install any of these projects separately and configure them
standalone or as connected entities.
Figure 4.4 shows the relationships among the OpenStack services:
lOMoARcPSD|32653156
EnggTree.co
m
www.EnggTree.com
To design, deploy, and configure OpenStack, administrators must understand the
logical architecture.OpenStack consists of several independent parts, named the
OpenStack services. All services authenticate through a common Identity
service.Individual services interact with each other through public APIs, except where
privileged administrator commands are necessaryInternally, OpenStack services are
composed of several processes.
All services have at least one API process, which listens for API requests,
preprocesses them and passes them on to other parts of the service.With the exception
of the Identity service, the actual work is done by distinct processes.For
communication between the processes of one service, an AMQP message broker is
used.The service's state is stored in a database.When deploying and configuring the
OpenStack cloud, administrator can choose among several message broker and
database solutions, such as RabbitMQ, MySQL, MariaDB, and SQLite.Users can
access OpenStack via the web-based user interface implemented by the Horizon
Dashboard, via command-line clients and by issuing API requests through tools like
browser plug-ins or curl.For applications, several SDKs are available. Ultimately, all
these access methods issue REST API calls to the various OpenStack services.
lOMoARcPSD|32653156
EnggTree.co
m
The controller node runs the Identity service, Image service, Placement service,
management portions of Compute, management portion of Networking, various
Networking agents, and the Dashboard.It also includes supporting services such as an
SQL database, message queue, and NTP.
Optionally, the controller node runs portions of the Block Storage, Object Storage,
Orchestration, and Telemetry services.The controller node requires a minimum of two
network interfaces.The compute node runs the hypervisor portion of Compute that
operates instances. By default,wCwomwp.uEtenugsgeTs
trheeeK.cVoMmhypervisor. The compute node
also runs a Networking service agent that connects instances to virtual networks and
provides firewalling services to instances via security groups.
Administrator can deploy more than one compute node. Each node requires a
minimum of two network interfaces. The optional Block Storage node contains the
disks that the BlockStorage and Shared File System services provision for instances.
For simplicity, service traffic between compute nodes and this node uses the
management network.
Production environments should implement a separate storage network to increase
performance and security. Administrator can deploy more than one block storage
node. Each node requires a minimum of one network interface.The optional Object
Storage node contains the disks that the Object Storage service uses for storing
accounts, containers, and objects.For simplicity, service traffic between compute
nodes and this node uses the management network.Production environments should
implement a separate storage network to increase performance and security.This
service requires two nodes. Each node requires a minimum of one network interface.
Administrator can deploy more than two object storage nodes.The provider networks
option deploys the OpenStack Networking service in the simplest way possible with
primarily layer 2 (bridging/switching) services and VLAN segmentation of networks.
Essentially, it bridges virtual networks to physical networks and relies on physical
network infrastructure for layer-3 (routing) services.Additionally, a DHCP service
provides IP address information to instances.
lOMoARcPSD|32653156
EnggTree.co
m
UNIT V
CLOUD SECURITY
SYLLABUS:Virtualization System-Specific Attacks: Guest hopping – VM migration
attack – hyperjacking. Data Security and Storage; Identity and Access Management
(IAM) - IAM Challenges - IAM Architecture and Practice.
PART A
2 Marks
1.What is a virtualization attack?BTL1
Virtualization Attacks One of the top cloud computing threats involves one of its core
enabling technologies: virtualization. In virtual environments, the attacker can take
control of virtual machines installed by compromising the lower layer hypervisor.
3. What is guesthopping?BTL1
Guest-hopping attack: In thiswtywpwe o.Ef antgtagckT, raenea.tctaocmker
will try to get access to one virtual machine by penetrating another virtual machine
hosted in the same hardware.
One of the possible mitigations of guest hopping attack is the Forensics and VM
debugging tools to observe the security of cloud.
4. What is a hyperjacking attack?BTL1
Hyperjacking is an attack in which a hacker takes malicious control over the
hypervisor that creates the virtual environment within a virtual machine (VM) host.
5. How does a hyperjacking attack work?BTL1
Hyperjacking is an attack in which an adversary takes malicious control over the
hypervisor that creates the virtual environment within a virtual machine (VM) host.
6. What is data security and storage in cloud computing?BTL1
Cloud data security is the practice of protecting data and other digital information
assets from security threats, human error, and insider threats. It leverages technology,
policies, and processes to keep your data confidential and still accessible to those who
need it in cloud-based environments
7. What are the 5 components of data security in cloud computing?BTL1
Visibility.
Exposure Management.
Prevention Controls.
Detection.
Response
lOMoARcPSD|32653156
EnggTree.co
m
IAM roles are of 4 types, primarily differentiated by who or what can assume the role:
Service Role. Service-Linked Role. Role for Cross-Account Access.
EnggTree.co
m
Management. For each category a general description of goals is provided, followed
by a list of specific requirements that will help ensure goals will be met
PART B
13 Marks
1. What is virtual migration attacks?BTL1
(Definition:2 marks,Concept explanation:11 marks)
EnggTree.co
m
Guest-hopping attack: one of the possible mitigations of guest hopping attack is the
Forensics and VM debugging tools toobserve any attempt to compromise VM.
Another possible mitigation is using High Assurance Platform (HAP) whichprovides
a high degree of isolation between virtual machines.-SQL injection: to mitigate SQL
injection attack you should remove all stored procedures that are rarely used.
Also,assign the least possible privileges to users who have permissions to access the
database-Side channel attack: as a countermeasure, it might be preferable to ensure
that none of the legitimate user VMs resides onthe same hardware of other users. This
completely eliminates the risk of side-channel attacks in a virtualized
cloudenvironment-Malicious Insider: strict privileges’ planning, security auditing can
minimize this security threat-Data storage security: ensuring data integrity and
confidentlyoEnsure limited access to the users’ data by the CSP employees.
What Is Hyperjacking?
Virtual Machine
w w w . E ng gT
A virtual machine is just that: a n o n- p hy s ic a l
re e . c o m
m a ch i n e that uses virtualization software instead of hardware to function.
Though virtual machines must exist on a piece of hardware, they operate using virtual
components (such as a virtual CPU).
Hypervisors form the backbone of virtual machines. These are software programs that
are responsible for creating, running, and managing VMs. A single hypervisor can
host multiple virtual machines, or multiple guest operating systems, at one time,
which also gives it the alternative name of virtual machine manager (VMM).
There are two kinds of hypervisors. The first is known as a "bare metal" or "native"
hypervisor, with the second being a "host" hypervisor. What you should note is that it
is the hypervisors of virtual machines that are the targets of hyperjacking attacks
(hence the term "hyper-jacking").
Origins of Hyperjacking
In the mid-2000s, researchers found that hyperjacking was a possibility. At the time,
hyperjacking attacks were entirely theoretical, but the threat of one being carried out
was always there. As technology advances and cybercriminals become more inventive,
the risk of hyperjacking attacks increases by the year.
lOMoARcPSD|32653156
EnggTree.co
m
In fact, in September 2022, warnings of real hyperjacking attacks began to arise.
Both Mandiant and VMWare published warnings stating that they found malicious
actors using malware to conduct hyperjacking attacks in the wild via a harmful
version of VMWare software. In this venture, the threat actors inserted their own
malicious code within victims' hypervisors while bypassing the target devices'
security measures (similarly to a rootkit).
Through this exploit, the hackers in question were able to run commands on the
virtual machines' host devices without detection.
Hypervisors are the key target of hyperjacking attacks. In a typical attack, the original
hypervisor will be replaced via the installation of a rogue, malicious hypervisor that
the threat actor has control of. By installing a rogue hypervisor under the original, the
attacker can therefore gain control of the legitimate hypervisor and exploit the VM.
By having control over the hypervisor of a virtual machine, the attacker can, in turn,
gain control of the entire VM server. This means that they can manipulate anything in
the virtual machine. In the aforementioned hyperjacking attack announced in
September 2022, it was found wthwatwha.cEknergsgwTerree
eus.icnogmhyperjacking to spy on victims.
Compared to other hugely popular cybercrime tactics like phishing and ransomware,
hyperjacking isn't very common at the moment. But with the first confirmed use of
this method, it's important that you know how to keep your devices, and your data,
safe.
Cloud data security is the practice of protecting data and other digital information
assets from security threats, human error, and insider threats. It leverages technology,
policies, and processes to keep your data confidential and still accessible to those who
need it in cloud-based environments. Cloud computing delivers many benefits,
allowing you to access data from any device via an internet connection to reduce the
chance of data loss during outages or incidents and improve scalability and agility. At
the same time, many organizations remain hesitant to migrate sensitive data to the
cloud as they struggle to understand their security options and meet regulatory
demands.
lOMoARcPSD|32653156
EnggTree.co
m
Understanding how to secure cloud data remains one of the biggest obstacles to
overcome as organizations transition from building and managing on-premises data
centers. So, what is data security in the cloud? How is your data protected? And what
cloud data security best practices should you follow to ensure cloud-based data assets
are secure and protected?
Read on to learn more about cloud data security benefits and challenges, how it works,
and how Google Cloud enables companies to detect, investigate, and stop threats
across cloud, on-premises, and hybrid deployments.
Cloud data security protects data that is stored (at rest) or moving in and out of the
cloud (in motion) from security threats, unauthorized access, theft, and corruption. It
relies on physical security, technology tools, access management and controls, and
organizational policies.
Why companies need cloud security
Today, we’re living in the era of big data, with companies generating, collecting, and
storing vast amounts of data by the second, ranging from highly confidential business
or personal customer data to less sensitive data like behavioral and marketing
analytics.
Beyond the growing volumes of data that companies need to be able to access,
manage, and analyze, organizations are adopting cloud services to help them achieve
more agility and faster times t o m ar k et , an d t o s u p p o rt
w w w . E n g g Tr e e . c o m
increasingly remote or hybrid
workforces. The traditional network perimeter is fast disappearing, and security teams
are realizing that they need to rethink current and past approaches when it comes to
securing cloud data. With data and applications no longer living inside your data
center and more people than ever working outside a physical office, companies must
solve how to protect data and manage access to that data as it moves across and
through multiple environments.
As more data and applications move out of a central data center and away from
traditional security mechanisms and infrastructure, the higher the risk of exposure
becomes. While many of the foundational elements of on-premises data security
remain, they must be adapted to the cloud.
• Lack of visibility. Companies don’t know where all their data and
applications live and what assets are in their inventory.
lOMoARcPSD|32653156
EnggTree.co
m
• Less control. Since data and apps are hosted on third-party infrastructure, they
have less control over how data is accessed and shared.
• Confusion over shared responsibility. Companies and cloud providers share
cloud security responsibilities, which can lead to gaps in coverage if duties and tasks
are not well understood or defined.
• Inconsistent coverage. Many businesses are finding multicloud and hybrid
cloud to better suit their business needs, but different providers offer varying levels of
coverage and capabilities that can deliver inconsistent protection.
• Growing cybersecurity threats. Cloud databases and cloud data storage
make ideal targets for online criminals looking for a big payday, especially as
companies are still educating themselves about data handling and management in the
cloud.
• Strict compliance requirements. Organizations are under pressure to comply
with stringent data protection and privacy regulations, which require enforcing
security policies across multiple environments and demonstrating strong data
governance.
• Distributed data storage. Storing data on international servers can deliver
lower latency and more flexibility. Still, it can also raise data sovereignty issues that
might not be problematic if yowu wwewre.EopnegragtiTnrgeien .ycooumr
own data center.
Greater visibility
Strong cloud data security measures allow you to maintain visibility into the inner
workings of your cloud, namely what data assets you have and where they live, who
is using your cloud services, and the kind of data they are accessing.
Easy backups and recovery
Cloud data security can offer a number of solutions and features to help automate
and standardize backups, freeing your teams from monitoring manual backups and
troubleshooting problems. Cloud-based disaster recovery also lets you restore and
recover data and applications in minutes.
Cloud data compliance
Robust cloud data security programs are designed to meet compliance obligations,
including knowing where data is stored, who can access it, how it’s processed, and
how it’s protected. Cloud data loss prevention (DLP) can help you easily discover,
classify, and de-identify sensitive data to reduce the risk of violations.
Data encryption
lOMoARcPSD|32653156
EnggTree.co
m
Organizations need to be able to protect sensitive data whenever and wherever it
goes. Cloud service providers help you tackle secure cloud data transfer, storage, and
sharing by implementing several layers of advanced encryption for securing cloud
data, both in transit and at rest.
Lower costs
Cloud data security reduces total cost of ownership (TCO) and the administrative
and management burden of cloud data security. In addition, cloud providers offer the
latest security features and tools, making it easier for security professionals to do
their jobs with automation, streamlined integration, and continuous alerting.
Advanced incident detection and response
IAM Challenges One critical challenge of IAM concerns managing access for diverse
user populations (employees, contractors, partners, etc.) accessing internal and
externally hosted services. IT is constantly challenged to rapidly provision appropriate
access to the users whose roles and responsibilities often change for business reasons.
Another issue is the turnover of users within the organization. Turnover varies by
industry and function—seasonal www.EnggTree
staffing fluctuations in finance departments, for
example—and can also arise from changes in the business, such as mergers and
acquisitions, new product and service releases, business process outsourcing, and
changing responsibilities. As a result, sustaining IAM processes can turn into a
persistent challenge. Access policies for information are seldom centrally and
consistently applied. Organizations can contain disparate directories, creating
complex webs of user identities, access rights, and procedures. This has led to
inefficiencies in user and access management processes while exposing these
organizations to significant security, regulatory compliance, and reputation risks. To
address these challenges and risks, many companies have sought technology solutions
to enable centralized and automated user access management. Many of these
initiatives are entered into with high expectations, which is not surprising given that
the problem is often large and complex. Most often those initiatives to improve IAM
can span several years and incur considerable cost. Hence, organizations should
approach their IAM strategy and architecture with both business and IT drivers that
address the core inefficiency issues while preserving the control’s efficacy (related to
access control). Only then will the organizations have a higher likelihood of success
and return on investment.
lOMoARcPSD|32653156
EnggTree.co
m
PART C
15 Marks
1. Explain in detail about IAM architecture?BTL4
(Definition:2 marks,Diagram:4 marks,Concept explanation:9 marks)
Use IAM tools to apply for appropriate permissions. Analyze access patterns and
review permissions.The Architecture of Identity Access Management
User Management:- It consists of
m
activities for the control and
management over the ty of users.
EnggTree.co
m
Monitoring and Auditing:- Based on the defined policies the monitoring, auditing,
and reporting are done by the users regarding their access to within the organization.
resources www.EnggTree.com
Operational Activities of IAM:- In this process, we onboard the new users on the
organization's system and application and provide them with necessary access to the
services and data. Deprovisioning works completely opposite in that we delete or
deactivate the identity of the user and de-relinquish all the privileges of the user.
EnggTree.co
m
order to build custom authentication and authorization features into their application,
it also promotes the loose coupling architecture.
The maturity model takes into account the dynamic nature of IAM users, systems, and
applications in the cloud and addresses the four key components of the IAM
automation process: • User Management, New Users • User Management, User
Modifications • Authentication Management • Authorization Management Table 5-3
defines the maturity levels as they relate to the four key components.
lOMoARcPSD|32653156
EnggTree.co
m
By matching the model’s descriptions of various maturity levels with the cloud
services delivery model’s (SaaS, PaaS, IaaS) current state of IAM, a clear picture
emerges of IAM maturity across the four IAM components. If, for example, the
service delivery model (SPI) is “immature” in one area but “capable” or “aware” in all
others, the IAM maturity model can help focus attention on the area most in need of
attention.
EnggTree.co
m
• Federation or SSO
• Authorization management
• Compliance management
EnggTree.co
m
Here are the specific pros and cons of this approach: Pros Organizations can leverage
the existing investment in their IAM infrastructure and extend the practices to the
cloud. For example, organizations that have implemented SSO for applications within
their data center exhibit the following benefits:
• They are consistent with internal policies, processes, and access management
frameworks. • They have direct oversight of the service-level agreement (SLA) and
security of the IdP. • They have an incremental investment in enhancing the existing
identity architecture to support federation. Cons By not changing the infrastructure to
support federation, new inefficiencies can result due to the addition of life cycle
management for non-employees such as customers. Most organizations will likely
continue to manage employee and long-term contractor identities using organically
developed IAM infrastructures and practices. But they seem to prefer to outsource the
management of partner and consumer identities to a trusted cloudbased identity
provider as a service partner. Identity management-as-a-service In this architecture,
cloud services can delegate authentication to an identity management-asa-service
(IDaaS) provider. In this model, organizations outsource the federated identity
management technology and user management processes to a third-party service
provider, such as Ping Identity, TriCipher’s Myonelogin.com, or Symplified.com.
When federating identities to the cloud, organizations may need to manage the
identity life cycle using their IAM system and processes. However, the organization
might benefit from an outsourced multiprotocol federation gateway (identity
federation service) if it has to interface with many different partners and cloud service
federation schemes. For example, as of this writing, Salesforce.com supports SAML
lOMoARcPSD|32653156
EnggTree.co
m
1.1 and Google Apps supports SAML 2.0. Enterprises accessing Google Apps and
Salesforce.com may benefit from a multiprotocol federation gateway hosted by an
identity management CSP such as Symplified or TriCipher. In cases where
credentialing is difficult and costly, an enterprise might also outsource credential
issuance (and background investigations) to a service provider, such as the GSA
Managed Service Organization (MSO) that issues personal identity verification (PIV)
cards and, optionally, the certificates on the cards. The GSA MSO† is offering the
USAccess management end-to-end solution as a shared service to federal civilian
agencies. In essence, this is a SaaS model for identity management, where the SaaS
IdP stores identities in a “trusted identity store” and acts as a proxy for the
organization’s users accessing cloud services, as illustrated in Figure 5-8.• They are
consistent with internal policies, processes, and access management frameworks. •
They have direct oversight of the service-level agreement (SLA) and security of the
IdP. • They have an incremental investment in enhancing the existing identity
architecture to support federation. Cons By not changing the infrastructure to support
federation, new inefficiencies can result due to the addition of life cycle management
for non-employees such as customers. Most organizations will likely continue to
manage employee and long-term contractor identities using organically developed
IAM infrastructures and practices. But they seem to prefer to outsource the
management of partner and consumer identities to a trusted cloudbased identity
provider as a service partner. Identity management-as-a-service In this architecture,
cloud services can delegate authentication to an identity management-asa-service
(IDaaS) provider. In this model, organizations outsource the federated identity
management technology and user management processes to a third-party service
provider, such as Ping Identwityw,
wTr.iECnipghgerT’sreMey.conoemlogin.com, or Symplified.com. When
federating identities to the cloud, organizations may need to manage the identity life
cycle using their IAM system and processes. However, the organization might benefit
from an outsourced multiprotocol federation gateway (identity federation service) if it
has to interface with many different partners and cloud service federation schemes.
For example, as of this writing, Salesforce.com supports SAML
1.1 and Google Apps supports SAML 2.0. Enterprises accessing Google Apps and
Salesforce.com may benefit from a multiprotocol federation gateway hosted by an
identity management CSP such as Symplified or TriCipher. In cases where
credentialing is difficult and costly, an enterprise might also outsource credential
issuance (and background investigations) to a service provider, such as the GSA
Managed Service Organization (MSO) that issues personal identity verification (PIV)
cards and, optionally, the certificates on the cards. The GSA MSO† is offering the
USAccess management end-to-end solution as a shared service to federal civilian
agencies. In essence, this is a SaaS model for identity management, where the SaaS
IdP stores identities in a “trusted identity store” and acts as a proxy for the
organization’s users accessing cloud services, as illustrated in Figure 5-8.
lOMoARcPSD|32653156
EnggTree.co
m
The identity store in the cloud is kept in sync with the corporate directory through a
providerproprietary scheme (e.g., agents running on the customer’s premises
synchronizing a subset of an organization’s identity store to the identity store in the
cloud using SSL VPNs). Once the IdP is established in the cloud, the organization
should work with the CSP towdwewleg.Eatne gagutThreenteic.actoiomn
to the cloud identity service provider. The cloud IdP will authenticate the cloud
users prior to them accessing any cloud services (this is done via browser SSO
techniques that involve standard HTTP redirection techniques). Here are the specific
pros and cons of this approach:
Pros
Delegating certain authentication use cases to the cloud identity management service
hides the complexity of integrating with various CSPs supporting different federation
standards. Case in point: Salesforce.com and Google support delegated authentication
using SAML. However, as of this writing, they support two different versions of
SAML: Google Apps supports only SAML 2.0, and Salesforce.com supports only
SAML 1.1. Cloudbased identity management services that support both SAML
standards (multiprotocol federation gateways) can hide this integration complexity
from organizations adopting cloud services. Another benefit is that there is little need
for architectural changes to support this model. Once identity synchronization
between the organization directory or trusted system of record and the identity service
directory in the cloud is set up, users can sign on to cloud services using corporate
identity, credentials (both static and dynamic), and authentication policies.
Cons
When you rely on a third party for an identity management service, you may have less
visibility into the service, including implementation and architecture details. Hence,
the availability and authentication performance of cloud applications hinges on the
lOMoARcPSD|32653156
EnggTree.co
m
identity management service provider’s SLA, performance management, and
availability. It is important to understand the provider’s service level, architecture,
service redundancy, and performance guarantees of the identity management service
provider. Another drawback to this approach is that it may not be able to generate
custom reports to meet internal compliance requirements. In addition, identity
attribute management can also become complex when identity attributes are not
properly defined and associated with identities (e.g., definitions of attributes, both
mandatory and optional). New governance processes may be required to authorize
various operations (add/modify/remove attributes) to govern user attributes that move
outside the organization’s trust boundary. Identity attributes will change through the
life cycle of the identity itself and may get out of sync. Although both approaches
enable the identification and authentication of users to cloud services, various features
and integration nuances are specific to the service delivery model— SaaS, PaaS, and
IaaS—as we will discuss in the next section.
www.EnggTree.com