0% found this document useful (0 votes)
20 views33 pages

Unit-3 PPT Updated

The document outlines the concepts of cloud storage and cloud computing, detailing their functionalities, benefits, and types. It discusses cloud provisioning, data management, and the advantages of using cloud storage, including scalability, cost savings, and improved data security. Additionally, it covers data-intensive technologies and distributed data storage, highlighting specific systems like Amazon Dynamo, CouchDB, and ThruDB.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views33 pages

Unit-3 PPT Updated

The document outlines the concepts of cloud storage and cloud computing, detailing their functionalities, benefits, and types. It discusses cloud provisioning, data management, and the advantages of using cloud storage, including scalability, cost savings, and improved data security. Additionally, it covers data-intensive technologies and distributed data storage, highlighting specific systems like Amazon Dynamo, CouchDB, and ThruDB.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

“Data Storage and Cloud

Computing”
Prepared By
Prof. Dinesh Banabakode
(Assistant Professor)
AI Department

.
Syllabus
Cloud Storage: Data Management, Provisioning Cloud
storage, Data Intensive Technologies for Cloud Computing.
Cloud Storage from LANs to WANs: Cloud
Characteristics, Distributed Data Storage.
File System in Cloud
A file system in the cloud is a hierarchical storage system that provides shared
access to file data. Users can create, delete, modify, read, and write files and can
organize them logically in directory trees for intuitive access.
File System in Cloud
A file system in the cloud is exactly what it sounds like. The vendor
creates a file system that offers traditional file protocols like NFS or
SMB to cloud hosted applications. Essentially, the vendor provides
an instance of their file system and the organization implements it in
the cloud provider of their choice. It then allocates the appropriate
storage compute performance and the storage IO.

The goal with these file systems is to speed the migration of


applications to the cloud. By using a file system in the cloud the
organization does not need to re-write the storage IO components of
the application
Cloud Data Stores
Cloud storage is a cloud computing model that stores data on the
Internet through a cloud computing provider who manages and
operates data storage as a service.
In this fast-moving world it become necessary to store data on
the cloud storage. The biggest advantage of cloud storage is that we
can store any type of data in digital form on the cloud. Anothe
advantage of cloud storage is that we can access data from anywhere
anytime on any device. There are many cloud storage providers such
as, Google Drive, Dropbox, OneDrive, iCloud, etc. They provide
free service for limited storage but if you want to store beyond the
limit, you have to pay.
How does Cloud Storage Works?
Cloud storage is purchased from a third party cloud vendor who
owns and operates data storage capacity and delivers it over the
Internet in a pay-as-you-go model. These cloud storage vendors
manage capacity, security and durability to make data accessible to
your applications all around the world.
Applications access cloud storage through traditional storage
protocols or directly via an API. Many vendors offer complementary
services designed to help collect, manage, secure and analyze data at
massive scale.
Cloud Vs Grid Computing
Cloud Computing Grid Computing
Cloud Computing follows client-server Grid computing follows a distributed computing
computing architecture. architecture.

Scalability is high. Scalability is normal.


Cloud Computing is more flexible than grid Grid Computing is less flexible than cloud
computing. computing.
Cloud operates as a centralized management Grid operates as a decentralized management
system. system.

In cloud computing, cloud servers are owned by In Grid computing, grids are owned and
infrastructure providers. managed by the organization.

Cloud computing uses services like Iaas, PaaS, Grid computing uses systems like distributed
and SaaS. computing, distributed information, and
distributed pervasive.

Cloud Computing is Service-oriented. Grid Computing is Application-oriented.

It is accessible through standard web protocols. It is accessible through grid middleware.


Cloud Storage
Cloud storage is a data deposit model in which digital information such as
documents, photos, videos and other forms of media are stored on virtual or
cloud servers hosted by third parties. It allows you to transfer data on an
offsite storage system and access them whenever needed.

Cloud storage is a cloud computing model that allows users to save


important data or media files on remote, third-party servers. Users can access
these servers at any time over the internet. Also known as utility storage,
cloud storage is maintained and operated by a cloud-based service provider.
How Cloud Storage Works?
How Cloud Storage Works?
Cloud storage works as a virtual data center. It offers end users
and applications virtual storage infrastructure that can be scaled to
the application’s requirements. It generally operates via a web-
based API implemented remotely through its interaction with in-
house cloud storage infrastructure.
Cloud storage includes at least one data server to which a user can
connect via the internet. The user sends files to the data server,
which forwards the message to multiple servers, manually or in an
automated manner, over the internet. The stored data can then be
accessed via a web-based interface.
Types of Cloud Storage
Types of Cloud Storage
Note : Theory part , we have already covered in Unit-1
i.e. Public, Private, Hybrid and Community Cloud/

Benefits of Cloud Storage :


1. Cost Saving
2. Data Redudancy and Replication
3. Scalability
4. Speed
Data Management in Cloud Storage
Cloud data management is the practice of storing a company’s
data at an offsite data center that is typically owned and overseen
by a vendor who specializes in public cloud infrastructure, such as
AWS or MicrosoftAzure. Managing data in the cloud provides an
automated backup strategy, professional support, and ease of
access from any location.
Benefit of Cloud Data Management
1.Security: Modern cloud data management often delivers better data
protection than on-premises solutions. In fact, 94% of cloud adopters report
security improvements. Why? First of all, cloud data management reduces
the risk of data loss due to device damage or hardware failure. Second,
companies specializing in cloud hosting and data management employ more
advanced security measures and practices to protect sensitive data than
companies that invest in their on-premises data.
2.Scalability and savings: Cloud data management lets users scale services
up or down as needed. More storage or compute power can be added when
needed to accommodate changing workloads. Companies can then scale back
after the completion of a big project to avoid paying for services they don’t
need.
Benefit of Cloud Data Management
3.Governed access: With improved security comes greater peace of mind
regarding governed data access. Cloud storage means team members can
access the data they need from wherever they are. This access also supports a
collaborative work culture, as employees can work together on a dataset,
easily share insights, and more.

4.Automated backups and disaster recovery: The cloud storage vendor


can manage and automate data backups so that the company can focus its
attention on other things, and can rest assured that its data is safe. Having an
up-to-date backup at all times also speeds up the process of disaster recovery
after emergencies, and can help mitigate the effects of ransomware attacks.
Benefit of Cloud Data Management
5.Improved data quality: An integrated, well-governed cloud data
management solution helps companies tear down data silos and create a
single source of truth for every data point. Data remains clean, consistent,
up-to-date, and accessible for every use case, from real-time data analytics to
advanced machine learning applications to external sharing via APIs.

6.Automated updates: Cloud data management providers are committed to


providing the best services and capabilities. When applications need
updating, cloud providers run these updates automatically. That means your
team doesn’t need to pause work while they wait for IT to update everyone’s
system.
Cloud Provisioning
Cloud provisioning means allocating a cloud service provider’s resources to
a customer. It is a key feature of cloud computing. It refers to how a client
gets cloud services and resources from a provider. The cloud services that
customers can subscribe to include infrastructure-as-a-service (IaaS),
software-as-a-service (SaaS), and platform-as-a-service (PaaS) in public or
private environments.
Benefits of Cloud Provisioning
Scalability: A company makes a huge investment in its on-site infrastructure under the
conventional IT provisioning model. This requires immense preparation and
prophesying infrastructure needs. However, in the cloud provisioning model, cloud
resources can scale up and scale down which is entirely dependant on the short-term
consumption of usage. This way scalability can help the organizations.

Speed: Speed is another factor of the cloud’s provisioning which can benefit the
organizations. For this, the developers of the organization can schedule the jobs which
in turn removes the need for an administrator who provisions and manages resources.

Cost Savings: It is another potential benefit of cloud provisioning. Traditional


technology can incur a huge cost to the organizations while cloud providers allow
customers to pay only for what they consume. This is another major reason why cloud
provisioning is preferred.
Types of Cloud Provisioning
Network Provisioning: Network Provisioning in the telecom industry is a means of
referring to the provisions of telecommunications services to a client.

Server Provisioning: Datacenter’s physical infrastructure, installation, configuration of


the software, and linking it to middleware, networks, and storage.

User Provisioning: It is a method of identity management that helps us in keeping a


check on the access and privileges of authorization. Provisioning is featured by the
artifacts such as equipment, suppliers, etc.

Service Provisioning: It requires setting up a service and handling its related data.
Tools and Softwares Used in Cloud Provisioning
Several enterprises can provide the services and resources manually as per their
need, whereas public cloud providers offer tools to provide various resources and
services such as:

1. IBM Cloud Orchestrator


2. Cloud Bolt
3. Morpheus Data
4. Flexera
5. Cloud Sphere
6. Scalr
7. Google Cloud Deployment manager
Data Intensive Technology in Cloud Computing
Data Intensive Computing is a class of parallel computing which uses data parallelism
in order to process large volumes of data. The size of this data is typically in terabytes
or petabytes. This large amount of data is generated each day and it is referred to Big
Data.

Data intensive computing has some characteristics which are different from other
forms of computing. They are:
1. In order to achieve high performance in data intensive computing, it is necessary to
minimize the movement of data. This reduces system overhead and increases
performance by allowing the algorithms to execute on the node where the data
resides.
2. The data intensive computing system utilizes a machine independent approach
where the run time system controls the scheduling, execution, load balancing,
communications and the movement of programs.
Data Intensive Technology in Cloud Computing
Data intensive computing has some characteristics which are different from other
forms of computing. They are:
3. Data intensive computing hugely focuses on reliability and availability of data.
Traditional large scale systems may be susceptible to hardware failures,
communication errors and software bugs, and data intensive computing is designed
to overcome these challenges.

4. Data intensive computing is designed for scalability so it can accommodate any


amount of data and so it can meet the time critical requirements. Scalability of the
hardware as well as the software architecture is one of the biggest advantages of
data intensive computing.
Cloud Storage from LANs to WANs:
Characteristics :
1. Computer power is elastic, when it can perform parallel operations. In
general, applications conceived to run on the peak of a shared-nothing
architecture are well matched for such an environment. Some cloud computing
goods, for example, Google’s App Engine, supply not only a cloud computing
infrastructure, but also an entire programs stack with a constrained API so that
software developers are compelled to compose programs that can run in a
shared-nothing natural environment and therefore help elastic scaling.
Cloud Storage from LANs to WANs:
Characteristics :
2. Data is retained at an unknown host server. In general, letting go off data is
a threat to many security issues and thus suitable precautions should be taken.
The very title ‘loud computing’ implies that the computing and storage resources
are being operated from a celestial position. The idea is that the data is
physically stored in a specific host country and is subject to localized laws and
regulations. Since most cloud computing vendors give their clientele little
command over where data is stored, the clientele has no alternative but to expect
the least that the data is encrypted utilizing a key unavailable with the owner, the
data may be accessed by a third party without the customer’s knowledge.
Cloud Storage from LANs to WANs:
Characteristics :
3. Data is duplicated often over distant locations. Data accessibility and
durability is paramount for cloud storage providers, as data tampering can be
impairing for both the business and the organization’s reputation. Data
accessibility and durability are normally accomplished through hidden
replications. Large cloud computing providers with data hubs dispersed all
through the world have the proficiency to provide high levels of expected error
resistance by duplicating data at distant locations across continents. Amazon’s
S3 cloud storage service replicates data over ‘regions’ and ‘availability zones’ so
that data and applications can survive even when the whole location collapses.
Cloud Storage from LANs to WANs:
Distributed Data Storage :
Distributed storage means are evolving from the existing practices of data
storage for the new generation of WWW applications through organizations like
Google, Amazon and Yahoo. There are some reasons for distributed storage
means to be favoured over traditional relational database systems encompassing
scalability, accessibility and performance. The new generation of applications
require processing of data to a tune of terabytes and even peta bytes. This is
accomplished by distributed services. Distributed services means distributed
data. This is a distinct giant compared to traditional relational database systems.
Several studies have proposed that this is an end of an architectural era and
relational database systems have to take over. Emerging answers are Amazon
Dynamo, CouchDB and ThruDB.
Cloud Storage from LANs to WANs:
13.3.1 Amazon Dynamo
Amazon Dynamo is a widely used key-value store. It is one of the main
components of Amazon. com, the biggest e-commerce stores in the world. It has
a primary-key only interface. This demands that data is retained as key-value in
twos, and the only interface to get access to data is by identifying the key. Values
are anticipated to be barely there (less than 1 MB).

Dynamo is said to be highly accessible for composing as opposed to reading,


since malfunction of composing inconveniences the end-user of the application.
Therefore any data confrontations are finalized at the time of reading than
writing.
Cloud Storage from LANs to WANs:
13.3.2 CouchDB :
CouchDB is a document-oriented database server, accessible by REST APIs.
Couch is an acronym for ‘Cluster Of Unreliable Commodity Hardware’,
emphasizing the distributed environment of the database. CouchDB is designed
for document-oriented applications, for example, forums, bug following, wiki,
Internet note, etc. CouchDB is ad-hoc and schema-free with a flat address space.

CouchDB aspires to persuade the Four Pillars of Data Management by these


methods:

Save: ACID compliant, save efficiently


See: Easy retrieval, straightforward describing procedures, fulltext search
Secure: Strong compartmentalization, ACL, connections over SSL
Share: Distributed means
Cloud Storage from LANs to WANs:
13.3.2 CouchDB :
The storage form is a Multiversion Concurrency Control (MVCC) scheme with
hopeful locking. A purchaser sees a snapshot of the data and works with it even
if it is altered at the same time by a distinct client.

CouchDB actually has no apparent authentication scheme, i.e., it is in-built. The


replication is distributed. A server can revise others once the server is made
offline and data is changed. If there are confrontations, CouchDB will choose a
survivor and hold that as latest. Users can manually suspend this surviving
alternative later. Importantly, the confrontation tenacity yields identical results
comprehensively double-checking on the offline revisions. This also promises to
compose a storage motor for MySQL founded on CouchDB.
Cloud Storage from LANs to WANs:
13.3.3 ThruDB
ThruDB aspires to be universal in simplifying the administration of the up-to-
date WWW data level (indexing, caching, replication, backup) by supplying a
reliable set of services:

Thrucene for indexing


Throxy for partitioning and burden balancing
Thrudoc for article storage
ThruDB builds on top of some open source projects: Thrift, Lucene (indexing),
Spread (message bus), Memcached (caching), Brackup (backup to disk/S3) and
also values Amazon S3.
Cloud Storage from LANs to WANs:
13.3.3 ThruDB
Thrift is a structure for effective cross-language data serialization, RPC and
server programming. Thrift is a programs library and set of code-generation
devices conceived to expedite development and implementation of effective and
scalable backend services. Its prime aim is to enhance effective and dependable
connection over programming languages. This is finished by abstracting the
portions of each dialect that are inclined to need the most customization into a
widespread library that is applied in each language. Specifically, Thrift permits
developers to characterise data types and service interfaces in a sole language-
neutral document and develop all the essential cipher to construct RPC
purchasers and servers.
Cloud Storage from LANs to WANs:
13.3.3 ThruDB
Thrudoc arrives with some data storage engines: Disk and S3. In this
implementation, the data is persevered on localized computer disk, which
bestows us an unbelievable throughput capability and a slave gist that calmly
replays all of the instructions to the S3 backend as well, therefore giving us a
provoke-free persistence and recovery form for virtual environments, for
example, EC2.
THANK YOU!!!

56

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy