MC5501 Cloud Computing
MC5501 Cloud Computing
Presented By
Prof.S.Sajithabanu M.Tech.,(Ph.D).,
UNIT I - CLOUD ARCHITECTURE AND MODEL
• Processor speed growth is plotted in the upper curve in Figure 1.4 across
generations of microprocessors or CMPs. We see growth from 1 MIPS for the
VAX 780 in 1978 to 1,800 MIPS for the Intel Pentium 4 in 2002, up to a 22,000
MIPS peak for the Sun Niagara 2 in 2008.
• As the figure shows, Moore’s law has proven to be pretty accurate in this case.
The clock rate for these processors increased from 10 MHz for the Intel 286 to 4
GHz for the Pentium 4 in 30 years.
System Models for Distributed and
Cloud Computing
Classification of Distributed Computing Systems
• These can be classified into 4 groups: clusters, peer-to-peer networks, grids, and
clouds.
• A computing cluster consists of interconnected stand-alone computers which work
cooperatively as a single integrated computing resource. The network of compute
nodes are connected by LAN/SAN and are typically homogeneous with distributed
control running Unix/Linux. They are suited to HPC.
Peer-to-peer (P2P) Networks
• In a P2P network, every node (peer) acts as both a client and server. Peers act
autonomously to join or leave the network. No central coordination or central database is
needed. No peer machine has a global view of the entire P2P system. The system is self-
organizing with distributed control.
• Unlike the cluster or grid, a P2P network does not use dedicated interconnection network.
Distributed File Sharing: content distribution of MP3 music, video, etc. E.g. Gnutella,
Napster, BitTorrent.
• Like an electric utility power grid, a computing grid offers an infrastructure that
couples computers, software/middleware, people, and sensors together.
• The computers used in a grid include servers, clusters, and supercomputers. PCs,
laptops, and mobile devices can be used to access a grid system.
Clouds
• A Cloud is a pool of virtualized computer resources. A cloud can host a variety of
different workloads, including batch-style backend jobs and interactive and user-
facing applications.
• Workloads can be deployed and scaled out quickly through rapid provisioning of
VMs. Virtualization of server resources has enabled cost effectiveness and allowed
cloud systems to leverage low costs to benefit both users and providers.
• Cloud system should be able to monitor resource usage in real time to enable
rebalancing of allocations when needed.
• REST supports many data formats, whereas SOAP only allows XML.
• REST supports JSON (smaller data formats and offers faster parsing compared to XML
parsing in SOAP which is slower).
• REST provides superior performance, particularly through caching for information
that’s not altered and not dynamic.
• REST is used most often for major services such as Amazon, Twittter.
• REST is generally faster and uses less bandwidth.
• SOAP provides robust security through WS-Security and so useful for enterprise apps
such as banking and financial apps; REST only has SSL.
• SOAP offers built-in retry logic to compensate for failed communications.
Performance Metrics and Scalability Analysis
• Performance Metrics:
• CPU speed: MHz or GHz, SPEC benchmarks like SPECINT
• Network Bandwidth: Mbps or Gbps
• System throughput: MIPS, TFlops (tera floating-point operations per second), TPS
(transactions per second), IOPS (IO operations per second)
• Other metrics: Response time, network latency, system availability
• Scalability:
• Scalability is the ability of a system to handle growing amount of work in a
capable/efficient manner or its ability to be enlarged to accommodate that growth.
• For example, it can refer to the capability of a system to increase total throughput
under an increased load when resources (typically hardware) are added.
Scalability
Scale Vertically
To scale vertically (or scale up) means to add resources to a single node in a system,
typically involving the addition of CPUs or memory to a single computer.
Tradeoffs
There are tradeoffs between the two models. Larger numbers of computers means
increased management complexity, as well as a more complex programming model
and issues such as throughput and latency between nodes.
Also, some applications do not lend themselves to a distributed computing model.
In the past, the price difference between the two models has favored "scale up"
computing for those applications that fit its paradigm, but recent advances in
virtualization technology have blurred that advantage, since deploying a new virtual
system/server over a hypervisor is almost always less expensive than actually buying
and installing a real one.
Scalability
• One form of scalability for parallel and distributed systems is:
• Size Scalability
This refers to achieving higher performance or more functionality by increasing the
machine size. Size in this case refers to adding processors, cache, memory, storage, or
I/O channels.
• Scale Horizontally and Vertically
Methods of adding more resources for a particular application fall into two broad
categories:
Scale Horizontally
To scale horizontally (or scale out) means to add more nodes to a system, such as
adding a new computer to a distributed software application. An example might be
scaling out from one Web server system to three.
The scale-out model has created an increased demand for shared data storage with
very high I/O performance, especially where processing of large amounts of data is
required.
Amdahl’s Law
It is typically cheaper to add a new node to a system in order to achieve improved
performance than to perform performance tuning to improve the capacity that each
node can handle. But this approach can have diminishing returns as indicated by
Amdahl’s Law.
Assume that a fraction α of the code must be executed sequentially, called the
sequential block. Therefore, (1 - α ) of the code can be compiled for parallel execution
by n processors. The total execution time of program is calculated by:
α T + (1 - α ) T / n
where the first term is the sequential execution time on a single processor and the
second term is the parallel execution time on n processing nodes.
All system or communication overhead is ignored here. The I/O and exception
handling time is also not included in the speedup analysis.