0% found this document useful (0 votes)
32 views20 pages

REPLICATION

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views20 pages

REPLICATION

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

REPLICATION

INTRODUCTION
EXAMPLES OF REPLICATION EXAMPLES
RELATIONS TO OTHER DISTRIBUTED SYSTEM TOPICS
REASONS FOR REPLICATION
OBJECT REPLICATION
REPLICATION AS SCALING TECHNIQUE
FAULT TOLERANT SERVICE
HIGH AVAILABILITY SERVICE
TRANSACTION WITH REPLICATED DATA
RELATIONS TO OTHER DISTRIBUTED SYSTEM
TOPICS

 Fault Tolerance-the idea of multiple chain of nodes with same data in cases of
one node being down then the other nodes deal with demand for resources.
 Concurrency-distribution of nodes to clients when they are requesting the
same resources e. g bids in auction , forex-trading
 Scalability-more integration of nodes or components into the existing system
or patching.
 Security-write operations such as transactions needs to be locked when such
transactions are occurring so that consistency is met.
 Intercommunication processes-asynchronous activity
 Latency-delays of message transportation.
 Other designing principles for a distributed systems.
INTRODUCTION

 Replication is a way of copying data or state from one component to


another. E .g bank account details , transactions.
 Main goal is to deal with fault tolerance , availability and load balance.
 Synchronous replication ensures all replication are up-to date e . g
banking transactions
 Asynchronous replication allows some delay e . g your google drive.
 Replication is a way to scale and make distributed systems realiable .
examples

 Social networks-Facebook , Twitter


 E-commerce- amazon , e-bay
 Banking systems-Visa , Master-card .
 Cloud platforms-Oracle cloud , Microsoft
Full replication process in DS
Reasons for replication

 Reliability increase to systems – if one replica is unavailable or crashes


it uses another.
 Improved performance-a client can access the nearest or least loaded
replica ,minimizing network latency and demand on resources.

 *The Main key being is need to maintain consistency of replicated data .


Object Replication

 It is a fundamental technique used in distributed systems to enhance


availability, fault tolerance and performance.
 It involves creating and maintaining multiples copies of an object or data item
across different nodes or servers in distributed systems .
 Object examples are distributed databases often use replication to improve
availability and fault tolerance e. g *Apache Casandra*.
 Peer to peer networks-such as Bit-Torrent replicate files across multiple peers
 Block chain technology utilizes object replication to create a distributed ledger
that maintains a consistent and replicated copy of transaction across multiple
nodes.
 The specific implementation and replication strategies can vary depending on
the system requirements, consistency guarantees and performance objectives.
Diagram for Object replication
Replication as Scaling Technique

 Replication helps with scaling in two main ways :


 1. Load Balancing and Performance improvement – by distributing read
operations across multiple replicas, replication eases the load on any
single server. This prevents bottlenecks and improves overall system
performance. Replicating the website’s data across multiple servers
allows users to access the content from the closest or least busy server
improving responsiveness.
 2. Scalability by Geographic Distribution – Data replication can be used
strategically to place copies of data closer to where it’s most needed. Tis
reduces access latency, especially for geographically dispersed users.
For instance, a multinational company might replicate customer data in
regional data centers to provide faster access for local customers.
Common replication challenges

 1. Synchronization overhead – keeping the replicas in sync can be a


resource-intensive task, especially for large datasets.

 2. Data consistency – Ensuring data consistency across replicas can be


tricky especially if the system is distributed across multiple geographical
locations

 3. Replication lag – there can be a delay between when data is updated


on one replica and when it’s updated on other replicas
Fault Tolerant Service

 How replication achieves fault tolerant

 Failover—in node or replica failures other replica can take over ensuring
service remain operational. Clients can be ,directed to available replicas.
 Data durability-having multiple sources of data copies reduces the risk
of losing data due to hardware failures , network failures or other
localized failures.
 Load balancing-when one node is becoming overwhelmed due to sudden
change of demand then the load is distributed among the remaining
replica.
 Geographical redundancy-additional fault tolerance in issues such as
natural disasters.
 Fast Recovery-system recoveries rather than considering data backups
to restore services implying downtimes are minimum for efficient
High Availability Services
 High Available Services refer to systems designed to provide continuous
and uninterrupted access to services, even in the presence of failures or
disruptions.
 The goal is to minimize downtime and ensure that services remain
accessible and accessible and operational for users.
 Techniques to achieve HA
 Redundancy: Involves replicating data across multiple nodes or
machines. Redundancy ensures that the system remains available even
in the event of failures.
 Load Balancing: Load balancing distributes the workload evenly across
multiple replicas, helping prevent any replica from becoming
overwhelmed with requests, improving performance and resource
utilization.
 Failover and Failback: Failover is the process of switching to a backup
replica when the primary fails. Failback is the process of returning to the
primary replica after it has recovered. These mechanism ensure the
system can quickly recover from failures and restore normal operations.
TRANSACTION WITH REPLICATED
DATA
Replication Schemes

 Read one –Write All


 Cannot handle network partitions
 Network partitions are just network failures-configuration error ,
overloaded, unavailable.
 But two phase commit ensures handling of network partitions
 Schemes that can handle network partitions
 Available copies with validation
 Quorum consensus
Read one –Write All

 the effect of transactions on replicated objects should be the same as if they had
been performed one at a time on a single set of objects.
 this property is called one-copy serializability.
 Each write operation sets a write lock at each replica manager e.g.fx trading
 Each read sets a read lock at one replica manager
 This happens in a multi-master replication
 The idea is that each server must act as if its only server writing to the database
and other servers just reading from the server and every update to database
must be replicated to all other servers before being committed with algorithms
such as Two-phase commit.
 Changes are not immediately visible to other servers but such is written in log
files which act as record of change.
 Two phase commit
 Method of achieving one-copy serializability in multi-master replication.
 It has 2 procedures such as prepare and commit phase.
 Prepare phase-a server wanting to make change to database sends a message to
other servers for positive acknowledgement which then proceed to commit phase.
 Commit phase-server initiating the change sends a commit message to other
servers and change will be considered complete if all other servers received the
commit message and responded with positive acknowledgement.
 Available copies with validation
 Each server keep track number of copies of the database that are currently
available .
 If a change have been requested then number of copies are checked to see if
they are enough and if they are not change is delayed till the copies are enough.
 This is to ensure data consistency
 And is based on quorum factor
 Quorum consensus
 Consensus is more to say in agreement based on cohesion
 It ensures that all servers has same view of database.
 A server keeps track of other servers it has communicated with and uses
this information to come up with a quorum which is a minimum number
of servers that must agree to set of updates before they can be applied
to database.
 If agreement is not reach servers continue to communicate with each
other till agreement is met.
 Virtual Partition
 All about nodes reaching consensus in events of virtual partitions
 Virtual partitions are bottlenecks whereby servers in involved will not be
reaching consensus even connected to the network .
 In succinct we are saying communication loss due to misconfigurations .
 * Paxos Algorithm
Case study # forex trading

 In live trading , market data is constantly being generated and updated.


 This data is replicated to a large number of servers so that multiple
clients can access it at the same time .
 This ensures data consistency and availability , even when some nodes
are down still clients continue to get services without even noticing that
the part of the distributed system is down.
Questions

 What is the purpose of replication?


 What are the benefits of replication?
 How does replication work in a distributed system?
 What are the design consideration of a replication system?
 What happens when a new node is added or removed from the
replication system.
 How do you enforce security on a replication model.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy