0% found this document useful (0 votes)
2 views49 pages

7. Consistency Replication

The document discusses the importance of data replication for reliability and performance in distributed systems, highlighting the trade-offs involved. It explains various consistency models, including strong, sequential, causal, and eventual consistency, along with their respective advantages and disadvantages. Additionally, it covers concepts such as the CAP theorem, monotonic reads and writes, and different replication protocols.

Uploaded by

Hoang do
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views49 pages

7. Consistency Replication

The document discusses the importance of data replication for reliability and performance in distributed systems, highlighting the trade-offs involved. It explains various consistency models, including strong, sequential, causal, and eventual consistency, along with their respective advantages and disadvantages. Additionally, it covers concepts such as the CAP theorem, monotonic reads and writes, and different replication protocols.

Uploaded by

Hoang do
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Consistency And Replication

Reasons for Replication


• Data are replicated to increase the reliability of a system.
• Replication for performance
▪ Scaling in numbers
▪ Scaling in geographical area
• Caveat
▪ Gain in performance
▪ Cost of increased bandwidth for maintaining replication
Data-centric Consistency Models
• The general organization of a logical data store, physically
distributed and replicated across multiple processes.
CAP Theorem
• The CAP Theorem states that a
distributed system can guarantee at
most two of the following three
properties:
• Consistency (C),
• Availability (A),
• Partition Tolerance (P).
• In practice, this means choosing
between consistency and availability
during network partitions.
Consistency models
• Consistency models describe how distributed systems manage data
across multiple nodes.
• They provide a contract between the system and the programmer, defining
how data should behave when read, written or updated
• Strong consistency guarantees any read reflects the most recent write,
regardless of which node is accessed
• Eventual consistency guarantees all nodes will converge to the same state
over time but allows temporary discrepancies between them
• Four commonly used consistency models: Strong Consistency,
Sequential Consistency, Causal Consistency, and Eventual
Consistency.
Consistency models
Consistency models
• Strong consistency: also known as linearizability, guarantees that all clients see the
latest data immediately after a write.
• Every operation occurs instantaneously at some point between its invocation and
completion, behaving as if there were a single copy of the data.
• This approach guarantees a unified, real-time view of data across all nodes, making it
easier to reason about correctness in distributed systems.
• Examples:
• Financial transactions and ledgers: real-time correctness and strict ordering of operations are
critical to prevent anomalies such as double-spending or inconsistent balances.
• Distributed coordination: Systems like leader election and distributed locks require strict
synchronization to ensure tasks or resources are managed consistently across nodes.
• Configuration and metadata management: Distributed systems often rely on shared configuration
or metadata (e.g. system settings, quotas, or state information). Strong consistency ensures updates
to these values are immediately synchronized across all nodes, preventing conflicting states.
Consistency models
Strong consistency: Trade-offs
• Pros
• Guarantees correctness by eliminating stale reads and preventing anomalies like
double-spending.
• Simplifies application logic with consistent data views.
• Cons
• High latency due to quorum-based operations required for synchronization.
• Reduced availability during network partitions, as writes may block until consensus is
achieved.
Consistency models
• Sequential consistency: ensures that all operations occur in a logical
order.
• The execution results are as if all operations were executed individually,
maintaining the order in which each client issues operations.
• Sequential consistency does not enforce real-time ordering, allowing for
performance optimizations while preserving predictable behavior.
• Examples:
• Gaming systems: Ensures actions happen in the correct order for all players, such
as moves in turn-based games, even if operations aren’t synchronized in real-time.
• Collaborative editing: Guarantees ordered application of updates in shared
workspaces, like document editing platforms, ensuring all collaborators’ edits appear
in the correct sequence.
Consistency models
Sequential consistency: Trade-offs
• Pros
• Lower synchronization overhead compared to strong consistency, as
strict real-time guarantees are not required.
• Well-suited for scenarios where the sequence of operations matters
more than immediate visibility.
• Cons
• Slower than eventual consistency due to global ordering constraints.
• Lack of real-time guarantees can cause anomalies in time-sensitive
applications.
Consistency models
• Causal consistency: ensures that operations with a cause-and-effect
relationship are seen in the correct order across nodes.
• This model balances consistency and performance, making it ideal for
collaborative applications or systems that don’t require strict real-time
consistency.
• Examples:
• Collaborative platforms: Tools like Google Docs or shared workspaces, where
causally related updates (e.g., editing text and applying formatting) must occur in the
correct order.
• Messaging apps: Ensures messages are delivered as they were sent, maintaining
logical coherence in conversations without requiring global synchronization.
Consistency models
Causal consistency: Trade-offs
• Pros
• Intuitive behavior for collaborative or real-time systems.
• Balanced trade-offs between consistency and performance.
• Cons
• More complexity than eventual consistency.
• Slightly higher latency due to dependency tracking.
• Total ordering is not enforced across the entire system.
Consistency models
• Eventual consistency: allows for temporary inconsistencies between nodes,
with a guarantee that all nodes will eventually converge to the same state.
• This model sacrificies immediate consistency to ensure the system remains
responsive during network failures or partitions.
• In practice, eventual consistency means that updates propagate asynchronously
across replicas, leading to temporary discrepancies. However, the system
guarantees that all nodes eventually synchronize to the same state.
• Examples:
• Web caching: Ensures high-speed data access with tolerable temporary inconsistencies,
such as serving stale session data or cached web pages.
• Product recommendations: Provides personalized suggestions without requiring real-
time consistency, allowing recommendation systems to sync across nodes gradually.
• Inventory counts: Tracks stock across distributed warehouses or regions, tolerating brief
inconsistencies while ensuring eventual convergence to accurate totals.
Consistency models
Eventual consistency: Trade-offs
• Pros
• High availability and low latency, even during partitions.
• Scales well for high-demand, read-heavy systems.
• Cons
• Temporary inconsistencies may lead to stale or conflicting reads.
• Application logic must handle potential data conflicts.
Continuous Consistency

• A has 3 pending operations


-> order deviation = 3
• A missed 1 operations from A
-> max diff is 5 units (1, 5)

• B has 2 pending operations


-> order deviation = 2
• B missed 3 operations from A
-> max diff is 6 units (3, 6)
Continuous Consistency
• Choosing the appropriate granularity for a consistency unit (conit)
(a) Two updates lead to update propagation.
Continuous Consistency
• Choosing the appropriate granularity for a conit.
(b) No update propagation is needed (yet).
Sequential Consistency
• Behavior of two processes operating on the same data item. The
horizontal axis is time.
Sequential Consistency
A data store is sequentially consistent when:
• The result of any execution is the same as if the (read and
write) operations by all processes on the data store were
executed in some sequential order and the operations of
each individual process appear
▪ in this sequence
▪ in the order specified by its program.
Sequential Consistency
• (a) A sequentially consistent data store.
(b) A data store that is not sequentially consistent.
Sequential Consistency

• Three concurrently-executing processes.


Sequential Consistency
• Four valid execution sequences for the processes of concurrently
execution. The vertical axis is time.
Causal Consistency

For a data store to be considered causally consistent, it is


necessary that the store obeys the following condition:
• Writes that are potentially causally related must be seen by all
processes in the same order.
• Concurrent writes may be seen in a different order on different
machines.
Causal Consistency

• This sequence is allowed with a causally-consistent store, but not


with a sequentially consistent store.
Causal Consistency
• (a) A violation of a causally-consistent store.
Causal Consistency
• (b) A correct sequence of events in a causally-consistent store.
Grouping Operations
Necessary criteria for correct synchronization:
• An acquire access of a synchronization variable is not allowed to
perform until all updates to guarded shared data have been performed
with respect to that process.
• Before exclusive mode access to synchronization variable by a process
is allowed to perform with respect to that process, no other process
may hold synchronization variable, not even in nonexclusive mode.
• After exclusive mode access to synchronization variable has been
performed, any other process’ next nonexclusive mode access to that
synchronization variable may not be performed until it has performed
with respect to that variable’s owner.
Grouping Operations

• A valid event sequence for entry consistency.


Eventual Consistency
• The principle of a mobile
user accessing different
replicas of a distributed
database.
Monotonic Reads
A data store is said to provide monotonic-read consistency if the
following condition holds:
• If a process reads the value of a data item x, any successive
read operation on x by that process will always return that
same value or a more recent value.
Monotonic Reads
• The read operations performed by a single process P at two
different local copies of the same data store.
(a) A monotonic-read consistent data store.
Monotonic Reads
• The read operations performed by a single process P at two
different local copies of the same data store.
(b) A data store that does not provide monotonic reads.
Monotonic Writes
In a monotonic-write consistent store, the following condition
holds:
• A write operation by a process on a data item x is completed
before any successive write operation on x by the same
process.
Monotonic Writes
• The write operations performed by a single process P at two
different local copies of the same data store.
(a) A monotonic-write consistent data store.
Monotonic Writes

• The write operations performed by a single process P at two


different local copies of the same data store.
(b) A data store that does not provide monotonic-write consistency.
Read Your Writes
A data store is said to provide read-your-writes consistency, if the
following condition holds:
• The effect of a write operation by a process on data item x will
always be seen by a successive read operation on x by the
same process.
Read Your Writes
• (a) A data store that provides read-your-writes consistency.
Read Your Writes
• (b) A data store that does not.
Writes Follow Reads
A data store is said to provide writes-follow-reads consistency, if
the following holds:
• A write operation by a process on a data item x following a
previous read operation on x by the same process is
guaranteed to take place on the same or a more recent value
of x that was read.
Writes Follow Reads
• (a) A writes-follow-reads consistent data store.
Writes Follow Reads
• (b) A data store that does not provide writes-follow-reads
consistency.
Replica-Server Placement
• Choosing a proper cell size for server placement.
Content Replication and Placement
• The logical organization of different kinds of copies of a data store
into three concentric rings.
Server-Initiated Replicas
• Counting access requests from different clients.
State versus Operations
Possibilities for what is to be propagated:
1. Propagate only a notification of an update.
2. Transfer data from one copy to another.
3. Propagate the update operation to other copies.
Pull versus Push Protocols
• A comparison between push-based and pull-based protocols in
the case of multiple-client, single-server systems.
Remote-Write Protocols
• The principle of a
primary-backup
protocol.
Local-Write Protocols

• Primary-backup
protocol in which
the primary
migrates to the
process wanting
to perform an
update.
Quorum-Based Protocols
• Three examples of the voting algorithm.
(a) A correct choice of read and write set.
(b) A choice that may lead to write-write conflicts.
(c) A correct choice, known as ROWA (read one, write all).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy