Lecture 8 Chapter 5 Part 4 Big Data Storage Concepts
Lecture 8 Chapter 5 Part 4 Big Data Storage Concepts
• However, with the tremendous growth of the data size and data variety, the traditional
strong consistency and pre-defined schema for relational databases have limited their
capability for dealing with large-scale and semi/unstructured data in the new era
• Therefore, recently, a new generation of highly scalable, more flexible data store systems
has emerged to challenge the dominance of relational databases
• These new groups of systems are called NoSQL (Not only SQL) systems
• The principle underneath the advance of NoSQL systems is actually a trade-off between
the CAP properties of distributed storage systems
SQL vs. NoSQL Cont.
• SQL databases are valuable in handling structured data, or data
that has relationships between its variables and entities
• RDBMS, which use SQL, must exhibit four properties, known by
the acronym ACID.
• NoSQL systems allow a dynamic schema for unstructured data,
so there’s less need to pre-plan and pre-organize data, and it’s
easier to make modifications
• NoSQL calls for BASE properties
SQL vs. NoSQL Cont.
• Traditional RDBMS (SQL) normally provide a strong consistency
model based on their transaction model while NoSQL systems try
to sacrifice some extent of consistency for either higher
availability or better partition tolerance
ACID
• ACID stands for Atomicity, Consistency, Isolation, and Durability
• Atomicity: All transactions must succeed or fail completely and cannot be
left partially complete, even in the case of system failure
• Consistency: Guarantees that data meets predefined integrity constraints
and business rules. Even if multiple users perform similar operations
simultaneously, data remains consistent for all
• Isolation ensures that a new transaction, accessing a particular record, waits
until the previous transaction finishes before it commences operation. It
ensures that concurrent transactions do not interfere with each other,
maintaining the illusion that they are executing serially
• Durability ensures that the database maintains all committed records, even if
the system experiences failure. It guarantees that when ACID transactions
are committed, all changes are permanent and unimpacted by subsequent
system failures
BASE