0% found this document useful (0 votes)
21 views20 pages

Bda Module 3

Uploaded by

Neha.Kale K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views20 pages

Bda Module 3

Uploaded by

Neha.Kale K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Introduction to No SQL

NoSQL databases (AKA "not only SQL") store data differently than
relational tables. NoSQL databases come in a variety of types based on
their data model. The main types are
document
, key-value, wide-column, and graph. They provide flexible schemas and
scale easily with large amounts of big data and high user loads.
RDBMS OLAP NoSQL
Relational Online Not only
Database Analytical SQL in 1998
Manageme Processing
nt System
History of NoSQL Databases
• 1988- Carlo Stroozzi use the term NoSQL for his light weight .open
source realtional db.
• 2000- Graph database Neo4j is launched.
• 2004- Google BigTable is launched.
• 2005- CouchDB is launched.
• 2007- The research pape on Amazon Dynamo is released.
• 2008-Facebooks open sources the Cassandra project.
• 2009 the term NoSQL was reintroduced.
WHY NoSQL
• A NoSQL database is best for handling indeterminate, unrelated, or
rapidly changing data.
• It is intuitive to use for developers when the application dictates the
database schema.
• You can use it for applications that: Need flexible schemas that enable
faster and more iterative development.
CAP THEOREM
• The CAP theorem, originally introduced as the CAP principle, can be
used to explain some of the competing requirements in a distributed
system with replication. It is a tool used to make system designers
aware of the trade-offs while designing networked shared-data
systems.
• The three letters in CAP refer to three desirable properties of
distributed systems with replicated data: consistency (among
replicated copies), availability (of the system for read and write
operations) and partition tolerance (in the face of the nodes in the
system being partitioned by a network fault).
• The CAP theorem states that it is not possible to guarantee all three of
the desirable properties – consistency, availability, and partition
tolerance at the same time in a distributed system with data replication.

• The theorem states that networked shared-data systems can only


strongly support two of the following three properties:
• Consistency –
• Consistency means that the nodes will have the same copies of a
replicated data item visible for various transactions. A guarantee that
every node in a distributed cluster returns the same, most recent and a
successful write. Consistency refers to every client having the same view
of the data. There are various types of consistency models. Consistency
in CAP refers to sequential consistency, a very strong form of
consistency.
• Availability –
• Availability means that each read or write request for a data item will either
be processed successfully or will receive a message that the operation
cannot be completed. Every non-failing node returns a response for all the
read and write requests in a reasonable amount of time. The key word here
is “every”. In simple terms, every node (on either side of a network
partition) must be able to respond in a reasonable amount of time.
• Partition Tolerance –
• Partition tolerance means that the system can continue operating even if
the network connecting the nodes has a fault that results in two or more
partitions, where the nodes in each partition can only communicate among
each other. That means, the system continues to function and upholds its
consistency guarantees in spite of network partitions. Network partitions
are a fact of life. Distributed systems guaranteeing partition tolerance can
gracefully recover from partitions once the partition heals.
• CA(Consistency and Availability)-
• The system prioritizes availability over consistency and can
respond with possibly stale data.

• Example databases: Cassandra, CouchDB, Riak, Voldemort.

• AP(Availability and Partition Tolerance)-


• The system prioritizes availability over consistency and can respond
with possibly stale data.
• The system can be distributed across multiple nodes and is
designed to operate reliably even in the face of network partitions.
• Example databases: Amazon DynamoDB, Google Cloud Spanner.
• CP(Consistency and Partition Tolerance)-
• The system prioritizes consistency over availability and
responds with the latest updated data.
• The system can be distributed across multiple nodes and is
designed to operate reliably even in the face of network partitions.
• Example databases: Apache HBase, MongoDB, Redis.
Features of NoSQL

Schema free /
relaxed schemas
No definition req.
Non Relational heterogenous structures
flat fixed column records
self contained aggregates
dont req Obj Relational
Simple API
Mapping and Data
easy interface
normalization and ACID NoSQL Text based protocols
No std based query lang
web enabled dbsrunning
Open source
No Expensive
licensing/H/w
Distributed
reqirements
Multiple DBs
Auto scaling and fail over capabilities
only provides eventual consistency
Advantages and disadvantages of
NoSQL
R
e
l
a
t
i
o
n N
a o
l S
Q
D L
a
t
a
b
a
s
e

I I
t t

i i
s s

u u
s s
e e
d d

t t
o o

h h
a a
n n
d d
l l
e e

d d
a a
t t
a a

c c
o o
m m
i i
n n
g g

i i
n n

l h
o i
w g
h
v v
e e
l l
o o
c c
i i
t t
y y
. .

I
t
g
i
I v
t e
s
g
i b
v o
e t
s h
o r
n e
l a
y d
r a
e n
a d
d
w
s r
c i
a t
l e
a
b s
i c
l a
i l
t a
y b
. i
l
i
t
y
.

I I
t t

m m
a a
n n
a a
g g
e e
s s

s a
t l
r l
u t
c y
t p
u e
r
e o
d f
d d
a a
t t
a a
. .

D
a
t
a D
a
a t
r a
r
i a
v r
e r
s i
v
f e
r s
o
m f
r
o o
n m
e
m
o a
r n
y
f
e l
w o
c
l a
o t
c i
a o
t n
i s
o .
n
s
.

I I
t t
s s
u u
p p
p p
o o
r r
t t
s s
c s
o i
m m
p p
l l
e e
x
t t
r r
a a
n n
s s
a a
c c
t t
i i
o o
n n
s s
. .

I
t
N
h o
a
s s
i
s n
i g
n l
g e
l
e p
o
p i
o n
i t
n
t o
f
o
f f
a
f i
a l
i u
l r
u e
r .
e
.

I I
t t
h h
a a
n n
d d
l l
e e
s s
d d
a a
t t
a a
i i
n n
l h
e i
s g
s h
v v
o o
l l
u u
m m
e e
. .

T
T r
r a
a n
n s
s a
a c
c t
t i
i o
o n
n s
s
w
w r
r i
i t
t t
t e
e n
n
i
i n
n
m
o a
n n
e y
l l
o o
c c
a a
t t
i i
o o
n n
. s
.

s
u d
p o
p e
o s
r n
t ’
A t
C s
I u
D p
p p
r o
o r
p t
e A
r C
t I
i D
e
s p
c r
o o
m p
p e
l r
i t
a i
n e
c s
e

I
t
s
d
i
f
f E
i n
c a
u b
l l
t e
t s
o e
m a
a s
k y
e a
c n
h d
a f
n r
g e
e q
s u
i e
n n
t
d c
a h
t a
a n
b g
a e
s s
e
o t
n o
c d
e a
i t
t a
b
i a
s s
e
d
e
f
i
n
e
d

s
c
h
e
m s
a c
h
i e
s m
a
m d
a e
n s
d i
a g
t n
o
r i
y s
t n
o o
s t
t r
o e
r q
e u
t i
h r
e e
d
NoSQL Business Drivers
NoSQL Data Architecture Patterns
• Architecture Pattern is a logical way of categorizing data that will be stored on the Database.
NoSQL is a type of database which helps to perform operations on big data and store it in a
valid format. It is widely used because of its flexibility and a wide variety of services.

• Architecture Patterns of NoSQL:


• The data is stored in NoSQL in any of the following four data architecture patterns.

• 1. Key-Value Store Database


• 2. Column Store Database
• 3. Document Database
• 4. Graph Database
• These are explained as following below.
1. Key-Value Store Database:
• This model is one of the most basic models of NoSQL databases. As the name suggests, the
data is stored in form of Key-Value Pairs. The key is usually a sequence of strings, integers
or characters but can also be a more advanced data type. The value is typically linked or co-
related to the key. The key-value pair storage databases generally store data as a hash table
where each key is unique. The value can be of any type (JSON, BLOB(Binary Large Object),
strings, etc). This type of pattern is usually used in shopping websites or e-commerce
applications.

• Advantages:
• Can handle large amounts of data and heavy load,
• Easy retrieval of data by keys.
• Limitations:
• Complex queries may attempt to involve multiple key-value pairs which may delay
performance.
• Data can be involving many-to-many relationships which may collide.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy