NOSQL
NOSQL
MongoDB is a NoSQL database that follows a document-based model and supports a schema less structure.
This means:
Documents in the same collec on can have different fields, data types, and structures.
Let’s create a students collec on and insert mul ple documents with different structures.
// Document 1: Basic student info db.students.insertOne({ name: "Alice", age: 21, course: "Computer Science"
});
// Document 2: Different structure (addi onal fields) db.students.insertOne({ name: "Bob", email:
"bob@example.com", skills: ["Java", "Python"], isAc ve: true });
// Document 3: Nested structure db.students.insertOne({ name: "Charlie", address: { city: "New York", zip:
"10001" }, age: 22 });
db.students.find().pre y();
Output:
{ "_id": ObjectId("..."), "name": "Alice", "age": 21, "course": "Computer Science" } { "_id": ObjectId("..."),
"name": "Bob", "email": "bob@example.com", "skills": ["Java", "Python"], "isAc ve": true } { "_id":
ObjectId("..."), "name": "Charlie", "address": { "city": "New York", "zip": "10001" }, "age": 22 }
Conclusion
MongoDB allows inser ng documents with different field names, data types, and nested objects.
There's no need to define a rigid schema up front like in tradi onal SQL databases.
Changes in structure (like adding a new column) require schema altera on, which can be complex and
me-consuming.
Primarily support ver cal scaling (adding more power to a single server).
Performing mul ple joins across large tables can be slow and resource-intensive.
Implemen ng replica on, sharding, and high availability requires addi onal setup and complexity.
A column-oriented database stores data by columns rather than by rows. This means that data from the same
column across mul ple rows is stored together.
Comparison with Row-Oriented Databases:
Feature Descrip on
Performance Faster for analy cal queries like SUM, AVG, COUNT on large datasets.
OLAP Focus Suited for read-heavy workloads and analy cal processing (OLAP).
Use Cases
Data Warehousing
Business Intelligence
Real- me Analy cs
Amazon Amazon Cloud-based data warehouse built on PostgreSQL and op mized for OLAP
Redshi with columnar storage.
Google BigQuery Fully-managed serverless data warehouse with columnar storage and
SQL-like querying.
Apache Apache Columnar file format used by big data tools like Hive, Spark. Not a DB
Parquet itself, but a core part of columnar processing.
Ver ca Ver ca Commercial column-store DB op mized for big data analy cs with
extremely fast performance.
Apache Apache Kudu Hybrid between HDFS and tradi onal column stores, op mized for fast
analy cs.
SAP SAP HANA In-memory, columnar DB for real- me data processing and analy cs.
InfluxData InfluxDB Purpose-built me-series database using columnar concepts for high-
speed inges on and queries.
ID Name Age
1 John 25
2 Jane 28
3 Max 30
Row Store: stores as [(1, John, 25), (2, Jane, 28), (3, Max, 30)]
Column Store: stores as [1, 2, 3], [John, Jane, Max], [25, 28, 30]
Summary
Pros Cons
A document-oriented database is a type of NoSQL database designed to store, retrieve, and manage semi-
structured data as documents, typically in JSON, BSON, or XML formats.
Feature Descrip on
Document A self-contained data unit resembling a JSON object (key-value pairs, arrays,
nested structures).
Collec on A group of related documents (similar to a table in RDBMS but without a fixed
schema).
Schemaless Documents in the same collec on can have different structures and fields.
Embedded/Nested Supports nested objects and arrays for modeling complex rela onships within a
Data single document.
{ "_id": 1, "name": "Alice", "email": "alice@example.com", "skills": ["Java", "Python"], "address": { "city":
"Bangalore", "zip": "560001" } }
Data can be queried using a rich query language similar to JSON syntax.
Advantage Descrip on
Flexible Schema Easily modify structure without altering schema (ideal for agile development).
Natural Data Mapping Ideal for applica ons using JSON (e.g., JavaScript, web apps).
MongoDB MongoDB Most widely-used open-source document DB, stores data in BSON,
Inc. supports rich queries, indexing, and aggrega on.
Couchbase Couchbase Enterprise NoSQL database with JSON storage, in-memory caching, SQL-
Inc. like querying (N1QL), and mobile sync.
Apache CouchDB Open-source document store using JSON documents, HTTP REST API,
and MVCC for versioning and conflict resolu on.
Microso Azure Cosmos DB Globally distributed NoSQL database suppor ng mul ple models,
including document model with MongoDB API support.
MarkLogic MarkLogic Server Mul -model DB with strong enterprise features, including document,
seman c, and search capabili es.
RavenDB RavenDB Open-source document database with ACID guarantees and built-in full-
text search and replica on.
OrientDB OrientDB A mul -model DB that supports document, graph, object, and key/value
models.
Challenge Descrip on
Complex Rela onships Not ideal for deeply rela onal data (joins are limited or inefficient).
Consistency Eventual consistency in distributed setups may not suit all applica ons.
Indexing Improper indexing can lead to slow queries, especially with dynamic schemas.
Use Cases
Product catalogs
Real- me analy cs
Summary
Feature Descrip on
Data distribu on refers to how data is stored, accessed, and synchronized across mul ple nodes or loca ons
in a distributed system.
1. Centralized Model
2. Replicated Model
Descrip on: Copies of the same data are stored on mul ple nodes.
Types:
Master-Slave Replica on: One node is the master, others are read-only slaves.
Mul -Master Replica on: All nodes can read/write, requiring conflict resolu on.
Descrip on: Dataset is split into parts (shards) and distributed across nodes.
Descrip on: Combines replica on and par oning (e.g., each shard is replicated).
Summary Table
Advantages of NoSQL
NoSQL databases offer several benefits over tradi onal rela onal databases, especially for modern
applica ons that demand scalability, flexibility, and performance.
2. Horizontal Scalability
Supports large-scale web apps, real- me analy cs, and IoT data.
3. High Performance
Suitable for high-throughput applica ons like messaging systems, recommenda on engines, etc.
Ideal for storing logs, social media data, sensor data, and product catalogs.
Built to store and process massive volumes of data with low latency.
Widely used in analy cs, machine learning pipelines, and event processing.
Many NoSQL solu ons (like Firebase, Cosmos DB, DynamoDB) offer built-in support for mul -region
replica on, failover, and cloud deployment.
NoSQL databases store data in formats (e.g., JSON) that map more naturally to objects in programming
languages, simplifying development.
While not always ACID-compliant, NoSQL databases o en support eventual consistency models that
are acceptable and more performant for many use cases.
10. Cost-Effec ve
Summary Table
Advantage Benefit
Big Data Support Designed for large volumes and real- me data
Mul -Model Support Fits diverse use cases (docs, key-value, graphs)
NoSQL databases are classified based on how they store and model data. Each model is suited for specific use
cases and types of queries.
1. Document-Oriented Model
Key Feature: Flexible schema, nested data, easy to map with app objects.
Examples:
MongoDB
CouchDB
Firebase Firestore
Amazon DocumentDB
Example Document:
2. Key-Value Model
Examples:
Redis
Amazon DynamoDB
Riak
Berkeley DB
Example:
Structure: Stores data in tables, rows, and columns, but columns are grouped into families.
Use Case: Time-series data, analy cs, IoT, big data workloads.
Key Feature: Fast for queries over large datasets; highly scalable.
Examples:
Apache Cassandra
HBase
ScyllaDB
Google Bigtable
Example:
| Alice | alice@example.com | 25 |
4. Graph Model
Structure: Data is represented as nodes (en es) and edges (rela onships).
Examples:
Neo4j
ArangoDB
OrientDB
Amazon Neptune
Example:
Use Case: Complex applica ons requiring mul ple data representa ons.
Examples:
Summary Table
Document Store JSON, BSON CMS, user data, catalogs MongoDB, CouchDB
Key-Value Store Key => Value Caching, sessions, fast lookup Redis, DynamoDB
Graph DB Nodes + Edges Social, fraud, recommenda ons Neo4j, Amazon Neptune
Mul -Model Mixed formats Complex & diverse use cases ArangoDB, OrientDB
Demonstrate the crea on and retrieval car models in Cassandra database. Use key space, column family.
USE car_dealership;
CREATE TABLE car_models ( model_id UUID PRIMARY KEY, brand TEXT, model_name TEXT, year INT, fuel_type
TEXT, price DECIMAL );
INSERT INTO car_models (model_id, brand, model_name, year, fuel_type, price) VALUES (uuid(), 'Toyota',
'Corolla', 2021, 'Petrol', 1500000.00);
INSERT INTO car_models (model_id, brand, model_name, year, fuel_type, price) VALUES (uuid(), 'Tesla', 'Model
3', 2022, 'Electric', 3500000.00);
INSERT INTO car_models (model_id, brand, model_name, year, fuel_type, price) VALUES (uuid(), 'Hyundai',
'Creta', 2023, 'Diesel', 1800000.00);
Sample Output
Notes
Cassandra does not support joins or complex transac ons — data is usually denormalized.
Use ALLOW FILTERING with cau on; it's inefficient on large datasets.
Concept Descrip on
Schema-less Each document can have a different structure, offering high flexibility
Nested Data Documents can contain nested fields, arrays, and objects
Example Document
{ "_id": "u123", "name": "Alice", "email": "alice@example.com", "skills": ["Java", "MongoDB"], "address": {
"city": "Bangalore", "zip": "560001" } }
Self-contained
Human-readable
MongoDB MongoDB Open-source, most popular document DB using BSON format. Offers
Inc. indexing, aggrega on, and full CRUD support.
Apache CouchDB Uses JSON to store documents, features built-in replica on and RESTful
interface.
Amazon AWS Amazon Fully managed MongoDB-compa ble document DB, integrated with
DocumentDB AWS ecosystem.
Google Cloud Firestore Real- me, cloud-hosted NoSQL database op mized for mobile/web
(Firebase) apps.
Couchbase Couchbase Combines document store with key-value access and in-memory
Inc. caching. Supports SQL-like queries via N1QL.
Microso Cosmos DB Globally distributed, mul -model DB service suppor ng MongoDB API,
Azure SQL API, Gremlin, etc.
RavenDB RavenDB Open-source document DB with ACID support, full-text search, and
clustering.
OrientDB OrientDB Mul -model database (document + graph + key-value + object). Open
source.
Query Language MongoQL REST API Firebase SDK SQL-like, MongoDB API
Best Use Case General purpose Offline-first apps Mobile/web apps Distributed apps
Use Cases
E-commerce catalogs
Summary
Flexibility Schema-less
A graph database is a type of NoSQL database designed to store and process data with complex
rela onships using graph structures. It consists of:
Tradi onal rela onal databases struggle with deep joins and rela onship-intensive queries, while graph
databases are designed to efficiently model and query connected data.
Component Descrip on
[LIKES]
(Product A)
This models:
Advantage Descrip on
Fast Rela onship Queries Op mized for traversing rela onships quickly
Neo4j Inc. Neo4j Industry-leading graph DB, uses Cypher query language. Great for
real- me recommenda on engines, fraud detec on.
Apache TinkerPop (Gremlin) Framework for graph processing, supports mul ple backend engines
like JanusGraph.
Amazon Amazon Neptune Fully managed graph DB suppor ng both property graphs (Gremlin)
AWS and RDF (SPARQL).
Microso Azure Cosmos DB Supports graph storage via Gremlin; part of mul -model Cosmos DB.
(Gremlin API)
OrientDB OrientDB Mul -model DB that supports graph + document + object models.
ArangoDB ArangoDB Mul -model DB (graph, document, key-value), supports AQL and
graph traversals.
TigerGraph TigerGraph High-performance na ve parallel graph DB for big data analy cs.
TerminusDB TerminusDB Git-like graph database focused on collabora on and versioned data.
Best Use Case Real- me traversal Knowledge Graphs Hybrid apps Graph + JSON apps
Sample Query (Neo4j – Cypher Language)
// Create two nodes and a rela onship CREATE (a:Person {name: "Alice"}) CREATE (b:Person {name: "Bob"})
CREATE (a)-[:FRIEND_OF]->(b);
// Query all friends of Alice MATCH (a:Person {name: "Alice"})-[:FRIEND_OF]->(friends) RETURN friends.name;
Summary
A Key-Value Database is the simplest form of NoSQL database. It stores data as a collec on of key-value
pairs, where:
The value can be any data (string, number, JSON, blob, etc.).
Key Concepts
Component Descrip on
Key A unique iden fier (like a primary key)
Value The actual data associated with the key (can be simple or complex)
Schema-less There is no fixed schema; values can vary in type and structure
Example
Each record is accessed via its unique key, and the value can be any serializable object or blob.
Advantage Descrip on
Ideal for Caching Great for session data, shopping carts, etc.
Redis Labs Redis In-memory key-value store. Fast, supports strings, hashes, lists, sets,
pub/sub, and TTL. Widely used for caching, queues, and real- me
analy cs.
Amazon Amazon Fully managed key-value and document DB with automa c scaling and
AWS DynamoDB mul -region replica on.
Microso Azure Table Key-value database for structured NoSQL data, integrated with Azure
Azure Storage cloud.
Riak Riak KV Highly available, fault-tolerant distributed key-value database. Ideal for IoT
and large-scale systems.
Etcd etcd Lightweight, distributed key-value store for configura on and service
discovery (used in Kubernetes).
LevelDB LevelDB High-performance embedded key-value store by Google. Used in browsers
and low-level systems.
Use Cases
Summary
Mongo DB Drivers
A MongoDB driver is a library or package that allows applica ons to connect, interact, and communicate with
a MongoDB server using language-specific APIs.
Connec on management
Query building
PHP mongodb (extension + library) pecl install mongodb + Composer for mongodb/mongodb
Summary
Feature Descrip on
You can store data using the MongoDB shell, Compass GUI, or programma cally via drivers.
// Insert mul ple documents db.car_models.insertMany([ { brand: "Tesla", model: "Model 3", year: 2022,
fuel_type: "Electric", price: 3500000 }, { brand: "Hyundai", model: "Creta", year: 2021, fuel_type: "Diesel",
price: 1800000 } ]);
# Insert one document collec on.insert_one({ "brand": "Honda", "model": "Civic", "year": 2023, "fuel_type":
"Petrol", "price": 2200000 })
# Insert mul ple documents collec on.insert_many([ {"brand": "Ford", "model": "Mustang", "year": 2021,
"fuel_type": "Petrol", "price": 5000000}, {"brand": "Mahindra", "model": "Thar", "year": 2023, "fuel_type":
"Diesel", "price": 1600000} ])
# Find documents with a filter results = collec on.find({"fuel_type": "Petrol"}) for car in results: print(car)
Update Example
// Update price of a Tesla db.car_models.updateOne({ brand: "Tesla" }, { $set: { price: 3600000 } });
Delete Example
Summary Table
Querying MongoDB
db.car_models.find();
Output: Show only model and year of Toyota cars (hide _id).
4. Comparison Operators
5. Logical Operators
6. Sor ng Results
If documents have:
Limit/Skip db.car_models.find().limit(5).skip(5)
Aggrega on db.car_models.aggregate([...])