0% found this document useful (0 votes)
18 views18 pages

Recherche

The document provides a comprehensive overview of SQL and NoSQL databases, detailing their structures, properties, and use cases. It discusses SQL's ACID properties, normalization, performance optimization techniques, and various types of SQL databases, alongside NoSQL's flexibility, scalability, and different types like document-oriented and key-value stores. Additionally, it compares SQL and NoSQL performance, outlines when to use each, and addresses security considerations for both database types.

Uploaded by

ferchichimanel24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views18 pages

Recherche

The document provides a comprehensive overview of SQL and NoSQL databases, detailing their structures, properties, and use cases. It discusses SQL's ACID properties, normalization, performance optimization techniques, and various types of SQL databases, alongside NoSQL's flexibility, scalability, and different types like document-oriented and key-value stores. Additionally, it compares SQL and NoSQL performance, outlines when to use each, and addresses security considerations for both database types.

Uploaded by

ferchichimanel24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

SQL , NOSQL

1.​ SQL Databases

in BI , we need visualisation ; dashboarding ; where we can select , retrieve


,filter ect. So we use SQL queries => create views from the available tables that
will satisfy our visualisation needs

○​ "Relational Database Management Systems (RDBMS)"

A relational database is a collection of highly structured tables, wherein each row


reflects a data entity, and every column defines a specific information field.
Relational databases are built using the structured query language (SQL) to create,
store, update, and retrieve data.

So SQL : standard user interface querying relational databases

A relational database = tables

​ tables = set of tuplet , and each tuplet has same structure of attributes (infos
details ) , an ID , and might have foreigner key that is an ID for one of the tables to
make a relation with another table

○​ "ACID properties in SQL databases"

The ACID properties (Atomicity, Consistency, Isolation, and Durability) ensure reliable transactions in

databases.

1.​ Atomicity​

A transaction must be executed completely or not at all.​


Example:​

○​ If you transfer ₹500 to A, your account should be debited, and A’s account should be

credited.

○​ If a failure occurs in between, the transaction is rolled back, ensuring no partial

updates.

2.​ Consistency​

The database remains in a valid state before and after a transaction.​

Example:​

○​ Before: You have ₹1500, A has ₹2500.

○​ After transferring ₹500, you should have ₹1000, and A should have ₹3000.

○​ The total balance remains the same, maintaining consistency.

3.​ Isolation​

Multiple transactions run independently without affecting each other.​

Example:​

○​ If two transactions read your balance as ₹2500 and both try to deduct ₹500, they

might overwrite each other’s updates incorrectly.

○​ Isolation prevents this by ensuring transactions do not interfere.

4.​ Durability​

Once a transaction is committed, it remains in the database permanently.​

Example:​
○​ If you transfer ₹500 and the system crashes afterward, your updated balance should

be saved when the system restarts.

○​ "Normalization and Denormalization in SQL"

Normalization organizes data to reduce redundancy, while denormalization adds


redundancy for performance.

1.​ Normalization involves dividing large tables into smaller, related tables and
defining relationships between them.
●​ Reduce data redundancy
●​ Maintain data integrity
●​ Optimize storage
●​ Improve or speed up queries

First Normal Form (1NF) – Ensures that all columns contain atomic
(indivisible) values and each row is unique.

Second Normal Form (2NF) – Requires 1NF compliance and ensures


that all non-key attributes are fully dependent on the primary key.

Third Normal Form (3NF) – Ensures 2NF compliance and removes


transitive dependencies (i.e., no non-key column should depend on another
non-key column).

Boyce-Codd Normal Form (BCNF) – A stricter version of 3NF, ensuring


that every determinant is a candidate key.

Fourth Normal Form (4NF) and Fifth Normal Form (5NF) – Used for
handling multi-valued dependencies and complex relationships

2.​ Denormalisation is the opposite


it is the start data we had in the example

●​ Query Performance
Fewer joins needed
Faster read operations
Simpler queries
Better response time

●​ Reporting Efficiency
Direct access to data
Simpler report queries
Faster aggregations
Better user experience

3.​ When to Normalize


a.​ OLTP systems
b.​ Data entry applications
c.​ When data consistency is critical
d.​ When storage is a concern

(Online Transaction Processing) is a type of database system designed to


handle a large number of short, fast, and real-time transactions.

4.​ When to Denormalize


a.​ Data warehouses
b.​ Reporting systems
c.​ Read-heavy applications (Frequent Reads, Fewer Writes)
d.​ When performance is critical

5.​ Design Guidelines


a.​ Start with normalized design
b.​ Denormalize strategically
c.​ Document design decisions
d.​ Consider data volume

6.​ Things to Remember


a.​ Balance needs carefully
b.​ Consider maintenance costs
c.​ Plan for growth
d.​ Monitor performance

●​ Normalize for consistency


●​ Denormalize for performance
●​ Balance based on needs
●​ Document your choices

○​ "SQL Performance Optimization Techniques"


1.​ Indexing for Faster Query Performance​

○​ Use indexes on frequently queried columns.


○​ Avoid excessive indexing to prevent slow inserts/updates.
2.​ Optimize SELECT Queries​

○​ Retrieve only necessary columns instead of SELECT *.


○​ Use EXISTS instead of IN for subqueries.
3.​ Use Joins Efficiently​

○​ Prefer INNER JOIN over OUTER JOIN when possible.


○​ Ensure JOIN conditions use indexed columns.
4.​ Avoid Unnecessary Calculations in WHERE Clause​

○​ Avoid functions on indexed columns in WHERE clauses.


○​ Use direct comparisons instead of applying transformations.
5.​ Use Proper Data Types​

○​ Use appropriate data types for efficiency (e.g., INTEGER instead of VARCHAR
for IDs).
○​ Limit string sizes with VARCHAR(N) instead of TEXT.
6.​ Optimize Pagination​

○​ Use indexed WHERE conditions instead of high OFFSET values.


7.​ Denormalization and Partitioning​

○​ Reduce JOIN operations by denormalizing frequently accessed data.


○​ Use partitioning for large tables to improve query performance.
8.​ Use Query Caching​

○​ Enable query caching if supported by the database.


○​ Optimize frequently executed queries for cache efficiency.

○​ Some notions :
1.​ Different databases :
2.​ Joins in sql :
3. Indexing, Transactions, and Constraints

●​ Indexing – Speeds up data retrieval by creating a searchable structure (e.g.,


B-Trees, Hash indexes).
●​ Transactions – A sequence of operations treated as a single unit, following
ACID properties.
●​ Constraints – Rules to enforce data integrity (e.g., NOT NULL, UNIQUE, CHECK,
DEFAULT).

4. Data Integrity & Keys

●​ Data Integrity – Ensures accuracy, consistency, and reliability of data.


●​ Primary Key – Uniquely identifies each row in a table (e.g., id INT PRIMARY
KEY).
●​ Foreign Key – Links two tables by referencing a primary key in another table.

5. OLTP (Online Transaction Processing)

●​ Handles real-time, fast, and frequent transactions.


●​ Used in banking, e-commerce, and airline reservation systems.
●​ Ensures data consistency and concurrency control.

Would you like a more detailed explanation on any of these topics? 😊


○​ Scalability and Replication in SQL Databases:

1.​ Vertical Scaling (Scaling Up)

●​ Involves upgrading the existing database server (e.g., increasing CPU, RAM, or
storage).
●​ Does not require changes in SQL queries but often involves optimizing indexes,
caching, and upgrading hardware.

Example: Adding an index to improve performance.


CREATE INDEX idx_users_email ON users(email);

2.​ Horizontal Scaling (Sharding)

●​ Involves distributing data across multiple servers.


●​ Requires partitioning data, often through sharding.
3.​ Data Replication and Partitioning

●​ Ensures high availability and fault tolerance by copying data from a master to one or
more slave databases.
●​ Horizontal Partitioning (Divides rows across multiple tables)

●​ Vertical Partitioning (Stores different columns in separate tables)


○​ Popular SQL Databases: MySQL, PostgreSQL, Oracle, SQL Server

1. MySQL

●​ Open-source, widely used for web applications.


●​ Known for simplicity, speed, and ease of use.
●​ Supports replication, partitioning, and JSON data types.
●​ Uses InnoDB as the default storage engine with ACID compliance.

2. PostgreSQL

●​ Open-source, highly extensible and standards-compliant.


●​ Known for advanced features like CTEs, JSONB, full-text search, and MVCC.
●​ Supports strong ACID compliance and concurrency control.
●​ Used in enterprise applications requiring complex queries.

3. Oracle Database

●​ Enterprise-grade database with high performance and security.


●​ Offers features like PL/SQL, advanced partitioning, and RAC (Real Application
Clusters).
●​ Known for high availability, scalability, and deep analytics capabilities.
●​ Used in banking, finance, and large enterprise systems.

4. SQL Server (Microsoft SQL Server)

●​ Developed by Microsoft, used in enterprise and business applications.


●​ Supports T-SQL, advanced security, and BI tools like SSIS, SSRS, and SSAS.
●​ Offers seamless integration with Microsoft products (Azure, Power BI, etc.).
●​ Provides Always On availability groups for high availability.
2.​ NoSQL Databases

1. Non-Relational Databases

●​ Also called NoSQL databases, they do not follow the traditional relational model.
●​ Designed for scalability, flexibility, and handling diverse data structures.
●​ Common types: Document-oriented, Key-value, Column-family, Graph
databases.
2. Document-Oriented Databases (e.g., MongoDB)

●​ Store data in JSON-like documents (BSON in MongoDB).


●​ Flexible schema allows dynamic fields.
●​ Used for content management, catalogs, real-time analytics.

3. Key-Value Stores (e.g., Redis, DynamoDB)

●​ Store key-value pairs for ultra-fast read/write access.


●​ Highly optimized for caching, session storage, and real-time applications.
●​ Redis supports in-memory storage, persistence, and pub/sub messaging.

4. Column-Family Stores (e.g., Cassandra, HBase)

●​ Store data in columns instead of rows for better query performance on large
datasets.
●​ Optimized for distributed, high-volume writes.
●​ Used in log processing, IoT, and recommendation systems.

5. Graph Databases (e.g., Neo4j, ArangoDB)

●​ Use nodes, edges, and properties to store and query relationships efficiently.
●​ Best suited for social networks, fraud detection, recommendation engines.
●​ Supports complex queries like shortest path, recommendations, and pattern
matching.

6. CAP Theorem (Consistency, Availability, Partition Tolerance)

●​ A distributed system can only guarantee two out of three:


○​ Consistency → Every node sees the same data at the same time.
○​ Availability → Every request gets a response, even if nodes fail.
○​ Partition Tolerance → System continues to function despite network
partitions.
●​ Example:
○​ CP (MongoDB, HBase) → Prioritizes Consistency & Partition Tolerance.
○​ AP (Cassandra, DynamoDB) → Prioritizes Availability & Partition Tolerance.

7. BASE Model (Basically Available, Soft state, Eventual consistency)


●​ Alternative to ACID for NoSQL databases.
○​ Basically Available → System always responds, even if stale data is
returned.
○​ Soft State → System state can change even without new input.
○​ Eventual Consistency → Data will eventually become consistent across
nodes.
●​ Used in high-scalability applications like social media, content delivery networks
(CDNs), and distributed databases.

8. Schema-Less Design

●​ No fixed schema, unlike relational databases.


●​ Data can have variable structures, making it more flexible.
●​ Used in big data applications, dynamic web content, and IoT systems.

9. Horizontal Scaling & Sharding

●​ Horizontal Scaling (Scale-Out) → Adding more servers (nodes) to distribute load.


●​ Sharding → Splitting data across multiple databases (shards) for better
performance.
●​ Essential for handling large-scale web applications and distributed databases.

10. Big Data, Unstructured Data, Semi-Structured Data

●​ Big Data → Massive volumes of data that require distributed storage & processing
(e.g., Hadoop, Spark).
●​ Unstructured Data → No predefined format (e.g., images, videos, logs).
●​ Semi-Structured Data → Partially organized (e.g., JSON, XML, NoSQL documents).
SQL vs NoSQL Performance Comparison

"SQL vs NoSQL: A Comparative Study"

●​ SQL (Relational Databases): Optimized for complex queries, transactions (ACID


compliance), and structured data. Best for consistency and integrity.
●​ NoSQL (Non-Relational Databases): Optimized for scalability, high-speed
reads/writes, and unstructured data. Best for distributed architectures and large-scale
applications.
●​ Performance Factors: SQL performs better for JOIN-heavy operations, while
NoSQL is superior for high-speed writes, horizontal scaling, and distributed
workloads.

When to Use SQL vs NoSQL

✅ Use SQL when:


●​ Data requires strong consistency and integrity (e.g., banking, ERP, CRM).
●​ Complex queries and relationships are frequent (e.g., reporting, analytics).
●​ Transactions must follow ACID principles (e.g., financial applications).

✅ Use NoSQL when:


●​ The application requires high scalability and performance (e.g., social media, IoT).
●​ Data is semi-structured or unstructured (e.g., JSON, logs, real-time feeds).
●​ You need high availability with eventual consistency (e.g., content delivery, caching).
Big Data and NoSQL Databases

●​ Big Data Challenges: Handling large volumes, high velocity, and variety of data.
●​ NoSQL for Big Data:
○​ MongoDB (Document storage for flexibility).
○​ Cassandra (Scalable column-family storage).
○​ HBase (Real-time analytics on Hadoop).
○​ Elasticsearch (Full-text search on large datasets).
●​ Key Advantage: Distributed architecture allows handling petabytes of data efficiently.

Hybrid Database Approaches

●​ Hybrid Databases combine SQL and NoSQL features (e.g., PostgreSQL with JSON
support, Azure Cosmos DB).
●​ Use Cases:
○​ Store structured financial data in SQL, but customer interactions in NoSQL.
○​ Use NoSQL for real-time analytics while keeping SQL for long-term reporting.
○​ Combine graph and relational databases for better relationship analysis.

Data Warehousing with SQL and NoSQL

●​ SQL for Data Warehousing:


○​ Examples: Snowflake, Amazon Redshift, Google BigQuery.
○​ Structured data storage, optimized for BI and reporting.
○​ Supports ETL (Extract, Transform, Load) processes.
●​ NoSQL for Data Warehousing:
○​ Examples: Hadoop, Apache Drill.
○​ Handles semi-structured/unstructured data (logs, JSON, XML).
○​ Works well for real-time data ingestion and analytics.

Security Considerations in SQL and NoSQL

✅ SQL Security Risks & Solutions:


●​ SQL Injection → Use prepared statements and input validation.
●​ Access Control → Implement role-based access (RBAC) and encryption.
●​ Data Integrity → ACID transactions ensure consistency.

✅ NoSQL Security Risks & Solutions:


●​ Lack of Standardized Authentication → Use database-specific security
mechanisms (e.g., MongoDB authentication).
●​ Eventual Consistency Issues → Use strong consistency settings where necessary.
●​ Data Encryption → Secure sensitive data in transit and at rest.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy