DBMS Finals Last Min Notes Draft 1.
DBMS Finals Last Min Notes Draft 1.
encompasses the core facilities into one or more tables (or "relations")
1
Relational Algebra is considered a rename the attribute ‘B’ of relation R
procedural query language. It has a by ‘A’.
group of operators that work on
Natural Join (⋈) - If there are two
relations or tables. It takes relations as
relations A and B then Natural join
an input and also gives relation as an
between A and B will show the set of
output.
all the tuples in which they have the
equal common attribute.
of two relations with X and Y rows will Intersection (⋂) - If there are two
have X*Y rows. relations A and B then the output of A
Union (U) - It eliminates the duplicate ⋂ B will be the set of tuples that are
2
Super Key - Any number of attributes the update of the data does not cause
added to a candidate key will be a any anomalies.
super key.
a tuple. The value of the primary key in form if it doesn’t contain any
only one primary key in a table. Second Normal Form (2NF) - For a
keys except the primary key are called attribute should be functionally
3
for every trivial functional dependency two tables are combined and selected
X→Y, Y is a subset of X (Y⊆X). This in the joined table which has the same
form is stronger than 3NF. value for the common column.
Joins are used in relational databases the right table ‘B’ and the matching
to combine data from multiple tables rows from the left table ‘A’ in the join.
5
● Entity-Relationship (ER) model: be "places an order" and
Represents entities, attributes, "contains."
and relationships between
Enhanced Entity Relationships (EER):
entities.
● Example: In an ER model for a ● EER Model: Extends the ER
7
model to tables, columns, and have multiple grades for different
relationships in a logical schema. courses.
● Example: Mapping an ER diagram
Transaction Processing and
for a library database to a logical
Concurrency Control:
schema with tables like "Books,"
"Authors," and "Borrowers" with ● Transaction: A unit of work that
----
- Data Layer:
To summarize:
10
- Conceptual schema describes the - 3-tier architecture separates
overall logical structure of the application components based on
database. functionality.
- Internal schema deals with the Here are the key differences between
physical storage and implementation super key, primary key, candidate key,
details. and surrogate key:
11
Candidate Key: - A foreign key is an attribute or set of
attributes in one relation that refers to
- A candidate key is a minimal super
the primary key of another relation.
key that can uniquely identify tuples in
a relation. - It establishes a relationship between
two tables.
- It satisfies the uniqueness and
minimality requirements.
Summary:
12
primary keys are chosen super keys
that uniquely identify tuples.
13
- **DML** stands for Data representing logical expressions in
Manipulation Language, and it is used boolean algebra.
to manipulate the data within the
- **BCNF** stands for Boyce-Codd
database.
Normal Form, and it is a higher level of
- DML statements are used to insert, normalization in database design.
update, and delete data from tables.
- BCNF ensures that there are no
- Examples of DML statements include non-trivial functional dependencies on
SELECT, INSERT, UPDATE, and non-prime attributes.
DELETE.
- It eliminates redundancy and
anomalies in the database design.
- **DCL** stands for Data Control Closure Property and Candidate Key
Language, and it is used to manage Notes:
user access and permissions within
- The **closure property** is used to
the database.
find the set of attributes that can
- DCL statements are used to grant or determine other attributes in a
revoke privileges and permissions to functional dependency.
users.
- To find a **candidate key**, start
- Examples of DCL statements include with a set of attributes and find the
GRANT and REVOKE. closure of that set by using the closure
property.
14
- For example, in a table of students, if Example of DML:
the set of attributes {StudentID, Name}
INSERT INTO Employees (EmployeeID,
can determine all other attributes
Name, Department, Salary)
(such as Address and Phone), then it is
a candidate key. VALUES (1, 'John Doe', 'Sales', 5000);
- EmployeeID
Salary DECIMAL(10,2)
3. NoSQL Databases:
1. Introduction to Big Data:
- NoSQL (Not Only SQL) databases are
- Big data refers to a large volume of designed to handle unstructured and
data that is too complex and exceeds semi-structured data.
the processing capacity of traditional
- They provide high scalability and
database systems.
performance for big data applications.
- It is characterized by the 3Vs:
- Examples of NoSQL databases
Volume, Velocity, and Variety.
include MongoDB, Cassandra, and
- Big data sources include social Redis.
media, sensors, devices, and
transactional systems.
4. Data Processing and Analytics:
8. Real-world Applications:
6. Data Visualization:
- Big data is used in various domains,
- Data visualization helps to represent including finance, healthcare,
complex big data in a visual format, marketing, and social media analysis.
making it easier to understand and
- Examples of big data applications
analyze.
include fraud detection, personalized
17
recommendations, predictive Returns all rows from the left table
maintenance, and sentiment analysis. and the matching rows from the right
table.
Copy code
Inner Join:
SELECT Customers.CustomerName,
Combines rows from two or more
Orders.OrderID
tables based on a related column (key).
FROM Customers
Returns only the matching rows
between the tables. LEFT JOIN Orders ON
Customers.CustomerID =
Example query:
Orders.CustomerID;
sql
Right Join (or Right Outer Join):
Copy code
Returns all rows from the right table
SELECT Orders.OrderID, and the matching rows from the left
Customers.CustomerName table.
18
SELECT Customers.CustomerName, Selects rows from a relation that
Orders.OrderID satisfy a given condition.
Relational Modelling:
3. Primary Key and Foreign Key:
2. Normalization:
20
1. Query Processing Steps: - Cost-Based Optimization: Evaluates
different query plans and chooses the
- Parsing: Analyzing the query and
most efficient based on estimated
checking its syntax.
costs.
- Semantic Analysis: Checking the
- Join Optimization: Determines the
query's semantics and verifying table
order in which tables are joined to
and column existence.
minimize the overall cost.
- Query Optimization: Finding the
- Query Rewriting: Modifies the
most efficient execution plan for the
original query to generate equivalent
query.
but more efficient queries.
- Query Execution: Retrieving and
- Example: Reordering joins to process
processing the data based on the
smaller result sets first to reduce
optimized plan.
intermediate data.
2. Indexing:
4. Caching:
- Indexes improve query performance
- Caching stores frequently accessed
by allowing faster data retrieval.
data in memory for faster retrieval.
- Common types include B-tree
- Reduces the need to access data from
indexes and hash indexes.
disk repeatedly.
- Example: Creating an index on the
- Example: Caching query results or
"ProductID" column to speed up
frequently used lookup tables in
searches for specific products.
memory for quick access.
21
- Breaks down a query into smaller - Example: A banking transaction that
tasks that can be processed transfers funds from one account to
concurrently. another.
Introduction to Transaction
Processing: - Atomicity: A transaction should be all
or nothing. It should either execute
completely or not at all.
- Transaction processing involves
executing a set of operations as a - Consistency: A transaction should
Summary:
23
- Process of organizing data in a
database to eliminate redundancy and
- Transaction processing ensures
dependency issues.
consistency, integrity, and
concurrency control in database - Reduces data anomalies and
systems. improves data integrity and efficiency.
24
- Eliminates repeating groups and - Example: Splitting a "Customer" table
ensures data is stored in a tabular into separate "Customer" and "Address"
format. tables.
25
- It involves handling a large number RTAP: Real-Time Analytics Processing
of short and simple transactions in (Big Data Architecture & Technology)
real-time.
26
3. Variety: The different types and - It involves the 7 V's: Volume, Velocity,
formats of data, including structured, Variety, Veracity, Value, Variability, and
semi-structured, and unstructured Visualization.
data.
- Big Data technologies, such as
4. Veracity: The reliability and Hadoop and Spark, enable distributed
accuracy of data, considering data storage, processing, and analysis of
quality and inconsistencies. large datasets.
27
—------------------------------------- which means attribute A determines
-- attribute B.
Starting with the set A, we will find the process, we obtained the closure A+.
A+) cardinality.
30
5. Connect entities with relationship
lines.
—------------------------------------
31