Basic Database Concepts: Raheem", "CS101", "95"
Basic Database Concepts: Raheem", "CS101", "95"
🔹 What is Data?
Definition:
Data is a collection of raw facts and figures that alone have no
meaning.
Example:
"Raheem", "CS101", "95" – These are data values.
🔹 What is Information?
Definition:
Information is processed data that is meaningful and useful.
Example:
If we combine the above data:
“Raheem scored 95 marks in course CS101” — this is information.
🔹 What is a Database?
Definition:
A database is an organized collection of data that can be easily accessed,
managed, and updated.
Think of a database as an electronic filing cabinet.
🔸 Examples of Databases:
Application Data Stored
School Management Students, Teachers, Courses
Bank System Accounts, Transactions, Customers
E-commerce Website Products, Orders, Users
🔹 What is a Database Management System (DBMS)?
Definition:
A DBMS is software that interacts with users, applications, and the
database itself to capture and analyze data.
It allows users to create, read, update, and delete data in a database
(CRUD operations).
🔸 Examples of DBMS:
Microsoft Access
MySQL
Oracle
PostgreSQL
SQLite
SQL Server
🔹 Functions of a DBMS:
Function Description
Data Storage Stores data efficiently
Data Retrieval Quickly find data using queries
Data Manipulation Add, update, delete data
Security Control access using permissions and roles
Backup and Recovery Prevent data loss in case of system failure
Concurrency Control Allow multiple users to access data safely
🔹 Advantages of DBMS:
1. Reduced Data Redundancy
2. Improved Data Integrity and Accuracy
3. Data Security
4. Data Consistency
5. Backup and Recovery Features
6. Easier Data Access and Sharing
🔹 Disadvantages of DBMS:
1. High Cost of Software and Hardware
2. Complexity in Setup and Maintenance
3. Requires Trained Personnel
4. Risk of System Failure (if not managed properly)
🔹 Real-Life Analogy
Imagine a library:
Books = Data
Librarian = DBMS
🔹 1. File-Based System
📘 Definition
A file-based system stores data in individual files. Each application
creates and manages its own data files directly, without a centralized
system.
It was commonly used before DBMS became popular.
📁 Example
Imagine a school system:
students.txt — stores student data
📗 Definition
In the database approach, data is stored in a centralized database
managed by a Database Management System (DBMS). Multiple
applications can access the same database in a controlled and efficient
way.
It removes redundancy, ensures consistency, and supports data sharing.
🗂️ Example
In a school DBMS, data for students, courses, and marks are stored in
tables that are related and managed by one software like MySQL or
Oracle.
Multi-user access
Improved security
📘 Definition
The Three-Level Schema Architecture is a model proposed by
ANSI/SPARC to separate the user's view, the logical structure, and the
physical storage of data.
It enables data abstraction and data independence.
️ Three Levels of the Architecture
+---------------------------+
| External Level |
| (User Views - Multiple) |
+---------------------------+
↓
+---------------------------+
| Conceptual Level |
| (Entire logical structure) |
+---------------------------+
↓
+---------------------------+
| Internal Level |
| (Physical storage format) |
+---------------------------+
Deals with:
o Record placement
o Access methods
Includes:
💡 For example, a student can see their marks; admin can see all.
✅ Advantages
1. Data Abstraction:
Users don’t need to know how data is stored physically.
2. Data Independence:
o Logical Data Independence: You can change the schema at
📘 Definition
Data Independence is the capacity of a database to change the schema
(structure) at one level of the database system without requiring changes
at the next higher level.
It enables easy schema modifications without affecting applications.
Changing indexes
� "You can change how data is stored without affecting the tables or
relationships."
� "You can change the design without affecting how users view or access
the data."
🔸 Why is Data Independence Important?
Reason Explanation
Easier Maintenance Schema changes don't break programs
Security Control what users see without exposing schema
Efficiency Improve performance at internal level anytime
️ Summary Table:
Schema Schema
Type Example
Changed Unaffected
Physical Conceptual, Change index
Internal
Independence External structure
Logical Add a column to a
Conceptual External
Independence table
📗 Definition
The Relational Data Model is a method of organizing data into tables
(called relations). It was introduced by E.F. Codd in 1970.
Data is stored in rows and columns (like spreadsheets).
🔷 Key Terminology
Term Meaning
Relation A table with rows and columns
Tuple A row in a table (represents one record)
Attribute A column in a table (represents one field)
Domain Set of allowed values for an attribute
Degree Number of attributes (columns) in a relation
Cardinality Number of tuples (rows) in a relation
📊 Example of a Relation (Table)
RollNo Name Department
101 Ali CS
102 Sana IT
103 Raheem SE
Relation: Student
Tuples: 3
⚠️ Disadvantages
Performance overhead for complex joins
📘 Definition
An attribute is a column in a database table that represents a specific
property or characteristic of the entity represented by the table.
💡 In simple words, an attribute = a field = a column in the table.
📊 Example
For a Student table:
RollNo Name Department Age
101 Ali CS 20
RollNo, Name, Department, and Age are attributes.
📂 2. Schemas
📘 Definition
A schema defines the structure of a database, including table names,
attributes, and their data types.
Think of schema as the blueprint of a database.
🔸 Types of Schemas
Schema Type Description
Database Schema Structure of the entire database
Table Schema Structure of one table (relation)
View Schema Structure of a view (virtual table)
️ Example: Table Schema
CREATE TABLE Student (
RollNo INT PRIMARY KEY,
Name VARCHAR(50),
Department VARCHAR(10),
Age INT
);
Here, Student is the schema of the table that defines:
Table name
Attributes
Data types
Constraints
️ 3. Tuples
📘 Definition
A tuple is a row in a table. It represents a single record or instance of
an entity.
One row = one tuple = one record
📊 Example
In the Student table:
RollNo Name Department Age
101 Ali CS 20
102 Sana IT 19
Each row is a tuple:
Tuple 1 = (101, Ali, CS, 20)
🔹 Additional Terms
The degree of a relation = number of attributes
Example:
If a table has 4 columns and 10 rows → degree = 4, cardinality = 10
🎯 4. Domains
📘 Definition
A domain is the set of valid values that an attribute can take.
Every attribute in a table is defined over a domain.
✅ Example Domains
Attribute Domain
Age Integers from 10 to 100
Gender {‘Male’, ‘Female’, ‘Other’}
Email Any valid email format
RollNo Positive integers only
️ Summary Table
Concept Description Example
Attribute Column in a table Name, Age
Tuple Row in a table (record) (101, Ali, CS, 20)
Table definition with CREATE TABLE Student
Schema (...)
structure
Set of valid values for an
Domain Age: integers 10–100
attribute
🔁 1. Relation Instances
📘 Definition
A relation instance refers to the actual set of rows (tuples) present in a
relation (table) at a particular point in time.
It is the current snapshot of the data in the table.
changes
📘 Definition
Keys are attributes (or sets of attributes) that are used to identify tuples
(rows) uniquely in a relation.
Keys enforce uniqueness, integrity, and help in data retrieval and
relationships.
🔹 1. Super Key
A set of one or more attributes that can uniquely identify a tuple
Example:
In Student(RollNo, Name, Email),
{RollNo}, {RollNo, Name}, {RollNo, Email} — all are
super keys
🔹 2. Candidate Key
A minimal super key — no extra attribute
Example:
In Student(RollNo, CNIC, Name):
{RollNo} and {CNIC} could both be candidate keys
🔹 3. Primary Key
One chosen candidate key to uniquely identify each row
Example:
If both RollNo and CNIC are candidate keys, we choose RollNo as
the primary key
🔹 4. Foreign Key
An attribute that refers to the primary key of another table
Example:
Student(RollNo, Name, DeptID)
Department(DeptID, DeptName)
🎯 Important Rules
A primary key must be:
o Unique
o Not NULL
A foreign key must:
Table 2: Student
RollNo Name DeptID
101 Ali 1
102 Sana 2
Primary Key: RollNo
Foreign Key: DeptID references Department.DeptID
️ Summary Table
Term Meaning
Relation Instance The actual table data (rows/tuples)
Super Key Any set of attributes that uniquely identify a tuple
Candidate Key Minimal super key
Primary Key Main unique identifier for the table
Foreign Key Links one table to another
✅ 1. Integrity Constraints in DBMS
📘 Definition
Integrity Constraints are rules that ensure the correctness, validity,
and consistency of data in a relational database.
Constraints prevent invalid data from being inserted or updated.
🔹 A. Domain Constraint
Ensures that attribute values fall within a predefined domain (set
of valid values).
📌 Example:
Age INT CHECK (Age >= 18 AND Age <= 60)
Only values between 18 and 60 are allowed in the Age column.
o Is unique
o Cannot be NULL
📌 Example:
CREATE TABLE Student (
RollNo INT PRIMARY KEY,
Name VARCHAR(50)
);
You cannot insert NULL or duplicate RollNo values.
🔹 C. Referential Integrity Constraint
Ensures that a foreign key value must either:
o Or be NULL
📌 Example:
CREATE TABLE Department (
DeptID INT PRIMARY KEY
);
🔹 D. Unique Constraint
Ensures that the column values are all different (like Primary Key,
📌 Example:
Name VARCHAR(50) NOT NULL
➕ 2. Relational Algebra in DBMS
📘 Definition
Relational Algebra is a procedural query language used to retrieve
and manipulate data from a relational database using mathematical
operations.
It forms the theoretical foundation of SQL.
🔹 A. Selection (σ)
Retrieves rows that satisfy a condition
📌 Notation:
σ<condition>(Relation)
📌 Example:
σDept = 'CS'(Student)
→ selects students in CS department
🔹 B. Projection (π)
Retrieves specific columns
📌 Notation:
π<columns>(Relation)
📌 Example:
πName,Dept(Student)
→ displays only Name and Department
🔹 C. Union ( ∪ )
Combines two relations with same attributes
Removes duplicates
📌 Example:
Student ∪ Alumni
🔹 D. Set Difference ( − )
Returns tuples in one relation but not in the other
📌 Example:
Student − Alumni
🔹 E. Cartesian Product ( × )
Combines every tuple of one relation with every tuple of another
📌 Example:
Student × Department
(Not commonly used alone — used in joins)
🔹 F. Rename (ρ)
Renames a relation or attributes for clarity
📌 Example:
ρ(S ← Student)
columns)
📌 Example:
Student ⨝ Department
🔹 H. Theta Join (⨝ condition)
Joins based on a given condition (can use =, <, >, etc.)
📌 Example:
Student ⨝ Student.DeptID = Department.DeptID
🎓 Example Dataset
Student Table:
RollNo Name DeptID
1 Ali 101
2 Sana 102
Department Table:
DeptID DeptName
101 CS
102 IT
🔍 Sample Queries
1. Get names of students in CS department:
σDeptName = 'CS'(Student ⨝ Department)
2. List all department names:
πDeptName(Department)
3. Students not in IT department:
σDeptName ≠ 'IT'(Student ⨝ Department)
✅ Summary Table
Operator Symbol Description
Selection σ Filters rows
Projection π Filters columns
Union ∪ Combines relations (no duplicates)
Difference − Rows in A but not in B
Product × Combines all tuples
Rename ρ Renames attributes/tables
Join ⨝ Combines related rows
🔍 1. Selection (σ)
📘 Definition:
Selection is a relational algebra operation that selects rows (tuples) from
a relation based on a condition.
Think of it as applying a WHERE clause in SQL.
️ Notation:
σ<condition>(Relation)
📊 Example:
Consider the table Student:
RollNo Name Dept Age
1 Ali CS 19
2 Sana IT 21
3 Amir CS 20
Query:
σDept = 'CS'(Student)
Result:
RollNo Name Dept Age
1 Ali CS 19
3 Amir CS 20
📋 2. Projection (π)
📘 Definition:
Projection retrieves specific columns (attributes) from a relation. It
removes duplicates automatically.
Equivalent to SELECT column1, column2 in SQL.
️ Notation:
π<attribute1, attribute2, ...>(Relation)
📊 Example:
πName, Dept(Student)
Result:
Name Dept
Ali CS
Sana IT
Amir CS
📘 Definition:
Cartesian Product returns a combination of every tuple in one relation
with every tuple in another.
Also called the cross join in SQL.
️ Notation:
Relation1 × Relation2
⚠️ Result Size:
If R has m rows and S has n rows, then R × S has m × n rows.
📊 Example:
Student
RollNo Name
1 Ali
2
Sana
Department
DeptID DeptName
CS
101
102 IT
Query:
Student × Department
Result:
RollNo Name DeptID DeptName
1 Ali 101 CS
1 Ali 102 IT
2 Sana 101 CS
2 Sana 102 IT
� Used in joins — often followed by a selection to match keys.
📘 Notation:
R ⨝<condition> S
📊 Example:
Student ⨝ Student.DeptID = Department.DeptID
️ B. Equi Join
A special case of Theta Join where condition is only equality ( = )
📘 Notation:
R ⨝ S
📊 Example:
Student Department
RollNo, Name, DeptID DeptID, DeptName
Student ⨝ Department
Result:
RollNo Name DeptID DeptName
1 Ali 101 CS
If no match → NULL
️ Summary Table
Operation Symbol Description
Selection σ Filters rows based on condition
Projection π Selects specific columns
Cartesian Product × Combines all tuples from two tables
Theta Join ⨝ Join using condition
Equi Join ⨝ (a = b) Theta join using = only
Natural Join ⨝ Join on same-named attributes
✅ 1. Normalization in DBMS
📘 Definition:
Normalization is the process of organizing data in a database to:
Eliminate redundancy
🎯 Goal of Normalization:
Minimize duplicate data
✅ Rule:
Every attribute must contain atomic (indivisible) values
✅ Rules:
Must be in 1NF
of a composite key)
📊 Example:
Relation:
Enrollment(RollNo, CourseCode, StudentName,
CourseName)
FDs:
RollNo → StudentName
CourseCode → CourseName
🚫 Partial dependencies:
StudentName depends on part of key (RollNo)
✅ Convert to 2NF:
1. Student(RollNo, StudentName)
2. Course(CourseCode, CourseName)
3. Enrollment(RollNo, CourseCode)
✅ Rules:
Must be in 2NF
non-prime attribute)
📊 Example:
Relation:
Student(RollNo, Name, DeptID, DeptName)
FDs:
RollNo → Name, DeptID
DeptID → DeptName
🚫 Transitive dependency:
DeptName depends on DeptID, which depends on RollNo
✅ Convert to 3NF:
1. Student(RollNo, Name, DeptID)
2. Department(DeptID, DeptName)
✅ Rules:
Must be in 3NF
📊 Example (Violation):
Relation:
R(A, B, C)
FDs:
A → B
B → A
C → A
️ Quick Recap
Normalization improves efficiency and data integrity
better schemas
📐 1. Entity-Relationship (ER) Model
✅ Definition:
The Entity-Relationship (ER) Model is a high-level data model used
in database design to visually represent the entities, relationships, and
attributes in a system.
💡 It provides a blueprint of the data structure using diagrams (ER
diagrams).
🎯 Purpose:
To model real-world objects and relationships in a database.
️ 2. Entity Sets
✅ Definition:
An Entity Set is a collection of similar entities that share the same
attributes.
💡 Think of it as a table, where each row is an entity, and columns are
attributes.
📘 Types of Entities:
Type Description Example
Strong Entity Exists independently Student, Employee
Weak Entity Depends on another entity (no primary key) Dependent, OrderItem
🔐 Key Attribute:
Every entity set has a key attribute that uniquely identifies each
entity.
Example: RollNo in Student entity set
️ 3. Attributes in ER Model
✅ Definition:
Attributes are the properties or characteristics of an entity or
relationship.
💡 Attributes become columns when the ER model is converted into a
relational model.
🔍 Types of Attributes:
🔸 B. Composite Attribute
Can be divided into smaller sub-parts.
🔸 C. Derived Attribute
Value is calculated from other attributes.
🔸 D. Multivalued Attribute
Can have more than one value for a single entity.
🔸 E. Key Attribute
Uniquely identifies an entity.
Underlined in ER diagram.
.
🔗 1. Relationships in ER Model
✅ Definition:
A relationship in an ER model represents an association among two or
more entity sets.
💡 It answers: "How are entities connected?"
📊 Example:
A Student enrolls in a Course.
Entities: Student, Course
Relationship: Enrolls
🔹 Degree of a Relationship
The number of entity sets that participate in a relationship.
Degree Type Example
Binary Two entities Student — Enrolls — Course
Ternary Three entities Doctor — Treats — Patient — Disease
📘 Example Table:
Student(StuID, Name)
Course(CourseID, Title)
Enrolls(StuID, CourseID, Grade)
️ Participation Constraints
Constraint Description
Total Every entity in the set must participate in the
Participation relationship (shown with double line)
Partial Some entities may not participate (shown with single
Participation line)
🔐 Key Constraints
Ensures that a relationship is properly defined with primary/foreign keys.
✅ Definition:
An ER Diagram is a graphical representation of entities, attributes,
and relationships.
It helps visualize how data is connected in the system.
🎨 ER Diagram Symbols
Component Symbol Description
Entity Rectangle Represents an entity set
Weak Entity Double Rectangle Depends on another entity
Attribute Oval Property of entity or relationship
Key Attribute Underlined Oval Uniquely identifies entity
Relationship Diamond Association between entities
Multivalued Attr Double Oval Attribute with multiple values
Derived Attr Dashed Oval Computed attribute (e.g., Age)
️ Example: ER Diagram (Text Representation)
+---------+ +-----------+
| Student | | Course |
+---------+ +-----------+
| StuID | | CourseID |
| Name | | Title |
+---------+ +-----------+
\ /
\ /
\ +--------+ /
-->|Enrolls |<--
+--------+
| Grade |
+--------+
Enrolls is a relationship with an attribute Grade.
✅ Summary Table
Concept Description
Relationship Link between two or more entities
Cardinality 1:1, 1:N, M:N (number of entity associations)
Participation Total (double line), Partial (single line)
ER Diagram Visual tool to design database structure
Symbols Rectangle (Entity), Diamond (Relation), Oval (Attribute)
️ 1. Structured Query Language (SQL)
✅ Definition:
SQL (Structured Query Language) is a standard programming
language used to manage and manipulate relational databases.
It allows creation, modification, retrieval, and control of data.
🔧 SQL Categories:
Type Purpose Examples
DDL Data Definition Language CREATE, ALTER, DROP
SELECT, INSERT, UPDATE,
DML Data Manipulation Language
DELETE
DCL Data Control Language GRANT, REVOKE
Transaction Control
TCL COMMIT, ROLLBACK, SAVEPOINT
Language
📘 Examples:
Create Table:
CREATE TABLE Student (
RollNo INT PRIMARY KEY,
Name VARCHAR(50),
Age INT
);
Insert Data:
INSERT INTO Student VALUES (1, 'Ali', 20);
Select Data:
SELECT * FROM Student;
🔗 2. Joins in SQL
✅ Definition:
JOIN is used to combine rows from two or more tables based on a
related column.
🔀 Types of Joins:
Join Type Description
INNER JOIN Returns matching rows in both tables
LEFT JOIN All rows from left table + matching from right
RIGHT JOIN All rows from right table + matching from left
FULL OUTER JOIN All rows from both tables, unmatched = NULL
CROSS JOIN Cartesian product (every combination)
SELF JOIN Join table with itself
📊 Example Tables:
Student:
RollNo Name DeptID
1 Ali 10
2 Sana 11
Department:
DeptID DeptName
10 CS
11 IT
📘 INNER JOIN:
SELECT Name, DeptName
FROM Student
INNER JOIN Department ON Student.DeptID =
Department.DeptID;
Result:
Name DeptName
Ali CS
Sana IT
️ 3. Subqueries in SQL
✅ Definition:
A subquery is a query inside another query, often used for filtering,
comparison, or as derived data.
🔄 Types of Subqueries:
Type Description
Scalar Returns a single value
Row Returns a row of values
Table Returns multiple rows & columns
Correlated Refers to outer query’s column
Nested in SELECT, FROM, WHERE Common usage areas
✅ Definition:
Grouping allows you to organize data into groups.
Aggregate functions perform calculations on groups of rows.
️ Aggregate Functions:
Function Purpose
SUM() Total value
AVG() Average
MIN() Minimum
MAX() Maximum
COUNT() Row count
📘 GROUP BY Example:
SELECT DeptID, COUNT(*) AS TotalStudents
FROM Student
GROUP BY DeptID;
Result:
DeptID TotalStudents
10 1
11 1
📘 HAVING Clause:
Used to filter groups (like WHERE but for aggregated data):
SELECT DeptID, COUNT(*) AS Total
FROM Student
GROUP BY DeptID
HAVING COUNT(*) > 1;
🔒 5. Concurrency Control in DBMS
✅ Definition:
Concurrency control ensures correct execution of transactions when
multiple users access the database simultaneously.
Prevents data inconsistency, lost updates, and deadlocks.
Durability)
2. Timestamp Ordering:
o Assigns timestamps to transactions
🔄 Types of Failures:
Type Description
System crash Power failure, OS crash
Transaction failure Incomplete or aborted transaction
Media failure Disk or hardware damage
Application error Logical bugs or wrong queries
📦 Types of Backups:
Type Description
Full Backup Copies the entire database
Incremental Backs up only data changed since last backup
Differential Backs up all data changed since the last full backup
Logical Backup Backs up data as SQL statements (e.g., mysqldump)
Physical Backup Backs up data files (binary copy of data blocks)
️
Backup Example in SQL Server:
BACKUP DATABASE University
TO DISK = 'C:\Backups\University.bak';
🔁 Recovery Techniques:
1. Rollback: Undo a transaction (using logs)
2. Rollforward: Apply logs to bring DB to latest state
3. Checkpoints: Save consistent state periodically
4. Shadow Paging: Keeps a copy of unchanged pages
5. Log-Based Recovery:
o Maintain transaction logs
🔍 2. Indexes in DBMS
✅ Definition:
An index is a data structure that speeds up data retrieval from a table
without scanning every row.
Similar to an index in a book — quick lookup of contents.
⚠️ Index Drawbacks:
Takes extra storage space
🌐 3. NoSQL Systems
✅ Definition:
NoSQL stands for "Not Only SQL". It refers to non-relational
databases designed for:
High scalability
Flexible schema
Big data and real-time apps