Tutorial on Database - Chapter 1-4
Tutorial on Database - Chapter 1-4
Database Systems
By: Aklilu Thomas (MSc.)
April 10, 2025
Chapter One
Introduction to Database Systems
• Data is known facts that can be recorded and that have implicit
meaning.
• Database is a collection of related data; can vary in size and
complexity (personal vs. enterprise).
• For example, consider the names, telephone numbers, and addresses
of the people you know.
• DBMS (Database Management System) is software for creating and
managing databases (e.g., MySQL, MS Access).
1.2. Database System vs. File System
• Traditional file systems lack features such as:
• Data redundancy and inconsistency
• Difficulty in accessing data
• Data isolation and integrity problems
• 1.3. Characteristics of the Database Approach
• Self-describing nature: Includes metadata about the database structure.
• Data abstraction: Hides irrelevant details from users.
• Support for multiple views: Different views for different users.
• Sharing and multiuser transaction processing: Allows multiple users to
access data simultaneously.
Database system environment
•.
1.4. Actors on the Scene
• The people whose jobs involve the day-to-day use of a large database
are called actors on the scene.
• Actors: Users who interact with the database.
• DBA (Database Administrator): Manages database access,
coordinates usage, and acquires resources.
1.5. Historical Development of Database
Technology
• Early Database Applications (The Hierarchical and Network
Models)
• Relational Model based Systems (RDBMS)
• Object-oriented and emerging applications (OODBMS)
• Data on the Web and E-commerce Applications (Web
contains data in HTML)
Chapter Two
Database System Architecture
• 2.1 Data Models
• A data model is a collection of concepts that can be used to describe
the structure of a database and provides the necessary means to
achieve data abstraction.
• Types of Data Models:-
High-level (Conceptual) models: Concepts close to user perception (e.g., ER
model).
Low-level (Physical) models: Details on data storage on physical media (easily
understood by computer specialists).
Representational (implementation) models: User-friendly, not too far from
physical storage (easily understood by end users).
2.2 Schemas and Instances
• Schema is the structure of the database, defined during design. It
Includes descriptions of the database structure, data types, and the
constraints on the database.
The database schema changes very infrequently or not changed.
• Instance is the current state of the database at a given time. It is the
actual data stored in a database at a particular moment in time. This
includes the collection of all the data in the database.
Also called database state (or occurrence or snapshot).
The database state changes every time the database is updated.
… Schemas and Instances
• Database Schema (design) • Database Instance (state)
2.3 Three-schema Architecture
• The goal of the 3-schema •
architecture is to separate the
user applications from the
physical database. In this
architecture, there are 3-levels:
1) Internal Level: Physical storage
structure.
2) Conceptual Level: Structure of
the whole database.
3) External Level: User views of
the database.
2.4 Data Independence
• Data Independence: Ability to change schema without affecting other
levels (logical and physical).
• Logical Data Independence:
• The capacity to change the conceptual schema without having to change the
external schemas.
• Physical Data Independence:
• The capacity to change the internal schema without having to change the
conceptual schema.
• For example, the internal schema may be changed when certain file
structures are reorganized or new indexes are created to improve database
performance
2.5 DBMS Languages
• Data Definition Language (DDL): Used by the DBA and database
designers to specify the conceptual schema of a database (SQL
Commands such as: Create, Alter, Drop, Rename).
• Data Manipulation Language (DML): Used to specify database
retrievals and updates (SQL Commands such as: Select, Insert, Delete,
Update)
• High Level or Non-procedural Language: For example, the SQL relational
language
• Low Level or Procedural Language: Retrieve data one record-at-a-time;
Chapter Three
Database Modeling
• 3.1 Phases of Database Design
• To identify the information gap and propose database solution to
solve the existing problem.
a) Requirement Analysis: Identify business needs and data
requirements.
b) Conceptual Database Design: Choose a suitable data model.
c) Logical Database Design: Map the conceptual design to an
internal model (tables, indexes).
d) Physical Design: Define storage constructs (tables, data files).
3.2 ERD (Entity Relationship Diagram)
• High-level conceptual data models helps to present data similarly to
user perception (e.g., Entity-Relationship model).
• ER Diagram is a graphical representation of database schema. It
contains three basic components such as: Entities, relationships, and
attributes.
• Entities are specific objects or things in the mini-world that are represented in the
database. For example the EMPLOYEE John Smith.
• Attributes are properties used to describe an entity. For example an EMPLOYEE
entity may have the attributes Name, SSN, Address, Sex, BirthDate.
… ER diagram for Company database
•
… ERD (Attributes)
• Weak entity are entities that do not have key attributes of their own.
• Strong entity are the regular entity types that have a key attribute.
• Composite attributes can be divided into smaller subparts, which
represent more basic attributes with independent meanings. For
example: Address(House#, Street, City, State, Country)
• Simple or atomic attributes are attributes that are not divisible.
• Single-valued attributes have a single value for a particular entity. For
example, Age of a person.
• Multivalued Attributes have more than one value for an entity. For
example, College_degrees of a person.
… ERD (Key)
• Derived Attributes are derived from other attributes. For example,
the Age attribute is derivable from the Birth_date attribute.
• key attribute is used to identify each entity uniquely.
•
3.3 UML class diagrams
• UML class diagrams represent classes
(similar to entity types) as large
rounded boxes with three sections:
• Top section includes entity type
(class) name
• Second section includes attributes
• Third section includes class
operations (operations are not in
basic ER model)
• Relationships (called associations)
represented as lines connecting the
classes
… ERD
• 3.4 Mapping ER Models to Relational Tables
• Create relations for strong entities.
• Include foreign keys for weak entities.
• Define mappings for relationships (1:1, 1:N, M:N).
• 3.5 Enhanced Entity Relationship (EER) Model
• Extends ER model to handle complex applications.
• Introduces concepts like subclasses and inheritance.
Chapter Four
Relational Database Model
• 4.1 The Relational Database Model
• The relational model represents data as a collection of relations
(tables).
• Each table consists of rows (tuples) and columns (attributes).
• Attribute: Column in a table.
• Tuple: Row in a table.
• Relation Schema: Name of the relation with its attributes.
• Relational Constraints are conditions that must hold for a valid
relation (domain constraints, key constraints, referential integrity).
Example: a relation STUDENT
•
… Relation
• Key of a Relation:
• Each row has a value of a data item (or set of items) that uniquely
identifies that row in the table
• Called the key
• In the STUDENT table, SSN is the key
• Example:
CUSTOMER (Cust-id, Cust-name, Address, Phone#)
• CUSTOMER is the relation name
• Defined over the four attributes: Cust-id, Cust-name, Address, Phone#
… Relation
• Formally,
• Given R(A1, A2, .........., An)
• r(R) dom (A1) X dom (A2) X ....X dom(An)
• R(A1, A2, …, An) is the schema of the relation
• R is the name of the relation
• A1, A2, …, An are the attributes of the relation
• r(R): a specific state (or "value" or “population”) of relation R – this is a set of
tuples (rows)
• r(R) = {t1, t2, …, tn} where each ti is an n-tuple
• ti = <v1, v2, …, vn> where each vj element-of dom(Aj)
4.2 Relational Integrity Constraints
• Constraints are conditions that must hold on all valid relation states.
• There are three main types of constraints in the relational model:
• Key constraints
• Entity integrity constraints
• Referential integrity constraints
• Superkey of R:
• Is a set of attributes SK of R with the following condition:
• No two tuples in any valid relation state r(R) will have the same value for SK
• That is, for any distinct tuples t1 and t2 in r(R), t1[SK] t2[SK]
• This condition must hold in any valid state r(R)
… Key constraints
• In general:
• Any key is a superkey (but not vice versa)
• Any set of attributes that includes a key is a superkey
• A minimal superkey is also a key
• If a relation has several candidate keys, one is chosen arbitrarily to be the
primary key.
• The primary key attributes are underlined.
• Example: Consider the CAR relation schema:
• CAR(State, Reg#, SerialNo, Make, Model, Year)
• We chose SerialNo as the primary key
… Entity integrity constraints
• Entity Integrity:
• The primary key attributes PK of each relation schema R in S cannot have null
values in any tuple of r(R).
• This is because primary key values are used to identify the individual tuples.
• t[PK] null for any tuple t in r(R)
• If PK has several attributes, null is not allowed in any of these attributes
… Referential Integrity
• Referential Integrity is a constraint involving two relations
• The previous constraints involve a single relation.
• Used to specify a relationship among tuples in two relations:
• The referencing relation and the referenced relation.
• Tuples in the referencing relation R1 have attributes FK (called
foreign key attributes) that reference the primary key attributes PK of
the referenced relation R2.
• A tuple t1 in R1 is said to reference a tuple t2 in R2 if t1[FK] = t2[PK].
•