TM351 P9 10 Summr2021
TM351 P9 10 Summr2021
Caveat
These Summary DO NOT replace the course learning materials
Exams WILL BE derived from the full set of the course learning materials
What is Data (D), Database (DB), Database Management System (DBMS), Relational Database (RDB), and
SQL?
Database
Data
DB types
Centralized located, stored, and maintained in a single location.
Distributed stored across different physical locations
Cloud a database that typically runs on a cloud computing platform
Relational based on the relational model of data
NoSQL non tabular, and store data differently than relational tables
OO information is represented in the form of objects
Operational used to update data in real-time
Graph uses graph structures with nodes, edges, and properties to represent and store data.
Popular DB
PostgreSQL
MongoDB
SQL Server
MySQL
Oracle
MS Access
SQL
DDL DML DCL
Create Select Grant
Alter Insert Revoke
Drop Update Deny
Delete
The main teaching materials
the online book: Database Design and Development: An Essential Guide for IT Professionals by
Paulraj Ponniah (2003).
http://onlinelibrary.wiley.com/book/10.1002/0471728993
PostgreSQL documentation by The PostgreSQL Global Development Group (2015).
https://www.postgresql.org/docs/
As PostgreSQL will be used for the practical SQL activities covering relational databases, you
should also use the PostgreSQL documentation as a guide and reference to using SQL.
Language Types
Declarative Language (What To Do) Imperative Language (How To Do)
Aggregation
SQL aggregate functions compute a single value from a set of values.
Aggregation functions:
o COUNT, SUM, AVG, MAX, MIN
The following example calculates some summary statistics about the contents of the patient
table:
View advantages:
o can represent a subset of the data contained in a table
o Security. Each user can be given permission to access the database only through a small
set of views that contain the specific data the user is authorized to see, thus restricting
the user's access to stored data.
As
PART 10: Normalization
Normalization
It’s a technique of dividing the data into multiple tables to reduce data redundancy and
inconsistency and to achieve data integrity.
Its technique of organizing the Data in DB
Multi-step process that put D in tabular form removing duplicated D from its relational tables.
Its improves data integrity.
Approaches to create a set of tables that represent real-world information.
o conceptual D modelling (or conceptual D analysis), not considered in this module,
o normalization (or relational D analysis), considered in this module.
This part is divided into two sections:
o Normalization
o Representing relationships between tables
Un-normalized Form (UNF)
Known as non-first normal form (NF2)
Its lacking the efficiency
The first step is to represent all the data in a tabular form where each data item is represented
by a column.
We usually exclude derived (computed) data such as totals to minimize data redundancy.
We then select one or more attributes (columns) to act as the primary key.
The above sample patients’ records listing the drugs they have been prescribed can be
represented in a tabular form as shown in Figure10.2.
page 64
Moving to First Normal Form (1NF) - ATOMIC (a value that cannot be divided)
You can also follow the normalization process described via Notebook 10.1
o Normalization-drugs prescribed example.
A relation is in First Normal Form (1NF) if each attribute contains only atomic values, that is, it
has no repeating groups of values.
To represent the data in 1NF we:
o Remove any repeating groups of data to separate relations
o Create a separate table for each related data
o Choose a primary key for each new relation
In the un-normalized data above (Figure 10.2), there are several values for the date, drug_code,
drug_name, dosage and duration attributes (columns) for each patient.
For example, patient p001 has been prescribed Tramadol, Omeprazole, Simvastatin and
Amitriptyline.
These items are a repeating group and are removed to a separate relation (Figure 10.5) using
the relational algebra project operation.
The new relation has a primary key comprising the patient_id, date and drug_code attributes
o a patient may be prescribed several drugs on the same day
o or may be prescribed the same drug on different days.
page 67
Moving to Second Normal Form (2NF) - Partial dependency (a non-prime attribute is functionally
dependent on part of a candidate key)
A relation is in Second Normal Form (2NF) if it is in 1NF and every non-primary key attribute of
the relation is dependent on the whole primary key, that is, without partial key dependencies.
To represent the data in 2NF we remove any attributes that only depend on part of the primary
key to separate relations, and choose a primary key for each new relation. (See Ponniah (2003)
‘Second Normal Form’, pp.312–14.)
This step only applies to relations that have a composite primary key. We have to decide
whether any attributes in such relations are functionally dependent on only part of the
composite primary key.
Functional dependencies
a relationship between two attributes, typically between the PK and other non-key attributes
within a table
For any two attributes A and B, A is functionally dependent on B if and only if:
o For a given value of B there is precisely one associated value of A at any one time.
o For example, patient_name is totally dependent on patient_id because each patient is
given a unique patient identifier.
o Can be represented as B A, B determine A or A is determined by B
o B = determinate, A = dependent attribute
Another way of describing this is to say that:
o Attribute B determines attribute A.
For example, patient_id determines patient_name.
But, the opposite is not true:
patient_name does not determine patient_id, as there may be several patients with the same
name.
page 71
Example
To normalize the relation into 2NF:
o drug_name is removed from the relation (Figure10.7), and
o drug_code and drug_name form a new relation (Figure10.8), with drug_codeas the
primary key.
page 73
Example
Remarks:
o The original relation can be recreated from these relations by performing a join
operation on the common attribute: drug_code.
o As the second of the two 1NF relations shown above (Figure10.6) has a non-composite
primary key, patient_id, it is in 2NF.
page 76
Normalized relations
The final set of normalized relations is shown in Figure10.11
page 77
page 81
Referential integrity
refers to the accuracy and consistency of data within a relationship.
to ensure that data on both sides of the relationship remain intact.
So, referential integrity requires that, whenever a FK value is used it must reference a valid,
existing PK in the parent table.
The referential integrity constraint enforces the integrity of the primary keys and foreign keys
the value of a foreign key in the referencing table must either be null or be one of the values of
the primary key in the referenced table.
Enforced by the DBMS:
o when a row containing an invalid foreign key value is inserted in the referencing table
o when a foreign key in the referencing table is updated to an invalid value
o when a row with a referenced primary key is deleted from the referenced table
o when a referenced primary key is updated in the referenced table
Joins
inner join and outer join operations allow us to:
o Realize relationships between tables (e.g. which doctor is responsible for which
patients), and to
o Identify the absence of a relationship between specific rows of the tables (e.g. which
doctors are not responsible for any patients, and which patients are not under the care
of a doctor).
o Watch the animation in Activity 10.7
Types:
Example:
o SELECT column_name(s) FROM table1. INNER JOIN table2.
o ON table1.column_name = table2.column_name;
o The INNER JOIN keyword selects all rows from both tables as long as there is a match
between the columns. If there are records in the "Orders" table that do not have
matches in "Customers", these orders will not be shown!