Part 1
Part 1
I. DATABASES
What is a database ?
A very general definition could be:
Regardless of the medium used to collect and store data (paper, files, etc.), when data is collected
and stored in an organized manner for a specific purpose, it is called a database.
More precisely, a database is a structured and organized set allowing the storage of large quantities
of information in order to facilitate its use (adding, updating, searching for data). Of course, in this
course we are interested in computerized databases.
Computerized database
This description of the data is carried out using a data model. The latter is a formal tool used to
understand the logical organization of data.
1
DATABASE IMPLEMENTATION
Management and access to a database are ensured by a set of programs which constitute the
Database Management System (DBMS).
II. CONSTRAINTS
Database constraints are a key feature of database management systems. They ensure that rules
defined at data model creation are enforced when the data is manipulated ( inserted, updated, or
deleted) in a database.
Constraints allow us to rely on the database to ensure integrity, accuracy, and reliability of the
data stored in it. They are different from validations or controls we define at application or
presentation layers; nor do they depend on the experience or knowledge of the users interacting
with the system.
we will briefly explain how to define the following types of constraint and their usage:
• DEFAULT
• CHECK
• NOT NULL
• UNIQUE KEY
• PRIMARY KEY
• FOREIGN KEY
2
DATABASE IMPLEMENTATION
Constraints can be defined when we create a table or can be added later. They can be
explicitly named when created (thus allowing us to identify them easily), or they can have system-
generated names if an explicit name is omitted.
C. Constraint Types
We will start from the most basic then move on to the more complex.
1. DEFAULT
This type of constraint allows us to define a value to be used for a given column when no data
is provided at insert time. If a column with a DEFAULT constraint is omitted in
the INSERT statement, then the database will automatically use the defined value and assign it to
the column (if there is no DEFAULT defined and the column is omitted, the database will assign
a NULL value for it).
Once the DEFAULT is created for a column, we can insert a row in our table without specifying
the column:
2. CHECK
CHECK constraints allow us to define a logical condition that will generate an error if it
returns FALSE. Every time a row is inserted or updated, the condition is automatically checked,
and an error is generated if the condition is false. The condition can be an expression evaluating
one or more columns.
3. UNIQUE KEY
Unique keys are defined at table level and can include one or more columns. They guarantee
that values in a row do not repeat in another. You can create as many unique keys as you need in
each table to ensure that all business rules associated with uniqueness are enforced.
4. NOT NULL
By default, all columns in a table accept NULL values. A NOT NULL constraint prevents a
column from accepting NULL values.
5. PRIMARY KEY
A primary key is a constraint defined at table level and can be composed of one or more
columns. Each table can have only one primary key defined, which guarantees two things at row
level:
• The combination of the values of the columns that are part of the primary key is unique.
• All the columns that are part of the primary key have non-null
4
DATABASE IMPLEMENTATION
6. FOREIGN KEY
Foreign keys are vital to maintaining referential integrity in a database. Foreign keys are
created in child tables, and they “reference” a parent table. To be able to reference a table, a
constraint that ensures uniqueness (either a UNIQUE or PRIMARY KEY) must exist for the
referenced columns of the parent table.
When a foreign key is defined, the two tables become related, and the database engine will ensure
that:
• Every value or combination of values entered at INSERT or UPDATE in the columns that
are part of a foreign key exist exactly once in the parent table. This means that we cannot
insert or update a row in the Order table with a reference to a product that does not exist in
the Product
• Every time we try to DELETE a row in the parent table, the database will verify that it does
not have child rows associated; the DELETE will fail if it does. This means that we would
not be able to remove a row in Product if it has one or more related rows in the Order
5
DATABASE IMPLEMENTATION
All constraint types we have reviewed can be defined at column level as long as they involve only
a single column (the column that is being defined). All constraint types except NOT NULL can
also be defined at table level, and this is mandatory when a constraint involves more than one
column (like complex CHECK conditions and multiple-column unique, primary, or foreign
keys). DEFAULT constraints can involve only one column, but they can be defined at either level.
6
DATABASE IMPLEMENTATION
III. JOINs
MySQL databases usually store large amounts of data. To analyze that data efficiently, analysts
and DBAs have a constant need to extract records from two or more tables based on certain
conditions. That’s where JOINs come to the aid. JOINS are used to retrieve data from multiple
tables in a single query. For JOINs to work, the tables need to be related to each other with a
common key value. JOIN clauses are used in the SELECT, UPDATE, and DELETE statements.
7
DATABASE IMPLEMENTATION
INNER JOINs are used to fetch only common matching records. The INNER JOIN clause
allows retrieving only those records from Table A and Table B, that meet the join condition. It is
the most widely used type of JOIN.
In contrast to INNER JOINs, OUTER JOINs return not only matching rows but non-
matching ones as well. In case there are non-matching rows in a joined table, the NULL values
will be shown for them.
There are the following two types of OUTER JOIN in MySQL: MySQL LEFT JOIN and MySQL
RIGHT JOIN.
8
DATABASE IMPLEMENTATION
LEFT JOINs allow retrieving all records from Table A, along with those records from
Table B for which the join condition is met. For the records from Table A that do not match the
condition, the NULL values are displayed.
Accordingly, RIGHT JOINs allow retrieving all records from Table B, along with those
records from Table A for which the join condition is met. For the records from Table B that do not
match the condition, the NULL values are displayed.
9
DATABASE IMPLEMENTATION
MySQL CROSS JOIN, also known as a cartesian join, retrieves all combinations of rows from
each table. In this type of JOIN, the result set is returned by multiplying each row of table A with
all rows in table B if no additional condition is introduced.
10
DATABASE IMPLEMENTATION
When you might need that type of JOIN? Envision that you have to find all combinations of a
product and a color. In that case, a CROSS JOIN would be highly advantageous.
Unlike SQL Server, MySQL does not support FULL OUTER JOIN as a separate JOIN
type. However, to get the results same to FULL OUTER JOIN, you can combine LEFT OUTER
JOIN and RIGHT OUTER JOIN.
11
DATABASE IMPLEMENTATION
Using MySQL JOINs, you can also join more than two tables.
ON tableA.id = tableB.id
ON tableC.id = tableA.id;
12
DATABASE IMPLEMENTATION
a. Operating principles
Management and access to a database are ensured by a set of programs which constitute the
Database Management System (DBMS). A DBMS must allow the addition, modification and
search of data. A database management system generally hosts several databases, which are
intended for different software or purposes.
Currently, most DBMS operate in a client/server mode. The server (meaning the machine that
stores the data) receives requests from several clients concurrently. The server analyzes the
request, processes it and returns the result to the client. The client/server model is quite often
implemented using the sockets interface, the network being the Internet.
1. Hierarchical model
In a database organized in a hierarchical structure, the data is collected in a tree-like form. This
model represents some of the links in the actual world, such as recipes for food, sitemaps for
websites, etc. A hierarchical model has the following characteristics:
• One-to-many relationship: The one-to-many relationship between the datatypes is
present in the data organization, which resembles a tree.
• Parent-child relationship: Although a parent node might have more than one child
node, every child node has a parent node.
• Deletion problem: When a parent node is erased, all child nodes follow suit.
• Pointers: Pointers navigate between the stored data and connect the parent and child
nodes.
13
DATABASE IMPLEMENTATION
2. Relational model
One of the most frequently used data models is the relational model. The data in this model is
kept as a two-dimensional table. The data storage takes the shape of rows and columns. Tables are
a relational model’s fundamental building block. In the relational paradigm, the tables are also
referred to as relations. The key traits of the relational model are as follows:
• Tuples: The table’s rows are referred to as tuples. All the information about any
object instance is contained in a row.
• Attribute or field: The property that defines a table or relation is called an attribute.
The attribute’s values ought to come from the same domain.
3. Object-oriented model
14
DATABASE IMPLEMENTATION
• Linked circular list: The circular linked list performs operations on the network
model.
C. DBMS Examples
You can use multiple database management systems or DBMS software for information
storage, organization, and data analysis. Some of the top options include:
1. Microsoft Access
2. MySQL
3. Oracle Database
15
DATABASE IMPLEMENTATION
4. MongoDB
The world’s top database specialists created IBM Db2, which gives developers, DBAs, and
enterprise architects the tools they need to execute real-time analytics and low-latency transactions
for even the most demanding workloads. Db2 is the tried-and-true hybrid database that offers
extreme availability, sophisticated integrated security, seamless scalability, and intelligent
automation for systems that run the world from microservices to AI workloads.
Virtually all of your data is now accessible across hybrid cloud or multi-cloud settings to
power your AI applications, thanks to the majority of the Db2 family being made available on the
IBM Cloud Pak for Data platform, either as an add-on or an included data source service.
6. Amazon RDS
16
DATABASE IMPLEMENTATION
The managed SQL database service known as Amazon Relational Database Service (RDS)
is offered by Amazon Web Services (AWS). Amazon RDS supports a variety of database engines
for data storage and organization. Additionally, it supports activities related to relational database
maintenance, including data migration, backup, recovery, and patching. Amazon RDS makes it
easy to set up, run, and scale a relational database in the cloud.
While automating time-consuming administrative activities like hardware provisioning,
database setup, patching, and backups, it offers affordable and expandable capacity. It gives you
more time to concentrate on your applications, giving them the quick response, high availability,
security, and compatibility required.
7. PostgreSQL
8. Apache Cassandra
With no single point of failure and the ability to handle massive volumes of data across
numerous commodity servers, Apache Cassandra is a distributed database that is highly scalable
and highly functional. It belongs to the NoSQL database family. The capacity to manage
organized, semi-structured and unstructured data is a strength of Apache Cassandra.
Initially created by Facebook, Cassandra was made available to the public in 2008 before
becoming one of the top-level Apache projects in 2010. Major corporations particularly benefit
from their capacity to handle enormous volumes. Because of this, many significant corporations,
like Apple, Facebook, and Instagram, are currently using it.
17
DATABASE IMPLEMENTATION
18