0% found this document useful (0 votes)
33 views

8.1_8.2 Databases

The document discusses the principles of relational databases, highlighting the advantages of a database approach over a file-based approach, such as reduced redundancy and improved data consistency. It explains key concepts including tables, relationships, primary and foreign keys, and normalization, which is the process of organizing data to minimize redundancy and ensure data integrity. The document also outlines the steps to achieve different normal forms in database design, ultimately leading to a fully normalized database structure.

Uploaded by

avinashnapaul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

8.1_8.2 Databases

The document discusses the principles of relational databases, highlighting the advantages of a database approach over a file-based approach, such as reduced redundancy and improved data consistency. It explains key concepts including tables, relationships, primary and foreign keys, and normalization, which is the process of organizing data to minimize redundancy and ensure data integrity. The document also outlines the steps to achieve different normal forms in database design, ultimately leading to a fully normalized database structure.

Uploaded by

avinashnapaul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

A Level Computer Science P1 Databases N.

DATABASES
Relational database: a method of creating a database using tables of related data, with relationships
between the tables.

A file-based approach is limited because:

• storage space is wasted when data items are duplicated by the separate applications and some data
is redundant
• data can be altered by one application and not by another; it then becomes inconsistent
• enquiries available can depend on the structure of the data and the software used so the data is not
independent.

A database approach is beneficial because:

in
• storage space is not wasted as data items are only stored once, meaning little or no redundant data
• data altered in one application is available in another application, so the data is consistent
• enquiries available are not dependent on the structure of the data and the software used, so the data

st
is independent.

u
Table: a method for implementing on entity and attributes as a group of related data.

A table is a group of similar data, in a database, with rows for each instance of an entity and columns for each
g
attribute. A record is a row in a table in a database. A field is a column in a table in a database.
u
A
r.
M

A tuple is one instance of an entity which is represented by a row in a table.

Relationships

A relationship is formed when one table in a database has a foreign key that refers to a primary key in another
table in the database.

Referential integrity: makes sure that if data is changed in one place, the change is reflected in all related
records.

In order to ensure referential integrity, the database must not contain any values of a foreign key that are not
matched to the corresponding primary key.

In other words, it ensures that every foreign key has a corresponding primary key.

Page 1 of 12
A Level Computer Science P1 Databases N.A

Relationships can take several forms:

» one-to-one, 1:1

» one-to-many, 1:m

» many-to-one, m:1

» many-to-many, m:m.

If you were to set up a database for a movie download site, you may create one table to store customer data,
one to store data about the movies and one to store download details:

• The CUSTOMER table contains data all of which are related to the customer, for example, their name
and address.
• The MOVIE table contains data related to the movie, for example, the title and genre.

in
• The DOWNLOAD table contains data related to the actual download itself, for example, when it was
downloaded, how much the customer paid, what file type was downloaded.

st
There are some real-world relationships between these three tables. For example:

• one-to-many: one customer may have many downloads



u
many-to-many: one customer could download many movies and one movie could be downloaded by
many customers.
g
Entity: an object about which data will be stored.
u

In our movie example, a customer may be an entity that has attributes such as name and address.
A

One of the first tasks when creating a relational database is to decide on how many tables are needed to solve
the problem. To do this, you must use a technique called normalisation, which ensures that databases are
truly relational and are organised effectively.

In view of this a further table called MOVIEFORMAT is added to the database. The reasons for this are
r.

explained in more detail later in this chapter in the section on normalisation.


M

Attribute: a characteristic or piece of information about an entity, which would be stored as a field in a
relational database.

With the movie download database example, we will store different items of data relating to each entity in a
table. Possible attributes, only some of which we will use, include:
CUSTOMER: Customer Name, Address, Phone Number, Date of Birth
MOVIE: Movie Title, Age Classification, Genre
DOWNLOAD: Date of Download, Price, Method of Payment
MOVIEFORMAT: File type

Page 2 of 12
A Level Computer Science P1 Databases N.A

Entity relationship (ER) diagram: a visual method of describing relationships between entities.

There are three types, or degrees, of relationship, two of which exist in the movie database:

• One-to-many: One customer will have many downloads.


• Many-to-many: Many customers could have many downloads.

in
Notice that the name of the entity is shown in the box with the lines indicating the nature of the relationship.
Labels are usually added above the lines to clarify the relationship.

st
The nature of relationships is sometimes hard to define. You should choose the one that best describes the
relationship in logical terms.
u
In our example you could say that the relationship between CUSTOMER and DOWNLOAD could be any one of
g
the three:

• One customer has one download.


u

• One customer has many downloads.


• Many customers have many downloads.
A

However, the most accurate way to describe it is that one customer could have many downloads because this
best describes the nature of the relationship in a real-life context.
r.

When creating a relational database, you should replace any many-to many relationships with one-to-
M

many relationships.

In the example, we replace the many customers to many movies relationship setting up the DOWNLOAD entity
as a link as shown:

Page 3 of 12
A Level Computer Science P1 Databases N.A

The third type of relationship, which does not exist in the movie database, is a one-to-one relationship. In a
school, if a teacher only taught in one classroom and that classroom was only used by the one teacher, then
this would be a one-to-one relationship.

Primary key: an attribute that can be used to uniquely identify every record within a table.

There must be a way of ensuring that every record in an entity table can be identified individually, otherwise
the relationships between the tables cannot be made.

in
There are three possible solutions:

• Use a unique attribute.

For example, if you were storing personal details you could use the National Insurance number as this

st
is unique to every person in the country.

• Create a unique attribute: We could invent a unique code or identifier (ID) for each customer.
u
• Use a composite key: Two or more attributes could be used in combination.

For example, using name and address as a composite key may ensure that each record is unique as
it is unlikely that you will have two customers with the same name at the same address. However, it is
g
still possible, for example, a father and son who are both called John Smith who live at the same
address.
u

Composite Key: a set of attributes that form a primary key to provide a unique identifier for a table.
A

Foreign key: an attribute in a table that is a primary key in another table and is used to link tables together.
r.

For example, if one customer can have more than one download, how do we create the one-to-many
relationship between the CUSTOMER table and the DOWNLOAD table?
M

The answer is to put the CustomerID in the DOWNLOAD table as a foreign key.

Page 4 of 12
A Level Computer Science P1 Databases N.A

Primary keys have been added for each entity in the form of unique IDs.

• CustomerID appears on the CUSTOMER table as the primary key and on the DOWNLOAD table as a
foreign key.
• MovieID appears on the MOVIE table as the primary key and on the DOWNLOAD table as a foreign
key.
• FormatID appears on the MOVIEFORMAT table as the primary key and on the MOVIE table as the
foreign key.

Now that the relationships have been created, the four tables become one database.

It is common practice to write out the details of relational databases in standard database notation as
shown:

in
CUSTOMER (CustomerID, CustomerName, Address, PhoneNumber, DateOfBirth)
MOVIE (MovieID, MovieTitle, AgeClassification, Genre, FormatID)
DOWNLOAD (DownloadID, DateOfDownload, Price, MethodOfPayment, CustomerID,

st
MovieID)
MOVIEFORMAT (FormatID, FileType) u
Note:
g
• the name of the table is shown in capitals
• all the attributes are placed between brackets
• primary keys are underlined.
u
A

Candidate key: an attribute or smallest set of attributes in a table where no tuple has the same value.

It can be any column or a combination of columns that can qualify as unique key in database. There can be
multiple Candidate Keys in one table. Each Candidate Key can qualify as Primary Key.
r.

An index can be created in a table to find data more quickly and efficiently.
M

In “TABLE”, the records are added in the sequence shown by “Location” field. If the “Data” field is indexed,
the entries will be shown as depicted in “INDEX”.

Page 5 of 12
A Level Computer Science P1 Databases N.A

Normalisation: the process of ensuring that a relational database is structured efficiently, thereby minimising
data redundancy.

In simple terms, a database is normalised when there is no redundant data and when each item of data is
stored in the correct table and at an atomic level.

• Redundant data occurs when the same field is unnecessarily duplicated in two or more tables.
For example, many different customers may download the same movie. If all the movie details were
stored every time it was downloaded, much of the data would be redundant as we only actually need
to store the movie details once and then link those details to each customer who downloads it.

• Storing the same data multiple times can also lead to the problem of data inconsistency, for example
we might store the same customer’s details several times but the telephone number stored might
differ. How would we know which was correct?

in
• Storing data at an atomic level means that they cannot be further decomposed.

For example, a table may contain an attribute called Address that stores the full address of the
customer. At an atomic level, this could be decomposed into several attributes, for example:

st
HouseNumber, Street, Town, County, or AddressLine1, AddressLine2 etc.
u
When a database is constructed according to these rules it is said to be in normal form.
g
First normal form (1NF)

First normal form is achieved by ensuring that a table does not contain repeating attributes or repeating
u

groups and that all of the data in the table is atomic.

For example, a first attempt at creating a database for the movie download system might produce the
A

following table:
r.
M

• This is not in first normal form (1NF) because there are repeating groups, which are shown in the
DateOfDownload, MovieID, MovieTitle, Genre, Format and FileType columns.

A repeating group is when a group of values is stored in a particular row/column intersection in a


database table instead of a single value.

➢ To satisfy first normal form, the repeating groups should be replaced by creating one record for each
download as shown in the table below:

Page 6 of 12
A Level Computer Science P1 Databases N.A

in
• If a customer downloads more than one movie then there will be multiple records for the same
customer.

In the case of Mary Jones, she has three records in the table as she has downloaded three movies.

st
• Each download can now be uniquely identified with a composite key made up of CustomerID and
MovieID, so this could be made into the primary key.
u
It is possible that one customer could download the same movie again, in which case this primary key
would not be adequate, but for this example we will assume that a customer will only download the
same movie once.
g
u

Second normal form (2NF)

Second normal form is achieved by ensuring the database is in first normal form and then removing
A

attributes that depend upon part but not all of the primary key by creating additional tables.

The non-key attributes are all the other attributes apart from the primary key.

For example, Address, DateOfDownload, Genre and FileType are non-key attributes.
r.

➢ To be in second normal form, any non-key attributes that depend upon part but not all of the primary
M

key should be removed to another table.

For example, the Address of the customer is dependent on the CustomerID, but not on the MovieID.

Similarly, the Genre is dependent on the MovieID but not on the CustomerID.

So, both Address and Genre are examples of attributes that depend on part but not all of the primary key.

This means that separate tables are needed to store the customer data and the movie data.

Page 7 of 12
A Level Computer Science P1 Databases N.A

To start with, we will separate the information about customer downloads and movies into two tables:
CUSTOMER DOWNLOAD

in
MOVIE

st
u
g
Notice that when we split the initial table up into two tables, we have kept an attribute which is common to
u

both tables (MovieID) so that we can link the information in the two tables together.

Each movie in the MOVIE table is now identified by the primary key MovieID. Every non-key attribute in the
A

MOVIE table depends on the whole of this primary key, so this table is now in second normal form. The
MovieID field also exists in the CUSTOMER-DOWNLOAD table, as a foreign key.

The CUSTOMER-DOWNLOAD table is more problematic as the primary key for this table would be a composite
r.

key made up of the CustomerID and the MovieID. Together, these two fields form a primary key as they
would be unique to each record because we have assumed that a particular customer will only download the
same movie once.
M

It is still the case that this table is not in second normal form as it contains attributes that depend upon part,
but not all, of the primary key.

For example, CustomerName depends upon the CustomerID but not the MovieID. The solution is to split
this table up further into a CUSTOMER table and a DOWNLOAD table:

CUSTOMER

Page 8 of 12
A Level Computer Science P1 Databases N.A
DOWNLOAD

The CUSTOMER table now has the attribute CustomerID as the primary key. The two other attributes in this

in
table depend on the whole of the primary key so this table is now in second normal form.

As we have assumed that each customer will download a movie only once, the DOWNLOAD table can have a
composite key made up of CustomerID and MovieID. The only other attribute in the table,

st
DateOfDownload, depends on both of these, so this table is also in second normal form.

Third normal form (3NF)


u
Third normal form is achieved by ensuring the database is in second normal form and then removing non-
key attributes that depend upon other non-key attributes by creating additional tables.
g
u

If we look at each of the three tables in turn:

➢ MOVIE: It can be noted that the FileType depends upon the Format. All LowRes films are
A

recorded in MPEG-2 format and all HiRes films are recorded in MPEG-4 format. Therefore, the non-
key attribute FileType depends upon the non-key attribute Format so we can split the format
information off from the movie information to create two new tables:
r.

MOVIE
M

MOVIEFORMAT

Page 9 of 12
A Level Computer Science P1 Databases N.A

Both of these tables are now in third normal form. Table MOVIE has MovieID as the primary key and all the
non-key attributes in this table depend upon MovieID and no other non-key attributes.

Table MOVIEFORMAT has Format as the primary key and only contains one other attribute, FileType. As
there is only one non-key attribute in this table, it must be in third normal form as it is not possible for a non-
key attribute to depend on another non-key attribute.

• Table CUSTOMER: The Address and Name both depend on the CustomerID and not on each other,
so this table is already in third normal form.
• Table DOWNLOAD: There is only one non-key attribute, so this cannot possibly depend upon another
non-key attribute, so this table must already be in third normal form.

The final, fully normalised design of the database is as follows:


CUSTOMER

in
st
MOVIE
u
g
u
A

MOVIEFORMAT
r.
M

DOWNLOAD

Page 10 of 12
A Level Computer Science P1 Databases N.A

• The CustomerID is the primary key of the CUSTOMER table and a foreign key in the DOWNLOAD table.
The MovieID is the primary key of the MOVIE table and a foreign key in the DOWNLOAD table.
• The Format is the primary key of the MOVIEFORMAT table and a foreign key in the MOVIE table.
• The DOWNLOAD table has a composite key made up of CustomerID and MovieID.
• At this point, a database designer might choose to add an additional field, DownloadID, to the
DOWNLOAD table which would be unique for each download. This would mean that the composite key
of CustomerID+MovieID could be replaced by a primary key that was just one field. This might be
considered to be an improvement, but it is not required for normalisation.

In summary, the characteristics that a relation database design must have to be fully normalised are:

• All of the data must be atomic / there must be no repeating groups / no repeating attributes. (1NF)
• There should be no partial dependencies, where a non-key attribute depends upon part but not all of

in
the primary key. (2NF)
• There should be no non-key dependencies, where a non-key attribute depends upon another non-
key attribute. (3NF)

st
Database management systems (DBMS)
Database management system (DBMS) – systems software for the definition, creation and manipulation of

a database.
u
g
Data management – the organisation and maintenance of data in a database to provide the information
required.
u

How a DBMS addresses the limitations of a file-based approach


A

• Data redundancy issue


o This is solved by storing data in separate linked tables, which reduces the duplication of data
as most items of data are only stored once.
o Items of data used to link tables by the use of foreign keys are stored more than once.
r.

o The DBMS will flag any possible errors when any attempt is made to accidentally delete this
type of item.
• Data inconsistency issue
M

o This is also solved by storing most items of data only once, allowing updated items to be seen
by all applications.
o As data is not inconsistent, the integrity of the data stored is improved.
o Consistent data is easier to maintain as an item of data will only be changed once, not
multiple times, by different applications.
• Data dependency issue
o Data is independent of the applications using the database, so changes made to the structure
of the data will be managed by the DBMS and have little or no effect on the applications using
the database.
o Any fields or tables added to or removed from the database will not affect the applications that
do not use those fields/tables, as each application only has access to the fields/tables it
requires.

Page 11 of 12
A Level Computer Science P1 Databases N.A

Data dictionary – a set of data that contains metadata (data about other data) for a database.

• A DBMS uses a data dictionary to store the metadata, including the definition of tables, attributes,
relationships between tables and any indexing.
• The data dictionary can also define the validation rules used for the entry of data and contain data
about the physical storage of the data.
• The use of a data dictionary improves the integrity of the data stored, helping to ensure that it is
accurate, complete and consistent.

Data modelling – the analysis and definition of the data structures required in a database and to produce a
data model.

• An E-R diagram is an example of a data model.


• A logical schema is a data model for a specific database that is independent of the DBMS used to
build the database.

in
Security measures taken by a DBMS can include:

• using usernames and passwords to prevent unauthorised access to the database

st
• using access rights to manage the actions authorised users can take, for example, users could
read/write/delete, or read only, or append only
• using access rights to manage the parts of the database they have access to, for example, the
u
provisions of different views of the data for different users to allow only certain users access to some
tables
g
• automatic creation and scheduling of regular back-ups
• encryption of the data stored
• automatic creation of an audit trail or activity log to record the actions taken by users of the database.
u
A

Access rights (database) – the permissions given to database users to access, modify or delete data.

The use and purpose of DBMS software tools


r.

Developer interface

• The developer interface allows a developer to write queries in structured query language (SQL) rather
M

than using query-by-example.


• These queries are then processed and executed by the query processor.
• This allows the construction of more complex queries to interrogate the database.

Query processor

• The query processor takes a query written in SQL and processes it.
• The query processor includes a DDL interpreter, a DML compiler and a query evaluation engine.
• Any DDL statements are interpreted and recorded in the database’s data dictionary.
• DML statements are compiled into low level instructions that are executed by the query evaluation
engine.
• The DML compiler will also optimise the query.
References: Cambridge A Level Computer Science (Watson & Williams); AQA A Level Computer Science (Reeves)

Page 12 of 12

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy