Unit-4
Unit-4
SYSTEMS (DBMS)
Structure
4.0 Objectives
4.1 Introduction
4.2 Concept and Definition of DBMS
4.3 Structure of DBMS
4.4 Types of DBMS
4.5 Database Organization and Development
4.6 Relational Database Management System
4.7 Summary
4.8 Answer to Self Check Exercises
4.9 Keywords
4.10 References and Further Reading
4.0 OBJECTIVES
After reading this unit, you will be able to :
l understand the concept of database management system;
l know the structure and types of databases; and
l comprehend the database organization.
4.1 INTRODUCTION
The non-technical dictionary meaning of database is “a store of a large amount of
information, especially in a form that can be handled by a computer”. A database is
a collection or set of related data arranged-logically in a structured form designed to
meet the information requirement on non redundant operational data which are
sharable between different application systems. The advantage with a database is
that the data remains independent of the application programs that use them. Further,
the data is accessible to any programme with a legitimate need for them, regardless
of where the data is physically located. It is also accessible to any programme
regardless of the language in which the programme is written. Data are not duplicated
in different locations. A database basically comprise data elements or fields each of
which contains a data value about an attribute of a particular entity. A set of similarly
constructed records constitute a file which contains data records about an entity
type. Ultimately a set of related files stored together in a logical fashion comprise a
database. Data are raw facts or intangible ideas about something and include numbers,
words, symbols, ideas, concepts and oral verbalisation.
The purpose of information systems is to collect, process and store large quantity of
data to obtain the information for effective decision making, planning and control. 65
Types of Information Systems Moreover, in most organizations for reasons of volume, complexity, timing and
computational demand, this collected data must be organised in a manner to serve
a variety of user’s information request.
Data that stored in a database generally fall into a data hierarchy made up of
categories: These categories in the hierarchical order are : Fields, Records, Files
and Database
Database
Files
Records
Fields
At the lowest level data is organized into fields. A field represents a subject data
item, e.g., in case of bibliographic data for books, we have Author Field, Title Field,
etc. Fields are made up of individual letters, numbers or symbols. Once characters
are joined into a field, then the field is treated as a Unit.
e.g. Ashok Kumar Singh = Author Field
Record is made up of a set of related fields. A record can contain either one field or
a number of fields, for example a record for a book may contain- Author Field, Title
Field, Imprint Field, Edition Field, Collation Field etc. So for each book we may
have a separate record.
File is a set of related records. A file may contain as few as one or as many as
millions of records.
Database is the highest in the hierarchy. A database is a collection or set of related
files.
Historically, files came before database. Though database represents a step forward
from files, they are in fact constructed from files.
Due to intrinsic advantages of database as mentioned earlier, the database organisers
are willing to pay for the cost of creation and maintenance of database. An organization
that uses database rather than files can save time and money in developing the
application program. It can exploit the data more efficiently, since they are easier to
get it. In a database, the data is organized into Record (logical) and Record (physical).
A logical record consists of more than one data elements or fields among which
some logical connection exists. The contents of the field represent a set of qualities
or attributes about the particular real world entity represented by the record and the
logical connection among the data elements or field is maintained by their physical
representation or the data structure within the storage medium. The physical record
is a basic unit of data which is read from or written on the storage medium by a
single input/output command to the computer. One physical record often contains
multiple logical records or segments. The data file is a collection of one type of
stored records or interrelated data that are treated as unit and kept on a secondary
66
computer storage device. A set of similarly constructed records comprise a file. Database Management
Systems
Physical storage area containing data could be program or group of data or records
managed as a single unit by the operating system of a computer. A given physical file
can be accessed in a wide variety of ways.
The records can be located in a storage medium randomly or directly, i.e., independent
of the location of any other record in the file. This type of organization is called direct
physical file organization or random access file. When a file is maintained on a sequential
or random storage device in sequential access mode then it is called a sequential file.
The searching of data or data records is generally through an index. The input to the
index file from each record is very important. The operation of creating an index file
from a master file is referred to as inversion and files with indexes are often called
inverted files.
A database concept has traditionally evolved from file-processing or management
system. It is a collection of related and cross referred files designed and created to
minimise the repetition of data. These integrated files are part of the overall database
system including the specialized software called the database management system
(DBMS), which allow data records to be created, accessed, updated, deleted and
retrieved. With the evolution of the database system and concepts, the same physical
data could be viewed in different logical ways by different applications. The database
management system bridges the gap between the logical file description and the
physical organisation of database.
c) Query Language, for users of the database who need answers to their question
The database approach to library automation can be explained with the help of a
diagram:
Advantage of a DBMS
A complete database management system separates the definition of data from the
programme that accesses it. This concept of data independence is one of the key
advantage of a database management system. The lack of data independence from
traditional approaches to progarmming also creates a significant maintenance
problem. As programmes are changed to reflect changing conditions or request
from users, all the programmes in a system that access the files have to be altered.
At a minimum, the record description in the programmes will have to be changed. It
may also be necessary to make modifications in the programme themselves to process
68
added data. With a database management system only programmes that access the Database Management
Systems
actual fields altered are generally affected by a change. Programmes that do not use
the altered fields do not usually have to be changed. As a result, we have gained
some independence between the data and the programmes that access those data.
This kind of data independence is the essence of the database concept.
The main functions and the design objectives and benefits of the database management
system are: (i) Integration, (ii) data independence, (iii) data retrieval, analysis,
modification and storage, (iv) privacy, (v) integrity controls and recovery methods,
(vi) compatibility, (vii) concurrency support, (viii) support of complex file structure
and access paths.
The overview of the database management system is given below :
Data Application
Dictionary
Retrieve
Entity User
description language
(query
update)
DBMS
Data
definition Data definition
Data
Language Relationship
69
Types of Information Systems
Conceptual Schema
Selection Selection Selection
DBMS
Software
Schema easy for the DBA. In terms of schema, the DBMS has a three-level
architecture. The first level structure of DBMS is the external level which refers to
the way the users view the data. The external level is also called sub-schema. A user
may be interested in a small portion of the database which will form his external
view. The architecture of a DBMS is illustrated in figure 2.
The second level conceptual schema represents the total information content of the
database. It gives a global or integrated view of the database. The third level of the
architecture is the internal schema. The Internal view describes how the data is
actually stored and managed on the storage media. It specifies what indexes exist,
how fields are represented and the physical sequence of the stored records. The
70 internal schema is built up using the internal DDL. The conceptual/internal mapping
statements ensure physical data interdependence. The mapping components between Database Management
Systems
internal schema and secondary storage device, i.e., direct access storage device
(DASD) is called access method.
Thus designing of efficient and effective database structure has led to the identification
of three distinct planes or levels of data abstraction and description. These three
levels are :
i) External or users view level.
ii) Conceptual schema level.
iii) Internal or physical level.
The conceptual schema is machine and application software independent description
of total database. The term schema is used to mean an overall chart of the entire
database. The conceptual schema might be regarded as an overall logical database
description. The model of conceptual schema or model base has to be as stable as
possible.
Data Models
In a database the data is usually logically and physically organised according to
some data model. A data model is a collection of conceptual tools for describing
data, data relationship, data semantics and data constraints.
The data base generally structure their data on the basis of one of the following four
data models :
i) Relational data model
ii) Network data model
iii) Hierarchical data model
iv) or a combination of these three or some subsets of these three models.
The hierarchical or network data models have been in use since the early 1960s,
whereas the relational model was proposed as a modeling structure in the early
1970s. The difference between these three data models is the way they represent
the relationship of entities or their attributes. These models make use of three database
constraints-simple sequence relationship, hierarchical relationship and network
relationship.
Internal Schema
The physical organization and layout of the database on the storage device is called
the internal view. The internal view is represented by means of the internal schema,
which not only defines the physical structure of the stored database, but also specifies
the methods that may be used to locate the logically related data records, insert new
record and delete records.
A B C
A1 A2 B1 B2 C1 C2
A11 A12
A B C
A1 A2 B1 C1
72
Database Management
Name City Profession Income (monthly) Systems
Since different users see different sets of data and different relationships among
them, it is necessary to extract subsets of the table columns for same users and to
join tables together to form larger tables for others. The mathematics provides the
basis for extracting some columns from the tables and for joining various columns.
This capability to manipulate relations provides a flexibility not normally available in
a hierarchical or network structure.
The relational database management system has many advantages. Most of the
DBMS are based on relational model because it is relatively easier for the user to
understand.
The above table contains information about books: order no, supplier, title, date of
order. While constructing the tables of the relational model one has to fulfil the
following requirements:
— Each table must be given a unique heading
— Each attribute must be given a name, referred to as keys
— Each entity record or tuple corresponding to a row in the table must have an
attribute or combination of attributes which serves as unique identifier referred
to as a primary key.
Within one table, we consider a set of attributes which defines the entity type or a
particular facet of the entity type with which we may be concerned in a given context.
Each row or tuple in the table represents data for a particular order. An attribute B
75
Types of Information Systems is said to be functionally dependent on the attribute A if the value of attribute A
always determines the value of the attribute B.
A B i.e., A is determinant of B
A functional dependency may involve more than two attributes.
Normalisation
The normalisation technique is concerned with translating a conceptual design into
a set of well designed relational table. The normalisation is a major task in designing
a relational database. The process of normalisation ensures that there will be no
problem in updating the database and that operations on the various relations will
not lead to inconsistent and incorrect data. During the normalisation process, the
designer first looks to be sure that the relations are in first normal form, next he or
she checks for second normal form and finally for third. The first normal form requires
that all occurrences of a record type contain the same number of fields. It may be
noted that normalisation is primarily aimed at preventing or reducing data maintenance
problems rather than improving retrieval efficiency. Normalisation of relations removes
anomalies in the database.
The second and third normal forms require the designer to examine the relationship
between key fields and other fields in the record. To conform to second and third
normal forms, each non-key field must give us information about the entire key and
nothing but the key.
e.g. suppose that one has a relationship as follows :
Order Author Title Supplier Data
No.
If author and title forms a composite key, this relationship is not in second normal
form. Note that its author and title would be repeated in each record that stores
information about a part in another. If the author changed then every record of a
order number would have to be updated. What would happen if there were no
order number relating to author or book. Then it is possible that database would like
to keep track of the author, since there would be no record having its author. The
relations can be made to conform to second normal form by splitting it into two
relations.
Order Author Title Supplier Data
No.
Author Title Supplier Data
Ordernumber and Author could be the combined key for the relation and author
can be the key for the second.
Thus normalisation is a systematic process of transforming initial conceptual design
first into a set of relational table in First normal form (1NF) by assigning a unique
key to each entity type table and removing repeating group from the tables by
splitting each into two or more new entity types or relation tables. The set of relational
tables in 1NF may be changed into set of tables in second normal form (2NF) by
removing partial functional dependencies within tables. It is accomplished by splitting
of these dependencies into new separate tables. The set of tables in 2NF may be
converted into a new set of tables in third normal form (3NF) by eliminating transitive
76 dependencies.
First normal form (1NF), Second normal form (2NF), Third normal form (3NF), Database Management
Systems
Fourth normal form (4NF), Fifth normal form (5NF), and the highest normal form is
called domain/key normal form (DK/NF).
Example of Normalisation Process
Employee
Employee Name Place of Work Child
Child name Date of Birth Sex
Employee Place of work Child
name
Child name Date of birth Sex
Ashok Varanasi Suraj 12-7-1985 M
Vinay New Delhi Arpita 14-7-1986 F
Surendra Allahabad Ashish 15-10-1988 M
Representation of employee
4.7 SUMMARY
The use of database management system represents one of the most significant
trends in the field of computer-based information systems. A database management
system (DBMS) is a collection of software for processing a collection of interrelated
78 data known as ‘database’. The objective of a database management system is to
facilitate the creation of data structure and relieve the programmer of the problem of Database Management
Systems
setting up complicated files. A complete DBMS separates the definition of data
from the programmes that access it. With a DBMS, it is possible to design file
structures much more easily and to set up a database that can be used by a number
of different application programmes. In modern times RDBMS is being developed
and used more and more.
80