DBMS Research Paper
DBMS Research Paper
RAHUL SUTHAR
Database System : Concepts and Design
Synopsis:
1. Introduction to Database
1.1 Meaning and Definition of Database
1.2 Functions of Database
1.3 Types of Databases
1.3.1 Bibliographic Database
1.3.2 Knowledge Database
1.3.3 Graphic-Oriented Database
1.3.4 Decision-making Database
1.4 Concept of Data Structure
1.4.1 List Structure
1.4.2 Tree / Hierarchical Structure
1.4.3 Network Structure
2. Database Design
2.1 Goals of Database Design
2.2 Logical and Physical View of Database
2.3 View of Data / Architecture of Database System
3.2.1 Data Abstraction
3.2.2 Instances and Schemas
3.2.3 Data independence
3.2.4 Database Languages
2.4 Storage Structures
1. Introduction to Database :
An organization must have accurate and reliable data for effective decision making. To
this end, the organization maintains records on the various facets maintaining relationships
among them. Such related data are called a database. A database system is an integrated
collection of related files, along with details of the interpretation of the data contained therein.
Basically, database system is nothing more than a computer-based record keeping system i.e. a
system whose overall purpose is to record and maintain information/data.
A database management system (DBMS) is a software system that allows access to
data contained in a database. The objective of the DBMS is to provide a convenient and
effective method of defining, storing and retrieving the information contained in the database.
The DBMS interfaces with the application programs, so that the data contained in the database
can be used by multiple applications and users. In addition, the DBMS exerts centralized control
of the database, prevents fraudulent or unauthorized users from accessing the data, and ensures
the privacy of the data.
Generally a database is an organized collection of related information. The organized
information or database serves as a base from which desired information can be retrieved or
decision made by further recognizing or processing the data. People use several databases in
their day-to-day life. Dictionary, Telephone directory, Library catalog, etc are example for
databases where the entries are arranged according to alphabetical or classified order.
The term 'DATA' can be defined as the value of an attribute of an entity. Any collection
of related data items of entities having the same attributes may be referred to as a 'DATABASE'.
Mere collection of data does not make it a database; the way it is organized for effective and
efficient use makes it a database.
Database technology has been described as "one of the most rapidly growing areas of
computer and information science". It is emerged in the late Sixties as a result of combination of
various circumstances. There was a growing demand among users for more information to be
provided by the computer relating to the day-to-day running of the organization as well as
information for planning and control purposes. The technology that emerged to process data of
various kinds is grossly termed as 'DATABASE MANAGEMENT TECHNOLOGY' and the
resulting software are known as 'DATABASE MANAGEMENT SYSTEM' (DBMS) which
they manage a computer stored database or collection of data.
*
1.1 Meaning and Definition of Database :
An entity may be concrete as person or book, or it may be abstract such as a loan or a
holiday or a concept. Entities are the basic units of objects which can have concrete existence or
constitute ideas or concepts. An entity set is a set of entities of the same type that share the same
properties or attributes .
An entity is represented by set of attributes. An attribute is also referred as data item,
data element , data field, etc. Attributes are descriptive properties possessed by each member of
an entity set. A groping of related entities becomes an entity set.
For ex : In a library environment,
Entity Set -Catalogue -
Entity -of Books, Journals, AV-Materials, etc
Attributes - contains Author, Title, Imprint, Accn. No., ISBN, etc.
The word 'DATA' means a fact or more specially a value of attribute of an entity. An
entity in general, may be an object, idea, event, condition or situation. A set of attributes
describes an entity. Information in a form which can be processed by a raw computer is called
data. Data are raw material of information.
The term 'BASE' means the support, foundation or key ingredient of anything.
Therefore base supports data.
A 'DATABASE' can be conceived as a system whose base, whose key concept, is simply
a particular way of handling data. In other words, a database is nothing more than a computer-
based record keeping. The objective of database is to record and maintain information. The
primary function of the database is the service and support of information system which satisfies
cost.
In short, " A database is an organized collection of related information stored with
minimum redundancy, in a manner that makes them accessible for multiple application".
Definition :
1. Prakash Naveen : "Database is a mechanized shared formally defined and central collection
of data used in an organization".
2. J.M.Martin : " Database is a collection of inter-related data stored together without harmful
or unnecessary redundancy to serve multiple application".
3. Mac-Millan dictionary of Information Technology : defines a database as a " a collection
of inter-related data stored so that it may be accessed by authorized users with simple user-
friendly dialogues".
1.2 Functions of Database :
The general theme behind a database, is to handle information as an integrated whole. The
general objective is to make information access easy, quick, inexpensive and flexible for the
user.
Controlled redundancy : Redundant data occupies space and therefore is wasteful. By
controlled redundancy, system performance is improved.
User-friendly (i.e. ease to learning and use) : A major feature of a user-friendly database
package is how easy it is to learn and use.
Data independence : means it allows for changes at one level of the database without
affecting the other levels i.e. changing hardware and storage procedures or adding new data
without having to rewrite application program.
Economy (i.e. more information at low cost) : Using, storing and modifying data at low cost
are important.
Accuracy and integrity : Even if redundancy is eliminated, however, the database may still
contain incorrect data. Centralized control of the database helps in avoiding these situation.
The accuracy of a database ensures that data quality and content remain constant. Integrity
controls detect data inaccuracies where they occur.
Recovery from failure : With multi-user access to a database, the system must recover
quickly after it is down with no loss of transactions. It helps to maintain data accuracy and
integrity.
Privacy and Security : For data to remain private, security measures must be taken to
prevent unauthorized access i.e. complete jurisdiction over the operational data. DBMS
ensures proper security through centralized control.
Performance : It emphasizes response time to inquiries suitable to the use of the data
depends on the nature the user-database dialogue.
Database retrieval, analysis, storage :.It facilitates Database retrieval, analysis and
storage.
Compatibility : Usefulness i.e. hardware/software can work with different
computers.
Concurrency control : is a feature that allows simultaneous access to a database, while
preserving data integrity.
Support : Support of complex file structure and access path. Ex : MARC
Data Sharing : A database allows sharing of data under its control by any number of users.
Standards can be enforced : Standardizing stored data formats is particularly desirable as
an aid to data interchange between systems.
1.3 Types of Databases :
Database is considered as a central pool of data which can be shared by a community of
users. There are three yard sticks to determine the nature of data we can deal with. They are :
a. Whether data is free of format or whether it is formatted.
b. Whether definition of data is of the same size as data itself.
c. Whether the data is active or passive.
Whether these yard sticks are applied to data. We can classify database into four kinds which
are
1.3.1 Bibliographic Databases
1.3.2 Knowledge Databases
1.3.3 Graphic-Oriented Databases
1.3.4 Decision-making Databases
1.3.1 Bibliographic Databases : have data which is free of format (unformatted data). They are
composed of textual data which, by it's very nature, displays little or no format. Such databases
are often used in Library and information system. Here data could be composed of abstracts of
books and such documents with key words and key phrases. Through the abstract, one can
determine the document is of interest or not. Bibliographic database contains descriptive
information about documents, titles, authors, Journal name, Volume and Number, date,
keywords, abstract, etc.
1.3.2 Knowledge Databases : are used in Artificial Intelligence applications. The data
contained in these is discrete and formatted. In these there are typically many kinds of data, with
only a very few occurrence of each kind. Such databases having the size of the data is as large as
the definition of the data.
1.3.3 Graphic-Oriented Databases : could possibly used in Computer-Aided Design (CAD).
The data in such database is characterized as being active. This means that data is a procedure
capable of being executed. Any modification can be made in data, as the above 1 and 2 cannot be
executed in a computer.
Ex : Computer-Aided Design (CAD)
Computer-Aided Learning (CAL)
Computer-Aided Instruction (CAI)
1.3.4 Decision-making Databases : are used in corporate management and allied
administrative tasks. Using data contained in these databases, one could handle problem like
resource planning and sales forecasting. These databases are characterized by the fact and their
data contents are :
a. Formatted
b. Far longer than description
c. Passive
These Decision-making databases are often referred to as just databases. Depending upon
the kind of databases being handled Database Management Systems (DBMS) can be classified as
for example : Bibliographic Database Management Systems, Knowledge Database Management
Systems and so on.
1.4 Concept of Data Structure :
Data are structured according to the Data model. A group of data elements handled as a
unit. Ex : Book details - is a data structure consisting of the data elements - Author name, Title,
Publisher's name, ISBN and Quantity.
There are several different approaches to analyzing the logical structure of data in
complex databases. Although all DBMS's have a common approach to data management, they
differ in the way : the structure of data.
There are three types of data structure, viz
1.4.1 List Structure
1.4.2 Tree / Hierarchical Structure
1.4.3 Network Structure
1.4.1 List Structure : A list is nothing morethan a special data structure made up of data record
where the Nth record is related (N-1) and (N-2) simply because of positioning. This brings one-
to-one relationship. This structure is illustrated as below :
1.4.3 Network Structure : Network Structure is another form of hierarchical structure. In this
view as in the hierarchy approach, the data is represented by records and links. However, a
network is a more general structure than a hierarchy.
A network structure allows relationships among entities. Here user views the
database as a number of individual record occurrences in which a given node may have any
number of subordinates nodes. Network Structure is equated to a graph structure. This
brings many-to-many relationship. The relationship between the different item is called as sets.
3.1 Goals of Database Design :
Database Design normally involves defining the logical attributes of the database
designing the layout of the database file structure.
The main objectives of database design is
1. To satisfy the information content requirement of the specified user and application.
2. To provide a natural and easy way to understand structuring of the information.
3. To support processing requirements and any performance objectives such as
i. Response time
ii. Processing time
iii. Storage space
The main objective of the database design is to ensure that the database meets the
reporting and information requirements of the users efficiently. The database should be designed
in such a way that :
i. It eliminates or minimizes data redundancy.
ii. Maintains the integrity and independence of the data.
In database design, several views of data must be considered along with the
persons who use them. There are three views :
1. The overall logical view
2. The program logical view
3. Physical view
The logical view is what the data look like, regardless of how they are stored whereas
the
physical view is the way data exist in physical storage, it deals with how data are stored,
accessed or related to other data in storage.
Four views of data : THREE logical views and ONE physical view.
The logical view as the user's view, the programmer's view and the overall logical view
(schema).
The overall logical view (schema) helps the DBMS to decide what data in storage it
should act upon as required by the application program.
A DBMS is a collection of interrelated files and a set of programs that allow users to
access and modify these files. A major purpose of a database system is to provide users with an
abstract view of the data i.e. the system hides certain details of how the data are stored and
maintained.
4 Internal / Physical level : The internal level is the one closest to physical storage i.e. one
concerned with the way in which the data is actually stored. It is the lowest level of abstraction
describes how the data are actually stored. At the physical level, complex low level data
structures are described in detail.
5 Conceptual / Logical level : is a "level of indirection" between the internal and external. The
next higher level of abstraction describes what data are stored in the database, and what
relationships exists among those data. The entire database is thus described in term of a small
number of relatively simple structures. This level is used by Database Administrators(DBA),
who must decide what information is to be kept in the database.
6 External / View level : The external level is the one closest to the users, i.e. the one concerned
with the way in which the data is viewed by individual users. It is the highest level of abstraction
describes only part of the entire database. Despite the use of simpler structures at the logical
level, some complexity remains, because of the large size of the database. Many users of the
database system will not be concerned with all this information. Instead, such users need to
access only a part of the database so that their interaction with the system is simplified, the view
level of abstraction is defined. The system may provide many views for the same database.
If the external level is concerned with the individual user views, the conceptual level may
be thought of as defining a community user view. In other words, there will be many "external
views," each consisting of a more or less abstract representation of some portion of the database,
and there will be a single "conceptual view," consisting of a similarity abstract representation of
the database in its entirety. Likewise, there will be a single "internal view," representing the total
database as actually stored.
6.3.1 Instances and schemes : Databases change over time as information is inserted or deleted. The
collection of information stored in the database at a particular moment is called an instance of
the database. The overall design of the database is called the database schema. Schemas are
changed infrequently, if at all.
The view at each of these levels is described by a Schema. A schema is an outline or a
plan that describes the records and relationships existing in the view. The word schema is used in
the database literature for the plural instead of schemata, the grammatically correct word. The
schema also describes the way in which entities at one level of abstraction can be mapped to the
next level.
Database systems have several schemas, partitioned according to the levels of
abstraction(that we discussed). At the lowest level is the physical schema; at the intermediate
level is the logical schema; and at the highest level is a subschema. In general, database system
support one physical schema, one logical schema and several subschemas.
6.3.2 Data independence : The ability to modify a schema definition in one level without
affecting a schema definition in the next higher level is called data independence. There are two
levels of data independence viz.
a. Physical data independence : is the ability to modify the physical schema without causing
application programs to be rewritten. Modifications at the physical level are occasionally
necessary to improve performance.
b. Logical data independence : is the ability to modify the logical schema without causing
application programs to be rewritten. Modifications at the logical level are occasionally
necessary whenever the logical structure of the database is altered.
Logical data independence is more difficult to achieve than is physical data
independence, since the application programs are heavily dependent on the logical structure of
the data that they access.
6.3.3 Database languages : Data Sublanguage (DSL) is a subset of the total language i.e.
concerned with the database objects and operations. DSL is a user's / query language which is
being embedded in a host language. In principle, any given DSL is really combination of two
languages :
a. Data Definition Language (DDL) : is one which specify the database schema. A database
schema is specified by a set of definitions. This definition includes all the entities and their
associated attributes as well as the relationships among the entities. The result of compilation of
DDL statements is a set of tables i.e. stored in a special file called data dictionary or data
directory, which caontains metadata i.e. data about data. This file is consulted before actual data
are read or modified in the database system.
The storage structure and access methods used by the database system are specified by a
set of definitions in a special type of DDL called a data storage and definition language.
b. Data Manipulation Language (DML) : is one which is used to express data queries and
updates i.e. manipulate data in the database. DML helps in
- the retrieval of information stored in the database
- the insertion of new information into the database
- the deletion of information from the database
- the modification of information stored in the existing database
A DML is a language that enables users to access or manipulate data as organized by the
appropriate data model. There are basically two types :
i. Procedural DMLs : requires a user to specify what data are needed and how to get those data.
ii.Non- Procedural DMLs : requires a user to specify what data are needed without
specifying how to get those data.
Mapping : There are two levels of mapping :
i. one between the external and conceptual levels of the system; and
ii. the other between the conceptual and internal levels.
The Conceptual/Internal mapping defines the correspondence between the conceptual
view and the stored database. The External/Conceptual mapping defines the correspondence
between a particular external view and the conceptual view.
The DBMS is the software that handles all access to the database. Conceptually what
happens is the following :
1. A user issues an access request, using some particular Data Manipulation
Language(DML);
2. the DBMS intercepts the requests and interprets it;
3. the DBMS inspects, in turn the external schema, the external/conceptual mapping, the
conceptual schema, the conceptual/internal mapping, and the storage structure definition; and
4. the DBMS performs the necessary operations on the stored database.
3.4 Storage Structures :
Storage Structures describes the way in which data may be organized in secondary
storage i.e. direct access media such as disk packs, drums and so on.
User operations are expressed (via the DML) in terms of external records, and must be
converted by the DBMS into corresponding operations on internal or stored records. These later
operations must be converted in turn to operations at the actual hardware level, i.e. to operations
on physical record or blocks. The component responsible for this internal/physical conversion is
called an access method. Its function is to conceal all device-dependent details from the DBMS
and to present the DBMS with a stored record interface. The stored interface thus corresponds
to the internal level, just as the user interface corresponds to the external level. The Physical
record interface corresponds to the actual hardware level.
The stored record interface permits the DBMS to view the storage structure as a
collection of stored files, each one consisting of all occurrences of one type of stored record(see
architecture of DBMS).Specifically, the DBMS knows (a). what stored files exist, and, for each
one, (b) the structure of the corresponding stored record, (c) the stored field(s), if any, on which
it is sequenced, and (d) the stored field(s), if any, that can be used as search arguments for direct
access. This information will all be specified as part of the storage structure definition.
4. Conclusion :
The field of information technology is growing out in a very fats rate in India. Recently, new
types requirements in database processing capabilities have been increasing in several area of
application. At the same time, a variety of sophisticated techniques have been developed and
powerful modeling capabilities.
Database development process includes information gathering, selection of quality
information, computation and consolidation or abstracting in case of bibliographic database,
coding, structuring the compiled data into database format, data entry and editing, updating,
quality control at all levels and maintenance.
As such database expresses a concept which has evolved and change gradually over the years
since the term was coined. Implementation of the concept has made possible by improving
hardware and software technology as made available increasingly regarded as a vital corporate
resource.
India is a large country with vast natural resources. Still the information is scarce. It is not
that information is not generated but gets locked on papers to be put in files in the custody of
various government organizations and research institutions. India needs database in view of
liberalization of Indian economy and the globalization of business. The increasing international
interaction requires formation of relevant and viable database.
In addition to the database with in an organization a vast new demand is growing for
database services. It has developed tremendously over the time to support the changing world's
need control and communication philosophies within the organizations as well as outside as seen
by the users of this service.
5. References :
1. Rob,Peter and Coronel Carlis : Database Systems: Design, implementation and
management -
4th ed. Cambridge, Course Technology, 2000 (p 1-55,286-321)
2. Date, C J : An Introduction to Database Systems - 3rd ed. Vol. 1 New
Delhi: Narosa,1996 (p 3-32,33-61,63-80)
3. Silberschatz,Abraham and others: Database System Concepts - 3rd ed. New
Delhi: McGraw Hill, 1996(p 1-21)
4. Desai,Bipin C : An Introduction to Database Systems
New Delhi: Galgotia, 1996 (p 2-33)
13