0% found this document useful (0 votes)
20 views137 pages

DBMS Unit 1 Mca 201 RGPV

Uploaded by

Abhishek Dubey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views137 pages

DBMS Unit 1 Mca 201 RGPV

Uploaded by

Abhishek Dubey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 137

Introduction of DBMS (Database Management System)

A Database Management System (DBMS) is a software solution


designed to efficiently manage, organize, and retrieve data in a
structured manner. It serves as a critical component in modern
computing, enabling organizations to store, manipulate, and secure
their data effectively. From small applications to enterprise systems,
DBMS plays a vital role in supporting data-driven decision-making
and operational efficiency.
In this article, we will explain the key concepts, benefits, and types of
Database Management Systems (DBMS). We’ll also cover how
DBMS solutions work, why they’re important for modern applications,
and what features they offer to ensure data integrity, security, and
efficient retrieval.

What is a DBMS?
A DBMS is a system that allows users to create, modify, and query
databases while ensuring data integrity, security, and efficient data
access. Unlike traditional file systems, DBMS minimizes data
redundancy, prevents inconsistencies, and simplifies data
management with features like concurrent access and backup
mechanisms. It organizes data into tables, views, schemas, and
reports, providing a structured approach to data management.
Example:
A university database can store and manage student information,
faculty records, and administrative data, allowing seamless retrieval,
insertion, and deletion of information as required.

Key Features of DBMS


1. Data Modeling: Tools to create and modify data models,

defining the structure and relationships within the database.


2. Data Storage and Retrieval: Efficient mechanisms for storing

data and executing queries to retrieve it quickly.


3. Concurrency Control: Ensures multiple users can access

the database simultaneously without conflicts.


4. Data Integrity and Security: Enforces rules to maintain

accurate and secure data, including access controls and


encryption.
5. Backup and Recovery: Protects data with regular backups

and enables recovery in case of system failures.

Types of DBMS
There are several types of Database Management Systems (DBMS),
each tailored to different data structures, scalability requirements, and
application needs. The most common types are as follows:

1. Relational Database Management System (RDBMS)

RDBMS organizes data into tables (relations) composed of rows and


columns. It uses primary keys to uniquely identify rows and foreign keys
to establish relationships between tables. Queries are written in SQL
(Structured Query Language), which allows for efficient data
manipulation and retrieval.
Examples: MySQL, Oracle, Microsoft SQL Server and Postgre SQL.

2. NoSQL DBMS
NoSQL systems are designed to handle large-scale data and provide
high performance for scenarios where relational models might be
restrictive. They store data in various non-relational formats, such as
key-value pairs, documents, graphs, or columns. These flexible
data models enable rapid scaling and are well-suited for unstructured
or semi-structured data.
Examples: MongoDB, Cassandra, DynamoDB and Redis.

3. Object-Oriented DBMS (OODBMS)

OODBMS integrates object-oriented programming concepts into the


database environment, allowing data to be stored as objects. This
approach supports complex data types and relationships, making it
ideal for applications requiring advanced data modeling and real-world
simulations.
Examples: ObjectDB, db4o.
Database Languages
Database languages are specialized sets of commands and
instructions used to define, manipulate, and control data within a
database. Each language type plays a distinct role in database
management, ensuring efficient storage, retrieval, and security of
data. The primary database languages include:

1. Data Definition Language (DDL)

DDL is the short name for Data Definition Language, which deals with
database schemas and descriptions, of how the data should reside in
the database.
● CREATE: to create a database and its objects like (table,

index, views, store procedure, function, and triggers)


● ALTER: alters the structure of the existing database

● DROP: delete objects from the database

● TRUNCATE: remove all records from a table, including all

spaces allocated for the records are removed


● COMMENT: add comments to the data dictionary

● RENAME: rename an object

2. Data Manipulation Language (DML)

DML focuses on manipulating the data stored in the database,


enabling users to retrieve, add, update, and delete data.
● SELECT: retrieve data from a database

● INSERT: insert data into a table

● UPDATE: updates existing data within a table

● DELETE: Delete all records from a database table

● MERGE: UPSERT operation (insert or update)

● CALL: call a PL/SQL or Java subprogram

● EXPLAIN PLAN: interpretation of the data access path

● LOCK TABLE: concurrency Control

3. Data Control Language (DCL)

DCL commands manage access permissions, ensuring data security


by controlling who can perform certain actions on the database.
● GRANT: Provides specific privileges to a user (e.g., SELECT,

INSERT).
● REVOKE: Removes previously granted permissions from a

user.

4. Transaction Control Language (TCL)

TCL commands oversee transactional data to maintain consistency,


reliability, and atomicity.
● ROLLBACK: Undoes changes made during a transaction.

● COMMIT: Saves all changes made during a transaction.

● SAVEPOINT: Sets a point within a transaction to which one

can later roll back.

5. Data Query Language (DQL)

DQL is a subset of DML, specifically focused on data retrieval.


● SELECT: The primary DQL command, used to query data

from the database without altering its structure or contents.

Paradigm Shift from File System to DBMS


Before the advent of modern Database Management Systems
(DBMS), data was managed using basic file systems on hard drives.
While this approach allowed users to store, retrieve, and update files
as needed, it came with numerous challenges.
A typical example can be seen in a file-based university management
system, where data was stored in separate sections such as
Departments, Academics, Results, Accounts, and Hostels. Certain
information like student names and phone numbers was repeated
across multiple files, leading to the following issues:
1. Redundancy of data

When the same data exists in multiple places, any update must be
manually repeated everywhere. For instance, if a student changes
their phone number, it must be updated across all sections. Failure to
do so leads to unnecessary duplication and wasted storage.

2. Inconsistency of Data

Data is said to be inconsistent if multiple copies of the same data do not


match each other. If the Phone number is different in Accounts Section
and Academics Section, it will be inconsistent. Inconsistency may be
because of typing errors or not updating all copies of the same data.

3. Complex Data Access

A user should know the exact location of the file to access data, so the
process is very cumbersome and tedious. If the user wants to search
the student hostel allotment number of a student from 10000 unsorted
students’ records, how difficult it can be.

4. Lack of Security

File systems provided limited control over who could access certain
data. A student who gained access to a file with grades might easily
alter it without proper authorization, compromising data integrity.

5. No Concurrent Access

File systems were not designed for multiple users working at the same
time. If one user was editing a file, others had to wait, which hindered
collaboration and slowed down workflows.
6. No Backup and Recovery

File systems lacked built-in mechanisms for creating backups or


recovering data after a loss. If a file was accidentally deleted or
corrupted, there was no easy way to restore it, potentially causing
permanent data loss.
Advantages of DBMS
1. Data organization: A DBMS allows for the organization and

storage of data in a structured manner, making it easy to


retrieve and query the data as needed.
2. Data integrity: A DBMS provides mechanisms for enforcing

data integrity constraints, such as constraints on the values of


data and access controls that restrict who can access the
data.
3. Concurrent access: A DBMS provides mechanisms for

controlling concurrent access to the database, to ensure that


multiple users can access the data without conflicting with
each other.
4. Data security: A DBMS provides tools for managing the

security of the data, such as controlling access to the data


and encrypting sensitive data.
5. Backup and recovery: A DBMS provides mechanisms for

backing up and recovering the data in the event of a system


failure.
6. Data sharing: A DBMS allows multiple users to access and

share the same data, which can be useful in a collaborative


work environment.

Disadvantages of DBMS
1. Complexity: DBMS can be complex to set up and maintain,

requiring specialized knowledge and skills.


2. Performance overhead: The use of a DBMS can add

overhead to the performance of an application, especially in


cases where high levels of concurrency are required.
3. Scalability: The use of a DBMS can limit the scalability of an

application, since it requires the use of locking and other


synchronization mechanisms to ensure data consistency.
4. Cost: The cost of purchasing, maintaining and upgrading a

DBMS can be high, especially for large or complex systems.


5. Limited Use Cases: Not all use cases are suitable for a

DBMS, some solutions don’t need high reliability, consistency


or security and may be better served by other types of data
storage.

Applications of DBMS
1. Enterprise Information: Sales, accounting, human

resources, Manufacturing, online retailers.


2. Banking and Finance Sector: Banks maintaining the

customer details, accounts, loans, banking transactions, credit


card transactions. Finance: Storing the information about
sales and holdings, purchasing of financial stocks and bonds.
3. University: Maintaining the information about student course

enrolled information, student grades, staff roles.


4. Airlines: Reservations and schedules.

5. Telecommunications: Prepaid, postpaid bills maintance.

Advantages of Database Management System

Database Management System (DBMS) is a collection of interrelated


data and a set of software tools/programs that access, process, and
manipulate data. It allows access, retrieval, and use of that data by
considering appropriate security measures. The Database
Management system (DBMS) is really useful for better data integration
and security.

Advantages of Database Management System


The advantages of database management systems are:
1. Data Security: The more accessible and usable the

database, the more it is prone to security issues. As the


number of users increases, the data transferring or data
sharing rate also increases thus increasing the risk of data
security. It is widely used in the corporate world where
companies invest large amounts of money, time, and effort to
ensure data is secure and used properly. A Database
Management System (DBMS) provides a better platform for
data privacy and security policies thus, helping companies to
improve Data Security.
2. Data integration: Due to the Database Management System

we have access to well-managed and synchronized forms of


data thus it makes data handling very easy and gives an
integrated view of how a particular organization is working and
also helps to keep track of how one segment of the company
affects another segment.
3. Data abstraction: The major purpose of a database system

is to provide users with an abstract view of the data. Since


many complex algorithms are used by the developers to
increase the efficiency of databases that are being hidden by
the users through various data abstraction levels to allow
users to easily interact with the system.
4. Reduction in data Redundancy: When working with a

structured database, DBMS provides the feature to prevent


the input of duplicate items in the database. for e.g. – If there
are two same students in different rows, then one of the
duplicate data will be deleted.
5. Data sharing: A DBMS provides a platform for sharing data

across multiple applications and users, which can increase


productivity and collaboration.
6. Data consistency and accuracy: DBMS ensures that data is

consistent and accurate by enforcing data integrity constraints


and preventing data duplication. This helps to eliminate data
discrepancies and errors that can occur when data is stored
and managed manually.
7. Data organization: A DBMS provides a systematic approach

to organizing data in a structured way, which makes it easier


to retrieve and manage data efficiently.
8. Efficient data access and retrieval: DBMS allows for

efficient data access and retrieval by providing indexing and


query optimization techniques that speed up data retrieval.
This reduces the time required to process large volumes of
data and increases the overall performance of the system.
9. Concurrency and maintained Atomicity : That means, if

some operation is performed on one particular table of the


database, then the change must be reflected for the entire
database. The DBMS allows concurrent access to multiple
users by using the synchronization technique.
10. Scalability and flexibility: DBMS is highly scalable and
can easily accommodate changes in data volumes and user
requirements. DBMS can easily handle large volumes of data,
and can scale up or down depending on the needs of the
organization. It provides flexibility in data storage, retrieval,
and manipulation, allowing users to easily modify the structure
and content of the database as needed.
DBMS offers numerous advantages, including data security,
integrity, and reduced redundancy. If you’re looking to master
database concepts and their benefits, the GATE CS Self-Paced
Course covers DBMS in a structured, easy-to-follow manner,
ensuring you’re exam-ready.
Advantages of Database Management System over Traditional
File System
1. Better Data Security : DBMS provides a centralized

approach to data management that ensures data integrity and


security. To prevent illegal access, alteration, or theft,
database management systems (DBMS) include a number of
security features, including encryption, authentication, and
authorization. Sensitive data is safeguarded against both
internal and external attacks thanks to this.
2. Reduced Data Redundancy : DBMS eliminates data

redundancy by storing data in a structured way. It allows


sharing data across different applications and users, reducing
the need for duplicating data. By keeping data centrally and
offering methods for sharing and reusing it, database
management systems (DBMS) remove data redundancy. As a
result, less data storage is needed, and data consistency is
increased.
3. Improved Data Consistency : DBMS allows defining

constraints and rules to ensure that data is consistent and


accurate. DBMS ensures data consistency by enforcing data
validation rules and constraints. This ensures that data is
accurate and consistent across different applications and
users.
4. Improved Data Access and Availability : DBMS provides

efficient data access and retrieval mechanisms that enable


quick and easy data access. It allows multiple users to access
the data simultaneously, ensuring data availability.
5. Improved Data Sharing : DBMS provides a platform for

sharing data across different applications and users. It allows


sharing data between different departments and systems
within an organization, improving collaboration and decision-
making. Database Management Systems prevent conflicts
and data loss by enabling numerous people to view and edit
the same data at once. This promotes teamwork and
enhances data uniformity throughout the company.
6. Improved Data Integration : DBMS allows integrating data

from different sources, providing a comprehensive view of the


data. It enables data integration from different systems and
platforms, improving the quality of data analysis. To avoid
data mistakes and inconsistencies, database management
systems (DBMSs) apply data integrity requirements including
referential integrity, entity integrity, and domain integrity. This
guarantees the consistency, accuracy, and completeness of
the data.
7. Improved Data Backup and Recovery : DBMS provides

backup and recovery mechanisms that ensure data is not lost


in case of a system failure. It allows restoring data to a
specific point in time, ensuring data consistency. Database
management systems (DBMS) offer backup and recovery
features that let businesses swiftly and effectively restore lost
or damaged data. This guarantees business continuity and
lowers the chance of data loss.
8. Data independence: By separating the logical and physical

views of data, database management systems (DBMS)


enable users to work with data without being aware of its
exact location or structure. This offers adaptability and lowers
the possibility of data damage as a result of modifications to
the underlying hardware or software.

Data Abstraction and Data Independence


Database systems comprise complex data structures. In order to make the
system efficient in terms of retrieval of data, and reduce complexity in terms of
usability of users, developers use abstraction i.e. hide irrelevant details from the
users. This approach simplifies database design.

Level of Abstraction in a DBMS


There are mainly 3 levels of data abstraction:

● Physical or Internal Level


● Logical or Conceptual Level
● View or External Level

Physical or Internal Level

This is the lowest level of data abstraction. It tells us how the data is actually
stored in memory. Access methods like sequential or random access and file
organization methods like B+ trees and hashing are used for the same.
Usability, size of memory, and the number of times the records are factors that
we need to know while designing the database.
Suppose we need to store the details of an employee. Blocks of storage and
the amount of memory used for these purposes are kept hidden from the user.

Logical or Conceptual Level

This level comprises the information that is actually stored in the database in
the form of tables. It also stores the relationship among the data entities in
relatively simple structures. At this level, the information available to the user at
the view level is unknown.
We can store the various attributes of an employee and relationships, e.g. with
the manager can also be stored.

The logical level thus describes the entire database in terms of a small
number of relatively simple structures. Although implementation of the simple
structures at the logical level may involve complex physical-level structures,
the user of the logical level does not need to be aware of this complexity. This
is referred to as physical data independence. Database administrators, who
must decide what information to keep in the database, use the logical level of
abstraction.

View or External Level

This is the highest level of abstraction. Only a part of the actual database is
viewed by the users. This level exists to ease the accessibility of the database
by an individual user. Users view data in the form of rows and columns. Tables
and relations are used to store data. Multiple views of the same database may
exist. Users can just view the data and interact with the database, storage and
implementation details are hidden from them. Even though the logical level uses
simpler structures, complexity remains because of the variety of information
stored in a large database. Many users of the database system do not need all
this information; instead, they need to access only a part of the database. The
view level of abstraction exists to simplify their interaction with the system

Example: In case of storing customer data,

● Physical level – it will contains block of storages (bytes,GB,TB,etc)


● Logical level – it will contain the fields and the attributes of data.
● View level – it works with CLI or GUI access of database

Data Abstraction

The main purpose of data abstraction is to achieve data independence in


order to save the time and cost required when the database is modified or
altered.

What is Data Independence in DBMS?


Data independence is a property of a database management system by which
we can change the database schema at one level of the database system
without changing the database schema at the next higher level. In this article,
we will learn in full detail about data independence and will also see its types.
If you read it completely, you will understand it easily.

What is Data Independence in DBMS?

In the context of a database management system, data independence is the


feature that allows the schema of one layer of the database system to be
changed without any impact on the schema of the next higher level of the
database system. ” Through data independence, we can build an environment
in which data is independent of all programs, and through the three schema
architectures, data independence will be more understandable. Data via two
card stencils along with centralized DBMS data is a form of transparency that
has value for someone.

It can be summed up as a sort of immunity of user applications that adjusts


correctly and does not change addresses, imparting the class of data and their
order. I want the separate applications not to be forced to deal with data
representation and storage specifics because this decreases quality and
flexibility. DBMS permits you to see data with such a generalized sight. It
actually means that the ability to change the structure of the lower-level schema
without presenting the upper-level schema is called data independence.

Types of Data Independence

There are two types of data independence.

● logical data independence


● Physical data independence
Logical Data Independence

● Changing the logical schema (conceptual level) without changing the


external schema (view level) is called logical data independence.
● It is used to keep the external schema separate from the logical
schema.
● If we make any changes at the conceptual level of data, it does not
affect the view level.
● This happens at the user interface level.
● For example, it is possible to add or delete new entities, attributes to
the conceptual schema without making any changes to the external
schema.

Physical Data Independence

● Making changes to the physical schema without changing the logical


schema is called physical data independence.
● If we change the storage size of the database system server, it will
not affect the conceptual structure of the database.
● It is used to keep the conceptual level separate from the internal
level.
● This happens at the logical interface level.
● Example – Changing the location of the database from C drive to D
drive.
Difference Between Physical and Logical Data
Independence

Physical Data Independence Logical Data Independence

It mainly concerns how the data It mainly concerns about changes to


is stored in the system. the structure or data definition.

It is easier to achieve than logical It is difficult to achieve compared to


independence. physical independence.

To make changes at the physical


To make changes at the logical level,
level we generally do not require
we need to make changes at the
changes at the application
application level.
program level.
It tells about the internal schema. It tells about the conceptual schema.

There may or may not be a need Whenever the logical structure of the
for changes to be made at the database has to be changed, the
internal level to improve the changes made at the logical level are
structure. important.

Example- change in
compression technology, Example – adding/modifying or
hashing algorithm, storage deleting a new attribute.
device etc.

Database Schemas

A database schema defines the structure and organization of data within a


database. It outlines how data is logically stored, including the relationships
between different tables and other database objects. The schema serves as a
blueprint for how data is stored, accessed, and manipulated, ensuring
consistency and integrity throughout the system. In this article, we will explore
the concept of database schema, its types, and how it plays a crucial role in
designing efficient and scalable databases

What is Schema?

A schema is the blueprint or structure that defines how data is organized and
stored in a database. It outlines the tables, fields, relationships, views, indexes,
and other elements within the database. The schema defines the logical view
of the entire database and specifies the rules that govern the data, including its
types, constraints, and relationships.

Database Schema
A database schema is the design or structure of a database that defines how

data is organized and how different data elements relate to each other. It acts
as a blueprint, outlining tables, fields, relationships, and rules that govern the

data.

Key points about a database schema:

● It defines how data is logically organized, including tables, fields,

and relationships.

● It outlines the relationships between entities, such as primary and

foreign keys.

● It helps resolve issues with unstructured data by organizing it in a

clear, structured way.

● Database schemas guide how data is accessed, modified, and

maintained.

In simple terms, the schema provides the framework that makes it easier to

understand, manage, and use data in a database. It’s created by database

designers to ensure the data is consistent and efficiently organized.


Types of Database Schemas

Physical Database Schema

● A physical schema defines how data is stored in the storage system,

including the arrangement of files, indices and other storage

structures. It specifies the actual code and syntax needed to create

the database structure. Essentially, it determines where and how

the data is stored in the physical storage medium.

● The database administrator decides the storage locations and

organization of data within the storage blocks. This schema

represents the lowest level of abstraction


Logical Database Schema

● A logical database schema defines the logical structure of the data,

including tables, views, relationships, and integrity constraints. It

describes how data is organized in tables and how the attributes of

these tables are connected. The logical schema ensures that the

data is stored in an organized manner, while maintaining data

integrity.

● Using Entity-Relationship (ER) modeling, the logical schema

outlines the relationships between different data components. It

also defines integrity constraints to ensure the quality of data during

insertion and updates.

● This schema represents a higher level of abstraction compared to

the physical schema, focusing on logical constraints and how the

data is structured, without dealing with the physical storage details.

View Database Schema

● The view schema is the highest level of abstraction in a database,

focusing on how users interact with the database. It defines the

interface through which users can access and manipulate data,

without needing to understand the underlying storage mechanisms.


● A database can have multiple view schemas, also known as

subschemas, each providing a different perspective of the data.

These schemas describe only a part of the database.

Creating Database Schema


For creating a schema, the statement “CREATE SCHEMA” is used in every

database. But different databases have different meanings for this. Below

we’ll be looking at some statements for creating a database schema in

different database systems:

1. MySQL: In MySQL, we use the “CREATE SCHEMA” statement for creating

the database, because, in MySQL CREATE SCHEMA and CREATE

DATABASE, both statements are similar.

2. SQL Server: In SQL Server, we use the “CREATE SCHEMA” statement for

creating a new schema.

3. Oracle Database: In Oracle Database, we use “CREATE USER” for creating

a new schema, because in the Oracle database, a schema is already created

with each database user. The statement “CREATE SCHEMA” does not create

a schema, instead, it populates the schema with tables & views and also

allows one to access those objects without needing multiple SQL statements

for multiple transactions.

Database Schema Designs


There are many ways to structure a database and we should use the best-

suited schema design for creating our database because ineffective schema

designs are difficult to manage & consume extra memory and resources.

Schema design mostly depends on the application’s requirements. Here we

have some effective schema designs to create our applications, let’s take a look

at the schema designs:

1. Flat Model

2. Hierarchical Model

3. Network Model

4. Relational Model

5. Star Schema

6. Snowflake Schema

Flat Model
A flat model schema is a 2-D array in which every column contains the same

type of data/information and the elements with rows are related to each other.

It is just like a table or a spreadsheet. This schema is better for small

applications that do not contain complex data.


Hierarchical Model
Data is arranged using parent-child relationships and a tree-like structure in

the Hierarchical Database Model. Because each record consists of several

children and one parent, it can be used to illustrate one-to-many relationships

in diagrams such as organizational charts. A hierarchical database structure is

great for storing nested data.


Network Model
The network model is similar to the hierarchical model in that it represents

data using nodes (entities) and edges (relationships). However, unlike the

hierarchical model, which enforces a strict parent-child relationship, the

network model allows for more flexible many-to-many relationships. This

flexibility means that a node can have multiple parent nodes and child nodes,

making the structure more dynamic.

The network model can contain cycles which is a situation where a path exists

that allows you to start and end at the same node. These cycles enable more

complex relationships and allow for greater data interconnectivity.


Relational Model
The relational model is mainly used for relational databases, where the data

is stored as relations of the table. This relational model schema is better for

object-oriented programming.
Star Schema
Star schema is better for storing and analyzing large amounts of data. It has a
fact table at its center & multiple dimension tables connected to it just like a
star, where the fact table contains the numerical data that run business
processes and the dimension table contains data related to dimensions such
as product, time, people, etc. or we can say, this table contains the description
of the fact table. The star schema allows us to structure the data of RDBMS.

Snowflake Schema
Just like star schema, the snowflake schema also has a fact table at its center
and multiple dimension tables connected to it, but the main difference in both
models is that in snowflake schema – dimension tables are further normalized
into multiple related tables. The snowflake schema is used for analyzing large
amounts of data.
Difference between Logical and Physical Database
Schema

Physical Schema Logical Schema

Logical schema provides the


Physical schema describes the way conceptual view that defines the
of storage of data in the disk. relationship between the data
entities.
Having Low level of abstraction. Having a high level of abstraction.

The design of a database must


The design of database is
work with a specific database
independent to any database
management system or hardware
management system.
platform.

Any changes made in logical


Changes in Physical schema
schema have minimal effect in the
effects the logical schema
physical schema

Physical schema does not include Logical schema includes


attributes. attributes.

Physical schema contains the Logical schema does not contain


attributes and their data types. any attributes or data types.
Examples: Data definition Examples: Entity Relationship
language(DDL), storage structures, diagram, Unified Modeling
indexes. Language, class diagram.

Advantages of Database Schema


● Providing Consistency of data: Database schema ensures the data

consistency and prevents the duplicates.

● Maintaining Scalability: Well designed database schema helps in

maintaining addition of new tables in database along with that it

helps in handling large amounts of data in growing tables.

● Performance Improvement: Database schema helps in faster data

retrieval which is able to reduce operation time on the database

tables.

● Easy Maintenance: Database schema helps in maintaining the entire

database without affecting the rest of the database

● Security of Data: Database schema helps in storing the sensitive

data and allows only authorized access to the database.

Database Instance
A database instance is a snapshot of a database at a specific moment in time,
containing all the properties described by a database schema as data values.
Unlike database schemas, which are considered the “blueprint” of a database,
instances can change over time whereas it is very difficult to modify the
schema because the schema represents the fundamental structure of the
database. Database instance does not hold any information related to the
saved data in database.

Database schema versus database instance

Aspect Database Schema Database Instance

Actual data stored in


Blueprint or design of the
Definition the database at a given
database structure
time

Static (does not change Dynamic (changes with


Nature
frequently) every data modification)
Structure (tables, columns, State of the data in the
Represents
data types, relationships) database

Table definitions, data Actual rows of data in


Example
types, constraints the tables

Changes infrequently (e.g.,


Change Changes frequently
during schema design
Frequency with transactions
changes)

What are the Data Models in DBMS?


The Data Model gives us an idea of how the final system would look after it has been
fully implemented. It specifies the data items as well as the relationships between them.
In a database management system, data models are often used to show how data is
connected, stored, accessed, and changed. We portray the information using a set of
symbols and language so that members of an organisation may understand and
comprehend it and then communicate.

Types of Data Models in DBMS


Though there are other data models in use today, the Relational model is the most
used. Aside from the relational model, there are a variety of different data models that
we shall discuss in-depth in this article. Data Models in DBMS include the following:

Hierarchical Model

This concept uses a hierarchical tree structure to organise the data. The hierarchy
begins at the root, which contains root data, and then grows into a tree as child nodes
are added to the parent node. If you wish to learn more about the hierarchical model,
click here.

Network Model

The main difference between this model and the hierarchical model is that any record
can have several parents in the network model. It uses a graph instead of a hierarchical
tree. If you wish to learn more about the network model, click here.

Entity-Relationship Model

The real-world problem is depicted in visual form in this model to make it easier for
stakeholders to comprehend. The ER diagram also makes it very simple for developers
to comprehend the system. If you wish to learn more about the entity-relationship
model, click here.

Relational Model

The data in this model is kept in the form of a table that is two-dimensional. All of the
data is kept in the form of rows and columns. Tables are the foundation of a relational
paradigm. If you wish to learn more about the relational model, click here.

Object-Oriented Data Model

Both the data and the relationship are contained in a single structure that is known as
an object in this model. We can now store audio, video, pictures, and other types of
data in databases, which was previously impossible with the relational approach
(Although you can store video and audio in relational DB, it is advised not to store them
in the relational database). If you wish to learn more about the object-oriented data
model, click here.

Object-Relational Data Model

It is a hybrid of relational and object-oriented models. This model was developed to


bridge the gap between the object-oriented and relational models. If you wish to learn
more about the object-relational data model, click here.

Flat Data Model

It’s a straightforward model in which the DB is depicted as a table with rows and
columns. If you wish to learn more about the flat data model, click here.

Semi-Structured Data Model

The relational model has evolved into the semi-structured model. In this model, we can’t
tell the difference between data and schema. If you wish to learn more about the semi-
structured data model, click here.

Associative Data Model

It is a model in which the data is separated into two sections. Everything that has its
own existence is referred to as an entity, and the relationships between these entities
are referred to as associations. Items and links are two types of data that are separated
into two components. If you wish to learn more about the associative data model, click
here.

Context Data Model


The Context Data Model is made up of various models. This includes models such as
network models and relational models, among others. If you wish to learn more about
the context data model

Data Model Definition


The term “data model” refers to the way data is organized,

documented, and defined within a database.

A data model in DBMS is a set of concepts and rules that are used to

describe and organize the data in a database.

It defines the structure, relationships, and constraints of the data, and

provides a way to access and manipulate the data.


Types of Data Models

Different data models are used to represent different types of data and

relationships, and each has its own set of advantages and

disadvantages.

It defines how data is stored, arranged, and accessed in a database

system. The main components of a data model include entities,

attributes, relationships, and constraints.

Key Components of a Data Model:


● Entities: Objects or concepts in the real world that are

represented in the database. For example, in a university

database, entities could be students, courses, professors, etc.

● Attributes: Characteristics or properties of entities. For

instance, a student entity may have attributes such as student

ID, name, date of birth, etc.

● Relationships: Associations between entities. These define

how entities are related to each other. For example, a student

entity can be related to a course entity through an enrollment

relationship.

● Constraints: Rules that define the valid values and

relationships for data elements. Constraints ensure data

integrity and enforce rules such as uniqueness (e.g., unique

student IDs) and referential integrity (e.g., ensuring that

every course enrollment is associated with an existing student

and course).
Advantages or Benefits of Data Models:
Data models are essential in database design and management for

several reasons:

1. Clarity and Understanding: They provide a clear and

structured way to understand the organization and

relationships of data within a database system. This clarity

helps both developers and users comprehend how data is

structured and accessed.

2. Database Design: Data models serve as a blueprint for

designing databases. They guide the creation of database

schemas, tables, relationships, and constraints based on the

requirements of the application or business domain.


3. Communication: Data models facilitate communication

among stakeholders such as developers, database

administrators, and business analysts. They provide a

common language and visual representation to discuss and

refine data requirements and structures.

4. Data Integrity and Consistency: By defining constraints

and relationships, data models ensure that data stored in the

database is accurate, consistent, and reliable. This helps in

maintaining data quality over time.

5. Scalability and Performance: Well-designed data models

contribute to efficient data retrieval and manipulation

operations. They optimize database performance by

organizing data in a manner that aligns with the typical

queries and operations performed on the data.

6. Flexibility and Adaptability: Data models can evolve over

time to accommodate changes in business requirements or

technological advancements. They provide a foundation that


can be modified or extended as needed without disrupting

existing data structures and applications.

Data in DBMS:
Data in a DBMS refers to the raw facts and figures that are processed

to produce meaningful information. These can be numbers, text,

images, videos, or any other form of information that can be stored

and processed by a computer system. In a relational database, data is

typically organized into tables, rows, and columns.

Data Models
Example: Consider a university database system. Data could include

student names, IDs, courses taken, grades, and faculty details. Each

piece of information (e.g., “John Doe”, “202345”, “Database

Management”, “A”, “Dr. Smith”) represents a unit of data within the

system.

Types of Data Models in DBMS:

1. Hierarchical Model:

● Description: In a hierarchical data model, data is organized

in a tree-like structure where each record has a single parent

record and potentially multiple child records.

● Example-1: An organizational chart where each employee

reports to one manager and can have several subordinates.

● Example-2:The university administration hierarchy, where

each department (node) reports to a dean (parent node), and

each faculty member (child node) belongs to a department.


In a hierarchical data model, data is organized in a tree-like structure

with one one-to-many relationship between two different types of

data. For example- one department can have many courses, teachers

and students.

2. Network Model:

Now, the Network Model in DBMS is a model that is used to represent

the many-to-many relationship among the database constraints.

It is a simple and easy-to-construct database model


● Description: Similar to the hierarchical model but allows

for more complex relationships where each record can have

multiple parent and child records.

● Example-1: A network of interconnected computers where

each node (computer) can communicate with multiple other

nodes.

● Example-2: A research database where a publication can

have multiple authors (nodes) & each author can contribute

to multiple publications, forming a network of interconnected

relationships.

Let us take a basic example to visualize the structure of a network

model in DBMS.
Suppose we are designing the network model for the Students

database. As we can see that the Subject entity has a relationship with

both the Student entity and Degree entity. So there is an edge

connecting the Subject entity with both Student and Degree.

The Subject entity has two parents and the other two entities have one

child entity.

3. Relational Model:

The relational model for database management is an approach to

logically represent and manage the data stored in a database. In this


model, the data is organized into a collection of two-dimensional

inter-related tables, also known as relations. Each relation is a

collection of columns and rows, where the column represents the

attributes of an entity and the rows (or tuples) represents the records.

● Description: Data is organized into tables (relations) where

relationships between data elements are represented by

values in common columns (attributes).

● Example-1: A customer relationship management (CRM)

database where there are tables for customers, orders,

products, etc., and relationships are defined using foreign

keys.

● Example-2:A student information system where student

data (name, ID, major) is stored in a table, and related tables

store course enrollments, grades, and faculty information

linked via keys.


Relational Model stores the data into tables (relations)

Any given row of the relation indicates a student i.e., the row of the

table describes a real-world entity.

The columns of the table indicate the attributes related to the entity.

In this case, the roll number, CGPA, and the name of the student.

4. Entity-Relationship (ER) Model:

An Entity Relationship Diagram (ER Diagram) pictorially explains the

relationship between entities to be stored in a database.


● Description: Represents entities (objects) and their

relationships in a diagrammatic form. Entities have attributes

that describe their properties.

● Example-1: A social media platform where entities could be

users, posts, comments, and relationships like “likes”,

“follows”, etc.

● Example-2: Designing a library management system where

entities like books, members, and transactions are modeled

along with relationships like “borrowed by”, “author of”, and

“category of”.
relationship diagram

The diagram showcases two entities — Student and Course, and their

relationship. The relationship described between student and course is

many-to-many, as a course can be opted by several students, and a

student can opt for more than one course. Student entity possesses

attributes — Stu_Id, Stu_Name & Stu_Age. The course entity has

attributes such as Cou_ID & Cou_Name.

5. Object-Oriented Model:

Increasingly complex real-world problems demonstrated a need for a

data model that more closely represented the real world. In the object
oriented data model (OODM), both data and their relationships are

contained in a single structure known as an object.

● Description: Data is represented as objects, similar to

object-oriented programming concepts. Objects have

attributes and methods (functions) that operate on the data.

● Example: A multimedia database storing images, videos,

and text, where each object (e.g., an image) has attributes

(size, format) and methods (display, edit).

6. Document Model:
Documents are the most common way for storing, retrieving, and

managing semi-structured data. Unlike the traditional relational data

model, the document data model is not restricted to a rigid schema of

rows and columns.

● Description: Stores data in flexible, semi-structured

documents (e.g., JSON, XML) rather than rigidly structured

tables.

● Example: A content management system where each

document (web page, article) is stored as a JSON object with

attributes (title, content).


Document data models are best fit for use cases requiring a flexible

schema and fast data access. E.g. nested documents enable

applications to store related pieces of information in the same

database record in a denormalized manner. As a result, applications

can issue fewer queries and updates to complete common operations.

Transaction Management
Transactions are a set of operations used to perform a logical set of work. A
transaction usually means that the data in the database has changed. One of
the major uses of DBMS is to protect the user data from system failures. It is
done by ensuring that all the data is restored to a consistent state when the
computer is restarted after a crash. The transaction is any one execution of the
user program in a DBMS. One of the important properties of the transaction is
that it contains a finite number of steps. Executing the same program multiple
times will generate multiple transactions.

Example: Consider the following example of transaction operations to be


performed to withdraw cash from an ATM vestibule.

Steps for ATM Transaction

1. Transaction Start.
2. Insert your ATM card.
3. Select a language for your transaction.
4. Select the Savings Account option.
5. Enter the amount you want to withdraw.
6. Enter your secret pin.
7. Wait for some time for processing.
8. Collect your Cash.
9. Transaction Completed.

A transaction can include the following basic database access operation.

● Read/Access data (R): Accessing the database item from disk


(where the database stored data) to memory variable.
● Write/Change data (W): Write the data item from the memory
variable to the disk.
● Commit: Commit is a transaction control language that is used to
permanently save the changes done in a transaction
Example: Transfer of 50₹ from Account A to Account B. Initially A= 500₹, B=
800₹. This data is brought to RAM from Hard Disk.

R(A) -- 500 // Accessed from RAM.

A = A-50 // Deducting 50₹ from A.

W(A)--450 // Updated in RAM.

R(B) -- 800 // Accessed from RAM.

B=B+50 // 50₹ is added to B's Account.

W(B) --850 // Updated in RAM.

commit // The data in RAM is taken back to Hard Disk.

Stages of Transaction

Note: The updated value of Account A = 450₹ and Account B = 850₹.

All instructions before committing come under a partially committed state and
are stored in RAM. When the commit is read the data is fully accepted and is
stored on a Hard Disk.
If the transaction is failed anywhere before committing we have to go back
and start from the beginning. We can’t continue from the same state. This is
known as Roll Back.

Desirable Properties of Transaction (ACID Properties)

Transaction management in a Database Management System (DBMS)


ensures that database transactions are executed reliably and follow ACID
properties: Atomicity, Consistency, Isolation, and Durability. These
principles help maintain data integrity, even during failures or concurrent user
interactions, ensuring that all transactions are either fully completed or rolled
back if errors occur.

For a transaction to be performed in DBMS, it must possess several properties


often called ACID properties.

● A – Atomicity
● C – Consistency
● I – Isolation
● D – Durability

Transaction States

Transactions can be implemented using SQL queries and Servers. In the


diagram, you can see how transaction states work.
Transaction States

The transaction has four properties. These are used to maintain consistency in
a database, before and after the transaction.
Property of Transaction:

● Atomicity
● Consistency
● Isolation
● Durability

Atomicity

● States that all operations of the transaction take place at once if not,
the transactions are aborted.
● There is no midway, i.e., the transaction cannot occur partially. Each
transaction is treated as one unit and either run to completion or is
not executed at all.
● Atomicity involves the following two operations:
● Abort: If a transaction stops or fails, none of the changes it made will
be saved or visible.
● Commit: If a transaction completes successfully, all the changes it
made will be saved and visible.

Consistency

● The rules (integrity constraint) that keep the database accurate and
consistent are followed before and after a transaction.
● When a transaction is completed, it leaves the database either as it
was before or in a new stable state.
● This property means every transaction works with a reliable and
consistent version of the database.
● The transaction is used to transform the database from one
consistent state to another consistent state. A transaction changes
the database from one consistent state to another consistent state.

Isolation

● It shows that the data which is used at the time of execution of a


transaction cannot be used by the second transaction until the first
one is completed.
● In isolation, if the transaction T1 is being executed and using the
data item X, then that data item can’t be accessed by any other
transaction T2 until the transaction T1ends.
● The concurrency control subsystem of the DBMS enforced the
isolation property

Durability

● The durability property is used to indicate the performance of the


database’s consistent state. It states that the transaction made the
permanent changes.
● They cannot be lost by the erroneous operation of a faulty
transaction or by the system failure. When a transaction is
completed, then the database reaches a state known as the
consistent state. That consistent state cannot be lost, even in the
event of a system’s failure.
● The recovery subsystem of the DBMS has the responsibility of
Durability property.

Implementing of Atomicity and Durability

The recovery-management component of a database system can support


atomicity and durability by a variety of schemes. E.g. the shadow-database
scheme:

Shadow copy

● In the shadow-copy scheme, a transaction that wants to update the


database first creates a complete copy of the database.
● All updates are done on the new database copy, leaving the original
copy, the shadow copy, untouched. If at any point the transaction
has to be aborted, the system merely deletes the new copy. The old
copy of the database has not been affected.
● This scheme is based on making copies of the database, called
shadow copies, assumes that only one transaction is active at a time.
● The scheme also assumes that the database is simply a file on disk.
A pointer called db pointer is maintained on disk, It points to the
current copy of the database.

Transaction Isolation Levels in DBMS

Some other transaction may also have used value produced by the failed
transaction. So we also have to rollback those transactions. The SQL standard
defines four isolation levels:

● Read Uncommitted: Read Uncommitted is the lowest isolation level.


In this level, one transaction may read not yet committed changes
made by other transaction, there by allowing dirty reads. In this level,
transactions are not isolated from each other.
● Read Committed: This isolation level guarantees that any data read
is committed at the moment it is read. Thus it does not allows dirty
read. The transaction holds a read or write lock on the current row,
and thus prevent other transactions from reading, updating or
deleting it.
● Repeatable Read: This is the most restrictive isolation level. The
transaction holds locks on all rows it references and writes locks on
all rows it inserts, updates, deletes. Since other transaction cannot
read, update or delete these rows, consequently it
avoids non-repeatable read.
● Serializable: This is the Highest isolation level. A serializable
execution is guaranteed to be serializable. Serializable execution is
defined to be an execution of operations in which concurrently
executing transactions appears to be serially executing.

Failure Classification

To find that where the problem has occurred, we generalize a failure into the
following categories:

● Transaction failure
● System crash
● Disk failure

1. Transaction failure

The transaction failure occurs when it fails to execute or when it reaches a point
from where it can’t go any further. If a few transactions or process is hurt, then
this is called as transaction failure.

Reasons for a transaction failure could be –

1. Logical errors: If a transaction cannot complete due to some code


error or an internal error condition, then the logical error occurs.
2. Syntax error: It occurs where the DBMS itself terminates an active
transaction because the database system is not able to execute it.
For example, The system aborts an active transaction, in case of
deadlock or resource unavailability.

2. System Crash

System failure can occur due to power failure or other hardware or software
failure. Example: Operating system error.

● Fail-stop assumption: In the system crash, non-volatile storage is


assumed not to be corrupted.

3. Disk Failure

● It occurs where hard-disk drives or storage drives used to fail


frequently. It was a common problem in the early days of technology
evolution.
● Disk failure occurs due to the formation of bad sectors, disk head
crash, and unreachability to the disk or any other failure, which
destroy all or part of disk storage.

Serializability

It is an important aspect of Transactions. In simple meaning, you can say that


serializability is a way to check whether two transactions working on a database
are maintaining database consistency or not.
It is of two types:

1. Conflict Serializability
2. View Serializability

Schedule

Schedule, as the name suggests is a process of lining the transactions and


executing them one by one. When there are multiple transactions that are
running in a concurrent manner and the order of operation is needed to be set
so that the operations do not overlap each other, Scheduling is brought into
play and the transactions are timed accordingly.

It is of two types:

1. Serial Schedule
2. Non-Serial Schedule

Uses of Transaction Management

● The DBMS is used to schedule the access of data concurrently. It


means that the user can access multiple data from the database
without being interfered with by each other. Transactions are used to
manage concurrency.
● It is also used to satisfy ACID properties.
● It is used to solve Read/Write Conflicts.
● It is used to implementRecoverability , Serializability , and
Cascading.
● Transaction Management is also used forConcurrency Control
Protocols and the Locking of data.

Advantages of using a Transaction

● Maintains a consistent and valid database after each transaction.


● Makes certain that updates to the database don’t affect its
dependability or accuracy.
● Enables simultaneous use of numerous users without sacrificing data
consistency.

Disadvantages of using a Transaction

● It may be difficult to change the information within the transaction


database by end-users.
● We need to always roll back and start from the beginning rather than
continue from the previous state.

Conclusion

In DBMSs, transaction management is crucial to preserving data integrity. To


guarantee dependable operations, it upholds the ACID (Atomicity, Consistency,
Isolation, Durability) qualities. A key component of reliable database systems,
transactions enable the grouping of several processes into a single unit while
providing data consistency and security against concurrent access.

Database Users
The people who use the system are called users. Users can be categorized into those who actually
use and manage the material (referred to as "Actors on the Scene") and those who make it
possible to create the database and the DBMS software (referred to as "Workers Behind the
Scene").

Users are put into different groups based on how they want to use the system. There are four
different kinds of database users:

1. Naive Users:
Naive users are people who don't know much about computers and use the system by
calling up one of the application programs that have already been written. For example, a
bank teller uses a program called "initaite_transfer" to move Rs. 15,000 from account A to
account B. This program asks the teller for the amount of money that needs to be moved,
the account that the money is coming from, and the account that the money is going to.
2. Application programmers:
These are people who work in computers and write programs for applications. Application
programmers have a lot of tools to choose from when making user interfaces. E.g., RAD
(Rapid Application Development) tools let an application programmer make forms and
reports without having to write a program.
3. Sophisticated users:
These people know how to use the system without writing programs. They write their
requests in a language for talking to databases. They send each query to a query
processor, whose job is to turn DML statements into instructions that the storage
manager can understand. This group is made up of analysts who use queries to look at
data in the database.
For example, Online Analytical Processing (OLAP) tools make analysts' jobs easier by
letting them see data summaries in different ways. For example, analysts can see total
products by categories, by no. of sales, or by a combination of categories and no. of
sales.
4. Specialized users:
Specialized users are advanced users who write database programs that don't fit into the
traditional way of processing data. Among these applications are computer-aided design
systems, knowledge base and expert systems, systems that store data with complex data
types (like audio/visual data), and systems that model the environment.

Database Administrators
One of the main reasons to use DBMS is to have centralized control over both the data and the
programs that access the data. The person in charge of the system as a whole is termed the
database administrator (DBA). The following are some of the duties of the DBA:

1. Defining Schema:
The DBA makes the original database schema by writing a set of definitions. The DDL
compiler translates these definitions into a set of tables, which are then stored in the data
dictionary.
2. Defining Storage Structure and Access Methods:
The DBA makes the right storage structures and access methods by writing a set of
definitions that the data-storage and data-definition language compiler translates.
3. Modification of Schema and Physical Organization:
The DBA makes changes to the schema and physical organization to reflect how the
organization's needs change or to change the physical organization to make it run better.
4. Giving Permission to Access Data:
The database administrator can control which parts of the database different users can
access by giving different types of permission. The information about who has permission
to access the data is kept in a special system structure that is checked by the database
system every time someone tries to access the data.
5. Fulfilling Integrity-Constraint Requirements:
The values of the data stored in the database must meet certain consistency
requirements. The database administrator must tell the database about this constraint.
The integrity constraints are kept in a special structure that the database system checks
every time it makes a change.
6. Regular Maintenance Tasks:
DBA regularly needs to back up the database, either to tapes or to other servers, in order
to avoid data loss in the event of emergencies. They also need to ensure that there is
sufficient free disk space for standard operations and need to update disk space as
necessary. DBA should also monitor database jobs to make sure performance is not
harmed by extremely expensive tasks supplied by some users.

Data Dictionary in DBMS


Introduction

A data dictionary in a relational database stores information about table relationships. It


helps users organize data effectively and reduces redundancy.

What is a data dictionary in a database?

As you can guess from the word "dictionary", which means a group of words, so in the
same way, a data dictionary is a group of names, attributes, and definitions of data
elements being used in a database. The data dictionary stores information like what is
in the database, who is allowed to use it etc.
Why Use a Data Dictionary?

A data dictionary works as a catalogue of data which provides data-related information,


like which data is used in a particular database management system (DBMS). It also
includes information about the data's meaning, format, and usage. Data dictionaries are
important because they help in managing data quality, ensuring data consistency, and
facilitating data sharing.

Sample Example

Let's take an example of a data dictionary in DBMS. Suppose we have a table with the
name Person, which has the following attributes, PersonID, FirstName, LastName, City.
Now the data dictionary of this table will be:

As you can see in the above image, the complete detail of the table Person is present.
The columns are Field, Type, Null, Key, and Default.
Types of Data Dictionary

There are mainly two types of Data Dictionary:

Integrated Data Dictionary

Each relational database contains an integrated data dictionary that is part of the
DBMS. This system catalogue is updated with the help of a relational database.
This integrated data dictionary is divided into two more sub-parts:

● Active: The active dictionary updates itself automatically.

● Passive: The passive dictionary needs to be updated manually.

Stand Alone Data Dictionary

Stand Alone data dictionary is a type of dictionary that permits the admin to manage all
the required data. The list of some common elements is as follows:

● Tables

● Index

● Programs

● Admin and End-users

● Data elements
● Relation between data elements

Need of Data Dictionary

Let's discuss the need of the data dictionary in points:

● The data directory is essential to have the proper knowledge of the content,
as the data provided by the data models are insufficient or offer fewer
details.

● It helps the user to analyze the data easily.

● It is also helpful to maintain consistency if you work on multiple projects at a


time.

● It also provides information related to attributes, relationships, and entities.

How to Create a Data Dictionary?

A data dictionary is a crucial component of any data management process. It serves as


a comprehensive guide that defines the structure, organization, and characteristics of
data within a dataset or database. Creating a data dictionary involves several key steps:

● Identify Data Elements: Begin by identifying all the data elements within
your dataset or database. These can include fields, columns, tables, or any
other relevant units of data.
● Document Data Attributes: For each data element, document its attributes
such as name, description, data type, length, format, and any constraints or
validations applied to it.
● Define Relationships: If your dataset contains multiple tables or entities,
define the relationships between them. Document how different data
elements relate to each other, whether through foreign keys, primary keys,
or other linkages.
● Document Business Rules: Record any business rules or logic that govern
the use and interpretation of the data. This can include data validation rules,
calculations, transformations, and any other business-specific guidelines.
● Include Metadata: Incorporate metadata such as creation date, last updated
timestamp, owner information, and any other relevant administrative details.
● Organize and Format: Organize the data dictionary in a clear and intuitive
manner, making it easy to navigate and reference. You can use tables,
diagrams, or other visual aids to enhance readability.
● Review and Validate: Review the data dictionary thoroughly to ensure
accuracy and completeness. Validate the information with stakeholders,
subject matter experts, and other relevant parties to verify its correctness.
● Maintain Documentation: Data dictionaries are living documents that
should be updated regularly to reflect any changes or additions to the data
structure. Establish a process for ongoing maintenance and version control.

Notations of Data Dictionary

There are many notations that help the user to create the Data dictionary. Let's discuss
them.

● The first notation is "=". It stands for "is composed of".

● The next notation is "+". It is used to denote the Sequence and stands for
"AND".

● Selection is the next notation denoted by "[ | ]". It stands for "OR".
● The next data construct is Parentheses, denoted by "()". It is used to
represent optional data.

● Repetition is denoted by {}n and known as "n repetitions".

● The last notation is Comment, denoted by "*…*" and stands for "to define a
comment".

Challenges with Data Dictionary

There are many challenges with the Data dictionary. Let's discuss them.

● Data dictionaries become outdated very quickly if the data in the database
keeps on changing frequently.|

● Data dictionaries are complex and time-consuming to maintain, especially if


we have a large database containing a lot of data.

● Data dictionaries are difficult to use, especially for users who are not familiar
with them.

● Data dictionaries may contain some sensitive information about the data,
and if it is not properly secured, can be accessed by unauthorized users.

Advantages of using a Data Dictionary

The pros of the data dictionary are as follows:


● It reduces data redundancy.

● It helps the user to maintain data integrity even if they are working on
multiple projects.

● It offers the relationship information between different multiple database


tables.

● It helps the user to read the structure of system requirement needs easily.

● It is helpful while creating the naming convention of models.

Disadvantages of using a Data Dictionary

The cons of the data dictionary is as follows:

● There is less amount of functional details present in the data dictionary.

● The non-tech user might face difficulties while using the data dictionary for
the first time.

● Most of the time, relational diagrams of the data dictionary do not look
visually appealing.

Databases
Data is generated with every digital activity on a daily basis. A database is a system for

storing, managing, and retrieving data. It organises information in a way that makes it
easy to access and use for various applications. For this article, we need to understand

two types of databases:

Centralised Databases
In a centralised database, all data is stored in a single location, such as a server. Users

can access this database through a network. Because everything is in one place,

centralised databases are easier to manage and secure. However, if the server fails, the

entire system may stop working. One example is a bank storing all of its customer data

on a central server.

Decentralised Databases
A decentralised database stores data across multiple locations or servers. Each server

holds part of the data and works independently. This system is more reliable because

even if one server fails, others continue to function. However, managing decentralised

databases can be more complex. A good example is a blockchain network where data

is distributed across many nodes.

What is DBMS Architecture


The DBMS architecture refers to the structural design and interconnected components

that manage and maintain databases efficiently. One widely used approach is the
client-server architecture, where client and server components are separated to

streamline data handling, application logic, and user interactions.

At the heart of any database system structure are several essential components:
Disk Storage
Disk storage is a critical component of a database management system (DBMS). It

stores data permanently, ensuring it's accessible even after a system shutdown.

Efficient use of disk storage is key to managing, retrieving, and organising large

datasets. Disk storage in DBMS involves:

1. Data Files: Data files hold the actual user data in a structured format. These files
store all the records, transactions, and information required by the application.
They are designed for optimised storage and fast retrieval.
2. Data Dictionary: The data dictionary is also known as metadata storage and
contains detailed information about the database structure. It records table
names, column types, constraints, and relationships. This acts as a reference for
the DBMS so it can understand how data is organised and used.
3. Indices: Indices are like shortcuts for data access. They create pointers for
specific rows or records, which enables the DBMS to locate data quickly without
scanning the entire database. Indices are necessary to greatly improve query
performance and are especially needed for large datasets.

Query Processor
The query processor is an important component of a DBMS. It interprets user queries

and executes them precisely. The query process involves several specialised

components that handle different tasks to convert high-level queries into low-level

instructions.

1. DML Compiler: The Data Manipulation Language (DML) compiler processes


user commands like SELECT, INSERT, UPDATE, and DELETE. It converts
these high-level queries into low-level instructions that the DBMS can understand
and execute.
2. DDL Interpreter: The Data Definition Language (DDL) interpreter handles
statements like CREATE, ALTER, and DROP. It processes commands that
define or modify the structure of the database, such as creating tables or
changing column properties.
3. Embedded DML Pre-compiler: integrates DML statements into programming
languages like C, Java, or Python. It translates embedded SQL queries into a
format that the DBMS can process. It is necessary to maintain proper
communication between applications and the database.
4. Query Optimizer: The query optimiser is responsible for improving query
performance. It evaluates multiple ways to execute a query and selects the most
efficient one. This reduces processing time, so the query execution is done
quickly.

Storage Manager
The storage manager is an essential part of a database management system. It handles

how data is stored, retrieved, and managed on physical storage devices like disks. The

storage manager is responsible for data integrity, security, and efficient access. It

includes the following key components:

1. Authorisation Manager: It’s a check that only lets authorised users access or
modify the database. It verifies user credentials and permissions to prevent
unauthorised access, so the database is secure.
2. Integrity Manager: The integrity manager enforces rules that maintain the
correctness of the data. For example, it ensures no duplicate primary keys exist
and that foreign key constraints are followed. This prevents invalid or inconsistent
data from entering the database.
3. Transaction Manager: oversees all database transactions. It oversees that each
transaction is completed fully or not at all which is needed for maintaining the
database's reliability. This includes handling operations like rollbacks in case of
failures and maintaining data consistency.
4. File Manager: The file manager organises and manages data on physical
storage devices. It handles how data files are created, read, and written. This
component is necessary for efficient storage utilisation and supports large-scale
data management.
5. Buffer Manager: The buffer manager controls the flow of data between main
memory (RAM) and disk storage. It stores frequently accessed data temporarily
in memory to speed up data retrieval and reduce disk access time.

Concurrency control
Concurrency control enables multiple users to access a database at the same time

without causing errors or data issues. It prevents problems like data loss,

inconsistencies, or system crashes by managing how transactions are handled. A

transaction, which is a single logical unit of work that retrieves or modifies data can

cause delays if executed one at a time. This increases the waiting time for other

transactions and slows overall performance. To improve throughput and reduce delays,

transactions are executed concurrently. It allows the database to handle multiple tasks

efficiently.

Together with communication interfaces and concurrency control mechanisms, these

components form the backbone of relational database architecture. They work in

harmony to ensure seamless user interactions, secure data access, and efficient

database operations.
Levels of DBMS Architecture
A Database Management System (DBMS) is organised into three main levels: the

Internal Level, the Conceptual Level, and the External Level.

1. Internal Level
This level handles the physical storage of data within the database. It focuses on
how data is stored and retrieved from storage devices, such as hard drives or
solid-state drives. The internal level is responsible for low-level operations like
data compression, indexing, and managing storage allocation.
2. Conceptual Level:
The conceptual level represents the logical organisation of the data. It defines the
structure and relationships between data elements, including tables, attributes,
and their links. This level is independent of any specific DBMS and ensures that
the data schema can be used across different systems without affecting the
underlying database implementation.
3. External Level:
This level represents how users interact with the database. It defines the user’s
view of the data and presents it in a way that is meaningful to them. Users can
access the database through customised views or interfaces that focus on their
specific needs. They can do all this without being concerned about the
database's internal structure or logic.

Database Architecture vs. Tier Architecture


Database architecture refers to the overall design and structure of a database system. It

involves how data is organised, stored, managed, and accessed within the system. The

important elements of the architecture of database are data models, components like
the DBMS, and how different layers, such as the storage manager, query processor,

etc., interact to provide data management and access.

On the other hand, tier architecture refers to the way a database system is structured.

This structure is defined by how its components are distributed across different layers.

In tier architecture, the database is divided into multiple layers, such as single-tier, two-

tier, or three-tier systems. These tiers create separation of the user interface,

application logic, and database to improve scalability, security, and maintenance.

The main difference to remember between the architecture of database and tier is that

database architecture focuses on the structure and components of the database system

itself. The tiered architecture deals with how these components are distributed across

different layers.

Types of Tier DBMS Architecture


Database management systems (DBMS) are designed with multiple levels of

abstraction to ensure efficient functioning. These layers help define the structure and
operation of the DBMS architecture.

Since users and applications do not always interact directly with the database, various
architectures of a database system are used based on how users are connected.

These architectures are tier-based, meaning the architecture of DBMS is categorised by

the number of layers within its structure.

For instance, in an n-tier DBMS architecture, the system is divided into n

independent layers:
● 1-tier architecture in DBMS has a single layer where the database and user
interface coexist.
● Two-tier DBMS architecture separates the system into two layers, typically a
client and a server.
● Three-tier architecture introduces a middle layer, such as an application server,
for additional processing.

As the number of layers increases, the level of abstraction also increases, enhancing

security and adding complexity to the database system structure. Each layer in the

application architecture in DBMS operates independently. Therefore, modifications to

one layer do not affect the others, ensuring flexibility and maintainability in the system

design. Let’s look at each one in more detail:

1. Single Tier Architecture in DBMS


In single-tier architecture in dbms, the database is directly accessible to the user.

The user interacts with the DBMS architecture directly, making changes that

immediately reflect on the database. This architecture does not offer user-friendly tools

for end-users and is ideal for local application development where programmers need

quick, direct access to the database.

This type of architecture of a database system is best suited for scenarios where:

● Data is rarely modified.


● Only a single user is accessing the database.
● A straightforward way to interact with or modify the database is required.

Example:

When learning Structured Query Language (SQL), developers often install an SQL

server and set up a database on their local machine. This allows direct execution of
SQL queries without a network connection. This setup is a 1-tier architecture in

DBMS.

Advantages of Single-Tier Architecture:

1. Simplicity: Easy to set up since it only requires a single machine.


2. Cost-Effective: No need for additional hardware, making it budget-friendly.
3. Easy Implementation: Quick deployment, making it ideal for small projects or
development environments.

2. Two-Tier Architecture in DBMS


In two-tier architecture, the system follows a basic client-server architecture. The client-

side applications directly communicate with the server-side database using APIs like

ODBC (Open Database Connectivity) or JDBC (Java Database Connectivity). The user

interface and application logic run on the client side, while the server handles tasks like

query processing and transaction management. In this DBMS architecture, the client

application establishes a connection to the server to interact with the database.

API Call
APIs in the architecture of database act as intermediaries which allow the client to

send requests to the server for tasks like getting, updating, or removing data. They

convert simple commands from the client into instructions the database can understand.

API makes it easy to work with different databases through standardisation and also

keeps the connection secure by including features for authentication and encryption.
Example:

Imagine withdrawing cash at a bank. The banker enters your account details and

withdrawal amount into the system. The client application (banker’s interface) sends a

request to the server-side database to check your balance and process the transaction.

This setup is a classic example of two-tier DBMS architecture.

Advantages of Two-Tier Architecture:

1. Supports Multiple Users: Suitable for organisations as multiple users can


access the database simultaneously.
2. High Processing Power: The server handles database functions, improving
performance.
3. Faster Access: Direct connection between client and server ensures quick data
retrieval.
4. Easy Maintenance: With two distinct layers, updates and maintenance become
simpler.

3. Three-Tier Architecture in DBMS :


In a three-tier architecture, there is an additional layer between the client and the server,

known as the application server. The client does not directly communicate with the

database. Instead, client-side applications interact with the application server, which

then communicates with the database server. This layer of separation ensures that the

end-user is unaware of the database's internal details, and the database remains

insulated from direct client interactions.

Working of Application Layer


The application layer acts as an intermediary between the client and the database. It

processes requests from the client, then performs necessary business logic, and finally

sends the appropriate queries to the database. This separation keeps the client only

with the application layer, and hence, it keeps the database secure and independent

from direct user interactions.


Example:

Consider an online shopping platform like Amazon. When a user places an order, the

client-side interface (website or app) sends the request to an application server. The

application server processes the order, verifies stock availability, and updates the

database accordingly. The client never communicates directly with the database; the

application server handles all interactions.

Advantages of Three-Tier Architecture:

1. Scalability: The application server can manage load balancing, allowing support
for numerous clients without impacting database performance.
2. Data Integrity: The application layer filters and validates client requests,
reducing the risk of data corruption or erroneous queries.
3. Security: By removing direct access to the database, the architecture minimises
the chance of unauthorised access and enhances security.
Structure of Database Management System
A Database Management System (DBMS) is software that allows users to

define, store, maintain, and manage data in a structured and efficient manner.

It acts as an intermediary between data and users, allowing disparate data

from different applications to be managed. A DBMS simplifies the complexity

of data processing by providing tools to organize data, ensure its integrity, and

prevent unauthorized access or loss of data.

In today’s data-driven world, DBMS are essential for applications such as

banking systems, e-commerce platforms, education, and medical systems.

They not only store and manage large amounts of data, but also provide

functionality that provides performance, security, and scalability for multiple

users with multiple access levels.

It also allows access to data stored in a database and provides an easy and

effective method of:

● We are defining the information.

● Storing the information.

● Manipulating the information.

● We are protecting the information from system crashes or data

theft.

● Differentiating access permissions for different users.


Understanding Data Theft in DBMS

Data theft means the illicit extraction or manipulation of sensitive information

stored in databases, servers, and other storage systems. This is further

defined, in DBMS, as improper access to confidential or sensitive data by

unauthorized persons.

This may include information such as personal data, financial records,

intellectual property, or trade secrets. As digital data storage has grown, so

has the threat of data theft; it is now a primary priority concern with serious

impacts on organizations worldwide.

Data theft can be carried out by, among others:

● Hacking and exploiting: Attackers can use DBMS security gaps to

access unauthorized sensitive data.

● Insider threats: Employees or contractors compromise privileged

access to information.

● Phishing and social engineering: These are techniques that will

trick the authorized user into revealing the login credentials to

enable intrusion.

● Malware and ransomware attacks: These are malware that make

database security vulnerable to attack, thus giving access to

attackers to steal data or lock down data until some amount of

ransom is paid.
Data theft prevention is not only an issue in sensitive information matters but

also for building trust between businesses and clients. Controls over access,

periodic audits, real-time monitoring of activities done through the database

are effective measures one could consider to reduce the risk. Also, following

cyber security protocols and periodic inundation of database systems will

reduce most of the vulnerabilities.

Database Architecture vs. Tier Architecture

Structure of Database Management System is also referred to as Overall

System Structure or Database Architecture but it is different from the Tier

architecture of Database.

● Database Architecture refers to the internal components of the

DBMS, including the Query Processor, Storage Manager, and Disk

Storage. It also defines the interaction of these components.

● Tier Architecture typically refers to the multi-layered setup in an

application where DBMS serves as the data layer, but it is distinct

from Database Architecture, which refers to the internal structure

and levels (internal, conceptual, and external) of the DBMS.

Components of a Database System

Query Processor, Storage Manager, and Disk Storage. These are explained as

following below.
Architecture of DBMS

1. Query Processor

It interprets the requests (queries) received from end user via an application

program into instructions. It also executes the user request which is received

from the DML compiler. Query Processor contains the following components


● DML Compiler: It processes the DML statements into low level

instruction (machine language), so that they can be executed.

● DDL Interpreter: It processes the DDL statements into a set of table

containing meta data (data about data).

● Embedded DML Pre-compiler: It processes DML statements

embedded in an application program into procedural calls.

● Query Optimizer: The Query Optimizer executes instructions

generated by the DML Compiler and improves query execution

efficiency by choosing the best query plan, considering factors such

as indexing, join order, and available system resources. For instance,

if a query involves joining two large tables, the optimizer will select

the best join order to minimize query execution time.

2. Storage Manager

Storage Manager is an interface between the data stored in the database and

the queries received. It is also known as Database Control System. It maintains

the consistency and integrity of the database by applying the constraints and

executing the DCL statements. It is responsible for updating, storing, deleting,

and retrieving data in the database. It contains the following components:

● Authorization Manager: It ensures role-based access control, i.e,.

checks whether the particular person is privileged to perform the


requested operation or not.

● Integrity Manager: It checks the integrity constraints when the

database is modified.

● Transaction Manager: It controls concurrent access by performing

the operations in a scheduled way that it receives the transaction.

Thus, it ensures that the database remains in the consistent state

before and after the execution of a transaction.

● File Manager: It manages the file space and the data structure used

to represent information in the database.

● Buffer Manager: It is responsible for cache memory and the

transfer of data between the secondary storage and main memory.

3. Disk Storage

It contains the following essential components:

● Data Files: It stores the actual data in the database.

● Data Dictionary: It contains the information about the structure of

database objects such as tables, constraints, and relationships. It is

the repository of information that governs the metadata.


● Indices: Provides faster data retrieval by allowing the DBMS to find

records quickly, improving query performance.

Levels of DBMS Architecture

The structure of a Database Management System (DBMS) can be divided into

three main components: the Internal Level, the Conceptual Level, and the

External Level.

1. Internal Level

This level represents the physical storage of data in the database. It is

responsible for storing and retrieving data from the storage devices, such as

hard drives or solid-state drives. It deals with low-level implementation

details such as data compression, indexing, and storage allocation.

2. Conceptual Level

This level represents the logical view of the database. It deals with the overall

organization of data in the database and the relationships between them. It

defines the data schema, which includes tables, attributes, and their

relationships. The conceptual level is independent of any specific DBMS and

can be implemented using different DBMSs.


3. External Level

This level represents the user’s view of the database. It deals with how users

access the data in the database. It allows users to view data in a way that

makes sense to them, without worrying about the underlying implementation

details. The external level provides a set of views or interfaces to the database,

which are tailored to meet the needs of specific user groups.

Schema Mapping in DBMS

The three levels are connected via schema mapping, ensuring that changes at

one level (e.g., the conceptual level) are accurately reflected in the others. This

process maintains data independence, allowing changes in physical storage

(internal level) without affecting the logical or user views.

Role of Database Administrator (DBA)

In addition to these three levels, a DBMS also includes a Database

Administrator (DBA) component, which is responsible for managing the

database system. The DBA performs critical tasks such as:

● Database design and architecture.


● Security management: Implementing role-based access control

(RBAC), encryption, and ensuring strong authentication measures

such as multi-factor authentication (MFA).

● Backup and recovery: Regularly creating backups and preparing

recovery plans in case of data loss.

● Performance tuning: Optimizing database performance, including

query optimization, indexing, and resource management to ensure

the DBMS runs efficiently.

Introduction of ER Model
The Entity Relationship Model is a model for identifying entities (like student,

car or company) to be represented in the database and representation of how

those entities are related. The ER data model specifies enterprise schema that

represents the overall logical structure of a database graphically.

We typically follow the below steps for designing a database for an

application.

● Gather the requirements (functional and data) by asking questions

to the database users.


● Create a logical or conceptual design of the database. This is where

ER model plays a role. It is the most used graphical representation

of the conceptual design of a database.

● After this, focus on Physical Database Design (like indexing) and

external design (like views)

Why Use ER Diagrams In DBMS

● ER diagrams represent the E-R model in a database, making them

easy to convert into relations (tables).

● ER diagrams serve the purpose of real-world modeling of objects

which makes them intently useful.

● ER diagrams require no technical knowledge of the underlying

DBMS used.

● It gives a standard solution for visualizing the data logically.

Symbols Used in ER Model

ER Model is used to model the logical view of the system from a data

perspective which consists of these symbols:

● Rectangles: Rectangles represent entities in the ER Model.

● Ellipses: Ellipses represent attributes in the ER Model.

● Diamond: Diamonds represent relationships among Entities.


● Lines: Lines represent attributes to entities and entity sets with

other relationship types.

● Double Ellipse: Double ellipses represent multi-valued Attributes.

● Double Rectangle: Double rectangle represents a weak entity.

Symbols used in ER Diagram

Components of ER Diagram

ER Model consists of Entities, Attributes, and Relationships among Entities in

a Database System.

Components of ER Diagram

What is an Entity

An Entity may be an object with a physical existence: a particular person, car,

house, or employee or it may be an object with a conceptual existence – a

company, a job, or a university course.

What is an Entity Set


An entity refers to an individual object of an entity type, and the collection of

all entities of a particular type is called an entity set. For example, E1 is an

entity that belongs to the entity type “Student,” and the group of all students

forms the entity set. In the ER diagram below, the entity type is represented

as:

Entity Set

We can represent the entity set in ER Diagram but can’t represent entity in ER

Diagram because entity is row and column in the relation and ER Diagram is

graphical representation of data.


Types of Entity

There are two types of entity:

1. Strong Entity

A Strong Entity is a type of entity that has a key Attribute. Strong Entity does

not depend on other Entity in the Schema. It has a primary key, that helps in

identifying it uniquely, and it is represented by a rectangle. These are called

Strong Entity Types.

2. Weak Entity

An Entity type has a key attribute that uniquely identifies each entity in the

entity set. But some entity type exists for which key attributes can’t be defined.

These are called Weak Entity types.

For Example, A company may store the information of dependents (Parents,

Children, Spouse) of an Employee. But the dependents can’t exist without the

employee. So dependent will be a Weak Entity Type and Employee will be

identifying entity type for dependent, which means it is Strong Entity Type.

A weak entity type is represented by a double rectangle. The participation of

weak entity types is always total. The relationship between the weak entity
type and its identifying strong entity type is called identifying relationship and

it is represented by a double diamond.

Strong Entity and Weak Entity

What are Attributes

Attributes are the properties that define the entity type. For example, Roll_No,

Name, DOB, Age, Address, and Mobile_No are the attributes that define entity

type Student. In ER diagram, the attribute is represented by an oval.

Attribute

Types of Attributes

1. Key Attribute
The attribute which uniquely identifies each entity in the entity set is called

the key attribute. For example, Roll_No will be unique for each student. In ER

diagram, the key attribute is represented by an oval with underlying lines.

Key Attribute

2. Composite Attribute

An attribute composed of many other attributes is called a composite

attribute. For example, the Address attribute of the student Entity type

consists of Street, City, State, and Country. In ER diagram, the composite

attribute is represented by an oval comprising of ovals.

Composite Attribute
3. Multivalued Attribute

An attribute consisting of more than one value for a given entity. For example,

Phone_No (can be more than one for a given student). In ER diagram, a

multivalued attribute is represented by a double oval.

Multivalued Attribute

4. Derived Attribute

An attribute that can be derived from other attributes of the entity type is

known as a derived attribute. e.g.; Age (can be derived from DOB). In ER

diagram, the derived attribute is represented by a dashed oval.

Derived Attribute

The Complete Entity Type Student with its Attributes can be represented as:
Entity and Attributes

Relationship Type and Relationship Set

A Relationship Type represents the association between entity types. For

example, ‘Enrolled in’ is a relationship type that exists between entity type

Student and Course. In ER diagram, the relationship type is represented by a

diamond and connecting the entities with lines.


Entity-Relationship Set

A set of relationships of the same type is known as a relationship set. The

following relationship set depicts S1 as enrolled in C2, S2 as enrolled in C1,

and S3 as registered in C3.

Relationship Set

Degree of a Relationship Set

The number of different entity sets participating in a relationship set is called

the degree of a relationship set.

1. Unary Relationship: When there is only ONE entity set participating in a

relation, the relationship is called a unary relationship. For example, one

person is married to only one person.


Unary Relationship

2. Binary Relationship: When there are TWO entities set participating in a

relationship, the relationship is called a binary relationship. For example, a

Student is enrolled in a Course.

Binary Relationship

3. Ternary Relationship: When there are three entity sets participating in a

relationship, the relationship is called a ternary relationship.

4. N-ary Relationship: When there are n entities set participating in a

relationship, the relationship is called an n-ary relationship.

What is Cardinality
The maximum number of times an entity of an entity set participates in a

relationship set is known as cardinality . Cardinality can be of different types:

1. One-to-One: When each entity in each entity set can take part only once in

the relationship, the cardinality is one-to-one. Let us assume that a male can

marry one female and a female can marry one male. So the relationship will

be one-to-one.

One to One Cardinality

Using Sets, it can be represented as:


Set Representation of One-to-One

2. One-to-Many: In one-to-many mapping as well where each entity can be

related to more than one entity. Let us assume that one surgeon department

can accommodate many doctors. So the Cardinality will be 1 to M. It means

one department has many Doctors.

one to many cardinality

Using sets, one-to-many cardinality can be represented as:


Set Representation of One-to-Many

3. Many-to-One: When entities in one entity set can take part only once in the

relationship set and entities in other entity sets can take part more than once

in the relationship set, cardinality is many to one. Let us assume that a student

can take only one course but one course can be taken by many students. So

the cardinality will be n to 1. It means that for one course there can be n

students but for one student, there will be only one course.

many to one cardinality

Using Sets, it can be represented as:


Set Representation of Many-to-One

In this case, each student is taking only 1 course but 1 course has been taken

by many students.

4. Many-to-Many: When entities in all entity sets can take part more than once

in the relationship cardinality is many to many. Let us assume that a student

can take more than one course and one course can be taken by many students.

So the relationship will be many to many.


many to many cardinality

Using Sets, it can be represented as:

Many-to-Many Set Representation

In this example, student S1 is enrolled in C1 and C3 and Course C3 is enrolled

by S1, S3, and S4. So it is many-to-many relationships.

Participation Constraint
Participation Constraint is applied to the entity participating in the relationship

set.

1. Total Participation: Each entity in the entity set must participate in the

relationship. If each student must enroll in a course, the participation of

students will be total. Total participation is shown by a double line in the ER

diagram.

2. Partial Participation: The entity in the entity set may or may NOT participate

in the relationship. If some courses are not enrolled by any of the students, the

participation in the course will be partial.

The diagram depicts the ‘Enrolled in’ relationship set with Student Entity set

having total participation and Course Entity set having partial participation.

Total Participation and Partial Participation

Using Set, it can be represented as,


Set representation of Total Participation and Partial Participation

Every student in the Student Entity set participates in a relationship but there

exists a course C4 that is not taking part in the relationship.

How to Draw an ER Diagram

● The very first step is to identify all the Entities

● Represent these entities in a Rectangle and label them accordingly.

● The next step is to identify the relationship between them and

represent them accordingly using the Diamond shape. Ensure that

relationships are not directly connected to each other.

● Attach attributes to the entities by using ovals. Each entity can have

multiple attributes (such as name, age, etc.), which are connected to

the respective entity.

● Assign primary keys to each entity. These are unique identifiers that

help distinguish each instance of the entity. Represent them with

underlined attributes.
● Remove any unnecessary or repetitive entities and relationships

● Review the diagram make sure it is clear and effectively conveys the

relationships between the entities.

DBMS – ER Design Issues


We have already covered ER diagram in our previous article DBMS ER Model
Concept. In this post, we will discuss the various issues that can arise while
designing an ER diagram.

Here are some of the issues that can occur while ER diagram design process:

1. Choosing Entity Set vs Attributes


Here we will discuss how choosing an entity set vs an attribute can change the
whole ER design semantics. To understand this lets take an example, let’s say we
have an entity set Student with attributes such as student-name and student-id.
Now we can say that the student-id itself can be an entity with the attributes like
student-class and student-section.

Now if we compare the two cases we discussed above, in the first case we can
say that the student can have only one student id, however in the second case
when we chose student id as an entity it implied that a student can have more
than one student id.

2. Choosing Entity Set vs. Relationship Sets


It is hard to decide that an object can be best represented by an entity set or
relationship set. To comprehend and decide the perfect choice between these
two (entity vs relationship), the user needs to understand whether the entity
would need a new relationship if a requirement arise in future, if this is the case
then it is better to choose entity set rather than relationship set.

Let’s take an example to understand it better: A person takes a loan from a bank,
here we have two entities person and bank and their relationship is loan. This is
fine until there is a need to disburse a joint loan, in such case a new relationship
needs to be created to define the relationship between the two individuals who
have taken joint loan. In this scenario, it is better to choose loan as an entity set
rather than a relationship set.

3. Choosing Binary vs n-ary Relationship Sets


In most cases, the relationships described in an ER diagrams are binary. The n-
ary relationships are those where entity sets are more than two, if the entity sets
are only two, their relationship can be termed as binary relationship.

The n-ary relationships can make ER design complex, however the good news is
that we can convert and represent any n-ary relationship using multiple binary
relationships.

This may sound confusing so lets take an example to understand how we can
convert an n-ary relationship to multiple binary relationships. Now lets say we
have to describe a relationship between four family members: father, mother, son
and daughter. This can easily be represented in forms of multiple binary
relationships, father-mother relationship as “spouse”, son and daughter
relationship as “siblings” and father and mother relationship with their child as
“child”.

4. Placing Relationship Attributes


The cardinality ratio in DBMS can help us determine in which scenarios we need
to place relationship attributes. It is recommended to represent the attributes of
one to one or one to many relationship sets with any participating entity sets
rather than a relationship set.

For example, if an entity cannot be determined as a separate entity rather it is


represented by the combination of participating entity sets. In such case it is
better to associate these entities to many-to-many relationship sets.

What is Mapping Cardinalities in ER Diagrams

Mapping Cardinalities in a Database Management System (DBMS) defines

how entities (tables) interact and relate to each other, establishing crucial

constraints for data relationships. By specifying the number of instances of

one entity that can link to instances of another, they enable the design of

efficient, logical, and real-world representations of data models. This

article shows the types of cardinalities, such as one-to-one, one-to-many,


and many-to-many, and explains how they maintain data integrity and

guide database design.

What is Mapping Cardinalities?

In a Database Management System (DBMS), mapping cardinalities explains

how entities (tables) in a database schema relate to one another. They

provide the number of instances of one entity that can be linked to the

number of instances of another entity via a certain kind of relationship, such

as many-to-many, one-to-one, one-to-many, or one-to-many. Designing

exact links between tables, preserving data integrity, and simulating actual

business operations in a database all depend on the mapping of

cardinalities. They are a cornerstone of database design’s entity-

relationship paradigm.

Whenever an attribute of one entity type refers to another entity type, then

some relationship exists between them.

Example:
Example of Mapping Cardinalities

● The attribute Manager of the department refers to an employee

who manages the department.

● In the ER model, these references are represented as

relationships.

Relationship

The relationship in the ER model is represented using a diamond-shaped

box.

In the Entity-Relationship (ER) model, relationships represent how two or

more entities interact with each other. These relationships are often

depicted as a diamond-shaped box connecting the related entities. For

instance, a customer buying products is a common business relationship

where the customer and the product are two entities, and the act of buying

forms the relationship.

Each relationship can have attributes, such as a timestamp recording when

a customer buys a product. Thus, relationships can have their own unique

characteristics that further define the interaction between entities.

Example:
Example of Relationship in DBMS

‘Buys’ is a relationship between customer entity and products. This

relationship can be read as ‘A customer buys a product/products.

Therefore, a relationship is a way to connect multiple entities.

● When a customer buys a product, there is a timestamp

associated with it, so the attribute “Time” will be an attribute of

‘Buys’.

● All the database concepts can be easily understood from the

concepts of sets and relations.

● According to the Set-theoretic perspective, it will be represented

as
Example

By interpreting this, we can understand that many customers can buy the

same type of product and many products can be bought by many customers.

And there are some products which are not bought by any customer and

there are some customers who do not buy any product.

According to the Relation/table perspective or relational model:

It can be represented as

Many to many Relationship

As the relationship is many to many (M: N) between customer and product,

therefore we require separate tables/ relations for ‘buys’.


In buys relation, Cust_id and Prod_id are the foreign key to the customer

and product.

Mapping Cardinality/Cardinality Ratio

Mapping cardinality is the maximum number of relationship instances in

which an entity can participate.

Example:

Entity type employee is related to department entity type by works_for relationship

Mathematically, here (e1, e2,e3…) are instances of an entity set Employee

and (d1,d2, d3 ….) are the instances of entity type department and (r1, r2,

r3 …) are relationship instances of relationship type.

Each instance ri(where i = 1,2,3,….) in R, is an association of entities, and the

association includes exactly one entity from each participating entity type.

Each such relationship instance, ri represents that the entities participating


in ri are related in some way by any constraint/condition provided by the

user to a designer.

● In works_for binary relationship type Department: Employee is of

cardinality (N:1), this means each department can be related to

any number of employees but an employee can be related to

(works for) only one department.

● The possible cardinality ratios of binary relationship types are

(1:1, 1:N, N:1, N:M).

Participation or existence constraint:

It represents the minimum number of relationship instances that each entity

can participate in and it is also called the minimum cardinality constraint.

There are two types of participation constraints, which are total and partial.

Example:

● In the above example, if the company policy is that every

employee should work for a department. Then all the employees

in the employee entity set must be related to the department by a

works_for relationship. Therefore, the participation of the

employee entity type is total in the relationship type. The total

participation is also called existence dependency.

● And if there is a constraint that a new department need not have

employees, then some entity in the employee entity set is not


related to the department entity by works_for relationship.

Therefore, the participation of employee entity in this relationship

(works_for) is partial.

● In the ER diagram, the total participation is represented using a

double line connecting the participating entity type to the

relationship, and a single line is used for partial participation.

The cardinality ratio and participation constraint together is called

structural constraint of the relationship type.

All possible cardinality ratios for binary relationships are explained below

with an example.

1. One to one relationship (1:1)

It is represented using an arrow(⇢,⇠)(There can be many

notations possible for the ER diagram).

Example:

One to One relationship


In this ER diagram, both entities customer and driving license having an

arrow which means the entity Customer is participating in the relation “has

a” in a one-to-one fashion. It could be read as ‘Each customer has exactly

one driving license and every driving license is associated with exactly one

customer.

The set-theoretic perspective of the ER diagram is

ER diagram

There may be customers who do not have a credit card, but every credit

card is associated with exactly one customer. Therefore, the entity customer

has total participation in a relation.

2. One to many relationship (1:M)

Example:
one to many relationship

This relationship is one to many because “There are some employees who

manage more than one team while there is only one manager to manage a

team”.

The set-theoretic perspective of the ER diagram is:

ER diagram

3. Many to one relationship (M:1)

Example:
Many to one realtionship

It is related to a one-to-many relationship but the difference is due to

perspective.

Any number of credit cards can belong to a customer and there might be

some customers who do not have any credit card, but every credit card in a

system has to be associated with an employee (i.e. total participation).

While a single credit card can not belong to multiple customers.

The set-theoretic perspective of the ER diagram is:

ER diagram
4. Many to many relationship (M:N)

Example:

A customer can buy any number of products and a product can be bought

by many customers.

Many to many Relationship

The set-theoretic perspective of the ER diagram is:

ER diagram
Any of the four cardinalities of a binary relationship can have both sides

partial, both total, and one partial, and one total participation, depending

on the constraints specified by user requirements.

Generalization, Specialization and Aggregation


in ER Model

Using the ER model for bigger data creates a lot of complexity while

designing a database model, So in order to minimize the complexity

Generalization, Specialization, and Aggregation were introduced in the ER

model. These were used for data abstraction. In which an abstraction

mechanism is used to hide details of a set of objects. In this article we will

cover the concept of Generalization, Specialization, and Aggregation with

example.

Generalization

Generalization is the process of extracting common properties from a set of

entities and creating a generalized entity from it. It is a bottom-up approach

in which two or more entities can be generalized to a higher-level entity if

they have some attributes in common. For Example, STUDENT and

FACULTY can be generalized to a higher-level entity called PERSON as

shown in Figure 1. In this case, common attributes like P_NAME, and


P_ADD become part of a higher entity (PERSON), and specialized attributes

like S_FEE become part of a specialized entity (STUDENT).

Generalization is also called as ‘ Bottom-up approach”.

Specialization

In specialization, an entity is divided into sub-entities based on its

characteristics. It is a top-down approach where the higher-level entity is

specialized into two or more lower-level entities. For Example, an

EMPLOYEE entity in an Employee management system can be specialized

into DEVELOPER, TESTER, etc. as shown in Figure 2. In this case, common

attributes like E_NAME, E_SAL, etc. become part of a higher entity

(EMPLOYEE), and specialized attributes like TES_TYPE become part of a

specialized entity (TESTER).

Specialization is also called as ” Top-Down approch”.

Inheritance: It is an important feature of generalization and specialization

● Attribute inheritance : It allows lower level entities to inherit the

attributes of higher level entities and vice versa. In diagram Car


entity is an inheritance of Vehicle entity ,So Car can acquire

attributes of Vehicle. Example:car can acquire Model attribute of

Vehicle.

● Participation inheritance: Participation inheritance in ER modeling

refers to the inheritance of participation constraints from a

higher-level entity (superclass) to a lower-level entity (subclass).

It ensures that subclasses adhere to the same participation rules

in relationships, although attributes and relationships themselves

are inherited differently. In diagram Vehicle entity has an

relationship with Cycle entity, but it would not automatically

acquire the relationship itself with the Vehicle entity.

Participation inheritance only refers to the inheritance of

participation constraints, not the actual relationships between

entities.
Example of Relation

Aggregation

An ER diagram is not capable of representing the relationship between an

entity and a relationship which may be required in some scenarios. In those

cases, a relationship with its corresponding entities is aggregated into a

higher-level entity. Aggregation is an abstraction through which we can

represent relationships as higher-level entity sets.

For Example, an Employee working on a project may require some

machinery. So, REQUIRE relationship is needed between the relationship

WORKS_FOR and entity MACHINERY. Using aggregation, WORKS_FOR


relationship with its entities EMPLOYEE and PROJECT is aggregated into a

single entity and relationship REQUIRE is created between the aggregated

entity and MACHINERY.

Representing Aggregation Via Schema

To represent aggregation, create a schema containing the following:

● The primary key to the aggregated relationship

● The primary key to the associated entity set

● Descriptive attribute, if exists

Inheritance Hierarchies in DBMS


Inheritance Hierarchies are crucial to building a structured and well-

organized database. It is comparable to the idea of inheritance found in

object-oriented programming languages. The main ideas of inheritance

hierarchies in database management systems (DBMS) will be covered in

this article, along with definitions, procedures, and in-depth examples to

aid in understanding.

What are Inheritance Hierarchies in DBMS?

In database management systems, inheritance hierarchies help create

linkages between tables or entities, much like inheritance helps create links
between classes in object-oriented programming. It explains the process by

which a newly created table, often known as a child table, can inherit

specific attributes and capabilities from an existing parent table. This

lowers redundancy and improves data integrity by creating a hierarchical

structure in the database.

Key Terminologies

● Superclass: The class or table whose methods and attributes are

inherited is called the superclass or base class. Another name for

it is the parent class.

● Subclass/Derived Class: In an inheritance structure, a subclass is

a class or table that receives some methods and attributes from

another class. Another name for it is the child class.

Types of Inheritance Hierarchies

There are mainly three types of inheritance hierarchies in DBMS as follows:

● Single Table Inheritance

● Class Table Inheritance

● Concrete Table Inheritance

Reducing an Entity-Relationship (ER) schema to a relational database


schema involves converting entities into tables, attributes into
columns, and relationships into foreign keys or separate tables. Strong
entities become individual tables, while weak entities are represented
with foreign keys linking to the strong entities. Relationships, especially
many-to-many, often require separate tables for implementation.

Here's a more detailed breakdown:

1. Strong Entities:

● Each strong entity is transformed into a separate table.



● Entity names become table names.

● Attributes of the entity become columns in the corresponding table.

● The primary key of the entity becomes the primary key of the table.

● Composite attributes are separated into simple attributes, and each simple
attribute becomes a column in the table.

2. Weak Entities:

● Weak entities are also converted to tables.


● A foreign key is added to the weak entity table, referencing the primary key
of its identifying (strong) entity.
● All attributes of the weak entity are included as columns in the table.

3. Relationships:

One-to-one:
The foreign key of one entity can be added to the table of the other entity. If
both have total participation (every instance of one entity is related to an
instance of the other), foreign keys in both tables can be made non-null.
One-to-many:
The foreign key of the "one" entity is added to the table of the "many" entity.
Many-to-many:
A separate table is created to represent the relationship. This table will
include foreign keys referencing the primary keys of both entities involved in
the relationship.
Multivalued attributes:
Create a separate table for each multivalued attribute, with a foreign key
referencing the primary key of the entity to which the multivalued attribute
belongs.

4. Additional Considerations:

● Derived attributes: These can be ignored or calculated on the fly during


queries.

● N-ary relationships: These are converted to separate tables, with foreign
keys referencing the primary keys of all participating entities.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy