Unit-2 Introduction to Database (1)
Unit-2 Introduction to Database (1)
DATABASE
UNIT 2
File System
1. Query Processor
The query processing is handled by the query processor, as the name implies. It executes the
user's query, to put it simply. In this way, the query processor aids the database system in
making data access simple and easy. The query processor's primary duty is to successfully
execute the query. The Query Processor transforms (or interprets) the user's application
program-provided requests into instructions that a computer can understand.
Components of the Query Processor
• DDL Interpreter:
Data Definition Language is what DDL stands for. As implied by the name, the DDL
Interpreter interprets DDL statements like those used in schema definitions (such as create,
remove, etc.).
STRUCTURE OF DBMS
• DML Compiler:
Compiler for DML Data Manipulation Language is what DML stands for. In keeping with its name,
the DML Compiler converts DML statements like select, update, and delete into low-level
instructions or simply machine-readable object code, to enable execution.
• Embedded DML Pre-compiler:
Before the query evaluation, the embedded DML commands in the application program (such as
SELECT, FROM, etc., in SQL) must be pre-compiled into standard procedural calls (program
instructions that the host language can understand).
• Query Optimizer:
It is in charge of analyzing the queries and running the object code that the DML Compiler
produces. the query evaluation engine evaluates the SQL commands used to access the
database's contents before returning the result of the query.
STRUCTURE OF DBMS
2. Storage Manager:
An application called Storage Manager acts as a conduit between the queries made and the
data kept in the database. Another name for it is Database Control System. It is in charge of
retrieving, storing, updating, and removing data from the database.
Components of Storage Manager
• Integrity Manager:
Whenever there is any change in the database, the Integrity manager will manage the
integrity constraints.
• Authorization Manager:
Authorization manager verifies the user that he is valid and authenticated for the specific
query or request.
STRUCTURE OF DBMS
• File Manager:
All the files and data structure of the database are managed by this component.
• Transaction Manager:
It is responsible for making the database consistent before and after the
transactions. Concurrent processes are generally controlled by this component.
• Buffer Manager:
The transfer of data between primary and main memory and managing the cache
memory is done by the buffer manager.
STRUCTURE OF DBMS
3. Disk Storage
A DBMS can use various kinds of Data Structures as a part of physical system implementation
in the form of disk storage. It is the space where data is stored.
Components of Disk Storage
• Data Dictionary:
It contains the metadata (data of data), which means each object of the database has some
information about its structure. So, it creates a repository which contains the details about
the structure of the database object.
• Data Files:
This component stores the data in the files.
• Indices:
These indices are used to access and retrieve the data in a very fast and efficient way.
People who Deal with Database
The Network Model main difference between this model and the
hierarchical model is that any record can have several parents. It uses a
graph instead of a hierarchical tree. This model can consist of multiple
parent segments and these segments are grouped as levels but there
exists a logical association between the segments belonging to any
level. Mostly, there exists a many-to-many logical association between
any of the two segments.
Types of Data Models in DBMS
The Relational Model E.F. Codd proposed the relational Model to model data in
the form of relations or tables. The relational model represents how data is
stored in Relational Databases. A relational database consists of a collection of
tables, each of which is assigned a unique name. The relational model uses a
collection of tables to represent both data and the relationships among those
data. Each table has multiple columns, and each column has a unique name.
Tables are also known as relations. After designing the conceptual model of the
Database using ER diagram, we need to convert the conceptual model into a
relational model which can be implemented using any RDBMS language like
Oracle SQL, MySQL, etc. Rows is called as Tuple.
Types of Data Models in DBMS
The Object-Oriented Model in DBMS or OODM is the data model where data is
stored in the form of objects. This model is used to represent real-world entities. The data and
data relationship are stored together in a single entity known as an object in the Object
Oriented Model. We can use the Object Oriented Model in DBMS to store real-world entities.
• Here Transport, Bus, Ship, and Plane are objects.
• Bus has Road Transport as the attribute.
• Ship has Water Transport as the attribute.
• Plane has Air Transport as the attribute.
• The Transport object` is the base object and the Bus
• Ship, and Plane objects derive from it.
Types of Data Models in DBMS
The ER model was created to provide a simple and understandable model for
representing the structure and logic of databases. Peter Chen developed the ER diagram
in 1976. The Entity Relational Model is a model for identifying entities to be represented
in the database and representation of how those entities are related. ER models are
used to model real-world objects like a person, a car, or a company and the relation
between these real-world objects. In short, the ER Diagram is the structural format of the
database.
What is Normalization?
Super Key
Super key is a combination of attribute set that can uniquely identify a
tuple. A super key is a superset of a candidate key.
For example: (EMPLOEE_ID, EMPLOYEE_NAME), the name of two
employees can be the same, but their EMPLYEE_ID can't be the same.
Hence, this combination can also be a key.
Types of keys:
Candidate key
• A candidate key is an attribute or set of attributes that can uniquely
identify a tuple.
• Except for the primary key, the remaining attributes are considered a
candidate key. The candidate keys are as strong as the primary key.
For example: In the EMPLOYEE table, id is best suited for the primary
key. The rest of the attributes, like SSN, Passport_Number,
License_Number, etc., are considered a candidate key.
Types of keys:
Primary key
• It is the first key used to identify one and only one instance of an entity
uniquely. An entity can contain multiple keys, as we saw in the PERSON
table. For each entity, the primary key selection is based on
requirements and developers.
Types of keys:
Alternate key
There may be one or more attributes or a combination of attributes that
uniquely identify each tuple in a relation. These attributes or
combinations of the attributes are called the candidate keys. One key is
chosen as the primary key from these candidate keys, and the
remaining candidate key, if it exists, is termed the alternate key. In other
words, the total number of the alternate keys is the total number of
candidate keys minus the primary key.
Types of keys:
Foreign key
• Foreign keys are the column of the table used to point to the primary
key of another table.
• Every employee works in a specific department in a company, and
employee and department are two different entities. So we can't store
the department's information in the employee table. That's why we link
these two tables through the primary key of one table.
Types of keys:
Foreign key
• Foreign keys are the column of the table used to point to the primary
key of another table.
• Every employee works in a specific department in a company, and
employee and department are two different entities. So we can't store
the department's information in the employee table. That's why we link
these two tables through the primary key of one table.
Types of keys:
Composite key
Whenever a primary key consists of more than one attribute, it is known
as a composite key. This key is also known as Concatenated Key.
Anomalies
• First Normal Form (1NF): Each field in a table contains different information. For
example, in an employee list, each table would contain only one birthdate field.
• Second Normal Form (2NF): Each field in a table that is not a determiner of the contents
of another field must itself be a function of the other fields in the table.
• Third Normal Form (3NF): No duplicate information is permitted. So, for example, if
two tables both require a birthdate field, the birthdate information would be separated into
a separate table, and the two other tables would then access the birthdate information via
an index field in the birthdate table. Any change to a birthdate would automatically be
reflect in all tables that link to the birthdate table.
Database Design
The main objectives of database designing are to produce logical and physical
designs models of the proposed database system.
The logical model concentrates on the data requirements
and the data to be stored independent of physical considerations.
It does not concern itself with how the data will be stored or
where it will be stored physically.
The physical data design model involves translating the logical
design of the database onto physical media using hardware
resources and software systems such as database management
systems (DBMS).
Entity
Entity
An entity can be a real-world object, either animate or inanimate,
that can be easily identifiable. For example, in a school database,
students, teachers, classes, and courses offered can be considered as entities.
All these entities have some attributes or properties that give them their identity.
An entity set is a collection of similar types of entities. An entity set may contain
entities with attribute sharing similar values. For example, a Students set may
contain all the students of a school; likewise a Teachers set may contain all the
teachers of a school from all faculties. Entity sets need not be disjoint.
Attributes
Attributes
Entities are represented by means of their properties,
called attributes. All attributes have values. For example, a student
entity may have name, class, and age as attributes.
There exists a domain or range of values that can be assigned to
attributes. For example, a student’s name cannot be a numeric
value. It has to be alphabetic. A student’s age cannot be negative,
etc.
Types of Attributes
• Simple attribute: Simple attributes are atomic values, which cannot be divided further.
For example, a student’s phone number is an atomic value of 10 digits.
• Composite attribute: Composite attributes are made of more than one simple attribute.
For example, a student’s complete name may have first_name and last_name.
• Derived attribute: Derived attributes are the attributes that do not exist in the physical
database, but their values are derived from other attributes present in the database. For
example, average_salary in a department should not be saved directly in the database,
instead it can be derived. For another example, age can be derived from data_of_birth.
• Single-value attribute: Single-value attributes contain single value. For example −
Social_Security_Number.
• Multi-value attribute: Multi-value attributes may contain more than one values. For
example, a person can have more than one phone number, email_address, etc.
Relationship
Relationship
Relationship Set
Integrity Constraints
• Integrity constraints are a set of rules. It is used to maintain the
quality of information.
• Integrity constraints ensure that the data insertion, updating, and
other processes have to be performed in such a way that data
integrity is not affected.
• Thus, integrity constraint is used to guard against accidental
damage to the database.
Types of Integrity Constraint
1. Domain constraints
•Domain constraints can be defined as the definition of a valid set of values for an a
•The data type of domain includes string, character, integer, time, date, currency, et
•The value of the attribute must be available in the corresponding domain.
Example:
Types of Integrity Constraint
•The entity integrity constraint states that primary key value can’t be null.
•This is because the primary key value is used to identify individual rows in relation
and if the primary key has a null value, then we can’t identify those rows.
•A table can contain a null value other than the primary key field.
Example:
Types of Integrity Constraint
4. Key constraints
• Keys are the entity set that is used to identify an entity within its
entity set uniquely.
• An entity set can have multiple keys, but out of which one key will
be the primary key. A primary key can contain a unique and null
value in the relational table.
DDL and DML Commands
DML statements are SQL statements that manipulate data. DML stands for
Data Manipulation Language. The SQL statements that are in the DML class
are INSERT, UPDATE and DELETE. Some people also lump the SELECT statement
in the DML classification.
Data Definition Languages (DDL) are used to define the database structure.
Any CREATE, DROP and ALTER commands are examples of DDL SQL statements.
Aggregate Functions