0% found this document useful (0 votes)
5 views36 pages

Dbms Notes

The document outlines the architecture tiers of Database Management Systems (DBMS), detailing 2-Tier and 3-Tier architectures, along with the three levels of abstraction in databases. It explains data independence, entity and attribute types, relationship sets, cardinality, keys, normalization processes, and functional dependencies. Additionally, it discusses normal forms, lossless decomposition, and the equivalence of functional dependencies in relational databases.

Uploaded by

skiitian0601
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views36 pages

Dbms Notes

The document outlines the architecture tiers of Database Management Systems (DBMS), detailing 2-Tier and 3-Tier architectures, along with the three levels of abstraction in databases. It explains data independence, entity and attribute types, relationship sets, cardinality, keys, normalization processes, and functional dependencies. Additionally, it discusses normal forms, lossless decomposition, and the equivalence of functional dependencies in relational databases.

Uploaded by

skiitian0601
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

DBMS

Architecture Tiers :
●​ 2-Tier :
-​ Basic client - server architecture
-​ The application at the client end directly communicates with the
database at the server side. API's like ODBC,JDBC are used for this
interaction.

-​ Disadvantages :
-​ In this Client can direct requests to the server which increases
database load and this is big issue for security of data.
-​ Scalability and Security
-​ Advantages:
-​ Maintenance and easy understanding

●​ 3-Tier :

-​ No direct interaction between client and server.


-​ Reduces load of databases and is more secure than 2 tier and more
scalable .
-​ Application Layer : This acts as a connection between the
end-user and the database. This tier holds the programs and the
application server along with the programs that could access the
database .
Three Level Of Abstraction or Three schema of architecture .

View level -> conceptual schema -> Physical Schema - >DataBase


-​ Physical Schema has access to where the database is .
-​ Conceptual schema is a blueprint where we mention which
tables to store , what is the size of tables etc. Like Tables ,
Relationships , ER Models.
-​ In Hard Disk data is stored as files but we as a user is able to
see data as tables .
-​ Logical Level is between user level and conceptual level .It
represents data in form of relational model .

Data Independence
Any changes in physical schema doesn’t affect the
conceptual schema and any changes in conceptual schema
doesn’t affect the view level .
Physical data independence is if we change in the location
and indexes of tables of a database then the actual view or
conceptual schema won’t get affected at the user level .
Conceptual data independence is that change in
conceptual schema like adding or deleting attributes doesn’t
affect the user schema i.e view level . But this is
practically more challenging than physical data
independence .

Entity : Anything or object which has some physical significance like


student , teacher , course .

Attribute : Attributes are the properties which define the entity type. For
example, Roll_No, Name, DOB, Age, Address, Mobile_No are the
attributes which define entity type Student. In the ER diagram, the attribute
is represented by an oval.

1.​ Key Attributes : Unique like registration number , underlined

2.​ Multivalued : Like Age , phone number , double circled representation


3.​ Composite Attributes : composed of many other attributes like Address.

4.​ Derived Attributes : Age (can be derived from DOB) , dotted lines

Degree of a relationship set: The number of different entity sets participating


in a relationship set is called the degree of a relationship set.

1.​ Unary - student is friend of student only i.e single entity is involved

2.​ Binary : two entities are involved

3.​ N-ary : n entities are involved in a relationship


Cardinality : (Degree of relationship)
The number of times an entity of an entity set participates in a relationship set is
known as cardinality. Cardinality can be of different types

1.​ One to One (1->1) : each entity set can take part only once in the
relationship

2.​ One to Many (1->M ): one entity set can take part only once in the
relationship set and entities in other entity sets can take part more than
once in the relationship set

3.​ Many to Many (N->M) : all entity sets can take part more than once in
the relationship cardinality

Participation Constraint: Participation Constraint is applied on the entity


participating in the relationship set.
1.Total Participation - Each entity in the entity set must participate in the
relationship.
Example 1: If each student must attend a course, the participation of student will
be total. Total participation is shown by double line in ER diagram.

2. Partial Participation -
The entity in the entity set may or may NOT participate in the relationship. If
some courses are not enrolled by any of the students, the participation of course
will be partial. The diagram depicts the ‘Enrolled in’ relationship set with Student
Entity set having total participation and Course Entity set having partial
participation.

Weak Entity Type and Identifying Relationship :

As discussed before, an entity type has a key attribute that uniquely identifies
each entity in the entity set. But there exists some entity type for which key
attribute can’t be defined. These are called the Weak Entity type. A weak
entity type is represented by a double rectangle. The participation of a weak
entity type is always total. The relationship between a weak entity type and its
identifying strong entity type is called an identifying relationship and it is
represented by a double diamond.
Example 1: a school might have multiple classes and each class might have
multiple sections. The section cannot be identified uniquely and hence they
do not have any primary key. Though a class can be identified uniquely and the
combination of a class and section is required to identify each section uniquely.
Therefore the section is a weak entity and it shares total participation with
the class.
Relational Model represents how data is stored in Relational Databases. A
relational database stores data in the form of relations (tables). Consider a
relation STUDENT with attributes ROLL_NO, NAME, ADDRESS, PHONE and
AGE shown in Table.

Properties that define a relationship or simply columns is called attributes


Each row or record in a relationship is called Tuple.

Foreign Keys maintain referential integrity .


Referenced Table in which primary key is present .
Referencing Table is one who is referring the referenced table .

Primary key table(referenced Foreign Key


table) table(referencing table)
INSERT No Violation May cause Violation

DELETE May cause integrity problem .​ No Violation


ON DELETE CASCADE : all tables
referencing this table will be updated
accordingly.

UPDATE May cause integrity problems .​ May cause Violation


ON UPDATE CASCADE : all tables
referencing this table will be updated .
What are the different types of Keys in DBMS?
1.​ Candidate Key
●​ It is a super key with no repeated data is called a candidate key.
●​ The minimal set of attributes that can uniquely identify a record.
●​ It must contain unique values. It can contain NULL values.
●​ Every table must have at least a single candidate key.
●​ A table can have multiple candidate keys but only one primary key.

2.​ Primary Key :


●​ It is a unique key. It cannot be NULL.
●​ It can identify only one tuple (a record) at a time.
●​ Primary keys are not necessarily to be a single column;
●​ more than one column can also be a primary key for a table.

3.​ Super Key :


●​ Adding zero or more attributes to the candidate key generates
the super key.
●​ A candidate key is a super key but vice versa is not true.
●​ Super Key values may also be NULL.

4.​ Alternate Key :


●​ All the candidate keys which are not primary keys are called
alternate keys.
●​ It is a secondary key. It contains two or more fields to identify two or
more records. These values are repeated.
5.​ Foreign Key :
●​ It is a key; it acts as a primary key in one table and it acts as
secondary key in another table.
●​ It combines two or more relations (tables) at a time.
●​ They act as a cross-reference between the tables.
●​ Foreign Key can be NULL as well as may contain duplicate
tuples i.e. it need not follow uniqueness constraint

●​
6.​ Composite Key :
●​ It acts as a primary key if there is no primary key in a table
●​ A composite key can also be made by the combination of more
than one candidate key.
●​ A composite key cannot be null.

Normalization :
-​ simply meaning is technique to remove or reduce Redundancy from
the table
Database normalization is the process of organizing the attributes of the
database to reduce or eliminate data redundancy (having the same data but at
different places) .
Problems because of data redundancy: Data redundancy unnecessarily
increases the size of the database as the same data is repeated in many places.
Inconsistency problems also arise during insert, delete and update operations.

Functional Dependencies :
FD is a database constraint that describes the relationship between attributes
(columns) in a table. It shows that the value of one attribute determines the
other .
X->Y : This means X determines Y or Y is determined by X .

Trivial FD : X-> Y In this X determines Y and Y is a subset of X .

LHS ⋂ RHS != Ø
Eg. SidSname -> Sid so intersection would be Sid .

Non Trivial FD : X->Y, LHS ⋂ RHS = Ø

Properties of FD :
●​ Reflexivity : If Y is subset of X then X->Y
●​ Augmentation : if X->Y then XZ -> YZ
●​ Transitive : if X->Y and Y->Z then X-> Z
●​ Union : if X->Y and X -> Z then X-> YZ
●​ Decomposition : if X->YZ then X->Y and X->Z
●​ Pseudo Transitivity : if X->Y and WY -> Z then WX->Z
If X->Y and Z-> W then XZ-> YW
Q 1 If X->Z and Y->Z then XY->Z ===> TRUE OR FALSE
Q2 if XY->Z then X->Y and Y->Z ===> TRUE OR FALSE

Closure Method :
R(ABCD)
FD {A->B , B->C , C->D}
then closure of A≣ ABCD ( A se B , B se C , C se D) so this can be Candidate
key as all the columns have been included in this closure .
Closure of B ≣ BCD Not CK
Closure of C ≣ CD Not CK
Closure of AB ≣ ABCD Not CK but Super Key
So CK = {A}

Rules for being Primary Attribute :


-​ Attribute used in making CK
-​ So in Above example : A will be PA (Primary Attribute) and rest all i.e
B,C,D will be NPA(Non Primary Attribute )

Rules for having Partial Dependency :


-​ LHS should be proper subset of CK
-​ RHS should be NPA

Rules for having Transitive Dependency :


-​ LHS must be CK or Super Key
-​ RHS must be PA

1st Normal Form :


Table should not contain any multivalued attribute .
2nd Normal Form :

-​ Table or relation must be in 1st Normal Form .


-​ There must not be any partial dependency in the relation .
-​ Eg.
R(ABCDEF)
FD(C->F , E->A , EC->D , A->B)
Closure of EC ≣ (ECABDF) i.e all the attributes are present so CK.
Now we have got one CK so for the rest of candidate keys we need to
check whether there are any FDs which have E or C or EC present in their
RHS .
So here there is no such case
CK = {EC}
PA = {E,C}
NPA = {A,B,D,F}
Now check for partial dependency for each FD
C->F LHS i.e C is a proper subset of CK and RHS i.e F is NPA so there is
partial dependency . Hence this cannot be in 2nd Normal Form .

3rd Normal Form :

-​ Table or relation must be in 2nd Normal Form .


-​ There must not be any transitive dependencies present in the table.
-​ 3rd Normal form always ensures “Dependency Preserving
Decomposition”
-​ This means while decomposing of tables , dependencies are
preserved in 3rd,2nd,1st Normal forms but not in BCNF
-​ Ensures Lossless Decomposition

Eg.
R(ABCD)
FD = {AB->CD , D->A}
Closure of AB ≣ ABCDA so this will be CK .
Now A is present in RHS of D->A so we can replace A with D and that will
be also CK
Closure of DB ≣ DBACD so this will be CK

CK = {AB , DB}
PA = {A,B,D}
NPA = {C}
Now check for transitive dependency :
AB-> CD : AB i.e LHS is a CK which satisfies that it must be CK or SK .
And RHS i.e CD is not a PA
D->A here D is not a CK and D is PA
So here both the FD is not in transitive dependency
Hence This table will be in 3rd Normal form .

Boyce Codd Normal Form :

-​ Table must be in 3rd Normal Form .


-​ LHS must be CK .
-​ BCNF doesn’t ensures “Dependency Preserving Decomposition”
-​ Ensures Lossless Decomposition .

Eg.
R(ABCD) , FD = {AB->CD , D->A}
Closure of AB = ABCD so this will be CK
Closure of DB = DBACD so this will be also CK
So CK = {AB , DB}
PA = {A,B,D}
NPA = {C}

Now check for 3rd Normal Form


So AB->CD : LHS is CK and CD is not PA
D->A : D is not CK
Hence this doesn’t follows transitive dependency so 3rd Normal form
Now we need to check just that if LHS of all FD is CK
So here AB is CK but D is not .
Hence this won’t be in BCNF .
Exercise 1: Find the highest normal form in R (A, B, C, D, E) under following
functional dependencies. ABC --> D CD --> AE
Important Points for solving above type of question.
1) It is always a good idea to start checking from BCNF, then 3 NF and so on.
2) If any functional dependency satisfied a normal form then there is no need to
check for lower normal form.
For example, ABC --> D is in BCNF (Note that ABC is a super key), so no need
to check this dependency for lower normal forms.
Candidate keys in a given relation are {ABC, BCD}
BCNF: ABC -> D is in BCNF.
Let us check CD -> AE, the CD is not a superkey so this dependency is not in
BCNF. So, R is not in BCNF. 3NF:
ABC -> D we don't need to check for this dependency as it already satisfied
BCNF.
Let us consider a CD -> AE.
Since E is not a prime attribute, so the relation is not in 3NF. 2NF: In 2NF, we
need to check for partial dependency. CD which is a proper subset of a candidate
key and it determine E, which is a non prime attribute. So, the given relation is
also not in 2 NF. So, the highest normal form is 1 NF.

LossyLess Decomposition

A B C

1 2 1

2 2 2

3 3 2

⬇ ↘
R1(AB) R2(BC)
Here at least one common column should be there while decomposition as in
future if we want to rejoin it then we can use that common column .

We need to choose the common column based on some rules :


-​ Common attributes should be CK or SK of either R1 or R2 or both .
-​ In above example A should be common as B and C contains duplicate
-​ R1(AC) , R2(AB)
-​ This will give lossless decomposition
-​ R1 U R2 = R
-​ R1 ⋂ R2 ! =Ø
-​ Common attribute must be CK of R1 or R2 or both

Equivalence of Functional Dependency :

Consider two functional dependencies F and G, If F+ = G+, that is if all


functional dependency of F is in G+ and all functional dependency of G is in
F+, then two Functional Dependencies are equivalent.

Eg. FD1 = {A->B , B->C} , FD2 = {A->B , B->C , A->C}

These will be equivalent if


-​ X covers Y means take the attribute of X and find closure in Y and then
check the existence of the same FD in the closures calculated .
-​ Y covers X

For X covers Y:
-​ Closure of A (FD1) = ABC (calculated from FD2) and now check for FD1 in
this. So A->B exists , B->C exists
-​ Hence X covers Y.
For Y covers X:
-​ Closure of A (FD2) = ABC(calculate from FD1) and now check for FD2 in
this so A->B exists , B->C exists , A->C exists
-​ So Y covers X and hence thereby we can say that FD1 is equivalent to
FD2 .
SQL :

Data Definition Language is used to define the database structure or schema. The
storage structure and access methods used by the database system by a set of
statements in a special type of DDL called a data storage and definition language.
These statements define the implementation details of the database schema, which are
usually hidden from the users. The data values stored in the database must satisfy
certain consistency constraints. For example, suppose the university requires that the
account balance of a department must never be negative. The DDL provides facilities to
specify such constraints. The database system checks these constraints every time the
database is updated.The database system implements integrity constraints that can be
tested with minimal overhead.
Domain Constraints : A domain of possible values must be associated with
every attribute (for example, integer types, character types, date/time types).
Declaring an attribute to be of a particular domain acts as the constraints on the
values that it can take.
Referential Integrity : is a part of a table and referencing some other table.(FK)
Assertions : An assertion is any condition that the database must always satisfy.
Domain constraints and Integrity constraints are special forms of assertions.
Authorization : The differentiation of users are expressed in terms of
Authorization. The most common being : read authorization - which allows
reading but not modification of data ; insert authorization - which allows insertion
of new data but not modification of existing data update authorization - which
allows modification, but not deletion.
DML (Data Manipulation Language) : DML statements are used for managing
data within schema objects. DML are of two types -
Procedural DMLs : require a user to specify what data are needed and how to
get that data.
Declarative DMLs (also referred as Non-procedural DMLs) : require a user to
specify what data is needed without specifying how to get that data. Declarative
DMLs are usually easier to learn and use than procedural DMLs. However, since
a user does not have to specify how to get the data, the database system has to
figure out an efficient means of accessing data.

TCL (Transaction Control Language) :

Transaction Control Language commands are used to manage transactions in


the database. These are used to manage the changes made by
DML-statements. It also allows statements to be grouped together into logical
transactions. Examples of TCL commands -
COMMIT: Commit command is used to permanently save any transaction into
the database.
ROLLBACK: This command restores the database to the last committed state. It
is also used with the savepoint command to jump to a savepoint in a transaction.
SAVEPOINT: Savepoint command is used to temporarily save a transaction so
that you can rollback to that point whenever necessary.

DCL (Data Control Language) :


A Data Control Language is a syntax similar to a computer programming
language used to control access to data stored in a database
(Authorization). In particular, it is a component of Structured Query Language
(SQL). Examples of DCL commands :
GRANT: allow specific users to perform specified tasks.
REVOKE: cancel previously granted or denied permissions. The operations for
which privileges may be granted to or revoked from a user or role apply to both
the Data definition language (DDL) and the Data manipulation language (DML),
and may include CONNECT, SELECT, INSERT, UPDATE, DELETE, EXECUTE
and USAGE.
A role is created to ease setup and maintenance of the security model. It is a
named group of related privileges that can be granted to the user. When there
are many users in a database it becomes difficult to grant or revoke privileges to
users. Therefore, if you define roles: You can grant or revoke privileges to
users, thereby automatically granting or revoking privileges. You can either
create Roles or use the system roles pre-defined.

Creating and Assigning a Role - First, the (Database Administrator)DBA must


create the role. Then the DBA can assign privileges to the role and users to the
role.
Syntax - ​ CREATE ROLE manager;
Role created.
In the syntax: 'manager' is the name of the role to be created. Now that the role is
created, the DBA can use the GRANT statement to assign users to the role as
well as assign privileges to the role. It's easier to GRANT or REVOKE privileges
to the users through a role rather than assigning a privilege directly to every user.
If a role is identified by a password, then GRANT or REVOKE privileges have to
be identified by the password.
Grant privileges to a role -
GRANT create table, create view
TO manager;
Grant succeeded.
Grant a role to users
GRANT manager TO SAM, STARK;
Grant succeeded.
Revoke privilege from a Role :
REVOKE create table FROM manager;
Drop a Role :
DROP ROLE manager;

Explanation - Firstly it creates a manager role and then allows managers to


create tables and views. It then grants Sam and Stark the role of managers. Now
Sam and Stark can create tables and views. If users have multiple roles granted
to them, they receive all of the privileges associated with all of the roles. Then
create table privilege is removed from the role 'manager' using Revoke.The role
is dropped from the database using drop.
Joins :

Cartesian / Cross Join : cross product of each record from rest of the table .

Unset

SELECT Student.NAME, Student.AGE, StudentCourse.COURSE_ID


FROM Student CROSS JOIN StudentCourse;

Self join : it is a join between two copies of the same table

Unset

SELECT a.ROLL_NO , b.NAME FROM Student a, Student b


WHERE a.ROLL_NO < b.ROLL_NO;

EQUI JOIN creates a JOIN for equality or matching column(s) values of the
relative tables. EQUI JOIN also creates JOIN by using JOIN with ON and then
providing the names of the columns with their relative tables to check equality
using equal sign (=).

Unset

SELECT student.name, student.id, record.class,


record.city FROM student, record WHERE student.city =
record.city;

Inner Join

Unset

SELECT StudentCourse.COURSE_ID, Student.NAME,


Student.AGE FROM Student INNER JOIN StudentCourse ON
Student.ROLL_NO = StudentCourse.ROLL_NO;
Left Join :

Unset

SELECT StudentCourse.COURSE_ID, Student.NAME,


Student.AGE FROM Student INNER JOIN StudentCourse ON
Student.ROLL_NO = StudentCourse.ROLL_NO;

Right Join :

Unset

SELECT StudentCourse.COURSE_ID, Student.NAME,


Student.AGE FROM Student INNER JOIN StudentCourse ON
Student.ROLL_NO = StudentCourse.ROLL_NO;

Full Join :

Unset

SELECT StudentCourse.COURSE_ID, Student.NAME,


Student.AGE FROM Student INNER JOIN StudentCourse ON
Student.ROLL_NO = StudentCourse.ROLL_NO;

3. Does SQL support programming language features?


Ans : It is true that SQL is a language, but it does not support programming as it
is not a programming language, it is a command language. We do not have
conditional statements in SQL like for loops or if..else, we only have commands
which we can use to query, update, delete, etc. data in the database. SQL allows
us to manipulate data in a database.
7. What is the difference between CHAR and VARCHAR2 datatypes in SQL?
Ans : Both of these data types are used for characters, but varchar2 is used for
character strings of variable length, whereas char is used for character
strings of fixed length. For example, if we specify the type as char(5) then we
will not be allowed to store a string of any other length in this variable, but if we
specify the type of this variable as varchar2(5) then we will be allowed to store
strings of variable length. We can store a string of length 3 or 4 or 2 in this
variable.

8. Name different types of case manipulation functions available in SQL.


Ans : There are three types of case manipulation functions available in SQL.
LOWER: The purpose of this function is to return the string in lowercase. It takes
a string as an argument and returns the string by converting it into lower case.
Syntax: LOWER('string')
UPPER: The purpose of this function is to return the string in uppercase. It takes
a string as an argument and returns the string by converting it into uppercase.
Syntax: UPPER('string')
INITCAP: The purpose of this function is to return the string with the first letter in
uppercase and the rest of the letters in lowercase. Syntax: INITCAP('string')

15. What is an index?


Ans : A database index is a data structure that improves the speed of data
retrieval operations on a database table at the cost of additional writes and the
use of more storage space to maintain the extra copy of data. Data can be stored
only in one order on a disk. To support faster access according to different
values, a faster search like binary search for different values is desired. For this
purpose, indexes are created on tables. These indexes need extra space on the
disk, but they allow faster search according to different frequently searched
values

Indexes are special lookup tables that are used by database search engine for
faster retrieval of the data. Simply put, an index is a pointer to data in a table. It is
like an Index page of a book
Q. Difference between DELETE and TRUNCATE
The DELETE statement removes rows TRUNCATE TABLE removes the data by
one at a time and records an entry in the deallocating the data pages used to
transaction log for each deleted row. store the table data and records only the
page deallocations in the transaction log.

The DELETE command is slower than While the TRUNCATE command is


the TRUNCATE command. faster than the DELETE command.

To use Delete you need DELETE To use Truncate on a table we need at


permission on the table. least ALTER permission on the table.

Identity of the column retains the identity Identity of the column is reset to its seed
after using DELETE Statement on table. value if the table contains an identity
column.

The delete can be used with indexed Truncate cannot be used with indexed
views. views.

Q. Difference between DROP and TRUNCATE

DROP TRUNCATE

The DROP command is used to remove table Whereas the TRUNCATE command is
definition and its contents. used to delete all the rows from the table.

In the DROP command, table space is freed from While the TRUNCATE command does
memory. not free the table space from memory.

DROP is a DDL(Data Definition Language) Whereas the TRUNCATE is also a


command. DDL(Data Definition Language) command.

In the DROP command, a view of the table does not While in this command, a view of the table
exist. exists.

In the DROP command, integrity constraints will be While in this command, integrity
removed. constraints will not be removed.
In the DROP command, undo space is not used. While in this command, undo space is
used but less than DELETE.

The DROP command is quick to perform but gives While this command is faster than
rise to complications. DROP.

INDEXING
Without Indexing ,
Q . Consider a Hardisk in which block size = 1000 bytes , each record is of size = 250 bytes. If
total number of records is 10000 , and data is entered in the hard disk without any
order(Unordered).
What is the average time complexity to search on the hard disk ?

Ans :

CPU RAM Blocks

No. of records in each block = 1000/250 = 4 .


No. of block required to fill 10000 records = 10000/4 = 2500

Best case time complexity is we got our result in the 1st block = 1
Worst case time complexity is we got our result at last block = 2500
So avg time complexity = 2500/2 = 1250.
O(N) time complexity . This is kind of Linear search
If it had been ordered then we could have used Binary Search and the time complexity would be
O(LogN) = Log(1250) ~= 12

Q . Consider a Hardisk in which block size = 1000 bytes , each record is of size = 250 bytes. If
total number of records is 10000 , and data is entered in the hard disk without any
order(Unordered).
What is the average time complexity to search from the Index table if Index table entry is 20B
(key (10B)+ pointer(10B))?

Ans :

CPU RAM Blocks

No. of records in each block = 1000/250 = 4 .


No. of block required to fill 10000 records = 10000/4 = 2500

In each block no. of index table = 1000/20 = 50 Index tables


And No. Total no. of dense sparse = 2500/50 = 50 sparse
= Log(50) +1 ans..

In case of dense 10000/50 = 200 dense


= Log(200) +1
Here we had Binary Search .

Types of Index :
-​ Primary Index
-​ Secondary Index
-​ CLustered Index

Ordered File Primary Index Clustered Index

Unordered FIle Secondary Index Secondary Index

Key Non Key(duplicate contains)

There is a key and pointer to store the address of the data blocks .

The DBMS uses a hard disk to store all the records in the above database.
As we know that the access time of the hard disk is very slow, searching for anything in
such huge databases could cost performance issues.
Moreover, searching for a repeated item in the database could lead to a greater
consumption of time as this will require searching for all the items in every block.
Suppose there are 100 rows in each block, so when a customer id is searched for in the
database, it will take too much time. The hard disk does not store the data in a particular
order.
One solution to this problem is to arrange the indexes in a database in sorted order so
that any looked up item can be found easily using Binary Search. This creation of orders
to store the indexes is called clustered indexing.

When the primary key is used to order the data in a heap, it is called Primary Indexing.
Sequential File Organization or Ordered Index File: In this, the indices are based on
a sorted ordering of the values. These are generally fast and a more traditional type of
storing mechanism. These Ordered or Sequential file organizations might store the data
in a dense or sparse format:

-​ Dense Index: For every search key value in the data file, there is an index
record. This record contains the search key and also a reference to the first data
record with that search key value.

-​

-​ Sparse Index: The index record appears only for a few items in the data file.
Each item points to a block as shown. To locate a record, we find the index
record with the largest search key value less than or equal to the search key
value we are looking for. We start at that record pointed to by the index record,
and proceed along with the pointers in the file (that is, sequentially) until we find
the desired record.
Clustered Indexing:
-​ Clustering index is defined on an ordered data file. The data file is
ordered on a non-key field.
-​ In some cases, the index is created on non-primary key columns which
may not be unique for each record.
-​ In such cases, in order to identify the records faster, we will group two or
more columns together to get the unique values and create an index out of
them. This method is known as the clustering index.
-​ Basically, records with similar characteristics are grouped together and
indexes are created for these groups.

Primary Indexing:
-​ This is a type of Clustered Indexing wherein the data is sorted according
to the search key and the primary key of the database table is used to
create the index.
-​ It is a default format of indexing where it induces sequential file
organization.
-​ As primary keys are unique and are stored in a sorted manner, the
performance of the searching operation is quite efficient.

Non-clustered or Secondary Indexing


-​ A non-clustered index just tells us where the data lies, i.e. it gives us a list
of virtual pointers or references to the location where the data is actually
stored.
-​ Data is not physically stored in the order of the index. Instead, data is
present in leaf nodes. For eg. the contents page of a book.
-​ Each entry gives us the page number or location of the information stored.
The actual data here(information on each page of the book) is not
organized but we have an ordered reference(contents page) to where the
data points actually lie.
-​ We can have only dense ordering in the non-clustered index as sparse
ordering is not possible because data is not physically organized
accordingly. It requires more time as compared to the clustered index
because some amount of extra work is done in order to extract the data by
further following the pointer. In the case of a clustered index, data is
directly present in front of the index. Let's look at the same order table and
see how the data is arranged in a non-clustered way in an index file.
Sparse is slower than dense but have more advantage in Less storage and efficient
searching.

-​
B- Tree :

Properties of B-Tree:
-​ All leaves are at the same level.
-​ B-Tree is defined by the term minimum degree 't'. The value of t depends upon disk
block size.
-​ Every node except root must contain at least t-1 keys. The root may contain minimum 1
key.
-​ All nodes (including root) may contain at most 2*t – 1 keys.
-​ Number of children of a node is equal to the number of keys in it plus 1.
-​ All keys of a node are sorted in increasing order.
-​ The child between two keys k1 and k2 contains all keys in the range from k1 and k2.
-​ B-Tree grows and shrinks from the root which is unlike Binary Search Tree.
-​ Binary Search Trees grow downward and also shrink from downward.
-​ Like other balanced Binary Search Trees, time complexity to search, insert and delete is
O(log n).
-​ Insertion of a Node in B-Tree happens only at Leaf Node.
-​ Follows inorder traversal
-​ Same searching as in binary search tree .

The minimum height of the B-Tree that can exist with n number of nodes and m is the

maximum number of children of a node can have is:


The maximum height of the B-Tree that can exist with n number of nodes and t is the
minimum number of children that a non-root node can have is:

Applications of B-Trees:
-​ It is used in large databases to access data stored in the disk
-​ Searching of data in a data set can be achieved in significantly less time using B tree
-​ With the indexing feature multilevel indexing can be achieved.
-​ Most of the servers also use the B-tree approach.

Disadvantage of B- Tree
-​ This uses more number of levels thereby increasing the search time.

This disadvantage is reduced by using B+ Tree by only storing the data pointers at the leaf
nodes .

Application of B+ Trees:
-​ Multilevel Indexing
-​ Faster operations on the tree (insertion, deletion, search)
-​ Database indexing

Advantage -
-​ A B+ tree with 'l' levels can store more entries in its internal nodes compared to a B-tree
having the same 'l' levels.
-​ This accentuates the significant improvement made to the search time for any given key.
Having lesser levels and the presence of Pnext pointers imply that the B+ trees are very
quick and efficient in accessing records from disks.
RDBMS VS DBMS

Q . What is the main difference between UNION and UNION ALL?

Ans : UNION and UNION ALL are used to join the data from 2 or more tables but UNION
removes duplicate rows and picks the rows which are distinct after combining the data from the
tables whereas UNION ALL does not remove the duplicate rows, it just picks all the data from
the tables.
Q . How is the pattern matching done in the SQL?

Answer: With the help of the LIKE operator, pattern matching is possible in the SQL.’%’ is used
with the LIKE operator when it matches with the 0 or more characters, and ‘_’ is used to match
the one particular character.
Ex .
SELECT * from Emp WHERE name like ‘b%’;
SELECT * from Emp WHERE name like ‘hans_’;

Q . What is the main goal of RAID technology?

Ans : RAID stands for Redundant Array of Inexpensive (or sometimes “Independent”)Disks.

RAID is a method of combining several hard disk drives into one logical unit (two or more disks
grouped together to appear as a single device to the host system). RAID technology was
developed to address the fault-tolerance and performance limitations of conventional disk
storage. It can offer fault tolerance and higher throughput levels than a single hard drive or
group of independent hard drives. While arrays were once considered complex and relatively
specialized storage solutions, today they are easy to use and essential for a broad spectrum of
client/server applications.

Q. Difference between CTE and Views ?


Ans : Views can be indexed but CTE can't. So this is one important point.

CTE work excellent on tree hierarchy ie. recursive

Also, consider views when dealing with complex queries. Views being a physical object on a
database (but does not store data physically) and can be used on multiple queries, thus
providing flexibility and centralized approach. CTE, on the other hand, are temporary and will be
created when they are used; that's why they are called as inline views.
Q . CTEs Vs Views ?

Q. DIFFERENCE BETWEEN ALTER AND UPDATE ?


ANS :
Q. DIFFERENCE BETWEEN PARTITION BY AND GROUP BY ?

Q . DIFFERENCE BETWEEN STORED PROCEDURE , CURSOR , TRIGGER .


Subquery: A query that is one level nested inside another query.

●​ It’s a single "inner query" used within an "outer query."

Nested Query: A more general term that refers to queries within queries and can involve
multiple levels of nesting, where one query contains another, and that inner query may itself
contain another query.

Q . What is a trigger?
Ans : A trigger is a special type of stored procedure that automatically runs when an event
occurs in the database server. There are 3 types of triggers
a) DDL (Data Definition Language) triggers: We can create triggers on DDL statements (like
CREATE, ALTER, and DROP) and certain system-defined stored procedures that perform
DDL-like operations. DDL triggers can be used to observe and control actions performed on the
server, and to audit these operations.
b) DML (Data Modification Language) triggers: In SQL Server we can create triggers on DML
statements (like INSERT, UPDATE, and DELETE) and stored procedures that perform DMLlike
operations. These triggers are of two types: ·
After Trigger: This type of trigger fires after SQL Server finishes the execution of the
action successfully that fired it. If you insert a record/row in a table then the trigger
related/associated with the insert event on this table will fire only after the row passes all
the constraints, like as primary key constraint, and some rules. If the record/row insertion
fails, SQL Server will not fire the After Trigger. ·
Instead of Trigger: An INSTEAD OF trigger is a trigger that allows you to skip an
INSERT, DELETE, or UPDATE statement to a table or a view and execute other
statements defined in the trigger instead.
c) Logon Triggers: Logon triggers are a special type of trigger that fire when LOGON event of
SQL Server is raised. We can use these triggers to audit and to track login activity or limit the
number of sessions for a specific login.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy