0% found this document useful (0 votes)
30 views20 pages

Com 322 lecture note1

The document provides an overview of various data models used in database design, including hierarchical, network, entity-relationship, relational, object-oriented, and associative models. It explains the advantages and disadvantages of each model, as well as the concepts of forms, reports, triggers, and storage media in databases. Additionally, it discusses indexing and hashing techniques for efficient data retrieval in database management systems.

Uploaded by

bukaraisha99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views20 pages

Com 322 lecture note1

The document provides an overview of various data models used in database design, including hierarchical, network, entity-relationship, relational, object-oriented, and associative models. It explains the advantages and disadvantages of each model, as well as the concepts of forms, reports, triggers, and storage media in databases. Additionally, it discusses indexing and hashing techniques for efficient data retrieval in database management systems.

Uploaded by

bukaraisha99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Mai Idris Alooma Polytechnic, Geidam

Department of Computer Science


Com 322 lecture note (Database Design II)

Step 1
Understand object oriented data models and the concept of object oriented languages.

Data Model
Data Model gives us an idea that how the final system will look like after its complete
implementation. It defines the data elements and the relationships between the data elements.
Data Models are used to show how data is stored, connected, accessed and updated in the
database management system. Here, we use a set of symbols and text to represent the
information so that members of the organization can communicate and understand it. Though
there are many data models being used nowadays but the Relational model is the most widely
used model. Apart from the Relational model, there are many other types of data models about
which we will study in details in this blog. Some of the Data Models in DBMS are:

1. Hierarchical Model
2. Network Model
3. Entity-Relationship Model
4. Relational Model
5. Object-Oriented Data Model
6. Object-Relational Data Model
7. Flat Data Model
8. Semi-Structured Data Model
9. Associative Data Model
10. Context Data Model

Hierarchical Model

Hierarchical Model was the first DBMS model. This model organizes the data in the
hierarchical tree structure. The hierarchy starts from the root which has root data and then it
expands in the form of a tree adding child node to the parent node. Refer to COM 312
Advantages of Hierarchical Model

 It is very simple and fast to tr averse through a tree-like structure.


 Any change in the parent node is automatically reflected in the child node so, the
integrity of data is maintained.
Disadvantages of Hierarchical Model

 Complex relationships are not supported.

COM 322 (DATABASE DESIGN II)

BABA SALEH 1
 As it does not support more than one parent of the child node so if we have some
complex relationship where a child node needs to have two parent node then that can't
be represented using this model.
 If a parent node is deleted then the child node is automatically deleted.

Network Model
This model is an extension of the hierarchical model. It was the most popular model before the
relational model. This model is the same as the hierarchical model, the only difference is that a
record can have more than one parent. It replaces the hierarchical tree with a
graph. Example: In the example below we can see that node student has two parents i.e. CSE
Department and Library. This was earlier not possible in the hierarchical model. Refer to
COM 312
Advantages of Network Model

 The data can be accessed faster as compared to the hierarchical model. This is because
the data is more related in the network model and there can be more than one path to
reach a particular node. So the data can be accessed in many ways.
 As there is a parent-child relationship so data integrity is present. Any change in parent
record is reflected in the child record.
Disadvantages of Network Model
 As more and more relationships need to be handled the system might get complex. So, a
user must be having detailed knowledge of the model to work with the model.
 Any change like updation, deletion, insertion is very complex.

Entity-Relationship Model
Entity-Relationship Model or simply ER Model is a high-level data model diagram. In this
model, we represent the real-world problem in the pictorial form to make it easy for the
stakeholders to understand. It is also very easy for the developers to understand the system by
just looking at the ER diagram. We use the ER diagram as a visual tool to represent an ER
Model. ER diagram has the following three components:
Entities: Entity is a real-world thing. It can be a person, place, or even a
concept. Example: Teachers, Students, Course, Building, Department, etc are some of the
entities of a School Management System.
Attributes: An entity contains a real-world property called attribute. This is the characteristics
of that attribute. Example: The entity teacher has the property like teacher id, salary, age, etc.
Relationship: Relationship tells how two attributes are related.

Relational Model
Relational Model is the most widely used model. In this model, the data is maintained in the
form of a two-dimensional table. All the information is stored in the form of row and columns.
The basic structure of a relational model is tables. So, the tables are also called relations in the
relational model. Example: In this example, we have an Employee table.
COM 322 (DATABASE DESIGN II)

BABA SALEH 2
Advnatages of Relational Model

 Simple: This model is more simple as compared to the network and hierarchical model.
 Scalable: This model can be easily scaled as we can add as many rows and columns we
want.
 Structural Independence: We can make changes in database structure without
changing the way to access the data. When we can make changes to the database
structure without affecting the capability to DBMS to access the data we can say that
structural independence has been achieved.
Disadvantages of Relatinal Model

 Hardware Overheads: For hiding the complexities and making things easier for the
user this model requires more powerful hardware computers and data storage devices.
 Bad Design: As the relational model is very easy to design and use. So the users don't
need to know how the data is stored in order to access it. This ease of design can lead to
the development of a poor database which would slow down if the database grows.
But all these disadvantages are minor as compared to the advantages of the relational model.
These problems can be avoided with the help of proper implementation and organization.

Object-Oriented Data Model


The real-world problems are more closely represented through the object-oriented data model.
In this model, both the data and relationship are present in a single structure known as an
object. We can store audio, video, images, etc in the database which was not possible in the
relational model (although you can store audio and video in relational database, it is advised
not to store in the relational database). In this model, two are more objects are connected
through links. We use this link to relate one object to other objects. This can be understood by
the example given below.

COM 322 (DATABASE DESIGN II)

BABA SALEH 3
In the above example, we have two objects Employee and Department. All the data and
relationships of each object are contained as a single unit. The attributes like Name, Job_title
of the employee and the methods which will be performed by that object are stored as a single
object. The two objects are connected through a common attribute i.e the Department_id and
the communication between these two will be done with the help of this common id.

Object-Relational Model
As the name suggests it is a combination of both the relational model and the object-oriented
model. This model was built to fill the gap between object-oriented model and the relational
model. We can have many advanced features like we can make complex data types according
to our requirements using the existing data types. The problem with this model is that this can
get complex and difficult to handle. So, proper understanding of this model is required.

Flat Data Model


It is a simple model in which the database is represented as a table consisting of rows and
columns. To access any data, the computer has to read the entire table. This makes the modes
slow and inefficient.

Semi-Structured Model
Semi-structured model is an evolved form of the relational model. We cannot differentiate
between data and schema in this model. Example: Web-Based data sources which we can't
differentiate between the schema and data of the website. In this model, some entities may
have missing attributes while others may have an extra attribute. This model gives flexibility in
storing the data. It also gives flexibility to the attributes. Example: If we are storing any value
in any attribute then that value can be either atomic value or a collection of values.

Associative Data Model


Associative Data Model is a model in which the data is divided into two parts. Everything
which has independent existence is called as an entity and the relationship among these entities
are called association. The data divided into two parts are called items and links.

The concept of object oriented languages (Refer to COM 313 OOP Using C++)

COM 322 (DATABASE DESIGN II)

BABA SALEH 4
Step 2&3
Design Forms Reports and Triggers
Forms allow you to both add data to tables and view data that already exists.
Reports present data from tables and also from queries, which then search for and analyze data
within these same tables. Reports also display your data, but on paper. Unlike Forms, Reports
don’t allow you to edit the data – they are designed to be static. After all, once you’ve printed
your data on paper (or as a PDF) it’s going to be pretty static, so Reports reflect that.
Triggers A trigger is a special type of stored procedure that automatically runs when an event
occurs in the database server.
DML triggers run when a user tries to modify data through a data manipulation language (DML)
event. DML events are INSERT, UPDATE, or DELETE statements on a table or view, or.
A database trigger is a set of instructions that automatically executes in response to certain
events on a particular table or view in a database. These events typically include actions such as
INSERT, UPDATE, or DELETE operations. Triggers are used to enforce business rules, validate
data, maintain audit trails, and synchronize tables.
Here are some key points about database triggers:
1. Automatic Execution: Triggers execute automatically when a specified event occurs.
2. Event-Driven: They are activated by events like INSERT, UPDATE, or DELETE.
3. Types of Triggers:
o Before Triggers: Execute before the triggering event.
o After Triggers: Execute after the triggering event.
o Instead Of Triggers: Replace the triggering event, typically used with views.
4. Scope: Can be defined at the row level (for each affected row) or statement level (once
per statement execution).
5. Uses:
o Enforcing complex business rules and constraints.
o Maintaining historical data (audit logs).
o Synchronizing data between tables.
o Preventing invalid transactions.
Example:
The trigger is:

COM 322 (DATABASE DESIGN II)

BABA SALEH 5
Explanation and demonstration of how to design forms reports and triggers in object oriented
databases. This demonstration will come in practical manual (see practical manual)

Step 4&5
Physical storage media and tertiary storage devices

1. Several types of data storage exist in most computer systems. They vary in
speed of access, cost per unit of data, and reliability.
o Cache: most costly and fastest form of storage. Usually very small, and
o managed by the ope rating system.
o Main Memory (MM): the storage area for data available to be operated
on.
 General-purpose machine instructions operate on main memory.
 Contents of main memory are usually lost in a power failure or
``crash''.
 Usually too small (even with megabytes) and too expensive to
store the entire data base.
o Flash memory: EEPROM (electrically erasable programmable read-
only memory).
 Data in flash memory survive from power failure.
 Reading data from flash memory takes about 10 nano-secs
(roughly as fast as from main memory), and writing data into flash
memory is more complicated: write-once takes about 4-10
microsecs.

COM 322 (DATABASE DESIGN II)

BABA SALEH 6
To overwrite what has been written, one has to first erase the

entire bank of the memory. It may support only a limited number
of erase cycles ( to ).
 It has found its popularity as a replacement for disks for storing
small volume s of data (5-10 megabytes).
o Magnetic-disk storage: primary medium for long-term storage.
 Typically the entire database is stored on disk.
 Data must be moved from disk to main memory in order for the
data to be operated on.
 After operations are performed, data must be copied back to disk
if any changes were made.
 Disk storage is called direct access storage as it is possible to read
data on the disk in any order (unlike sequential access).
 Disk storage usually survives power failures and system crashes.
o Optical storage: CD-ROM (compact-disk read-only memory), WORM
(write-once read-many) disk (for archival storage of data), and Juke
box (containing a few drives and numerous disks loaded on demand).
o Tape Storage: used primarily for backup and archival data.
 Cheaper, but much slower access, since tape must be read
sequentially from the beginning.
 Used as protection from disk failures!
2. The storage device hierarchy is presented in Figure, where the higher levels are
expensive (cost per bit), fast (access time), but the capacity is smaller.

Figure Storage-device hierarchy

3. Another classification: Primary, Secondary, and Tertiary storage.

COM 322 (DATABASE DESIGN II)

BABA SALEH 7
1.Primary storage: the fastest storage media, such as catch and main
memory.
2. Secondary (or on-line) storage: the next level of the hierarchy, e.g.,
magnetic disks.
3. Tertiary (or off-line) storage: magnetic tapes and optical disk juke boxes.
4. Volatility of storage. Volatile storage loses its contents when the power is
removed. Without power backup, data in the volatile storage (the part of the
hierarchy from main memory up) must be written to nonvolatile storage for
safekeeping.

Access and organization of records, and data –dictionary.


A data dictionary is like a bill of materials for a database; it lists all database components, including
reports, tables, field names and field types. Such information helps audit databases for objects you no
longer need. ... To create a table from the report, export it to Excel, and then back into Access. (see
practical manual for organization of record)

Storage structure of object oriented databases


Object-Relational Database Systems. ORDB systems can be thought of as an attempt to extend
relational database systems with the functionality necessary to support a broader class of
application domains, provide a bridge between the relational and object-oriented paradigms. This
approach attempts to get the best of both.
Object and Class
A conceptual entity is anything that exists and can be distinctly identified. E.g a person, an
employee, a car, a part,
In OO system all conceptual entities are modelled as object. An object has structural properties
defined by finite set of attributes and behavioral properties defined by a finite set method. All
objects with the same set of attributes a nd methods are grouped into a class, and form instances
of that class.
E.g A class EMPLOYEE can have class attributes called NO_of_EMPLOYEES which holds a
count of the number of employee instances in the class, and NEXT_ENO which holds the
employee number next new employee.
The class employee can have class method I called NEW which is used to construct new
instances of the class

The basic concepts of indexing and hashing


Data is stored in the form of records and every record has a key field, which helps it to be
recognize uniquely. Indexing is a data structure technique to efficiently retrieve records from the
database on some attributes on which the indexing has been done. Indexing in database is similar
to what we see in books.
Indexing in DBMS
Indexing is used to optimize the performance of a database by minimizing the number of disk
accesses required when a query is processed.
The index is a type of data structure. It is used to locate and access the data in a database table
quickly.
COM 322 (DATABASE DESIGN II)

BABA SALEH 8
It is defined based on the indexing attribute. Index structure, Indexes can be created using some
database columns.

Figure
The first column of the database is the search key that contains a copy of the primary key or
candidate key of the table. The values of the primary key are stored in sorted order so that the
corresponding data can be accessed easily.
The second column of the database is the data reference. It contains a set of pointers holding the
address of the disk block where the value of the particular key can be found.

Ordered indices
The indices are usually sorted to make searching faster. The indices which are sorted are known
as ordered indices.
Example: Suppose we have an employee table with thousands of record and each of which is 10
bytes long. If their IDs start with 1, 2, 3....and so on and we have to search student with ID-543.
In the case of a database with no index, we have to search the disk block from starting till it
reaches 543. The DBMS will read the record after reading 543*10=5430 bytes.
In the case of an index, we will search using indexes and the DBMS will read the record after
reading 542*2= 1084 bytes which are very less compared to the previous case.

Understand B+ and B– tree index files


B+ tree
B+ tree is used to store the records in the secondary memory. If the records are stored using this
concept, then those files are called as B+ tree index files. Since this tree is balanced and sorted,
all the nodes will be at same distance and only leaf node has the actual value, makes searching
for any record easy and quick in B+ tree index files.
Even insertion/deletion in B+ tree does not take much time. Hence B+ tree forms an efficient
method to store the records.
Searching, inserting and deleting a record is done in the same way we have seen above. Since it
is a balance tree, it searches for the position of the records in the file, and then it
fetches/inserts /deletes the records. In case it finds that tree will be unbalanced because of
insert/delete/update, it does the proper re-arrangement of nodes so that definition of B+ tree is
not changed.
Belo w is the simple example of how student details are stored in B+ tree index files.

COM 322 (DATABASE DESIGN II)

BABA SALEH 9
What is Hashing in DBMS?

In DBMS, hashing is a technique to directly search the location of desired data on the disk
without using index structure. Hashing method is used to index and retrieve items in a database
as it is faster to search that specific item using the shorter hashed key instead of using its original
value. Data is stored in the form of data blocks whose address is generated by applying a hash
function in the memory location where these records are stored known as a data block or data
bucket.

Why do we need Hashing?

Here, are the situations in the DBMS where you need to apply the Hashing method:

 For a huge database structure, it's tough to search all the index values through all its level
and then you need to reach the destination data block to get the desired data.
 Hashing method is used to index and retrieve items in a database as it is faster to search
that specific item using the shorter hashed key instead of using its original value.
 Hashing is an ideal method to calculate the direct location of a data record on the disk
without using index structure.
 It is also a helpful technique for implementing dictionaries.

Important Terminologies using in Hashing

Here, are important terminologies which are used in Hashing:

 Data bucket – Data buckets are memory locations where the records are stored. It is also
known as Unit of Storage.
 Hash function: A hash function, is a mapping function which maps all the set of search
keys to the address where actual records are placed.
 Hash index – It is an address of the data block. A hash function could be a simple
mathematical function to even a complex mathematical function.

COM 322 (DATABASE DESIGN II)

BABA SALEH 10
 Double Hashing –Double hashing is a computer programming method used in hash
tables to resolve the issues of a collision.
 Bucket Overflow: The condition of bucket-overflow is called collision. This is a fatal
stage for any static has to function.

There are mainly two types of SQL hashing methods:

1. Static Hashing
2. Dynamic Hashing

Static Hashing

In the static hashing, the resultant data bucket address will always remain the same.

Therefore, if you generate an address for say Student_ID = 10 using hashing function mod(3),
the resultant bucket address will always be 1. So, you will not see any change in the bucket
address.

Therefore, in this static hashing method, the number of data buckets in memory always remains
constant.
Static Hash Functions

 Inserting a record
 Searching
 Delete a record

Dynamic Hashing
Dynamic hashing offers a mechanism in which data buckets are added and removed dynamically
and on demand. In this hashing, the hash function helps you to create a large number of values.
What is Collision?
Hash collision is a state when the resultant hashes from two or more data in the data set, wrongly
map the same place in the hash table.
How to deal with Hashing Collision?
There are two techniques which you can use to avoid a hash collision:
1. Rehashing: This method, invokes a secondary hash function, which is applied
continuously until an empty slot is found, where a record should be placed.
2. Chaining: Chaining method builds a Linked list of items whose key hashes to the same
value. This method requires an extra link field to each table position.

Difference between Static and Dynamic Hashing


The main difference between static and dynamic hashing is that, in static hashing, the
resultant data bucket address is always the same while, in dynamic hashing, the data
buckets grow or shrink according to the increase and decrease of records.

COM 322 (DATABASE DESIGN II)

BABA SALEH 11
It is not possible to search all the indexes to find the data in a large database. Hashing provides
an alternative to this issue. Furthermore, it allows calculating the direct location of data on the
disk without using indexes. Hashing uses mathematical functions called hash functions to
generate addresses of data records. In addition, the memory locations that store data are called
data buckets. There are two types of hashing called static and dynamic hashing.

What is hashing in database? and what the basic techniques for hashing
Hashing in databases is a technique used to efficiently retrieve and store data. It involves
converting a key (such as a database record’s unique identifier) into a fixed-size value, often
called a "hash value" or "hash code," using a mathematical function known as a hash function.
This hash value is then used as an index in a hash table, where the actual data is stored.
Purpose of Hashing
The main purpose of hashing is to achieve quick data retrieval. When you want to access data,
instead of searching through the entire dataset, you can compute the hash of the key and directly
go to the location in the hash table where the data is stored. This makes data retrieval operations
(like lookups, inserts, and deletes) faster, often achieving near constant time complexity, O(1).
Basic Techniques for Hashing
1. Direct Hashing:
o Simple Hash Function: A simple mathematical function is used to convert the
key into an index. For example, if the key is a number, the hash function could be
something like h(key) = key % table size.
o Characteristics: It is straightforward but can lead to collisions if two keys hash to
the same index.
2. Collision Handling:
o Chaining: When a collision occurs (i.e., two keys hash to the same index), all the
keys that hash to that index are stored in a linked list (or any other data structure)
at that index. When you need to retrieve an item, you compute its hash to find the
index and then search within the linked list.
o Open Addressing: Instead of storing multiple keys in the same index, open
addressing finds another open slot within the hash table to store the key.
Techniques include:
 Linear Probing: If the desired slot is taken, you check the next slot (index
+ 1) until an open slot is found.
 Quadratic Probing: Similar to linear probing but checks slots by a
quadratic function, e.g., (index + 1^2), (index + 2^2), etc., to reduce
clustering.
COM 322 (DATABASE DESIGN II)

BABA SALEH 12
 Double Hashing: A secondary hash function is used to determine the step
size in case of a collision, helping to spread out the keys more uniformly.
3. Rehashing:
o Rehashing: If the hash table becomes too full, or if too many collisions occur,
rehashing is performed. This involves creating a new, larger hash table and
rehashing all existing keys into the new table with a new hash function.
4. Perfect Hashing:
o Perfect Hashing: This technique is used when there is a fixed set of keys known
in advance. It constructs a hash function that maps each key to a unique index,
ensuring no collisions occur. This is more complex but provides very efficient
retrieval.
5. Hash Functions:
o Division Method: The key is divided by the size of the hash table, and the
remainder is used as the hash value, e.g., h(key) = key % table_size.
o Multiplication Method: A multiplication constant (often irrational) is multiplied
by the key, and the fractional part is multiplied by the table size to determine the
index.
o Universal Hashing: A family of hash functions is chosen randomly for each run,
ensuring that the chance of collisions is minimized.
Key Points to Remember
 Hash Function Quality: The efficiency of hashing largely depends on the quality of the
hash function. A good hash function distributes keys uniformly across the hash table.
 Load Factor: The ratio of the number of stored entries to the table size is critical. A
higher load factor increases the chances of collisions.
Hashing is fundamental to database indexing, as it allows for faster access to records compared
to other data structures like binary search trees or linked lists

Multiple Key Accesses


To handle above situation, multiple key accesses are introduced. Here, we use combination of
two or more columns which are frequently queried to get index. In the above example, both
DEPT_ID and SALARY are clubbed into one index and are stored in the files. Now what
happens when we fire above query? It filters both DEPT_ID = 20 and SALARY = 5000 at a shot
and returns the result.
This type of indexing works well when all the columns used in the index are involved in the
query. In above example we have index on (DEPT_ID, SALARY) and both the columns are

COM 322 (DATABASE DESIGN II)

BABA SALEH 13
used in the query. Hence it resulted quickly. But what will happen when any one of the column is
used in the query? This index will not be used to fetch the record. Hence query becomes slower.

JOINS
The Join operator is one of the most useful operator in relational algebra and is used to combine
information from two or more relations. Joins are of following types:
1.Conditional Join.
2.Equi Join.
3.Natural Join
CONDITIONAL JOIN
In conditional joins, the given relations are combined with respect to some conditions.
For e.g. S1 S1.SID <= R1.SID R1

EQUI JOIN
EQUI Join is a special case of JOIN operators in which equality of columns is considered (same
no. of columns). For e.g. S1 S1.SID = R1.SID R1

NATURAL JOIN
Natural join is a special case of join operators in which equalities are specified on all the fields
with same name. The EQUI join expression is basically a NATURAL Join. Natural join can be
represented as below:
S1 R1

COM 322 (DATABASE DESIGN II)

BABA SALEH 14
Sorting
Sorting a database means arranging the records in a specific way to make reported date more
usable. You sort records by choosing a specific field(s) within a record by which to sort. For
example, an alphabetical sort by the last name field will arrange text data in ascending
alphabetical (A-Z) order. If specified, the text field can also be sorted in descending order. (It
will be discuss during the practical)

Understand transaction, transaction state, atomicity and durability


Transaction Introduction
 Set of operations that form a single logical unit of work are called as transactions.
 A transaction is a unit of program execution that accesses and possibly updates various
data items.
 E.g. transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)

 read(X), which transfers the data item X from the database to a variable, also called X, in
a buffer in main memory belonging to the transaction that executed the read operation.
 write(X), which transfers the value in the variable X in the main-memory buffer of the
transaction that executed the write to the data item X in the database.

Transaction Properties
A transaction has following properties:
1. Atomicity
2. Consistency
3. Isolation
4. Durability
ATOMICITY
 If the transaction fails after step 3 and before step 6, money will be “lost” leading
to an inconsistent database state
 Failure could be due to software or hardware
 The system should ensure that updates of a partially executed transaction are not
reflected in the database
COM 322 (DATABASE DESIGN II)

BABA SALEH 15
 All or nothing, regarding the execution of the transaction
CONSISTENCY
 The sum of A and B is unchanged by the execution of the transaction.
 Execution of a (single) transaction preserves the consistency of the database.
 A transaction must see a consistent database and must leave a consistent database
 During transaction execution the database may be temporarily inconsistent.
 Constraints to be verified only at the end of the transaction
ISOLATION
 Concurrent execution of two or more transactions should be consistent is called as
Isolation.
 If between steps 3 and 6, another transaction T2 is allowed to access the partially updated
database, it will see an inconsistent database (the sum A + B will be less than it should
be).

T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B)
DURABILITY
 After a transaction completes successfully, the changes it has made to the database
persist, even if there are system failures.
 Database should be recoverable under any type of crash.
Example of Fund Transfer
 Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
TRANSACTION STATES

COM 322 (DATABASE DESIGN II)

BABA SALEH 16
 Active – the initial state; the transaction stays in this state while it is executing
Partially committed – if all the operations of the transactions are completed successfully and it
enters the start-to-commit state where it instruct the DBMS to reflect the changes made by it into
the database.
 Failed – after the system determines that the transaction can no longer proceed with its
normal execution because of hardware or logical errors.
 Aborted – after the transaction has been rolled back and the database restored to its state
prior to the start of the transaction. Two options after it has been aborted:
 restart the transaction
 hardware or software error
 kill the transaction
 Internal logical error
 Committed – after successful completion. Once a transaction has committed, we cannot
undo its effects by aborting it.
CONCURRENT EXECUTION
 Transaction processing systems usually allow multiple transactions to run parallelly.
 Allowing multiple transactions to update the data concurrently causes several
complications with the consistency of data.
 For that, we can think that the best way of executing the transactions can be---serial
execution
 However, executing the multiple transactions at same time has its own benefits
ADVANTAGES OF CONCURRENT EXECUTION
 Improved throughput and resource utilization:
 Throughput: Number of transactions executed in given amount of time.
 I/O activity and CPU activity
 One transaction can perform I/O, other can use CPU cycles for execution
 Reduced Waiting time:
 In serial scheduling, a short transaction may have to wait for a preceding long
transaction to complete

COM 322 (DATABASE DESIGN II)

BABA SALEH 17
 Waiting time: the time for a transaction to be completed after it has been
submitted for execution.
SCHEDULE
 The execution sequence of transactions. The time order sequence of two or more
transactions
 Represent the chronological order in which transactions are executed in the system.
 For example, Read (A) must appear before Write (A) in any transaction
TYPES OF SCHEDULES
 Serial
 Parallel (Concurrent)
SERIAL SCHEDULE
 After the commit of first transaction second transaction begins.
 There is NO inconsistency in serial schedules.
 It has poor throughput
 It has poor resource utilization.
 Eg: T1 has W(A) W(B) operations and T2 has R(A)R(B)

EXAMPLE SCHEDULE 1
 Let T1 transfer $50 from A ($500) to B($200), and T2 transfer 10% of the balance from A
to B.

PARALLEL/CONCURRENT SCHEDULE
COM 322 (DATABASE DESIGN II)

BABA SALEH 18
IRRECOVERABLE SCHEDULES
 Irrecoverable schedule — if a transaction Tj reads a data item previously written by a
transaction Ti , then the commit operation of Tj must appear before the commit operation
of Ti.
The following schedule is not recoverable if T9 commits immediately after the read

COM 322 (DATABASE DESIGN II)

BABA SALEH 19
 If T8 should abort, T9 would have read (and possibly shown to the user) an inconsistent
database state. Hence, database must ensure that schedules are recoverable.
RECOVERABLE SCHEDULES
 Recoverable schedule — if a transaction Tj reads a data item previously written by a
transaction Ti , then the commit operation of Ti must appear before the commit operation
of Tj.
The following schedule is recoverable if T9 commits after the T8

COM 322 (DATABASE DESIGN II)

BABA SALEH 20

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy