0% found this document useful (0 votes)
3 views22 pages

dbms_unit_4

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views22 pages

dbms_unit_4

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Crash Recovery

DBMS is a highly complex system with hundreds of transactions being executed


every second. The durability and robustness of a DBMS depends on its complex
architecture and its underlying hardware and system software. If it fails or crashes
amid transactions, it is expected that the system would follow some sort of algorithm
or techniques to recover lost data.

Failure Classification

To see where the problem has occurred, we generalize a failure into various
categories, as follows −

Transaction failure

A transaction has to abort when it fails to execute or when it reaches a point from
where it can’t go any further. This is called transaction failure where only a few
transactions or processes are hurt.
Reasons for a transaction failure could be −
 Logical errors − Where a transaction cannot complete because it has some
code error or any internal error condition.
 System errors − Where the database system itself terminates an active
transaction because the DBMS is not able to execute it, or it has to stop
because of some system condition. For example, in case of deadlock or
resource unavailability, the system aborts an active transaction.

System Crash

There are problems − external to the system − that may cause the system to stop
abruptly and cause the system to crash. For example, interruptions in power supply
may cause the failure of underlying hardware or software failure.
Examples may include operating system errors.

Disk Failure

In early days of technology evolution, it was a common problem where hard-disk


drives or storage drives used to fail frequently.
Disk failures include formation of bad sectors, unreachability to the disk, disk head
crash or any other failure, which destroys all or a part of disk storage.

Storage Structure

We have already described the storage system. In brief, the storage structure can
be divided into two categories −
 Volatile storage − As the name suggests, a volatile storage cannot survive
system crashes. Volatile storage devices are placed very close to the CPU;
normally they are embedded onto the chipset itself. For example, main
memory and cache memory are examples of volatile storage. They are fast
but can store only a small amount of information.
 Non-volatile storage − These memories are made to survive system
crashes. They are huge in data storage capacity, but slower in accessibility.

1
Examples may include hard-disks, magnetic tapes, flash memory, and non-
volatile (battery backed up) RAM.

Recovery and Atomicity

When a system crashes, it may have several transactions being executed and
various files opened for them to modify the data items. Transactions are made of
various operations, which are atomic in nature. But according to ACID properties of
DBMS, atomicity of transactions as a whole must be maintained, that is, either all
the operations are executed or none.
When a DBMS recovers from a crash, it should maintain the following −
 It should check the states of all the transactions, which were being executed.
 A transaction may be in the middle of some operation; the DBMS must
ensure the atomicity of the transaction in this case.
 It should check whether the transaction can be completed now or it needs to
be rolled back.
 No transactions would be allowed to leave the DBMS in an inconsistent state.
There are two types of techniques, which can help a DBMS in recovering as well as
maintaining the atomicity of a transaction −
 Maintaining the logs of each transaction, and writing them onto some stable
storage before actually modifying the database.
 Maintaining shadow paging, where the changes are done on a volatile
memory, and later, the actual database is updated.
Log-Based Recovery

o The log is a sequence of records. Log of each transaction is


maintained in some stable storage so that if any failure occurs, then
it can be recovered from there.
o If any operation is performed on the database, then it will be
recorded in the log.
o But the process of storing the logs should be done before the actual
transaction is applied in the database.

Let's assume there is a transaction to modify the City of a student. The


following logs are written for this transaction.

o When the transaction is initiated, then it writes 'start' log.


1. <Tn, Start>

o When the transaction modifies the City from 'Noida' to 'Bangalore',


then another log is written to the file.

1. <Tn, City, 'Noida', 'Bangalore' >

o When the transaction is finished, then it writes another log to


indicate the end of the transaction.
2
1. <Tn, Commit>

There are two approaches to modify the database:

1. Deferred database modification:

o The deferred modification technique occurs if the transaction does


not modify the database until it has committed.
o In this method, all the logs are created and stored in the stable
storage, and the database is updated when a transaction commits.

2. Immediate database modification:

o The Immediate modification technique occurs if database


modification occurs while the transaction is still active.
o In this technique, the database is modified immediately after every
operation. It follows an actual database modification.

Recovery using Log records

When the system is crashed, then the system consults the log to find
which transactions need to be undone and which need to be redone.

1. If the log contains the record <Ti, Start> and <Ti, Commit> or <Ti,
Commit>, then the Transaction Ti needs to be redone.
2. If log contains record<Tn, Start> but does not contain the record
either <Ti, commit> or <Ti, abort>, then the Transaction Ti needs
to be undone.

Checkpoint

o The checkpoint is a type of mechanism where all the previous logs


are removed from the system and permanently stored in the
storage disk.
o The checkpoint is like a bookmark. While the execution of the
transaction, such checkpoints are marked, and the transaction is
executed then using the steps of the transaction, the log files will be
created.
o When it reaches to the checkpoint, then the transaction will be
updated into the database, and till that point, the entire log file will
be removed from the file. Then the log file is updated with the new
step of transaction till next checkpoint and so on.
o The checkpoint is used to declare a point before which the DBMS
was in the consistent state, and all transactions were committed.
3
Recovery using Checkpoint

In the following manner, a recovery system recovers the database from


this failure:

o The recovery system reads log files from the end to start. It reads
log files from T4 to T1.
o Recovery system maintains two lists, a redo-list, and an undo-list.
o The transaction is put into redo state if the recovery system sees a
log with <Tn, Start> and <Tn, Commit> or just <Tn, Commit>. In
the redo-list and their previous list, all the transactions are removed
and then redone before saving their logs.
o For example: In the log file, transaction T2 and T3 will have <Tn,
Start> and <Tn, Commit>. The T1 transaction will have only <Tn,
commit> in the log file. That's why the transaction is committed
after the checkpoint is crossed. Hence it puts T1, T2 and T3
transaction into redo list.
o The transaction is put into undo state if the recovery system sees a
log with <Tn, Start> but no commit or abort log found. In the undo-
list, all the transactions are undone, and their logs are removed.
o For example: Transaction T4 will have <Tn, Start>. So T4 will be
put into undo list since this transaction is not yet complete and
failed amid.

Recovery with Concurrent Transaction

4
o Whenever more than one transaction is being executed, then the
interleaved of logs occur. During recovery, it would become difficult
for the recovery system to backtrack all logs and then start
recovering.
o To ease this situation, 'checkpoint' concept is used by most DBMS.

As we have discussed checkpoint in Transaction Processing Concept of


this tutorial, so you can go through the concepts again to make things
more clear.

Recovery With Concurrent Transactions


Concurrency control means that multiple transactions can be executed at the same time and
then the interleaved logs occur. But there may be changes in transaction results so maintain
the order of execution of those transactions.
During recovery, it would be very difficult for the recovery system to backtrack all the logs
and then start recovering.
Recovery with concurrent transactions can be done in the following four ways.
1. Interaction with concurrency control
2. Transaction rollback
3. Checkpoints
4. Restart recovery
Interaction with concurrency control :
In this scheme, the recovery scheme depends greatly on the concurrency control scheme that
is used. So, to rollback a failed transaction, we must undo the updates performed by the
transaction.

Transaction rollback :
 In this scheme, we rollback a failed transaction by using the log.
 The system scans the log backward a failed transaction, for every log record found in the
log the system restores the data item.
Checkpoints :
 Checkpoints is a process of saving a snapshot of the applications state so that it can restart
from that point in case of failure.
 Checkpoint is a point of time at which a record is written onto the database form the
buffers.
 Checkpoint shortens the recovery process.
 When it reaches the checkpoint, then the transaction will be updated into the database,
and till that point, the entire log file will be removed from the file. Then the log file is
updated with the new step of transaction till the next checkpoint and so on.
 The checkpoint is used to declare the point before which the DBMS was in the consistent
state, and all the transactions were committed.
To ease this situation, ‘Checkpoints‘ Concept is used by the most DBMS.
 In this scheme, we used checkpoints to reduce the number of log records that the system
must scan when it recovers from a crash.
 In a concurrent transaction processing system, we require that the checkpoint log record
be of the form <checkpoint L>, where ‘L’ is a list of transactions active at the time of the
checkpoint.
 A fuzzy checkpoint is a checkpoint where transactions are allowed to perform updates
even while buffer blocks are being written out.
5
Restart recovery :
 When the system recovers from a crash, it constructs two lists.
 The undo-list consists of transactions to be undone, and the redo-list consists of
transaction to be redone.
 The system constructs the two lists as follows: Initially, they are both empty. The system
scans the log backward, examining each record, until it finds the first <checkpoint>
record.
Introduction of Shadow Paging
Shadow Paging is recovery technique that is used to recover database. In this
recovery technique, database is considered as made up of fixed size of logical
units of storage which are referred as pages. pages are mapped into physical
blocks of storage, with help of the page table which allow one entry for each
logical page of database. This method uses two page tables named current page
table and shadow page table.
The entries which are present in current page table are used to point to most
recent database pages on disk. Another table i.e., Shadow page table is used
when the transaction starts which is copying current page table. After this, shadow
page table gets saved on disk and current page table is going to be used for
transaction. Entries present in current page table may be changed during execution
but in shadow page table it never get changed. After transaction, both tables
become identical.
This technique is also known as Cut-of-Place updating.

To understand concept, consider above figure. In this 2 write operations are


performed on page 3 and 5. Before start of write operation on page 3, current page
table points to old page 3. When write operation starts following steps are
performed :
1. Firstly, search start for available free block in disk blocks.
2. After finding free block, it copies page 3 to free block which is represented by
Page 3 (New).
3. Now current page table points to Page 3 (New) on disk but shadow page table
points to old page 3 because it is not modified.
4. The changes are now propagated to Page 3 (New) which is pointed by current
page table.
6
COMMIT Operation :
To commit transaction following steps should be done :
1. All the modifications which are done by transaction which are present in buffers
are transferred to physical database.
2. Output current page table to disk.
3. Disk address of current page table output to fixed location which is in stable
storage containing address of shadow page table. This operation overwrites
address of old shadow page table. With this current page table becomes same
as shadow page table and transaction is committed.
Failure :
If system crashes during execution of transaction but before commit operation,
With this, it is sufficient only to free modified database pages and discard current
page table. Before execution of transaction, state of database get recovered by
reinstalling shadow page table.
If the crash of system occur after last write operation then it does not affect
propagation of changes that are made by transaction. These changes are
preserved and there is no need to perform redo operation.
Advantages :
 This method require fewer disk accesses to perform operation.
 In this method, recovery from crash is inexpensive and quite fast.
 There is no need of operations like- Undo and Redo.
Disadvantages :
 Due to location change on disk due to update database it is quite difficult to
keep related pages in database closer on disk.
 During commit operation, changed blocks are going to be pointed by shadow
page table which have to be returned to collection of free blocks otherwise they
become accessible.
 The commit of single transaction requires multiple blocks which decreases
execution speed.
 To allow this technique to multiple transactions concurrently it is difficult.
 This is the method where all the transactions are executed in the primary
memory or the shadow copy of database. Once all the transactions
completely executed, it will be updated to the database. Hence, if there is any
failure in the middle of transaction, it will not be reflected in the database.
Database will be updated after all the transaction is complete.
 A database pointer will be always pointing to the consistent copy of the
database, and copy of the database is used by transactions to update. Once
all the transactions are complete, the DB pointer is modified to point to new
copy of DB, and old copy is deleted. If there is any failure during the
transaction, the pointer will be still pointing to old copy of database,
and shadow database will be deleted. If the transactions are complete then
the pointer is changed to point to shadow DB, and old DB is deleted.
 As we can see in above diagram, the DB pointer is always pointing to
consistent and stable database. This mechanism assumes that there will not
be any disk failure and only one transaction executing at a time so that the
shadow DB can hold the data for that transaction. It is useful if the DB is
comparatively small because shadow DB consumes same memory space as
the actual DB. Hence it is not efficient for huge DBs. In addition, it cannot
handle concurrent execution of transactions. It is suitable for one transaction
at a time.

7
Failure with Loss of Nonvolatile Storage
Until now, we have considered only the case where a failure results in the loss of
information residing in volatile storage while the content of the nonvolatile storage
remains intact. Although failures in which the content of nonvolatile storage is lost
are rare, we nevertheless need to be prepared to deal with this type of failure. In this
section, we discuss only disk storage. Our discussions apply as well to other
nonvolatile storage types.

The basic scheme is to dump the entire content of the database to stable storage
periodically—say, once per day. For example, we may dump the database to one or
more magnetic tapes. If a failure occurs that results in the loss of physical database
blocks, the system uses the most recent dump in restoring the database to a
previous consistent state. Once this restoration has been accomplished, the system
uses the log to bring the database system to the most recent consistent state.

More precisely, no transaction may be active during the dump procedure, and a
procedure similar to checkpointing must take place:

1. Output all log records currently residing in main memory onto stable storage.

2. Output all buffer blocks onto the disk.

3. Copy the contents of the database to stable storage.

4. Output a log record <dump> onto the stable storage.

Steps 1, 2, and 4 correspond to the three steps used for checkpoints in Section 17.4.3.
To recover from the loss of nonvolatile storage, the system restores the database to disk by
using the most recent dump. Then, it consults the log and redoes all the transactions that have
committed since the most recent dump occurred. Notice that no undo operations need to be
executed.
A dump of the database contents is also referred to as an archival dump, since we can archive
the dumps and use them later to examine old states of the database.
Dumps of a database and checkpointing of buffers are similar.
The simple dump procedure described here is costly for the following two reasons.
First, the entire database must be be copied to stable storage, resulting in considerable data
transfer. Second, since transaction processing is halted during the dump procedure, CPU
cycles are wasted. Fuzzy dump schemes have been developed, which allow transactions to be
active while the dump is in progress.

8
Database Backup and Recovery from Catastrophic Failures

So far, all the techniques we have discussed apply to noncatastrophic failures. A key
assumption has been that the system log is maintained on the disk and is not lost as a result of
the failure. Similarly, the shadow directory must be stored on disk to allow recovery when
shadow paging is used. The recovery techniques we have dis-cussed use the entries in the
system log or the shadow directory to recover from fail-ure by bringing the database back to a
consistent state.

The recovery manager of a DBMS must also be equipped to handle more catastrophic failures
such as disk crashes. The main technique used to handle such crashes is a database backup,
in which the whole database and the log are periodically copied onto a cheap storage medium
such as magnetic tapes or other large capacity offline storage devices. In case of a
catastrophic system failure, the latest backup copy can be reloaded from the tape to the disk,
and the system can be restarted.

Data from critical applications such as banking, insurance, stock market, and other databases
is periodically backed up in its entirety and moved to physically separate safe locations.
Subterranean storage vaults have been used to protect such data from flood, storm,
earthquake, or fire damage. Events like the 9/11 terrorist attack in New York (in 2001) and
the Katrina hurricane disaster in New Orleans (in 2005) have created a greater awareness
of disaster recovery of business-critical databases.

To avoid losing all the effects of transactions that have been executed since the last backup, it
is customary to back up the system log at more frequent intervals than full database backup
by periodically copying it to magnetic tape. The system log is usually substantially smaller
than the database itself and hence can be backed up more frequently. Therefore, users do not
lose all transactions they have performed since the last database backup. All committed
transactions recorded in the portion of the system log that has been backed up to tape can
have their effect on the data-base redone. A new log is started after each database backup.
Hence, to recover from disk failure, the database is first recreated on disk from its latest
backup copy on tape. Following that, the effects of all the committed transactions whose
operations have been recorded in the backed-up copies of the system log are reconstructed.

9
Authentication
User authentication is to make sure that the person accessing the database is who
he claims to be. Authentication can be done at the operating system level or even
the database level itself. Many authentication systems such as retina scanners or
bio-metrics are used to make sure unauthorized people cannot access the database.
Authorization
Authorization is a privilege provided by the Database Administer. Users of the
database can only view the contents they are authorized to view. The rest of the
database is out of bounds to them.
The different permissions for authorizations available are:

 Primary Permission - This is granted to users publicly and directly.


 Secondary Permission - This is granted to groups and automatically
awarded to a user if he is a member of the group.
 Public Permission - This is publicly granted to all the users.
 Context sensitive permission - This is related to sensitive content and only
granted to a select users.
The categories of authorization that can be given to users are:

 System Administrator - This is the highest administrative authorization for a


user. Users with this authorization can also execute some database
administrator commands such as restore or upgrade a database.
 System Control - This is the highest control authorization for a user. This
allows maintenance operations on the database but not direct access to data.
 System Maintenance - This is the lower level of system control authority. It
also allows users to maintain the database but within a database manager
instance.
 System Monitor - Using this authority, the user can monitor the database
and take snapshots of it.
Database Integrity
Data integrity in the database is the correctness, consistency and completeness of
data. Data integrity is enforced using the following three integrity constraints:

 Entity Integrity - This is related to the concept of primary keys. All tables
should have their own primary keys which should uniquely identify a row and
not be NULL.
 Referential Integrity - This is related to the concept of foreign keys. A
foreign key is a key of a relation that is referred in another relation.
 Domain Integrity - This means that there should be a defined domain for all
the columns in a database.

10
Relational databases catered to the need of backend database in
application development arena for quite a long time, but there are other
types of databases that have the potential to do the same. We should not
overlook their merit while dealing with a RDBMS as they can serve similar
purposes equally well, and possibly in a better way. These database
models are related to the relational concepts in doing what they do best.
This article provides a perspective on the philosophies of database
systems that are designed to be different from the relation database
model yet have a similar kind of vibe in their approach.

Introduction

It is important that database application developers have a fairly good


idea about general database administration as well as that they be
experts in SQL. In the process of attaining knowledge, however,
developers have little inclination towards knowing the inner working of
the various database management systems. The reason for this keep-
aside attitude towards understanding the inner workings of an RDBMS is
that it isn't helpful for hosting or maintaining databases servers nor is it
helpful for developing applications that use the system. Knowing how the
system is organized helps in creating better solution which is equally
important. Sometimes the concepts of the inner working principles can
provide the essential ideas necessary to modify and extend its feature in
better ways. It is also important to have a good idea on the basic
principles behind popular database system like ORDBMS, OODBMS
compared to RDBMS.

Types of Database Systems

There are quite a few popular variants or types of database system such
as relational (RDBMS), object-relational (ORDBMS) and object-oriented
(OODBMS). The differences lie in the architecture or design. They have
many common as well as distinct features. Here, we provide a brief
overview on each of them.

The Idea of Relational Database (RDBMS)

The structure of a relational database management system is based upon


relational data model introduced by Ted Codd of IBM Research in 1970.
This model uses the concept of mathematical relation. Its building block is
tables and values. The functionality is based upon set theory and first
order predicate logic. Once this concept was out, it quickly garnered
interest among a number of vendors who built several commercial
implementations based upon this model. Today, there are many popular
RDBMSs such as IBM DB2, MySQL, Microsoft SQL Server, and Microsoft
Access.

The idea of a relational database and its associated features that form a
package called RDBMS flowered from the basic concept of a collection of
relations. Each relation is represented as a table and the rows in the table
represents a collection of related data values. The table and the column
name give a clue to the meaning of the values in each row. For example,
each row in the EMPLOYEE table represents a specific real-world entity
11
such as a record of an employee. The column names such as – emp_id,
name, birth_date, join_date determines how to interpret the data values
in each record according to the column it is in. Of course, all values in the
column must be the same data type, otherwise they disturb consistency.

Most commercial relational database management system implements


Structured Query Language (SQL) as a de-facto language to communicate
with the database. Some say SQL has no other relation with relational
model except as a communication means for availing the storage and
retrieval services of the database. However, we must also understand
that as words cannot be separated from its sound, SQL cannot be
separated from RDBMS. SQL is very much a part of it.

There are many merits and demerits of this system. If we sum up the
merit, it is the its robustness that has stood the test of time. However,
down the years its upkeep has significantly increased both its size and
complexity. The addition and modification of feature as well as the tools
built to support changing needs, often had the challenge to coerce
different technologies. This is a clear call for unavoidable complexity.

Even having to deal with many of its pros and cons, relational database
management systems (RDBMS) are here to stay. This is true even in the
day and age of big data, No SQL and what not. Professionals have
devoted a lot of their expertise to improving and assimilating it with many
of new technological system. When the wave of object-oriented
technology became the forefront of the application development, RDBMS
seemed too incompatible. But miscible components were conjugated with
techniques like Hibernate, JTA and the like. These techniques became so
popular and effective that RDBMS may not seem like a system of yore
and can be quite a reliable backend companion in many projects. The fact
is, there are many old systems still available today and many new
systems thriving on the relational model.

The Idea of Object-Relational Database System (ORDBMS)

The object-relational database systems are an attempt to merge the two


different kind of system. It is an object database enhancement of a
relational model, a hybrid in design. Perhaps, the most visible aspect that
we might observe is in the addition of object database features in the SQL
revision for this hybrid model. One of the pitfalls of relation model was in
describing complex objects. The vibe of object-oriented mechanisms was
brought into the play with the introduction of type constructors to
describe row type that corresponds to the tuple constructor, array type
for specifying collections, sets, list, mechanism for specifying object
identity, encapsulation, inheritance, and more.

Note that the core technology used in ORDBMS is based upon the
relational model. The commercial implementations simply added a layer
of some of the object-oriented principle on top of the relational database
management system. The simplest example is Microsoft SQL Server.
Since this system is based on relational model, there is one more problem
added to it: translating object-oriented concept to relational mechanism.
However, this problem is extenuated by an object-oriented application
12
that helps in the communication between the object-oriented application
with the underlying relational databases.

Understand that relational and object-oriented principle do not go well


together, because they work on different principle. Therefore, it may
seem that this model somehow tries to coerce them into a truce for the
sake of developer's convenience. The real reason is to permit storage and
retrieval of objects in an RDBMS way by providing extension to the query
language to work on the object-oriented principle.

Some common implementation include Oracle Database, PostgreSQL,


and Microsoft SQL Server.

The Idea of Object-Oriented Database System (OODBMS)

The object-oriented database systems are of different genre. OODBMSes


attempt to imbibe the object-oriented principle to database functionality
right from its core implementation. If RDBMS carries the rich tradition of
query language and its extension, then an ORDBMS champions the cause
of adding rich data types to the relational model, OODBMS provides a
seamless integration with OOP languages. The query mechanism of
OODBMS focuses on object manipulation using specialized object-oriented
query language such as ODL (Object Description Language) and OQL
(Object Query Language).

An object consists to two components: state represented by value and


behavior represented by operations. In a typically object-oriented
programming language (OOPL) transient objects exists in memory.
OODBMS extended this idea and stores the object permanently beyond
program termination and can be retrieved later. Most unusual feature of
the OODBMS is that it provides the mechanism to persist complex objects
with both structure and behavior that can be applied to this object via
object-oriented programming interface. This mechanism leverages the
modelling of real-world objects as accurately as possible without forcing a
relationship among entities as we can see in the relational model.

Some common implementation include Versant object-oriented


Database, Objectivity/DB, ObjectDB, and ObjectStore.

Conclusion

Database technologies have also taken to new height due to the advances
in technology. RDBMS has stayed relevant in the evolution even today.
ORDBMS and OODMS draws many inspirations from RDBMS. ORDBMS
provide some layer of data encapsulation and behavior. Database vendors
often build extensions to the statement-response interfaces by extending
SQL to contain object descriptors and spatial query mechanisms.

13
n object-oriented database (OODBMS) or object database management system
(ODBMS) is a database that is based on object-oriented programming (OOP). The
data is represented and stored in the form of objects. OODBMS are also called
object databases or object-oriented database management systems.

A database is a data storage. A software system that is used to manage databases


is called a database management system (DBMS). There are many types of
database management systems such as hierarchical, network, relational, object-
oriented, graph, and document. Learn more here, Types of Database Management
Systems.

In this article, we will discuss what object-oriented databases are and why they are
useful.

Object-Oriented Database

Object database management systems (ODBMSs) are based on objects in object-


oriented programing (OOP). In OOP, an entity is represented as an object and
objects are stored in memory. Objects have members such as fields, properties, and
methods. Objects also have a life cycle that includes the creation of an object, use of
an object, and deletion of an object. OOP has key characteristics, encapsulation,
inheritance, and polymorphism. Today, there are many popular OOP languages
such as C++, Java, C#, Ruby, Python, JavaScript, and Perl.

The idea of object databases was originated in 1985 and today has become
common for various common OOP languages, such as C++, Java, C#, Smalltalk,
and LISP. Common examples are Smalltalk is used in GemStone, LISP is used in
Gbase, and COP is used in Vbase.

Object databases are commonly used in applications that require high performance,
calculations, and faster results. Some of the common applications that use object
databases are real-time systems, architectural & engineering for 3D modeling,
telecommunications, and scientific products, molecular science, and astronomy.

Advantages of Object Databases

ODBMS provide persistent storage to objects. Imagine creating objects in your


program and saving them as it is in a database and reading back from the database.

In a typical relational database, the program data is stored in rows and columns. To
store and read that data and convert it into program objects in memory requires
reading data, loading data into objects, and storing it in memory. Imagine creating a
class in your program and saving it as it is in a database, reading back and start
using it again.
Object databases bring permanent persistent to objects. Objects can be stored in
persistent storage forever.

In typical RDBMS, there is a layer of object-relational mapping that maps database


schemas with objects in code. Reading and mapping an object database data to the
objects is direct without any API or OR tool. Hence faster data access and better
performance.

Some object database can be used in multiple languages. For example, Gemstone
database supports C++, Smalltalk and Java programming languages.
14
What Is Object Relational Database?

An object-relational database (ORD) is a database management system


(DBMS) that’s composed of both a relational database (RDBMS) and an
object-oriented database (OODBMS). An object-relational database acts
as an interface between relational and object-oriented databases because
it contains aspects and characteristics from both models.

Object-oriented database (ORD) serves two main purposes:

 It connects the divide between relational databases and the


object-oriented modeling techniques that are usually used in
programming languages like C#, Java and C++.
 It bridges the gap between conceptual data modeling techniques
for relational and object-oriented databases like entry-
relationship diagram (ERD) and object-relational mapping (ORM).

What Is Object Oriented Database?

An object-oriented database is organized around objects rather than


actions and data rather than logic. Therefore, an object database is a
database management system in which information is represented in the
form of objects as used in object-oriented programming.

Usually, when OODBMS is integrated with an object programming


language, there is a much greater consistency between the database and
the programming language because both use the same model of data
representation. When compared to a relational database management
system, an object-oriented database stores complex data and
relationships between data directly, without mapping to relational rows
and columns whereas a relational database stores information in tables
with rows and columns.

Key Differences

Features Of Object Oriented Database (OODBMS)

15
1. In object oriented database, relationships are represented by
references via the object identifier (OID).
2. Object oriented systems employ indexing techniques to locate
disk pages that store the object. Therefore, they are able to
provide persistent storage for complex-structured objects.
3. Handles larger and complex data than RDBMS.
4. The constraints supported by object oriented systems vary from
system to system.
5. In object oriented systems, the data management language is
typically incorporated into a programming language such as
#C++.
6. Stores data entries are described as object.
7. Object oriented database can handle different types of data.
8. In the object oriented database, the data is stored in the form of
objects.

Features Of Object Relational Database (ORDBMS)

1. In object t relational database, connections between two relations


are represented by foreign key attributes in one relation that
reference the primary key of another relation.
2. Relational database systems do not specify any data storage
structure, each base relation is implemented as separate file and
therefore, they are unable to provide persistent storage for
complex-structured objects.
3. Handles comparatively simpler data.
4. Object oriented database has keys, entity integrity and
referential integrity.
5. In relational database systems there are data manipulation
languages such as SQL, QUEL and QBE which are based on
relational calculus.
6. Stores data in entries is described as tables.
7. Relational database can handle a single type of data.
8. In relational database, data is stored in the form of tables, which
contains rows and column.

his chapter introduces the concept of DDBMS. In a distributed database, there are a
number of databases that may be geographically distributed all over the world. A
distributed DBMS manages the distributed database in a manner so that it appears
as one single database to users. In the later part of the chapter, we go on to study
the factors that lead to distributed databases, its advantages and disadvantages.
A distributed database is a collection of multiple interconnected databases, which
are spread physically across various locations that communicate via a computer
network.
16
Features

 Databases in the collection are logically interrelated with each other. Often
they represent a single logical database.
 Data is physically stored across multiple sites. Data in each site can be
managed by a DBMS independent of the other sites.
 The processors in the sites are connected via a network. They do not have
any multiprocessor configuration.
 A distributed database is not a loosely connected file system.
 A distributed database incorporates transaction processing, but it is not
synonymous with a transaction processing system.

Distributed Database Management System

A distributed database management system (DDBMS) is a centralized software


system that manages a distributed database in a manner as if it were all stored in a
single location.

Features

 It is used to create, retrieve, update and delete distributed databases.


 It synchronizes the database periodically and provides access mechanisms
by the virtue of which the distribution becomes transparent to the users.
 It ensures that the data modified at any site is universally updated.
 It is used in application areas where large volumes of data are processed and
accessed by numerous users simultaneously.
 It is designed for heterogeneous database platforms.
 It maintains confidentiality and data integrity of the databases.

Factors Encouraging DDBMS

The following factors encourage moving over to DDBMS −


 Distributed Nature of Organizational Units − Most organizations in the
current times are subdivided into multiple units that are physically distributed
over the globe. Each unit requires its own set of local data. Thus, the overall
database of the organization becomes distributed.
 Need for Sharing of Data − The multiple organizational units often need to
communicate with each other and share their data and resources. This
demands common databases or replicated databases that should be used in
a synchronized manner.
 Support for Both OLTP and OLAP − Online Transaction Processing (OLTP)
and Online Analytical Processing (OLAP) work upon diversified systems
which may have common data. Distributed database systems aid both these
processing by providing synchronized data.
 Database Recovery − One of the common techniques used in DDBMS is
replication of data across different sites. Replication of data automatically
helps in data recovery if database in any site is damaged. Users can access

17
data from other sites while the damaged site is being reconstructed. Thus,
database failure may become almost inconspicuous to users.
 Support for Multiple Application Software − Most organizations use a
variety of application software each with its specific database support.
DDBMS provides a uniform functionality for using the same data among
different platforms.

Advantages of Distributed Databases

Following are the advantages of distributed databases over centralized databases.


Modular Development − If the system needs to be expanded to new locations or
new units, in centralized database systems, the action requires substantial efforts
and disruption in the existing functioning. However, in distributed databases, the
work simply requires adding new computers and local data to the new site and
finally connecting them to the distributed system, with no interruption in current
functions.
More Reliable − In case of database failures, the total system of centralized
databases comes to a halt. However, in distributed systems, when a component
fails, the functioning of the system continues may be at a reduced performance.
Hence DDBMS is more reliable.
Better Response − If data is distributed in an efficient manner, then user requests
can be met from local data itself, thus providing faster response. On the other hand,
in centralized systems, all queries have to pass through the central computer for
processing, which increases the response time.
Lower Communication Cost − In distributed database systems, if data is located
locally where it is mostly used, then the communication costs for data manipulation
can be minimized. This is not feasible in centralized systems.

Adversities of Distributed Databases

Following are some of the adversities associated with distributed databases.


 Need for complex and expensive software − DDBMS demands complex
and often expensive software to provide data transparency and co-ordination
across the several sites.
 Processing overhead − Even simple operations may require a large number
of communications and additional calculations to provide uniformity in data
across the sites.
 Data integrity − The need for updating data in multiple sites pose problems
of data integrity.
 Overheads for improper data distribution − Responsiveness of queries is
largely dependent upon proper data distribution. Improper data distribution
often leads to very slow response to user requests.
In this part of the tutorial, we will study the different aspects that aid in designing
distributed database environments. This chapter starts with the types of distributed
databases. Distributed databases can be classified into homogeneous and
heterogeneous databases having further divisions. The next section of this chapter
discusses the distributed architectures namely client – server, peer – to – peer and
multi – DBMS. Finally, the different design alternatives like replication and
fragmentation are introduced.

18
Types of Distributed Databases

Distributed databases can be broadly classified into homogeneous and


heterogeneous distributed database environments, each with further sub-divisions,
as shown in the following illustration.

Homogeneous Distributed Databases

In a homogeneous distributed database, all the sites use identical DBMS and
operating systems. Its properties are −
 The sites use very similar software.
 The sites use identical DBMS or DBMS from the same vendor.
 Each site is aware of all other sites and cooperates with other sites to process
user requests.
 The database is accessed through a single interface as if it is a single
database.

Types of Homogeneous Distributed Database

There are two types of homogeneous distributed database −


 Autonomous − Each database is independent that functions on its own.
They are integrated by a controlling application and use message passing to
share data updates.
 Non-autonomous − Data is distributed across the homogeneous nodes and
a central or master DBMS co-ordinates data updates across the sites.

Heterogeneous Distributed Databases

In a heterogeneous distributed database, different sites have different operating


systems, DBMS products and data models. Its properties are −
 Different sites use dissimilar schemas and software.
 The system may be composed of a variety of DBMSs like relational, network,
hierarchical or object oriented.
 Query processing is complex due to dissimilar schemas.
 Transaction processing is complex due to dissimilar software.
19
 A site may not be aware of other sites and so there is limited co-operation in
processing user requests.

Types of Heterogeneous Distributed Databases

 Federated − The heterogeneous database systems are independent in


nature and integrated together so that they function as a single database
system.
 Un-federated − The database systems employ a central coordinating module
through which the databases are accessed.

Distributed DBMS Architectures

DDBMS architectures are generally developed depending on three parameters −


 Distribution − It states the physical distribution of data across the different
sites.
 Autonomy − It indicates the distribution of control of the database system
and the degree to which each constituent DBMS can operate independently.
 Heterogeneity − It refers to the uniformity or dissimilarity of the data models,
system components and databases.

Design Alternatives

The distribution design alternatives for the tables in a DDBMS are as follows −

 Non-replicated and non-fragmented


 Fully replicated
 Partially replicated
 Fragmented
 Mixed

Non-replicated & Non-fragmented

In this design alternative, different tables are placed at different sites. Data is placed
so that it is at a close proximity to the site where it is used most. It is most suitable
for database systems where the percentage of queries needed to join information in
tables placed at different sites is low. If an appropriate distribution strategy is
adopted, then this design alternative helps to reduce the communication cost during
data processing.

Fully Replicated

In this design alternative, at each site, one copy of all the database tables is stored.
Since, each site has its own copy of the entire database, queries are very fast
requiring negligible communication cost. On the contrary, the massive redundancy
in data requires huge cost during update operations. Hence, this is suitable for
systems where a large number of queries is required to be handled whereas the
number of database updates is low.

Partially Replicated

20
Copies of tables or portions of tables are stored at different sites. The distribution of
the tables is done in accordance to the frequency of access. This takes into
consideration the fact that the frequency of accessing the tables vary considerably
from site to site. The number of copies of the tables (or portions) depends on how
frequently the access queries execute and the site which generate the access
queries.

Fragmented

In this design, a table is divided into two or more pieces referred to as fragments or
partitions, and each fragment can be stored at different sites. This considers the fact
that it seldom happens that all data stored in a table is required at a given site.
Moreover, fragmentation increases parallelism and provides better disaster
recovery. Here, there is only one copy of each fragment in the system, i.e. no
redundant data.
The three fragmentation techniques are −

 Vertical fragmentation
 Horizontal fragmentation
 Hybrid fragmentation

Mixed Distribution

This is a combination of fragmentation and partial replications. Here, the tables are
initially fragmented in any form (horizontal or vertical), and then these fragments are
partially replicated across the different sites according to the frequency of accessing
the fragments.
A spatial database is a database that is optimized for storing and querying data that
represents objects defined in a geometric space. Most spatial databases allow representing
simple geometric objects such as points, lines and polygons. Some spatial databases handle
more complex structures such as 3D objects, topological coverages, linear networks,
and TINs. While typical databases have developed to manage various numeric and
character types of data, such databases require additional functionality to process spatial
data types efficiently, and developers have often added geometry or feature data types.
The Open Geospatial Consortium developed the Simple Features specification (first released
in 1997)[1] and sets standards for adding spatial functionality to database
systems.[2] The SQL/MM Spatial ISO/EIC standard is a part the SQL/MM multimedia standard
and extends the Simple Features standard with data types that support circular
interpolations.

Limitation of conventional database

More Costly
Creating and managing a database is quite costly. High cost software and hardware
is required for the database. Also highly trained staff is required to handle the
database and it also needs continuous maintenance. All of these ends up making a
database quite a costly venture.
High Complexity

21
A Database Management System is quite complex as it involves creating, modifying
and editing a database. Consequently, the people who handle a database or work
with it need to be quite skilled or valuable data can be lost.
Database handling staff required
As discussed in the previous point, database and DBMS are quite complex. Hence,
skilled personnel are required to handle the database so that it works in optimum
condition. This is a costly venture as these professionals need to be very well paid.
Database Failure
All the relevant data for any company is stored in a database. So it is imperative that
the database works in optimal condition and there are no failures. A database failure
can be catastrophic and can lead to loss or corruption of very important data.
High Hardware Cost
A database contains vast amount of data. So a large disk storage is required to store
all this data. Sometimes extra storage may even be needed. All this increases
hardware costs by a lot and makes a database quite expensive.
Huge Size
A database contains a large amount of data, especially for bigger organisations. This
data may even increase as more data is updated into the database. All of these
leads to a large size of the database.
The bigger the database is, it is more difficult to handle and maintain. It is also more
complex to ensure data consistency and user authentication across big databases.
Upgradation Costs
Often new functionalities are added to the database.This leads to database
upgradations. All of these upgradations cost a lot of money. Moreover it is also quite
expensive to train the database managers and users to handle these new
upgradations.
Cost of Data Conversion
If the database is changed or modified in some manner, all the data needs to be
converted to the new form. This cost may even exceed the database creation and
management costs sometimes. This is the reason most organisations prefer to work
on their old databases rather than upgrade to new ones.

22

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy