Unit VII Advanced Topics
Unit VII Advanced Topics
Users
Roles
Grant
Revoke
Concept of Parallel and Distributed
Databases
Parallel Database
System needs to handle huge amount of data with
high transfer rate
For such requirements, client-server or centralized
system is not efficient.
A parallel database system seeks to improve the
performance of the system through parallelizing
concept
A parallel DBMS is a DBMS that runs across multiple
processors of CPUs and is mainly designed to execute
query operations in parallel, wherever possible.
The parallel DBMS link a number of smaller machines
to achieve the same throughput as expected from a
single large system
Three architectural designs for parallel DBMS
Shared Memory Architecture
Shared Disk Architecture
Shared Nothing Architecture
Shared Memory Architecture
Uses multiple processors which is attached in a global shared
memory via interconnection channel.
In a shared memory structure all the processors available in the
system uses common memory for execution.
Advantages
Data is easily accessible to any processor
One processor can send message to other efficiently
Disadvantages
Waiting time of processors is increased due to more number of
processors
Bandwith Problems
Shared Disk System
Uses multiple processors which are accessible to
multiple disks via interconnection channels and
every processor has local memory.
Advantages
Fault tolerance is achieved using shared disk system. If
a processor or its memory fails the other processor can
complete the task.
Disadvantages
Shared disk system has limited scalability as large
amount of data travels through the intercommunication
channel
If more processors are added the existing processors
are slowed down.
Shared Nothing disk system
Each processor in the shared nothing system has
its own local memory and storage.
Advantages
Number of processors and disk can be connected as per
the requirement in share nothing disk system
Shared nothing disk system can support for many
processor, which makes the system are scalable
Disadvantages
Data partitioning is required in shared nothing disk
system
Cos t of communication for accessing local disk is much
higher.
Distributed Database
A distributed database (DDB) is a collection of multiple, logically
interrelated databases distributed over a computer network
A distributed database management system (DDBMS) is the
software that manages the DDB and provides an access
mechanism that makes this distribution transparent to the users
The terms DDBMS and DDBS are often used interchangeably
Implicit assumptions
Data stored at a number of sites each site logically consists of a
single processor
Processors at different sites are interconnected by a computer
network (we do not consider multiprocessors in DDBMS, cf.
parallel systems)
DDBS is a database, not a collection of files (cf. relational data
model). Placement and query of data is impacted by the access
patterns of the user
DDBMS is a collections of DBMSs (not a remote file system)
Distributed Database Systems deliver the
following advantages:
Higher reliability
Improved performance
Easier system expansion
Transparency of distributed and replicated data
Disadvantages
Complexity
Cost
Security
Integrity control more difficult
Lack of standards
Lack of experience
Database design more complex
Concept of Data warehousing and Data
mining
As the volume of data, is increasing day by day the
traditional ways and methods that were used to
manage and manipulate data were becoming obsolete
in nature, to overcome this problem we need to have a
more effective and advanced data storage system this
with the use of data warehouses.
In general terms is a historic repository of information
collected from multiple sources, stored under a unified
schema, and that usually resides at a single site.
A data warehouse stores historical data of an
organization so that they can analyze their
performance over the past times and plan for the futre
A data warehouse (DW) is a digital storage system that
connects and harmonizes large amounts of data from
many different sources.
Its purpose is to feed business intelligence(BI), reporting,
and analytics and support regulatory requirements – so
companies can turn their data into insight and make
smart, data-driven decisions.
Data warehouse store current and historical data in one
place and act as the single source of truth for an
organization
Data warehousing is the process of constructing and using
data warehouses,
It is the process of extracting and transferring operational
data into informational data and loading it into a central
data store (warehouse)
Benefits
Better Business analytics
Faster Queries
Improved Data Quality
Historical insight
Features
Subject Oriented
Integrated
Time Variant
Non-volatile
Loads
OLTP Database Data Warehouse