DBMS Unit5
DBMS Unit5
Architecture of RDBMS-
RDBMS stands for Relational Database Management System and it implements SQL.
In the real-world scenario, people use the Relational Database Management System to
collect information and process it, to provide service. E.g. In a ticket processing system,
details about us (e.g. age, gender) and our journey (e.g. source, destination), are collected,
and the ticket is provided to us.
RDBMS Architecture :
Note –
Each term in the diagram is explained below in the point number associated with the term.
1. All data, data about data (metadata) and logs are stored in the Secondary Storage
devices (SSD), such as Disks and Tapes.The programs that are used to do the day-to-
day tasks of an enterprise are called Application programs. These programs provide the
functionality for the day-to-day operations of the enterprise. They are written in high
level languages (HLL) like Java, C etc, which along with the SQL, are used to
communicate with the databases.
2. RDBMS has a compiler that converts the SQL commands to lower level language,
processes it and stores it into the secondary storage device.
3. It is the job of Database Administrator (DBA) to set up the structure of the database
using command processor. The DDL stands for Data Definition Language and is used
by the DBA to create or drop tables, add columns etc. The DBA also uses other
commands which are used to set constraints and access controls.
4. Application Programmers compile the applications using a compiler and create
executable files (compiled application programs) and then store the data on the
secondary storage device.
5. Job of Data Analyst is to use the Query Compiler and Query Optimizer (uses relational
properties for executing queries) to manipulate the data in the database.
6. RDBMS Run Time System executes the compiled queries and application programs and
also interacts with the transaction manager and buffer manager.
7. Buffer Manager temporarily stores the data of the database in the main memory and
uses paging algorithm so that operations can be performed faster and the disk space can
be managed.
8. Transaction Manager deals with the principle of either completely doing a task or not
doing it at all (Atomicity property). E.g. Suppose a person named Geeks, wants to send
money to his sister. He sends the money and system crashes in between. In no case
should it happen that he has sent money but his sister has not received it. This is
handled by the transaction manager. The transaction manager would either refund the
money to Geeks or transfer it to his sister.
9. Log is a system, which records the information about all the transactions, so that
whenever a system failure (disk failure, system shut down due to no power etc.) arises,
the partial transactions can be undone.
10. Recovery Manager takes control of the system so that it reaches a steady state after
failure. The Recovery Manager takes into account the log files and undoes the partial
transactions and reflects the complete transaction in the database.
In a Relational Database Management System (RDBMS), physical files are the actual files on
the disk that store data, indexes, and other database objects. Different RDBMS products have
their own file structures and management techniques, but the general types of physical files
commonly found in RDBMS include:
1. Data Files:
o InnoDB: .ibd files (when innodb_file_per_table is enabled) or within the
system tablespace (ibdata1).
o MyISAM: .MYD files store table data.
o SQL Server: .mdf (primary data file) and .ndf (secondary data files).
o Oracle: .dbf files.
2. Index Files:
o MyISAM: .MYI files store index data.
o Other RDBMSs: Indexes are typically stored within data files.
3. Log Files:
o Transaction Logs: Store a history of all changes made to the database.
(e.g., .ldf in SQL Server, redo logs in Oracle, and binary logs in MySQL).
o Error Logs: Record errors and significant events.
o Relay Logs: Used in replication to store changes received from a master
server before applying them to a slave server.
4. Configuration Files:
o Store configuration settings for the database server (e.g., my.cnf for MySQL,
init.ora for Oracle).
5. Control Files (Oracle):
o Contain metadata about the database structure, including the locations of data
files and redo log files.
6. Temporary Files:
o Used for intermediate results of queries, such as sorting operations and
temporary tables (e.g., tempdb in SQL Server).
The memory structure of a Relational Database Management System (RDBMS) is crucial for
its performance and efficiency. Memory management involves the allocation and
optimization of memory resources to ensure smooth operation, quick data access, and
efficient query processing. While different RDBMSs have their own specific
implementations, the general components of memory structure in RDBMS include:.
Table spaces are logical storage containers in a database. They help manage the storage
allocation for database objects like tables and indexes.
Sql code -
CREATE TABLESPACE example_tbs
DATAFILE '/path/to/datafile/example01.dbf' SIZE 100M;
Segments
Segments are collections of extents allocated for specific database objects like tables or
indexes. Each segment resides in a tablespace.
Example: Create a table, which automatically creates a data segment in the specified
tablespace:
Sql code –
CREATE TABLE example_table (
id NUMBER,
name VARCHAR2(50)
) TABLESPACE example_tbs;
Extents
Extents are units of space allocation within a segment, consisting of contiguous data blocks.
Extents help manage storage more efficiently by allocating space in larger blocks rather than
individually.
Sql code -
SELECT segment_name, extent_id, file_id, block_id, blocks
FROM dba_extents
WHERE segment_name = 'EXAMPLE_TABLE';
Blocks
Blocks are the smallest units of storage in a database. They store the actual data and are
grouped into extents.
Sql code -
SELECT segment_name, blocks
FROM dba_segments
WHERE segment_name = 'EXAMPLE_TABLE';
Summary of Relationships
1. Create a Tablespace:
Sql code -
CREATE TABLESPACE example_tbs
DATAFILE '/path/to/datafile/example01.dbf' SIZE 100M;
Sql code -
CREATE TABLE example_table (
id NUMBER,
name VARCHAR2(50)
) TABLESPACE example_tbs;
Sql code -
INSERT INTO example_table (id, name) VALUES (1, 'John Doe');
INSERT INTO example_table (id, name) VALUES (2, 'Jane Smith');
COMMIT;
Sql code -
SELECT segment_name, segment_type, tablespace_name
FROM dba_segments
WHERE tablespace_name = 'EXAMPLE_TBS';
5. Monitor Extents:
Sql code -
SELECT segment_name, extent_id, file_id, block_id, blocks
FROM dba_extents
WHERE segment_name = 'EXAMPLE_TABLE';
Sql code -
SELECT segment_name, blocks
FROM dba_segments
WHERE segment_name = 'EXAMPLE_TABLE
Dedicated Server -
In a dedicated server configuration, each client connection to the database is handled by its
own dedicated server process.
Characteristics:
Advantages:
High Performance: Since each client has a dedicated process, performance per
connection is generally better.
Fault Isolation: Problems with one connection do not affect others.
Simplicity: Easier to configure and manage compared to multithreaded servers.
Disadvantages:
High Resource Consumption: More server resources are used because each
connection has its own process.
Limited Scalability: Scalability is limited because the server can only handle as
many connections as it can processes.
Characteristics:
Shared Processes: Server processes are shared among multiple client connections.
Resource Usage: More efficient in terms of resource usage as fewer server processes
are needed.
Performance: While the performance per connection might be slightly lower due to
shared resources, the overall ability to handle many connections is better.
Complexity: More complex to configure and manage.
Advantages:
Disadvantages:
Potential Performance Drop: Performance per connection may drop slightly due to
shared resources.
Complexity in Management: More complex to set up and manage compared to
dedicated servers.
Resource Contention: Increased potential for contention for resources among
connections.
Comparison Summary
Connection
One process per connection Shared processes for multiple connections
Handling
Dedicated Server: Ideal for environments with a smaller number of concurrent users
that require high performance and fault isolation.
Multithreaded Server: Suitable for environments with a large number of concurrent
users where efficient resource utilization is crucial.
Distributed Database
A distributed database is a database system where data is stored across multiple locations or
nodes. Each location can manage its own data independently while appearing as a single
unified database to users. This setup enhances data availability, redundancy, and scalability.
Key Features:
Data Distribution: Data is spread across different sites to improve access speed and
reliability.
Transparency: Users interact with the database as if it were a single entity, regardless
of where the data is physically stored.
Replication: Data can be duplicated across multiple locations for backup and fault
tolerance.
Database Links
A database link is a connection that allows one database to access objects (like tables or
views) in another database. This is especially useful in distributed databases, where you
might need to perform queries or data manipulation across different databases.
Key Features:
Snapshots
A snapshot is a read-only copy of data taken from a database at a specific point in time.
Snapshots are commonly used to create copies for reporting and analysis without impacting
the performance of the source database.
Key Features:
Summary
Dictionary structure -
USER can view only what is in USER’s scheme (generally exclude column OWNER)
DBA Can view everything what maximum any user can access
USER Views
The views with the prefix USER generally refer to views that contain data about a
particular user’s own object, which includes data about objects created by that user, grants
made by that user, and so on.
It is generally a subset of ALL_views. It has only data related to the user and has columns
similar to other views.
Examples:
Some Common Views with the prefix USER are:
Query to print all the objects (name and type) in a user’s scheme
Syntax:
SELECT object_name, object_type FROM user_objects;
Query to print all the data about all Tables owned by the user:
Views with the prefix ALL are views that contain data about not only objects owned by the
user but also objects with access via public or explicit grants of privileges or roles.
Examples:
Some Common Views with the prefix ALL are:
ALL_OBJECTS - views that contains data about all objects (tables, views, indexes,
procedures, etc..) in the database
ALL_TAB_COLUMNS - views that contains data about all columns in all the tables and
views in the database
ALL_TABLES - views that contains data about all tables and views in the database
ALL_VIEWS - views that contains data about all views in the database in the database
ALL_USERS - Views that contains data about data about all users in the database
ALL_CONSTRAINTS - Views that contains data about all constraints
(primary keys, foreign keys, unique constraints, etc. ) in the database
ALL_SEQUENCES - View that contains data about all sequences in the database
ALL_INDEXES - Views that contains data about all indexes in the database
ALL_TRIGGERS - Views that contains data about about all triggers in the database
ALL_INDEXES: View that contains data about all indexes in the database.
ALL_SYNONYMS - Views that contains data about all synonyms in the database
Example:
Query to print all the objects (name and type) to which the user has access:
DBA Views
Views with the prefix DBA are views that are generally accessed only by the Database
Administrators or any user who has got the system privilege SELECT ANY TABLE. It will
have access to the OWNER column as well. The prefix is generally SYS followed by the
table name.
The views are similar to the ALL and USER views
Example:
Query to print all the objects (name and type) in the DataBase:
Key Features
1. Standardization: ANSI SQL sets guidelines for SQL syntax and behavior, making it
easier to write queries that work across various databases.
2. Core Functions:
o DDL (Data Definition Language): Commands for creating and modifying
database structures (e.g., CREATE, ALTER).
o DML (Data Manipulation Language): Commands for querying and
changing data (e.g., SELECT, INSERT, UPDATE, DELETE).
o DCL (Data Control Language): Commands for managing access to data
(e.g., GRANT, REVOKE).
3. Portability: Code written in ANSI SQL can often be run on different database
systems with little or no modification.
4. Interoperability: Facilitates data sharing and communication between different
databases.
Importance
What is a Cursor?
A cursor is a tool in SQL that allows you to work with database records one row at a time.
It’s useful when you need to perform actions on each row individually, rather than all at once.
Key Points
1. Purpose: Cursors help you process individual records from a query result, which is
handy for complex operations.
2. Types of Cursors:
o Implicit Cursors: Automatically created for simple queries that return one
row.
o Explicit Cursors: Defined by you when you need to work with multiple rows.
3. Cursor Lifecycle:
o Declare: Define the cursor and the SQL query it will use.
o Open: Start the cursor so it can retrieve data.
o Fetch: Get the next row of data from the cursor.
o Close: Stop using the cursor when you're done.
o Deallocate: Remove the cursor definition to free up resources.
Simple Example
Code-
FETCH NEXT FROM emp_cursor INTO @name; -- Get the first row
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT @name; -- Print the employee name
FETCH NEXT FROM emp_cursor INTO @name; -- Get the next row
END
Use cursors when you need to handle each record individually, like in complex
calculations or detailed processing tasks.
For most other operations, set-based queries are preferred because they are usually
faster and more efficient.
Nested Cursors
Nested Cursors are cursors defined inside another cursor. They allow you to perform
operations on related data, such as fetching employees from different departments.
Code-
OPEN dept_cursor;
FETCH NEXT FROM dept_cursor INTO @department_id;
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT 'Department ID: ' + CAST(@department_id AS VARCHAR);
OPEN emp_cursor;
FETCH NEXT FROM emp_cursor INTO @employee_name;
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT ' Employee Name: ' + @employee_name;
FETCH NEXT FROM emp_cursor INTO @employee_name;
END
Parameterized Cursors
Parameterized Cursors allow you to pass parameters to the cursor's SQL query. This is
useful for filtering data based on specific values.
Here’s how to create a cursor that selects employees with a salary above a specific value:
Code-
OPEN emp_cursor;
FETCH NEXT FROM emp_cursor INTO @name;
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT @name; -- Print the employee name
FETCH NEXT FROM emp_cursor INTO @name; -- Move to next employee
END
Exception handling is a programming construct that enables developers to manage errors and
exceptional conditions that may occur during the execution of a program or database
operations. In SQL, this mechanism allows for graceful error management, ensuring that
applications can respond appropriately to issues without crashing or causing data corruption.
1. Data Integrity: Ensures that operations on the database maintain consistent and valid
data, especially in transactional contexts.
2. Error Management: Allows developers to anticipate potential errors and define how
to respond to them.
3. User Experience: Improves the user experience by providing informative error
messages rather than abrupt failures.
1. Error Types: Understanding common error types (e.g., constraint violations, syntax
errors, deadlocks) helps in managing them effectively.
2. Error Propagation: Errors can propagate through different layers of an application.
Exception handling allows developers to catch errors at the appropriate level.
3. Transactional Control: In the context of database operations, exception handling is
often paired with transaction management to ensure that changes can be rolled back in
case of failure.
1. TRY...CATCH / BEGIN...EXCEPTION
Most SQL dialects provide a mechanism to handle exceptions. Here are the constructs for
different databases:
These functions help retrieve information about the error that occurred:
SQL Server:
o ERROR_MESSAGE(): Returns the error message.
o ERROR_NUMBER(): Returns the error number.
o ERROR_SEVERITY(): Returns the severity level of the error.
Oracle:
o SQLERRM: Returns the error message associated with the last error.
PostgreSQL:
o SQLERRM: Similar to Oracle, it returns the last error message.
Example Scenarios
Code-
BEGIN TRY
BEGIN TRANSACTION;
-- Handle error
DECLARE @ErrorMessage NVARCHAR(4000) = ERROR_MESSAGE();
PRINT @ErrorMessage; -- Output the error message
END CATCH;
Example in Oracle
Code-
BEGIN
-- Attempt to insert a new employee
INSERT INTO employees (name, salary) VALUES ('John Doe', 50000);
EXCEPTION
WHEN DUP_VAL_ON_INDEX THEN
-- Handle specific error
DBMS_OUTPUT.PUT_LINE('Duplicate entry!');
WHEN OTHERS THEN
-- Handle any other error
DBMS_OUTPUT.PUT_LINE(SQLERRM); -- Output the error message
END;
What is a Trigger?
A trigger is a special kind of stored procedure that automatically runs when a specific event
occurs in a database table, like adding, updating, or deleting a record. Triggers help enforce
rules and maintain data integrity without manual intervention.
Types of Triggers
1. Row-Level Triggers: Run for each row affected by the event (e.g., when you update
multiple rows).
2. Statement-Level Triggers: Run once for the entire SQL statement, regardless of how
many rows are affected.
Basic Syntax
Code-
If you want to log every time a new employee is added, you might write:
Code-
Mutating Table Errors happen in Oracle when a trigger tries to read from or change the
same table that is currently being modified. This can lead to confusion because the data is not
stable during the operation.
Example Scenario
If you have a trigger that checks if a salary is below average while updating the employees
table, it might throw a mutating error:
Code-
1. Use Statement-Level Triggers: These don’t check the table being modified.
2. Compound Triggers: In Oracle, these let you separate logic into different parts that
run at different times.
3. PL/SQL Collections: Use collections to hold data instead of querying the table
directly.
Alternative Solutions
NoSQL databases (Not Only SQL) are types of databases designed to store and manage
large amounts of data in a flexible way. They differ from traditional SQL databases, which
use a fixed structure and SQL language. Here’s a simple breakdown:
1. Flexible Schema: You can store different types of data without needing a fixed
format. This is useful when data requirements change often.
2. Scalability: NoSQL databases can easily expand by adding more servers to handle
increased data loads, rather than just upgrading a single server.
3. High Performance: They are optimized for fast read and write operations, making
them ideal for applications that need quick data access.
4. Various Data Models: NoSQL databases come in different types:
o Document Stores: Store data as documents (like JSON). Examples:
MongoDB, CouchDB.
o Key-Value Stores: Store data as key-value pairs. Examples: Redis,
DynamoDB.
o Column Family Stores: Organize data in columns. Examples: Cassandra,
HBase.
o Graph Databases: Focus on relationships between data, using nodes and
edges. Examples: Neo4j, ArangoDB.