Dbms Seminar
Dbms Seminar
Q U E RY P R O C E S S I N G
&
DISTRIBUTED
D ATA B A S E S
VALARMATHI
M
II – CSE D
OVRVIEW
Query Processing Steps:
1. Scanning, Parsing, Validating and
translation
2. Optimization
3. Evaluation
OPTIMIZATION
Query Optimization: Amongst all equivalent
evaluation plans choose the one with lowest cost.
Cost is estimated using statistical information
from the
database catalog.
e.g. number of tuples in each relation, size of
tuples, etc.
2 Bob 102 IT
1 HR 4 David 103 Sales
Final Conclusion:
Rules:
1) Draw initial query tree.
2) Move SELECT down the tree
3) Move Restrictive SELECT operation
4) Replace CARTESIAN PRODUCT and SELECT operation with
JOIN operation.
5) Move PROJECT operation down the tree.
Eg:
Employee (Fname,Lnmae,ssn,Bdate,Address,Dno);
Works_for (Essn,Pno,hours);
Project (Pname,Pnum,Plocation,Dnum)
Step – 1: (a) Initial (canonical) query tree for SQL
query Q.
Step-2: Moving SELECT operations down the
query tree.
Step – 3: Applying the more restrictive
SELECT operation first.
Step – 4: Replacing CARTESIAN PRODUCT
and SELECT with JOIN operations
Step – 5: Moving PROJECT operations
down the query tree.
COST ESTIMATION
The main aim of query optimization is to
choose the most efficient way of
implementing the relational algebra
operations at the lowest possible cost.
The query optimizer should not depend
solely on heuristic rules, but it should also
estimate the cost of executing the different
strategies and find out the strategy with the
minimum cost estimate.
The cost functions are only estimates and
not exact values.
The cost depends on the cardinality of the
inputs.
Cost Components of Query
Execution
• The cost of executing the query includes the
following components:-
Access cost to secondary storage.
Storage cost.
Computation cost.
Memory uses cost.
Communication cost.
i. Access Cost to Secondary Storage
Disk I/O Cost → Reading/writing tables and indexes from
disk.
Index Lookup Cost → Searching for indexed records.
Sequential vs. Random Access Cost → Sequential
scans are cheaper than random accesses.
ii) Storage Cost
Data Storage Cost → Space occupied by tables and
indexes.
Index Storage Cost → Extra space required for maintaining
indexes.
Temporary Storage Cost → Space needed for intermediate
query results.