0% found this document useful (0 votes)

12 views30 pages

CH 13 Updated

Uploaded by

pratikmalviya2974

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views30 pages

CH 13 Updated

Uploaded by

pratikmalviya2974

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 30

Chapter 13: Query Processing

Basic Steps in Query Processing

1. Parsing and translation

2. Optimization
3. Evaluation

| 2
Basic Steps in Query Processing (Cont.)
● Parsing and translation
● translate the query into its internal form. This is then translated into relational
algebra.
● Parser checks syntax, verifies relations
● Evaluation
● The query-execution engine takes a query-evaluation plan, executes that plan, and
returns the answers to the query.

| 3
Basic Steps in Query Processing :
Optimization
● A relational algebra expression may have many equivalent expressions
● E.g., σbalance<2500(∏balance(account)) is equivalent to
∏balance(σbalance<2500(account))
● Each relational algebra operation can be evaluated using one of several different
algorithms
● Correspondingly, a relational-algebra expression can be evaluated in many ways.
● Annotated expression specifying detailed evaluation strategy is called an evaluation-
plan.
● E.g., can use an index on balance to find accounts with balance < 2500,
● or can perform complete relation scan and discard accounts with balance ≥ 2500

| 4
Basic Steps: Optimization (Cont.)

● Query Optimization: Amongst all equivalent evaluation plans choose the one with
lowest cost.
● Cost is estimated using statistical information from the database catalog
• e.g. number of tuples in each relation, size of tuples, etc.

| 5
Measures of Query Cost

● Cost is generally measured as total elapsed time for answering query

● Many factors contribute to time cost
• disk accesses, CPU, or even network communication
● Typically disk access is the main cost, and is also relatively easy to estimate. Measured
by taking into account
● Number of seeks * average-seek-cost
● Number of blocks read * average-block-read-cost
● Number of blocks written * average-block-write-cost
• Cost to write a block is greater than cost to read a block
– data is read back after being written to ensure that the write was successful

| 6
Measures of Query Cost (Cont.)
● For simplicity we just use the number of block transfers from disk and the number of
seeks as the cost measures
● tT – time to transfer one block
● tS – time for one seek
● Cost for b block transfers plus S seeks
b * tT + S * t S
● We ignore CPU costs for simplicity
● Real systems do take CPU cost into account
● Also, we do not include cost to writing output to disk in our cost formulae

| 7
Selection Operation – Algorithms List
● Search Algorithms are used to search and retrieve records that fulfill selection condition.

● File Scan : Entire relation (file) is scanned

● Two basic Algorithms
● Linear Search
● Binary Search
● Index Scan : Indexes are used to search records.
● Four Algorithms
● Primary index, equality on key
● Primary index, equality on nonkey
● Secondary index, equality on key
● Secondary index, equality on nonkey
● Algo. For Selection involving comparisons (<, <=, >, >=)
● Primary index, comparison
● Secondary index, comparison
● Algo. For complex selection (Conjunction, Disjunction, Negation)
● Conjunctive selection using one index
● Conjunctive selection using composite index
● Conjunctive selection by intersection of identifiers
● Disjunctive selection by union of identifiers

| 8
Selection Operation – Algo. (Cont…)

● File scan – search algorithms that locate and retrieve records that fulfill a selection
condition.
● Algorithm A1 (linear search). Scan each file block and test all records to see
whether they satisfy the selection condition.
● Cost estimate = br block transfers + 1 seek (1 seek is required to
access first block of the file. Then blocks can be accessed if stored
contiguously. If not stored contiguously, extra seeks may be required.)
• br denotes number of blocks containing records from relation r
● If selection is on a key attribute, can stop on finding record because unique
value exists for key
• The average transfer cost is (br /2) block transfers + 1 seek
• But the worst case is br block transfers + 1 seek
● Linear search can be applied to any file regardless of
• selection condition or
• ordering of records in the file, or
| 9
• availability of indices
Selection Operation (Cont.)
● A2 (binary search). Applicable if selection is an equality comparison on
the attribute
on which file is ordered.
● Assume that the blocks of a relation are stored contiguously
● Cost estimate (number of disk blocks to be scanned):
• cost of locating the first tuple (satisfying the condition) by a binary
search on the blocks
• ⎡log2(br)⎤ * (tT + tS)
Here, ⎡log2(br)⎤ is the no. of blocks to be examined in worst case and
(tT + tS) is
the time cost (tT is block transfer time and tS is block seek time).

• If there are multiple records (non-key attribute) satisfying the selection

condition
– Cost of reading extra blocks has to be added
– Add transfer cost of the number of blocks containing records that
| 10
satisfy selection condition
Selections Using Indices
● Index scan – search algorithms that use an index
● selection condition must be on the search-key of an index.
● A3 (primary index on primary key, equality). Retrieve a single record that satisfies
the corresponding equality condition
● Cost = (hi + 1) * (tT + tS) [hi is the height of a B+ Tree]
● A4 (primary index on nonkey, equality) Retrieve multiple records.
● Records will be on consecutive blocks
• Let b = number of blocks containing matching records
● Cost = hi * (tT + tS) + tS + tT * b
● A5 (secondary index on candidate key, equality).
● Retrieve a single record if the search-key is a primary key(or candidate key)
• Cost = (hi + 1) * (tT + tS)
• A6 (secondary index on nonkey, equality).
● Retrieve multiple records if search-key is not a primary key
• each of n matching records may be on a different block
• Cost = (hi + n) * (tT + tS) | 11

– Can be very expensive!

Selections Involving Comparisons

● Can implement selections of the form σA≤V (r) or σA ≥ V(r) by using

● a linear file scan or binary search,
● or by using indices in the following ways:
● A7 (primary index, comparison). (Relation is sorted on A)
• For σA ≥ V(r) use index to find first tuple ≥ v and scan relation sequentially from there
• For σA≤V (r) just scan relation sequentially till first tuple > v; do not use index
● A8 (secondary index, comparison).
• For σA ≥ V(r) use index to find first index entry ≥ v and scan index sequentially from
there, to find pointers to records.
• For σA≤V (r) just scan leaf pages of index finding pointers to records, till first entry > v
• In either case, retrieve records that are pointed to

| 12
Implementation of Complex Selections

● Conjunction: σθ1∧ θ2∧. . . θn (r)

● A9 (conjunctive selection using one index).
● Select a combination of θi and algorithms A1 through A8 that results in the least
cost for σθi (r).
● Test other conditions on tuple after fetching it into memory buffer.
● A10 (conjunctive selection using multiple-key index).
● Use appropriate composite (multiple-key) index if available.
● A11 (conjunctive selection by intersection of identifiers).
● Requires indices with record pointers.
● Use corresponding index for each condition, and take intersection of all the
obtained sets of record pointers.
● Then fetch records from file
● If some conditions do not have appropriate indices, apply test in memory.

| 13
Algorithms for Complex Selections

● Disjunction: σθ1∨ θ2 ∨. . . θn (r).

● A12 (disjunctive selection by union of identifiers).
● Applicable if all conditions have available indices.
• Otherwise use linear scan.
● Use corresponding index for each condition, and take union of all the obtained sets of
record pointers.
● Then fetch records from file
● Negation: σ¬θ(r)
● Use linear scan on file
● If very few records satisfy ¬θ, and an index is applicable to θ
• Find satisfying records using index and fetch from file

| 14
Sorting

● Sorting is important in DBMS because:

● To display tuples in sorted order.
● To process the queries, several relational operations use sorting before
applying the actual operation. For ex., join

● Relation could be sorted by building an index on the relation.

● This process orders the relation logically, not physically. Hence,

reading of tuples in sorted order may lead to disk access (disk seek plus
block transfer) for each tuple (when no of records are larger than no of
blocks) which could be very expensive. Therefore, it is desirable to arrange
the records in order physically.

● For relations that fit in memory, techniques like quicksort can be used.

● For relations that don’t fit in memory, external sort-merge is a good

choice. | 15
Example: External Sorting Using Sort-Merge
(Merge-Join)

• Sort-merge or merge-join is applied only on the relations having a join condition with = operator i.e., on equi-join.

| 16
Join Operation

● Several different algorithms to implement joins

● Nested-loop join
● Block nested-loop join
● Indexed nested-loop join
● Merge-join
● Hash-join
● Choice based on cost estimate

| 17
Nested-Loop Join

● To compute the theta join (theta join means join is based on the operator other than
= ) of two relations r and s is r θ s

for each tuple tr in r do begin

for each tuple ts in s do begin
test pair (tr,ts) to see if they satisfy the join condition θ
if they do, add tr • ts to the result.
end
end
● r is called the outer relation and s the inner relation of the join.
● Requires no indices and can be used with any kind of join condition. (similar to
linear file scan)
● Expensive since it examines every pair of tuples (nr * ns) in the two relations.

| 18
Block Nested-Loop Join

● Variant of nested-loop join in which every block of inner relation is paired with
every block of outer relation.
for each block Br of r do begin
for each block Bs of s do begin
for each tuple tr in Br do begin
for each tuple ts in Bs do begin
Check if (tr,ts) satisfy the join condition

if they do, add tr • ts to the result.

end
end
end
end

| 19
Indexed Nested-Loop Join

● Used with existing indices.

● In the previous algorithm nested-loop join, if index is available on the inner loop’s join
attribute and equi/natural join is used then index scan could be used instead of file
scans.

● For each tuple tr in the outer relation r=customer, use the index of S (created on join
attribute) to look up tuples in s that satisfy the join condition with tuple tr.

● If indices are available on join attributes of both r and s, use the relation with
fewer tuples as the outer relation.

| 20
Merge-Join (Sort-Merge-Join)

• Applied on sorted relations and for equi/natural join.

• Sort both relations on their join attribute (if not already sorted on the join attributes).
• Merge the sorted relations to join them using pointers pr and ps.

| 21
Merge-Join (Cont.)

● hybrid merge-join: If one relation is sorted, and the other has a secondary B+-tree
index on the join attribute.
● Merge the sorted relation with the leaf entries of the B+-tree. The result file will
contain tuples from the sorted relation and addresses for tuples of the unsorted
relation.
● Sort the result file on the addresses of the unsorted relation’s tuples
● Scan the unsorted relation in physical address order and merge with previous
result, to replace addresses by the actual tuples

| 22
Hash-Join

● Applicable for equi-joins and natural joins.

● A hash function h is used to partition tuples of both relations.
● The tuples of each relation are partitioned into sets that have the same hash value
on the Join Attribute.
● h maps JoinAttrs values to {0, 1, ..., n}, where JoinAttrs denotes the common
attributes of r and s used in the natural join.
● r0, r1, . . ., rn denote partitions of r tuples

• Each tuple tr ∈ r is put in partition ri where i = h(tr [JoinAttrs]).

● r0,, r1. . ., rn denotes partitions of s tuples

• Each tuple ts ∈s is put in partition si, where i = h(ts [JoinAttrs]).

| 23
Hash-Join (Cont.)

| 24
Hash-Join (Cont.)

● r tuples in ri need only to be compared with s tuples in si

● Need not be compared with s tuples in any other partition, since:
● an r tuple and an s tuple that satisfy the join condition will have the same
value for the join attributes.
● If that value is hashed to some value i, the r tuple has to be in ri and the s
tuple in si.

| 25
Handling of Overflows

● Partitioning is said to be skewed if some partitions have significantly more tuples than
some others
● Hash-table overflow occurs in specific partition si if si does not fit in memory.
Reasons could be
● Many tuples in s with same value for join attributes
● Bad hash function
● Overflow resolution
● Partition si is further partitioned using different hash function.
● Partition ri must be similarly partitioned as si.
● Overflow avoidance
● perform partitioning carefully to avoid overflows
● E.g. partition relation into many partitions, then combine them
● Both approaches fail with large numbers of duplicates

| 26
Complex Joins

● Join with a conjunctive (“and” operator in where clause) condition:

r θ1∧ θ 2∧... ∧ θ n s
● Either use nested loops/block nested loops, or
● Compute the result of one of the simpler joins r θi s
• final result comprises those tuples in the intermediate result that satisfy
the remaining conditions

θ1 ∧ . . . ∧ θi –1 ∧ θi +1 ∧ . . . ∧ θn
● Join with a disjunctive condition
r θ1 ∨ θ2 ∨... ∨ θn s
● Either use nested loops/block nested loops, or
● Compute as the union of the records in individual joins r θi s:
(r θ1 s) ∪ (r θ2 s) ∪ . . . ∪ (r θn s)

| 27
Other Operations

● Duplicate elimination can be implemented via hashing or sorting.

● On sorting duplicates will come adjacent to each other, and all but one set of
duplicates can be deleted.
● Hashing is similar – duplicates will come into the same bucket.
● Projection:
● perform projection on each tuple followed by duplicate elimination.

| 28
Other Operations : Aggregation

● Aggregation can be implemented in a manner similar to duplicate elimination.

● Sorting or hashing can be used to bring tuples in the same group together, and
then the aggregate functions can be applied on each group.

| 29
End of Chapter

| 30

7-Query Processing
No ratings yet
7-Query Processing
47 pages
Relational Query Optimization: Warih Maharani, ST.,MT
No ratings yet
Relational Query Optimization: Warih Maharani, ST.,MT
39 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
49 pages
DBMS R19 Unit Iv
No ratings yet
DBMS R19 Unit Iv
25 pages
DBMS Unit5 Lecture1
No ratings yet
DBMS Unit5 Lecture1
22 pages
Unit IV Part II
No ratings yet
Unit IV Part II
37 pages
Advanced Database Systems Lecture Notes
No ratings yet
Advanced Database Systems Lecture Notes
79 pages
Lesson 05
No ratings yet
Lesson 05
29 pages
Ch12-Query Processing
No ratings yet
Ch12-Query Processing
34 pages
1.3 PPT - Measure of Query Cost
100% (1)
1.3 PPT - Measure of Query Cost
42 pages
QueryProcess Optim
No ratings yet
QueryProcess Optim
60 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
49 pages
06 Query Processing (2) - NDN
No ratings yet
06 Query Processing (2) - NDN
31 pages
Database Technology Query Processing: Heiko Paulheim
No ratings yet
Database Technology Query Processing: Heiko Paulheim
60 pages
Query Processing and Optimisation - Intr
No ratings yet
Query Processing and Optimisation - Intr
41 pages
Lecture Notes
No ratings yet
Lecture Notes
96 pages
DBMS
No ratings yet
DBMS
24 pages
Query Processing
No ratings yet
Query Processing
39 pages
Chapter 13: Query Processing: Database System Concepts, 5th Ed
No ratings yet
Chapter 13: Query Processing: Database System Concepts, 5th Ed
55 pages
Unit 1
No ratings yet
Unit 1
23 pages
Advance Database Management System: Unit - 2 .Query Processing and Optimization
No ratings yet
Advance Database Management System: Unit - 2 .Query Processing and Optimization
38 pages
Session - 10 Querying
No ratings yet
Session - 10 Querying
36 pages
Unit 4 - Query Processing
No ratings yet
Unit 4 - Query Processing
49 pages
Unit-2 Query Processing and Optimization, Query Equivalence, Join Strategies
No ratings yet
Unit-2 Query Processing and Optimization, Query Equivalence, Join Strategies
38 pages
Query Processing Concepts
No ratings yet
Query Processing Concepts
99 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
55 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
25 pages
Introduction To Query Processing and Query Optimization Techniques
No ratings yet
Introduction To Query Processing and Query Optimization Techniques
77 pages
CH 1 Query Processing
No ratings yet
CH 1 Query Processing
38 pages
Unit 4
No ratings yet
Unit 4
24 pages
DBMS UNIT 4 Part 1
No ratings yet
DBMS UNIT 4 Part 1
15 pages
Overview of Query Evaluation: R&G Chapter 12
No ratings yet
Overview of Query Evaluation: R&G Chapter 12
30 pages
Ad Database All Slide
No ratings yet
Ad Database All Slide
49 pages
Chapter 12: Query Processing
No ratings yet
Chapter 12: Query Processing
57 pages
Dbms Chapter 5
No ratings yet
Dbms Chapter 5
54 pages
Introduction To Query Processing
No ratings yet
Introduction To Query Processing
21 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Database Tuning: Database Tuning Describes A Group of Activities Used To Optimize and Homogenize The
No ratings yet
Database Tuning: Database Tuning Describes A Group of Activities Used To Optimize and Homogenize The
20 pages
Chapter 1 Query Processing
100% (1)
Chapter 1 Query Processing
45 pages
Chapter 13: Query Processing
No ratings yet
Chapter 13: Query Processing
49 pages
QEII
No ratings yet
QEII
44 pages
Ch1 Query Processing
No ratings yet
Ch1 Query Processing
49 pages
13 QP1
No ratings yet
13 QP1
33 pages
Chapter - 3 Algorithms For Query Processing and Optimization PDF
No ratings yet
Chapter - 3 Algorithms For Query Processing and Optimization PDF
100 pages
Q Evaluation
No ratings yet
Q Evaluation
17 pages
Lecture11 Query Processing
No ratings yet
Lecture11 Query Processing
37 pages
Module - 4
No ratings yet
Module - 4
60 pages
3 Query Processing and Optimization-1
No ratings yet
3 Query Processing and Optimization-1
18 pages
Chapter 1
No ratings yet
Chapter 1
44 pages
L10-Query Evaluaion
No ratings yet
L10-Query Evaluaion
50 pages
Algorithms For Query Processing and Optimization
No ratings yet
Algorithms For Query Processing and Optimization
77 pages
DBMS IMPORTANT UNIT-4 QUESTIONS and Answer
No ratings yet
DBMS IMPORTANT UNIT-4 QUESTIONS and Answer
5 pages
Query Processing + Optimization: Outline: Operator Evaluation Strategies
No ratings yet
Query Processing + Optimization: Outline: Operator Evaluation Strategies
53 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
33 pages
Unit-5 Query Processing and Optimization
No ratings yet
Unit-5 Query Processing and Optimization
40 pages
05 QueryProcessing LecW4 Feb7 22
No ratings yet
05 QueryProcessing LecW4 Feb7 22
55 pages
Algorithms For Query Processing and Optimization
No ratings yet
Algorithms For Query Processing and Optimization
53 pages
ADBMS TypicalQueryOptimizer
No ratings yet
ADBMS TypicalQueryOptimizer
30 pages
Dbms Seminar
No ratings yet
Dbms Seminar
24 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Grok System Design Interview
100% (4)
Grok System Design Interview
163 pages
Solution To QUIZ 1
No ratings yet
Solution To QUIZ 1
1 page
03 Circuit Elements
No ratings yet
03 Circuit Elements
57 pages
DD Assignment 1
No ratings yet
DD Assignment 1
25 pages
System Partitioning
No ratings yet
System Partitioning
3 pages
Welding Domex Steels
100% (1)
Welding Domex Steels
16 pages
Physics Grade 9 Worksheet I Second Sem
No ratings yet
Physics Grade 9 Worksheet I Second Sem
11 pages
M-35 Mix Design
No ratings yet
M-35 Mix Design
1 page
Physical Science - q4 - Slm13-Pages-Deleted
No ratings yet
Physical Science - q4 - Slm13-Pages-Deleted
5 pages
Chapter 8 Gladys May Alcantara
No ratings yet
Chapter 8 Gladys May Alcantara
57 pages
Lesson 3: Surface Creation
No ratings yet
Lesson 3: Surface Creation
86 pages
Displacement and Acceleration C Programming
No ratings yet
Displacement and Acceleration C Programming
11 pages
Toaz - Info Detailed Lesson Plan DLP For Demo Teaching Parallelism PR
No ratings yet
Toaz - Info Detailed Lesson Plan DLP For Demo Teaching Parallelism PR
3 pages
DxDiag Requisitos
No ratings yet
DxDiag Requisitos
30 pages
Pronoun-Antecedent Rules
No ratings yet
Pronoun-Antecedent Rules
22 pages
Work Measurement Techniques Methods Types
No ratings yet
Work Measurement Techniques Methods Types
5 pages
1 - Tuberia 4'' SCH40 222956 Tpco
No ratings yet
1 - Tuberia 4'' SCH40 222956 Tpco
2 pages
Performance and Durability Comparison: Dell Latitude 14 5000 Series vs. HP EliteBook 840 G1
No ratings yet
Performance and Durability Comparison: Dell Latitude 14 5000 Series vs. HP EliteBook 840 G1
20 pages
Testing MCQ
No ratings yet
Testing MCQ
59 pages
Power Electronics For Electric Vehicles
No ratings yet
Power Electronics For Electric Vehicles
51 pages
Filtration PDF
No ratings yet
Filtration PDF
13 pages
Electrophysiology Devices Market Report
No ratings yet
Electrophysiology Devices Market Report
7 pages
Mutations
No ratings yet
Mutations
48 pages
Prof K V Subbaraju
No ratings yet
Prof K V Subbaraju
26 pages
Q. No Sub Q.No Answer: (Autonomous)
No ratings yet
Q. No Sub Q.No Answer: (Autonomous)
23 pages
9-Mm Pistol Pmi Training: REF: FM 23 - 35
No ratings yet
9-Mm Pistol Pmi Training: REF: FM 23 - 35
30 pages
Compiler Design 1
100% (1)
Compiler Design 1
30 pages
Hauz Khas Urban Village
No ratings yet
Hauz Khas Urban Village
7 pages
Letter of Invitation SGC
No ratings yet
Letter of Invitation SGC
7 pages
Aug 1-27 Final
No ratings yet
Aug 1-27 Final
90 pages
Algebra and More For Analytics
No ratings yet
Algebra and More For Analytics
29 pages
RW A. Com: An Essay On Criticism
No ratings yet
RW A. Com: An Essay On Criticism
1 page
Ground Floor Containment Overall Layout
No ratings yet
Ground Floor Containment Overall Layout
1 page
Marketnext Foundation
No ratings yet
Marketnext Foundation
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CH 13 Updated

Uploaded by

CH 13 Updated

Uploaded by

Chapter 13: Query Processing

Basic Steps in Query Processing

1. Parsing and translation

● Cost is generally measured as total elapsed time for answering query

● File Scan : Entire relation (file) is scanned

• If there are multiple records (non-key attribute) satisfying the selection

– Can be very expensive!

● Can implement selections of the form σA≤V (r) or σA ≥ V(r) by using

● Conjunction: σθ1∧ θ2∧. . . θn (r)

● Disjunction: σθ1∨ θ2 ∨. . . θn (r).

● Sorting is important in DBMS because:

● Relation could be sorted by building an index on the relation.

● This process orders the relation logically, not physically. Hence,

● For relations that don’t fit in memory, external sort-merge is a good

● Several different algorithms to implement joins

for each tuple tr in r do begin

if they do, add tr • ts to the result.

● Used with existing indices.

• Applied on sorted relations and for equi/natural join.

● Applicable for equi-joins and natural joins.

• Each tuple tr ∈ r is put in partition ri where i = h(tr [JoinAttrs]).

● r0,, r1. . ., rn denotes partitions of s tuples

• Each tuple ts ∈s is put in partition si, where i = h(ts [JoinAttrs]).

● r tuples in ri need only to be compared with s tuples in si

● Join with a conjunctive (“and” operator in where clause) condition:

● Duplicate elimination can be implemented via hashing or sorting.

● Aggregation can be implemented in a manner similar to duplicate elimination.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.