0% found this document useful (0 votes)
52 views

SQL Join Algorithm

The document describes the basic steps in query processing: (1) translation, (2) optimization, and (3) processing. It provides examples of query execution plans showing how relations can be joined, grouped, and filtered to evaluate a query. Common join methods like nested loop, sort-merge, and hash joins are explained along with operations like sorting, indexing, and pipelining to efficiently process queries.

Uploaded by

murali312
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views

SQL Join Algorithm

The document describes the basic steps in query processing: (1) translation, (2) optimization, and (3) processing. It provides examples of query execution plans showing how relations can be joined, grouped, and filtered to evaluate a query. Common join methods like nested loop, sort-merge, and hash joins are explained along with operations like sorting, indexing, and pipelining to efficiently process queries.

Uploaded by

murali312
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Steps in Query Processing

1. Translation
• check SQL syntax
• check existence of relations and attributes
• replace views by their definitions
• generate internal query representation
2. Optimization
• consider alternative plans for processing the query
• select an efficient plan
3. Processing
• execute the plan
4. Data Delivery
Example

Select DeptNo, Deptname, Count*, SumSalary


From Employee, Department
Where WorkDept = DeptNo And DeptNo Like ’D%’
GroupBy DeptNo, Deptname
Having SumSalary > 1000000
An Execution Plan

1. Scan the Employee table, select all tuples for which


WorkDept starts with ’D’, call the result R1 .
2. Join R1 and Department, eliminate attributes other
than DeptNo, Deptname, and Salary. Call the result
R2 . This may involve:
• sorting R1 on WorkDept
• sorting Department on Deptno
• joining the two sorted relations to produce R2
3. Group the tuples of R2 . Call the result R3 . This may
involve:
• sorting R2 on DeptNo and Deptname
• group tuples with identical values of DeptNo and
Deptname
• count tuples in each group, and add their Salaries
4. Scan R3 , select all tuples with sum(Salary) > 1000000
Pictorial Access Plan

Select
(Sum(Salary) > 100)

R3
Grouping
(DeptNo, Deptname)

R2

Project
(DeptNo,Deptname,Salary)

Join
(DeptNo = WorkDept)

R1

Select
(DeptNo LIKE ’D%’)

Department Employee
Pipelined Plans and Iterators

• In a pipelined plan, each tuples stream from one


operator to another.
• Pipelining allows for parallel execution of operators,
and avoids unnecessary materialization of
intermediate results. (Sometimes materialization may
be necessary...)
• Iterators are a common model for plan operators:
• every operator is an iterator
• an iterator provides the following interface: Open,
GetNext, and Close
• each iterator implements its interface, using calls to
the interface functions of its child (or children)
DB2 Access Plan

FILTER (having)
|
GRPBY (deptno,deptname)
|
MSJOIN
/ \
TBSCAN FILTER
| |
(deptno) SORT TBSCAN
| |
(like D%) TBSCAN SORT (workdept)
| |
DEPARTMENT TBSCAN (like D%)
|
EMPLOYEE
DB2 Access Plan with Index
FILTER
|
GRPBY
|
TBSCAN
|
SORT
|
NLJOIN
/ \
TBSCAN FETCH
| / \
EMPLOYEE IXSCAN DEPARTMENT
|
DEPTNOIND
Plan Structures

R4 R4

R3 R3
R1 R2 R3 R4

R1 R2 R2 R1

Left−Deep Right−Deep Bushy


Some Basic Query Processing Operations

• Data Access and Filtering


• Index scans
• Table scans
• Projection
• Joining
• nested loop join
• hash join
• sort-merge join
• and others . . .
• Sorting
• Grouping and Duplicate Elimination
• by sorting
• by hashing
Joining Relations

select DeptName, LastName


from Department, Employee
where DeptNo = WorkDept

Conceptually, a nested-loop join works like this:

foreach tuple d in Department do


foreach tuple e in Employee do
if d.DeptNo = e.WorkDept then
output d,e
end
end
Block Nested Loop Join

select DeptName, LastName


from Department, Employee
where DeptNo = WorkDept

Process outer relation a chunk at a time

foreach chunk C of Department


foreach tuple e in Employee do
foreach tuple d in C
if d.DeptNo = e.WorkDept then
output d,e
end
end
end
Other Techniques for Join
• If there is an index on the WorkDept attribute of the
Employee relation, an index join can be used:
foreach tuple d in Department do
use the index to find Employee tuples where d
for each such tuple e
output d,e
end
• Examples of other join techniques:
• Sort-Merge Join: sort the tuples of Employee on
WorkDept and the tuples of Department of DeptNo,
then merge the sorted relations.
• Hash Join: assign each tuple of Employee and of
Department to a “bucket” by applying a hash
function to its WorkDept (DeptNo) value. Within each
bucket, look for Employee/Department tuple pairs for
which WorkDept = DeptNo.
Hash Join Example

result disk

Hash Join Operator

memory

outer inner

6,1,8,1,2,8,3,3,4,6,6,8,2,8,9,3 8,2,2,3,4,5,5,6,5,7,7,2,3,7,8,5
Hash Join Example (cont’d)

result disk

Hash Join Operator


5,5,5,5
8 2,2,6,2
memory 4
3,7,7,3,7
8

outer inner

6,1,8,1,2,8,3,3,4,6,6,8,2,8,9,3 8,2,2,3,4,5,5,6,5,7,7,2,3,7,8,5
Hash Join Example (cont’d)

(8,8),(8,8),(8,8),(8,8),(4,4),(8,8),(8,8),(8,8),(8,8)

result disk

Hash Join Operator


5,5,5,5
8 2,2,6,2
memory 4
3,7,7,3,7
8
1,1,9
6,2,6,6,2
3,3,3
outer inner

6,1,8,1,2,8,3,3,4,6,6,8,2,8,9,3
Hash Join Example (cont’d)

result disk

Hash Join Operator


5,5,5,5
5 2,2,6,2
memory 5
5 3,7,7,3,7
5
1,1,9
6,2,6,6,2
3,3,3
outer inner
Hash Join Example (cont’d)

result disk

Hash Join Operator

5 2,2,6,2
memory 5
5 3,7,7,3,7
5
1,1,9
6,2,6,6,2
3,3,3
outer inner
Hash Join Example (cont’d)

(6,6),(2,2),(2,2),(2,2),(6,6),(6,6),(2,2),(2,2),(2,2)

result disk

Hash Join Operator

2 2,2,6,2
memory 2
6 3,7,7,3,7
2

6,2,6,6,2
3,3,3
outer inner
Hash Join Example (cont’d)

(3,3),(3,3),(3,3),(3,3),(3,3),(3,3)

result disk

Hash Join Operator

7
3
memory 7
3,7,7,3,7
3
7

3,3,3
outer inner
External Merge Sort: Run Formation

disk

memory

disk
External Merge Sort: Run Formation (cont’d)

disk

memory

disk
External Merge Sort: Run Formation (cont’d)

disk

memory

disk
External Merge Sort: Merging Runs

disk

memory

disk
Summary

• A plan describes how a query is executed, including:


• the sequence of basic operations (select, project, join,
sort, etc.) used to process the query
• how each operation will be implemented, e.g., which
join method will be used, which indices will be used to
perform a selection.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy