Vu Lec 33
Vu Lec 33
Management Systems
Lecture 33
In the previous lecture
• Final phase of QD
• Data Localization: for HF,
VF and DF.
In today’s Lecture
U
No
EMP2
x
ASG
EMP PROJ
• Alternatives with N
relations are O(N!)
based on properties of
relations
• So, restrictions are
applied
1- Heuristics
- Selection and
projection on base
relations
- Avoid Cartesian
product
2- Shape of Tree
- Linear Tree: At least one
node for each operand is
a base relation
- Bushy tree: May have
operators with interm
tables only; allows
parallel execution
Search Strategy
• Most popular is Dynamic
Programming
• That starts with base
relations and keeps on
adding relations calculating
cost
• DP is almost exhaustive
so produces best plan
• Too expensive with more
than 5 relations
• Other option is
Randomized strategy
• Do not guarantee best
Cost Model
• Cost of operators, statistics
of base data to predict size
of intermediate tables
• Cost considered as Total
Time and Response Time.
• Total time = CPU time +
I/O time + tr time
• In WAN, major cost is tr
time
• Initially ratios were 20:1
for tr and I/O, for LAN it
is 1:1.6
• Response time = CPU
time + I/O time + tr
time
• Difference.?
• TCPU = time for a CPU inst
• TI/O = a disk I/O
• TMSG = fixed time for
initiating and recv a msg
• TTR = transmit a data unit
from one site to another.
Site 1 X units
Site 3
Site 2 Y units
• TT = 2TMSG + TTR*(x+y)
• RT = max{TMSG + TTR*X,
TMSG + TTR*Y}
Database Statistics
• Major factor is interm tabs
• If the interm results are to
be transmitted, then
estimation about size is a
must
• More precise statistics cost
more
• For each relation R[A1, A2, …, An]
fragmented as R1, …, Rr
1.length of each attribute: length(Ai)
2.the number of distinct values for
each attribute in each fragment:
card(Ai(Rj))
3.maximum and minimum values in
the domain of each attribute:
min(Ai), max(Ai).
4.The cardinalities of each
domain: card(dom[Ai])
and the cardinalities of
each fragment: card(Rj)
5.Join selectivity factor for
(card(R) ∗ card(S))-
Cardinalities of
Intermediate Results
Selection Operation
• Card(
Card( F(R))=SFS(F) *
card(R)
• SFS(A = value) = 1/card(A(R))
• SFS(A > value) = max(A) – value
/(max(A) – min(A))
• SFS(A < value) = value - min(A)
/(max(A) – min(A))
• SFS(A < value) = max(A) – value
/(max(A) – min(A))
• SFS(p(Ai) ^ p(Aj)) = SFS(p(Ai)) *
(SFSp(Aj))
• SFS(p(Ai) v p(Aj)) = SFS(p(Ai)) +
SFS(p(Aj))–(SFS(p(Ai))* SFS(p(Ai))).
Cardinality of Projection
• Hard to determine precisely
• Two cases when it is trivial
1- When a single attribute A,
card(A(R)) = card (A)
2- When PK is included
card(A(R)) = card (R)
Cartesian Product
• card(RxS) = card (R) * card(S).
• Cardinality of Join
• No general way to test without
additional information