PDB Partitioning
PDB Partitioning
Database Systems
Data partitioning distributes data over a
number of processing elements.
Each processing element is then executed
range partitioning.
For applying range partition, we need to first identify
partitioning Vector.
Let us choose the following Vector as range partitioning
The mgrid field of Departments is the eid of the manager. Each relation
contains 20-byte tuples, and the sal and budget fields both contain uniformly
distributed values in the range 0 to 1,000,000. The Employees relation
contains 100,000 pages, the Departments relation contains 5,000 pages, and
each processor has 100 buffer pages of 4,000 bytes each. The cost of one
page I/O is td, and the cost of shipping one page is ts; tuples are shipped in
units of one page by waiting for a page to be filled before sending a message
from processor i to processor j. There are no indexes, and all joins that are
local to a processor are carried out using a sort-merge join. Assume that the
relations are initially partitioned using a round-robin algorithm and that there
are 10 processors.
For each of the following queries, describe the evaluation plan briefly and give
its cost in terms of td and ts. You should compute the total cost across all sites
as well as the ‘elapsed time’ cost (i.e., if several operations are carried out
concurrently, the time taken is the maximum over these operations).
1. Find the highest paid employee.
2. Find the highest paid employee in the department with did
55.
3. Find the highest paid employee over all departments with
budget less than 100,000.
4. Find the highest paid employee over all departments with
budget less than 300,000.
5. Find the average salary over all departments with budget
less than 300,000.
6. Find the salaries of all managers.
7. Find the salaries of all managers who manage a department
with a budget less than 300,000 and earn more than 100,000.
8. Print the eids of all employees, ordered by increasing
salaries. Each processor is connected to a separate printer,
and the answer can appear as several sorted lists, each
printed by a different processor, as long as we can obtain a
fully sorted list by concatenating the printed lists (in some
order)