Capacitry Planning For DBMS With OLAP Workloads
Capacitry Planning For DBMS With OLAP Workloads
Performance Professionals
The Computer Measurement Group, commonly called CMG, is a not for profit, worldwide organization of data processing professionals committed to the
measurement and management of computer systems. CMG members are primarily concerned with performance evaluation of existing systems to maximize
performance (eg. response time, throughput, etc.) and with capacity management where planned enhancements to existing systems or the design of new
systems are evaluated to find the necessary resources required to provide adequate performance at a reasonable cost.
This paper was originally published in the Proceedings of the Computer Measurement Group’s 2003 International Conference.
Copyright 2003 by The Computer Measurement Group, Inc. All Rights Reserved. Published by The Computer Measurement Group, Inc. (CMG), a non-profit
Illinois membership corporation. Permission to reprint in whole or in any part may be granted for educational and scientific purposes upon written application to
the Editor, CMG Headquarters, 151 Fries Mill Road, Suite 104, Turnersville , NJ 08012.
BY DOWNLOADING THIS PUBLICATION, YOU ACKNOWLEDGE THAT YOU HAVE READ, UNDERSTOOD AND AGREE TO BE BOUND BY THE
FOLLOWING TERMS AND CONDITIONS:
License: CMG hereby grants you a nonexclusive, nontransferable right to download this publication from the CMG Web site for personal use on a single
computer owned, leased or otherwise controlled by you. In the event that the computer becomes dysfunctional, such that you are unable to access the
publication, you may transfer the publication to another single computer, provided that it is removed from the computer from which it is transferred and its use
on the replacement computer otherwise complies with the terms of this Copyright Notice and License.
Copyright: No part of this publication or electronic file may be reproduced or transmitted in any form to anyone else, including transmittal by e-mail, by file
transfer protocol (FTP), or by being made part of a network-accessible system, without the prior written permission of CMG. You may not merge, adapt,
translate, modify, rent, lease, sell, sublicense, assign or otherwise transfer the publication, or remove any proprietary notice or label appearing on the
publication.
Disclaimer; Limitation of Liability: The ideas and concepts set forth in this publication are solely those of the respective authors, and not of CMG, and CMG
does not endorse, approve, guarantee or otherwise certify any such ideas or concepts in any application or usage. CMG assumes no responsibility or liability
in connection with the use or misuse of the publication or electronic file. CMG makes no warranty or representation that the electronic file will be free from
errors, viruses, worms or other elements or codes that manifest contaminating or destructive properties, and it expressly disclaims liability arising from such
errors, elements or codes.
General: CMG reserves the right to terminate this Agreement immediately upon discovery of violation of any of its terms.
Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
Xilin Cui, Patrick Martin and Wendy Powley
School of Computing, Queen's University, Kingston
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
This paper focuses on capacity planning for online analytical processing (OLAP)
workloads. We first build a workload model for an OLAP workload based on the TPC-H
benchmark, then investigate the impact of database design and database management
system (DBMS) tuning factors (such as buffer pool, sort heap size, and prefetching) on
OLAP performance. We propose a queueing network model (QNM) to represent the
system under study, present preliminary experiment results to validate this model, and
show how the model can represent various common changes to the CPU and disk.
1. INTRODUCTION
www.cmg.org
server systems, and those systems are becoming so
Often more emphasis is placed on the functionality of
complex and diverse that they are difficult to control
a computer system rather than it’s performance. In
and maintain. To monitor and predict their
the past, E-commerce companies believed that the
performance is a challenge for the system designers.
most urgent priority they had was to get their business
up and running. However, this attitude has changed Database management systems (DBMSs) are
[VIJA99]. Given infinite resources, the expected complicated software systems. Their growing
quality of service can always be provided, but with complexity stimulates the development of software
limited resources, capacity planning and resource tools to help in system design and tools to predict and
management is needed. probe system performance. Of specific interest to this
paper is database system design and planning. The
Computer capacity planning is the process of
main goal of the paper is to lay the foundation for a
monitoring and projecting computer workload and
software tool that will allow users to examine their
specifying the most cost-effective computing
design decisions and to estimate the performance of
environment that meets the current and future
the resulting database management system.
demands for computer services [LAZO84]. Much of
the work on capacity planning was done in the 1970’s The database community widely recognizes two
and the 1980’s [GRAH78] [SAUE81] while CPU major types of commercial database workloads
speeds were relatively slow. Since then, research has [TPC]: online transaction processing (OLTP) and
been sparse in this area. One of the reasons behind online analytical processing (OLAP), also called
this lack of interest could be the relatively low cost of decision support systems (DSS). OLTP systems, such
PC’s and their building components. Starting from as airline reservation systems, handle the operational
1996, with the development of the World Wide Web aspects of day-to-day business transactions. OLAP
(WWW), and the emergence of very large multimedia systems provide historical support for forming
databases, the database demand has doubled every business decisions. The high degree of query
9-12 months according to Greg’s law [PATT98], while complexity in an OLAP workload presents unique
the processor speed has doubled every 18 months challenges for capacity planning, and our focus is on
based on Moore’s law. The gap between database this type of workload.
demand and CPU speed is increasing with time,
indicating a need for capacity planning in this area. The remainder of the paper is organized as follows.
Section 2 reviews the background and related work.
Database applications provided 32% of the server In Section 3, we outline the system specification and
volume in 1995 and the share is increasing [STEN97]. workload characterization, investigate the impact
Corporate profits and organizational efficiency are factors in DBMS, such as buffer pool, sort heap size,
becoming increasingly dependent upon their database and prefetching on OLAP performance. A baseline
queueing network model representing the DBMS the systems under study, and a familiarity with the
system under study is proposed and validated in typical system workload. In this section, we introduce
Section 4. Section 5 extends this model to different the computing environments used for our study,
hardware configurations to test its flexibility, characterize the OLAP workload, and then discuss the
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
portability and adaptability. Section 6 presents our DBMS factors that affect OLAP performance.
conclusions and guidelines for future work.
3.1 System Specification
2. RELATED WORK
We carry out our study on three different computer
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
DBMS vendors have developed capacity planning systems to ensure independence from the hardware
tools that allow users to estimate the performance of configuration and the adaptability of our model. The
their products. One of these tools is the DB2/UDB first system is an IBM Netfinity 5000, with dual PIII
Estimator developed by IBM [DB200]. However, its 400MHz processors with 512KB cache, 1 GB ECC
DB2/UDB orientation limits the scope and usability of SDRAM memory, Open Bay hard disk drives (four
this tool. The assumptions and simplifications on SQL IBM-PCCO DGHS09Y UW (ultra wide) SCSI SCA
statement, table definition, and transaction of the disks, and one IBM-PSG DNES-309170Y Ult Wide
database in DB2/UDB Estimator also leads to SCSI disk), Dual Channel Wide Ultra SCSI and
imprecise estimation of performance. PCI/ISA controllers.
Previous work in our research group focused on The second computer system is an IBM-Xseries 240
capacity planning for an OLTP workload [ZAWA02]. A with dual 1 GHZ CPUs, four PCI/ISA controllers, and
queueing network model was proposed to model the 24 Seagate ST 318436LC SCSI disks. With two
behavior of the OLTP transactions and determine processors on its motherboard, we choose upon
their relationship with the DBMS resource. The OLTP startup whether the system runs with either a single
workload is partitioned into five classes based on the CPU or dual CPUs.
five different types of transactions in the TPC-C
The third computer system is an IBM PowerServer
benchmark. A set of experiments were run to validate
www.cmg.org
704 with a single 200 MHZ CPU, 1 GB of RAM, and
and use the model. The estimated results were mostly
16 hard disk drives (Seagate ST 34520W SCSI). This
in the 10% error range compared to the actual
system has four disk controllers, and each controller
detected values, with the exception being when the
handles a number of the disks. By distributing the
population size is small and the system is under-
database tables over multiple disks, we can choose
utilized.
how many disks controllers and disk drives are used.
This QNM model is not suitable for an OLAP
Throughout the discussion of our experiments, we
workload. First, the queries in TPC-H, the benchmark
refer to the first system as Jaguar, the second system
for OLAP workload, are formalized from real world
as Baserver, and the third as Cougar. The operating
heterogeneous business questions. We cannot map
system on the machines is Windows NT Server
all queries one to one into the QNM workload, as was
Version 4.0, Jaguar and Baserver runs IBM’s DB2
previously done with the TPC-C benchmark (there are
Universal Database (DB2/UDB) V8.1 [DB203], and
only five transaction classes in TPC-C). The queries
Cougar runs DB2/UDB V7.2, respectively. We
must be characterized and grouped into a few general
assume this difference does not affect our study.
transactions. Second, the characteristics of OLTP
queries are different from those of OLAP queries. 3.2 Workload Characterization
OLTP queries often reuse data, making the buffer
pool area crucial for good performance. OLAP Before modeling DBMS performance, we must first
queries, on the other hand, use the buffer pool in understand the intended workload. OLAP has
different ways, often scanning large amounts of data different characteristics compared with an OLTP
thus requiring only small buffer pool. The equation workload. OLTP uses short, moderately complex
derived to represent the relationship between disk queries that read and/or modify a relatively small
demand and buffer pool size for an OLTP workload is portion of the overall database. These access
inappropriate for an OLAP workload. Even more, the patterns translate into small random disk accesses.
TPC-H queries use extensive sorting functions, so the OLTP workloads typically have a high degree of
model must account for the use of sort memory. multiprogramming due to the large number of
Third, the typical number of clients running concurrent users. In contrast, OLAP queries are
simultaneously in an OLAP workload is far smaller typically long running, moderate to very complex
than for an OLTP workload. queries, which often scan large portions of the
database in a read-only fashion. This access pattern
3. SYSTEM FACTORS INFLUENCING OLAP translates into large sequential disk accesses. The
PERFORMANCE multiprogramming level in OLAP systems is typically
The first step in conducting capacity planning is to much lower than that of OLTP systems.
have a clear understanding of the configurations of
In order to simplify and standardize database these batch files for different system configurations
performance evaluation, the Transaction Processing and collected the performance indices using Windows
Performance Council (TPC) [TPC] defines and NT performance monitor. We only use the read-only
maintains several industry standard database queries in our experiments. The TPC-H database is
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
benchmarks. Two common benchmarks, TPC-C and stable and consistent, and the DBMS performance is
TPC-D, model OLTP and OLAP workloads the same for multiple runs.
respectively. In April 1999, two new benchmarks,
3.3 DBMS Tuning
TPC-R and TPC-H, replaced TPC-D as the industry's
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
standard benchmarks for OLAP applications, TPC-R Database server performance is dependent on two
for a reporting workload and TPC-H for an ad-hoc things: hardware environment and software
querying workload. An ad-hoc querying workload configuration. In this section, we examine the impacts
simulates an environment in which users connect to of software configuration, that is DBMS tuning
the database system and issue individual queries that parameters such as buffer pool, sort heap, and
are not known in advance. prefetching, and determine the appropriate settings
for these parameters to be used in the later section.
The workload in both TPC-H and TPC-R consists of
the execution of 22 read-only queries in both single 3.3.1 Buffer Pool
and multi-user mode and two refresh functions. These
queries are formalized from real world business The buffer pool is an area of memory into which
questions. They simulate generated ad-hoc queries database pages are temporarily read and modified.
(TPC-H) and reporting queries (TPC-R), and generate The purpose of the buffer pool is to reduce the
intensive disk and CPU activity on the database number of disk accesses and to improve database
server. Note that since our TPC-H benchmark setup system performance, since data can be accessed
has not been audited per TPC specifications, our much faster from memory than from a disk. During
benchmark workloads are referred to as TPC-H-like buffer pool set up and configuration, the general rule
workloads and not to be interpreted as official of thumb is to keep objects with different access
www.cmg.org
benchmark results. patterns separate. That is, objects with a high
reference rate should be kept separate from those
From the perspective of a queueing network model, with a high sequential rate. TPC-C uses small random
the workload is a set of formalized parameters. disk access with a high degree of rereference, while
Among these parameters, the two most important are TPC-H uses large sequential disk accesses. Zawawy
service demand (the service time needed from each [ZAWA02] found that TPC-C performance was greatly
resource) and the number of workload classes. impacted by the buffer pool size in an OLTP
workload.
The Windows NT performance monitor is used to
collect performance data including the percentage of We measured the TPC-H response time of the 22
non-idle processor time spent servicing DB2/UDB queries using different buffer pool sizes with different
(%Processor Time) and the percentage of elapsed system configurations. The experiments are designed
time that the selected disk drive is busy servicing and run with all the other parameters fixed except the
read or write requests (%Disk Time). Based on these one being investigated, thus isolating it’s effect.
measurements, the CPU service demand and Disk Figure 1 shows the sum of the 22 TPC-H queries’
service demand can be calculated using the following response time vs. buffer pool size for different sort
utilization law: heap sizes, another factor that will be covered in next
section.
Utilization * Time
ServiceDemand = From Figure 1 we can see that the response time
Completions decreased drastically at the beginning of the graph,
where Utilization is the percentage of time this i.e. any change in the smaller buffer pool size will
resource is busy serving this class, Time is the total have a great impact on the performance. Once the
time the workload is run, and Completions is the buffer pool reaches a certain size, the response time
number of completed transactions of this class. We becomes a constant, which means there is a
run 22 TPC-H queries one by one, and collect their threshold of the buffer pool size. The performance of
%Processor Time and %Disk Time to calculate their TPC-H is independent on buffer pool size beyond this
CPU demand and disk demand. threshold.
In the experiments that we conduct, we use a C In an OLTP workload, the response time typically
language implementation of the TPC-H benchmark decreases with an increase in buffer pool size. The
driver to simulate a typical OLAP workload on a main reason for the different behavior at an OLAP
DBMS. A set of batch files, each file running the 22 workload is the different data access pattern between
TPC-H queries in random order, is used to simulate TPC-C and TPC-H workload. During the large
multiple clients using the DBMS concurrently. We ran sequential disk accesses of TPC-H workload, the data
are swapped into and out of the buffer pool; the Figure 2 shows the results of different CPU speed
number of disk access is independent of the buffer (Cougar 200MHZ, Jaguar 400MHZ, Baserver 1GHZ).
pool size. Another reason is that in TPC-H, the The impact of different number of CPU (1 and 2
database indexes are heavily used. Once these processors) and the effect of multiple disks, where the
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
indexes can be held in memory, the performance of database is distributed across 3, 4, and 12 disks,
TPC-H is significantly improved. respectively, was also tested [CUI03]. We observed
that TPC-H performance changes along with the
In order to verify our hypothesis, we monitor the
hardware configurations, but the buffer pool threshold
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
for indexes are very high (>93%), while the hit rates 8000
Cougar
for data are quite low (<20%). The data hit rates even 7000
Jaguar
overall response time of the TPC-H queries increases 0 1000 2000 3000 4000 5000 6000 7000
Buffer pool size (page)
8000 9000 10000 11000 12000
www.cmg.org
need to find the exact value and impact factors of this 1500
size.
600
A DBMS is a complex software system; changing one 0 2000 4000 6000 8000 10000 12000
9000
SortHeap 1K
SortHeap 10K
Response Time (s)
8000
7000 SortHeap 100K Buffer pool size vs. Database size (10GB)
6000 16000
5000
Response Time (s)
15500
4000
3000
15000
2000
1000 14500
-1000 9000 19000 29000 39000 49000 59000 69000 79000 89000 99000
The third set of experiments is on the relationship threshold. The value of this threshold is related to the
between buffer pool threshold and database size. workload. Each query uses different amounts of data,
Based on the above analysis, buffer pool threshold and has different requirements for sorting and
should be closely related to the database size, grouping. When the sort heap size is small, some of
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
especially with the database index size. We test three the temporary results have to be written to disk. The
database sizes: 1GB, 5GB and 10GB. The results sort heap threshold is the size of sort heap that
(Figure 3) verify our hypothesis and indicate that the satisfies the required memory for that query’s data
buffer pool threshold increases along with database management. We plot the response time of all 22
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
size. However, we cannot derive a quantitative queries together, all queries except query 9, and
equation for buffer pool threshold and database size, query 9 in Figure 4, respectively. Query 9 plays a
without further points of reference. dominant role in the overall 22 queries. If we look at
query 9 more closely in the figure, we see that the
In summary, there is a threshold for buffer pool size
response time of query 9 drops from 848.1 sec to
for a TPC-H workload. After this threshold, the
308.2 sec when the sort heap size changes from
performance of TPC-H is independent of the size of
82040 to 82050 pages. A change of only 10 pages
buffer pool. This threshold is not correlated with sort
improves its performance by more than 60%.
heap parameter in DBMS, nor with the hardware
configuration. It is closely related to, and increases Another point that should be noted is that not all
along with, the database size. queries are sensitive to a change in sort heap size,
and each query has a different threshold. Query 9
In the next set of experiments, we use a buffer pool
has the largest threshold among the TPC-H queries.
size of 5,000 pages for the 1GB database, 7,000
pages for the 5GB database and 20,000 pages for the Following this work we investigated the effect of
10 GB, which are all beyond their buffer pool hardware (number of disks, number of CPUs, and
thresholds, to eliminate the buffer pool effect and to CPU speed) and database size on the sort heap
simplify the model. threshold. Figure 5 shows the impact of number of
www.cmg.org
disks, where a 1GB TPC-H database is distributed
3.3.2 Sort Heap
across 3, 4, 8, and 12 disks, respectively. The
OLAP queries are typically long running, moderate to response times of each kind of disk distribution are
very complex queries with frequent multi-way joins. different, but the shape of the graph and the threshold
Among the 22 queries of the TPC-H benchmark, 18 of sort heap size are the same. The sort heap
queries have an “ordered by” function, and 15 queries threshold is independent of the disk configuration.
have a “group by” function. The sort heap, which is
used to sort, join, and calculate, should play an 1900
1700 12 disk
3 disk
6000
1500 4 disk
sum
5000 Sum-Q9 1300
Q9
Response time (s)
1100
4000
900
3000
700
2000 500
0 10000 20000 30000 40000 50000 60000
sort heap size (page)
1000
threshold when changing from one processor to two pool, reducing the time spent waiting for I/O to
processors [CUI03]. complete. During the experiments on buffer pool hit
rate of Section 3.3.1, we found that prefetching
8000 decreases the TPC-H response time by 30%.
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
Baserver
7000
Dennis Shasha and Philippe Bonnet [SHAS] tested
Sum Response Time (s)
Cougar
6000
Jaguar the influence of prefetching on throughput of
5000 DB2/UDB V7.1. They found that the throughput
increases up to a certain point when prefetching size
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
4000
3000
increases (about to 16 pages); then it is a constant.
2000
Sixteen pages is the default value for table
prefetching size in DB2/UDB. We use this value in
1000
our experiments.
0
0 20000 40000 60000 80000 100000 120000 140000 160000 180000 200000
The size of prefetching is predefined when creating
Sort heap size (page)
the table space, we cannot modify prefetching size
once the table space is created, but prefetching can
Figure 6. Effect of Sort Heap Size on TPC-H Query Response
Time for Different CPU Speeds (Database size 1GB)
be turned on or off via configuration parameters.
From Table 1 we can see that prefetching decreases
The impact of database size on sort heap threshold is the response time by more than 20% in every set of
shown in Figure 7. As expected, we found that the experiments. We use prefetching on as the default in
sort heap threshold increases along with the database the rest of our experiments.
size.
Table 1. The Effect of Prefetching on TPC-H Response Time (s)
for Different Buffer Pool Sizes and Sort Heap Sizes (page)
18000
(Cougar, Database size 1G)
16000
Buffer pool 1K 5K 10K
www.cmg.org
Sum of Response Time (s)
14000
Sort heap 1K 10K 100K 1K 10K 100K 1K 10K 100K
12000
1GB
10000 5 GB Prefetch off 77796490 5040 7070 6072 4985 67196083 4948
10 GB
8000 Prefetch on 57395095 3825 4848 4666 3771 45744679 3731
6000
I/O servers are used on behalf of the database agents
4000 to perform prefetch I/O and asynchronous I/O by
2000 utilities such as backup and restore. The number of
0
I/O servers in DB2/UDB could be from 1 to 255. We
0 20000 40000 60000 80000 100000
Sort heap size (page)
120000 140000 160000 180000 200000 have checked the influence of number of I/O servers
on the performance of TPC-H queries, and found that
Figure 7. Effect of Sort Heap Size on TPC-H Query Response one I/O server slowed down the response, while the
Time for Different Database Sizes (Baserver, One CPU, Four response times are almost constant starting from two
Disks) I/O servers. This experiment shows that the 2 I/O
In summary, sort heap is another tunable parameter servers are enough for the current system; we use the
in a DBMS that impacts TPC-H performance. We value (20) for the number of I/O servers in our
found that there is a certain threshold, and beyond experiments.
this value the TPC-H performance is independent of 4. QUEUEING NETWORK MODEL
the sort heap size. This threshold is dependent on the
query content, and different queries have different A queueing network model represents the behavior of
thresholds. The disk configuration has no effect on queries in the DBMS. We model a query in terms of
the sort heap threshold, while CPU speed plays an its use of three main resources, namely, CPU, disk
important role. We found that the slower processor and main memory. An OLAP query can be
usually need more sort heap memory and has a larger characterized as a series of logical page accesses
sort heap threshold. The required sort heap threshold and page processes. A logical page access first tries
increases along with the database size. the main memory and, if the page requested is not in
the buffer pool, then it is read from the disk (physical
3.3.3 Prefetching page access). The probability that a logical page
OLAP queries are typically scanning large portions of access involves a disk access in OLAP queries is
the database. This access pattern translates into large very high based on the fact that the buffer pool data
sequential disk accesses. The prefetchers (I/O hit rate is pretty low (<20%). The page process
servers) are designed to deal with this matter, since includes sort, group and calculate to give the final
they prefetch index and data pages into the buffer query result.
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
complex (SVC), etc. All 22 TPC-H queries can be
mapped into these general queries based on their
CPU and disk demand value. The CPU demand and
disk demand of each general query can be calculated
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
www.cmg.org
therefore, in our experiments is 15. Each of these sum of CPU and disk service demand is always larger
clients submits a query, waits for the response of that than the actual response time due to the
query, analyzes the response, and composes a new asynchronous activity, and these asynchronous
query to be submitted to the server. A closed actions cannot be captured by the QNM. The
queueing network model with a finite population is asynchronous time was estimated by the difference
used. between the sum of CPU and disk demand and the
actual response time. Half of the asynchronous time
A closed QNM requires three inputs: the number of was deducted from the CPU service demand and disk
service centers, the service demand of each service service demand to reflect the asynchronous actions in
center for every workload class, and the frequency of DBMS.
each workload class. There are three service centers
in our DBMS QNM: CPU, disk and main memory. We Table 2. The Parameter Values for QNM (Baserver, Database size
1GB, 1 CPU, 1 Disk, 1 Client)
consider the buffer area and sort heap to be the main Workload
memory resource in this study. The time to transfer Class SS SC CS CC CVC VCS VCC
data in and out of main memory is negligible so the No.
service demand of the main memory is not Requests 0.23 0.05 0.09 0.27 0.10 0.18 0.09
significant. As we discussed in the previous section, CPU
demand 6.4 5.6 6.1 9.6 19.9 17.5 15.3
the OLAP performance is independent of the size of Disk
buffer pool and sort heap when beyond their demand 33.0 102.6 26.2 100 335.5 26.4 94.6
thresholds. Our QNM can be simplified into two
service centers (CPU and Disk) if the DBMS is well We validate our queueing network model against the
tuned (buffer pool and sort heap are larger than their TPC-H performance using DB2/UDB. System
thresholds). measurements are made and used to calculate the
QNM input parameters (Table 2 shows one set of
The computational complexity of a QNM increases parameters used in the QNM). The performance
exponentially with the number of workload classes. indices (response time, resource utility) are estimated
Instead of directly using 22 queries as our QNM using the QNM with these calculated parameters. The
workload, we simplify and classify them into a small model is validated by comparing the calculated
number of general workload classes based on their indices with the performance indices measured
resource usage. The CPU demand and Disk demand directly from the system. Results of the experiments
of all queries are classified into three categories are shown in Figure 9. In most cases the errors are
[simple, complex, very complex] using a K-means within 10%.
25000
calculated values, using MS Excel workbooks
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
20000 (ClosedQN.XLS) developed by Menasce [MENA98]
for solving parallel multiple service centers’ QNM, are
15000
always smaller then the actual measured. The actual
disk system is not optimized as expected in the QNM.
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
10000
www.cmg.org
0.200
0.000
0 2 4 6 8 10 12 14 16 18 20
where P is the number of processors in the system,
Client Number A(P) is the actual capacity. The parameter, σ, where
(0<σ<1), known as the seriality constant, refers to the
Figure 10. The QNM Calculated CPU and Disk Utilities for serial fraction of the workload that cannot be made to
Different Population Size (Baserver, Database size 1GB) execute in parallel.
Calculating the weight of CPU demand and disk
20000
demand in the overall service demand (sum of CPU
demand and disk demand), we found that only 5 TPC- Actual
Response Time (s)
In order to test the correctness of this strategy, 5.3 Different Disk Specifications
another system (Jaguar with 3 disks) was examined.
The results for different population sizes are shown in Every disk can be characterized by three basic
Figure 12. Comparing with the actual measured quantities; seek time, latency time, and transfer rate.
From these characteristics it is possible to estimate
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
response time, the estimated error is within 10%. We
the changes in disk service demands that will result
also found that the seriality constant (σ=0.30) is the
from replacing one type of disk with another. The
same as the previous system (Baserver), indicating
exact change in disk service demand would also
that σ is related to the workload characteristics, not
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
30000 All the TPC-H queries we used are read only and
Actual Model running without the update function. They can be
25000
composed as a set of logical reads. If a logical read is
20000 not satisfied in the buffer area, it results in a physical
15000
disk access, which is further classified into two
general disk access patterns: random read (RR) and
10000 sequential read (SR). The disk service time for a
random read, random read time (RRT), and
5000
sequential read, sequential read time (SRT) can be
0 calculated using the following equations [MENA98]:
3 6 9 12 15
Client Number
LatencyTime
RRT = SeekTime + + TransferTime
Figure 12. The Actual Response Time and the Estimated Values 2
from Modified QNM for Different Number of Clients (Jaguar,
Database size 1GB, 3 disks)
www.cmg.org
5.2 Multiple CPUs SeekTime
SRT = + TransferTime +
We also investigated the effect of adding processors RunLength
to the system. We assume that the CPU demands are [1 / 2 + ( RunLength − 1)(1 + DiskUtilit y ) / 2] * LatencyTime
spread equally among the multiple processors, and RunLength
the measurements from the NT monitor proved this
hypothesis. Because the CPU is not the bottleneck of
the current system, adding one CPU only improves Where Run length is the block size of sequential read.
the TPC-H performance by 5-10%, while adding one For our system we use prefetch size (64 KB) as Run
disk led to a 20-30% improvement. length, and the disk utility is very high (>90%) in our
An example result is shown in Figure 13. The actual system. In this case, the SRT equation can be
measured and the calculated response times, using simplified into the following equation:
MS Excel workbooks (ClosedQN.XLS) developed by
Menasce [MENA98] for solving parallel multiple
service centers’ QNM, are shown in Figure 13. The SeekTime
SRT = + TransferTime + LatencyTime
multiple processor effect [GUNT96] is not seen in the RunLength
current system. Perhaps the reason being that the
CPU is not the bottleneck of current system, or the
multiple processor effect is not significant in the case From the buffer pool snapshot function in DB2/UDB
of two processors. we can get the total number of physical reads and the
total buffer pool read time for each query. Based on
Response time (s)
10000 Physical Re ad = RR + SR
5000 Re adTime = RR * RRT + SR * SRT
0
0 2 4 6 8 10 12 14 16
These numbers (random read and sequential read)
Client Number are associated with the workload queries and will not
change for different disk system. Given a disk
Figure 13. The Response Time vs. Different Population Sizes
(Baserver, Database size 1GB, 1 disk, two Processors) specification (seek time, latency time, and transfer
rate), we can calculate RRT and SRT, and we can get
the buffer pool read time for a specific query using the Consequently, once we know the disk demand, we
known RR and SR. We also observed that there is a can plug it in our QNM and estimate the performance
linear relationship between buffer pool read time and for a new disk specification. Figure 16 shows the
disk demand (Figure 14). results.
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
5.4 Different CPU Speed
350
Disk demand (s)
120
Figure 14. The Linear Correlation Between Total Buffer Pool
CPU demand (s)
Read Time and Disk Demand (Baserver, database size 1GB, 1 100
disk, 1 CPU)
80
60
450
Disk demand (s)
40
400 Actual Disk Demand
www.cmg.org
300 0
250 0 200 400 600 800 1000 1200
0
Capacity planning techniques are needed to avoid the
1 2 3 4 5 6 9 12
Client No.
15 pitfalls of inadequate capacity and to meet users’
performance expectations in a cost-effective manner.
Figure 16. The response time (Actual, calculated using This paper attempts to lay a foundation to carry out
measured disk demand, and calculated using estimated disk capacity planning studies for DBMSs using OLAP
demand) for different population size (Cougar, database size workloads. The main contributions of this paper are a
1GB, 1 disk, 1 CPU)
study of DBMS factors influencing OLAP
Based on this linear equation, we can calculate the performance, the design and validation of a queueing
disk demand from the buffer pool read time. Figure network model to capture the main features of the
15 shows the estimated disk demand and actual DBMS behavior, and the use of a quantitative
measured disk demand for one system (Jaguar).
approach to project the DBMS performance with Like programming, modeling is more of an art then a
OLAP workload. science. The more information is supplied, then the
more accurate the model. Also, the more information
We first build a workload model for OLAP, which is
provided, the more difficult it is to build the model,
based on the TPC-H benchmark, investigate the
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
and the model will be less adaptable. There is a
impact of DBMS tuning (such as buffer pool, sort
trade-off between accuracy, cost and adaptability in
heap size, and prefetching) on OLAP performance.
capacity planning. The challenge is to come up with a
We found there are certain thresholds on buffer pool
scheme that is rich enough to be useful, and yet
and sort heap size in DBMS with OLAP workload, and
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
www.cmg.org
Based on these results, we then propose a queueing Management Systems with OLAP Workloads”, M. Sc.
network model (QNM) to represent the system under Thesis, School of Computing, Queen’s University
study and give the results of preliminary experiments (2003).
to validate this model. We have indicated, through [DB200] “DB2/UDB Estimator Help”, DB2/UDB
discussion and example, how to modify the Estimator V7, IBM Corp. (2000).
parameters of this baseline model to represent
various common changes to the hardware (CPU and [DB203] DB2/UDB Universal Database Administration
disk upgrade) and workload (number of users). Guide: Performance, IBM Corporation (2003).
Amdahl’s law is used to estimate the multiple disk [GRAH78] G. S. Graham. “Queueing Network Models
effect on DBMS performance when changing the of Computer System Performance”, Computing
number of disks. A mathematical approach, based on Surveys, Vol. 10, NO. 3, September (1978).
the disk configurations (seek time, latency time, and [GUNT96] N. J. Gunther. “Understanding the MP
transfer rate) and DBMS internal indices (physical Effect: Multiprocessing in Pictures”, CMG Conference
read, read time), is proposed and used in predicting Proceedings (1996).
disk demands for different disks.
[LAZO84] E. Lazowska, J. Zahorjan, S. Graham, and
A number of related issues require further study. First, K. Sevcik. “Quantitative System Performance:
systems with larger number of CPUs and disks and Computer System Analysis Using Queueing Network
larger database size need to be tested to verify and Models”, Prentice Hall, Englewood Cliffs, N. J. (1984).
quantify the multiple processor effect, the [MENA98] D. Menasce, and V. A.F. Almeida.
relationships between CPU demand and processor Capacity Planning for Web Performance: Metrics,
speed, CPU demand and database size, and disk Models, & Methods, Prentice Hall (1998).
demand and database size. The second issue is to
integrate our model with the previous QNM using [PATT98] D. A. Patterson and K. K. Keeton.
OLTP workload [ZAWA02], generating a general “Hardware Technology Trends and Database
workload model. Third, similar studies for DBMS Opportunities”, Keynote address at ACM-SIGMOD’98,
using other benchmark workloads such as TPC-W Seattle, Washington, June (1998). Available from
(Web workload) should be conducted. The future http://www.cs.berkeley.edu/~pattrsn/talks/sigmod98-
direction of this model is to allow the user to define keynote-color.pdf
his/her database and workload (SQL statements, [SAUE81] C. H. Sauer and K. M. Chandy. “Computer
batch files, utilities such as copy, recover and Systems Performance Modeling”, Prentice Hall
rebuild), and then use this model to estimate the (1981).
performance of different systems under the specified [SHAS] D. Shasha and P. Bonnet. Talk notes,
workloads. available from http://www.distlab.dk/dbtune/.
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
[TPC] Transaction Processing Performance Council. Corporation in the United States, other countries, or
http://www.tpc.org both.
[VIJA99] J. Vijavan, “Capacity Planning More Vital Microsoft and Windows NT are registered trademarks
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
Than Ever”, Computer World, February (1999). of Microsoft Corporation in the United States, other
[ZAWA02] H. Zawawy, P. Martin and H. Hassanein. countries, or both.
“Supporting Capacity Planning for DB2/UDB”, Pentium is a trademark of Intel Corporation in the
Proceedings of CASCON 2002, 89-97 September United States, other countries, or both.
(2002).
www.cmg.org