0% found this document useful (0 votes)

15 views13 pages

Capacitry Planning For DBMS With OLAP Workloads

The Computer Measurement Group (CMG) is a non-profit organization focused on the measurement and management of computer systems, particularly in performance evaluation and capacity management. This paper discusses capacity planning for online analytical processing (OLAP) workloads, proposing a queueing network model to assess the impact of various database management system (DBMS) tuning factors on performance. The study emphasizes the need for effective capacity planning to address the increasing gap between database demand and CPU speed, particularly for OLAP systems.

Uploaded by

kmdbasappa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views13 pages

Capacitry Planning For DBMS With OLAP Workloads

Uploaded by

kmdbasappa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

The Association of System

Performance Professionals

The Computer Measurement Group, commonly called CMG, is a not for profit, worldwide organization of data processing professionals committed to the
measurement and management of computer systems. CMG members are primarily concerned with performance evaluation of existing systems to maximize
performance (eg. response time, throughput, etc.) and with capacity management where planned enhancements to existing systems or the design of new
systems are evaluated to find the necessary resources required to provide adequate performance at a reasonable cost.

This paper was originally published in the Proceedings of the Computer Measurement Group’s 2003 International Conference.

For more information on CMG please visit http://www.cmg.org

Copyright Notice and License

Copyright 2003 by The Computer Measurement Group, Inc. All Rights Reserved. Published by The Computer Measurement Group, Inc. (CMG), a non-profit
Illinois membership corporation. Permission to reprint in whole or in any part may be granted for educational and scientific purposes upon written application to
the Editor, CMG Headquarters, 151 Fries Mill Road, Suite 104, Turnersville , NJ 08012.

BY DOWNLOADING THIS PUBLICATION, YOU ACKNOWLEDGE THAT YOU HAVE READ, UNDERSTOOD AND AGREE TO BE BOUND BY THE
FOLLOWING TERMS AND CONDITIONS:

License: CMG hereby grants you a nonexclusive, nontransferable right to download this publication from the CMG Web site for personal use on a single
computer owned, leased or otherwise controlled by you. In the event that the computer becomes dysfunctional, such that you are unable to access the
publication, you may transfer the publication to another single computer, provided that it is removed from the computer from which it is transferred and its use
on the replacement computer otherwise complies with the terms of this Copyright Notice and License.

Concurrent use on two or more computers or on a network is not allowed.

Copyright: No part of this publication or electronic file may be reproduced or transmitted in any form to anyone else, including transmittal by e-mail, by file
transfer protocol (FTP), or by being made part of a network-accessible system, without the prior written permission of CMG. You may not merge, adapt,
translate, modify, rent, lease, sell, sublicense, assign or otherwise transfer the publication, or remove any proprietary notice or label appearing on the
publication.

Disclaimer; Limitation of Liability: The ideas and concepts set forth in this publication are solely those of the respective authors, and not of CMG, and CMG
does not endorse, approve, guarantee or otherwise certify any such ideas or concepts in any application or usage. CMG assumes no responsibility or liability
in connection with the use or misuse of the publication or electronic file. CMG makes no warranty or representation that the electronic file will be free from
errors, viruses, worms or other elements or codes that manifest contaminating or destructive properties, and it expressly disclaims liability arising from such
errors, elements or codes.

General: CMG reserves the right to terminate this Agreement immediately upon discovery of violation of any of its terms.
Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

A STUDY OF CAPACITY PLANNING FOR DATABASE MANAGEMENT

SYSTEMS WITH OLAP WORKLOADS

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
Xilin Cui, Patrick Martin and Wendy Powley
School of Computing, Queen's University, Kingston
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

Ontario, Canada, K7L 3N6

{cui, martin, wendy} @cs.queensu.ca

This paper focuses on capacity planning for online analytical processing (OLAP)
workloads. We first build a workload model for an OLAP workload based on the TPC-H
benchmark, then investigate the impact of database design and database management
system (DBMS) tuning factors (such as buffer pool, sort heap size, and prefetching) on
OLAP performance. We propose a queueing network model (QNM) to represent the
system under study, present preliminary experiment results to validate this model, and
show how the model can represent various common changes to the CPU and disk.

1. INTRODUCTION

www.cmg.org
server systems, and those systems are becoming so
Often more emphasis is placed on the functionality of
complex and diverse that they are difficult to control
a computer system rather than it’s performance. In
and maintain. To monitor and predict their
the past, E-commerce companies believed that the
performance is a challenge for the system designers.
most urgent priority they had was to get their business
up and running. However, this attitude has changed Database management systems (DBMSs) are
[VIJA99]. Given infinite resources, the expected complicated software systems. Their growing
quality of service can always be provided, but with complexity stimulates the development of software
limited resources, capacity planning and resource tools to help in system design and tools to predict and
management is needed. probe system performance. Of specific interest to this
paper is database system design and planning. The
Computer capacity planning is the process of
main goal of the paper is to lay the foundation for a
monitoring and projecting computer workload and
software tool that will allow users to examine their
specifying the most cost-effective computing
design decisions and to estimate the performance of
environment that meets the current and future
the resulting database management system.
demands for computer services [LAZO84]. Much of
the work on capacity planning was done in the 1970’s The database community widely recognizes two
and the 1980’s [GRAH78] [SAUE81] while CPU major types of commercial database workloads
speeds were relatively slow. Since then, research has [TPC]: online transaction processing (OLTP) and
been sparse in this area. One of the reasons behind online analytical processing (OLAP), also called
this lack of interest could be the relatively low cost of decision support systems (DSS). OLTP systems, such
PC’s and their building components. Starting from as airline reservation systems, handle the operational
1996, with the development of the World Wide Web aspects of day-to-day business transactions. OLAP
(WWW), and the emergence of very large multimedia systems provide historical support for forming
databases, the database demand has doubled every business decisions. The high degree of query
9-12 months according to Greg’s law [PATT98], while complexity in an OLAP workload presents unique
the processor speed has doubled every 18 months challenges for capacity planning, and our focus is on
based on Moore’s law. The gap between database this type of workload.
demand and CPU speed is increasing with time,
indicating a need for capacity planning in this area. The remainder of the paper is organized as follows.
Section 2 reviews the background and related work.
Database applications provided 32% of the server In Section 3, we outline the system specification and
volume in 1995 and the share is increasing [STEN97]. workload characterization, investigate the impact
Corporate profits and organizational efficiency are factors in DBMS, such as buffer pool, sort heap size,
becoming increasingly dependent upon their database and prefetching on OLAP performance. A baseline

Find a CMG regional meeting near you at www.cmg.org/regions

Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

queueing network model representing the DBMS the systems under study, and a familiarity with the
system under study is proposed and validated in typical system workload. In this section, we introduce
Section 4. Section 5 extends this model to different the computing environments used for our study,
hardware configurations to test its flexibility, characterize the OLAP workload, and then discuss the

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
portability and adaptability. Section 6 presents our DBMS factors that affect OLAP performance.
conclusions and guidelines for future work.
3.1 System Specification
2. RELATED WORK
We carry out our study on three different computer
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

DBMS vendors have developed capacity planning systems to ensure independence from the hardware
tools that allow users to estimate the performance of configuration and the adaptability of our model. The
their products. One of these tools is the DB2/UDB first system is an IBM Netfinity 5000, with dual PIII
Estimator developed by IBM [DB200]. However, its 400MHz processors with 512KB cache, 1 GB ECC
DB2/UDB orientation limits the scope and usability of SDRAM memory, Open Bay hard disk drives (four
this tool. The assumptions and simplifications on SQL IBM-PCCO DGHS09Y UW (ultra wide) SCSI SCA
statement, table definition, and transaction of the disks, and one IBM-PSG DNES-309170Y Ult Wide
database in DB2/UDB Estimator also leads to SCSI disk), Dual Channel Wide Ultra SCSI and
imprecise estimation of performance. PCI/ISA controllers.
Previous work in our research group focused on The second computer system is an IBM-Xseries 240
capacity planning for an OLTP workload [ZAWA02]. A with dual 1 GHZ CPUs, four PCI/ISA controllers, and
queueing network model was proposed to model the 24 Seagate ST 318436LC SCSI disks. With two
behavior of the OLTP transactions and determine processors on its motherboard, we choose upon
their relationship with the DBMS resource. The OLTP startup whether the system runs with either a single
workload is partitioned into five classes based on the CPU or dual CPUs.
five different types of transactions in the TPC-C
The third computer system is an IBM PowerServer
benchmark. A set of experiments were run to validate

www.cmg.org
704 with a single 200 MHZ CPU, 1 GB of RAM, and
and use the model. The estimated results were mostly
16 hard disk drives (Seagate ST 34520W SCSI). This
in the 10% error range compared to the actual
system has four disk controllers, and each controller
detected values, with the exception being when the
handles a number of the disks. By distributing the
population size is small and the system is under-
database tables over multiple disks, we can choose
utilized.
how many disks controllers and disk drives are used.
This QNM model is not suitable for an OLAP
Throughout the discussion of our experiments, we
workload. First, the queries in TPC-H, the benchmark
refer to the first system as Jaguar, the second system
for OLAP workload, are formalized from real world
as Baserver, and the third as Cougar. The operating
heterogeneous business questions. We cannot map
system on the machines is Windows NT Server
all queries one to one into the QNM workload, as was
Version 4.0, Jaguar and Baserver runs IBM’s DB2
previously done with the TPC-C benchmark (there are
Universal Database (DB2/UDB) V8.1 [DB203], and
only five transaction classes in TPC-C). The queries
Cougar runs DB2/UDB V7.2, respectively. We
must be characterized and grouped into a few general
assume this difference does not affect our study.
transactions. Second, the characteristics of OLTP
queries are different from those of OLAP queries. 3.2 Workload Characterization
OLTP queries often reuse data, making the buffer
pool area crucial for good performance. OLAP Before modeling DBMS performance, we must first
queries, on the other hand, use the buffer pool in understand the intended workload. OLAP has
different ways, often scanning large amounts of data different characteristics compared with an OLTP
thus requiring only small buffer pool. The equation workload. OLTP uses short, moderately complex
derived to represent the relationship between disk queries that read and/or modify a relatively small
demand and buffer pool size for an OLTP workload is portion of the overall database. These access
inappropriate for an OLAP workload. Even more, the patterns translate into small random disk accesses.
TPC-H queries use extensive sorting functions, so the OLTP workloads typically have a high degree of
model must account for the use of sort memory. multiprogramming due to the large number of
Third, the typical number of clients running concurrent users. In contrast, OLAP queries are
simultaneously in an OLAP workload is far smaller typically long running, moderate to very complex
than for an OLTP workload. queries, which often scan large portions of the
database in a read-only fashion. This access pattern
3. SYSTEM FACTORS INFLUENCING OLAP translates into large sequential disk accesses. The
PERFORMANCE multiprogramming level in OLAP systems is typically
The first step in conducting capacity planning is to much lower than that of OLTP systems.
have a clear understanding of the configurations of

Find a CMG regional meeting near you at www.cmg.org/regions

Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

In order to simplify and standardize database these batch files for different system configurations
performance evaluation, the Transaction Processing and collected the performance indices using Windows
Performance Council (TPC) [TPC] defines and NT performance monitor. We only use the read-only
maintains several industry standard database queries in our experiments. The TPC-H database is

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
benchmarks. Two common benchmarks, TPC-C and stable and consistent, and the DBMS performance is
TPC-D, model OLTP and OLAP workloads the same for multiple runs.
respectively. In April 1999, two new benchmarks,
3.3 DBMS Tuning
TPC-R and TPC-H, replaced TPC-D as the industry's
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

standard benchmarks for OLAP applications, TPC-R Database server performance is dependent on two
for a reporting workload and TPC-H for an ad-hoc things: hardware environment and software
querying workload. An ad-hoc querying workload configuration. In this section, we examine the impacts
simulates an environment in which users connect to of software configuration, that is DBMS tuning
the database system and issue individual queries that parameters such as buffer pool, sort heap, and
are not known in advance. prefetching, and determine the appropriate settings
for these parameters to be used in the later section.
The workload in both TPC-H and TPC-R consists of
the execution of 22 read-only queries in both single 3.3.1 Buffer Pool
and multi-user mode and two refresh functions. These
queries are formalized from real world business The buffer pool is an area of memory into which
questions. They simulate generated ad-hoc queries database pages are temporarily read and modified.
(TPC-H) and reporting queries (TPC-R), and generate The purpose of the buffer pool is to reduce the
intensive disk and CPU activity on the database number of disk accesses and to improve database
server. Note that since our TPC-H benchmark setup system performance, since data can be accessed
has not been audited per TPC specifications, our much faster from memory than from a disk. During
benchmark workloads are referred to as TPC-H-like buffer pool set up and configuration, the general rule
workloads and not to be interpreted as official of thumb is to keep objects with different access

www.cmg.org
benchmark results. patterns separate. That is, objects with a high
reference rate should be kept separate from those
From the perspective of a queueing network model, with a high sequential rate. TPC-C uses small random
the workload is a set of formalized parameters. disk access with a high degree of rereference, while
Among these parameters, the two most important are TPC-H uses large sequential disk accesses. Zawawy
service demand (the service time needed from each [ZAWA02] found that TPC-C performance was greatly
resource) and the number of workload classes. impacted by the buffer pool size in an OLTP
workload.
The Windows NT performance monitor is used to
collect performance data including the percentage of We measured the TPC-H response time of the 22
non-idle processor time spent servicing DB2/UDB queries using different buffer pool sizes with different
(%Processor Time) and the percentage of elapsed system configurations. The experiments are designed
time that the selected disk drive is busy servicing and run with all the other parameters fixed except the
read or write requests (%Disk Time). Based on these one being investigated, thus isolating it’s effect.
measurements, the CPU service demand and Disk Figure 1 shows the sum of the 22 TPC-H queries’
service demand can be calculated using the following response time vs. buffer pool size for different sort
utilization law: heap sizes, another factor that will be covered in next
section.
Utilization * Time
ServiceDemand = From Figure 1 we can see that the response time
Completions decreased drastically at the beginning of the graph,
where Utilization is the percentage of time this i.e. any change in the smaller buffer pool size will
resource is busy serving this class, Time is the total have a great impact on the performance. Once the
time the workload is run, and Completions is the buffer pool reaches a certain size, the response time
number of completed transactions of this class. We becomes a constant, which means there is a
run 22 TPC-H queries one by one, and collect their threshold of the buffer pool size. The performance of
%Processor Time and %Disk Time to calculate their TPC-H is independent on buffer pool size beyond this
CPU demand and disk demand. threshold.

In the experiments that we conduct, we use a C In an OLTP workload, the response time typically
language implementation of the TPC-H benchmark decreases with an increase in buffer pool size. The
driver to simulate a typical OLAP workload on a main reason for the different behavior at an OLAP
DBMS. A set of batch files, each file running the 22 workload is the different data access pattern between
TPC-H queries in random order, is used to simulate TPC-C and TPC-H workload. During the large
multiple clients using the DBMS concurrently. We ran sequential disk accesses of TPC-H workload, the data

Find a CMG regional meeting near you at www.cmg.org/regions

Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

are swapped into and out of the buffer pool; the Figure 2 shows the results of different CPU speed
number of disk access is independent of the buffer (Cougar 200MHZ, Jaguar 400MHZ, Baserver 1GHZ).
pool size. Another reason is that in TPC-H, the The impact of different number of CPU (1 and 2
database indexes are heavily used. Once these processors) and the effect of multiple disks, where the

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
indexes can be held in memory, the performance of database is distributed across 3, 4, and 12 disks,
TPC-H is significantly improved. respectively, was also tested [CUI03]. We observed
that TPC-H performance changes along with the
In order to verify our hypothesis, we monitor the
hardware configurations, but the buffer pool threshold
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

buffer pool usage by the DBMS buffer pool snapshot

is the same for the different configurations.
and calculate the hit rate for indexes and data for
different user populations. We find that the hit rates 9000

for indexes are very high (>93%), while the hit rates 8000
Cougar

for data are quite low (<20%). The data hit rates even 7000
Jaguar

Response Time (s)

become negative when number of clients exceeds 9, 6000
Baserver
which means the actual number of reads from disk 5000

(Physical read) is larger than the transaction needs 4000

(Logical read). This is because of the prefetching, 3000

which we will discuss in Section 3.3.3. When 2000

prefetching is turned off, the data hit rate of 15 clients 1000

increases to 10%, but the performance degrades. The 0

overall response time of the TPC-H queries increases 0 1000 2000 3000 4000 5000 6000 7000
Buffer pool size (page)
8000 9000 10000 11000 12000

from 2 hours 5 minutes to 3 hours 6 minutes.

Figure 2. Effect of Buffer Pool Size on TPC-H Query Response
Now we know that there is a certain threshold for Time for Different CPU speeds (Database size 1GB)
buffer pool size for a specific database system with a
TPC-H workload. To further this investigation, we Buffer pool size vs. Database size (1GB)

www.cmg.org
need to find the exact value and impact factors of this 1500

threshold. We designed three sets of experiments to

Response Time (s)

study the buffer pool effect on TPC-H performance: 1200

DBMS parameter (sort heap), hardware configuration

(CPU speed and number, disk number) and database 900

size.
600
A DBMS is a complex software system; changing one 0 2000 4000 6000 8000 10000 12000

Buffer pool size (page)

parameter may impact another parameter. Sort heap
is an important DBMS parameter for the TPC-H
Buffer pool size vs. Database size (5GB)
workload because of the number of large sorts. The 12000

impact between buffer pool and sort heap was

Response Time (s)

investigated in our first set of experiments; and the 10000

results showed that they are independent of each

8000
other. From Figure 1 we can see that the buffer pool
thresholds for different sort heap sizes (1000, 10000, 6000

100000 pages, where 1 page is 4KB) are the same.

4000
0 5000 10000 15000 20000 25000

10000 Buffer pool size (page)

9000
SortHeap 1K
SortHeap 10K
Response Time (s)

8000

7000 SortHeap 100K Buffer pool size vs. Database size (10GB)
6000 16000

5000
Response Time (s)

15500
4000

3000
15000

2000

1000 14500
-1000 9000 19000 29000 39000 49000 59000 69000 79000 89000 99000

Buffer Pool Size (page) 14000

0 15000 30000 45000 60000 75000 90000 105000

Buffer pool size (page)

Figure 1. Effect of Buffer Pool Size on TPC-H Query Response
Time for Different Sort Heap Sizes (Cougar, Database size 1GB)
Figure 3. Effect of Buffer Pool Size on TPC-H Query Response
The second set of experiments studies the impact of Time for Different Database Sizes (Baserver)
hardware configuration on buffer pool threshold.

Find a CMG regional meeting near you at www.cmg.org/regions

Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

The third set of experiments is on the relationship threshold. The value of this threshold is related to the
between buffer pool threshold and database size. workload. Each query uses different amounts of data,
Based on the above analysis, buffer pool threshold and has different requirements for sorting and
should be closely related to the database size, grouping. When the sort heap size is small, some of

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
especially with the database index size. We test three the temporary results have to be written to disk. The
database sizes: 1GB, 5GB and 10GB. The results sort heap threshold is the size of sort heap that
(Figure 3) verify our hypothesis and indicate that the satisfies the required memory for that query’s data
buffer pool threshold increases along with database management. We plot the response time of all 22
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

size. However, we cannot derive a quantitative queries together, all queries except query 9, and
equation for buffer pool threshold and database size, query 9 in Figure 4, respectively. Query 9 plays a
without further points of reference. dominant role in the overall 22 queries. If we look at
query 9 more closely in the figure, we see that the
In summary, there is a threshold for buffer pool size
response time of query 9 drops from 848.1 sec to
for a TPC-H workload. After this threshold, the
308.2 sec when the sort heap size changes from
performance of TPC-H is independent of the size of
82040 to 82050 pages. A change of only 10 pages
buffer pool. This threshold is not correlated with sort
improves its performance by more than 60%.
heap parameter in DBMS, nor with the hardware
configuration. It is closely related to, and increases Another point that should be noted is that not all
along with, the database size. queries are sensitive to a change in sort heap size,
and each query has a different threshold. Query 9
In the next set of experiments, we use a buffer pool
has the largest threshold among the TPC-H queries.
size of 5,000 pages for the 1GB database, 7,000
pages for the 5GB database and 20,000 pages for the Following this work we investigated the effect of
10 GB, which are all beyond their buffer pool hardware (number of disks, number of CPUs, and
thresholds, to eliminate the buffer pool effect and to CPU speed) and database size on the sort heap
simplify the model. threshold. Figure 5 shows the impact of number of

www.cmg.org
disks, where a 1GB TPC-H database is distributed
3.3.2 Sort Heap
across 3, 4, 8, and 12 disks, respectively. The
OLAP queries are typically long running, moderate to response times of each kind of disk distribution are
very complex queries with frequent multi-way joins. different, but the shape of the graph and the threshold
Among the 22 queries of the TPC-H benchmark, 18 of sort heap size are the same. The sort heap
queries have an “ordered by” function, and 15 queries threshold is independent of the disk configuration.
have a “group by” function. The sort heap, which is
used to sort, join, and calculate, should play an 1900

important role in TPC-H performance. 8 disk

Sum of Response Time (s)

1700 12 disk
3 disk
6000
1500 4 disk
sum
5000 Sum-Q9 1300

Q9
Response time (s)

1100
4000

900

3000
700

2000 500
0 10000 20000 30000 40000 50000 60000
sort heap size (page)
1000

Figure 5. Effect of Sort Heap Size on TPC-H Query Response

0 Time for Different Disk Distributions (Database size 1GB,
0 50000 100000 150000 200000 250000
Baserver)
Sort heap size (page)
We experimented with three systems to verify the
Figure 4. Effect of Sort Heap Size on TPC-H Query Response
impact of CPU configuration on the sort heap
Time (Database size 1GB, Cougar) threshold. Figure 6 shows the results of different CPU
speeds (Cougar 200MHZ, Jaguar 400MHZ, and
Figure 4 gives the graph of sort heap size vs. Baserver 1GHZ). We find that the CPU speed does
response time of TPC-H queries. From this figure we have a great impact on the shape of the graph and
can see that, similar to buffer pool size, there is a the sort heap threshold increases along with the
threshold for sort heap size. After this threshold, the decreasing CPU speed: 10,000 pages for Baserver,
query response time is independent of the size of sort 20,000 pages for Jaguar, and 100,000 pages for
heap, while TPC-H response time decreases by Cougar. The slower processor needs more sort heap
increasing the sort heap size before it reaches this memory. We do not find any impact on the sort heap

Find a CMG regional meeting near you at www.cmg.org/regions

Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

threshold when changing from one processor to two pool, reducing the time spent waiting for I/O to
processors [CUI03]. complete. During the experiments on buffer pool hit
rate of Section 3.3.1, we found that prefetching
8000 decreases the TPC-H response time by 30%.

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
Baserver
7000
Dennis Shasha and Philippe Bonnet [SHAS] tested
Sum Response Time (s)

Cougar
6000
Jaguar the influence of prefetching on throughput of
5000 DB2/UDB V7.1. They found that the throughput
increases up to a certain point when prefetching size
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

4000

3000
increases (about to 16 pages); then it is a constant.
2000
Sixteen pages is the default value for table
prefetching size in DB2/UDB. We use this value in
1000
our experiments.
0
0 20000 40000 60000 80000 100000 120000 140000 160000 180000 200000
The size of prefetching is predefined when creating
Sort heap size (page)
the table space, we cannot modify prefetching size
once the table space is created, but prefetching can
Figure 6. Effect of Sort Heap Size on TPC-H Query Response
Time for Different CPU Speeds (Database size 1GB)
be turned on or off via configuration parameters.
From Table 1 we can see that prefetching decreases
The impact of database size on sort heap threshold is the response time by more than 20% in every set of
shown in Figure 7. As expected, we found that the experiments. We use prefetching on as the default in
sort heap threshold increases along with the database the rest of our experiments.
size.
Table 1. The Effect of Prefetching on TPC-H Response Time (s)
for Different Buffer Pool Sizes and Sort Heap Sizes (page)
18000
(Cougar, Database size 1G)
16000
Buffer pool 1K 5K 10K

www.cmg.org
Sum of Response Time (s)

14000
Sort heap 1K 10K 100K 1K 10K 100K 1K 10K 100K
12000
1GB
10000 5 GB Prefetch off 77796490 5040 7070 6072 4985 67196083 4948
10 GB
8000 Prefetch on 57395095 3825 4848 4666 3771 45744679 3731
6000
I/O servers are used on behalf of the database agents
4000 to perform prefetch I/O and asynchronous I/O by
2000 utilities such as backup and restore. The number of
0
I/O servers in DB2/UDB could be from 1 to 255. We
0 20000 40000 60000 80000 100000
Sort heap size (page)
120000 140000 160000 180000 200000 have checked the influence of number of I/O servers
on the performance of TPC-H queries, and found that
Figure 7. Effect of Sort Heap Size on TPC-H Query Response one I/O server slowed down the response, while the
Time for Different Database Sizes (Baserver, One CPU, Four response times are almost constant starting from two
Disks) I/O servers. This experiment shows that the 2 I/O
In summary, sort heap is another tunable parameter servers are enough for the current system; we use the
in a DBMS that impacts TPC-H performance. We value (20) for the number of I/O servers in our
found that there is a certain threshold, and beyond experiments.
this value the TPC-H performance is independent of 4. QUEUEING NETWORK MODEL
the sort heap size. This threshold is dependent on the
query content, and different queries have different A queueing network model represents the behavior of
thresholds. The disk configuration has no effect on queries in the DBMS. We model a query in terms of
the sort heap threshold, while CPU speed plays an its use of three main resources, namely, CPU, disk
important role. We found that the slower processor and main memory. An OLAP query can be
usually need more sort heap memory and has a larger characterized as a series of logical page accesses
sort heap threshold. The required sort heap threshold and page processes. A logical page access first tries
increases along with the database size. the main memory and, if the page requested is not in
the buffer pool, then it is read from the disk (physical
3.3.3 Prefetching page access). The probability that a logical page
OLAP queries are typically scanning large portions of access involves a disk access in OLAP queries is
the database. This access pattern translates into large very high based on the fact that the buffer pool data
sequential disk accesses. The prefetchers (I/O hit rate is pretty low (<20%). The page process
servers) are designed to deal with this matter, since includes sort, group and calculate to give the final
they prefetch index and data pages into the buffer query result.

Find a CMG regional meeting near you at www.cmg.org/regions

Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

clustering algorithm. The combination of these three

categories from CPU and disk produces 9 general
query classes, e.g. CPU simple/disk simple (SS),
Population
CPU simple/disk complex (SC), CPU simple/disk very

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
complex (SVC), etc. All 22 TPC-H queries can be
mapped into these general queries based on their
CPU and disk demand value. The CPU demand and
disk demand of each general query can be calculated
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

by averaging all TPC-H queries belong to this class,

and the appearing frequency of this general workload
class is the percent of all queries in this class over all
CPU
Memory
Disk 22 TPC-H queries. Details of the mapping are
available elsewhere [CUI03].
One point that should be noted is that QNM can
Figure 8. DBMS Queueing Network Model model multiple users using the system concurrently,
e.g. client A uses the CPU service and client B uses
Figure 8 shows a queueing network model that the disk service at the same time, but QNM cannot
represents a DBMS on a computer with a single CPU, model one user using multiple service centers
disk drive, and a number of users concurrently using concurrently, e.g. client A uses the CPU service and
the system. A queue is associated with each the disk service at the same time. This scenario is
resource. A client generates a query, which is common in DBMSs. The DBMS can prefetch data
composed of a set of logical page accesses and from disk while CPU performs a sort. If the time spent
processes. Given the nature of OLAP workloads, the on main memory is ignored, the actual response time
concurrent users of OLAP database server is far less for each query should be close to the sum of CPU
than an OLTP one. The maximum number of users service time and disk service time. We found that the

www.cmg.org
therefore, in our experiments is 15. Each of these sum of CPU and disk service demand is always larger
clients submits a query, waits for the response of that than the actual response time due to the
query, analyzes the response, and composes a new asynchronous activity, and these asynchronous
query to be submitted to the server. A closed actions cannot be captured by the QNM. The
queueing network model with a finite population is asynchronous time was estimated by the difference
used. between the sum of CPU and disk demand and the
actual response time. Half of the asynchronous time
A closed QNM requires three inputs: the number of was deducted from the CPU service demand and disk
service centers, the service demand of each service service demand to reflect the asynchronous actions in
center for every workload class, and the frequency of DBMS.
each workload class. There are three service centers
in our DBMS QNM: CPU, disk and main memory. We Table 2. The Parameter Values for QNM (Baserver, Database size
1GB, 1 CPU, 1 Disk, 1 Client)
consider the buffer area and sort heap to be the main Workload
memory resource in this study. The time to transfer Class SS SC CS CC CVC VCS VCC
data in and out of main memory is negligible so the No.
service demand of the main memory is not Requests 0.23 0.05 0.09 0.27 0.10 0.18 0.09
significant. As we discussed in the previous section, CPU
demand 6.4 5.6 6.1 9.6 19.9 17.5 15.3
the OLAP performance is independent of the size of Disk
buffer pool and sort heap when beyond their demand 33.0 102.6 26.2 100 335.5 26.4 94.6
thresholds. Our QNM can be simplified into two
service centers (CPU and Disk) if the DBMS is well We validate our queueing network model against the
tuned (buffer pool and sort heap are larger than their TPC-H performance using DB2/UDB. System
thresholds). measurements are made and used to calculate the
QNM input parameters (Table 2 shows one set of
The computational complexity of a QNM increases parameters used in the QNM). The performance
exponentially with the number of workload classes. indices (response time, resource utility) are estimated
Instead of directly using 22 queries as our QNM using the QNM with these calculated parameters. The
workload, we simplify and classify them into a small model is validated by comparing the calculated
number of general workload classes based on their indices with the performance indices measured
resource usage. The CPU demand and Disk demand directly from the system. Results of the experiments
of all queries are classified into three categories are shown in Figure 9. In most cases the errors are
[simple, complex, very complex] using a K-means within 10%.

Find a CMG regional meeting near you at www.cmg.org/regions

Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

We found that the total TPC-H response time (the

30000 sum of all 22 queries) decreases dramatically by
increasing the number of disks, and noted that the
Response Time (s)

25000
calculated values, using MS Excel workbooks

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
20000 (ClosedQN.XLS) developed by Menasce [MENA98]
for solving parallel multiple service centers’ QNM, are
15000
always smaller then the actual measured. The actual
disk system is not optimized as expected in the QNM.
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

10000

In the QNM it is assumed that all disks are fully

5000
parallel and that there is no interaction between them.
0
This assumption is not true for a DBMS and disk
1 3 6 9 12 15
Actual 2107.0 7751.7 11785.7 17400.2 21987.4 27335.8 system. The locking in DBMS and the relationships
Model 2007.0 5556.2 11117.7 16729.0 22355.0 27987.1 between different database tables limit the parallelism
of the disk system, and this limitation depends on the
Figure 9. The Response Time vs. Client Number (Baserver,
Database size 1GB)
workload and database table distribution. We have to
account for this in our QNM.
Some models are frequently used in estimating the
1.200
aggregate processing capacity of multiprocessor
1.000 CPU systems [GUNT96]. Among them the well-known
Disk Amdahl’s law [AMDA67] is based on the serial
0.800
fraction of a workload and would be a good fit for our
Utilization

0.600 problem. The law is

0.400 P
A( P) =
1 + σ ( P − 1)

www.cmg.org
0.200

0.000
0 2 4 6 8 10 12 14 16 18 20
where P is the number of processors in the system,
Client Number A(P) is the actual capacity. The parameter, σ, where
(0<σ<1), known as the seriality constant, refers to the
Figure 10. The QNM Calculated CPU and Disk Utilities for serial fraction of the workload that cannot be made to
Different Population Size (Baserver, Database size 1GB) execute in parallel.
Calculating the weight of CPU demand and disk
20000
demand in the overall service demand (sum of CPU
demand and disk demand), we found that only 5 TPC- Actual
Response Time (s)

H queries are CPU intensive, while 17 are disk 15000 Model

intensive. Of the service demand, 81% is for disk and
only 19% is for CPU demand. Based on the resource 10000
utilizations calculated from the QNM (Figure 10), we
see that the disk utilization is very high (>90% when 5000
there are more than two concurrent users), and CPU
utilization is low (~20%) for different population sizes.
0
All the signs show that the disk is the bottleneck for
1 2 3 4
our system. Disk Number
5. EXTENDING THE MODEL Figure 11. The Actual Response Time and the Estimated Values
from Modified QNM for Different Number of Disks (Baserver,
We now examine the flexibility and portability of our Database size 1GB, 9 clients)
model and show how it can be used to predict system
performance when disks or processors are upgraded. We tested different values of σ and found 0.30 is a
good fit for our system. Based on Amdahl’s law, the
5.1 Multiple Disks discrepancy between P and A(P) increases with
In the baseline system, we found that the disk is the adding more disks. The actual disk capacity is only
bottleneck. To improve the performance, the first 2.11 when the system has 4 parallel disks (σ=0.30).
choice is to upgrade the disk system. We investigate Figure 11 shows the results of actual measured
the effect of adding disks to the system. Each disk response time for 9 clients using different disk
was added along with a separate disk controller to configurations and the QNM estimated values using
eliminate the impact of controller’s bandwidth, and all Amdahl’s law. The estimated are within 10% of the
database tables are evenly distributed on all disks. real values.

Find a CMG regional meeting near you at www.cmg.org/regions

Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

In order to test the correctness of this strategy, 5.3 Different Disk Specifications
another system (Jaguar with 3 disks) was examined.
The results for different population sizes are shown in Every disk can be characterized by three basic
Figure 12. Comparing with the actual measured quantities; seek time, latency time, and transfer rate.
From these characteristics it is possible to estimate

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
response time, the estimated error is within 10%. We
the changes in disk service demands that will result
also found that the seriality constant (σ=0.30) is the
from replacing one type of disk with another. The
same as the previous system (Baserver), indicating
exact change in disk service demand would also
that σ is related to the workload characteristics, not
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

depend on block size, seek pattern, and IO subsystem

the system configuration. contention.
Response Time (s)

30000 All the TPC-H queries we used are read only and
Actual Model running without the update function. They can be
25000
composed as a set of logical reads. If a logical read is
20000 not satisfied in the buffer area, it results in a physical
15000
disk access, which is further classified into two
general disk access patterns: random read (RR) and
10000 sequential read (SR). The disk service time for a
random read, random read time (RRT), and
5000
sequential read, sequential read time (SRT) can be
0 calculated using the following equations [MENA98]:
3 6 9 12 15
Client Number
LatencyTime
RRT = SeekTime + + TransferTime
Figure 12. The Actual Response Time and the Estimated Values 2
from Modified QNM for Different Number of Clients (Jaguar,
Database size 1GB, 3 disks)

www.cmg.org
5.2 Multiple CPUs SeekTime
SRT = + TransferTime +
We also investigated the effect of adding processors RunLength
to the system. We assume that the CPU demands are [1 / 2 + ( RunLength − 1)(1 + DiskUtilit y ) / 2] * LatencyTime
spread equally among the multiple processors, and RunLength
the measurements from the NT monitor proved this
hypothesis. Because the CPU is not the bottleneck of
the current system, adding one CPU only improves Where Run length is the block size of sequential read.
the TPC-H performance by 5-10%, while adding one For our system we use prefetch size (64 KB) as Run
disk led to a 20-30% improvement. length, and the disk utility is very high (>90%) in our
An example result is shown in Figure 13. The actual system. In this case, the SRT equation can be
measured and the calculated response times, using simplified into the following equation:
MS Excel workbooks (ClosedQN.XLS) developed by
Menasce [MENA98] for solving parallel multiple
service centers’ QNM, are shown in Figure 13. The SeekTime
SRT = + TransferTime + LatencyTime
multiple processor effect [GUNT96] is not seen in the RunLength
current system. Perhaps the reason being that the
CPU is not the bottleneck of current system, or the
multiple processor effect is not significant in the case From the buffer pool snapshot function in DB2/UDB
of two processors. we can get the total number of physical reads and the
total buffer pool read time for each query. Based on
Response time (s)

30000 the calculated RRT and SRT using the disk

25000
Actual Model specification and above equation, the number of
20000
random reads and sequential reads can be estimated
using the following two equations:
15000

10000 Physical Re ad = RR + SR
5000 Re adTime = RR * RRT + SR * SRT
0
0 2 4 6 8 10 12 14 16
These numbers (random read and sequential read)
Client Number are associated with the workload queries and will not
change for different disk system. Given a disk
Figure 13. The Response Time vs. Different Population Sizes
(Baserver, Database size 1GB, 1 disk, two Processors) specification (seek time, latency time, and transfer
rate), we can calculate RRT and SRT, and we can get

Find a CMG regional meeting near you at www.cmg.org/regions

Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

the buffer pool read time for a specific query using the Consequently, once we know the disk demand, we
known RR and SR. We also observed that there is a can plug it in our QNM and estimate the performance
linear relationship between buffer pool read time and for a new disk specification. Figure 16 shows the
disk demand (Figure 14). results.

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
5.4 Different CPU Speed
350
Disk demand (s)

y = 0.5029x + 10.813 Another common hardware configuration change is

300
the upgrade of a CPU within a family of processors of
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

250 the same architecture. Usually this is one of the

easiest changes to evaluate using queueing network
200
model. The relative instruction execution rates among
150
processors within a family generally are known and
publicized by vendors and user groups. The primary
100
parameter change in QNM is to multiply the CPU
50
demand by the ratio of the current CPU’s processing
rate to that of the new one.
0
0 100 200 300 400 500 600 700
160
Total buffer pool read time (s) Q1 Q22 Q11 Q10 Q18 Q21
140

120
Figure 14. The Linear Correlation Between Total Buffer Pool
CPU demand (s)

Read Time and Disk Demand (Baserver, database size 1GB, 1 100
disk, 1 CPU)
80

60
450
Disk demand (s)

40
400 Actual Disk Demand

www.cmg.org

350 Calc. Disk Demand

20
!"# #$% &'( ()*+ +,-. /0. 123 456 7896 9:;< <=>? ?@A BCDA DEFG GHI JKL MNO PQRO STUR UVW XYZ [\] ]^_` ab` cde fgh hijk lmnk nop qrs tuv vwxy yz{| }~| ¡¢ £¤¥ ¦§¨ ¨©ª« ¬®« ®¯° ±²³ ³´µ¶ ¶·¸¹ º»¹ ¼½

300 0
250 0 200 400 600 800 1000 1200

200 CPU speed (MHz)

150
100 Figure 17. The CPU Speed effect on CPU demand (Database size
50 1GB, 1 disk, 1 CPU)
0 In our DBMS system, we found that the CPU demand
SS SC CS CC CVC VCS VCC is not linear with the CPU processing rate. We
General query
checked three different CPU systems: Cougar (200
MHZ), Jaguar (400 MHZ), and Baserver (1000 MHZ).
Figure 15. Disk Demand Prediction (Jaguar, database size 1GB, 1 Their relative ratio of CPU demand is 1:1.7:3.4 ,
disk, 1 CPU)
instead of 1:2:5. Figure 17 lists the CPU demands
and CPU speed for some TPC-H queries. Since we
70000

Actual Response Time cannot use a linear equation, we employ curve-fitting

60000 Calculate using measured disk demand techniques to estimate the effect of CPU speed on
Response Time (s)

Calculate using estimated disk demand

50000
CPU demand. Using Figure 17, we can get the CPU
demand of each TPC-H query for a certain CPU
40000
speed, which can be further generalized and plugged
30000
into our QNM to estimate the performance under this
CPU speed.
20000

6. CONCLUSIONS AND FUTURE WORK

10000

0
Capacity planning techniques are needed to avoid the
1 2 3 4 5 6 9 12
Client No.
15 pitfalls of inadequate capacity and to meet users’
performance expectations in a cost-effective manner.
Figure 16. The response time (Actual, calculated using This paper attempts to lay a foundation to carry out
measured disk demand, and calculated using estimated disk capacity planning studies for DBMSs using OLAP
demand) for different population size (Cougar, database size workloads. The main contributions of this paper are a
1GB, 1 disk, 1 CPU)
study of DBMS factors influencing OLAP
Based on this linear equation, we can calculate the performance, the design and validation of a queueing
disk demand from the buffer pool read time. Figure network model to capture the main features of the
15 shows the estimated disk demand and actual DBMS behavior, and the use of a quantitative
measured disk demand for one system (Jaguar).

Find a CMG regional meeting near you at www.cmg.org/regions

Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

approach to project the DBMS performance with Like programming, modeling is more of an art then a
OLAP workload. science. The more information is supplied, then the
more accurate the model. Also, the more information
We first build a workload model for OLAP, which is
provided, the more difficult it is to build the model,
based on the TPC-H benchmark, investigate the

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
and the model will be less adaptable. There is a
impact of DBMS tuning (such as buffer pool, sort
trade-off between accuracy, cost and adaptability in
heap size, and prefetching) on OLAP performance.
capacity planning. The challenge is to come up with a
We found there are certain thresholds on buffer pool
scheme that is rich enough to be useful, and yet
and sort heap size in DBMS with OLAP workload, and
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

simple enough to be manageable.

then we investigate the relationships between these
thresholds, database size and hardware
configurations. The buffer pool threshold is not ACKNOWLEDGMENTS
correlated with sort heap parameter in DBMS, nor
with the hardware configuration. It is closely related The authors would like to thank IBM Canada Ltd., the
to, and increases along with, the database size. The National Science and Engineering Research Council
sort heap threshold is dependent on the query (NESRC) of Canada and Communications and
content, and different queries have different Information Technology Ontario (CITO) for supporting
thresholds. The disk configuration has no effect on this research.
the sort heap threshold, while CPU speed plays an
important role. We found that the slower processor REFERENCES
usually needs more sort heap memory and has a [AMDA67] G. Amdahl. “Validity of The Single
larger sort heap threshold. The required sort heap Processor Approach to Achieving Large Scale
threshold increases along with the database size. Computing Capabilities”, AFIPS Conference
Prefetching is good for OLAP performance and Proceedings (1967).
should always be turned on.
[CUI03] X. Cui. “Capacity Planning for Database

www.cmg.org
Based on these results, we then propose a queueing Management Systems with OLAP Workloads”, M. Sc.
network model (QNM) to represent the system under Thesis, School of Computing, Queen’s University
study and give the results of preliminary experiments (2003).
to validate this model. We have indicated, through [DB200] “DB2/UDB Estimator Help”, DB2/UDB
discussion and example, how to modify the Estimator V7, IBM Corp. (2000).
parameters of this baseline model to represent
various common changes to the hardware (CPU and [DB203] DB2/UDB Universal Database Administration
disk upgrade) and workload (number of users). Guide: Performance, IBM Corporation (2003).
Amdahl’s law is used to estimate the multiple disk [GRAH78] G. S. Graham. “Queueing Network Models
effect on DBMS performance when changing the of Computer System Performance”, Computing
number of disks. A mathematical approach, based on Surveys, Vol. 10, NO. 3, September (1978).
the disk configurations (seek time, latency time, and [GUNT96] N. J. Gunther. “Understanding the MP
transfer rate) and DBMS internal indices (physical Effect: Multiprocessing in Pictures”, CMG Conference
read, read time), is proposed and used in predicting Proceedings (1996).
disk demands for different disks.
[LAZO84] E. Lazowska, J. Zahorjan, S. Graham, and
A number of related issues require further study. First, K. Sevcik. “Quantitative System Performance:
systems with larger number of CPUs and disks and Computer System Analysis Using Queueing Network
larger database size need to be tested to verify and Models”, Prentice Hall, Englewood Cliffs, N. J. (1984).
quantify the multiple processor effect, the [MENA98] D. Menasce, and V. A.F. Almeida.
relationships between CPU demand and processor Capacity Planning for Web Performance: Metrics,
speed, CPU demand and database size, and disk Models, & Methods, Prentice Hall (1998).
demand and database size. The second issue is to
integrate our model with the previous QNM using [PATT98] D. A. Patterson and K. K. Keeton.
OLTP workload [ZAWA02], generating a general “Hardware Technology Trends and Database
workload model. Third, similar studies for DBMS Opportunities”, Keynote address at ACM-SIGMOD’98,
using other benchmark workloads such as TPC-W Seattle, Washington, June (1998). Available from
(Web workload) should be conducted. The future http://www.cs.berkeley.edu/~pattrsn/talks/sigmod98-
direction of this model is to allow the user to define keynote-color.pdf
his/her database and workload (SQL statements, [SAUE81] C. H. Sauer and K. M. Chandy. “Computer
batch files, utilities such as copy, recover and Systems Performance Modeling”, Prentice Hall
rebuild), and then use this model to estimate the (1981).
performance of different systems under the specified [SHAS] D. Shasha and P. Bonnet. Talk notes,
workloads. available from http://www.distlab.dk/dbtune/.

Find a CMG regional meeting near you at www.cmg.org/regions

Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

[STEN97] P. Stenstrom, E. Hagersten, D. J. Lijia, M. TRADEMARKS

Martonosi and M. Venugopal. “Trends in Shared DB2, DB2 Universal database, IBM, Netfinity, and
Memory Multiprocessing”, IEEE Computer, 44-50 Power server are trademarks or registered
December (1997). trademarks of International Bussiness Machines

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at
[TPC] Transaction Processing Performance Council. Corporation in the United States, other countries, or
http://www.tpc.org both.
[VIJA99] J. Vijavan, “Capacity Planning More Vital Microsoft and Windows NT are registered trademarks
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

Than Ever”, Computer World, February (1999). of Microsoft Corporation in the United States, other
[ZAWA02] H. Zawawy, P. Martin and H. Hassanein. countries, or both.
“Supporting Capacity Planning for DB2/UDB”, Pentium is a trademark of Intel Corporation in the
Proceedings of CASCON 2002, 89-97 September United States, other countries, or both.
(2002).

www.cmg.org

Find a CMG regional meeting near you at www.cmg.org/regions

Capacity Management in AS400 Environment
No ratings yet
Capacity Management in AS400 Environment
11 pages
Capacity Planning in Oracle Database
No ratings yet
Capacity Planning in Oracle Database
30 pages
SMP, MPP For Olap
100% (2)
SMP, MPP For Olap
10 pages
CMG - Unix Server Sizing
No ratings yet
CMG - Unix Server Sizing
13 pages
CMG Designing and Sizing Large Ecommerce Sites
No ratings yet
CMG Designing and Sizing Large Ecommerce Sites
13 pages
CMG1990 - Perf Engg Formulas Equations and Relationships
No ratings yet
CMG1990 - Perf Engg Formulas Equations and Relationships
18 pages
Benchmarking When What Why
No ratings yet
Benchmarking When What Why
11 pages
CMG - Build Responsive and Scalable Applications
No ratings yet
CMG - Build Responsive and Scalable Applications
12 pages
Capture System Data Usign Native Commands
No ratings yet
Capture System Data Usign Native Commands
14 pages
CMG Distributed Capacity Planning
No ratings yet
CMG Distributed Capacity Planning
12 pages
Accurately Recreating Web Workloads Using Production Data
No ratings yet
Accurately Recreating Web Workloads Using Production Data
29 pages
CMG - Multiprocessor Scalibility in Windows
No ratings yet
CMG - Multiprocessor Scalibility in Windows
13 pages
Capacity Planning For Web Based Internet Services
No ratings yet
Capacity Planning For Web Based Internet Services
12 pages
CMG - Page Analyzer A Tool For Performance Modelling
No ratings yet
CMG - Page Analyzer A Tool For Performance Modelling
12 pages
Challenges of Capacity Planning in Distributed Environments
No ratings yet
Challenges of Capacity Planning in Distributed Environments
10 pages
Capacitry Management For MS SQL Server
No ratings yet
Capacitry Management For MS SQL Server
12 pages
CMG1978 - Capacity MGMT Guide
No ratings yet
CMG1978 - Capacity MGMT Guide
13 pages
CMG1988 - Anatomy of Capacity Planning
No ratings yet
CMG1988 - Anatomy of Capacity Planning
12 pages
CMG1983 - Capacity Modelling Case Stduy BKB App
No ratings yet
CMG1983 - Capacity Modelling Case Stduy BKB App
9 pages
CMG Testing Ecommerce Website With TPC-W Benchmark
No ratings yet
CMG Testing Ecommerce Website With TPC-W Benchmark
10 pages
CMG Capacity Planning For Ecommerce Apps
No ratings yet
CMG Capacity Planning For Ecommerce Apps
9 pages
CMG - Sacling Up Java Apps On Windows Servers
No ratings yet
CMG - Sacling Up Java Apps On Windows Servers
10 pages
CMG Proactive Performance For 100 Fold Application Growth
No ratings yet
CMG Proactive Performance For 100 Fold Application Growth
10 pages
CMG - Your First Capacity Planning Process
No ratings yet
CMG - Your First Capacity Planning Process
12 pages
Capacity Planning Methodology For Success
No ratings yet
Capacity Planning Methodology For Success
9 pages
Capacity Plannning As A Business Process
No ratings yet
Capacity Plannning As A Business Process
7 pages
CMG - Capacity Planning The Practical and Politcal Side
No ratings yet
CMG - Capacity Planning The Practical and Politcal Side
11 pages
CMG Best Practices For Performance Engineering
No ratings yet
CMG Best Practices For Performance Engineering
11 pages
Capacity Planning For New Applications
No ratings yet
Capacity Planning For New Applications
9 pages
CMG1985 - Basic Elements of Capacity Planning
No ratings yet
CMG1985 - Basic Elements of Capacity Planning
10 pages
CMG End To End Scaling The Response Time Pipe
No ratings yet
CMG End To End Scaling The Response Time Pipe
9 pages
CMG Performance Modeeling For Web Apps
No ratings yet
CMG Performance Modeeling For Web Apps
14 pages
CMG1989 - A Survival Guide For Novice Capacity Planners
No ratings yet
CMG1989 - A Survival Guide For Novice Capacity Planners
10 pages
CMG1991 - Performance Engineers View of System Devleopment
No ratings yet
CMG1991 - Performance Engineers View of System Devleopment
11 pages
Capacity Management Techniques
No ratings yet
Capacity Management Techniques
6 pages
CMG - How To Validate Performance and Scalibilty
No ratings yet
CMG - How To Validate Performance and Scalibilty
8 pages
CMG Measuring The Performance of Business Logic Components
No ratings yet
CMG Measuring The Performance of Business Logic Components
12 pages
Capacity Planning According To Clingon
No ratings yet
Capacity Planning According To Clingon
9 pages
Capacity Planning - A Few Suggestions
No ratings yet
Capacity Planning - A Few Suggestions
7 pages
CMG Performance Career by Using Analytical Modelling
No ratings yet
CMG Performance Career by Using Analytical Modelling
9 pages
CMG1989 - A Quick CPU Capacity Plan
No ratings yet
CMG1989 - A Quick CPU Capacity Plan
9 pages
CMG Tending For Capacity Plannign
No ratings yet
CMG Tending For Capacity Plannign
7 pages
A Successful Capacity Analysis
No ratings yet
A Successful Capacity Analysis
6 pages
CMG1981 - Increasing Systems Productivity by Perf Engg
No ratings yet
CMG1981 - Increasing Systems Productivity by Perf Engg
11 pages
CMG1989 - Capacity Planning For Unix Systems
No ratings yet
CMG1989 - Capacity Planning For Unix Systems
9 pages
Benchmarking and Modelling
No ratings yet
Benchmarking and Modelling
10 pages
CMG Performance and Scalibility of A .Net Application
No ratings yet
CMG Performance and Scalibility of A .Net Application
5 pages
Capacity Costing Methodology
No ratings yet
Capacity Costing Methodology
11 pages
Capacity Planning Tools and Techniques
No ratings yet
Capacity Planning Tools and Techniques
10 pages
An Integrated Approach To Capacity Planning
No ratings yet
An Integrated Approach To Capacity Planning
2 pages
Capacity Plan - An IT Power Tool
No ratings yet
Capacity Plan - An IT Power Tool
8 pages
CMG1983 - Graphical Tools For SPE
No ratings yet
CMG1983 - Graphical Tools For SPE
4 pages
CMG1988 - How To Obtain Data For Performance Engineering Studies
No ratings yet
CMG1988 - How To Obtain Data For Performance Engineering Studies
10 pages
CMG1986 - Top Down Capacity Planning
No ratings yet
CMG1986 - Top Down Capacity Planning
5 pages
CMG1985 - Strategic Capacity Planning
No ratings yet
CMG1985 - Strategic Capacity Planning
8 pages
ISOM Midterm
No ratings yet
ISOM Midterm
10 pages
CMG2004 Paper 4200
No ratings yet
CMG2004 Paper 4200
8 pages
Capacity Plan Table of Contents
No ratings yet
Capacity Plan Table of Contents
4 pages
Man Is Part of The Whole Life Love Joy Truth Compassion
No ratings yet
Man Is Part of The Whole Life Love Joy Truth Compassion
390 pages
Mind Is Rascal Booklet by AiR Atman in Ravi 1
No ratings yet
Mind Is Rascal Booklet by AiR Atman in Ravi 1
32 pages
My Guru My Mentor Oh My God Last Man On Earth First God On Earth Coach Teacher Happy Life Live and Live Meaning Inspire 1
No ratings yet
My Guru My Mentor Oh My God Last Man On Earth First God On Earth Coach Teacher Happy Life Live and Live Meaning Inspire 1
194 pages
Dada Bhagwan Part 5 Eng
No ratings yet
Dada Bhagwan Part 5 Eng
44 pages
The Ultimate Goal of Life Liberation - Print
No ratings yet
The Ultimate Goal of Life Liberation - Print
32 pages
Rainbow of True Love Print Booklet by AiR Atman in Ravi
No ratings yet
Rainbow of True Love Print Booklet by AiR Atman in Ravi
32 pages
What Is Yoga Booklet by AiR Atman in Ravi 1
No ratings yet
What Is Yoga Booklet by AiR Atman in Ravi 1
24 pages
Cartoon Story 3
No ratings yet
Cartoon Story 3
20 pages
CMG - Managing Capacity Planning Through Simple Techniques
No ratings yet
CMG - Managing Capacity Planning Through Simple Techniques
11 pages
CMG - Determining Capacity Utilization For 1 Lakh Servers
No ratings yet
CMG - Determining Capacity Utilization For 1 Lakh Servers
10 pages
SOUL Booklet by AiR Atman in Ravi
No ratings yet
SOUL Booklet by AiR Atman in Ravi
32 pages
Satyam Shivam Sundaram Booklet by AiR Atman in Ravi
No ratings yet
Satyam Shivam Sundaram Booklet by AiR Atman in Ravi
32 pages
Lovable Laura Eng
No ratings yet
Lovable Laura Eng
40 pages
Proud-Peter Eng
No ratings yet
Proud-Peter Eng
40 pages
But We Pray - AiR Atman in Ravi
No ratings yet
But We Pray - AiR Atman in Ravi
32 pages
Law of Attraction Booklet by AiR Atman in Ravi 1
No ratings yet
Law of Attraction Booklet by AiR Atman in Ravi 1
32 pages
CMG1987 - 3480 Performance and CP
No ratings yet
CMG1987 - 3480 Performance and CP
10 pages
Auto Analysis ITSys Perf MGMT
No ratings yet
Auto Analysis ITSys Perf MGMT
6 pages
Science - C - Light Reflection and Refraction
No ratings yet
Science - C - Light Reflection and Refraction
8 pages
Applications Have Performance
No ratings yet
Applications Have Performance
5 pages
Capacity Plannign For Soa E-Business Application
No ratings yet
Capacity Plannign For Soa E-Business Application
9 pages
Best Practices For Server Virtualization
No ratings yet
Best Practices For Server Virtualization
7 pages
Oracle 11g Streams Implementer's Guide
From Everand
Oracle 11g Streams Implementer's Guide
Ann L. R. McKinnell
No ratings yet
Airflow for Data Workflow Automation
From Everand
Airflow for Data Workflow Automation
Richard Johnson
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Capacitry Planning For DBMS With OLAP Workloads

Uploaded by

Capacitry Planning For DBMS With OLAP Workloads

Uploaded by

The Association of System

For more information on CMG please visit http://www.cmg.org

Copyright Notice and License

Concurrent use on two or more computers or on a network is not allowed.

A STUDY OF CAPACITY PLANNING FOR DATABASE MANAGEMENT

Ontario, Canada, K7L 3N6

Find a CMG regional meeting near you at www.cmg.org/regions

Find a CMG regional meeting near you at www.cmg.org/regions

Find a CMG regional meeting near you at www.cmg.org/regions

buffer pool usage by the DBMS buffer pool snapshot

Response Time (s)

(Physical read) is larger than the transaction needs 4000

(Logical read). This is because of the prefetching, 3000

which we will discuss in Section 3.3.3. When 2000

prefetching is turned off, the data hit rate of 15 clients 1000

increases to 10%, but the performance degrades. The 0

from 2 hours 5 minutes to 3 hours 6 minutes.

threshold. We designed three sets of experiments to

study the buffer pool effect on TPC-H performance: 1200

DBMS parameter (sort heap), hardware configuration

Buffer pool size (page)

impact between buffer pool and sort heap was

investigated in our first set of experiments; and the 10000

results showed that they are independent of each

100000 pages, where 1 page is 4KB) are the same.

10000 Buffer pool size (page)

Buffer Pool Size (page) 14000

Buffer pool size (page)

Find a CMG regional meeting near you at www.cmg.org/regions

important role in TPC-H performance. 8 disk

Figure 5. Effect of Sort Heap Size on TPC-H Query Response

Find a CMG regional meeting near you at www.cmg.org/regions

Find a CMG regional meeting near you at www.cmg.org/regions

clustering algorithm. The combination of these three

by averaging all TPC-H queries belong to this class,

Find a CMG regional meeting near you at www.cmg.org/regions

We found that the total TPC-H response time (the

In the QNM it is assumed that all disks are fully

0.600 problem. The law is

H queries are CPU intensive, while 17 are disk 15000 Model

Find a CMG regional meeting near you at www.cmg.org/regions

depend on block size, seek pattern, and IO subsystem

30000 the calculated RRT and SRT using the disk

Find a CMG regional meeting near you at www.cmg.org/regions

y = 0.5029x + 10.813 Another common hardware configuration change is

250 the same architecture. Usually this is one of the

350 Calc. Disk Demand

200 CPU speed (MHz)

Actual Response Time cannot use a linear equation, we employ curve-fitting

Calculate using estimated disk demand

6. CONCLUSIONS AND FUTURE WORK

Find a CMG regional meeting near you at www.cmg.org/regions

simple enough to be manageable.

Find a CMG regional meeting near you at www.cmg.org/regions

[STEN97] P. Stenstrom, E. Hagersten, D. J. Lijia, M. TRADEMARKS

Find a CMG regional meeting near you at www.cmg.org/regions

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.