Ebook Db2 Performance Handbook All en 1006
Ebook Db2 Performance Handbook All en 1006
Handbook
for DB2 for z/OS
About the Contributors from Yevich, Lawson and Associates Inc.
DAN LUKSETICH is a senior DB2 DBA. He works as a DBA, application architect, presenter,
author, and teacher. Dan has over 17 years working with DB2 as a DB2 DBA, application
architect, system programmer and COBOL and BAL programmer — working on major
implementations on z/OS, AIX, and Linux environments.
His experience includes DB2 application design and architecture, database administration,
complex SQL and SQL tuning, performance audits, replication, disaster recovery, stored
procedures, UDFs, and triggers.
She is an IBM Gold Consultant for DB2 and z/Series. She has authored the IBM ‘DB2 for
z/OS V8 DBA Certification Guide’, DB2 for z/OS V7 Application Programming Certification
Guide’ and ‘DB2 for z/OS V9 DBA Certification Guide’ — 2007. She also co-authored several
books including ‘DB2 High Performance Design and Tuning’ and ‘DB2 Answers’ and is a
frequent speaker at user group and industry events. (Visit DB2Expert.com)
About CA
CA (NYSE: CA), one of the world's largest information technology (IT) management
software companies, unifies and simplifies the management of enterprise-wide IT for greater
business results. Our vision, tools and expertise help customers manage risk, improve service,
manage costs and align their IT investments with their business needs. CA Database
Management encompasses this vision with an integrated and comprehensive solution for
database design and modeling, database performance management, database administration,
and database backup and recovery across multiple database systems and platforms.
Copyright © 2007 CA. All Rights Reserved. One CA Plaza, Islandia, N.Y. 11749. All trademarks, trade names, service marks, and logos referenced herein belong to their respective companies.
Table of Contents
This handbook assumes a good working knowledge of DB2 and SQL, and is designed to help
you build good performance into the application, database, and the DB2 subsystem. It provides
techniques to help you monitor DB2 for performance, and to identify and tune production
performance problems.
Many people spend a lot of time thinking about DB2 performance, while others spend no time
at all on the topic. Nonetheless, performance is on everyone’s mind when one of two things
happen: an online production system does not respond within the time specified in a service
level agreement (SLA), or an application uses more CPU (and thus costs more money) than is
desired. Most of the efforts involved in tuning during these situations revolve around fire
fighting, and once the problem is solved the tuning effort is abandoned.
This handbook aims to provide information that performance fire fighters and performance
tuners need. It also provides the information necessary to design databases for high
performance, to understand how DB2 processes SQL and transactions, and to tune those
processes for performance. For database performance management, the philosophy of
performance tuning begins with database design, and extends through (database and
application) development and production implementation.
An understanding of DB2 performance, even at a high level, goes a long way to ensure well
performing applications, databases, and subsystems.
• DB2 subsystem performance information , and advice for subsystem configuration and tuning
• Commentary for proper application design for performance
• Real-world tips for performance tuning and monitoring
Most people don’t inherently care about performance. They want their data, they want it now,
they want it accurate, they want it all the time, and they want it to be secured. Of course, once
they get these things they are also interested in getting the data fast, and at the lowest cost
possible. Enter performance design and tuning.
Getting the best possible performance from DB2 databases and applications means a number
of things. First and foremost, it means that users do not need to wait for the information they
require to do their jobs. Second, it means that the organization using the application is not
spending too much money running the application. The methods for designing, measuring, and
tuning for performance vary from application to application. The optimal approach depends
upon the type of application. Is it is an online transaction processing system (OLTP)? A data
warehouse? A batch processing system?
2 ca.com
INTRODUCING DB2 PERFORMANCE
Your site may use one or all of the following to measure performance:
• User complaints
• Overall response time measurements
• Service level agreements (SLAs)
• System availability
• I/O time and waits
• CPU consumption and associated charges
• Locking time and waits
In other situations you may be dealing with packaged software applications that cause
performance problems. In these situations it may be difficult to tune the application but quite
possible to improve performance through database and subsystem tuning. (See Chapter 9.)
Designing an application with regards to performance will help avoid many performance
problems after the application has been deployed. Proactive performance engineering can
help eliminate redesign, recoding, and retrofitting during implementation in order to satisfy
performance expectations, or alleviate performance problems. Proactive performance
engineering allows you to analyze and stress test the designs before they are implemented.
Proper techniques (and tools) can turn performance engineering research and testing from
a lengthy process inhibiting development to one that can be performed efficiently prior
to development.
CA PERFORMANCE HANDBOOK FOR DB2 FOR z/OS 3
INTRODUCING DB2 PERFORMANCE
There are many options for proper index and table design for performance. These options have
to do specifically with the type of application that is going to be implemented, specifically with
the type of data access (random versus sequential, read versus update, etc.). In addition, the
application as well as to be properly designed for performance. This involves understanding the
application data access patterns, as well as the units of work (UOWs) and inputs. Taking these
things into consideration, and designing the application and database accordingly is the best
defense against performance problems. (See Chapters 4 and 7.)
When it comes to database access, finding and fixing the worst performing SQL statements is
the top priority. However, how does one prioritize the worst culprits? Are problems caused by
the statements that are performing table space scans, or the ones with matching index access?
Performance, of course, is relevant, and finding the least efficient statement is a matter of
understanding how that statement is performing relative to the business need that it serves.
(See Chapter 6.)
4 ca.com
CHAPTER 2
One of the most important things to understand about DB2, and DB2 performance, is that
DB2 offers a level of abstraction from the data. We don’t have to know where exactly the data
physically exists. We don’t need to know where the datasets are, we don’t need to know the
access method for the data, and we don’t need to know what index to use. All we need to know
is the name of the table, and the columns we are interested in. DB2 takes care of the rest. This
is a significant business advantage in that we can quickly build applications that access data
via a standardized language that is independent of any access method or platform. This
enables ultra-fast development of applications, portability across platforms, and all the power
and flexibility that comes with a relational or object-relational design. What we pay in return
is more consumption of CPU and I/O.
Using DB2 for z/OS can be more expensive than more traditional programming and access
methods on the z/OS platform. Yes, VSAM and QSAM are generally cheaper than DB2. Yes,
flat file processing can be cheaper than DB2, sometimes! However, with continually shrinking
people resources and skill sets on the z/OS platform, utilizing a technology such as DB2 goes
a long way in saving programming time, increasing productivity, and portability. This means
leverage, and leverage goes a long way in the business world.
So, with the increased cost of using DB2 over long standing traditional access methods on
the z/OS platform, can we possibly get the flexibility and power of DB2, along with the best
performance possible? Yes, we can. All we need is an understanding of the DB2 optimizer, the
SQL language, and the trade-offs between performance, flexibility, portability, and people costs.
The optimizer is responsible for interpreting your queries, and determining how to access your
data in the most efficient manner. However, it can only utilize the best of the available access
paths. It does not know what access paths are possible that are not available. It also can only
deal with the information that it is provided. That is, it is dependent upon the information we
give it. This is primarily the information stored in the DB2 system catalog, which includes the
basic information about the tables, columns, indexes and statistics. It doesn’t know about
indexes that could be built, or anything about the possible inputs to our queries (unless all
literal values are provided in a query), input files, batch sequences, or transaction patterns. This
is why an understanding of the DB2 optimizer, statistics, and access paths is very important,
but also important is the design of applications and databases for performance. These topics
are covered in the remainder of this guide, and so this chapter serves only as an education into
the optimizer, statistics, and access paths, and is design to build a basis upon which proper
SQL, database, and application tuning can be conducted.
The DB2 optimizer is cost based, and so catalog statistics are critical to proper performance.
The optimizer uses a variety of statistics to determine the cost of accessing the data in tables.
These statistics can be cardinality statistics, frequency distribution statistics, or histogram
statistics. The type of query (static or dynamic), use of literal values, and runtime optimization
options will dictate which statistics are used. The cost associated with various access paths
will affect whether or not indexes are chosen, or which indexes are chosen, the access path,
and table access sequence for multiple table access paths. For this reason maintaining accurate
up to date statistics is critical to performance.
It should be noted that catalog statistics and database design are not the only things the DB2
optimizer uses to calculate and choose an efficient access path. Additional information, such as
central processor model, number of processors, and size of the RID pool, various installation
parameters, and buffer pool sizes are used to determine the access path. You should be aware
of these factors as you tune your queries.
There are three flavors to DB2 catalog statistics. Each provides different details about the data
in your tables, and DB2 will utilize these statistics to determine the best way to access the data
dependent upon the objects and the query.
6 ca.com
SQL AND ACCESS PATHS
Cardinality Statistics
DB2 needs an accurate estimate of the number of rows that qualify after applying various
predicates in your queries in order to determine the optimal access path. When multiple tables
are accessed the number of qualifying rows estimated for the tables can also affect the table
access sequence. Column and table cardinalities provide the basic, but critical information for
the optimizer to make these estimates.
Cardinality statistics reflect the number of rows in a table, or the number of distinct values for a
column in a table. These statistics provide the main source of what is known as the filter factor,
a percentage of the number of rows expected to be returned, for a column or a table. DB2 will
use these statistics to determine such things as to whether or not access is via an index scan
or a table space scan, and when joining tables which table to access first. For example, for the
following statement embedded in a COBOL program:
SELECT EMPNO
FROM EMP
DB2 has to choose what is the most efficient way to access the EMP table. This will depend
upon the size of the table, as well as the cardinality of the SEX column, and any index that
might be defined upon the table (especially one on the SEX column). It will use the catalog
statistics to determine the filter factor for the SEX column. In this particular case, 1/COLCARDF,
COLCARDF being a column in the SYSIBM.SYSCOLUMNS catalog table. The resulting fractional
number represents the number of rows that are expected to be returned based upon the
predicate. DB2 will use this number, the filter factor, to make decisions as to whether of not to
use an index (if available), or table space scan. For example, if the column cardinality of the
SEX column of the EMP table is 3 (male, female, unknown), then we know there are 3 unique
occurrences of a value of SEX in the table. In this case, DB2 determines a filter factor of 1/3, or
33% of the values. If the table cardinality of the table, as reflected in the CARDF column of the
SYSIBM.SYSTABLES catalog table, is 10,000 employees then the estimated number of rows
returned from the query will be 3,333. DB2 will use this information to make decisions about
which access path to choose.
If the predicate in the query above is used in a join to another table, then the filter factor
calculated will be used not only to determine the access path to the EMP table, but it could
also influence which table will be accessed first in the join.
FROM EMP E
INNER JOIN
DEPT D
ON E.WORKDEPT = D.DEPTNO
In addition to normal cardinality statistics, statistics can be gathered upon groups of columns,
which is otherwise known as columns correlation statistics. This is very important to know
because again the DB2 optimizer can only deal with the information it is given. Say, for
example you have the following query:
SELECT E.EMPNO
FROM EMP E
Now, imagine in a truly hypothetical example that the amount of money that someone is paid
is correlated to the amount of education they have received. For example, an intern being paid
minimum wage won’t be paid as much as an executive with an MBA. However, if the DB2
optimizer is not given this information it has to multiply the filter factor for the SALARY
predicate by the filter factor for the EDLEVEL predicate. This may result in an exaggerated filter
factor for the table, and could negatively influence the choice of index, or the table access
sequence of a join. For this reason, it is important to gather column correlation statistics on
any columns that might be correlated. More conservatively, it may be wise to gather column
correlation statistics on any columns referenced together in the WHERE clause.
8 ca.com
SQL AND ACCESS PATHS
Skewed distributions can have a negative influence on query performance if DB2 does not
know about the distribution of the data in a table and/or the input values in a query predicate.
In the following query:
SELECT EMPNO
FROM EMP
DB2 has no idea what the input value of the :SEX host variable is, and so it uses the default
formula of 1/COLCARDF. Likewise, if the following statement is issued:
SELECT EMPNO
FROM EMP
and most of the values are ‘M’ or ‘F’, and DB2 has only cardinality statistics available, then the
same formula and filter factor applies. This is where distribution statistics can pay off.
DB2 has the ability to collect distribution statistics via the RUNSTATS utility. These statistics
reflect the percentage of frequently occurring values. That is, you can collect the information
about the percentage of values that occur most or least frequently in a column in a table. These
frequency distribution statistics can have a dramatic impact on the performance of the
following types of queries:
• Dynamic or static SQL with embedded literals
• Static SQL with host variables bound with REOPT(ALWAYS)
• Static or dynamic SQL against nullable columns with host variables that cannot be null
• Dynamic SQL with parameter markers bound with REOPT(ONCE), REOPT(ALWAYS), or
REOPT(AUTO)
So, if we gathered frequency distribution statistics for our SEX column in the previous
examples, we may find the following frequency values in the SYSIBM.SYSCOLDIST table
(simplified in this example):
VALUE FREQUENCY
‘M’ .49
‘F’ .49
‘U’ .01
Now, if DB2 has this information, and is provided with the following query:
SELECT EMPNO
FROM EMP
It can determine that the value ‘U’ represents about 1% of the values of the sex column. This
additional information can have dramatic effects on access path and table access sequence
decisions.
10 ca.com
SQL AND ACCESS PATHS
Histogram Statistics
Histogram statistics improve on distribution statistics in that they provide value-distribution
statistics that are collected over the entire range of values in a table. This goes beyond the
capabilities of distribution statistics in that distribution statistics are limited to only the most
or least occurring values in a table (unless they are, of course, collected all the time for every
single column which can be extremely expensive and difficult). To be more specific, histogram
statistics summarize data distribution on an interval scale, and span the entire distribution of
values in a table.
Histogram statistics divide values into quantiles. The quantile defines an interval of values
The quantity of intervals is determined via a specification of quantiles in a RUNSTATS
execution. Thus a quantile will represent a range of values within the table. Each quantile will
contain approximately the same percentage of rows. This can go beyond the distribution
statistics in that all values are represented.
Now, the previous example is no good because we had only three values for the SEX column of
the EMP table. So, we’ll use a hypothetical example with our EMP sample table. Let’s suppose,
theoretically, that we needed to query on the middle initial of person’s names. Those initials are
most likely in the range of ‘A’ to ‘Z’. If we had only cardinality statistics then any predicate
referencing the middle initial column would receive a filter factor of 1/COLARDF or 1/26. If we
had distribution statistics then any predicate that used a literal value could take advantage of
the distribution of values to determine the best access path. That determination would be
dependent upon the value being recorded as one of the most or least occurring values. If it is
not, then the filter factor is determined based upon the difference in frequency of the remaining
values. Histogram statistics eliminate this guesswork.
If we calculated histogram statistics for the middle initial, they may look something like this:
1 A G 5080 20%
2 H L 4997 19%
3 M Q 5001 20%
4 R U 4900 19%
5 V Z 5100 20%
Now, the DB2 optimizer has information about the entire span of values in the table, and any
possible skew. This is especially useful for range predicates such as this:
SELECT EMPNO
FROM EMP
Access Paths
There are a number of access paths that DB2 can choose when accessing the data in your
database. This section describes those access paths, as well as when they are effective. The
access paths are exposed via use of the DB2 EXPLAIN facility. The EXPLAIN facility can be
invoked via one of these techniques:
• EXPLAIN SQL statement
• EXPLAIN(YES) BIND or REBIND parameter
• Visual Explain (DB2 V7, DB2 V8, DB2 9)
• Optimization Service Center (DB2 V8, DB2 9)
12 ca.com
SQL AND ACCESS PATHS
A table space scan is not necessarily a bad thing. It depends upon the nature of the request
and the frequency of access. Sure, running a tables pace scan every one second in support of
online transactions is probably a bad thing. A table space scan in support of a report that spans
the content of an entire table once per day is probably a good thing.
FROM EMP
DB2 can choose an index based access method if there was an index available on the EMPNO
column. DB2 could also choose a table space scan, but utilize the partitioning key values to
access only the partition in which the predicate matches.
DB2 can also choose a secondary index for the access path, and still eliminate partitions using
the partitioning key values if those values are supplied in a predicate. For example, in the
following query:
FROM EMP
DB2 could choose a partitioned secondary index on the WORKDEPT column, and eliminate
partitions based upon the range provided in the query for the EMPNO.
DB2 can employ partition elimination for predicates coded against the partitioning key using:
• Literal values (DB2 V7, DB2 V8, DB2 9)
• Host variables (DB2 V7 with REOPT(VARS), DB2 V8, DB2 9)
• Joined columns (DB2 9)
Index Access
DB2 can utilize indexes on your tables to quickly access the data based upon the predicates in
the query, and to avoid a sort in support of the use of the DISTINCT clause, as well as for the
ORDER BY and GROUP BY clauses, and INTERSECT, and EXCEPT processing.
There are several types of index access. The index access is indicated in the PLAN_TABLE by
an ACCESSTYPE value of either I, I1, N, MX, or DX.
DB2 will match the predicates in your queries to the leading key columns of the indexes
defined against your tables. This is indicated by the MATCHCOLS column of the PLAN_TABLE.
If the number of columns matched is greater than zero, then the index access is considered a
matching index scan. If the number of matched columns is equal to zero, then the index access
is considered a non-matching index scan. In a non-matching index scan all of the key values
and their record identifiers (RIDs) are read. A non-matching index scan is typically used if DB2
can use the index in order to avoid a sort when the query contains an ORDER BY, DISTINCT, or
GROUP BY clause, and also possibly for an INTERSECT, or EXCEPT.
The matching predicates on the leading key columns are equal (=) or IN predicates. This would
correspond to an ACCESSTYPE value of either “I” or “N”, respectively. The predicate that
matches the last index column can be an equal, IN, NOT NULL, or a range predicate (<, <=, >,
>=, LIKE, or between). For example, for the following query assume that the EMPPROJACT
DB2 sample table has an index on the PROJNO, ACTNO, EMSTDATE, and EMPNO columns:
FROM EMPPROJACT
14 ca.com
SQL AND ACCESS PATHS
In the above example DB2 could choose a matching index scan matching on the PROJNO,
ACTNO, and EMSTDATE columns. It does not match on the EMPNO column due to the fact
that the previous predicate is a range predicate. However, the EMPNO column can be applied
as an index screening column. That is, since it is a stage 1 predicate (predicate stages are
covered in Chapter 2) DB2 can apply it to the index entries after the index is read. So, although
the EMPNO column does not limit the range of entries retrieved from the index it can eliminate
the entries as the index is read, and therefore reduce the number of data rows that have to be
retrieved from the table.
If all of the columns specified in a statement can be found in an index, then DB2 may
choose index only access. This is indicated in the PLAN_TABLE with the value “Y” in the
INDEXONLY column.
DB2 is also able to access indexes via multi-index access. This is indicated via an ACCESSTYPE
of “M” in the PLAN_TABLE. A multi-index access involves reading multiple indexes, or the
same index multiple times, gathering qualifying RID values together in a union or intersection
of values (depending upon the predicates in the SQL statement), and then sorting them in data
page number sequence. In this way a table can still be accessed efficiently for more complicated
queries. With multi-index access each operation is represented in the PLAN_TABLE in an
individual row. These steps include the matching index access (ACCESTYPE = “MX” or “DX”)
for regular indexes or DOCID XML indexes, and then unions or intersections of the rids
(ACCESSTYPE = “MU”, “MI”, “DU”, or “DI”).
The following example demonstrates a SQL statement that can perhaps utilize multi-index
access if there is an index on the LASTNAME and FIRSTNME columns of the EMP table.
SELECT EMPNO
FROM EMP
With the above query DB2 could possibly access the index twice, union the resulting RID
values together (do to the OR condition in the WHERE clause), and then access the data.
List Prefetch
The preferred method of table access, when access is via an index, is by utilizing a clustered
index. When an index is defined as clustered, then DB2 will attempt to keep the data in the
table in the same sequence as the key values in the clustering index. Therefore, when the table
is accessed via the clustering index, and the data in the table is well organized (as represented
by the catalog statistics), then the access to the data in the table will typically be sequential
and predictable. Accessing the data in a table via a non-clustering index, or via the clustering
index with a low cluster ratio (the table data is disorganized), then the access to the data
can be very random and unpredictable. In addition, the same data pages could be read
multiple times.
When DB2 detects that access to a table will be via a non-clustering index, or via a clustering
index that has a low cluster ratio, then DB2 may choose an index access method called list
prefetch. List prefetch will also be used as part of the access path for multi-index access and
access to the inner table of a hybrid join. List prefetch is indicated in the PLAN_TABLE via a
value of “L” in the PREFETCH column.
List prefetch will access the index via a matching index scan of one or more indexes, and
collect the RIDs into the RID pool, which is a common area of memory. The RIDs are then
sorted in ascending order by the page number, and the pages in the table space are then
retrieved in a sequential or skip sequential fashion. List prefetch will not be used if the
matching predicates include an IN-list predicate.
List prefetch will not be chosen as the access path if DB2 determines that the RIDs to be
processed will take more than 50% of the RID pool when the query is executed. Also, list
prefetch can terminate during execution if DB2 determines that more than 25% of the rows of
the table must be accessed. In these situations (called RDS failures) the access path changes
to a table space scan.
In general, a list prefetch is useful for access to a moderate number of rows when access to the
table is via a non-clustering or clustering index when the data is disorganized (low cluster
ratio). It is usually less useful for queries that process large quantities of data or very small
amounts of data.
16 ca.com
SQL AND ACCESS PATHS
A nested loop join will repeatedly access the inner table as rows are accessed in the outer
table. Therefore, the nested loop join is most efficient when a small number of rows qualify for
the outer table, and is a good access path for transactions that are processing little or no data.
It is important that there exist an index that supports the join on the inner table, and for the
best performance that index should be a clustering index in the same order as the outer table.
In this way the join operation can be a sequential and efficient process. DB2 (DB2 9) can also
dynamically build a sparse index on the inner table when no index is available on the inner
table. It may also sort the outer table if it determines that the join columns are not in the same
sequence as the inner table, or the join columns of the outer table are not in an index or are in a
clustering index where the table data is disorganized.
The nested loop join is the method selected by DB2 when you are joining two tables together,
and do not specify join columns.
Hybrid Join
The preferred method for a join is to join two tables together via a common clustering index. In
these cases it is likely that DB2 will choose a nested loop join as the access method for small
amounts of data. If DB2 detects that the inner table access will be via a non-clustering index or
via a clustering index in which the cluster ratio is low, then DB2 may choose a hybrid join as
the access method. The hybrid join is indicated via a value of “4” in the METHOD column of
the PLAN_TABLE.
In a hybrid join DB2 will scan the outer table either via a table space scan or via an index, and
then join the qualifying rows of the outer table with the RIDs from the matching index entries.
DB2 also creates a RID list, as it does for list prefetch, for all the qualifying RIDs of the inner
table. Then DB2 will sort the outer table and RIDs, creating a sorted RID list and an inter-
mediate table. DB2 will then access the inner table via list prefetch, and join the inner table
to the intermediate table.
Hybrid join can out perform a nested loop join when the inner table access is via a non-
clustering index or clustering index for a disorganized table. This is especially true if the query
will be retrieving more than a trivial amount of data. That is, the hybrid join is better if the
equivalent nested loop join would be accessing the inner table in a random fashion, and thus
accessing the same pages multiple times. Hybrid join takes advantage of the list prefetch
processing on the inner table to read all the data pages only once in a sequential or skip
sequential manner. Thus, hybrid join is better when the query processes more than a tiny
amount of data (nested loop is better in this case), or less than a very large amount of data
(merge scan is better in this case). Since list prefetch is utilized, the hybrid join can experience
RID list failures. This can result in a table space scan of the inner table on initial access to the
inner table, or a restart of the list prefetch processing upon subsequent accesses (all within the
same query). If you are experiencing RID list failures in a hybrid join then you should attempt to
encourage DB2 to use a nested loop join.
In a merge scan join DB2 will scan both tables in the order of the join columns. If there are no
efficient indexes providing the join order for the join columns, then DB2 may sort the outer
table, the inner table, or both tables. The inner table is placed into a work file, and the outer
table will be placed into a work file if it has to be sorted. DB2 will then read both tables, and
join the rows together via a match/merge process.
Merge scan join is best for large quantities of data. The general rule should be that if you are
accessing small amounts of data in a transaction then nested loop join is best. Processing large
amounts of data in a batch program or a large report query? Then merge scan join is generally
the best.
Star Join
A star join is another join method that DB2 will employ in special situations in which it detects
that the query is joining tables that are a part of a star schema design of a data warehouse. The
star join is indicated via the value of “S” in the JOIN_TYPE column of the PLAN_TABLE. The
star join is best described in the DB2 Administration Guide (DB2 V7, DB2 V8) of the DB2
Performance Monitoring and Tuning Guide (DB2 9).
Read Mechanisms
DB2 can apply some performance enhancers when accessing your data. These performance
enhancers can have a dramatic effect on the performance of your queries and applications.
Sequential Prefetch
When your query reads data in a sequential manner, DB2 can elect to use sequential prefetch
to read the data ahead of the query requesting it. That is, DB2 can launch asynchronous read
engines to read data from the index and table page sets into the DB2 buffers. This hopefully
allows the data pages to already be in the buffers in expectation of the query accessing them.
The maximum number of pages read by a sequential prefetch operation is determined by
the size of the buffer pool used. When the buffer pool is smaller than 1000 pages, then the
prefetch quantity is limited due to the risk of filling up the buffer pool. If the buffer pool has
more than 1000 pages then 32 pages can be read with a single physical I/O.
Sequential prefetch is indicated via a value of “S” in the PREFETCH column of the
PLAN_TABLE. DB2 9 will only use sequential prefetch for a table space scan. Otherwise, DB2 9
will rely primarily on dynamic prefetch.
18 ca.com
SQL AND ACCESS PATHS
Dynamic prefetch can be activated for single cursors, or for repetitive queries within a single
application thread. This could improve the performance of applications that issue multiple
SQL statements that access the data as a whole sequentially. It is also important, of course,
to cluster tables that are commonly accessed together to possibly take advantage of
sequential detection.
It should be noted that dynamic prefetch is dependent upon the bind parameter
RELEASE(DEALLOCATE). This is because the area of memory used to track the last eight
pages accessed is destroyed and rebuilt for RELEASE(COMMIT). It should also be noted that
RELEASE(DEALLOCATE) is not effective for remote connections, and so sequential detection
and dynamic prefetch are less likely for remote applications.
Index Lookaside
Index lookaside is another powerful performance enhancer that is active only after an initial
access to an index. For repeated probes of an index DB2 will check the most current index leaf
page to see if the key is found on that page. If it is then DB2 will not read from the root page of
the index down to the leaf page, but simply use the values from the leaf page. This can have a
dramatically positive impact on performance in that getpage operations and I/O’s can be avoided.
Index lookaside depends on the query reading index data in a predictable pattern. Typically this
is happening for transactions that are accessing the index via a common cluster with other
tables. Index lookaside is also dependent upon the RELEASE(DEALLOCATE) bind parameter,
much in the same way as sequential detection.
Sorting
DB2 may have to invoke a sort in support of certain clauses in your SQL statement, or in
support of certain access paths. DB2 can sort in support of an ORDER BY or GROUP BY clause,
to remove duplicates (DISTINCT, EXCEPT, or INTERSECT), or in join or subquery processing.
In general, sorting in an application will always be faster than sorting in DB2 for small result
sets. However, by placing the sort in the program you remove some of the flexibility, power, and
portability of the SQL statement, along with the ease of coding. Perhaps instead you can take
advantage of the ways in which DB2 can avoid a sort.
SELECT *
FROM EMP
In the above query the sort will be avoided if DB2 uses the index on those three columns to
access the table. DB2 can also avoid a sort if any number of the leading columns of an index
match the ORDER BY clause, or if the query contains an equals predicate on the leading
columns (ORDER BY pruning). So, both of the following queries will avoid a sort if the
index is used:
SELECT *
FROM EMP
ORDER BY WORKDEPT
SELECT *
FROM EMP
20 ca.com
SQL AND ACCESS PATHS
FROM EMP
FROM EMP
GROUP BY WORKDEPT
FROM EMP
FROM EMP
FROM EMP
When using this technique, make sure you properly document so other programmers
understand why you are grouping on all columns.
Parallelism
DB2 can utilize something called parallel operations for access to your tables and/or indexes.
This can have a dramatic performance impact on performance for queries that process large
quantities of data across multiple table and index partitions. The response time for these data
or processor intensive queries can be dramatically reduced. There are two types of query
parallelism; query I/O parallelism and query CP parallelism.
22 ca.com
SQL AND ACCESS PATHS
I/O parallelism manages concurrent I/O requests for a single query. It can fetch data using
these multiple concurrent I/O requests into the buffers, and significantly reduce the response
time for large, I/O bound queries.
Query CP parallelism can break a large query up into smaller queries, and then run the smaller
queries in parallel on multiple processors. This can also significantly reduce the elapsed time
for large queries.
You can enable query parallelism by utilizing the DEGREE(ANY) parameter of a BIND
command, or by setting the value of the CURRENT DEGREE special register to the value of
‘ANY’. Be sure, however, that you utilize query parallelism for larger queries, as there is
overhead to the start of the parallel tasks. For small queries this can actually be a performance
detriment.
DB2 can also utilize something called Sysplex query parallelism. In this situation DB2 can split
a query across multiple members of a data sharing group to utilize all of the processors on the
members for extreme query processing of very large queries.
This chapter will address the fundamentals of SQL performance and SQL processing in DB2.
In order to write efficient SQL statements, and to tune SQL statements, we need a basic
understanding of how DB2 optimizes SQL statements, and how it accesses your data.
When writing and tuning SQL statements, we should maintain a simple and important
philosophy. That being “filter as much as possible as early as possible”. We have to keep in
mind that the most expensive thing we can do is to travel from our application to DB2. Once
we are processing inside DB2 we need to do our filtering as close to the indexes and data
as possible.
We also need to understand how to reduce repeat processing. Keep in mind that the most
efficient SQL statement is the one that never executes. That is, we should only do absolutely
what is necessary to get the job done.
You can apply a very simple philosophy in order to understand which of your predicates
might fall within one of these classifications. That is, the DB2 stage 1 engine understands your
indexes and tables, and can utilize an index for efficient access to your data. Only a stage 1
predicate can limit the range of data accessed on a disk. The stage 2 engine processes
functions and expressions, but is not able to directly access data in indexes and tables. Data
from stage 1 is passed to stage 2 for further processing, and so stage 1 predicates are generally
more efficient than stage 2 predicates. Stage 2 predicates cannot utilize an index, and thus
cannot limit the range of data retrieved from disk. Finally, a stage 3 predicate is a predicate that
is processed in the application layer. That is, filtering performed once the data is retrieved from
DB2 and processed in the application. Stage 3 predicates are the least efficient.
There is a table in the DB2 Administration Guide (DB2 V7 and DB2 V8) or the DB2
Performance Monitoring and Tuning Guide (DB2 9) that lists the various predicate forms, and
whether they are stage 1 indexable, stage 1, or stage 2. Only examples are given in this guide.
Stage 1 Indexable
The best performance for filtering will occur when stage 1 indexable predicates are used. But
just because a predicate is listed in the chart as stage 1 indexable does not mean it will be used
for either an index or processed in stage 1 as there are many other factors that must also be in
place. The first thing that determines if a predicate is stage 1 is the syntax of the predicate and
then secondly, it is the type and length of constants used in the predicate. This is one area
which has improved dramatically with version 8 in that more data types can be promoted
inside stage 1 thus improving stage 1 matching. If the predicate, even though it would classify
as a stage 1 predicate, is evaluated after a join operation then it is a stage 2 predicate. All
indexable predicates are stage 1 but not all stage 1 predicates are indexable.
Stage 1 indexable predicates are predicates that can be used to match on the columns of an
index. The most simple example of a stage 1 predicate would be of the form <col op value>,
where col is a column of a table, op is an operator (=, >, <, >=, <=) and value represents a non-
column expression (an expression that does not contain a column from the table). Predicates
containing BETWEEN, IN (for a list of values), and LIKE (without a leading search character)
can also be stage 1 indexable. The stage 1 indexable predicate, when an index is utilized,
provides the best filtering in that it can actually limit the range of data accessed from the index.
You should try to utilize stage 1 matching predicates whenever possible. Assuming that an
index exists on the EMPNO column of the EMP table, then the predicate in the following query
is a stage 1 indexable predicate:
FROM EMP
26 ca.com
PREDICATES AND SQL TUNING
Other Stage 1
Just because a predicate is stage 1 does not mean that it can utilize an index. Some stage 1
predicates are not available for index access. These predicates (again, the best reference is the
chart in the IBM manuals) are generally of the form <col NOT op value>, where col is a column
of a table, op is an operator, and value represents a non-column expression, host variable, or
value. Predicates containing NOT BETWEEN, NOT IN (for a list of values), NOT LIKE (without a
leading search character), or LIKE (with a leading search character) can also be stage 1
indexable. Although non-indexable stage 1 predicates cannot limit the range of data read from
an index, they are available as index screening predicates. The following is an example of a
non-indexable stage 1 predicate:
FROM EMP
Stage 2
The DB2 manuals give a complete description of when a predicate can be stage 2 versus stage
1. However, generally speaking stage 2 happens after data accesses and performs such things
as sorting and evaluation of functions and expressions. Stage 2 predicates cannot take
advantage of indexes to limit the data access, and are generally more expensive than stage 1
predicates because they are evaluated later in the processing.
Stage 2 predicates are generally those that contain column expressions, correlated subqueries,
and CASE expressions, among others. A predicate can also appear to be stage 1, but processed
as stage 2. For example, any predicate processed after a join operation is stage 2. Also,
although DB2 does a good job of promoting mismatched data types to stage 1 via casting (as
of version 8) some predicates with mismatched data types are stage 2. One example is a range
predicate comparing a character column to a character value that exceeds the length of the
column. The following are examples of stage 2 predicates (EMPNO is a character column of
fixed length 6):
FROM EMP
FROM EMP
Stage 3
A stage 3 predicate is a fictitious predicate we’ve made up to describe filtering that occurs
after the data has been retrieved from DB2. That is, filtering that is done in the application
program. You should have no stage 3 predicates. Performance is best served when all filtering
is done in the data server, and not in the application code. Imagine for a moment that a COBOL
program has read the EMP table. After reading the EMP table, the following statement is
executed:
CONTINUE
ELSE
PERFORM PROCESS-ROW
END-IF.
28 ca.com
PREDICATES AND SQL TUNING
This is an example of a stage 3 predicate. The data retrieved from the table isn’t used unless
the condition is false. Why did this happen? Because the programmer was told that no stage 2
predicates are allowed in SQL statements as part of a shop standard. The truth is, however,
that the following stage 2 predicate would always outperform a stage 3 predicate:
SELECT EMPNO
FROM EMP
Combining Predicates
It should be known that when simple predicates are connected together by an OR condition,
the resulting compound predicate will be evaluated at the higher stage of the two simple
predicates. The following example contains two simple predicates combined by an OR. The
first predicate is stage 1 indexable, and the second is non-indexable stage 1. The result is that
the entire compound predicate is stage 1 and not indexable:
SELECT EMPNO
FROM EMP
In the next example the first simple predicate is stage 1 indexable, but the second (connected
again by an OR) is stage 2. Thus, the entire compound predicate is stage 2:
SELECT EMPNO
FROM EMP
If the predicate on the LASTNAME column evaluates as false, then the entire WHERE clause is
false. The same is true for the predicate on the MIDINIT column. DB2 can take advantage of
this in that it could utilize an index (if available) on either the LASTNAME column or the
MIDINIT column. This is opposed to the following WHERE clause:
In this case if the predicate on LASTNAME or MIDINIT evaluates as false, then the rest of the
WHERE clause must be evaluated. These predicates are non-Boolean term.
We can modify predicates in the WHERE clause to take advantage of this to improve index
access. While the following query contains non-Boolean term predicates:
30 ca.com
PREDICATES AND SQL TUNING
A redundant predicate can be added that, while making the WHERE clause functionally
equivalent, contains a Boolean term predicate, and thus could take better advantage of an
index on the LASTNAME column:
SELECT *
ON T1.A = T2.B
WHERE T1.A = 1
DB2 will generate a redundant predicate on T2.B = 1. This gives DB2 more choices for filtering
and table access sequence. You can add your own redundant predicates, however DB2 will
not consider them when it applies predicate transitive closure, and so they will be redundant
and wasteful.
Predicates of the form <COL op value>, where the operation is an equals or range predicate,
are available for predicate transitive closure. So are BETWEEN predicates. However, predicates
that contain a LIKE, an IN, or subqueries are not available for transitive closure, and so it may
benefit you to code those predicates redundantly if you feel they can provide a benefit.
You need to look at your SQL statements and ask some questions. Is the statement needed?
Do I really need to run the statement again? Does the statement have a DISTINCT, and are
duplicates possible? Does the statement have an ORDER BY, and is the order important? Are
you repeatedly accessing code tables, and can the codes be cached within the program?
32 ca.com
PREDICATES AND SQL TUNING
• Reduce the number of SQL statements Each SQL statement is a programmatic call to the
DB2 subsystem that incurs fixed overhead for each call. Careful evaluation of program
processes can reveal situations in which SQL statements are issued that don’t need to be.
This is especially true for programmatic joins. If separate application programs are retrieving
data from related tables, then this can result in extra SQL statements issued. Code SQL joins
instead of programmatic joins. In one simple test of a programmatic join of two tables in a
COBOL program compared to the equivalent SQL join, the SQL join consumed 30% less CPU.
• Use stage 1 predicates You should be aware of the stage 1 indexable, stage 1, and stage 2
predicates, and try to use stage 1 predicates for filtering whenever possible. See this section
in this chapter on promoting predicates to help determine if you can convert your stage 2
predicates to stage 1, or stage 1 non-indexable to indexable.
• Never use generic SQL statements Generic SQL equals generic performance. Generic SQL
generally only has logic that will retrieve some specific data or data relationship, and leaves
the entire business rule processing in the application. Common modules, SQL that is used to
retrieve current or historical data, and modules that read all data into a copy area only to
have the caller use a few fields are all examples we’ve seen. Generic SQL and generic I/O
layers do not belong in high performing systems.
• Avoid unnecessary sorts Avoid unnecessary sorting is a requirement in any application but
more so in any high performance environment especially with a database involved. Generally,
if ordering is always necessary then there should probably be indexes to support the
ordering. Sorting can be caused by GROUP BY, ORDER BY, DISTINCT, INTERSECT, EXCEPT,
and join processing. If ordering is required for a join process, it generally is a clear indication
that an index is missing. If ordering is necessary after a join process, then hopefully the result
set is small. The worse possible scenario is where a sort is performed as the result of the
SQL in a cursor, and the application only processes a subset of the data. In this case the sort
overhead needs to be removed. It is more an application of common sense. If the SQL has
to sort 100 rows and only 10 are processed by the application, then it may be better to sort
in the application, or perhaps an index needs to be created to help avoid the sort.
• Only sort truly necessary columns When DB2 sorts data, the columns used to determine
the sort order actually appear twice in the sort rows. Make sure that if you specify the sort
only on the necessary columns. For example, any column specified in an equals predicate in
the query does not need to be in the ORDER BY clause.
• Use the ON clause for all join predicates By using explicit join syntax instead of implicit join
syntax you can make a statement easier to read, easier to convert between inner and outer
join, and harder to forget to code a join predicate.
• Avoid UNIONs (not necessarily UNION ALL) Are duplicates possible? If not then replace
the UNION with a UNION ALL. Otherwise, see if it is possible to produce the same result
with an outer join, or in a situation in which the subselects of the UNION are all against the
same table try using a CASE expression and only one pass through the data instead.
• Use joins instead of subqueries Joins can outperform subqueries for existence checking if
there are good matching indexes for both tables involved, and if the join won’t introduce any
duplicate rows in the result. DB2 can take advantage of predicate transitive closure, and also
pick the best table access sequence. These things are not possible with subqueries.
• Code the most selective predicates first DB2 processes the predicates in a query in a
specific order:
– Indexable predicates are applied first in the order of the columns of the index
– Then other stage 1 predicates are applied
– Then stage 2 predicates are applied
Within each of the stages above DB2 will process the predicates in this order:
– All equals predicates (including single IN list and BETWEEN with only one value)
– All range predicates and predicates of the form IS NOT NULL
– Then all other predicate types
Within each grouping DB2 will process the predicates in the order they have been coded
in the SQL statement. Therefore, all SQL queries should be written to evaluate the most
restrictive predicates first to filter unnecessary rows earlier, reducing processing cost at a
later stage. This includes subqueries as well (within the grouping of correlated and non-
correlated).
• Use the proper method for existence checking For existence checking using a subquery an
EXISTS predicate generally will outperform an IN predicate. For general existence checking
you could code a singleton SELECT statement that contains a FETCH FIRST 1 ROW ONLY
clause.
• Avoid unnecessary materialization Running a transaction that processes little or no data?
You might want to use correlated references to avoid materializing large intermediate result
sets. Correlated references will encourage nested loop join and index access for transactions.
See the advanced SQL and performance section of this chapter for more details.
Promoting Predicates
We need to code SQL predicates as efficiently as possible, and so when you are writing
predicates you should make sure to use stage 1 indexable predicates whenever possible. If
you’ve coded a stage 1 non-indexable predicate, or a stage 2 predicate, then you should be
asking yourself if you can possibly promote those predicates to a more efficient stage.
If you have a stage 1 non-indexable predicate, can it be promoted to stage 1 indexable? Take
for example the following predicate:
34 ca.com
PREDICATES AND SQL TUNING
If we use the “end of the DB2 world” as an indicator of data that is still active, then why not
change the predicate to something that is indexable:
Do you have a stage 2 predicate that can be promoted to stage 1, or even stage 1 indexable?
Take for example this predicate:
The predicate above is applying date arithmetic to a column. This is a column expression,
which makes the predicate a stage 2 predicate. By moving the arithmetic to the right side of
the inequality we can make the predicate stage 1 indexable:
These examples can be applied as general rules for promoting predicates. It should be noted
that as of DB2 9 it is possible to create an index on an expression. An index on an expression
can be considered for improved performance of column expressions when it’s not possible to
eliminate the column expression in the query.
FROM EMP
will be executed for every row processed. If you need the ultimate in ease of coding, time to
delivery, portability, and flexibility, then code expressions and functions like this in your SQL
statements. If you need the ultimate in performance, then do the manipulation of data in your
application program.
CASE expressions can be very expensive, but CASE expressions will utilize “early out” logic
when processing. Take the following CASE expression as an example:
OR C1 = ‘K’
OR C1 = ‘T’
OR C1 = ‘Z’
36 ca.com
PREDICATES AND SQL TUNING
If most of the time the value of the C1 column is a blank, then the following functionally
equivalent CASE expression will consume significantly less CPU:
OR C1 = ‘K’
OR C1 = ‘T’
OR C1 = ‘Z’)
DB2 will take the early out from the Case expression for the first not true of an AND, or the
first TRUE of an OR. So, in addition to testing for the blank value above, all the other values
should be tested with the most frequently occurring values first.
CORRELATION In general, correlation encourages nested loop join and index access. This can
be very good for your transaction queries that process very little data. However, it can be bad
for your report queries that process vast quantities of data. The following query is generally a
very good performer when processing large quantities of data:
TAB2.AVGSAL,TAB2.HDCOUNT
FROM
FROM EMP
WORKDEPT
FROM EMP
ON TAB1.WORKDEPT = TAB2.WORKDEPT
If there is one sales rep per department, or if most of the employees are sales reps, then the
above query would be the most efficient way to retrieve the data. In the query above, the entire
employee table will be read and materialized in the nested table expression called TAB2. It is
very likely that the merge scan join method will be used to join the materialized TAB2 to the
first table expression called TAB1.
38 ca.com
PREDICATES AND SQL TUNING
Suppose now that the employee table is extremely large, but that there are very few sales reps,
or perhaps all the sales reps are in one or a few departments. Then the above query may not be
the most efficient due to the fact that the entire employee table still has to be read in TAB2, but
most of the results of that nested table expression won’t be returned in the query. In this case,
the following query may be more efficient:
TAB2.AVGSAL,TAB2.HDCOUNT
COUNT(*) AS HDCOUNT
FROM EMP
While this statement is functionally equivalent to the previous statement, it operates in a very
different way. In this query the employee table referenced as TAB1 will be read first, and then
the nested table expression will be executed repeatedly in a nested loop join for each row that
qualifies. An index on the WORKDEPT column is a must.
MERGE VERSUS MATERIALIZATION FOR VIEWS AND NESTED TABLE EXPRESSIONS When you
have a reference to a nested table expression or view in your SQL statement DB2 will possibly
merge that nested table expression or view with the referencing statement. If DB2 cannot
merge then it will materialize the view or nested table expression into a work file, and then
apply the referencing statement to that intermediate result. IBM states that merge is more
efficient than materialization. In general, that statement is correct. However, materialization
may be more efficient if your complex queries have the following combined conditions:
• Nested table expressions or view references, especially multiple levels of nesting
• Columns generated in the nested expressions or views via application of functions,
user-defined functions, or other expressions
• References to the generated columns in the outer referencing statement
In general, DB2 will materialize when some sort of aggregate processing is required inside the
view or nested table expression. So, typically this means that the view or nested table expression
contains aggregate functions, grouping (GROUP BY), or DISTINCT. If materialization is not
required then the merge process happens. Take for example the following query:
SELECT MAX(CNT)
DB2 will materialize TAB1 in the above example. Now, take a look at the next query:
FROM
40 ca.com
PREDICATES AND SQL TUNING
In this query there is no DISTINCT, GROUP BY, or aggregate functions, and so DB2 will merge
in inner table expression with the outer referencing statement. Since there are two references
to COL1 in the outer referencing statement, then the CASE expression in the nested table
expression will be calculated twice during query execution. The merged statement would look
something like this:
,SUM(CASE WHEN
FROM YLA.ACCT_TABLE
For this particular query the merge is probably more efficient than materialization. However, if
you have multiple levels of nesting and many references to generated columns, merge can be
less efficient than materialization. In these specific cases you may want to introduce a non-
deterministic function into the view or nested table expression to force materialization. We use
the RAND() function.
UNION IN A VIEW OR NESTED TABLE EXPRESSION You can place a UNION or UNION ALL into a
view or nested table expression. This allows for some really complex SQL processing, but also
enables you to create logically partitioned tables. That is, you can store data in multiple tables
and then reference them all together as one table in a view. This is useful for quickly rolling
through yearly tables, or to create optional table scenarios with little maintenance overhead.
While each SQL statement in a union in view (or table expression) results in an individual
query block, and SQL statements written against our view are distributed to each query block,
DB2 does employ a technique to prune query blocks for efficiency. DB2 can, depending upon
the query, prune (eliminate) query blocks at either statement compile time or during statement
execution. If we consider the account history view:
(ACCOUNT_ID, AMOUNT) AS
FROM HIST1
UNION ALL
FROM HIST1
SELECT *
FROM V_ACCOUNT_HISTORY
42 ca.com
PREDICATES AND SQL TUNING
The predicate of this query contains the literal value 12000000, and this predicate is distributed
to both of the query blocks generated. However, DB2 will compare the distributed predicate
against the predicates coded in the UNION inside our view, looking for redundancies. In any
situations in which the distributed predicate renders a particular query block unnecessary, DB2
will prune (eliminate) that query block from the access path. So, when our distributed
predicates look like this:
. . .
. . .
DB2 will prune the query blocks generated at statement compile time base upon the literal
value supplied in the predicate. So, although in our example above two query blocks would
be generated, one of them will be pruned when the statement is compiled. DB2 compares the
literal predicate supplied in the query against the view with the predicates in the view. Any
unnecessary query blocks are pruned. So, since one of the resulting combined predicates is
impossible, DB2 eliminates that query block. Only one underlying table will then be accessed.
Query block pruning can happen at statement compile (bind) time, or at run time if a host
variable or parameter marker is supplied for a redundant predicate. So, let’s take the previous
query example, and replace the literal with a host variable:
SELECT *
FROM V_ACCOUNT_HISTORY
If this statement was embedded in a program, and bound into a plan or package, two query
blocks would be generated. This is because DB2 does not know the value of the host variable in
advance, and distributes the predicate amongst both generated query blocks. However, at run
time DB2 will examine the supplied host variable value, and dynamically prune the query
blocks appropriately. So, if the value 12000000 was supplied for the host variable value, then
one of the two query blocks would be pruned at run time, and only one underlying table would
be accessed. This is a complicated process that does not always work. You should test it by
stopping one of the tables, and then running a query with a host variable that should prune the
query block on that table. If the statement is successful then runtime query block pruning is
working for you.
We can get query block pruning on literals and host variables. However, we can’t get query
block pruning on joined columns. In certain situations with many query blocks (UNIONS),
many rows of data, and many index levels for the inner view or table expression of a join, we
have recommended using programmatic joins in situations in which the query can benefit from
runtime query block pruning using the joining column. Please be aware of the fact that this is
an extremely specific recommendation, and certainly not a general recommendation. You
should also be aware of the fact that there are limits to UNION in view (or table expression),
and that you should always test to see if you get bind time or run time query block pruning. In
some cases it just doesn’t happen, and there are APARs out there that address the problems,
but are not comprehensive. So, testing is important.
You can influence proper use of runtime query block pruning by encouraging distribution of
joins and predicates into the UNION in view (see section below). This is done by reducing the
number of tables in the UNION, or by repeating host variables in predicates instead of, or in
addition to using correlation. Take a look at the query below.
SELECT
AND HIST.UPD_TSP =
(SELECT MAX(UPD_TSP)
WITH UR;
44 ca.com
PREDICATES AND SQL TUNING
The predicate on ACCT_ID in the subquery could be correlated to the outer query, but it isn’t.
The same goes for the predicate on the HIST_EFF_DTE predicate in the subquery. The reason
for repeating the host variable and value references was to take advantage of the runtime
pruning. Correlated predicates would not have gotten the pruning.
Be aware of the fact that if you are moving a local batch process to a remote server, you are
going to loose some of the efficiencies that go along with the RELEASE(DEALLOCATE) bind
parameter, in particular sequential detection and index lookaside.
Proper Statistics
What can we say? DB2 utilizes a cost-based optimizer, and that optimizer needs accurate
statistical information about your data. Collecting the proper statistics is a must to good
performance. With each new version of DB2 the optimizer takes more advantage of catalog
statistics. This also means that with each new version DB2 is more dependent upon catalog
statistics. You should have statistics on every column referenced in every WHERE clause in
your shop. If you are using parameter markers and host variables then in the least you need
cardinality statistics. If you have skewed data, or are using literal values in your SQL
statements, then perhaps you need frequency distribution and/or histogram statistics. If you
suspect columns are correlated, you can gather column correlation statistics. To determine if
columns are correlated you can run these two queries (DB2 V8, and DB2 9):
FROM CUSTOMER
If the number from the second query is lower than the number from the first query then the
columns are correlated.
You can also run GROUP BY queries against tables for columns used in predicates to count the
occurrences of values in these columns. These counts can give you a good indication as to
whether or not you need frequency distribution statistics or histogram statistics, and runtime
reoptimization for skewed data distributions.
46 ca.com
PREDICATES AND SQL TUNING
Runtime Reoptimization
If your query contains a predicate with an embedded literal value then DB2 knows something
about the input to the query, and can take advantage of frequency distribution or histogram
statistics if available. This can result in a much improved filter factor, and better access path
decisions by the optimizer. However, what if DB2 doesn’t know anything about your input value:
SELECT *
FROM EMP
In this case if the values for MIDINT are highly skewed, then DB2 could make an inaccurate
estimate of the filter factor for some input values.
DB2 can employ something called runtime reoptimization to help your queries. For static SQL
the option of REOPT(ALWAYS) is available. This bind option will instruct DB2 to recalculate
access paths at runtime using the host variable parameters. This can result in improved
execution time for large queries. However, if there are many queries in the package then they
will all get reoptimized. This could negatively impact statement execution time for these
queries. In situations where you use REOPT(ALWAYS) consider separating the query that can
benefit into it’s own package.
For dynamic SQL statements there are three options, REOPT(ALWAYS), REOPT(ONCE), and
REOPT(AUTO). REOPT(AUTO) is DB2 9 only.
• REOPT(ALWAYS) This will reoptimize a dynamic statement with parameter markers based
upon the values provided on every execution.
• REOPT(ONCE) Will reoptimize a dynamic statement the first time it is executed base upon
the values provided for parameter markers. The access path will then be reused until the
statement is removed from the dynamic statement cache, and needs to be prepared again.
This reoptimization option should be used with care as the first execution should have good
representative values.
• REOPT(AUTO) Will track how the values for the parameter markers change on every
execution, and will then reoptimize the query based upon those values if it determines that
the values have changed significantly
There is also a system parameter called REOPTEXT (DB2 9) that enables the REOPT(AUTO)
like behavior, subsystem wide, for any dynamic SQL queries (without NONE, ALWAYS, or
ONCE already specified) that contain parameter markers when it detects changes in the values
that could influence the access path.
The OPTIMIZE FOR clause is a way, within a SQL statement, to tell DB2 how many rows you
intent to process. DB2 can then make access path decisions in order to determine the most
efficient way to access the data for that quantity. The use of this clause will discourage such
things as list prefetch and sequential prefetch, and multi-index access. It will encourage index
usage to avoid a sort, and a join method of nested loop join. A value of 1 is the strongest
influence on these factors.
You should actually put the number of rows you intend to fetch in the OPTIMIZE FOR clause.
Incorrectly representing the number of rows you intent to fetch can result in a more poorly
performing query.
When DB2 joins tables together in an inner join it attempts to select the table that will qualify
the fewest rows first in the join sequence. If, for some reason, DB2 has chosen the incorrect
table first (maybe due to statistics or host variables) then you can attempt to change the table
access sequence by employing one or more of these techniques:
• Enable predicates on the table you want to be first. By increasing potential matchcols on this
table DB2 may select an index for more efficient access and change the table access sequence.
• Disable predicates on the table you don’t want accessed first. Predicate disablers are
documented in the DB2 administration Guide (DB2 V7, DB2 V8) or the DB2 Performance
Monitoring and Tuning Guide (DB2 9). We do not recommend using predicate disablers.
• Force materialization of the table you want accessed first by placing it into a nested table
expression with a DISTINCT or GROUP BY. This could change the join type as well as the
join sequence. This technique is especially useful when a nested loop join is randomly
accessing the inner table.
• Convert joins to subqueries. When you code subqueries you tell DB2 the table access
sequence. Non-correlated subqueries are accessed first, then the outer query is executed,
and then any correlated subqueries are executed. Of course, this is only effective if the table
moved from a join to a subquery doesn’t have to return data.
• Convert a joined table to a correlated nested table expression. This will force another table to
be accessed first as the data for the correlated reference is required prior to the table in the
correlated nested table expression being accessed.
• Convert an inner join to a left join. By coding a left join you have absolutely dictated the table
join sequence to DB2, as well as the fact that the right table will filter no data.
48 ca.com
PREDICATES AND SQL TUNING
• Add a CAST function to the join predicate for the table you want accessed first. By placing
this function on that column you will encourage DB2 to access that table first in order to
avoid a stage 2 predicate against the second table.
• Code an ORDER BY clause on the columns of the index of the table that you want to be
accessed first in the join sequence. This may influence DB2 to use that index to avoid the
sort, and access that table first.
• Change the order of the tables in the FROM clause. You can also try converting from implicit
join syntax to explicit join syntax and vice versa.
• Try coding a predicate on the table you want accessed first as a non-transitive closure
predicate, for example a non-correlated subquery against the SYSDUMMY1 table that
returns a single value rather than an equals predicate on a host variable or literal value. Since
the subquery is not eligible for transitive closure, then DB2 will not generate the predicate
redundantly against the other table, and has less encouragement to choose that table first.
If DB2 has chosen one index over another, and you disagree then you try one of these
techniques to influence index selection:
• Code an ORDER BY clause on the leading columns of the index you want chosen. This may
encourage DB2 to chose that index to avoid a sort.
• Add columns to the index to make the access index-only.
• Increase the index matchcols either by modifying the query or the index.
• You could disable predicates that are matching other indexes. The IBM manuals document
predicate disablers. We don’t recommend them.
• Try using the OPTIMIZE FOR clause.
It is very important to make the correct design choices when designing physical objects such
as tables, table spaces, and indexes — once a physical structure has been defined and implemented,
it is generally difficult and time-consuming to make changes to the underlying structure. The
best way to perform logical database modeling is to use strong guidelines developed by an
expert in relational data modeling, or to use one of the many relational database modeling
tools supplied by vendors. But it is important to remember that just because you can ‘press a
button’ to have a tool migrate your logical model into a physical model, does not mean that
the physical model is the most optimal for performance. There is nothing wrong with twisting
the physical design to improve performance as long as the logical model is not compromised
or destroyed.
DB2 objects need to be designed for availability, ease of maintenance, and overall performance,
as well as for business requirements. There are guidelines and recommendations for achieving
these design goals, but how each of these is measured will depend on the business and the
nature of the data.
DB2 9 introduces a new type of table space called a universal table space. A universal table
space is a table space that is both segmented and partitioned. Two types of universal table
spaces are available: the partition-by-growth table space and the range-partitioned table space.
Before DB2 9, partitioned tables required key ranges to determine the target partition for
row placement. Partitioned tables provide more granular locking and parallel operations by
spreading the data over more data sets. Now, in DB2 9, you have the option to partition
according to data growth, which enables segmented tables to be partitioned as they grow,
without the need for key ranges. As a result, segmented tables benefit from increased table
space limits and SQL and utility parallelism that were formerly available only to partitioned
tables, and you can avoid needing to reorganize a table space to change the limit keys.
A range-partitioned table space is a type of universal table space that is based on partitioning
ranges and that contains a single table. The new range-partitioned table space does not
replace the existing partitioned table space, and operations that are supported on a regular
partitioned or segmented table space are supported on a range-partitioned table space. You
can create a range-partitioned table space by specifying both SEGSIZE and NUMPARTS
keywords on the CREATE TABLESPACE statement. With a range-partitioned table space, you
can also control the partition size, choose from a wide array of indexing options, and take
advantage of partition-level operations and parallelism capabilities. Because the range-
partitioned table space is also a segmented table space, you can run table scans at the
segment level. As a result, you can immediately reuse all or most of the segments of a table
after the table has been dropped or a mass delete has been performed.
Range-partitioned universal table spaces follow the same partitioning rules as for partitioned
table spaces in general. That is, you can add, rebalance, and rotate partitions. The maximum
number of partitions possible for both range-partitioned and partition-by-growth universal
table spaces, as for partitioned table spaces, is controlled by the DSSIZE and page size.
52 ca.com
TABLE AND INDEX DESIGN FOR PERFORMANCE
Clustering and partitioning can be completely independent, and we’re given a log of options
for organizing our data in a single dimension (clustering and partitioning are based on the
same key) dual dimensions (clustering inside each partition by a different key) or multiple
dimensions (combining different tables with different partitioning unioned inside a view). You
should chose a partitioning strategy based upon a concept of application controlled parallelism,
separating old and new data, grouping data by time, or grouping data by some meaningful
business entity (e.g. sales region, office location). Then within those partitions you can cluster
the data by your most common sequential access sequence.
There is a way to dismiss clustering for inserts. See the section in this chapter on append
processing.
There are several advantages to partitioning a table space. For large tables, partitioning is the
only way to store large amounts of data, but partitioning also has advantages for tables that
are not necessarily large. DB2 allows us to define up 4096 partitions of up to 64 GB each
(however, total table size is limited depending on the DSSIZE specified). Non-partitioned table
spaces are limited to 64 GB of data. You can take advantage of the ability to execute utilities on
separate partitions in parallel. This also gives you the ability to access data in certain partitions
while utilities are executing on others. In a data-sharing environment, you can spread partitions
among several members to split workloads. You can also spread your data over multiple
volumes and need not use the same storage group for each data set belonging to the table
space. This also allows you to place frequently accessed partitions on faster devices.
Free Space
The FREEPAGE and PCTFREE clauses are used to help improve the performance of updates
and inserts by allowing free space to exist on table spaces. Performance improvements include
improved access to the data through better clustering of data, faster inserts, fewer row
overflows, and a reduction in the number of REORGs required. Some tradeoffs include an
increase in the number of pages, fewer rows per I/O and less efficient use of buffer pools, and
more pages to scan. As a result, it is important to achieve a good balance for each individual
table space and index space when deciding on free space, and that balance will depend on the
processing requirements of each table space or index space. When inserts and updates are
performed, DB2 will use the free space defined, and by doing this it can keep records in
clustering sequence as much as possible. When the free space is used up, the records must be
located elsewhere, and this is when performance can begin to suffer. Read-only tables do not
require any free space, and tables with a pure insert-at-end strategy (append processing)
generally don’t require free space. Exceptions to this would be tables with VARCHAR columns
and tables using compression that are subject to updates. When DB2 attempts to maintain
cluster during inserting and updating it will search nearby for free space and/or free pages for
the row. If this space is not found DB2 will exhaustively search the table space for a free place
to put the row before extending a segment or a data set. You can notice this activity by
gradually increasing insert CPU times in you application (by examining the accounting records)
as well as increasing getpage counts and relocated row counts. When this happens it’s time for
a REORG, and a perhaps a reevaluation of your free space quantities.
Allocations
The PRIQTY and SECQTY clauses of the CREATE TABLESPACE and ALTER TABLESPACE SQL
statements specify the space that is to be allocated for the table space if the table space is
managed by DB2. These settings influence the allocation by the operating system of the
underlying VSAM data sets in which table space and index space data is stored. The PRIQTY
specifies the minimum primary space allocation for a DB2-managed data set of the table space
or partition. The primary space allocation is in kilobytes, and the maximum that can be
specified is 64 GB. DB2 will request a data set allocation corresponding to the primary space
allocation, and the operating system will attempt to allocate the initial extent for the data set in
one contiguous piece. The SECQTY specifies the minimum secondary space allocation for a
DB2-managed data set of the table space or partition. DB2 will request secondary extents in a
size according to the secondary allocation. However, the actual primary and secondary data set
sizes depend upon a variety of settings and installation parameters.
You can specify the primary and secondary space allocations for table spaces and indexes or
allow DB2 to choose them. Having DB2 choose the values, especially for the secondary space
quantity, increases the possibility of reaching the maximum data set size before running out of
extents. In addition, the MGEXTSZ subsystem parameter will influence the SECQTY
allocations, and when set to YES (NO is the default) changes the space calculation formulas to
help utilize all of the potential space allowed in the table space before running out of extents.
You can alter the primary and secondary space allocations for a table space. The secondary
space allocation will take immediate effect. However, since the primary allocation happens
when the data set is created, then that allocation will not take affect until a data set is added
(depends upon the type of table space) or until the data set is recreated via utility execution
(such as a REORG or LOAD REPLACE).
Column Ordering
There are two reasons you want to order your columns in specific ways; to reduce CPU
consumption when reading and writing columns with variable length data, and to minimize the
amount of logging performed when updating rows. Which version of DB2 you are using will
impact how you, or how DB2, organizes your columns.
54 ca.com
TABLE AND INDEX DESIGN FOR PERFORMANCE
For reduced CPU when using variable length columns you’ll want to put your variable length
columns after all of your fixed length columns (DB2 V7 and DB2 V8). If you mix the variable
length columns and your fixed length columns together then DB2 will have to search for any
fixed or variable length column after the first variable length column, and this will increase CPU
consumption. So, in DB2 V7 or DB2 V8 you want to put the variable length columns after the
fixed length columns when defining your table. This is especially true for any read-only
applications. For applications in which the rows are updated, you may want to organize your
data differently (read on). Things change with DB2 9 as it employs something called reordered
row format. Once you move to new function mode in DB2 9 any new tablespace you create will
automatically have its variable length columns placed after the fixed length columns physically
in the table space, regardless of the column ordering in the DDL. Within each grouping (fixed
and variable) your DDL column order is respected. In addition to new table spaces any table
spaces that are REORGed or LOAD REPLACEd will get the reordered row format.
For reduced logging you’ll want to order the rows in your DDL a little differently. For high
update tables you’ll want the columns that never changed placed first in the row, followed by
the columns that change less frequently, then followed by the columns that changed all the
time (e.g. an update timestamp). So, you’ll want you variable length columns that never change
in front of the fixed length columns that do change (DB2 V7 and DB2 V8) in order to reduce
logging. This is because DB2 will record the first byte changed to last byte changed for fixed
length rows, and first byte changed to end of the row for variable length rows if the length
changes (unless the table has been defined with DATA CAPTURE CHANGES which will cause
the entire before and after image to be logged for updates). This all changes once you’ve
moved to DB2 9, and the table space is using the reordered row format. In this case you have
no control over the placement of never changing variable length rows in front of always
changing fixed length rows. This can possibly mean increased logging for your heavy updaters.
To reduce the logging in these situations you can still order the columns such that the most
frequently updated columns are last, and DB2 will respect the order of the columns within the
grouping. You can also contact IBM about turning off the automatic reordered row format if
this is a concern for you.
Compression allows us to get more rows on a page and therefore see many of the following
performance benefits, depending on the SQL workload and the amount of compression:
• Higher buffer pool hit ratios
• Fewer I/Os
• Fewer getpage operations
• Reduced CPU time for image copies
There are also some considerations for processing cost when using compression, but that cost
is relatively low.
• The processor cost to decode a row using the COMPRESS clause is significantly less than the
cost to encode that same row.
• The data access path DB2 uses affects the processor cost for data compression. In general,
the relative overhead of compression is higher for table space scans and less costly for
index access.
Some data will not compress well so you should query the PAGESAVE column in
SYSIBM.SYSTABLEPART to be sure you are getting a savings (at least 50% is average). Data
that does not compress well includes binary data, encrypted data, and repeating strings. Also
you should never compress small tables/rows if you are worried about concurrency issues as
this will put more rows on a page.
Keep in mind when you compress the row is treated as varying length with length change when
it comes to updates. This means there is a potential for row relocation causing high numbers in
NEARINDREF and FARINDREF. This means you are now doing more I/O to get to your data
because it has been relocated and you will have to REORG to get it back to its original position.
Utilizing Constraints
Referential integrity (RI) allows you to define required relationships between and within tables.
The database manager maintains these relationships, which are expressed as referential
constraints, and requires that all values of a given attribute or table column also exist in some
other table column.
In general DB2 enforced referential integrity is much more efficient than coding the equivalent
logic in your application program. In addition, have the relationships enforced in a central
location in the database is much more powerful than making it dependent upon application
logic. Of course, you are going to need indexes to support the relationships enforced by DB2.
Remember that referential integrity checking has cost associated with it and can become
expensive if used for something like continuous code checking. RI is meant for parent/child
relationship, not code checking. Better options for this include check constraints, or even better
to put codes in memory and check them there.
Table check constraints will enforce data integrity at the table level. Once a table-check
constraint has been defined for a table, every UPDATE and INSERT statement will involve
checking the restriction or constraint. If the constraint is violated, the data record will not be
inserted or updated, and a SQL error will be returned.
56 ca.com
TABLE AND INDEX DESIGN FOR PERFORMANCE
A table check constraint can be defined at table creation time or later, using the ALTER TABLE
statement. The table-check constraints can help implement specific rules for the data values
contained in the table by specifying the values allowed in one or more columns in every row of
a table. This can save time for the application developer, since the validation of each data value
can be performed by the database and not by each of the applications accessing the database.
However, check constraints should, in general, not be used for data edits in support of data
entry. It’s best to cache code values locally within the application and performs the edits local
to the application. This will avoid numerous trips to the database to enforce the constraints.
Indexing
Depending upon you application and the type of access, indexing can be a huge performance
advantage or a performance bust. Is your application a heavy reader, or perhaps even a read-
only application? Then lots of indexes can be a real performance benefit. What if your application
is constantly inserting, updating, and deleting from your table. Then in that case maybe lots of
indexes can be a detriment. When does it matter, well of course it depends. Just remember this
simple rule; if you are adding a secondary index to a table then for inserts and deletes, and
perhaps even updates, you are adding another random read to these statements. Can you
application afford that in support of queries that may use the index? That’s for you to decide.
Index Compression
As of DB2 9 an index can be defined with the COMPRES YES option (COMPRESS NO is
default). Index compression can be used where there is a desire to reduce the amount of disk
space an index consumes. Index compression is recommended for applications that do
sequential insert operations with few of no delete operations. Random inserts and deletes can
adversely effect compression. An Index compression is also recommended for applications
where the indexes are created primarily for scan operations.
A bufferpool that is used to create the index must be 8K, 16K, or 32K in size. The physical page
size for the index on disk will be 4K. The reason that the bufferpool size is larger than the page
size is that index compression only saves space on disk. The data in the index page is expanded
when read into the pool. So, index compression can possibly save you read time for sequential
operations, and perhaps random (but far less likely).
Index compression can have a significant impact on the REORGs and index rebuilds resulting
in significant savings in this area. Keep in mind, however, that if you use the copy utility to back
up an index that image copy is actually uncompressed.
58 ca.com
TABLE AND INDEX DESIGN FOR PERFORMANCE
Secondary Indexes
There are two types of secondary indexes, non-partitioning secondary indexes and data
partitioned secondary indexes.
NON-PARTITIONING SECONDARY INDEXES NPSIs are indexes that are used on partitioned
tables. They are not the same as the clustered partitioning key, which is used to order and
partition the data, but rather they are for access to the data. NPSIs can be unique or non-
unique. While you can have only one clustered partitioning index, you can have several NPSIs
on a table if necessary. NPSIs can be broken apart into multiple pieces (data sets) by using the
PIECESIZE clause on the CREATE INDEX statement. Pieces can vary in size from 254 KB to 64
GB — the best size will depend on how much data you have and how many pieces you want to
manage. If you have several pieces, you can achieve more parallelism on processes, such as
heavy INSERT batch jobs, by alleviating the bottlenecks caused by contention on a single data
set. As of DB2 V8 and beyond the NPSI can be the clustering index.
NPSIs are great for fast read access as there is a single index b-tree structure. They can,
however, grow extremely large and become a maintenance and availability issue.
DATA PARTITIONED SECONDARY INDEXES The DPSI index type provides us with many
advantages for secondary indexes on a partitioned table space over the traditional NPSIs
(Non-Partitioning Secondary Indexes) in terms of availability and performance.
The partitioning scheme of the DPSI will be the same as the table space partitions and the
index keys in ‘x’ index partition will match those in ‘x’ partition of the table space. Some of the
benefits that this provides include:
• Clustering by a secondary index
• Ability to easily rotate partitions
• Efficient utility processing on secondary indexes (no BUILD-2 phase)
• Allow for reducing overhead in data sharing (affinity routing)
DRAWBACKS OF DPSIS While there will be gains in furthering partition independence, some
queries may not perform as well. If the query has predicates that reference columns in a single
partition are therefore are restricted to a single partition of the DPSI it will benefit from this
new organization. The queries will have to be designed to allow for partition pruning through
the predicates in order to accomplish this. This means that the at least leading column of the
partitioning key has to be supplied in the query in order for DB2 to prune (eliminate) partitions
from the query access path. However if the predicate references only columns in the DPSI it
may not perform very well because it may need to probe several partitions of the index. Other
limitations to using DPSIs include the fact that they cannot be unique (some exceptions in DB2
9) and they may not be the best candidates for ORDER BYs.
Rebuild or Recover?
As of DB2 V8 you can define an index as COPY YES. This means, as with a table space, you
can use the COPY and RECOVER utilities to backup and recover these indexes. This may be
especially useful for very large indexes. Be aware, however, that large NPSI’s cannot be copied
in pieces, and can get very large in size. You’ll need to have large data sets to hold the backup.
This could mean large quantities of tapes, or perhaps even hitting the 59 volume limit for a
data set on DASD. REBUILD will require large quantities of temporary DASD to support sorts,
as well as more CPU than a RECOVER. You should carefully consider whether your strategy for
an index should be backup and recover, or rebuild.
As of DB2 V8, one solution can be with the use of MQTs — Materialized Query Tables. This
allows you to precompute whole or parts of each query and then use computed results to
answer future queries. MQTs provide the means to save the results of prior queries and then
reuse the common query results in subsequent queries. This helps avoid redundant scanning,
aggregating and joins. MQTs are useful for data warehouse type applications.
MQTs do not completely eliminate optimization problems but rather move optimizations
issues to other areas. Some challenges include finding the best MQT for expected workload,
maintaining the MQTs when underlying tables are updated, ability to recognize usefulness of
MQT for a query, and the ability to determine when DB2 will actually use the MQT for a query.
Most of these types of problems are addressed by OLAP tools, but MQTs are the first step.
The main advantage of the MQT is that DB2 is able to recognize a summary query against the
source table(s) for the MQT, and rewrite the query to use the MQT instead. It is, however, your
responsibility to move data into the MQT, with via a REFRESH TABLE command, or by manually
moving the data yourself.
Volatile Tables
As of DB2 V8, volatile tables are a way to prefer index access over table space scans or non-
matching index scans for tables that have statistics that make them appear to be small. They
are good for tables that shrink and grow allowing matching index scans on tables that have
grown larger without new RUNSTATS.
60 ca.com
TABLE AND INDEX DESIGN FOR PERFORMANCE
They also improve support for cluster tables. Cluster tables are those tables that have groups
or clusters of data that logically belong together. Within each group rows need to be accessed
in same sequence to avoid lock contention during concurrent access. The sequence of access is
determined by primary key and if DB2 changes the access path lock contention can occur. To
best support cluster tables (volatile tables) DB2 will use index only access when possible. This
will minimize application contention on cluster tables by preserving the access sequence by
primary key. We need to be sure indexes are available for single table access and joins.
The keyword VOLATILE can be specified on the CREATE TABLE or the ALTER TABLE
statements. If specified you are basically forcing an access path of index accessing and no
list prefetch.
Clone Tables
In DB2 9 you can create a clone table on an existing base table at the current server by using
the ALTER TABLE statement. Although ALTER TABLE syntax is used to create a clone table,
the authorization granted as part of the clone creation process is the same as you would get
during regular CREATE TABLE processing. The schema (creator) for the clone table will be the
same as for the base table. You can create a clone table only if the base table is in a universal
table space.
To create a clone table, issue an ALTER TABLE statement with the ADD CLONE option.
The creation or drop of a clone table does not impact applications accessing base table data.
No base object quiesce is necessary and this process does not invalidate plans, packages, or
the dynamic statement cache.
You can exchange the base and clone data by using the EXCHANGE statement. To exchange
table and index data between the base table and clone table issue an EXCHANGE statement
with the DATA BETWEEN TABLE table-name1 AND table-name2 syntax. This is in essence a
method of performing an online load replace!
After a data exchange, the base and clone table names remain the same as they were prior
to the data exchange. No data movement actually takes place. The instance numbers in the
underlying VSAM data sets for the objects (tables and indexes) do change, and this has the
effect of changing the data that appears in the base and clone tables and their indexes. For
example, a base table exists with the data set name *I0001.*. The table is cloned and the
clone’s data set is initially named *.I0002.*. After an exchange, the base objects are named
*.I0002.* and the clones are named *I0001.*. Each time that an exchange happens, the instance
numbers that represent the base and the clone objects change, which immediately changes the
data contained in the base and clone tables and indexes. You should also be aware of the fact
that when the clone is dropped and an uneven number of EXCHANGE statements have been
executed, the base table will have an *I0002.* data set name. This could be confusing.
Traditionally, our larger database tables have been placed into partitioned tablespaces.
Partitioning helps with database management because it’s easier to manage several small
objects versus one very large object. There are still some limits to partitioning. For example,
each partition is limited to a maximum size of 64GB, a partitioning index is required (DB2 V7
only), and if efficient alternate access paths to the data are desired then non-partitioning indexes
(NPSIs) are required. These NPSIs are not partitioned, and exist as single large database indexes.
Thus NPSIs can present themselves as an obstacle to availability (i.e. a utility operation against
a single partition may potentially make the entire NPSI unavailable), and as impairment to
database management as it is more difficult to manage such large database objects.
A UNION in a view can be utilized as an alternative to table partitioning in support of very large
database tables. In this type of design, several database tables can be created to hold different
subsets of the data that would have otherwise be held in a single table. Key values, similar to
what may be used in partitioning, can be used to determine which data goes into which of the
various tables. Take, for example, the following view definition:
62 ca.com
TABLE AND INDEX DESIGN FOR PERFORMANCE
PAYMENT_TYPE, INVOICE_NUMBER)
AS
PAYMENT_TYPE, INVOICE_NUMBER
FROM ACCOUNT_HISTORY1
UNION ALL
PAYMENT_TYPE, INVOICE_NUMBER
FROM ACCOUNT_HISTORY2
UNION ALL
PAYMENT_TYPE, INVOICE_NUMBER
FROM ACCOUNT_HISTORY3
UNION ALL
PAYMENT_TYPE, INVOICE_NUMBER
FROM ACCOUNT_HISTORY4
By separating the data into different table, and creating the view over the tables we can create
a logical account history table with these distinct advantages over a single physical table:
• We can add or remove tables with very small outages, usually just the time it takes to drop
and recreate the view.
• We can partition each of the underlying tables, creating still smaller physical database
objects.
• NPSIs on each of the underlying tables could be much smaller and easier to manage then
they would be under a single table design.
• Utility operations could execute against an individual underlying table, or just a partition of
that underlying table. This greatly shrinks utility times against these individual pieces, and
improves concurrency. This truly gives us full partition independence.
• The view can be referenced in any SELECT statement in exactly the same way as a physical
table would be.
• Each underlying table could be as large as 16TB, logically setting the size limit of the table
represented by the view at 64TB.
• Each underlying table could be clustered differently, or could be a segmented or partitioned
tablespace.
• DB2 will distribute predicates against the view to every query block within the view, and
then compare the predicates. Any impossible predicates will result in the query block being
pruned (not executed). This is an excellent performance feature.
64 ca.com
TABLE AND INDEX DESIGN FOR PERFORMANCE
If you have a single index for read access, then having append processing may mean more
random reads. This may require more frequent REORGs to keep the data organized for read
access. Also, if you are partitioning, and the partitioning key is not the read index then you will
still have random reads during insert to you non-partitioned index. You’ll need to make sure
you have adequate free space to avoid index page splits.
Append processing can also be used to store historical or seldom read audit information. In
these cases you would want to partition based upon an every ascending value (e.g. a date) and
have all new data go to the end of the last partition. In this situation all table space maintenance,
such as copies and REORGs, will be against the last partition. All other data will be static and
will not require maintenance. You will possibly need a secondary index, or each read query will
have to be for a range of values within the ascending key domain.
The solution may be to build a look-up table that acts as a sparse index. This look up table will
contain nothing more than your ascending key values. One example would be dates, say one
date per month for every month possible in our database. If the historical data is organized and
partitioned by the date, and we have only one date per month (to further sub-categorize the
data), then we can use our new sparse index to access the data we need. Using user-supplied
dates as starting and ending points the look up table can be used to fill the gap with the dates
in between. This gives us the initial path to the history data. Read access is performed by
constructing a key during SELECT processing. So, in this example we’ll access an account
history table (ACCT_HIST) that has a key on HIST_EFF_DTE, ACCT_ID, and our date lookup
table called ACCT_HIST_DATES, which contains one column and one row for each legitimate
date value corresponding to the HIST_EFF_DTE column of the ACCT_HIST table.
CURRENT DATA ACCESS Current data access is easy; we can retrieve the account history data
directly from the account history table.
SELECT {columns}
FROM ACCT_HIST
RANGE OF DATA ACCESS Accessing a range of data is a little more complicated than simply
getting the most recent account history data. Here we need to use our sparse index history
date table to build the key on the fly. We apply the range of dates to the date range table, and
then join that to the history table.
SELECT {columns}
INNER JOIN
ACCT_HIST_DATES DTE
ON HIST.HIST_EFF_DTE = DTE.EFF_DTE
66 ca.com
TABLE AND INDEX DESIGN FOR PERFORMANCE
FULL DATA ACCESS To access all of the data for an account we simply need a version of the
previous query without the date range predicate.
SELECT {columns}
INNER JOIN
ACCT_HIST_DATES DTE
ON HIST.HIST_EFF_DTE = DTE.EFF_DTE
Denormalization “Light”
In many situations, especially those in which there is a conversion from a legacy flat file
based system to a relational database, there is a performance concern (or more importantly
a performance problem in that an SLA is not being met) for reading the multiple DB2 tables.
These are situations in which the application is expecting to read all of the data that was once
represented by a single record, but is now in many DB2 tables. In these situations many people
will begin denormalizing the tables. This is an act of desperation! Remember the reason your
moving your data into DB2 in the first place, and that is for all the efficiency, portability,
flexibility, and faster time to delivery for your new applications. By denormalizing you are
throwing these advantages away, and you may as well have stayed with your old flat file
based design.
In some situations, however, the performance of reading multiple tables compared to the
equivalent single record read just isn’t good enough. Well, instead of denormalization you could
possible employ a “denormalization light” instead. This type of denormalization can be applied
to parent and child tables, when the child table data is in an optional relationship to the parent.
Instead of denormalizing the optional child table data into the parent table simply instead add
a column to the parent table that basically indicates whether of not the child table has any data
for that parent key. This will require some additional application responsibility in maintaining
that indicator column. However, DB2 can utilize a during join predicate to avoid probing the
child table when there is no data for the parent key.
Take, for example, an account table and an account history table. The account may or may not
have account history, and so the following query would join the two tables together to list the
basic account information (balance) along with the history information if present:
FROM ACCOUNT A
ACCT_HIST B
ON A.ACCT_ID = B.ACCT_ID
In the example above the query will always probe the account history table in support of the
join, whether or not the account history table has data. We can employ our light denormalization
by adding an indicator column to the account table. Then we can use a during join predicate.
DB2 will only perform the join operation with the join condition is true. In this particular case
the access to the account history table is completely avoided when the indicator column a
value not equal to “Y”:
FROM ACCOUNT A
ACCT_HIST B
ON A.ACCT_ID = B.ACCT_ID
DB2 is going to test that indicator column first before performing the join operation, and supply
nulls for the account history table when data is not present as indicated.
You can image now the benefit of this type of design when you are doing a legacy migration
from a single record system to 40 or so relational tables with lots of optional relationships. This
form of denormalizing can really improve performance in support of legacy system access,
while maintaining the relation design for efficient future applications.
68 ca.com
CHAPTER 5
It is important to know something about how your application will perform prior to the
application actually executing in a production environment. There are several ways in which we
can predict the performance of our applications prior to implementation, and several tools can
be used. The important thing you have to ask when you begin building your application is “Is
the performance important?” If not, then proceed with development at a rapid pace, and then
fix the performance once the application has been implemented. What if you can’t wait until
implementation to determine performance? Well, then you’re going to have to predict the
performance. This chapter will suggest ways to do that.
EXPLAIN Facility
The DB2 EXPLAIN facility is used to expose query access path information. This enables
application developers and DBAs to see what access path DB2 is going to take for a query, and
if any attempts at query tuning are needed. DB2 can gather basic access path information in a
special table called the PLAN_TABLE (DB2 V7, DB2 V8, DB2 9), as well as detailed information
about predicate stages, filter factor, predicate matching, and dynamic statements that are
cached (DB2 V8, DB2 9).
When EXPLAIN is executed it can populate many EXPLAIN tables. The target set of EXPLAIN
tables depends upon the authorization id associated with the process. So, the creator (schema)
of the EXPLAIN tables is determined by the CURRENT SQLID of the person running the
EXPLAIN statement, or the OWNER of the plan or package at bind time. The EXPLAIN tables
are optional, and DB2 will only populate the tables that it finds under the SQLID or OWNER of
the process invoking the EXPLAIN. There are many EXPLAIN tables, which are documented in
the DB2 Performance Monitoring and Tuning Guide (DB2 9). Some of these tables are not
available in DB2 V7. The following tables can be defined manually, and the DDL can be found
in the DB2 sample library member DSNTESC:
• PLAN_TABLE The PLAN_TABLE contains basic access path information for each query
block of your statement. This includes, among other things, information about index usage,
and the number of matching index columns, which join method is utilized, which access
method is utilized, and whether or not a sort will be performed. The PLAN_TABLE forms the
basis for access path determination.
• DSN_STATEMNT_TABLE The statement table contains estimated cost information for the
cost of a statement. If the statement table is present when you EXPLAIN a query, then it will
be populated with the cost information that corresponds to the access path information for
the query that is stored in the PLAN_TABLE. For a given statement, this table will contain the
estimated processor cost in milliseconds, as well as the estimated processor cost in service
units. It places the cost values into two categories:
– Category A DB2 had enough information to make a cost estimation without using
any defaults.
– Category B DB2 had to use default values to make some cost calculations.
The statement table can be used to compare estimated costs when you are attempting to
modify statements for performance. Keep in mind, however, that this is a cost estimate, and
is not truly reflective of how your statement will be used in an application (given input
values, transaction patterns, etc). You should always test your statements for performance
in addition to using the statement table and EXPLAIN.
• DSN_FUNCTION_TABLE The function table contains information about user-defined
functions that are a part of the SQL statement. Information from this table can be
compared to the cost information (if populated) in the DB2 System Catalog table,
SYSIBM.SYSROUTINES, for the user-defined functions.
• DSN_STATEMENT_CACHE_TABLE This table is not populated by a normal invocation of
EXPLAIN, but instead by the EXPLAIN STMTCACHE ALL statement. Issuing the statement
will result in DB2 reading the contents of the dynamic statement cache, and putting runtime
execution information into this table. This includes information about the frequency of
execution of these statements, the statement text, the number of rows processed by the
statement, lock and latch requests, I/O operations, number of index scans, number of sorts,
and much more! This is extremely valuable information about the dynamic queries executing
in a subsystem. This table is available only for DB2 V8 and DB2 9.
70 ca.com
EXPLAIN AND PREDICTIVE ANALYSIS
• INDEXONLY A value of “Y” in this column indicates that the access required could be fully
served by accessing the index only, and avoiding any table access.
• SORTN####, SORTC#### These columns indicate any sorts that may happen in support
of a UNION, grouping, or a join operation, among others.
• PREFETCH This column indicates whether or not prefetch may play a role in the access.
By manually running EXPLAIN, and examining the PLAN_TABLE, you can get good information
about the access path, indexes that are used, join operations, and any sorting that may be
happening as part of your query. If you have additional EXPLAIN tables created (those created
by Visual Explain among other tools) then those tables are populated automatically either by
using those tools, or by manually running EXPLAIN. You can also query those tables manually,
especially if you don’t have the remote accessed required from a PC to use those tools. The
DB2 Performance Monitoring and Tuning Guide (DB2 9) documents all of these tables. These
tables provide detailed information about such things as predicate stages, filter factor,
partitions eliminated, parallel operations, detailed cost information, and more.
There is some information, however, that EXPLAIN does not tell you about your queries. You
have to be aware of this to effectively do performance tuning and predictive analysis. Here are
some of the things you cannot get from EXPLAIN:
• INSERT indexes EXPLAIN does not tell you the index that DB2 will use for an INSERT
statement. Therefore, it’s important that you understand your clustering indexes, and
whether or not DB2 will be using APPEND processing for your inserts. This understanding is
important for INSERT performance, and the proper organization of your data. See Chapter 4
for more information in this regard.
• Access path information for enforced referential constraints If you have INSERTS,
UPDATES, and DELETES in the program you have EXPLAINed, then any database enforced
RI relationships and associated access paths are not exposed in the EXPLAIN tables.
Therefore, it is your responsibility to make sure that proper indexes in support of the RI
constraints are established and in use.
• Predicate evaluation sequence The EXPLAIN tables do not show you the order in which
predicates of the query are actually evaluated. Please see Chapter 3 of this guide for more
information on predicate evaluation sequence.
• The statistics used The optimizer used catalog statistics to help determine the access path
at the time the statement was EXPLAINed. Unless you have historical statistics that happen
to correspond to the time the EXPLAIN was executed, then you don’t know what the
statistics looked like at the time of the EXPLAIN, and if they are different now and can
change the access path.
• The input values If you are using host variables in your programs then EXPLAIN knows
nothing about the potential input values to those host variables. This makes it important for
you to understand these values, what the most common values are, and if there is data skew
relative to the input values.
• The SQL statement The SQL statement is not captured in the EXPLAIN tables, although
some of the predicates are. If you EXPLAINed a statement dynamically, or via one of the
tools, then you know what the statement looks like. However, if you’ve EXPLAINed a package
or plan, then you are going to need to see the program source code.
• The order of input to your transactions Sure the SQL statement looks OK from an access
path perspective, but what is the order of the input data to the statement? What ranges are
being supplied? How many transactions are being issued? Is it possible to order the input or
the data in the tables, in a manner in which it’s most efficient. This is not covered in EXPLAIN
output. These things are discussed further, however, throughout this guide.
• The program source code In order to fully understand the impact of the access path that a
statement has used you need to see how that statement is being used in the application
program. So, you should always be looking at the program source code of the program the
statement is embedded in. How many times will the statement be executed? Is the
statement in a loop? Can we avoid executing the statement? Is the program issuing 100
statements on individual key values when a range predicate and one statement would
suffice? Is the programming performing programmatic joins? These are questions that can
only be answered by looking at the program source code. The EXPLAIN output may show
a perfectly good and efficient access path, but the statement itself could be completely
unnecessary. (this is where a tool — or even a trace — can be very helpful in order to verify
if the SQL statements executed really are what’s expected).
72 ca.com
EXPLAIN AND PREDICTIVE ANALYSIS
The OSC can be used to analyze previously generated explains or to gather explain data and
explain dynamic SQL statements.
The OSC is available for DB2 V8, and DB2 9. If you are using DB2 V7, then you can use the
DB2 Visual Explain product, which provides a subset of the functionality of the OSC.
DB2 Estimator
The DB2 Estimator product is another useful predictive analysis tool. This product runs on a
PC, and provides a graphical user interface for entering information about DB2 tables, indexes,
and queries. Table and index definitions and statistics can be entered directly, or imported view
DDL files. SQL statements can be imported from text files.
The table definitions and statistics can be used to accurately predict database sizes. SQL
statements can be organized into transactions, and then information about DASD models, CPU
size, access paths, and transaction frequencies can be set. Once all of this information is input
into Estimator, then capacity reports can be produced. These reports will contain estimates of
the DASD required, as well as the amount of CPU required for an application. These reports
are very helpful during the initial stage of capacity planning. That is, before any actual real
programs or test data is available. The DB2 Estimator product will no longer be available after
DB2 V8, and so you should download it today!
Another approach for large systems design is to spend little time considering performance
during the development and set aside project time for performance testing and tuning. This
frees the developers from having to consider performance in every aspect of their program-
ming, and gives them incentive to code more logic in their queries. This makes for faster
development time, and more logic in the queries means more opportunities for tuning (if all
the logic was in the programs, then tuning may be a little harder to do). Let the logical
designers do their thing, and let the database have a first shot at deciding the access path.
If you choose to make performance decisions during the design phase, then each performance
decision should be backed by solid evidence, and not assumptions. There is no reason why a
slightly talented DBA or developer can’t make a few test tables, generate some test data, and
write a small program or two to simulate a performance situation and test their assumptions
about how the database will behave. This gives the database the opportunity to tell you how to
design for performance. Reports can then be generated, and given to managers. It’s much
easier to make design decisions based upon actual test results.
Tools that you can use for testing statements, design ideas, or program processes include, but
are not limited to:
• REXX DB2 programs
• COBOL test programs
• Recursive SQL to generate data
• Recursive SQL to generate statements
• Data in tables to generate more data
• Data in tables to generate statements
Generating Data
In order to simulate program access you need data in tables. You could simply type some data
into INSERT statements, insert them into a table, and then use data from that table to generate
more data. Say, for example, that you have to test various program processes against a
PERSON_TABLE table and a PERSON_ORDER table. No actual data has been created yet, but
you need to test the access patterns of incoming files of orders. You can key some INSERT
statements for the parent table, and then use the parent table to propagate data to the child
table. For example, if the parent table, PERSON_TABLE, contained this data:
PERSON_ID NAME
1 JOHN SMITH
2 BOB RADY
74 ca.com
EXPLAIN AND PREDICTIVE ANALYSIS
Then the following statement could be used to populate the child table, PERSON_ORDER, with
some test data:
FROM YLA.PERSON_TABLE
UNION ALL
FROM YLA.PERSON_TABLE
1 1 B100 10 14.95
1 2 B120 3 1.95
2 1 B100 10 14.95
2 2 B120 3 1.95
The statements could be repeated over and over to add more data, or additional statements
can be executed against the PERSON_TABLE to generate more PERSON_TABLE data.
Recursive SQL (DB2 V8, DB2 9) is an extremely useful way to generate test data. Take a look at
the following simple recursive SQL statement:
WITH TEMP(N) AS
(SELECT 1
FROM SYSIBM.SYSDUMMY1
UNION ALL
SELECT N+1
FROM TEMP
SELECT N
FROM TEMP
This statement generates the numbers 1 through 10, one row each. We can use the power of
recursive SQL to generate mass quantities of data that can then be inserted into DB2 tables,
and ready for testing. The following is a piece of a SQL statement that was used to insert
300,000 rows of data into a large test lookup table. The table was quickly populated with data,
and a test conducted to determine the performance. It was quickly determined that the
performance of this large lookup table would not be adequate, but that couldn’t have been
known for sure without testing:
76 ca.com
EXPLAIN AND PREDICTIVE ANALYSIS
(VALUES (0)
UNION ALL
SELECT KEYVAL + 1
FROM LASTPOS
,STALETBL (STALE_IND) AS
END AS PART_NUM
STALETBL ON 1=1;
Generating Statements
Just as data can be generated so can statements. You can write SQL statements that generate
statements. Say, for example, that you needed to generate singleton select statements against
the EMP table to test a possible application process or scenario. You could possibly write a
statement such as this to generate those statements:
FROM SUSAN.EMP
The Above query will generate SELECT statements for approximately 33% of the employees in
departments “C01” and “E01”. The output would look something like this:
----------------------------------------------------------
78 ca.com
EXPLAIN AND PREDICTIVE ANALYSIS
You could also use recursive SQL statements to generate statements. The following statement
was used during testing of high performance INSERTs to an account history table. The
following statement generated 50,000 random insert statements:
UNION ALL
In one situation it was debated as to whether or not an entire application interface should
utilize large joins between parent and child tables, or that all access should be via individual
table access (programmatic joins) for the greatest in flexibility. Coding for both types of access
would be extra programming effort, but what is the cost of the programmatic joins for this
application? Two simple COBOL programs were coded against a test database; one with a two
table programmatic join, and the other with the equivalent SQL join. It was determined that the
SQL join consumed 30% less CPU than the programmatic join.
80 ca.com
CHAPTER 6
Monitoring
It’s critical to design for performance when building applications, databases, and SQL
statements. You’ve designed the correct SQL, avoided programmatic joins, clustered commonly
accessed table in the same sequence, and have avoided inefficient repeat processing. Now,
your application is in production and is running fine. Is there more you can save? Most
certainly!
Which statement is the most expensive? Is it the tablespace scan that runs once per day, or
the matching index scan running millions of times per day? Are all your SQL statements sub-
second responders, and so you don’t need tuning? What is the number one statement in terms
of CPU consumption? All of these questions can be answered by monitoring your DB2
subsystems, and the applications accessing them.
DB2 provides facilities for monitoring the behavior of the subsystem, as well as the applications
that are connected to them. This is primarily via the DB2 trace facility. DB2 has several different
types of traces. In this chapter we’ll discuss the traces that are important for monitoring
performance, as well as how to use them effectively for proactive performance tuning.
DB2 Traces
DB2 provides a trace facility to help track and record events within a DB2 subsystem. There are
six types of traces:
• Statistics
• Accounting
• Audit
• Performance
• Monitor
• Global
This chapter will cover the statistics, accounting and performance traces as they apply to
performance monitoring. These traces should play an integral part in your performance
monitoring process.
Statistics Trace
The data collected in the statistics trace allows you to conduct DB2 capacity planning and to
tune the entire set of DB2 programs. The statistics trace reports information about how much
the DB2 system services and database services are used. It is a system wide trace and should
not be used for charge-back accounting. Statistics trace classes 1, 3, 4, 5, and 6 are the default
classes for the statistics trace if statistics is specified yes in installation panel DSNTIPN. If the
statistics trace is started using the START TRACE command, then class 1 is the default class.
The statistics trace can collect information about the number of threads connected, the
amount of SQL statements executed, and amount of storage consumed within the database
manager address space, deadlocks, timeouts, logging, buffer pool utilization, and much more.
This information is collected at regular intervals for an entire DB2 subsystem. The interval is
typically 10 or 15 minutes per record.
Accounting Trace
The accounting trace provides data that allows you to assign DB2 costs to individual
authorization IDs and to tune individual programs. The DB2 accounting trace provides
information related to application programs, including such things as:
• Start and stop times
• Number of commits and aborts
• The number of times certain SQL statements are issued
• Number of buffer pool requests
• Counts of certain locking events
• Processor resources consumed
• Thread wait times for various events
• RID pool processing
• Distributed processing
• Resource limit facility statistics
Accounting times are usually the prime indicator of performance problems, and most often
should be the starting point for analysis. DB2 times are classified as follows:
• Class 1: This time shows the time the application spent since connecting to DB2, including
time spent outside DB2.
• Class 2: This shows the elapsed time spent in DB2. It is divided into CPU time and
waiting time.
• Class 3: This elapsed time is divided into various waits, such as the duration of suspensions
due to waits for locks and latches or waits for I/O.
82 ca.com
MONITORING
DB2 trace begins collecting this data at successful thread allocations to DB2 and writes a
completed record when the thread terminates or in some cases when the authorization ID
changes. Having the accounting trace active is critical for proper performance monitoring,
analysis, and tuning. When an application connects to DB2 it is executing across address
spaces, and the DB2 address spaces are shared by perhaps thousands of users across many
address spaces. The accounting trace provides information about the time spent within DB2,
as well as the overall application time. Class 2 time is a component of class 1 time, and class 3
time a component of class 2 time.
Accounting data for class 1 (the default) is accumulated by several DB2 components during
normal execution. This data is then collected at the end of the accounting period; it does not
involve as much overhead as individual event tracing. On the other hand, when you start class
2, 3, 7, or 8, many additional trace points are activated. Every occurrence of these events is
traced internally by DB2 trace, but these traces are not written to any external destination.
Rather, the accounting facility uses these traces to compute the additional total statistics that
appear in the accounting record when class 2 or class 3 is activated. Accounting class 1 must
be active to externalize the information.
We recommend you set accounting classes 1,2,3,7,8. Be aware that this can add between 4%
and 5% of your overall system CPU consumption. However, if you are already writing account-
ing classes 1,2,3, then adding 7 and 8 typically should not add much overhead. Also, if you are
using an online performance monitor then it could already have these classes started. If that is
the case then adding SMF as a destination for these classes should not add any CPU overhead.
Performance Trace
The performance trace provides information about a variety of DB2 events, including events
related to distributed data processing. You can use this information to further identify a
suspected problem or to tune DB2 programs and resources for individual users or for DB2 as a
whole. To start a performance trace, you must use the –START TRACE(PERFM) command.
Performance traces cannot be automatically started. Performance traces are expensive to run,
and consume a lot of CPU. They also collect a very large volume of information. Performance
traces are usually run via an online monitor tool, or the output from the performance trace can
be sent to SMF and then analyzed using a monitor reporting tool, or sent to IBM for analysis.
Because performance traces can consume a lot of resources and generate a lot of data, there
are a lot of options when starting the trace to balance the information desired with the
resources consumed. This includes limited the trace data collected by plan, package, trace
class, and even IFCID.
Performance traces are typically utilized by online monitor tools to track a specific problem for
a given plan or package. Reports can then be produced by the monitor software, and can detail
SQL performance, locking, and many other detailed activities. Performance trace data can also
be written to SMF records, and batch reporting tools can read those records to produce very
detailed information about the execution of SQL statements in the application.
Statistics Report
Since statistics records are collected typically at 10 minute or 15 minute intervals quite a few
records can be collected on a daily basis. Your reporting software should be able to produce
either summary reports, which can gather and summarize the data for a period of time, or
detail reports, which can report on every statistics interval. Start with a daily summary report,
and look for specific problems within the DB2 subsystem. Once you detect a problem then
you can produce a detailed report to determine the specific period of time that the problem
occurred, and also coordinate the investigation with detailed accounting reports for the same
time period in an effort to attribute the problem to a specific application or process. Some of
the things to look for in a statistics report:
• RID Pool Failures There should be a section of the statistics report that reports the usage
of the RID pool for things such as list prefetch, multiple index access, and hybrid joins. The
report will also indicate RID failures. There can be RDS failures, DM failures, and failures
due to insufficient size. If you are getting failures due to insufficient storage you can increase
the RID pool size. However, if you are getting RDS or DM failures in the RID pool then there
is a good chance that the access path selected is reverting to a table space scan. In these
situations it is important to determine which applications are getting these RID failures.
Therefore, you need to produce a detailed statistics report that can identify the time of the
failures, and also produce detailed accounting reports that will show which threads are
getting the failures. Further investigation will have to be performed to determine the packages
within the plan, and DB2 EXPLAIN can be used to determine which statements are using list
prefetch, hybrid join, or multi-index access. You may have to test the queries to determine if
they are indeed the one’s getting these failures, and if they are you’ll have to try to influence
the optimizer to change the access path (see Chapter 3 for SQL tuning).
• Bufferpool Issues One of the most valuable pieces of information coming out of the
statistics report is the section covering buffer utilization and performance. For each buffer
pool in use the report will include the size of the pool, sequential and random getpages,
prefetch operations, pages written, and number of sequential I/O’s, random I/O’s, and write
I/O’s, plus much more. Also reported are the number of times certain buffer thresholds have
been hit. One of the things to watch for are the number of synchronous reads for sequential
access, which may be an indication that the number of pages is too small and pages for a
sequential prefetch are stolen before they are used. Another thing to watch is whether or not
any critical thresholds are reached, if there are write engines not available, and whether or
not deferred write thresholds are triggering. It’s also important to monitor the number of
getpages per synchronous I/O, as well as the buffer hit ratio. Please see Chapter 8 for
information about subsystem tuning.
84 ca.com
MONITORING
• Logging Problems The statistics report will give important information about logging. This
includes the number of system checkpoints, number of reads from the log buffer, active log
data sets, or archived log data sets, number of unavailable output buffers, and total log
writes. This could give you an indication as to whether or not you need to increase log buffer
sizes, or investigate frequent application rollbacks or other activities that could cause
excessive log reads. Please see Chapter 8 for information about subsystem tuning.
• EDM Pool Hit Ratio The statistics report will show how often database objects such as
DBD’s, cursor tables, and package tables are requested as well as how often those requests
have to be satisfied via a disk read to one of the directory tables. You can use this to
determine if the EDM pool size needs to be increased. You also get statistics about the use
of dynamic statement cache and the number of times statement access paths were reused
in the dynamic statement cache. This could give you a good indication about the size of
your cache, and its effectiveness, but it could also give you an indication of the potential
reusability of the statements in the cache. Please see Chapter 8 for more information about
the EDM pool, and Chapter 3 for information about tuning dynamic SQL.
• Deadlocks and Timeouts The statistics report will give you a subsystem wide perspective
on the number of deadlocks and timeouts your applications have experienced. You can use
this as an overall method of detecting deadlocks and timeouts across all applications. If the
statistics summary report shows a positive count you can use the detailed report to find out
at what time the problems are occurring. You can also use accounting reports to determine
which applications are experiencing the problem.
This has only been a sample of the fields on the statistics report, and the valuable information
they provide. You should be using your statistics reports on a regular basis, and using your
monitoring software documentation, along with the DB2 Administration Guide (DB2 V7,
DB2 V8) or DB2 Performance Monitoring and Tuning Guide (DB2 9) to interpret the
information provided.
Accounting Report
An accounting report will read the SMF accounting records to produce thread or application
level information from the accounting trace. These reports typically can summarize information
at the level of a plan, package, correlation id, authorization id, and more. In addition, you can
also have the option to produce one report per thread. This accounting detail report can give
very detailed performance information for the execution of a specific application process. If you
have accounting classes 1,2,3,7, and 8 turned on, then the information will be reported at both
the plan and package level.
You can use the accounting report to find specific problems within certain applications,
programs, or threads. Some of the things to look for in an accounting report include:
• Class 1 and Class 2 Timings Class 1 times (elapsed and CPU) include the entire application
time, including the time spent within DB2. Class 2 is a component of class 1, and represents
the amount of time the application spent in DB2. The first question to ask when an
application is experiencing a performance problem is “where is the time being spent?” The
first indication of the performance issue being DB2 will be a high class 2 time relative to the
class 1 time. Within class 2 you could be having a CPU issue (CPU time represents the
majority of class 2 time), or a wait issue (CPU represents very little of the overall class 2
time, but class 3 wait represents most of the time), or maybe your entire system is CPU
bound (Class 2 overall elapsed is not reflected in class 2 CPU and class 3 wait time
combined).
• Buffer Usage The accounting report contains buffer usage at the thread level. This
information can be used to determine how a specific application or process is using the
buffer pools. If you have situations in which certain buffers have high random getpage counts
you may want to look at which applications are causing those high number of random
getpages. You can use this thread level information to determine which applications are
accessing buffers randomly versus sequentially. Then perhaps you can see which objects the
application uses, and use this information to separate sequentially accessed objects from
randomly accessed objects into different buffer pools (see Chapter 8 on subsystem tuning).
The buffer pool information in the accounting report will also indicate just how well the
application is utilizing the buffers. The report can be used during buffer pool tuning to
determine the impact of buffer changes on an application.
• Package Execution Times If accounting classes 7 and 8 are turned on then the account
report will show information about the DB2 processing on a package level. This information
is very important for performance tuning because it allows you to determine which programs
in a poorly performing application should be reviewed first.
• Deadlocks, Timeouts, and Lock Waits The accounting report includes information about the
number of deadlocks and timeouts that occurred on a thread level. It also reports the time
the thread spent waiting on locks. This will give you a good indication as to whether or not
you need to do additional investigation into applications that are having locking issues.
• Excessive Synchronous I/O’s Do you have a slow running job or online process? Exactly
what is slow about that job or process? The accounting report will tell you if there are a large
number of excessive random synchronous I/O’s being issued, and how much time the
application spends waiting on I/O. The information in the report can also be used to
approximately determine the efficiency of your DASD by simply dividing the number of
synchronous I/O’s into the total synchronous I/O wait time.
• RID Failures The accounting report does give thread level RID pool failure information. This
is important in determining if you have access path problems in a specific application.
86 ca.com
MONITORING
• High Getpage Counts and High CPU Time Often it is hard to determine if an application is
doing repeat processing when there are not a lot of I/O’s being issued. You should use the
accounting report to determine if your performance problem is related to an inefficient repeat
process. If the report shows a very high getpage count, or that the majority of elapsed time is
actually class 2 CPU time, then that may be an indication of an inefficient repeat process in
one of the programs for the plan. You can use the package level information to determine
which program uses the most CPU, and try to identify any inefficiencies in that program.
There’s a way to quickly assess application performance and identify the significant offenders
and SQL statements causing problems. You can quickly identify the "low-hanging fruit," report
on it to your boss, and change the application or database to support a more efficient path to
the data. Management support is a must, and an effective manner of communicating
performance tuning opportunities and results is crucial.
There’s been some concern about the performance impact of this level of DB2 accounting.
The IBM DB2 Administration Guide (DB2 V7, DB2 V8) or the DB2 Performance Monitoring
and Tuning Guide (DB2 9) states that the performance impact of setting these traces is
minimal and the benefits can be substantial. Tests performed at a customer site demonstrated
an overall system impact of 4.3 percent for all DB2 activity when accounting classes 1, 2, 3, 7,
and 8 are started. In addition, adding accounting classes 7 and 8 when 1, 2, and 3 are already
started has nominal impact, as does the addition of most other performance monitor
equivalent traces (i.e. your online monitor software).
You can process whatever types of reports you produce so that a concentrated amount of
information about DB2 application performance can be extracted. This information is reduced
to the amount of elapsed time and CPU time the application consumes daily and the number
of SQL statements each package issues daily. This highly specific information will be your first
clue as to which packages provide the best DB2 tuning opportunity. The following example is
from a package level report with the areas of interest highlighted (Package Name, Total DB2
Elapsed, Total SQL Count, Total DB2 TCB):
88 ca.com
MONITORING
If you lack access to a reporting tool that can filter out just the pieces of information desired,
you can write a simple program in any language to read the standard accounting reports and
pull out the information you need. REXX is an excellent programming language well-suited to
this type of "report scraping," and you can write a REXX program to do such work in a few
hours. You could write a slightly more sophisticated program to read the SMF data directly to
produce similar summary information if you wish to avoid dependency on the reporting
software. Once the standard reports are processed and summarized, all the information for a
specific interval (say one day) can appear in a simple spreadsheet. You can sort the
spreadsheet by CPU descending. With high consumers at the top of the report, the low
hanging fruit is easy to spot. The following spreadsheet can be derived by extracting the fields
of interest from a package level summary report:
Package Executions Total Elapsed Total CPU Total SQL Elaps/Execution CPU/Execution Elapsed/SQL CPU/SQL
ACCT001 246745 75694.2992 5187.4262 1881908 0.3067 0.021 0.0402 0.0027
ACCT002 613316 26277.2022 4381.7926 1310374 0.0428 0.0071 0.02 0.0033
ACCTB01 8833 4654.4292 2723.1485 531455 0.5269 0.3082 0.0087 0.0051
RPTS001 93 6998.7605 2491.9989 5762 75.2554 26.7956 1.2146 0.4324
ACCT003 169236 33439.2804 2198.0959 1124463 0.1975 0.0129 0.0297 0.0019
RPTS002 2686 2648.3583 2130.2409 2686 0.9859 0.793 0.9859 0.793
HRPK001 281 4603.1262 2017.7179 59048 16.3812 7.1804 0.0779 0.0341
HRPKB01 21846 3633.5143 2006.6083 316746 0.1663 0.0918 0.0114 0.0063
HRBKB01 505 2079.5351 1653.5773 5776 4.1178 3.2744 0.36 0.2862
CUSTB01 1 4653.9935 1416.6254 7591111 4653.9935 1416.6254 0.0006 0.0001
CUSTB02 1 3862.1498 1399.9468 7971317 3862.1498 1399.9468 0.0004 0.0001
CUST001 246670 12636.0232 1249.7678 635911 0.0512 0.005 0.0198 0.0019
CUSTB03 280 24171.1267 1191.0164 765906 86.3254 4.2536 0.0315 0.0015
RPTS003 1 5163.3568 884.0148 1456541 5163.3568 884.0148 0.0035 0.0006
CUST002 47923 10796.5509 875.252 489288 0.2252 0.0182 0.022 0.0017
CUST003 68628 3428.4162 739.4523 558436 0.0499 0.0107 0.0061 0.0013
CUSTB04 2 1183.2068 716.2694 3916502 591.6034 358.1347 0.0003 0.0001
CUSTB05 563 1232.2111 713.9306 1001 2.1886 1.268 1.2309 0.7132
Look for some simple things to choose the first programs to address. For example, package
ACCT001 consumes the most CPU per day, and issues nearly 2 million SQL statements.
Although the CPU consumed per statement on average is low, the sheer quantity of
statements issued indicates an opportunity to save significant resources. If just a tiny amount
of CPU can be saved, it will quickly add up. The same applies to package ACCT002 and
packages RPTS001 and RPTS002. These are some of the highest consumers of CPU and they
also have a relatively high average CPU per SQL statement. This indicates there may be some
inefficient SQL statements involved. Since the programs consume significant CPU per day,
tuning these inefficient statements could yield significant savings.
ACCT001, ACCT002, RPTS001, and RPTS002 represent the best opportunities for saving CPU,
so examine those first. Without this type of summarized reporting, it’s difficult to do any sort of
truly productive tuning. Most DBAs and systems programmers who lack these reports and look
only at online monitors or plan table information are really just shooting in the dark.
Reporting to Management
To do this type of tuning, you need buy-in from management and application developers. This
can sometimes be the most difficult part because, unfortunately, most application tuning
involves costly changes to programs. One way to demonstrate the potential ROI for program-
ming time is to report the cost of application performance problems in terms of dollars. This is
easy and amazingly effective!
The summarized reports can report on information on the application level. An in-house
naming standard can be used to combine all the performance information from various
packages into application-level summaries. This lets you classify applications and address
the ones that use the most resources.
For example, if the in-house accounting application has a program naming standard where all
program names begin with "ACCT," then the corresponding DB2 package accounting infor-
mation can be grouped by this header. Thus, the DB2 accounting report data for programs
ACCT001, ACCT002, and ACCT003 can be grouped together, and their accounting information
summarized to represent the "ACCT" application.
Most capacity planners have formulas for converting CPU time into dollars. If you get this
formula from the capacity planner, and categorize the package information by application, you
can easily turn your daily package summary report into an annual CPU cost per application.
The following shows a simple chart developed using an in-house naming standard and a CPU-
to-dollars formula. Give this report to the big guy and watch his head spin! This is a really great
tool for getting those resources allocated to get the job done.
90 ca.com
MONITORING
Make sure you produce a "cost reduction" report, in dollars, once a phase of the tuning has
completed. This makes it perfectly clear to management what the tuning efforts have
accomplished and gives incentive for further tuning efforts. Consider providing a visual
representation of your data. A bar chart with before and after results can be highly effective in
conveying performance tuning impact.
Involve managers and developers in your investigation. It's much easier to tune with a team
approach where different team members can be responsible for different analysis.
Performance traces are expensive, sometimes adding as much as 20 percent to the overall
CPU costs. However, a short-term performance trace may be an effective tool for gathering
information on frequent SQL statements and their true costs.
If plan table information isn’t available for the targeted package, then you can rebind that
package with EXPLAIN(YES). If it's hard to get the outage to rebind EXPLAIN(YES) or a plan
table is available for a different owner id, you could also copy the package with EXPLAIN(YES)
(for example execute a BIND into a special/dummy collection-id) rather than rebinding it.
The following example shows PLAN_TABLE data for two of the most expensive programs in
our example.
Here, our most expensive program issues a simple SQL statement with matching index access
to the PERSON_ACCT table, and it orders the result, which results in a sort (Method=3). The
programmer, when consulted, advised that the query rarely returns more than a single row of
data. In this case, a bubble sort in the application program replaced the DB2 sort. The bubble
sort algorithm was almost never used because the query rarely returned more than one row,
and the CPU associated with DB2 sort initialization was avoided. Since this query was
executing many thousands of times per day, the CPU savings were substantial.
While these statements may have not caught someone's eye just by looking at EXPLAIN
results, when combined with the accounting data, they screamed for further investigation.
92 ca.com
MONITORING
• Trace or Monitor Report You can run a performance trace or watch your online monitor for
the packages identified as high consumers in your package report. This type of monitoring
will help to drill down to the high-consuming SQL statements within these packages.
• Plan Table Report Run extractions of plan table information for the high-consuming programs
identified in your package report. You may quickly find some bad access paths that can be
tuned quickly. Don't forget to consider the frequency of execution as indicated in the package
report. Even a simple thing such as a small sort may be really expensive if executed often.
• Index Report Produce a report of all indexes on tables in the database of interest. This
report should include the index name, table name, columns names, columns sequence,
cluster ratio, clustering, first key cardinality, and full key cardinality. Use this report when
tuning SQL statements identified in the plan table or trace/monitor report. There may be
indexes you can take advantage of, add or change — or even drop. Indexes not used will
create an overhead for Insert, Delete, Update processing as well as utilities.
• DDL or ERD You're going to need to know about the database. This includes relationships
between tables, column data types, and knowing where data is. An Entity Relationship
Diagram (ERD) is the best tool for this, but if none is available, you can print out the Data
Definition Language (DDL) SQL statements used to create the tables and indexes. If the DDL
isn’t available, you can use a tool such as DB2LOOK (yes, you can use this against a
mainframe database) to generate the DDL.
Don’t overlook the importance of examining the application logic. This has to do primarily with
the quantity of SQL statements being issued. The best performing SQL statement is the one
that is never issued, and it's surprising how often application programs will go to the database
when they don't have to. The program may be executing the world’s best-performing SQL
statements, but if the data isn't used, then they're really poor-performing statements.
While the majority of performance improvements will be realized via proper application design,
or by tuning the application, nowhere else does “it depends” matter most than when dealing
with applications. This is because the performance design and tuning techniques you apply will
vary depending upon the nature, needs, and characteristics of your application. This chapter
will describe some general recommendations for improving performance in applications.
Caching Data
If you are accessing code tables for validating data, editing data, translating values, or
populating drop-down boxes then those code tables should be locally cached for better
performance. This is particularly true for any code tables that rarely change (if the codes are
frequently changing then perhaps it’s not a code table). For a batch COBOL job for example
read the code tables you’ll use into in-core tables. For larger code tables you can employ a
binary or quick search algorithm to quickly look up values. For CICS applications set up the
code in VSAM files, and use them as CICS data tables. This is much faster than using DB2
to look up the value for every CICS transaction. The VSAM files can be refreshed on regular
intervals via a batch process, or on an as-needed basis. If you are using a remote application,
or Windows based clients you should read the code tables once when the application starts,
and cache the values locally to populate various on screen fields, or to validate the data
input on a screen.
If other situations, and especially for object or service oriented applications you should always
check if an object has already been read in before reading it again. This will avoid blindingly
and constantly rereading the data every time a method is invoked to retrieve the data. If you
are concerned that an object is not “fresh” and that you may be updating old data then you
can employ a concept call optimistic locking. With optimistic locking you don’t have to
constantly reread the object (the table or query to support the object). Instead you read it
once, and when you go to update the object you can check the update timestamp to see if
someone else has updated the object before you. This technique is described further in the
locking section of this chapter.
Traditionally people have used a “next-key” table, which contained one row of one column with
a numeric data type, to generate keys. This typically involved reading the column, incrementing
and updating the column, and then using the new value as a key to another table. These next-
key tables typically wound up being a huge application bottleneck.
There are several ways to generate key values inside DB2, two of which are identity columns
(DB2 V7, DB2 V8, DB2 9) and sequence objects (DB2 V8, DB2 9). Identity columns are
attached to tables, and sequence objects are independent of tables. If you are on DB2 V7 then
your only choice of the two is the identity column. However, there are some limitations to the
changes you can make to identity columns in DB2 V7 so you are best placing them in their own
separate table of one row and one column, and using them as a next key generator table.
The high performance solution for key generation in DB2 is the sequence object. Make sure
that when you use sequence objects (or identity columns) that you utilize the CACHE and
ORDER settings according to your high performance needs. These settings will impact the
number of values that are cached in advance of a request, as well as whether or not the order
the values are returned is important. The settings for a high level of performance in a data
sharing group would be, for example, CACHE 50 NO ORDER.
When using sequence objects or identity columns (or even default values, triggers, ROWIDs),
the GENERATE_UNIQUE function, the RAND function, and more) you can reduce the number
of SQL statements your application issues by utilizing a SELECT from a result table (DB2 V8,
DB2 9). In this case that result table will be the result of an insert. In the following example
we use a sequence object to generate a unique key, and then return that unique key back to
the application:
SELECT ACCT_ID
VALUES
Check the tips chapter for a cool tip on using the same sequence object for batch and online
key assignment!
96 ca.com
APPLICATION DESIGN AND TUNING FOR PERFORMANCE
DB2 does provide a feature known as an INSTEAD OF trigger (DB2 9) that allows somewhat of
a mapping between the object world and multiple tables in a database. You can create a view
that joins two tables that are commonly accessed together. Then the object based application
can treat the view as a table. Since the view is on a join, it is a read-only view. However, you
can define an INSTEAD OF trigger on that view to allow INSERTs, UPDATEs, and DELETEs
against the view. The INSTEAD OF trigger can be coded to perform the necessary changes to
the tables of the join using the transition variables from the view.
In our tests of a simple two table SQL join versus the equivalent programmatic join (FETCH
loop within a FETCH loop in a COBOL program), the two table SQL join used 30% less CPU
than the programmatic join.
Multi-Row Operations
In an effort to reduce the quantity of SQL statements the application is issuing DB2 provides
for some multi-row operations. This includes:
• Multi-row fetch (DB2 V8, DB2 9)
• Multi-row insert (DB2 V8, DB2 9)
• MERGE statement (DB2 9)
These are in addition to the possibility of doing a mass INSERT, UPDATE, or DELETE operation.
MULTI-ROW FETCH Multi-row fetching gives us the opportunity to return multiple rows (up to
32,767) in a single API call with a potential CPU performance improvement somewhere around
50%. It works for static or dynamic SQL, and scrollable or non-scrollable cursors. There is also
support for positioned UPDATEs and DELETEs. The sample programs DSNTEP4 (which is
DSNTEP2 with multi-row fetch) and DSNTIAUL also can exploit multi-row fetch.
There are two reasons to take advantage of the multi-row fetch capability:
1. To reduce the number of statements issued between your program address space
and DB2.
2. To reduce the number of statements issued between DB2 and the DDF address space.
The first way to take advantage of multi-row fetch is to program for it in your application code.
The second way to take advantage of multi-row fetch is in our distributed applications that are
using block fetching. Once in compatibility mode in DB2 V8 the blocks used for block fetching
are built using the multi-row capability without any code change on our part. This results in
great savings for our distributed SQLJ applications. In one situation the observed benefit or this
feature was when a remote SQLJ application migrated from DB2 V7 to DB2 V8 it did not have
a CPU increase.
Coding for a multi-row fetch is quite simple. The basic changes include:
• Adding the phrase “WITH ROWSET POSITIONING” to a cursor declaration
• Adding the phrases “NEXT ROWSET” and “FOR n ROWS” to the FETCH statement
• Changing the host variables to host variable arrays (for COBOL this is as simple as adding
an OCCURS clause)
• Placing a loop within you fetch loop to process the rows
These changes are quite simple, and can have a profound impact on performance. In our tests
of a sequential batch program the use of 50 row fetch (the point of diminishing return for our
test) of 39 million rows of a table reduced CPU consumption by 60% over single-row fetch. In
a completely random test where we expected on average 20 rows per random key, our 20 row
fetch used 25% less CPU than the single-row fetch. Keep in mind, however, that multi-row
fetch is a CPU saver, and not necessarily an elapsed time saver.
When using multi-row fetch, the GET DIAGNOSTICS statement is not necessary, and should
be avoided due to high CPU overhead. Instead use the SQLCODE field of the SQLCA to
determine whether your fetch was successful (SQLCODE 000), if the fetch failed (negative
SQLCODE), or if you hit end of file (SQLCODE 100). If you received an SQLCODE 100 then you
can check the SQLERRD3 field of the SQLCA to determine the number of rows to process.
MULTI-ROW INSERT As with multi-row fetch reading multiple rows per FETCH statement,
a multi-row insert can insert multiple rows into a table in a single INSERT statement. The
INSERT statement simply needs to contain the “FOR n ROWS” clause, and the host variables
referenced in the VALUES clause need to be host variable arrays. IBM states that multi-row
inserts can result in as much as a 25% CPU savings over single row inserts. In addition, multi-
row inserts can have a dramatic impact on the performance of remote applications in that the
number of statements issued across a network can be significantly reduced.
The multi-row insert can be coded as ATOMIC, meaning that if one insert fails then the entire
statement fails, or it can be coded as NOT ATOMIC ON SQLERROR CONTINUE, which means
that any one failure of any of the inserts will only impact that one insert of the set.
98 ca.com
APPLICATION DESIGN AND TUNING FOR PERFORMANCE
As with the multi-row fetch, the GET DIAGNOSTICS statement is not initially necessary, and
should be avoided for performance reasons unless it needed. In the case of a failed non-atomic
multi-row insert you’ll get a SQLCODE of -253 if one or more of the inserts failed. Only then
should you use GET DIAGNOSTICS to determine which one failed. Remember, if you get a
SQLCODE of zero then all the inserts were a success, and there is no need for additional
analysis.
MERGE STATEMENT Many times applications are interfacing with other applications. In these
situations an application may receive a large quantity of data that applies to multiple rows of
a table. Typically the application would in this case perform a blind update. That is, the
application would simply attempt to update the rows of data in the table, and if any update
failed because a row was not found, then the application would insert the data instead. In other
situations, the application may read all of the existing data, compare that data to the new
incoming data, and then programmatically insert or update the table with the new data.
DB2 9 supports this type of processing via the MERGE statement. The MERGE statement
updates a target (table or view, or the underlying tables of a fullselect) using the specified
input data. Rows in the target that match the input data are updated as specified, and rows
that do not exist in the target are inserted. MERGE can utilize a table or an array of variables
as input.
Since the MERGE operates against multiple rows, it can be coded as ATOMIC or NOT
ATOMIC. The NOT ATOMIC option will allow rows that have been successfully updated or
inserted to remain if others have failed. The GET DIAGNOSITICS statement should be used
along with NOT ATOMIC to determine which updates or inserts have failed.
The following example shows a MERGE of rows on the employee sample table:
ON INPUT_TBL.EMPNO = EXISTING_TBL.EMPNO
,COMM = INPUT_TBL.COMM
,BONUS = INPUT_TBL.BONUS
INPUT_TBL.BONUS)
As with the multi-row insert operation the use of GET DIAGNOSTICS should be limited.
These advanced features allow applications to be written quickly by pushing some of the logic
of the application into the database server. Most of the time advanced functionality can be
incorporated into the database using these features at a much lower development cost than
coding the feature into the application itself. A feature such as database enforced referential
integrity (RI) is a perfect example of something that is quite easy to implement in the
database, but would take significantly longer time to code in a program.
These advanced database features also allow application logic to be placed as part of the
database engine itself, making this logic more easily reusable enterprise wide. Reusing existing
logic will mean faster time to market for new applications that need that logic, and having the
logic centrally located makes it easier to manage than client code. Also, in many cases having
data intensive logic located on the database server will result in improved performance as that
logic can process the data at the server, and only return a result to the client.
Using advanced SQL for performance was addressed in Chapter 3 of this guide, and so let’s
address the other features here.
USER-DEFINED FUNCTIONS Functions are a useful way of extending the programming power
of the database engine. Functions allow us to push additional logic into our SQL statements.
User-Defined scalar functions work on individual values of a parameter list, and return a single
value result. A table function can return an actual table to a SQL statement for further
processing (just like any other table). User-defined functions (UDF) provide a major
breakthrough in database programming technology. UDFs actually allow developers and
DBAs to extend the capabilities of the database. This allows for more processing to be pushed
into the database engine, which in turns allows these types of processes to become more
centralized and controllable. Virtually any type of processing can be placed in a UDF, including
legacy application programs. This can be used to create some absolutely amazing results, as
well as push legacy processing into SQL statements. Once your processing is inside SQL
statements you can put those SQL statements anywhere. So that anywhere you can run your
SQL statements (say, from a web browser) you can run your programs! So, just like complex
SQL statements, UDFs place more logic into the highly portable SQL statements.
100 ca.com
APPLICATION DESIGN AND TUNING FOR PERFORMANCE
Also just like complex SQL, UDFs can be a performance advantage or disadvantage. If the
UDFs process large amounts of data, and return a result to the SQL statement, there may be a
performance advantage over the equivalent client application code. However, if a UDF is used
to process data only then it can be a performance disadvantage, especially if the UDF is
invoked many times or embedded in a table expression, as data type casting (for SQL scalar
UDFs compared to the equivalent expression coded directly in the SQL statement) and task
switch overhead (external UDFs run in a stored procedure address space) are expensive (DB2
V8 relieves some of this overhead for table functions). Converting a legacy program into a UDF
in about a day’s time, invoking that program from a SQL statement, and then placing that SQL
statement where it can be access via a client process may just be worth that expense!
Simply put, if the UDF results in the application program issuing fewer SQL statements, or
getting access to a legacy process then chances are that the UDF is the right decision.
STORED PROCEDURES Stored procedures are becoming more prevalent on the mainframe, and
can be part of a valuable implementation strategy. Stored procedures can be a performance
benefit for distributed applications, or a performance problem. In every good implementation
there are trade-offs. Most of the trade-offs involve sacrificing performance for things like
flexibility, reusability, security, and time to delivery. It is possible to minimize the impact of
distributed application performance with the proper use of stored procedures.
Since stored procedures can be used to encapsulate business logic in a central location on
the mainframe, they offer a great advantage as a source of secured, reusable code. By using a
stored procedure the client will only need to have authority to execute the stored procedures
and will not need authority to the DB2 tables that are accessed from the stored procedures. A
properly implemented stored procedure can help improve availability. Stored procedures can be
stopped, queuing all requestors. A change can be implemented while access is prevented, and
the procedures restarted once the change has been made. If business logic and SQL access is
encapsulated within stored procedures there is less dependency on client or application server
code for business processes. That is, the client takes care of things like display logic and edits,
and the stored procedure contains the business logic. This simplifies the change process, and
makes the code more reusable. In addition, like UDFs stored procedures can be used to access
legacy data stores, and quickly web enable our legacy processes.
The major advantage to stored procedures is when they are implemented in a client/server
application that must issue several remote SQL statements. The network overhead involved in
sending multiple SQL commands and receiving result sets is quite significant, therefore proper
use of stored procedures to accept a request, process that request with encapsulated SQL
statements and business logic, and return a result will lessen the traffic across the network and
reduce the application overhead. If a stored procedure is coded in this manner then it can be a
significant performance improvement. Conversely, if the stored procedures contain only a few
or one SQL statement the advantages of security, availability, and reusability can be realized,
but performance will be worse than the equivalent single statement executions from the client
due to task switch overhead.
DB2 9 offers a significant performance improvement for stored procedures with the
introduction of native SQL procedures. These unfenced SQL procedures will execute as run
time structures rather than be converted into external C program procedures (as with DB2 V7
and DB2 V8). Running these native SQL procedures will eliminate the task switch overhead of
executing in the stored procedure address space. This represents a significant performance
improvement for SQL procedures that contain little program logic, and few SQL statements.
TRIGGER AND CONSTRAINTS Triggers and constraints can be used to move application logic
into the database. The greatest advantage to triggers and constraints is that they are generally
data intensive operations, and these types of operations are better performers when placed
close to the data. These features consist of:
• Triggers
• Database Enforced Referential Integrity (RI)
• Table Check Constraints
A trigger is a database object that contains some application logic in the form of SQL
statements that are invoked when data in a DB2 table is changed. These triggers are installed
into the database, and are then dependent upon the table on which they are defined. SQL
DELETE, UPDATE, and INSERT statements can activate triggers. They can be used to replicate
data, enforce certain business rules, and to fabricate data. Database enforced RI can be used to
ensure that relationships from tables are maintained automatically. Child table data cannot be
created unless a parent row exists, and rules can be implemented to tell DB2 to restrict or
cascade deletes to a parent when child data exists. Table check constraints are used to ensure
values of specific table columns, and are invoked during LOAD, insert, and update operations.
Triggers and constraints ease the programming burden because the logic, in the form of SQL
is much easier to code than the equivalent application programming logic. This helps make
the application programs smaller and easier to manage. In addition, since the triggers and
constraints are connected to DB2 tables, then are centrally located rules and universally
enforced. This helps to ensure data integrity across many application processes. Triggers can
also be used to automatically invoke UDFs and stored procedures, which can introduce some
automatic and centrally controlled intense application logic.
There are wonderful advantages to using triggers and constraints. They most certainly provide
for better data integrity, faster application delivery time, and centrally located reusable code.
Since the logic in triggers and constraints is usually data intensive their use typically
outperforms the equivalent application logic simply due to the fact that no data has to be
returned to the application when these automated processes fire. There is one trade-off for
performance, however. When triggers, RI, or check constraints are used in place of application
edits they can be a serious performance disadvantage. This is especially true if several edits on
a data entry screen are verified at the server. It could be as bad as one trip to the server and
back per edit. This would seriously increase message traffic between the client and the server.
For this reason, data edits are best performed at the client when possible.
102 ca.com
APPLICATION DESIGN AND TUNING FOR PERFORMANCE
If is important to understand that when you are working with triggers you need to respect the
triggers when performance schema migrations, or changes to the triggering tables. The triggers
will, in some situations, have to be recreated in the same sequence they were originally
created. In certain situations trigger execution sequence may be important, and if there are
multiple triggers of the same type against a table then they will be executed in the order they
were defined.
Remember, a locally executing batch process that is processing the input data in the same
sequence as the cluster or your tables, and bound with RELEASE(DEALLOCATE) will utilize
several performance enhancers, especially dynamic prefetch and index lookaside, to
significantly improve the performance of these batch processes.
Searching
Search queries, as well as driving cursors (large queries that provide the input data to a batch
process), can be expensive queries. Here there is, once again, a trade off between the amount
of program code you are willing to write and the performance of your application.
If you code a generic search query, you will get generic performance. In the following example
the SELECT statement basically supports a direct read, a range read, and a restart read in one
statement. In order to enable this type of generic access a generic predicate has to be coded.
In most cases this means that for every SQL statement issued more data will be read than is
needed. This is due to the fact that DB2 has a limited ability to match on these types of
predicates. In the following statement the predicate supports a direct read, sequential read,
and restart read for at least one part of a three part compound key:
OR
These predicates are very flexible, however they are not the best performing. The predicate in
this example most likely results in a non-matching index scan even though an index on COL1,
COL2, COL3 is available. This means that the entire index will have to be searched each time
the query is executed. This is not a bad access path for a batch cursor that is reading an entire
table in a particular order. For any other query, however, it is a detriment. This is especially true
for online queries that are actually providing three columns of data (all min and max values are
equal). For larger tables the CPU and elapsed time consumed can be significant.
The very best solution is two have separate predicates for various numbers of key columns
provided. This will allow DB2 to have matching index access for each combination of key
parts provided.
104 ca.com
APPLICATION DESIGN AND TUNING FOR PERFORMANCE
This will dramatically increase the number of SQL statements coded within the program, but
will also dramatically increase the statement performance.
If the additional statements are not desired then there is another choice for the generic
predicates. This would involve adding a redundant Boolean term predicate. These Boolean
term predicates will enable DB2 to match on one column of the index. Therefore, for this
WHERE clause:
OR
OR
The addition of this redundant predicate does not affect the result of the query, but allows DB2
to match on the COL1 column.
Name searching can be a challenge. Once again we are faced with multiple queries to solve
multiple conditions, or one large generic query to solve any request. In many cases it pays to
study the common input fields for a search, and then code specific queries that match those
columns only, and are supported by an index. Then, the generic query can support the less
frequently searched on fields.
We have choices for coding our search queries. Let’s say that we need to search for two
variations of a name to try and find someone in our database. The following query can be
coded to achieve that (in this case a name reversal):
106 ca.com
APPLICATION DESIGN AND TUNING FOR PERFORMANCE
SELECT PERSON_ID
FROM PERSON_TBL
FIRST_NAME = ‘BOB’) OR
FIRST_NAME = ‘RADY’);
This query gets the job done, but uses multi-index access at best. Another way you could code
the query is as follows:
SELECT PERSON_ID
FROM PERSON_TBL
UNION ALL
SELECT PERSON_ID
FROM PERSON_TBL
This query gets better index access, but will probe the table twice. The next query uses a
common table expression (DB2 V8, DB2 9) to build a search list, and then divides that table
into the person table:
UNION ALL
SELECT PERSON_ID
This query gets good index matching and perhaps reduced probes. Finally, the next query
utilizes a during join predicate to probe on the first condition and only apply the second
condition if the first finds nothing. That is, it will only execute the second search if the first finds
nothing and completely avoid the second probe into the table. Keep in mind that this query
may not produce the same results as the previous queries due to the optionality of the search:
FROM SYSIBM.SYSDUMMY1
LEFT OUTER JOIN
PERSON_TBL A
ON IBMREQD = ‘Y’
(SELECT PERSON_ID
FROM PERSON_TBL
ON A.EMPNO IS NULL
108 ca.com
APPLICATION DESIGN AND TUNING FOR PERFORMANCE
Which search query is the best for your situation? Test and find out! Just keep in mind that
when performance is a concern there are many choices.
Existence Checking
What is the best for existence checking within a query? Is it a join, a correlated subquery, or a
non-correlated subquery? Of course it depends on your situation, but these types of existence
checks are always better resolved in a SQL statement then with separate queries in your
program. Here are the general guidelines:
SELECT SNAME
FROM S
WHERE S# IN
(SELECT S# FROM SP
WHERE P# = ‘P2’)
Are generally good when there is no available index for inner select but there is on outer table
column (indexable). Or when there is no index on either inner or outer columns. We also like
non-correlated subqueries when there is relatively a small amount of data provided by the
subquery. Keep in mind that DB2 can transform a non-correlated subquery to a join.
SELECT SNAME
FROM S
WHERE EXISTS
(SELECT * FROM SP
Are generally good when there is a supporting index available on inner select and there is a
cost benefit in reducing repeated executions to inner table and distinct sort for join. They may
be a benefit as well if the inner query could return a large amount of data if coded as a non-
correlated subquery as long as there is a supporting index for the inner query. Also, the
correlated subquery can outperform the equivalent non-correlated subquery if an index on the
outer table is not used (DB2 chose a different index based upon other predicates) and one
exists in support of the inner table.
FROM S, SP
May be best if supporting indexes are available and most rows hook up in the join. Also, if the
join results in no extra rows returned then the DISTINCT can also be avoided. Joins can provide
DB2 the opportunity to pick the best table access sequence, as well as apply predicate
transitive closure.
Which existence check method is best for your situation? We don’t know, but you have choices
and should try them out! It should also be noted that as of DB2 9 it is possible to code and
ORDER BY and FETCH first in a subquery, which can provide even more options for existence
checking in subqueries!
For singleton existence checks you can code FETCH FIRST and ORDER BY clauses in a
singleton select. This could provide the best existence checking performance in a stand
alone query:
FROM TABLE
110 ca.com
APPLICATION DESIGN AND TUNING FOR PERFORMANCE
Lock avoidance also needs frequent commits so that other processes do not have to acquire
locks on updated pages, and this also allows for page reorganization to occur to clear the
“possibly uncommitted” (PUNC) bit flags in a page. Frequent commits allow the commit log
sequence number (CLSN) on a page to be updated more often since it is dependent on the
begin unit of recovery in the log, the oldest begin unit of recovery being required.
The best way to avoid taking locks in your read-only cursors is to read uncommitted. Use the
WITH UR clause in your statements to avoid taking or waiting on locks. Keep in mind however
that using WITH UR can result in the reading of uncommitted, or dirty, data that may eventually
be rolled back. If you are using WITH UR in an application that will update the data, then an
optimistic locking strategy is your best performing option.
Optimistic Locking
With high demands for full database availability, as well as high transaction rates and levels of
concurrency, reducing database locks is always desired. With this in mind, many applications
are employing a technique called “optimistic locking” to achieve these higher levels of
availability and concurrency. This technique traditionally involves reading data with an
uncommitted read or with cursor stability. Update timestamps are maintained in all of the data
tables. This update timestamp is read along with all the other data in a row. When a direct
update is subsequently performed on the row that was selected, the timestamp is used to
verify that no other application or user has changed the data between the point of the read and
the update. This places additional responsibility on the application to use the timestamp on all
updates, but the result is a higher level of DB2 performance and concurrency.
Here is a hypothetical example of optimistic locking. First the application reads the data from a
table with the intention of subsequently updating:
FROM TABLE1
WITH UR
Here the data is has been changed and the update takes place.
UPDATE TABLE1
If the data has changed then the update will get a SQLCODE of 100, and restart logic will have
to be employed for the update. This requires that all applications respect the optimistic locking
strategy and update timestamp when updating the same table.
As of DB2 9, IBM has introduced built-in support for optimistic locking via the ROW CHANGE
TIMESTAMP. When a table is created or altered, a special column can be created as a row
change timestamp. These timestamp columns will be automatically updated by DB2 whenever
a row of a table is updated. This built-in support for optimistic locking takes some of the
responsibility (that of updating the timestamp) out of the hands of the various applications
that might be updating the data.
112 ca.com
APPLICATION DESIGN AND TUNING FOR PERFORMANCE
Here is how the previous example would look when using the ROW CHANGE TIMESTAMP for
optimistic locking:
FROM TABLE1
WITH UR
Here the data has been changed and update takes place.
UPDATE TABLE1
Heuristic control/restart tables have rows unique to each application process to assist in
controlling the commit scope using the number of database updates or time between commits
as their primary focus. There can also be an indicator in the table to tell an application that it is
time to stop at the next commit point. These tables are accessed every time an application
starts a unit-of-recovery (unit-of-work), which would be at process initiation or at a commit
point. The normal process is for an application to read the control table at the very beginning of
the process to get the dynamic parameters and commit time to be used. The table is then used
to store information about the frequency of commits as well as any other dynamic information
that is pertinent to the application process, such as the unavailability of some particular
resource for a period of time. Once the program is running, it both updates and reads from the
control table at commit time. Information about the status of all processing at the time of the
commit is generally stored in the table so that a restart can occur at that point if required.
Values in these tables can be changed either through SQL in a program or by a production
control specialist to be able to dynamically account for the differences in processes through
time. For example, you would probably want to change the commit scope of a job that is
running during the on-line day vs. when it is running during the evening. You can also set
indicators to tell an application to gracefully shut down, run different queries with different
access paths due to a resource being taken down, or go to sleep for a period of time.
114 ca.com
CHAPTER 8
Tuning Subsystems
There are several areas in the DB2 subsystem that you can examine for performance
improvements. These areas are components of DB2 that aid in application processing. This
chapter describes those DB2 components and presents some tuning tips for them.
Buffer Pools
Buffer pools are areas of virtual storage that temporarily store pages of table spaces or indexes.
When a program accesses a row of a table DB2 places the page containing that row in a buffer.
When a program changes a row of a table DB2 must write the data in the buffer back to disk
(eventually) normally either at a DB2 system checkpoint or a write threshold. The write
thresholds are either a vertical threshold at the page set level or a horizontal threshold at the
buffer pool level.
The way buffer pools work is fairly simple by design, but it is tuning these simple operations
that can make all the difference in the world to the performance of our applications. The data
manager issues GETPAGE requests to the buffer manager who hopefully can satisfy the
request from the buffer pool instead of having to retrieve the page from disk. We often trade
CPU for I/O in order to manage our buffer pools efficiently. Buffer pools are maintained by
subsystem, but individual buffer pool design and use should be by object granularity and in
some cases also by application.
DB2 buffer pool management by design allows the following: ability to ALTER and DISPLAY
buffer pool information dynamically without requiring a bounce of the DB2 subsystem. This
improves availability by allowing us to dynamically create new buffer pools when necessary
and to also dynamically modify or delete buffer pools. We may find we need to do ALTERs of
buffer pools a couple times during the day because of varying workload characteristics. We will
discuss this when we look at tuning the buffer pool thresholds. Initial buffer pool definitions set
at installation/migration but are often hard to configure at this time because the application
process against the objects is usually not detailed at installation. But regardless of what is set
at installation we can use ALTER any time after the install to add/delete new buffer pools,
resize the buffer pools or change any of the thresholds. The buffer pool definitions are stored
in BSDS (Boot Strap Dataset) and we can move objects between buffer pools via an ALTER
INDEX/TABLESPACE and a subsequent START/STOP command of the object.
Pages
There are three types of pages in virtual pools:
• Available pages: pages on an available queue (LRU, FIFO, MRU) for stealing
• In-Use pages: pages currently in use by a process that are not available for stealing. In Use
counts do not indicate the size of the buffer pool, but this count can help determine
residency for initial sizing
• Updated pages: these pages are not ‘in-use’, not available for stealing, and are considered
‘dirty pages’ in buffer pool waiting to be externalized
There are four page sizes and several bufferpools to support each size:
BP0 – BP49 4K pages
BP8K0 – BP8K9 8K pages
BP16K0 – BP16K9 16K pages
BP32K0 – BP32K9 32K pages
Work file table space pages are only 4K or 32K. There is a DSNZPARM called DSVCI that
allows the control interval to match to the actual page size.
Our asynchronous page writes per I/O will change with each page size accordingly.
4K Pages 32 Writes per I/O
8K Pages 16 Writes per I/O
16K Pages 8 Writes per I/O
32K Pages 4 Writes per I/O
With these new page sizes we can achieve better hit ratios and have less I/O because we can
fit more rows on a page. For instance if we have a 2200 byte row (maybe for a data warehouse),
a 4K page would only be able to hold 1 row, but if an 8K page was used 3 rows could fit on a
page, 1 more than if 4K pages were used and one less lock also if required. However, we do not
want to use these new page sizes as a band-aid for what may be a poor design. You may want
to consider decreasing the row size based upon usage to get more rows per page.
116 ca.com
TUNING SUBSYSTEMS
DB2 breaks up these queues into multiple LRU chains. This way there is less overhead for
queue management because the latch that is taken at the head of the queue (actually on the
hash control block which keeps the order of the pages on the queue) will be latched less
because the queues are smaller. Multiple subpools are created for a large virtual buffer pool
and the threshold is controlled by DB2, not to exceed 4000 VBP buffers in each subpool. The
LRU queue is managed within each of the subpools in order to reduce the buffer pool latch
contention when the degree of concurrency is high. Stealing of these buffers occurs in a
round-robin fashion through the subpools.
FIFO — First-in, first-out can also be used instead of the default of LRU. With this method the
oldest pages are moved out regardless. This decreases the cost of doing a GETPAGE operation
and reduces internal latch contention for high concurrency. This would only be used where there
is little or no I/O and where table space or index is resident in the buffer pool. We will have
separate buffer pools with LRU and FIFO objects and this can be set via the ALTER BUFFERPOOL
command with a new PGSTEAL option of FIFO. LRU is the PGSTEAL option default.
Asynchronous reads are several pages read per I/O for such prefetch operations such as
sequential prefetch, dynamic prefetch or list prefetch. Asynchronous writes are several pages
per I/O for such operations as deferred writes.
We want to control page externalization via our DWQT and VDWQT thresholds for best
performance and avoid surges in I/O. We do not want page externalization to be controlled by
DB2 system checkpoints because too many pages would be written to disk at one time causing
I/O queuing delays, increased response time and I/O spikes. During a checkpoint all updated
pages in the buffer pools are externalized to disk and the checkpoint recorded in the log
(except for the work files).
In very general terms during an on-line processing, DB2 should checkpoint about every 5 to 10
minutes, or some other value based on investigative analysis of the impact on restart time after
a failure. There are two real concerns for how often we take checkpoints:
• The cost and disruption of the checkpoints
• The restart time for the subsystem after a crash
Many times the costs and disruption of DB2 checkpoints are overstated. While a DB2
checkpoint is a tiny hiccup, it does not prevent processing from proceeding. Having a
CHKFREQ setting that is too high along with large buffer pools and high thresholds, such as
the defaults, can cause enough I/O to make the checkpoint disruptive. In trying to control
checkpoints, some users increased the CHKFREQ value and made the checkpoints less
frequent, but in effect made them much more disruptive. The situation is corrected by
reducing the amount written and increasing the checkpoint frequency which yields much
better performance and availability. It is not only possible, but does occur at some installations,
that a checkpoint every minute did not impact performance or availability. The write efficiency
at DB2 checkpoints is the key factor needed to be observed to see if CHKFREQ can be reduced.
If the write thresholds (DWQT/VDQWT) are doing their job, then there is less work to
perform at each checkpoint. Also using the write thresholds to cause I/O to be performed in a
level, non-disruptive fashion is also helpful for the non-volatile storage in storage controllers.
118 ca.com
TUNING SUBSYSTEMS
However, even if we have our write thresholds (DWQT/VDQWT) set properly, as well as our
checkpoints, we could still see an unwanted write problem. This could occur if we do not have
our log datasets properly sized. If the active log data sets are too small then active log switches
will occur often. When an active log switch takes place a checkpoint is taken automatically.
Therefore, our logs could be driving excessive check point processing resulting in constant
writes. This would prevent us from achieving a high ratio of pages written per I/O because the
deferred write queue would not be allowed to fill as it should.
Sizing
Buffer pool sizes are determined by the VPSIZE parameter. This parameter determines the
number of pages to be used for the virtual pool. DB2 can handle large bufferpools efficiently, as
long as enough real memory is available. If insufficient real storage exists to back the bufferpool
storage requested, then paging can occur. Paging can occur when the bufferpool size exceeds
the available real memory on the z/OS image. DB2 limits the total amount of storage allocated
for bufferpools to approximately twice the amount of real storage (but less is recommended).
There is a maximum of 1TB total for all bufferpools (provided the real storage is available).
In order to size bufferpools it is helpful to know the residency rate of the pages for the object(s)
in the bufferpool.
One tuning option often used is altering the VPSEQT to 0 to set the pool up for just random
use. When the VPSEQT is altered to 0, the SLRU will no longer be valid and the buffer pool is
now totally random. Since only the LRU will be used, all pages on the SLRU have to be freed.
This will also disable prefetch operations in this buffer pool and this is beneficial for certain
strategies. However, there are problems with this strategy for certain buffer pools and this will
be addressed later.
Writes
The DWQT (Deferred Write Threshold), also known as the Horizontal Deferred Write
Threshold, is the percentage threshold that determines when DB2 starts turning on write
engines to begin deferred writes (32 pages/Async I/O). The value can be from 0 to 90%.
When the threshold is reached, write engines (up to 600 write engines as of this publication)
begin writing pages out to disk. Running out of write engines can occur if the write thresholds
are not set to keep a constant flow of updated pages being written to disk. This can occur and
if it is uncommon then it is okay, but if this occurs daily then there is a tuning opportunity. DB2
turns on these write engines, basically one vertical pageset, queue at a time, until a 10%
reverse threshold is met. When DB2 runs out of write engines it can be detected in the
statistics reports in the WRITE ENGINES NOT AVAILABLE indicator on Statistics report.
When setting the DWQT threshold a high value is useful to help improve hit ratio for updated
pages, but will increase I/O time when deferred write engines begin. We would use a low value
to reduce I/O length for deferred write engines, but this will increase the number of deferred
writes. This threshold should be set based on the referencing of the data by the applications.
If we choose to set the DWQT to zero so that all objects defined to the buffer pool are
scheduled to be written immediately to disk then DB2 actually uses its own internal
calculations for exactly how many changed pages can exist in the buffer pool before it is
written to disk.
32 pages are still written per I/O, but it will take 40 dirty pages (updated pages) to trigger the
threshold so that the highly re-referenced updated pages, such as space map pages, remain in
the buffer pool.
When implementing LOBs (Large Objects), a separate buffer pool should be used and this
buffer pool should not be shared (backed by a group buffer pool in a data sharing environment).
The DWQT should be set to 0 so that for LOBS with LOG NO, force-at-commit processing
occurs and the updates continually flow to disk instead of surges of writes. For LOBs defined
with LOG YES, DB2 could use deferred writes and avoid massive surges at checkpoint.
The DWQT threshold works at a buffer pool level for controlling writes of pages to the buffer
pools, but for a more efficient write process you will want to control writes at the pageset/
partition level. This can be controlled via the VDWQT (Vertical Deferred Write Threshold). The
percentage threshold that determines when DB2 starts turning on write engines and begins the
deferred writes for a given data set. This helps to keep a particular pageset/partition from
monopolizing the entire buffer pool with its updated pages. The value is 0 to 90% with a
default of 10%. The VDWQT should always be less than the DWQT.
A good rule of thumb for setting the VDWQT is that if less than 10 pages are written per I/O,
set it to 0. You may also want to set it to 0 to trickle write the data out to disk. It is normally
best to keep this value low in order to prevent heavily updated pagesets from dominating the
section of the deferred write area. Either a percentage of pages or actual number of pages,
from 0 to 9999, can be specified for the VDWQT. You must set the percentage to 0 to use the
number specified. Set to 0,0 and system uses MIN(32,1%) for good for trickle I/O.
If we choose to set the VDWQT to zero, 32 pages are still written per I/O, but it will take 40
dirty pages (updated pages) to trigger the threshold so that the highly re-referenced updated
pages, such as space map pages, remain in the buffer pool.
It is a good idea to set the VDWQT using a number rather than a percentage because if
someone increases the buffer pool that means that now more pages for a particular pageset
can occupy the buffer pool and this may not always be optimal or what you want.
120 ca.com
TUNING SUBSYSTEMS
When looking at any performance report, showing the amount of activity for the VDWQT
and the DWQT, you would want to see the VDWQT being triggered most of the time
(VERTIC.DEFER.WRITE THRESHOLD), and the DWQT extremely less (HORIZ.DEFER.WRITE
THRESHOLD). There can be no general ratios since that would depend on the both the activity
and the number of objects in the buffer pools. The bottom line is that we would want to be
controlling I/O by the VDWQT, with the DWQT watching for and controlling activity across the
entire pool and in general writing out rapidly queuing up pages. This will also assist in limiting
the amount of I/O that checkpoint would have to perform.
Parallelism
THE VPPSEQT Virtual Pool Parallel Sequential Threshold is the percentage of VPSEQT setting
that can be used for parallel operations. The value is 0 to 100% with a default of 50%.
If this is set to 0 then parallelism is disabled for objects in that particular buffer pool. This can
be useful in buffer pools that cannot support parallel operations. The VPXPSEQT — Virtual
Pool Sysplex Parallel Sequential Threshold is a percentage of the VPPSEQT to use for inbound
queries. It also defaults to 50% and if it is set to 0, Sysplex Query Parallelism is disabled when
originating from the member the pool is allocated to. In affinity data sharing environments this
is normally set to 0 to prevent inbound resource consumption of work files and bufferpools.
Stealing Method
The VPSTEAL threshold allows us to choose a queuing method for the buffer pools. The default
is LRU (Least Recently Used), but FIFO (First In- First Out) is also an option. This option turns
off the overhead for maintaining the queue and may be useful for objects that can completely
fit in the bufferpool or if the hit ratio is less than 1%.
Page Fixing
You can use the PGFIX keyword with the ALTER BUFFERPOOL command to fix a buffer pool in
real storage for an extended period of time. The PGFIX keyword has the following options:
• PGFIX(YES) The buffer pool is fixed in real storage for the long term. Page buffers are fixed
when they are first used and remain fixed.
• PGFIX(NO) The buffer pool is not fixed in real storage for the long term. Page buffers are
fixed and unfixed in real storage, allowing for paging to disk. PGFIX(NO) is the default option.
The recommendation is to use PGFIX(YES) for buffer pools with a high I/O rate, that is, a high
number of pages read or written. For buffer pools with zero I/O, such as some read-only data
or some indexes with a nearly 100% hit ratio, PGFIX(YES) is not recommended. In these cases,
PGFIX(YES) does not provide a performance advantage.
Internal Thresholds
The following thresholds are a percent of unavailable pages to total pages, where unavailable
means either updated or in use by a process.
SPTH
THE SPTH Sequential Prefetch Threshold is checked before a prefetch operation is scheduled
and during buffer allocation for a previously scheduled prefetch. If the SPTH threshold is
exceeded prefetch will either not be scheduled or will be canceled. PREFETCH DISABLED —
NO BUFFER (Indicator on Statistics Report) will be incremented every time a virtual buffer
pool reaches 90% of active unavailable pages, disabling sequential prefetch. This value should
always be zero. It this value is not 0, then it is a clear indication that you are probably
experiencing degradation in performance due to all prefetch being disabled. To eliminate this
you may want to increase the size of the buffer pool (VPSIZE). Another option may be to have
more frequent commits in the application programs to free pages in the buffer pool, as this will
put the pages on the write queues.
DMTH
THE DMTH Data Manager Threshold (also referred to as Buffer Critical Threshold) occurs
when 95% of all buffer pages are unavailable (in use). The Buffer Manager will request all
threads to release any possible pages immediately. This occurs by a setting GETPAGE/
RELPAGE processing by row instead of page. After a GETPAGE and single row is processed
then a RELPAGE is issued. This will cause CPU to become high for objects in that buffer pool
and I/O sensitive transaction can suffer. This can occur if the buffer pool is too small. You can
observe when this occurs by seeing a non-zero value in the DM THRESHOLD REACHED
indicator on a statistics reports. This is checked every time a page is read or updated. If this
threshold is not reached then DB2 will access the page in the virtual pool once for each page
(no matter how many rows used). If this threshold has been reached then DB2 will access the
page in the virtual pool once for every ROW on the page that is retrieved or updated. This can
lead to serious performance degradation.
IWTH
The Immediate Write Threshold (IWTH) is reached when 97.5% of buffers are unavailable
(in use). If this threshold is reached then synchronous writes begin and this presents a
performance problem. For example if there are 100 rows in page and if there are 100 updates
then 100 synchronous writes will occur, one by one for each row. Synchronous writes are not
concurrent with SQL, but serial, so the application will be waiting while the write occurs
(including 100 log writes which must occur first). This causes large increases in I/O time. It is
not recorded explicitly in a statistic reports, but DB2 will appear to be hung and you will see
synchronous writes begin to occur when this threshold is reached. Be careful with some
various monitors that send exception messages to the console when synchronous writes occur
and refers to it as IWTH reached — not all synchronous writes are caused by this threshold
being reached. This is simply being reported incorrectly. See the following note.
Note: Be aware that if looking at some performance reports, the IWTH counter can also be
incremented when dirty pages are on the write queue have been re-referenced, which has
caused a synchronous I/O before the page could be used by the new process. This threshold
counter can also be incremented if more than two checkpoints occur before an updated page
is written since this will cause a synchronous I/O to write out the page.
122 ca.com
TUNING SUBSYSTEMS
The output contains valuable information such as prefetch information (Sequential, List,
Dynamic Requests) Pages Read, Prefetch I/O and Disablement (No buffer, No engine). The
incremental detail display shifts the time frame every time a new display is performed.
RID Pool
The RID (Row Identifier) pool is used for storing and sorting RIDs for operations such as:
• List Prefetch
• Multiple Index Access
• Hybrid Joins
• Enforcing unique keys while updating multiple rows
The RID pool is looked at by the optimizer for prefetch and RID use. The full use of RID POOL
is possible for any single user at run time. Run time can result in a table space scan being
performed if not enough space is available in the RID. For example, if you want to retrieve
10,000 rows from 100,000,000 row table and there is no RID pool available, then a scan of
100,000,000 rows would occur, at any time and without external notification. The optimizer
assumes physical I/O will be less with a large pool.
Sizing
The default size of the RID pool is currently 8 MB with a maximum size of 10000 MB, and is
controlled by the MAXRBLK installation parameter. The RID pool could be set to 0, and this
would disable the types of operations that use the RID pool, and DB2 would not choose access
paths that the RID pool supports. The RID pool is created at start up time, but no space is
allocated until RID storage is actually needed. It is then allocated in 32KB blocks as needed,
until the maximum size you specified on installation panel DSNTIPC is reached. There are a
few guidelines for setting the RID pool size. You should have as large a RID pool as required as
it is a benefit for processing and can lead to performance degradation if it is too small. A good
guideline for sizing the RID pool is as follows:
Statistics to Monitor
There are three statistics to monitor for RID pool problems:
RIDS OVER THE RDS LIMIT This is the number of times list prefetch is turned off because the
RID list built for a single set of index entries is greater that 25% of number of rows in the table.
If this is the case, DB2 determines that instead of using list prefetch to satisfy a query it would
be more efficient to perform a table space scan, which may or may not be good depending on
the size of the table accessed. Increasing the size of the RID pool will not help in this case. This
is an application issue for access paths and needs to be evaluated for queries using list prefetch.
124 ca.com
TUNING SUBSYSTEMS
There is one very critical issue regarding this type of failure. The 25% threshold is actually
stored in the package/plan at bind time, therefore it may no longer match the real 25% value,
and in fact could be far less. It is important to know what packages/plans are using list prefetch,
and on what tables. If the underlying tables are growing, then rebinding the packages/plans
that are dependent on it should be rebound after a RUNSTATS utility has updated the statistics.
Key correlation statistics and better information about skewed distribution of data can also
help to gather better statistics for access path selection and may help avoid this problem.
RIDS OVER THE DM LIMIT This occurs when over 28 million RIDS were required to satisfy a
query. Currently there is a 28 million RID limit in DB2. The consequences of hitting this limit
can be fallback to a table space scan. In order to control this, you have a couple of options:
• Fix the index by doing something creative
• Add an additional index better suited for filtering
• Force list prefetch off and use another index
• Rewrite the query
• Maybe it just requires a table space scan
INSUFFICIENT POOL SIZE This indicates that the RID pool is too small.
DB2 allocates at startup a sort pool in the private area of the DBM1 address space. DB2 uses
a special sorting technique called a tournament sort. During the sorting processes it is not
uncommon for this algorithm to produce logical work files called runs, which are intermediate
sets of ordered data. If the sort pool is large enough then the sort completes in that area.
More often than not the sort cannot complete in the sort pool and the runs are moved into the
work file database, especially if there are many rows to sort. These runs are later merged to
complete the sort. When work file database is used for holding the pages that make up the
sort runs, you could experience performance degradation if the pages get externalized to the
physical work files since they will have to be read back in later in order to complete the sort.
Size
The sort pool size defaults to 2MB unless specified. It can range in size from 240KB to 128MB
and is set with an installation DSNZPARM. The larger the Sort Pool (Sort Work Area) is, the
fewer sort runs are produced. If the sort pool is large enough then the buffer pools and sort
work files may not be used. If buffer pools and work file database are not used then the better
performance will be due to less I/O. We want to size sort pool and work file database large
because we do not want sorts to have pages being written to disk.
Sizing
If the pool is too small, then you will see increased I/O activity in the following DB2 table
spaces, which support the DB2 directory:
DSNDB01.DBD01
DSNDB01.SPT01
DSNDB01.SCT02
Our main goal for the EDM pool is to limit the I/O against the directory and catalog. If the pool
is too small, then you will also see increased response times due to the loading of the SKCTs,
SKPTs, and DBDs, and re-preparing the dynamic SQL statements because they could not
remained cached. By correctly sizing the EDM pools you can avoid unnecessary I/Os from
accumulating for a transaction. If a SKCT, SKPT or DBD has to be reloaded into the EDM pool
this is additional I/O. This can happen if the pool pages are stolen because the EDM pool is too
small. Pages in the pool are maintained on an LRU queue, and the least recently used pages
get stolen if required. A DB2 performance monitor statistics report can be used to track the
statistics concerning the use of the EDM pools.
126 ca.com
TUNING SUBSYSTEMS
Efficiency
We can measure the following ratios to help us determine if our EDM pool is efficient. Think of
these as EDM pool hit ratios:
• CT requests versus CTs not in EDM pool
• PT requests versus PTs not in EDM pool
• DBD requests versus DBDs not in EDM pool
What you want is a value of 5 for each of the above (1 out of 5). An 80% hit ratio is what you
are aiming for.
In addition to the global dynamic statement caching in a subsystem, an application can also cache
statements at the thread level via the KEEPDYNAMIC(YES) bind parameter in combination
with not re-preparing the statements. In these situations the statements are cached at the
thread level in thread storage as well as at the global level. As long as there is not a shortage
of virtual storage the local application thread level cache is the most efficient storage for the
prepared statements. The MAXKEEPD subsystem parameter can be used to limit the amount
of thread storage consumed by applications caching dynamic SQL at the thread level.
Logging
Every system has some component that will eventually become the final bottleneck. Logging is
not to be overlooked when trying to get transactions through the systems in a high performance
environment. Logging can be tuned and refined, but the synchronous I/O associated with
logging and commits will always be there.
Log Reads
When DB2 needs to read from the log it is important that the reads perform well because
reads are normally performed during recovery, restarts and rollbacks — processes that you do
not want taking forever. An input buffer will have to be dedicated for every process requesting
a log read. DB2 will first look for the record in the log output buffer. If it finds the record there it
can apply it directly from the output buffer. If it is not in the output buffer then DB2 will look for
it in the active log data set and then the archive log data set. When it is found the record is
moved to the input buffer so it can be read by the requesting process. You can monitor the
successes of reads from the output buffers and active logs in the statistics report. These reads
are the better performers. If the record has to be read in from the archive log, the processing
time will be extended. For this reason it is important to have large output buffers and active logs.
Log Writes
Applications move log records to the log output buffer using two methods — no wait or force.
The ‘no wait’ method moves the log record to the output buffer and returns control to the
application, however if there are no output buffers available the application will wait. If this
happens it can be observed in the statistics report when the UNAVAILABLE ACTIVE LOG BUFF
has a non-zero value. This means that DB2 had to wait to externalize log records due to the
fact that there were no available output log buffers. Successful moves without a wait are
recorded in the statistics report under NO WAIT requests.
A force occurs at commit time and the application will wait during this process, which is
considered a synchronous write.
Log records are then written from the output buffers to the active log datasets on disk either
synchronously or asynchronously. To know how often this happens you can look at the WRITE
OUTPUT LOG BUFFERS in the statistics report.
In order to improve the performance of the log writes there are a few options. First we can
increase the number of output buffers available for writing active log datasets, which is
performed by changing an installation parameter (OUTBUFF). You would want to increase this
if you are seeing that there are unavailable buffers. Providing a large buffer will improve
performance for log reads and writes.
SUSPENSION An application process is suspended when it requests a lock that is already held
by another application process and cannot be shared. The suspended process temporarily
stops running.
128 ca.com
TUNING SUBSYSTEMS
TIME OUT An application process is said to time out when it is terminated because it has been
suspended for longer than a preset interval. DB2 terminates the process and returns a -911 or
-913 SQL code.
DEADLOCK A deadlock occurs when two or more application processes hold locks on
resources that the others need and without which they cannot proceed.
The way DB2 issues locks is complex. It depends on the type of processing being done, the
LOCKSIZE parameter specified when the table was created, the isolation level of the plan or
package being executed, and the method of data access.
Thread Management
The thread management parameters on DB2 install panel DSNTIPE control how many threads
can be connected to DB2, and determine main storage size needed. Improper allocation of the
parameters on this panel directly affect main storage usage. If the allocation is too high, storage
is wasted. If the allocation is too low, performance degradation occurs because users are
waiting for available threads.
You can use a DB2 performance trace record with IFCID 0073 to retrieve information about
how often thread create requests wait for available threads. Starting a performance trace can
involve substantial overhead, so be sure to qualify the trace with specific IFCIDS, and other
qualifiers, to limit the data collected.
The MAX USERS field on DSNTIPE specifies the maximum number of allied threads that can
be allocated concurrently. These threads include TSO users, batch jobs, IMS, CICS, and tasks
using the call attachment facility. The maximum number of threads that can be accessing
data concurrently is the sum of this value and the MAX REMOTE ACTIVE specification.
When the number of users trying to access DB2 exceeds your maximum, plan allocation
requests are queued.
You can reorg the catalog and directory. You should periodically run RUNSTATs on the catalog
and analyze the appropriate statistics that let you know when you need to reorg.
Also, consider isolating the DB2 catalog and directory into their own buffer pool.
130 ca.com
CHAPTER 9
Many organizations today do not retain the manpower resources to build and maintain their
own custom applications. In many situations, therefore, these organizations are relying on “off
the shelf” packaged software solutions to help run their applications and automate some of
their business activities such as accounting, billing, customer management, inventory
management, and human resources. Most of the time these packaged application are referred
to as enterprise resource planning (ERP) applications. This chapter will offer some tips as to
how to manage and tune your DB2 database for a higher level of performance with these
ERP applications.
These packaged applications are also typically written using an object oriented design, and
are created in such a way that they can be customized by the organization implementing the
software. This provides a great flexibility in that many of the tables in the ERP applications’
database can be used in a variety of different ways, and for a variety of purposes. With this
great flexibility also comes the potential for reduced database performance. OO design may
lead to an increase in SQL statements issued, and the flexibility of the implementation could
have an impact on table access sequence, random data access, and mismatching of predicates
to indexes.
Finally, when you purchase an application you very typically have no access to the source code,
or the SQL statements issued. Sometimes you can get a vendor to change a SQL statement
based upon information you provided about a performance problem, but this is typically not
the case. So, with that in mind the majority of your focus should be on first getting the
database organized in a manner that best accommodates the SQL statements, and then tune
the subsystem to provide the highest level of throughput. Of course, even though you can’t
change the SQL statements you should first be looking at those statements for potential
performance problems that can be improved by changing the database or subsystem.
The best way to find the programs and SQL statements that have the potential for elapsed and
CPU time savings is to utilize the technique described in Chapter 6 called “Overall Application
Performance Monitoring”. This would be the technique for packaged applications that are using
static embedded SQL statements. However, if your packaged application is utilizing dynamic
SQL then this technique will be less effective. In that case you are better off utilizing one or more
of the techniques outlined in Chapter 3 “Recommendations for Distributed Dynamic SQL”.
Whether or not the application is using static or dynamic SQL it’s best to capture all of the SQL
statements being issued, and cataloging them in some sort of document or perhaps even in a
DB2 table. This can be achieved by querying SYSIBM.SYSSTMT table by plan for statements in
a plan, or the SYSIBM.SYSPACKSTMT table by package for statements in a package. You can
query these tables using the plan or package names corresponding to the application. If the
application is utilizing dynamic SQL then you can capture the SQL statements by utilizing
EXPLAIN STMTCACHE ALL (DB2 V8 or DB2 9), or by running a trace (DB2 V7, DB2 V8, DB2
9). Of course if you run a trace you can expect to have to parse through a significant amount
of data. Keep in mind that by capturing these dynamic statements you are only seeing the
statements that have been executing during the time you’ve monitored. For example, using
EXPLAIN STMTCACHE ALL statement will only capture the statements that are residing in
the dynamic statement cache at the time the statement is executed.
Once you’ve determined the packages and/or statements that are consuming the most
resources, or have the highest elapsed time, then you can begin your tuning effort. This will
involve subsystem and/or database tuning. You need to make a decision at this point as
subsystem tuning really has no impact on the application and database design. However, if you
make changes to the database, then you need to consider the fact that future product upgrades
or releases can undo the changes you have done.
132 ca.com
UNDERSTANDING AND TUNING YOUR PACKAGED APPLICATIONS
• Memory and DSMAX These packaged applications typically have a lot of objects. You can
monitor to see which objects are most commonly accessed, and make those objects CLOSE
NO. Make sure your DSMAX installation parameter is large enough to avoid any closing of
datasets, and if you have enough storage make all of the objects CLOSE NO. You can also
make sure your output log buffer is large enough to avoid shortages if the application is
changing data quite often. You can consider using the KEEPDYNAMIC bind option for the
packages, but keep in mind that this can increase thread storage, and you’ll need to balance
the number of threads with the amount of virtual storage consumed by the DBM1 address
space. Make sure your buffer pools are page fixed.
Please refer to Chapter 8 of this guide for more subsystem tuning options. You can also refer to
the IBM redbook entitled “DB2 UDB for z/OS V8: Through the Looking Glass and What SAP
Found There” for additional tips on performance.
134 ca.com
UNDERSTANDING AND TUNING YOUR PACKAGED APPLICATIONS
Of course you need to always be aware of the fact that any change you make to the database
can be undone on the next release of the software. Therefore, you should document every
change completely!
STATISTICS COLLECTION DB2 is always collecting statistics for database objects. The statistics
are kept in virtual storage and are calculated and updated asynchronously upon externalization.
In order to externalize them the environment must be properly set up. A new set of DB2
objects must be created in order to allow for DB2 to write out the statistics.
There are two tables (with appropriate indexes) that must be created to hold the statistics:
• SYSIBM.TABLESPACESTATS
• SYSIBM.INDEXSPACESTATS
These tables are kept in a database named DSNRTSDB, which must be started in order to
externalize the statistics that are being held in virtual storage. DB2 will then populate the tables
with one row per table space or index space, or one row per partition. For tables that are shared
in a data-sharing environment, each member will write its own statistics to the RTS tables.
Some of the important statistics that are collected for table spaces include: total number of
rows, number of active pages, and time of last COPY, REORG, or RUNSTATS execution. Some
statistics that may help determine when a REORG is needed include: space allocated, extents,
number of inserts, updates, or deletes (singleton or mass) since the last REORG or LOAD
REPLACE, number of unclustered inserts, number of disorganized LOBs, number of overflow
records created since last REORG. There are also statistics to help for determining when
RUNSTATS should be executed. These include: number of inserts/updates/deletes (singleton
and mass) since the last RUNSTATS execution. Statistics collected to help with COPY
determination include: distinct updated pages and changes since the last COPY execution
and the RBA/LRSN of first update since last COPY.
There are also statistics gathered on indexes. Basic index statistics include: total number of
entries (unique or duplicate), number or levels, number of active pages, space allocated and
extents. Statistics that help to determine when a REORG is needed include: time when the last
REBUILD, REORG or LOAD REPLACE occurred. There are also statistics regarding the number
of updates/deletes (real or pseudo, singleton or mass)/inserts (random and those that were
after the highest key) since the last REORG or REBUILD. These statistics are of course very
helpful for determining how our data physically looks after certain process (i.e. batch inserts)
have occurred so we can take appropriate actions if necessary.
EXTERNALIZING AND USING REAL TIME STATISTICS There are different events that can trigger
the externalization of the statistics. DSNZPARM STATSINST (default 30 minutes) is used to
control the externalization of the statistics at a subsystem level.
There are several processes that will have an effect on the real time statistics. Those processes
include: SQL, Utilities and the dropping/creating of objects.
Once externalized, queries can then be written against the tables. For example, a query against
the TABLESPACESTATS table can be written to identify when a table space needs to be copied
due to the fact that greater than 30 percent of the pages have changed since the last image
copy was taken.
SELECT NAME
FROM SYSIBM.SYSTABLESPACESTATS
((COPYUPDATEDPAGES*100)/NACTIVE)>30
This table can be used to compare the last RUNSTATS timestamp to the timestamp of the last
REORG on the same object to determine when RUNSTATS is needed. If the date of the last
REORG is more recent than the last RUNSTATS, then it may be time to execute RUNSTATS.
136 ca.com
UNDERSTANDING AND TUNING YOUR PACKAGED APPLICATIONS
SELECT NAME
FROM SYSIBM.SYSTABLESPACESTATS
(JULIAN_DAY(REORGLASTTIME)>JULIAN_DAY(STATSLASTTIME))
This last example may be useful if you want to monitor the number of records that were
inserted since the last REORG or LOAD REPLACE that are not well-clustered with respect to
the clustering index. Ideally, ‘well-clustered’ means the record was inserted into a page that
was within 16 pages of the ideal candidate page (determined by the clustering index). The
SYSTABLESPACESTATS table value REORGUNCLUSTINS can be used to determine whether
you need to run REORG after a series of inserts.
SELECT NAME
FROM SYSIBM.SYSTABLESPACESTATS
((REORGUNCLUSTINS*100)/TOTALROWS)>10
There is also a DB2 supplied stored procedure to help with this process, and possibly even
work toward automating the whole determination/utility execution process. This stored
procedure, DSNACCOR, is a sample procedure which will query the RTS tables and determine
which objects need to be reorganized, image copied, updated with current statistics, have
taken too many extents, and those which may be in a restricted status. DSNACCOR creates
and uses its own declared temporary tables and must run in a WLM address space. The output
of the stored procedure provides recommendations by using a predetermined set of criteria
in formulas that use the RTS and user input for their calculations. DSNACCOR can make
recommendations for everything (COPY, REORG, RUNSTATS, EXTENTS, RESTRICT) or for
one or more of your choice and for specific object types (table spaces and/or indexes).
Tuning Tips
Back against a wall? Not sure why DB2 is behaving in a certain way, or do you need an answer
to a performance problem. It’s always good to have a few ideas or tricks up your sleeve. Here
are some DB2 hints and tips in no particular order.
We often see DISTINCT coded when it is not necessary and sometimes it comes from a lack of
understanding of the data and/or the process. Sometimes code generators create a DISTINCT
after every SELECT clause, no matter where the SELECT is in a statement. Also we have seen
some programmers coding DISTINCTs just as a safeguard. When considering the usage of
DISTINCT, the question to first ask is ‘Are duplicates even possible?’ If the answer is no, then
remove the DISTINCT and avoid the potential sort.
If duplicates are possible and not desired then use DISTINCT wisely. Try to use a unique index
to help avoid a sort. Consider Using DISTINCT as early as possible in complex queries in order
to get rid of the duplicates as early as possible. Also, avoid using DISTINCT more than once in
a query. A GROUP BY all columns can utilize non-unique indexes possibly avoid a sort. Coding
a GROUP BY all columns can be more efficient than using DISTINCT, but should be carefully
documented in your program and/or statement.
Using Read Stability can lead to other known problems, such as the inability to get lock
avoidance, and potential share lock escalations because DB2 may escalate from an S lock on
the page to an S lock on the tablespace. Another problem that has been experienced is that
searched updates can deadlock under isolation RS. When an isolation level of RS is used, the
search operation for a searched update will read the page with an S lock. When it finds data to
update then it will change the S lock to an X lock. This could be exaggerated by a self-referencing
correlated subquery in the update (e.g. updating the most recent history or audit row). This is
because in that situation DB2 will read the data with an S lock, put the results in a workfile with
the RIDs, and then go back to do the update in RID sequence. As the transaction volume
increases these deadlocks are more likely to occur. These problems have been experienced
with applications that are running under Websphere and allowing it to determine the isolation
level, which defaults to RS.
So how do you control this situation? Well some possible solutions include the following.
1. Change the application server connect to use an isolation level of CS rather than RS. This
can be done by setting a JDBC database connection property. This is the best option, but
often the most difficult to get implemented.
2. Rebind the isolation RS package being used, with an isolation level of CS. This is a dirty
solution because there may be applications that require RS, or when the DB2 client
software is upgraded it may result in a new RS package (or a rebind of the package), and
we'll have to track that and perform a special rebind with every upgrade on every client.
3. Have the application change the statement to add WITH CS.
4. There is a zparm RRULOCK=YES option which acquires U, versus S locks for
update/delete with ISO(RS) or ISO(RR). This could help to avoid deadlocks for some
update statements.
Through researching this issue we have found no concrete reason why this is the default in
Websphere, but for DB2 applications this is wasteful and should be changed.
In V7, this was the case if the tablespace was defined with LOCKPART(YES), and the V8 test
results were the same because now LOCKPART(YES) is the default for the tablespace.
Once an update was issued then the table was not readable.
This is working as designed and the reason for this tip is to make sure that applications are
aware that just issuing this statement does NOT immediately make a table unreadable.
140 ca.com
TUNING TIPS
(SELECT 1
FROM SYSIBM.SYSDUMMY1
UNION ALL
SELECT C+1
FROM GET_THOUSAND
FROM GET_THOUSAND;
FROM ACCT_HIST
FROM ACCOUNT A
HIST_VIEW B
ON A.ACCT_ID = B.ACCT_ID
Consider instead the following query which will strongly encourage DB2 to use the index on
the ACCT_ID column of the ACCT_HIST table:
FROM ACCOUNT A
FROM ACCT_HIST X
ON A.ACCT_ID = B.ACCT_ID
142 ca.com
TUNING TIPS
This recommendation is not limited to aggregate queries in views, but can apply to many
situations. Give it a try!
144 ca.com
TUNING TIPS
SECTION 1 1
About This Supplement
SECTION 2 3
Ensure Efficient Application SQL for DB2 for z/OS
with CA Database Management
CA SQL-Ease® for DB2 for z/OS
CA Plan Analyzer® for DB2 for z/OS
SECTION 3 7
Ensure Efficient Object Access for DB2 for z/OS
with CA Database Management
CA Database Analyzer™ for DB2 for z/OS
CA Rapid Reorg® for DB2 for z/OS
SECTION 4 11
Monitor Applications for DB2 for z/OS with
CA Database Management
CA Insight™ Database Performance Monitor
for DB2 for z/OS
CA Detector® for DB2 for z/OS
CA Subsystem Analyzer for DB2 for z/OS
CA Thread Terminator Tool
SECTION 5 17
Further Improve Application Performance for
DB2 for z/OS with CA Database Management
CA Index Expert™ for DB2 for z/OS
Copyright © 2007 CA. All Rights Reserved. One CA Plaza, Islandia, N.Y. 11749. All trademarks, trade names, service marks, and logos referenced herein belong to their respective companies.
SECTION 1
This supplement to the CA Performance Management Handbook for DB2 for z/OS provides
specific information on how CA Database Management for DB2 for z/OS addresses the
performance management challenges outlined in the handbook with an approach that reduces
CPU and DB2 resource overhead and streamlines labor intensive tasks with automated
processes. The supplement provides information to ensure efficient application SQL, ensure
efficient object access, monitor applications, and improve application performance.
The primary audiences for the handbook are physical and logical database administrators.
The handbook assumes a good working knowledge of DB2 and SQL, and is designed to help
you build good performance into the application, database, and the DB2 subsystem. It provides
techniques to help you monitor DB2 for performance, and to identify and tune production
performance problems.
When developing DB2 applications it is critical that SQL statements within the application are
as efficient as possible to ensure good database and application performance. Inefficient SQL
can lead to poor application response times and high resource use and CPU costs within the
DB2 system in which it runs.
It is much more cost effective to develop efficient SQL from the outset –during development,
rather than to work to identify and fix inefficient SQL running in a live application.
Sometimes, the original design of the SQL is not entirely under your control. You may be
implementing Enterprise Resource Planning (ERP) applications using dynamic SQL and
packaged solutions on your systems. It is still important to ensure the SQL within these
applications is efficient. Tuning SQL in ERP applications can yield substantial performance
gains and cost savings.
For more Coding efficient SQL is not always easy since there are often a number of ways to obtain the
same result set. SQL should be written to filter data effectively while returning the minimum
information,
number of rows and columns to the application. Data filtering should be as efficient as possible
visit
using correctly coded and ordered predicates. Some SQL functions such as sorting should be
ca.com/db avoided if possible. If sorting is necessary, you should ensure your database design supports
indexes to facilitate the sort ordering.
Developing efficient SQL initially and subsequent tuning of existing SQL in your application is
very hard to do manually. CA provides two products to assist you with SQL development and
tuning, namely CA SQL-Ease® for DB2 for z/OS and CA Plan Analyzer® for DB2 for z/OS.
The programmer can also use the NOTES function as an online SQL reference manual.
Once the programmer has finished formatting the SQL statement the SYNTAX function can
be used to check that everything is syntactically correct. The STAND function can be used to
convert the SQL statement into a standard format for easy viewing, understanding and
documentation purposes.
The PRED function allows the programmer to check how efficient the predicates coded for
the SQL statement are. It shows whether DB2 can use an index to evaluate the predicate and
whether it will be evaluated as stage 1 or stage 2.
The EXPLAIN function can be used to determine the access path that will be used by DB2 for
the SQL statement. It shows how the data will be accessed, whether by tablespace scan or
For more index access, whether prefetch is used and shows the level of DB2 locking. Estimated costs are
information, shown in ms, service units and TIMERONS.
visit
CA SQL-Ease also provides an Enhanced Explain function offering enhanced access path
ca.com/db
information and uses an expert system to provide recommendations to show how the
efficiency of the SQL could be improved. The expert system provides recommendations in 3
categories: SQL coding guidelines to improve the efficiency of the SQL statement, predicate
coding guidelines offering efficiency improvements for predicate filters and physical object
guidelines offering improvements to the object which will improve the SQL access to it.
CA SQL-Ease allows the programmer to manipulate the DB2 statistics for the object to allow
‘what-if’ scenarios to be played to see what effect different volumes of user data will have on
the efficiency of the SQL they are developing.
CA SQL-Ease also allows the programmer to execute the SQL statement to ensure the
expected result set is returned.
Enhanced Explain also offers eight different reports to fully document the SQL using summary
report, access path, cost estimate, predicate analysis, object dependency, RI dependency, tree
diagram and object statistics.
4 ca.com
CA Plan Analyzer® for DB2 for z/OS
CA Plan Analyzer for DB2 for z/OS (CA Plan Analyzer) is designed to improve DB2
performance by efficiently analyzing SQL and provides in-depth SQL reports and
recommendations to show you how to fix resource-hungry and inefficient SQL statements in
your application. SQL can be analyzed from any source and at any level from a single SQL
statement, a group of SQL statements right up to a complete DB2 application.
CA Plan Analyzer can analyze SQL from DB2 plans, packages, DBRMs, SQL statements and
QMF queries from the DB2 Catalog. It can also analyze statements from a DBRMLIB, a file,
an exported QMF query or statements entered online. You can create logical groups of SQL
sources using SQL from any of the sources listed by creating a strategy. The strategy stores
the logical grouping and means you can create a logical grouping for an application or a group
of applications if you wish.
CA Plan Analyzer is also integrated with CA Detector for DB2 for z/OS so SQL can also be
analyzed online directly from within CA Detector’s application performance management
functions. This means that inefficient SQL identified in real time by CA Detector can be
analyzed by CA Plan Analyzer to help you fix your performance problem. This is especially
useful for ERP environments where dynamic SQL is used. CA Detector allows you to capture
problematic dynamic SQL statements and pass them to CA Plan Analyzer to identify the cause
For more of the problem.
information, CA Plan Analyzer uses the same Enhanced Explain capability and expert system as CA SQL-
visit Ease to offer recommendations showing how you can improve the efficiency of your SQL by
ca.com/db making changes to SQL coding, predicate coding or by making changes to physical objects.
CA Plan Analyzer adds further expert system support by providing recommendations to
improve SQL efficiency based on making changes to plans and packages. The same eight
Enhanced Explain reports are available namely: summary report, access path, cost estimate,
predicate analysis, object dependency, RI dependency, tree diagram and object statistics.
One of CA Plan Analyzer’s most powerful features provides you with the ability to monitor SQL
statements and their efficiency over time. Each time you use the Explain features of CA Plan
Analyzer, the results are stored away within an historical database. This means that as you
enhance your applications, or make physical database object changes, CA Plan Analyzer can
compare your SQL access paths to those previously and highlight any changes which may
affect the performance of your application.
This feature is especially useful when you upgrade to a new DB2 version. Binding your plans
and packages on a new DB2 version may lead to different access paths. CA Plan Analyzer
can evaluate your access paths on the new DB2 version without needing to bind your plans
or packages. The powerful Compare feature allows you to identify changed SQL statements,
changed host variables and changed access paths with powerful filtering based on changes in
statement cost using ms, service units and TIMERONS. By using the Compare feature, you only
need to evaluate SQL statements whose access paths have changed rather than every SQL
statement in your entire application.
CA Plan Analyzer also gives you access to Statistics Manager for analyzing SQL performance
by projecting data statistics so you can create ‘what if’ scenarios based on different data
volumes within your DB2 objects.
CA Plan Analyzer further provides powerful features allowing you to identify problem SQL
within your applications. You can search for SQL whose access characteristics may be
undesirable, so for example you can find all statements that use exclusive (X) DB2 locking or
all statements that perform tablespace scans. Along similar lines you can search for problem
plans or packages which may use undesirable bind characteristics, such as all plans that use
uncommitted read as an isolation level.
In addition, Enhanced Explain CA Plan Analyzer provides a whole series of plan, package,
DBRM, statement and object reports allowing the DBA to administer the total SQL environment.
Working directly from any report, you can BIND, FREE, or REBIND some or all of the plans or
packages listed by entering primary or line commands. All utilities can be executed both online
and in batch processing.
For more
information, CA Plan Analyzer also provides a Stored Procedures Maintenance facility giving you the ability
visit to create, update, browse, delete, and start stored procedures and maintain your DB2
SYSPROCEDURES table.
ca.com/db
6 ca.com
SECTION 3
Tuning your application SQL statements is critical but all that effort could be wasted if the
DB2 Catalog statistics for your application objects did not accurately reflect the actual
data volumes stored in your application objects, or if your physical DB2 objects became
excessively disorganized.
Efficient SQL access paths rely on accurate DB2 Catalog statistics. When you bind your
application plans and packages, or if you are using dynamic SQL, the DB2 optimizer will use
DB2 Catalog statistics to help determine which access path to use for a given SQL statement.
For example, if the number of rows in a table is very small, DB2 may choose to use a
tablespace scan rather than index access. This provides reasonable results as long as the data
volumes in that table remain small. If the actual data volumes in the table become very large,
using a tablespace scan would likely cause application performance to suffer.
Keeping your DB2 Catalog statistics current requires you to collect RUNSTATS on a regular
For more basis. This is even more important today if you have ERP applications using dynamic SQL
information, because data volumes can vary widely. Some tables can contain 0 rows one minute and then
visit an hour later hold over a million. When would you run RUNSTATS on such an object? Even
ca.com/db with dynamic SQL, the DB2 optimizer still has to rely on the DB2 Catalog statistics being
accurate to choose the best access path. The key then is to collect RUNSTATS regularly. Given
the immense volumes of data in large enterprises, running RUNSTATS can be a very expensive
and time-consuming exercise.
Application performance can also be badly impacted if your physical DB2 objects are
excessively disorganized. The more work that DB2 has to do to physically locate the desired
data rows, the longer your application will wait. ERP applications are notorious for spawning
highly disorganized data. First, ERP applications tend to perform high volumes of INSERT
statements which cause indexes to become badly disorganized and wasteful of space. Next,
since most ERP objects contain only variable length rows, any UPDATE to a row that causes
the row length to change can mean the row gets relocated away from its home page.
Finally, because of the high volumes of transactions common is such ERP systems, physical
objects can quickly become badly disorganized. This can impact the performance of your
application significantly.
CA Database Analyzer allows you to define Extract Procedures. These define a logical grouping
of DB2 objects and supports extensive wildcarding of object names. It is simple to specify that
you want to include all objects based on a creator ID of perhaps SAPR3, thus allowing all
objects to be included for SAP using the schema name of SAPR3. So, a simple Extract
Procedure definition could include many thousands of DB2 objects for a complete ERP
application. Since object names are evaluated at run time, any newly created objects will
automatically be included.
Extract Procedures can be used to collect statistics, including RUNSTATS statistics, for all
objects which meet the selection criteria. The process uses multi-tasking so multiple objects
are processed in parallel. The process runs very quickly and does not impact DB2 or your DB2
applications.
For more
In addition to collecting RUNSTATS statistics, an Extract Procedure will collect approximately
information, twice the number of data points for each object that RUNSTATS does. The other major feature
visit of statistics collection using an Extract Procedure is that all data points for every execution are
ca.com/db stored away within a database. This allows for online reporting and also provides the
opportunity for identifying trends in data volumes and allows you to forecast when an event
will occur based on the current growth.
The reporting function allows you to view the data points which have been collected for each DB2
object. Twenty-one different reports are available for tablespace objects while sixteen reports
are provided of index objects. Each report can be viewed as a data query, a graph, a trend or a
forecast. The latest data can be viewed or weekly, monthly, or in yearly aggregated values.
Another powerful feature in CA Database Analyzer gives you the ability to selectively generate
DB2 utility jobs based on object data point criteria. So, for example, you can specify that you
want to run a REORG if the cluster ratio of any tablespace is lower than say 60%.
This is achieved by creating an Action Procedure. An Action Procedure defines two things. You
may specify under which conditions you would like to trigger a DB2 utility. Over 100 example
triggers are provided in the tool and these can be evaluated against DB2 Catalog statistics,
data points with the database and DB2 Real Time Statistics. You may also set a threshold value
for many of the triggers. Once you have specified your triggers, you specify which DB2 utilities
you would like to trigger. This supports all of the IBM utility tools, all of the CA utility tools, and
various other utility programs and user applications.
By tying an Action Procedure to an Extract Procedure you link the objects you are interested in
to the triggers and trigger thresholds.
8 ca.com
Using Extract and Action Procedures in this way, it is easy to not only ensure you have the
latest statistics for your application objects, but you can also generate utility jobs, such as CA
Rapid Reorg to REORG objects which are badly disorganized — automatically. This means you
can REORG the objects which need to be reorganized and not waste time reorganizing other
objects which are not disorganized. This approach helps you keep your applications performing
at the optimum level whilst not wasting any of your valuable CPU resources.
High levels of performance are provided by using multi-tasking and parallel processing of
partitions. Indexes can be built in parallel to the tablespace partitions and non-partitioning
For more indexes can either be completely rebuilt or updated. Sorting of data is performed efficiently and
data is sorted in its compressed form. Clever use of z/OS dataspaces allows buffer areas to be
information,
placed in fast storage rather than using slow work files. Lastly, a Log Monitor address space can
visit
be used to filter log records for use during the high speed log apply phase of an online REORG.
ca.com/db
With 24x7 application availability becoming the norm, online reorganization is a necessity.
An online REORG uses a shadow copy of the DB2 object where the reorganized data is written
and log records are applied. Once the REORG process has finished the shadow copy of the
DB2 object is switched to become the ‘live’ copy of the object. This switch phase can cause
problems for applications, particularly ERP applications where long running transactions can
hold locks for many hours without committing.
CA Rapid Reorg provides the greatest flexibility to control the switch phase for an online
REORG. Whatever REORG utility you use, there will be a short time when availability of the
DB2 object is suspended while the switch phase occurs. With CA Rapid Reorg, you get to
choose and control when access to the object is suspended. You can choose that the REORG
will wait until all application CLAIMS have completed before the REORG will suspend access.
Once access is suspended the switch will happen quickly using FASTSWITCH. Using this
approach you can avoid the situation with other REORG tools where a DRAIN is issued at the
switch phase and has to wait for long running transactions to complete while at the same time
preventing other SQL CLAIMS from the application from running. CA Rapid Reorg provides the
best control over application availability for online REORG.
Using CA Rapid Reorg you can maintain the performance of your applications by ensuring
the data is fully organized. CA Rapid Reorg is so efficient you may find you can afford to run
REORG much more frequently, thus improving your application performance further.
For more
information,
visit
ca.com/db
10 ca.com
SECTION 4
After you have spent time and effort tuning your application SQL to be as efficient as possible
and tuning your DB2 objects for high performance, it is always disappointing when your
application performance does not meet your expectations. There could be many reasons for
this. You may have bottlenecks within the DB2 system itself and you need to find them and
take corrective action. Your application workload and data volumes may not be what you
expected and these can cause performance bottlenecks. Additionally, these volumes can
change over time and impact performance at some point in the future. Lastly, if you are using
an ERP application using dynamic SQL, these are the most dynamic environments to manage
and are the hardest to maintain with good stable performance.
CA Insight DPM provides a System Condition Monitor function. The System Condition Monitor
provides a single point for viewing overview status information for all DB2 systems you are
monitoring. This is the starting point for finding performance problems and then drilling down
to identify the cause.
DB2 system activity is collected based on user-defined time intervals. A data collector task is
used to collect the DB2 information for monitoring use. DB2 system information is displayed as
an accumulation of all intervals, or the difference or delta between the current and most recent
interval. Information is provided on buffer pool usage, EDM pool usage, storage usage, locks,
log activity and SQL activity counts. The data collector can be customized for each DB2
subsystem, making it easy to vary the set of collected performance information from one
subsystem to the next and get the precise amount of detail desired.
When viewing DB2 application thread activity, you can drill down into a thread for deeper
analysis of potential problems, including determining how long the thread has been active,
For more how much of that time is spent in DB2 and how much time is spent waiting for DB2 resources.
information, Thread information includes SQL text, timing information, SQL counts, buffer pool activity, lock
visit activity, Distributed Data Facility (DDF) data and Resource Limit Facility (RLF) data. Having
ca.com/db identified a poorly performing SQL statement, you can invoke EXPLAIN to identify the access
path and see why you have that problem. From there you can make the decision as to how you
will fix the problem.
One of the most powerful features of CA Insight DPM is the Exception Monitor. CA Insight
DPM includes hundreds of predefined exception conditions that can be selectively activated for
system events, application events and SQL events. When a DB2 processing limit is reached or
exceeded, the exception will be highlighted. This allows you to instantly see when a potential
performance issue exists and react to fix the problem. An exception can also be configured to
submit a user-defined Intelligent Module (IMOD) to automatically initiate corrective action.
By setting thresholds in the Exception Monitor, you do not need to constantly watch the
screens looking for potential performance problems. When they occur, CA Insight DPM will
notify you.
One further powerful feature of CA Insight DPM is the implementation of Insight Query
Language (IQL). All screen displays within CA Insight DPM are built using IQL. CA Insight
DPM is supplied with many built-in requests, all of which can be customized using the IQL
language. You may also write new on demand requests to address new requirements. This
makes CA Insight DPM very flexible and extensible for your environment and particular
monitoring needs.
12 ca.com
CA Detector® for DB2 for z/OS
CA Detector® for DB2 for z/OS (CA Detector) is an application performance analysis tool for
DB2 which provides unique insight into your application workload and resource use. It uniquely
allows you to view activity at the application, plan, package, DBRM and SQL statement level.
CA Detector allows you to identify the programs and SQL statements that most significantly
affect your DB2 system performance.
CA Detector collects its data exclusively from accounting trace information. Accounting trace
data imposes a significantly lower monitoring overhead on your DB2 system than performance
traces will. Additionally, the use of accounting data allows CA Detector to provide its unique
focus on performance, not at the DB2 system level, but at the application level. This means you
can tune your application to perform efficiently on your system, rather than tune your system
to make-up for deficiencies in your application.
CA Detector analyzes all SQL than runs on your system, both dynamic and static. Its low
overhead means that you can run Detector 24x7 and never miss an application performance
problem. All information collected by CA Detector is stored in a datastore which rolls into a
history datastore driven by the user defined interval period. This means you can store
information concerning performance of your system over time. Not only that, you can unload
this if you wish and process it to further refine your performance data.
For more
information, With CA Detector, you specify which plans, packages, DBRMs and SQL statements belong to
visit which application. This can be done using wildcards to include or exclude SQL sources from
ca.com/db your defined application. CA Detector aggregates the information for all of the SQL sources
within your application into a single view of performance data for your application. You can drill
down from the application level to view the plans, from there the packages and from there
down to the individual SQL statements.
The unique capability of CA Detector is that on any display it always shows the most resource
intensive application, plan, package or SQL statement. For example, if you know you have a
poorly performing application, you can drill down to see which is the most resource intensive
plan. From there you can drill down and see which is the most resource intensive package, then
SQL statement. From there you can view the SQL statement text and even jump to CA Plan
Analyzer to take advantage of all of the enhanced explain capabilities of that tool to help solve
the problem.
This capability allows you to find the operations within your application which are consuming
the most DB2 resources based on how your users are using the application. This can highlight
for example if that single daily execution of a tablespace scan in your application is actually any
worse in performance terms than the index scan which run millions of times a day.
In addition to viewing data at the application level, CA Detector allows you to view
performance data at the DB2 system level and at the DB2 member level when DB2 data-
sharing is used.
In a similar way, CA Detector has the ability to look for SQL error conditions occurring within
your application. You may have SQL transactions which are constantly receiving a negative
SQL code when trying to update a table. Unless the application is written to notify you of those
failures, you will never know they are happening. CA Detector will trap the SQL errors and
show them to you in the SQL Error display. From here you can proactively go and fix the problem.
With the increased use of dynamic SQL within applications, particularly in ERP applications, it
is even more important that you can manage the performance of dynamic SQL statements in
the same way as static SQL. CA Detector collects performance information for dynamic SQL in
exactly the same way as it does for static SQL. There is no need to set an exception to do this.
CA Detector collects the SQL text for dynamic SQL and can even help you identify when
similar dynamic statements are failing to take advantage of the statement cache because of
the use of constants or literals in the SQL text. Making simple changes to use parameter
markers and host variable can significantly improve the performance of your applications by
taking full advantage of the dynamic SQL cache.
For more
information,
visit
ca.com/db CA Subsystem Analyzer for DB2 for z/OS
CA Subsystem Analyzer for DB2 for z/OS (CA Subsystem Analyzer) is a complementary
product to CA Detector and uses the same collection interval process and datastore. While CA
Detector looks at how SQL is performing within your applications, CA Subsystem Analyzer is
designed to show you how those applications are impacting your DB2 resources, such as buffer
pools, EDM pools, RID pools, DASD volumes, datasets and even dataset extents.
CA Subsystem Analyzer works in the same way as CA Detector and always shows you the
most active item in the list, whether it is a list of DASD volumes or a list of buffer pools. With
CA Subsystem Analyzer, you can take a broad, system-wide view to identify and examine the
most active DB2 databases, tablespaces, tables, indexes, buffer pools and DASD volumes, then
drill down logically to look at specific details.
Tight process flow integration between CA Subsystem Analyzer and CA Detector means that
you can seamlessly and automatically transition between the two products. For example,
having drilled down to view the busiest DB2 table in your system using CA Subsystem
Analyzer, you can select that you want to view the SQL accessing that table. You are then taken
seamlessly into CA Detector to view those SQL statements where you can jump to CA Plan
Analyzer if you like to take advantage of the enhanced explain capabilities of that tool.
When trying to improve application performance, it is not always the application SQL which is
causing the problem. Sometimes you need to make changes to your DB2 system to improve
performance or remove a processing bottleneck.
14 ca.com
It is sometimes hard to determine which objects are being used most frequently and should
be isolated from each other to avoid contention. With CA Subsystem Analyzer it is easy. For
example, you may notice that a certain buffer pool has very high activity. You can drill down to
see which active objects are using that buffer pool. You will see a list of tablespaces showing
the tablespace with the highest activity at the top. You can even drill down further to view the
most active table and from there view the SQL activity occurring on that table if you like. You
may decide that one of the tablespaces should be moved to a different buffer pool to reduce
contention and improve performance.
CA Subsystem Analyzer allows you to view your system performance data at the DB2 system
level and at the DB2 member level when DB2 data-sharing is used.
CA Thread Terminator is a very powerful tool that helps you manage your DB2 system and
application in real-time. The tool allows you to make changes to your DB2 environment
immediately, whereas to make that change using normal DB2 techniques may normally require
you to re-cycle DB2.
For more
information, You can make a whole list of changes to your system parameters including changes to buffer
visit pool sizes, security parameters, logging parameters, application values, performance values,
ca.com/db storage sizes including EDM pool, thread parameters and operator parameters etc. You can
also add or delete active log datasets.
CA Thread Terminator can also show you a list of all active DB2 threads for all DB2 systems.
From a thread, you can drill down and view the thread detail including the SQL text. A large
amount of information concerning in DB2 times, wait times and I/O counts can be seen for each
thread. For dynamic SQL you can see the SQL text and also the contents of the host variables.
A common problem, particularly with ERP applications with dynamic SQL, is run-away threads.
These are SQL requests which may run for many hours and never finish. The problem with
these is that they consume system resources and potentially hold locks preventing other
transactions and utilities, such as REORG, from continuing. With CA Thread Terminator, you
can firstly view the SQL and host variable for the run-away thread. You can evaluate the DB2
statistics and see the amount of resource the thread has consumed. You can also choose to
cancel that thread if you like to release the DB2 resources that it holds. The cancel request will
work even if the status of the thread is not in DB2.
Another useful feature of CA Thread Terminator, allows you to terminate all threads accessing
a particular table or pageset. This is a useful feature if you are in a situation where you have to
REORG and object, by locks associated with threads are preventing the REORG from running.
For more
information,
visit
ca.com/db
16 ca.com
SECTION 5
After designing and implementing an efficient SQL application in your production system,
sometimes performance does not meet your expectations. Alternatively, you may have
implemented an ERP application only to find that performance is poor.
Monitoring your application using tools like CA Detector and CA Insight DPM can highlight
where the performance problem lies. You may be able to make further SQL performance
improvements highlighted by the enhanced explain capability of CA Plan Analyzer.
Additionally, you may be able to make database and system improvements highlighted
by CA Subsystem Analyzer.
One area where performance can be improved further is effective index design. No doubt you
will have designed your indexes to support fast access to the table data. The problem however,
is that when you designed your indexes, you could never be certain what the distribution of
your data would look like. It is only after you are in production and when production data
For more volumes are loaded can you see how effective your indexes are. Worst still, if you are using an
information, ERP system such as SAP, you may have had no control over the design of indexes which are
visit implemented.
ca.com/db Whatever indexes you have in your application, it may be that they do not provide the
performance benefits you expected. You may also have indexes which the DB2 optimizer never
chooses to use. Both of these situations can impact application performance.
With CA Index Expert, you can analyze an entire application quickly and efficiently, and make
intelligent index design decisions. This product saves time and reduces errors by analyzing
complex SQL column dependencies and reviewing SQL to determine which DB2 objects
are referenced.
CA Index Expert recommends indexing strategies at the application level with suggestions
based on actual SQL usage. This is an important benefit. Using data imported from CA Detector,
CA Index Expert knows how existing indexes and tables are referenced and how often.
Using CA Detector, it is easy to identify a problem SQL transaction. Analysis of the SQL using
the enhanced explain capability of CA Plan Analyzer may find that index access is not being
used for some reason. Remember, if this is an ERP application you may not be able to change
the SQL to resolve the problem. Using CA Index Expert you can obtain recommendations for
changes to the index design which would allow index access to be used. Implementation of this
index design will solve your application performance problem.
For more
information,
visit
ca.com/db
18 ca.com
CA, one of the world’s largest information technology (IT)
management software companies, unifies and simplifies
complex IT management across the enterprise for greater
business results. With our Enterprise IT Management vision,
solutions and expertise, we help customers effectively
govern, manage and secure IT.
HB05ESMDBMS01E MP321131007