DB2 Interview Ques
DB2 Interview Ques
A1. A DB2 bind is a process that builds an access path to DB2 tables.
A2. An access path is the method used to access data specified in DB2 sql statements.
A3. An application plan or package is generated by the bind to define an access path.
Q4. What is normalization and what are the five normal forms?
A4. Normalization is a design procedure for representing data in tabular format. The
five normal forms are progressive rules to represent the data with minimal redundancy.
A5. These are attributes of one table that have matching values in a primary key in
another table, allowing for relationships between tables.
A7. WHERE is used with a relational statement to isolate the object element or row.
Q8. What techniques are used to retrieve data from more than one table in a single SQL
statement?
A8. Joins, unions and nested selects are used to retrieve data.
Q9. What do the initials DDL and DML stand for and what is their meaning?
A9. DDL is data definition language and DML is data manipulation language. DDL
statements are CREATE, ALTER, TRUNCATE. DML statements are SELECT,
INSERT, DELETE and UPDATE.
A11. An outer join includes rows from tables when there are no matching values in the
tables.
A12. A subselect is a select which works in conjunction with another select. A nested
select is a kind of subselect where the inner select passes to the where criteria for the
outer select.
A13. Group by controls the presentation of the rows, order by controls the presentation
of the columns for the results of the SELECT statement.
A14. The explain statement provides information about the optimizer's choice of access
path of the sql.
A15. Tables are stored in tablespaces (hence the name)! There are three types of
tablespaces: simple, segmented and partitioned.
A16. An embedded sql statement may return a number of rows while the programming
language can only access one row at a time. The programming device called a cursor
controls the position of the row.
A17. Referential integrity refers to the consistency that must be maintained between
primary and foreign keys, ie every foreign key value must have a corresponding
primary key value.
Q18. Usually, which is more important for DB2 system performance - CPU processing or
I/O access?
A18. I/O operations are usually most critical for DB2 performance (or any other
database for that matter).
Q19. Is there any advantage to denormalizing DB2 tables?
A19. Denormalizing DB2 tables reduces the need for processing intensive relational
joins and reduces the number of foreign keys.
A20. The database descriptor, DBD is the DB2 component that limits access to the
database whenever objects are created, altered or dropped.
A21. To maintain the integrity of DB2 objects the DBD permits access to only on object
at a time. Lock contention happens if several objects are required by contending
application processes simultaneously.
A22. SPUFI stands for SQL processing using file input. It is the DB2 interactive menu-
driven tool used by developers to create database objects.
Q23. What is the significance of DB2 free space and what parameters control it?
A23. The two parameters used in the CREATE statement are the PCTFREE which
specifies the percentage of free space for each page and FREEPAGE which indicates the
number of pages to be loaded with data between each free page. Free space allows room
for the insertion of new rows.
Q24. What is a NULL value? What are the pros and cons of using NULLS?
A24. A NULL value takes up one byte of storage and indicates that a value is not
present as opposed to a space or zero value. It's the DB2 equivalent of TBD on an
organizational chart and often correctly portrays a business situation. Unfortunately, it
requires extra coding for an application program to handle this situation.
A25. A synonym is used to reference a table or view by another name. The other name
can then be written in the application code pointing to test tables in the development
stage and to production entities when the code is migrated. The synonym is linked to the
AUTHID that created it.
A27. A LIKE table is created by using the LIKE parameter in a CREATE table
statement. LIKE tables are typically created for a test environment from the production
environment.
Q28. If the base table underlying a view is restructured, eg. attributes are added, does the
application code accessing the view need to be redone?
A28. No. The table and its view are created anew, but the programs accessing the view
do not need to be changed if the view and attributes accessed remain the same.
Q29. Under what circumstances will DB2 allow an SQL statement to update more than
one primary key value at a time?
A29. Never. Such processing could produce duplicate values violating entity
integrity. Primary keys must be updated one at a time.
Q30. What is the cascade rule and how does it relate to deletions made with a
subselect.
A30. The cascade rule will not allow deletions based on a subselect that references the
same table from which the deletions are being made.
A31. The self-referencing constraint limits in a single table the changes to a primary key
that the related foreign key defines. The foreign key in a self referencing table must
specify the DELETE CASCADE rule.
A32. Tables related with a foreign key are called delete-connected because a deletion in
the primary key table can affect the contents of the foreign key table.
Q33. When can an insert of a new primary key value threaten referential integrity?
A33. Never. New primary key values are not a problem. However, the values of foreign
key inserts must have corresponding primary key values in their related tables. And
updates of primary key values may require changes in foreign key values to maintain
referential integrity.
Q34. In terms of DB2 indexing, what is the root page?
A34. The simplest DB2 index is the B-tree and the B-tree's top page is called the root
page. The root page entries represent the upper range limits of the index and are
referenced first in a search.
A35. DB2 use the multiple indexes to satisfy multiple predicates in a SELECT
statement that are joined by an AND or OR.
Q36. What are some characteristics of columns that benefit from indexes?
A36. Primary key and foreign key columns; columns that have unique values; columns
that have aggregates computed frequently and columns used to test the existence of a
value.
Q37. What is a composite index and how does it differ from a multiple index?
A37. A multiple index is not one index but two indexes for two different columns of a
table. A composite index is one index made up of combined values from two columns in
a table. If two columns in a table will often be accessed together a composite index will
be efficient.
A38. The number of distinct values for a column is called index cardinality. DB2's
RUNSTATS utility analyzes column value redundancy to determine whether to use a
tablespace or index scan to search for data.
A39. For a clustered index DB2 maintains rows in the same sequence as the columns in
the index for as long as there is free space. DB2 can then process that table in that order
efficiently.
Q40. What keyword does an SQL SELECT statement use for a string search?
A40. The LIKE keyword allows for string searches. The % sign is used as a wildcard.
Q41. What are some sql aggregates and other built-in functions?
A41. The common aggregate, built-in functions are AVG, SUM, MIN, MAX, COUNT
and DISTINCT.
Q43. What are the three DB2 date and time data types and their associated functions?
A43. The three data types are DATE, TIME and TIMESTAMP. CHAR can be used to
specify the format of each type. The DAYS function calculates the number of days
between two dates. (It's Y2K compliant).
A44. In DB2 a transaction typically requires a series of updates, insertions and deletions
that represent a logical unit of work. A transaction puts an implicit lock on the DB2
data. Programmers can use the COMMIT WORK statement to terminate the transaction
creating smaller units for recovery. If the transaction fails DB2 uses the log to roll back
values to the start of the transaction or to the preceding commit point.
A45. Deadlock occurs when transactions executing at the same time lock each other out
of data that they need to complete their logical units of work.
A46. DB2 imposes locks of four differing sizes: pages, tables, tablespace and for
indexes subpage.
A47. The three types are shared, update and exclusive. Shared locks allow two or
more programs to read simultaneously but not change the locked space. An exclusive
lock bars all other users from accessing the space. An update lock is less restrictive; it
allows other transactions to read or acquire shared locks on the space.
A48. SQL statements may return any number of rows, but most host languages deal with
one row at a time by declaring a cursor that presents each row at a unique isolation level.
A49. An intent lock is at the table level for a segmented tablespace or at the tablespace
level for a nonsegmented tablespace. They indicate at the table or tablespace level the
kinds of locks at lower levels.
Q50. What is the difference between static and dynamic sql?
A50. Static sql is hard-coded in a program when the programmer knows the statements
to be executed. For dynamic sql the program must dynamically allocate memory to
receive the query results.
A51. Cursor stability means that DB2 takes a lock on the page the cursor is accessing
and releases the lock when the cursor moves to another page.
Q52. What is the significance of the CURSOR WITH HOLD clause in a cursor
declaration?
A52. The clause avoids closing the cursor and repositioning it to the last row processed
when the cursor is reopened.
Q53. What is the SQL Communications Area and what are some of its key fields?
A53. It is a data structure that must be included in any host-language program using
SQL. It is used to pass feedback about the sql operations to the program. Fields are
return codes, error messages, handling codes and warnings.
A54. The WHENEVER statement is coded once in the host program to control program
actions depending on the SQL-CODE returned by each sql statement within the program.
A55. DCLGEN stands for declarations generator; it is a facility to generate DB2 sql
data structures in COBOL or PL/I programs.
A56. The FREE command can be used to delete plans and/or packages no longer
needed.
Q57. DB2 can implement a join in three ways using a merge join, a nested join or a
hybrid join. Explain the differences.
A57. A merge join requires that the tables being joined be in a sequence; the rows are
retrieved with a high cluster ratio index or are sorted by DB2. A nested join does not
require a sequence and works best on joining a small number of rows. DB2 reads the
outer table values and each time scans the inner table for matches. The hybrid join is a
nested join that requires the outer table be in sequence.
Q58. Compare a subselect to a join.
A58. Any subselect can be rewritten as a join, but not vice versa. Joins are usually more
efficient as join rows can be returned immediately, subselects require a temporary work
area for inner selects results while processing the outer select.
A59. If there is an index on the attributes tested an IN is more efficient since DB2 uses
the index for the IN. (IN for index is the mnemonic).
A60. A Cartesian product results from a faulty query. It is a row in the results for every
combination in the join tables.
Q61. 4/99 Mail from Joseph Howard: 'Q: DB2 What is the difference between a
package and a plan? How does one bind 2 versions of a CICS transaction with the same
module name in two different CICS regions that share the same DB2 subsystem?
A61. Package and plan are usually used synonomously, as in this site. Both contain
optimized code for SQL statements - a package for a single program, module or
subroutine contained in the datebase request module (DBRM) library. A plan may
contain multiple packages and pointers to packages. The one CICS module would then
exist in a package that could be referenced in two different plans.
A62. It is a write to disk that may occur before or long after a commit. The write is
controlled by the buffer manager.
A63. A lock is the mechanism that controls access to data pages and tablespaces.
A64. This is a key concept for any relational database. Isolation level is the manner in
which locks are applied and released during a transaction. For DB@ a 'repeatable read'
holds all locks untile the transaction completes or a syncpoint is issued. For transactions
using 'cursor stability' the page lock releases are issued as the cursor 'moves', i.e. as the
transaction releases addressability to the records.
A66. It is a DB2 facility for static SQL statements - it replaces these statements with
calls to the DB2 language interface module.
A67. The opposite of a leaf page; it is the highest level index page. An index can
contain only the one root page; all other index pages are associated to the root.
A68. A thread is the connection between DB2 and some other subsystem, such as CICS
or IMS/DC.
sql
Q1. What is the basic difference between a join and a union?
A1. A join selects columns from 2 or more tables. A union selects rows.
Q2. What is normalization and what are the five normal forms?
A2. Normalization is a design procedure for representing data in tabular format. The
five normal forms are progressive rules to represent the data with minimal redundancy.
A3. These are attributes of one table that have matching values in a primary key in
another table, allowing for relationships between tables.
A5. WHERE is used with a relational statement to isolate the object element or row.
Q6. What techniques are used to retrieve data from more than one table in a single SQL
statement?
A6. Joins, unions and nested selects are used to retrieve data.
A7. A view is a virtual table made up of data from base tables and other views, but not
stored separately.
A8. An outer join includes rows from tables when there are no matching values in the
tables.
A9. A subselect is a select which works in conjunction with another select. A nested
select is a kind of subselect where the inner select passes to the where criteria for the
outer select.
Q10. What is the difference between group by and order by?
A10. Group by controls the presentation of the rows, order by controls the presentation
of the columns for the results of the SELECT statement.
Q11. What keyword does an SQL SELECT statement use for a string search?
A11. The LIKE keyword allows for string searches. The % sign is used as a wildcard.
Q12. What are some sql aggregates and other built-in functions?
A12. The common aggregate, built-in functions are AVG, SUM, MIN, MAX, COUNT
and DISTINCT.
A13. SUBSTR is used for string manipulation with column name, first position and
string length used as arguments. Eg. SUBSTR (NAME, 1 3) refers to the first three
characters in the column NAME.
A14. The explain statement provides information about the optimizer's choice of access
path of the sql.
A15. Referential integrity refers to the consistency that must be maintained between
primary and foreign keys, ie every foreign key value must have a corresponding
primary key value.
Q16. What is a NULL value? What are the pros and cons of using NULLS?
A16. A NULL value takes up one byte of storage and indicates that a value is not
present as opposed to a space or zero value. It's the DB2 equivalent of TBD on an
organizational chart and often correctly portrays a business situation. Unfortunately, it
requires extra coding for an application program to handle this situation.
A17. A synonym is used to reference a table or view by another name. The other name
can then be written in the application code pointing to test tables in the development
stage and to production entities when the code is migrated. The synonym is linked to the
AUTHID that created it.
Q19. When can an insert of a new primary key value threaten referential integrity?
A19. Never. New primary key values are not a problem. However, the values of foreign
key inserts must have corresponding primary key values in their related tables. And
updates of primary key values may require changes in foreign key values to maintain
referential integrity.
A20. Static sql is hard-coded in a program when the programmer knows the statements
to be executed. For dynamic sql the program must dynamically allocate memory to
receive the query results.
A21. Any subselect can be rewritten as a join, but not vice versa. Joins are usually more
efficient as join rows can be returned immediately, subselects require a temporary work
area for inner selects results while processing the outer select.
A22. If there is an index on the attributes tested an IN is more efficient since DB2 uses
the index for the IN. (IN for index is the mnemonic).
A23. A Cartesian product results from a faulty query. It is a row in the results for every
combination in the join tables.
A25. Static sql is compiled and optimized prior to its execution; dynamic is compiled
and optimized during execution.
Q26. Any SQL implementation covers data types in couple of main categories. Which of
the following are those data types ? (Check all that apply) A. NUMERIC B.
CHARACTER C. DATE AND TIME D. BLOBS E. BIT
A26. A,B,C. Not all SQL implementations have a BLOB or a BIT data types.
Q27. We have a table with a CHARACTER data type field. We apply a ">" row
comparison between this field and another CHARACTER field in another table. What
will be the results for records with field value of NULL ? (Check one that applies the
best) A. TRUE B. FALSE C. UNKNOWN D. Error. E. Those records will be
ignored
Q28. Any database needs to go through a normalization process to make sure that data
is represented only once. This will eliminate problems with creating or destroying data in
the database. The normalization process is done usually in three steps which results in
first, second and third normal forms. Which best describes the process to obtain the third
normal form? (Check one that applies the best) A. Each table should have related
columns. B. Each separate table should have a primary key. C. We have a table with
multi-valued key. All columns that are dependent on only one or on some of the keys
should be moved in a different table. D. If a table has columns not dependent on the
primary keys, they need to be moved in a separate table. E. Primary key is always
UNIQUE and NOT NULL.
A28. D. All columns in a table should be dependent on the primary key. This will
eliminate transitive dependencies in which A depends on B, and B depends on C, but
we're not sure how C depends on A.
Q29. SQL can be embedded in a host program that uses a relational database as a
persistent data repository. Some of the most important pre-defined structures for this
mechanism are SQLDA ("SQL Descriptor Area") and SQLCA ("SQL Communications
Area") SQLCA contains two structures - SQLCODE and SQLSTATE. SQLSTATE is a
standard set of error messages and warnings in which the first two characters defines the
class and the last three defines the subclass of the error. Which of the following
SQLSTATE codes is interpreted as "No data returned"? (Check one that applies the best)
A. 00xxx B. 01xxx C. 02xxx D. 22xxx E. 2Axxx
It depends on you. If you are sure you don't want see the data from multiple members of
one file at once, you can use them to simplify the work.
For example if you create invoice database, you can store each month in one member.
Advantages:
the backuping is simple, only current moth is backuped (but it can be done by
journaling)
the other members can be locked, so this is more secure (but it can be done using
logical files)
Nevertheless every time you ask: "Why should I use members, when there are other ways
to do it", you can also ask: "Why should I use other ways, when I have members ?".
Also the members are used for source files. This is reasonable, because AS has no
directory hierarchy.