Chapter_4_Data Warehouse Indexes
Chapter_4_Data Warehouse Indexes
Data Warehouse
Indexes
Amol D. Vibhute (PhD)
Assistant Professor
• Non-Clustered Index:
– Stores an index structure separate from the data, allowing multiple indexes on a table
– Example: In a sales transaction table, you might have a non-clustered index on the
ProductID column to quickly find sales data for a specific product.
– Benefits:
– Faster Data Retrieval: Improve the speed of queries that filter or sort data based on indexed columns.
– Flexibility: Can be created on multiple columns and can be used for various
query types.
– For example, on a table with one million rows, a column with 10,000 distinct values is a candidate for a bitmap index.
A bitmap index on this column can outperform a B-tree index, particularly when this column is often queried in
conjunction with other indexed columns. In fact, in a typical data warehouse environments, a bitmap index can be
considered for any non-unique column.
– B-tree indexes are most effective for high-cardinality data: that is, for data with many possible values, such as
customer_name or phone_number. In a data warehouse, B-tree indexes should be used only for unique columns or
other columns with very high cardinalities (that is, columns that are almost unique). The majority of indexes in a data
warehouse should be bitmap indexes.
– In ad hoc queries and similar situations, bitmap indexes can dramatically improve query performance. AND and OR
conditions in the WHERE clause of a query can be resolved quickly by performing the corresponding Boolean
operations directly on the bitmaps before converting the resulting bitmap to rowids. If the resulting number of rows is
small, the query can be answered quickly without resorting to a full table scan.
– Each entry (or bit) in the bitmap corresponds to a single row of the
customers table. The value of each bit depends upon the values of the
corresponding row in the table. For example, the bitmap cust_gender='F'
contains a one as its first bit because the gender is F in the first row of
the customers table. The bitmap cust_gender='F' has a zero for its third
bit because the gender of the third row is not F.
Friday, February 21, 2025 Dr. Amol 9
Cont.…
• Using B-Tree Indexes in Data Warehouses:
– A B-tree index is organized like an upside-down tree. The bottom level of the index holds the actual data values and
pointers to the corresponding rows, much as the index in a book has a page number associated with each index
entry.
– In general, use B-tree indexes when you know that your typical query refers to the indexed column and retrieves a
few rows. In these queries, it is faster to find the rows by looking at the index. However, using the book index
analogy, if you plan to look at every single topic in a book, you might not want to look in the index for the topic and
then look up the page. It might be faster to read through every chapter in the book. Similarly, if you are retrieving
most of the rows in a table, it might not make sense to look up the index to find the table rows. Instead, you might
want to read or scan the table.
– B-tree indexes are most commonly used in a data warehouse to index unique or near-unique keys. In many cases, it
may not be necessary to index these columns in a data warehouse, because unique constraints can be maintained
without an index, and because typical data warehouse queries may not work better with such indexes. B-tree indexes
are more common in environments using third normal form schemas. In general, bitmap indexes should be more
common than B-tree indexes in most data warehouse environments.
– Multi-level tree structure
– Breaks data into pages or blocks
– Should be used for high-cardinality (unique) coloumns