0% found this document useful (0 votes)
33 views20 pages

Hoffer Mdm12e PP Ch09

This chapter discusses data warehousing and summarizes key concepts. It defines a data warehouse as a subject-oriented collection of integrated and time-variant data used for decision making. Data warehouses address the need for an integrated company-wide view of information and the separation of operational and informational systems. Common architectures include dimensional data models with star schemas containing fact and dimension tables. Data warehousing supports online analytical processing and business performance management through techniques like slicing, drilling down, and dashboards.

Uploaded by

zeyad boyka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views20 pages

Hoffer Mdm12e PP Ch09

This chapter discusses data warehousing and summarizes key concepts. It defines a data warehouse as a subject-oriented collection of integrated and time-variant data used for decision making. Data warehouses address the need for an integrated company-wide view of information and the separation of operational and informational systems. Common architectures include dimensional data models with star schemas containing fact and dimension tables. Data warehousing supports online analytical processing and business performance management through techniques like slicing, drilling down, and dashboards.

Uploaded by

zeyad boyka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

CHAPTER 9:

DATA WAREHOUSING

Modern Database Management


12th Edition
Global Edition

Jeff Hoffer, Ramesh Venkataraman,


Heikki Topi

Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-1


OBJECTIVES
 Define terms
 Give reasons for information gap between
information needs and availability
 List reasons for need of data warehousing
 Describe three levels of data warehouse
architectures
 Describe two components of star schema
 Estimate fact table size
 Design a data mart
 Develop requirements for a data mart
 Understand future data warehousing trends

Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-2


DEFINITIONS
 Data Warehouse
 A subject-oriented, integrated, time-variant, non-
updatable collection of data used in support of
management decision-making processes
 Subject-oriented: e.g. customers, patients, students,
products
 Integrated: consistent naming conventions, formats,
encoding structures; from multiple data sources
 Time-variant: can study trends and changes
 Non-updatable: read-only, periodically refreshed
 Data Mart
 A data warehouse that is limited in scope
Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-3
NEED FOR DATA WAREHOUSING

 Integrated, company-wide view of high-


quality information (from disparate
databases)

 Separation of operational and informational


systems and data (for improved
performance)

Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-4


ISSUES WITH COMPANY-WIDE VIEW

 Inconsistent key structures


 Synonyms
 Free-form vs. structured fields
 Inconsistent data values
 Missing data

See figure 9-1 for example


Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-5
Figure 9-1
Examples of
heterogeneous
data

Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-6


SEPARATING OPERATIONAL AND
INFORMATIONAL SYSTEMS

 Operational system – a system that is used to run


a business in real time, based on current data; also
called a system of record

 Informational system – a system designed to


support decision making based on historical point-
in-time and prediction data for complex queries or
data-mining applications

Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-7


Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-8
Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-9
DERIVED DATA
 Objectives
 Ease of use for decision support applications
 Fast response to predefined user queries
 Customized data for particular target audiences
 Ad-hoc query support
 Data mining capabilities
 Characteristics
 Detailed (mostly periodic) data
 Aggregate (for summary)
 Distributed (to departmental servers)
Most common data model = dimensional model
(usually implemented as a star schema)
Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-10
Figure 9-9 Components of a star schema
Fact tables contain factual
or quantitative data

1:N relationship between Dimension tables are denormalized


dimension tables and fact tables to maximize performance

Dimension tables contain descriptions


about the subjects of the business

Excellent for ad-hoc queries, but bad for online transaction processing
Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-11
Figure 9-10 Star schema example

Fact table provides statistics for sales


broken down by product, period and
store dimensions

Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-12


Figure 9-11 Star schema with sample data

Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-13


SURROGATE KEYS
 Dimension table keys should be surrogate (non-
intelligent and non-business related), because:

 Business keys may change over time


 Helps keep track of nonkey attribute values
for a given production key
 Surrogate keys are simpler and shorter
 Surrogate keys can be same length and format
for all key

Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-14


GRAIN OF THE FACT TABLE
 Granularity of Fact Table–what level of detail
do you want?
 Transactional grain–finest level
 Aggregated grain–more summarized
 Finer grains  better market basket
analysis capability
 Finer grain  more dimension tables, more
rows in fact table
 In Web-based commerce, finest granularity
is a click
Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-15
SIZE OF FACT TABLE
 Depends on the number of dimensions and the grain of
the fact table
 Number of rows = product of number of possible values
for each dimension associated with the fact table
 Example: Assume the following for Figure 9-11:

 Total rows calculated as follows (assuming only half the


products record sales for a given month):

Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-16


ONLINE ANALYTICAL PROCESSING (OLAP) TOOLS
 The use of a set of graphical tools that provides
users with multidimensional views of their data
and allows them to analyze the data using simple
windowing techniques
 Relational OLAP (ROLAP)
 Traditional relational representation
 Multidimensional OLAP (MOLAP)
 Cube structure
 OLAP Operations
 Cube slicing–come up with 2-D view of data
 Drill-down–going from summary to more detailed
views

Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-17


Figure 9-21 Slicing a data cube

Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-18


Summary report
Figure 9-22
Example of drill-down

Starting with summary


data, users can obtain Drill-down with
details for particular color added
cells

Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-19


BUSINESS PERFORMANCE MGMT (BPM)
Figure 9-25
Sample Dashboard

BPM systems allow


managers to measure,
monitor, and manage
key activities and
processes to achieve
organizational goals.
Dashboards are often
used to provide an
information system in
support of BPM.

Charts like these are examples of data visualization, the representation of


data in graphical and multimedia formats for human analysis.
Chapter 9 Copyright © 2016 Pearson Education, Ltd. 9-20

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy