0% found this document useful (0 votes)
99 views35 pages

An Introduction To Data Warehousing

This document provides an overview of data warehousing, including its definition, evolution, need, and key characteristics. It defines a data warehouse as a subject-oriented, integrated, nonvolatile, and time-variant collection of data used for analysis and decision making. The document outlines the evolution of data warehousing from reporting to querying to analysis-focused systems. It discusses the need for data warehousing to support business intelligence, faster decision making, and strategic advantages. Key differences between online transaction processing (OLTP) systems and data warehouses are highlighted. The roles of data marts and operational data stores are also summarized.

Uploaded by

svdontha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views35 pages

An Introduction To Data Warehousing

This document provides an overview of data warehousing, including its definition, evolution, need, and key characteristics. It defines a data warehouse as a subject-oriented, integrated, nonvolatile, and time-variant collection of data used for analysis and decision making. The document outlines the evolution of data warehousing from reporting to querying to analysis-focused systems. It discusses the need for data warehousing to support business intelligence, faster decision making, and strategic advantages. Key differences between online transaction processing (OLTP) systems and data warehouses are highlighted. The roles of data marts and operational data stores are also summarized.

Uploaded by

svdontha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 35

An Introduction to

Data Warehousing

Adil Siddiqui
Adil.siddiqui@tcs.com

Course Roadmap
Data Warehousing - An Overview
Characteristic of Data Warehouse
Evolution of Data Warehousing
Need for Data Warehouse
Data Warehouse and Data Mart
OLTP Vs Data Warehouse
Operational Data Store
OLTP Vs ODS Vs DWS
Data Warehouse Architecture

Objectives
At the end of this lesson, you will know :

What is Data Warehousing


The evolution of Data Warehousing
Need for Data Warehousing
OLTP Vs Warehouse Applications
Data marts Vs Data Warehouses
Operational Data Stores
Overview of Warehouse Architecture

What is a Data Warehouse ?


Can I see credit
Can I see credit
report from
report from
Accounts, Sales
Accounts, Sales
from marketing
from marketing
and open order
and open order
report from order
report from order
entry for this
entry for this
customer
customer

Data from
Data from
multiple
multiple
sources is
sources is
integrated for
integrated for
a subject
a subject

A data warehouse is a subject-oriented,


integrated, nonvolatile, time-variant
collection of data in support of
management's decisions.
- WH Inmon

Identical
Identical
queries will give
queries will give
same results at
same results at
different times.
different times.
Supports
Supports
analysis
analysis
requiring
requiring
historical data
historical data

WH Inmon - Regarded As Father Of Data Warehousing

Data stored for


Data stored for
historical period.
historical period.
Data is populated in
Data is populated in
the data warehouse
the data warehouse
on daily/weekly
on daily/weekly
basis depending
basis depending
upon the
upon the
requirement.
requirement.

Subject-OrientedCharacteristics of a Data
Warehouse
Data
Operation

Warehouse

al

Leads

Quotes

Prospects

Orders

Focus is on Subject Areas rather than Applications

Customers

Products

Regions

Time

Non-volatile Characteristics of a Data


insert
change
Warehouse
Data
Warehouse

Operational
delet
e

replace

insert
load

change

Integrated View Is The Essence Of A Data Warehouse

read only
access

Time Variant Characteristics of a Data


Warehouse
Operational

Current Value data


time horizon : 60-90 days
key may not have element of
time

Data Warehouse Typically Spans Across Time

Data
Warehouse

Snapshot data
time horizon : 5-10 years
key has an element of time
data warehouse stores
historical data

Alternate Definitions
A collection of integrated, subject
oriented databases designed to
support the DSS function, where
each unit of data is relevant to some
moment of time
- Imhoff

Alternate Definitions
Data Warehouse is a repository of data
summarized or aggregated in
simplified form from operational
systems. End user orientated data
access and reporting tools let user
get at the data for decision support Babcock

Evolution of Data
1960 - 1985 : MIS Era
Warehousing

Unfriendly
Slow
Dependent on IS programmers
Inflexible
Analysis limited to defined reports
Focus on Reporting

Evolution of Data
1985 - 1990 : Querying Era
Warehousing
Adhoc, unstructured access to corporate data
SQL as interface not scalable
Cannot handle complex analysis

Focus on Online Querying

Evolution of Data
1990 - 20xx : Analysis Era
Warehousing

Trend Analysis
What If ?
Moving Averages
Cross Dimensional Comparisons
Statistical profiles
Automated pattern and rule discovery
Focus on Online Analysis

Need for Data Warehousing


Better business intelligence for end-users
Reduction in time to locate, access, and

analyze information
Consolidation of disparate information sources
Strategic advantage over competitors
Faster time-to-market for products and
services
Replacement of older, less-responsive decision
support systems
Reduction in demand on IS to generate reports

OLTP Vs Warehouse
Operational System

Data Warehouse

Transaction Processing

Query Processing

Time Sensitive

History Oriented

Operator View

Managerial View

Organized by transactions
(Order, Input, Inventory)

Organized by subject (Customer,


Product)

Relatively smaller database

Large database size

Many concurrent users

Relatively few concurrent users

Volatile Data

Non Volatile Data

Stores all data

Stores relevant data

Not Flexible

Flexible

Processing Power

Capacity Planning

Time of day
Processing Load Peaks During the Beginning and End of Day

Examples Of Some
Applications

Manufacturers
Manufacturers

Target

Retailers
Retailers

Marketing
Market Segmentation
Budgeting
Credit Rating Agencies
Financial Reporting and Consolidation

Market Basket Analysis - POS Analysis

Churn Analysis

Profitability Management

Event tracking

Customers
Customers

Do we need a separate
database ?
OLTP and data warehousing require two very

differently configured systems


Isolation of Production System from Business
Intelligence System
Significant and highly variable resource
demands of the data warehouse
Cost of disk space no longer a concern
Production systems not designed for query
processing

Data Marts
Enterprise wide data warehousing projects have a

very large cycle time


Getting consensus between multiple parties may
also be difficult
Departments may not be satisfied with priority
accorded to them
Sometimes individual departmental needs may be
strong enough to warrant a local implementation
Application/database distribution is also an
important factor

Data Marts
Subject or Application Oriented
Business View of Warehouse

Quick Solution to a specific Business


Problem
Finance, Manufacturing, Sales etc.
Smaller amount of data used for
Analytic Processing

A Logical Subset of The Complete Data Warehouse

Data Warehouses or Data


Marts
For companies interested in changing their corporate
cultures or integrating separate departments, an
enterprise
wide approach makes sense.

Companies that want a quick solution to a specific


business
problem are better served by a standalone data mart.

Some companies opt to build a warehouse


incrementally,
data mart by data mart.
A Logical Subset of The Complete Data Warehouse

Data Warehouse and Data


Mart
Data
Warehouse

Scope

Data Marts

Application Neutral Specific


Centralized, Shared Application
Requirement
Cross
LOB,
LOB/enterprise

department
Business
Process Oriented

Historical Detailed
Data
data
Perspe Some summary
ctive
Subject Multiple subject
areas

Detailed (some
history)
Summarized

Single Partial
subject

Data Warehouse and Data


Mart
Data
Warehouse

Data Marts

Data
Sources

Many
Few
Operational/ External Operational,

Implement
Time
Frame
Characteris
tics

9-18 months for first 4-12 months

Data

external data

stage
Multiple stage
implementation

Flexible, extensible
Durable/Strategic
Data orientation

Restrictive, non

extensible
Short life/tactical
Project

Warehouse or Mart First ?


Data Warehouse First

Data Mart first

Expensive

Relatively cheap

Large development cycle

Delivered in < 6 months

Change management is
difficult

Easy to manage change

Difficult to obtain continuous


corporate support

Can lead to independent


and incompatible marts

Technical challenges in
building large databases

Cleansing, transformation,
modeling techniques may
be incompatible

OLTP Systems Vs Data


Warehouse Remember
Between OLTP and Data Warehouse systems
users are different
data content is different,
data structures are different
hardware is different
Understanding The Differences Is The Key

Operational Data Store Definition

A
B

ODS

Data
Warehouse

C
Operational
DSS

Can I see credit


report from
Accounts, Sales
from marketing
and open order
report from
order entry for
this customer

Operational Data Store - Definition


Data from multiple
sources is integrated
for a subject

A subject oriented, integrated,


volatile, current valued data
store containing only corporate
detailed data

Identical queries may


give different results
at different times.
Supports analysis
requiring current
data

Data stored only for


current period. Old
Data is either
archived or moved to
Data Warehouse

Operational Data Store


The ODS applies only to the world of

operational systems.
The ODS contains current valued and
near current valued data.
The ODS contains almost exclusively
all detail data
The ODS requires a full function,
update, record oriented environment.

Operational Data Store


Functions of an ODS
Converts Data,
Decides Which Data of Multiple Sources Is the
Best,
Summarizes Data,
Decodes/encodes Data,
Alters the Key Structures,
Alters the Physical Structures,
Reformats Data,
Internally Represents Data,
Recalculates Data.

Different kinds of
Information Needs
Current
Current

Recent
Recent

Historical
Historical

Is this medicine available


in stock

What are the tests this


patient has completed so
far

Has the incidence of


Tuberculosis increased in
last 5 years in Southern
region

OLTP Vs ODS Vs DWH


Characte OLTP
ristic

ODS

Data
Warehouse

Analysts

Managers and
analysts

Data access Individual


records,
transaction
driven

Individual
records,
transaction or
analysis driven

Set of records,
analysis driven

Data content Current, realtime

Current and
near-current

Historical

Data
Structure

Detailed

Detailed and
lightly
summarized

Detailed and
Summarized

Data
organization

Functional

Subject-oriented

Subject-oriented

Audience

Operating
Personnel

OLTP Vs ODS Vs DWH


Characteristic

OLTP

ODS

Data
Warehouse

Data redundancy

Non-redundant within
system; Unmanaged
redundancy among
systems

Somewhat
redundant with
operational
databases

Managed redundancy

Data update

Field by field

Field by field

Controlled batch

Database size

Moderate

Moderate

Large to very large

Development
Methodology

Requirements driven,
structured

Data driven,
somewhat
evolutionary

Data driven,
evolutionary

Philosophy

Support day-to-day
operation

Support day-today decisions &


operational
activities

Support managing the


enterprise

Typical Data Warehouse


Architecture
Data
Marts

Select

EIS /DSS

Metadata

Query Tools

Extract
Transform
Integrate
Maintain

Data
Warehouse

OLAP/ROLAP

Web Browsers
Operational
Systems/Data
Data
Preparation
Multi-tiered Data Warehouse without ODS

Middleware/
API

Data Mining

Typical Data Warehouse


Architecture
Data
Marts
Metadata

Metadata
Select

Select

Extract

Extract

Transform
Integrate

ODS

Transform
Load

Maintain

Operational
Systems/Data
Data
Preparation
Multi-tiered Data Warehouse with ODS

Data
Preparation

Data
Warehouse

Reference Book
Principles of Data Warehouse Design
Data warehouses and
OLAP:Concepts, Architectures and
Solutions

Thank You

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy