0% found this document useful (0 votes)

21 views37 pages

1a Ravi

This document provides an introduction to data warehousing and business intelligence. It discusses the purposes of reporting and analysis and how they differ. It also defines key concepts like data warehousing, business intelligence, the data lifecycle, metadata repositories, and different types of data marts. Reporting organizes data for monitoring performance while analysis explores data for insights to improve business. A data warehouse stores historical data from multiple sources to support analysis and decision making.

Uploaded by

Krishna Chauhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views37 pages

1a Ravi

Uploaded by

Krishna Chauhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Introduction to Data Warehousing

and Business Intelligence

Prof. Ravi Patel

IT Department
ADIT
Why Reporting and Analysis?
• Reporting: The process of organizing data into
informational summaries in order to monitor
how different areas of a business are
performing.
• Analysis: The process of exploring data and
reports in order to extract meaningful insights,
which can be used to better understand and
improve business performance.
Cont…
• Reporting translates raw data into information. Analysis transforms
data and information into insights.
• Reporting helps companies to monitor their online business and be
alerted to when data falls outside of expected ranges. Good
reporting should raise questions about the business from its end
users. The goal of analysis is to answer questions by interpreting
the data at a deeper level and providing actionable
recommendations.
• Through the process of performing analysis you may raise
additional questions, but the goal is to identify answers, or at least
potential answers that can be tested.
• In summary, reporting shows you what is happening while analysis
focuses on explaining why it is happening and what you can do
about it.
Data life Cycle
• The data life cycle provides a high level overview of
the stages involved in successful management and
preservation of data for use and reuse.
• Plan: description of the data that will be compiled, and how the
data will be managed and made accessible throughout its lifetime
• Collect: observations are made either by hand or with sensors or
other instruments and the data are placed a into digital form
• Assure: the quality of the data are assured through checks and
inspections
• Describe: data are accurately and thoroughly described using the
appropriate metadata standards
• Preserve: data are submitted to an appropriate long-term archive
(i.e. data center)
• Discover: potentially useful data are located and obtained, along
with the relevant information about the data (metadata)
• Integrate: data from disparate sources are combined to form one
homogeneous set of data that can be readily analyzed
• Analyze: data are analyzed
What is Business Intelligence?
• BI(Business Intelligence) is a set of processes, architectures,
and technologies that convert raw data into meaningful
information that drives profitable business actions.It is a
suite of software and services to transform data into
actionable intelligence and knowledge.
• BI has a direct impact on organization's strategic, tactical
and operational business decisions. BI supports fact-based
decision making using historical data rather than
assumptions and gut feeling.
• BI tools perform data analysis and create reports,
summaries, dashboards, maps, graphs, and charts to
provide users with detailed intelligence about the nature of
the business.
• Business Intelligence tools often source the data from data
warehouses. The reason is straightforward: a data warehouse
already has data from various production systems within an
enterprise; the data is cleansed, consolidated, conformed and
stored in one location. Because of this BI tools are able to
concentrate on analyzing the data.
BI And DW
• Business Intelligence and Data Warehouse
(BI/DW) are two separate but closely linked
technologies that are crucial to the success of
any large or mid-size business. The insights
derived from these systems are vital for an
organization as it helps in revenue
enhancement, cost reduction, and decision
making.
• Data storage and management is an important
managerial activity in any organization today
and have become significant for rational
decision making. A DW acts as a central
repository system where an enterprise stores
all its data (from one or more sources) in one
place. DW helps industries in reporting and
data analysis from the current and historical
data stored, and hence it is considered as a
core component of Business Intelligence.
What is Data Warehouse? Explain it with Key
Feature.
• Data warehousing provides architectures and tools for business
executives to systematically organize, understand, and use their
data to make strategic decisions.
• A data warehouse refers to a database that is maintained
separately from an organization’s operational databases.
• They support information processing by providing a solid
platform of consolidated historical data for analysis.
• “A data warehouse is a subject-oriented, integrated, time-
variant, and nonvolatile collection of data in support of
management’s decision making process”
• The four keywords, subject-oriented, integrated, time-variant,
and nonvolatile, distinguish data warehouses from other data
repository systems, such as relational database systems,
transaction processing systems, and file systems.
• Why Subject-oriented ?
• A data warehouse is organized around major
subjects, such as customer, supplier, product, and
sales.
• Rather than concentrating on the day-to-day
operations and transaction processing of an
organization, a data warehouse focuses on the
modeling and analysis of data for decision
makers.
• Data warehouses typically provide a simple and
concise view around particular subject issues by
excluding data that are not useful in the decision
support process.
• Why Integrated?
• A data warehouse is usually constructed by
integrating multiple heterogeneous sources,
such as relational databases, flat files, and on-
line transaction records.
• Data cleaning and data integration techniques
are applied to ensure consistency in naming
conventions, encoding structures, attribute
measures, and so on.
• Why Time-variant ?
• Data are stored to provide information from a historical
perspective (e.g., the past 5–10 years).
• Every key structure in the data warehouse contains,
either implicitly or explicitly, an element of time.
• Why Nonvolatile?
• A data warehouse is always a physically separate store
of data transformed from the application data found in
the operational environment.
• Due to this separation, a data warehouse does not
require transaction processing, recovery, and
concurrency control mechanisms.
• It usually requires only two operations in data
accessing: initial loading of data and access of data.
Meta data repository:
• Metadata are data about data. When used in
a data warehouse, metadata are the data that
define warehouse objects.
• Metadata are created for the data names and
definitions of the given warehouse.
• Additional metadata are created and captured
for time stamping any extracted data, the
source of the extracted data, and missing
fields that have been added by data cleaning
or integration processes.
A Metadata repository should contain the following:
• A description of the structure of the data
warehouse, which includes the warehouse schema,
view, dimensions, hierarchies, and derived data
definitions, as well as data mart locations and
contents.
• Operational metadata, which include data lineage
(history of migrated data and the sequence of
transformations applied to it),monitoring
information (warehouse usage statistics, error
reports, and audit trails).
• The algorithms used for summarization, which
include measure and dimension definition
algorithms, data on granularity, partitions, subject
areas, aggregation, summarization and predefined
queries and reports.
• The mapping from the operational environment to
the data warehouse, which includes source
databases and their contents, data partitions, data
extraction, cleaning, transformation rules and
defaults, data refresh and purging rules, and
security (user authorization and access control).
• Data related to system performance, which
include indices and profiles that improve data
access and retrieval performance, in addition to
rules for the timing and scheduling of refresh,
update, and replication cycles.
• Business metadata, which include business terms
and definitions, data ownership information, and
charging policies.
data mart and its types :
• Data marts contain a subset of organization-wide data that
is valuable to specific groups of people in an organization.
• A data mart contains only those data that is specific to a
particular group.
• Data marts improve end-user response time by allowing
users to have access to the specific type of data they need
to view most often by providing the data in a way that
supports the collective view of a group of users.
• A data mart is basically a condensed and more focused
version of a data warehouse that reflects the regulations
and process specifications of each business unit within an
organization.
• Each data mart is dedicated to a specific
business function or region.
• For example, the marketing data mart may
contain only data related to items, customers,
and sales. Data marts are confined to subjects.
Three basic types of data marts are dependent,
independent, and hybrid.
• The categorization is based primarily on the data
source that feeds the data mart.
• Dependent data marts draw data from a central
data warehouse that has already been created.
• Independent data marts, in contrast, are
standalone systems built by drawing data directly
from operational or external sources of data or
both.
• Hybrid data marts can draw data from
operational systems or data warehouses
Dependent Data Marts
• A dependent data mart allows you to unite
your organization's data in one data
warehouse.
• This gives you the usual advantages of
centralization.
• Figure illustrates a dependent data mart.
Independent Data Marts
• An independent data mart is created without
the use of a central data warehouse.
• This could be desirable for smaller groups
within an organization.
• Figure illustrates an independent data mart.
Hybrid Data Marts
• A hybrid data mart allows you to combine
input from sources other than a data
warehouse.
• This could be useful for many situations,
especially when you need ad hoc integration,
such as after a new group or product is added
to the organization.
Figure illustrates a hybrid data mart.
Basics elements of Data Warehouse
• Source System
• Data Staging Area
• Presentation Server/area
• Metadata
• End User Application
Source System
• An operational system of record whose
function it is to capture the transactions of
the business.
• A source system is often called a "legacy
system" In a mainframe environment.
Data Staging Area
• A storage area and a set of processes that clean,
transform, combine, de-duplicate, household, archive, and
prepare source data for use in the data warehouse.
• The data staging area is everything in between the source
system and the data presentation server.
• It may be on single machine or separated over different
machines
• Data staging is an intermediate storage area used for data
processing during the extract, transform and load (ETL)
process. The data staging area sits between the data
source(s) and the data target(s), which are often data
warehouses, data marts, or other data repositories.
Presentation Server/area
• The target physical machine on which the data warehouse
data is organized and stored for direct querying by end
users, report writers, and other applications.
• it is the presentation server where we insist that the data
be presented and stored in a dimensional framework.
• If the presentation server is based on a relational database,
then the tables will be organized as star schemas. If the
presentation server is based on non-relational on-line
analytic processing (OLAP) technology, then the data will
still have recognizable dimensions, most of the large data
marts (greater than a few gigabytes) are implemented on
relational databases.
End User Application
• A collection of tools that query, analyze, and
present information targeted to support a
business need.
• A minimal set of such tools would consist of
an end user data access tool, a spreadsheet, a
graphics package, and a user interface facility
for eliciting prompts and simplifying the
screen presentations to end users.
Components of Data Warehouse
• Source data Component
• Data staging Component
• Data storage Component
• Information Delivery Component
• Metadata Component
• Management and Control Component
Information Delivery Component
• In order to provide information for decision
making to the wide community of data
warehouse users, the information delivery
component includes different methods of
information delivery
• Provides information to one or more destinations
according to specified scheduling algorithm.
• Information delivery may be based on time of day
or completion of external events
Management and Control Component
• This component of the data warehouse architecture
sits on top of all the other components.
• The management and control component coordinates
the services and activities within the data warehouse.
• This component controls the data transformation and
the data transfer into the data warehouse storage.
• It works with the database management systems and
enables data to be properly stored in the repositories.
It monitors the movement of data into the staging area
and from there into the data warehouse storage itself.
• The management and control component interacts
with the metadata component to perform the
management and control functions

E Learning Answer Key
86% (21)
E Learning Answer Key
9 pages
Instant download Standards and Ethics for Counselling in Action 4th Edition Tim Bond pdf all chapter
100% (14)
Instant download Standards and Ethics for Counselling in Action 4th Edition Tim Bond pdf all chapter
81 pages
NSCOA LabGuide v24.02
No ratings yet
NSCOA LabGuide v24.02
81 pages
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet
Unit 1
No ratings yet
Unit 1
22 pages
Lecture 2 - Datawarehouse
No ratings yet
Lecture 2 - Datawarehouse
50 pages
BA unit2 own
No ratings yet
BA unit2 own
10 pages
CH 1
No ratings yet
CH 1
65 pages
Presentation Prepared By:: M.Balaji
No ratings yet
Presentation Prepared By:: M.Balaji
18 pages
Business Intelligence
No ratings yet
Business Intelligence
17 pages
Data Mining
No ratings yet
Data Mining
142 pages
Datawarehouse Unit-2
No ratings yet
Datawarehouse Unit-2
59 pages
Mis Presentation
No ratings yet
Mis Presentation
23 pages
DW Arch
No ratings yet
DW Arch
9 pages
Lecture 1 Introduction To Data Warehousing
No ratings yet
Lecture 1 Introduction To Data Warehousing
41 pages
WA Data Warehouse
No ratings yet
WA Data Warehouse
16 pages
Data Warehousing Components - L3 - L4 - L5
No ratings yet
Data Warehousing Components - L3 - L4 - L5
26 pages
DM Part 2
No ratings yet
DM Part 2
24 pages
Need of Two Types of Data: Information
No ratings yet
Need of Two Types of Data: Information
7 pages
UNIT I DWDM
No ratings yet
UNIT I DWDM
67 pages
Business Intelligence?: BI Used For?
No ratings yet
Business Intelligence?: BI Used For?
9 pages
BMIS Chapter 4 SCMSB
No ratings yet
BMIS Chapter 4 SCMSB
35 pages
12 01 09 10 32 12 1287 Sindhujam PDF
No ratings yet
12 01 09 10 32 12 1287 Sindhujam PDF
23 pages
CS2032 Unit I Notes
No ratings yet
CS2032 Unit I Notes
23 pages
Data Warehousing
No ratings yet
Data Warehousing
14 pages
Data Warehouse
No ratings yet
Data Warehouse
56 pages
CS2202_DataWarehouse_OLAP
No ratings yet
CS2202_DataWarehouse_OLAP
49 pages
MODULE 1 DWDM 8.1.24
No ratings yet
MODULE 1 DWDM 8.1.24
58 pages
Chapter 2
No ratings yet
Chapter 2
44 pages
Business Intelligence - Data Warehouse Implementation
100% (1)
Business Intelligence - Data Warehouse Implementation
157 pages
Data Warehousing and Its Role in BI
No ratings yet
Data Warehousing and Its Role in BI
10 pages
2024 Meeting 1 - Data Warehouse Fundamentals
No ratings yet
2024 Meeting 1 - Data Warehouse Fundamentals
47 pages
CS2032 Data Warehousing and Data Mining PPT Unit I
No ratings yet
CS2032 Data Warehousing and Data Mining PPT Unit I
88 pages
Week 02 Part 01
No ratings yet
Week 02 Part 01
15 pages
DWDM202
No ratings yet
DWDM202
6 pages
Sec A and B DWDM
No ratings yet
Sec A and B DWDM
31 pages
DWDM Notes/Unit 1
No ratings yet
DWDM Notes/Unit 1
31 pages
All Sec Dwdm
No ratings yet
All Sec Dwdm
48 pages
Notes_Data_Warehouse
No ratings yet
Notes_Data_Warehouse
49 pages
1. Business Intelligence and Data Warehousing (1)
No ratings yet
1. Business Intelligence and Data Warehousing (1)
117 pages
Introduction to Data Warehousing
No ratings yet
Introduction to Data Warehousing
13 pages
2 Data Warehousing Components L3 L4 L5
No ratings yet
2 Data Warehousing Components L3 L4 L5
26 pages
Data Warehouse Components
No ratings yet
Data Warehouse Components
26 pages
DMBI Unit-1
No ratings yet
DMBI Unit-1
37 pages
DWDM Notes - Final
No ratings yet
DWDM Notes - Final
46 pages
DW Final
No ratings yet
DW Final
38 pages
Data Mining and Warehousing: Kapil Sharma
No ratings yet
Data Mining and Warehousing: Kapil Sharma
55 pages
Presentation Prepared By:: Aqsa Ashfaq
No ratings yet
Presentation Prepared By:: Aqsa Ashfaq
22 pages
Data Warehouse: Meaning, Features, Applications, Architecture, Functions, Terminology
No ratings yet
Data Warehouse: Meaning, Features, Applications, Architecture, Functions, Terminology
13 pages
02 DataWarehousing and OLAP
No ratings yet
02 DataWarehousing and OLAP
66 pages
Data Warehouse Concepts
No ratings yet
Data Warehouse Concepts
53 pages
Data Warehouse
No ratings yet
Data Warehouse
74 pages
Data Warehousing
No ratings yet
Data Warehousing
111 pages
Introduction To Data Warehouse: Unit I: Data Warehousing
No ratings yet
Introduction To Data Warehouse: Unit I: Data Warehousing
110 pages
DATA WAREHOUSE
No ratings yet
DATA WAREHOUSE
53 pages
03 Data Warehouse
No ratings yet
03 Data Warehouse
27 pages
CS 2208 DATA MINING AND WAREHOUSING NOTES
No ratings yet
CS 2208 DATA MINING AND WAREHOUSING NOTES
14 pages
Data Warehousing
No ratings yet
Data Warehousing
4 pages
Fundamentals of Data Warehousing: Ms. Liza Mae P. Nismal
No ratings yet
Fundamentals of Data Warehousing: Ms. Liza Mae P. Nismal
15 pages
Fundamentals of Data Warehousing: Ms. Liza Mae P. Nismal
No ratings yet
Fundamentals of Data Warehousing: Ms. Liza Mae P. Nismal
15 pages
05-Data Warehousing & Data Mining
No ratings yet
05-Data Warehousing & Data Mining
8 pages
Data Warehousing and Mining
No ratings yet
Data Warehousing and Mining
52 pages
The Data Warehouse Advantage
From Everand
The Data Warehouse Advantage
Pasquale De Marco
No ratings yet
Animation Exam
No ratings yet
Animation Exam
6 pages
Information Protection: Kingdom of Saudi Arabia V Ministry of Interior High Comiviission For Industrial Security
No ratings yet
Information Protection: Kingdom of Saudi Arabia V Ministry of Interior High Comiviission For Industrial Security
12 pages
Tapan Kumar Das
No ratings yet
Tapan Kumar Das
2 pages
Master Thesis Presentation PPT Economics
100% (3)
Master Thesis Presentation PPT Economics
4 pages
Company Profile
No ratings yet
Company Profile
19 pages
Monitoring and Information Systems
No ratings yet
Monitoring and Information Systems
30 pages
Grade 10 Quarter 2 Week 1
No ratings yet
Grade 10 Quarter 2 Week 1
7 pages
Blue Eyes Technology (ABSTRACT)
90% (30)
Blue Eyes Technology (ABSTRACT)
19 pages
Rahul Pandita
No ratings yet
Rahul Pandita
2 pages
Marketing Bulletin: Patient Care Solutions
No ratings yet
Marketing Bulletin: Patient Care Solutions
2 pages
32-2024-A Novel Dual-Pipeline based Attention Mechanism for
No ratings yet
32-2024-A Novel Dual-Pipeline based Attention Mechanism for
7 pages
Roland E-500 Owner's Manual
No ratings yet
Roland E-500 Owner's Manual
94 pages
Basics of Vibration Monitoring For Fault Detection and Process Control
No ratings yet
Basics of Vibration Monitoring For Fault Detection and Process Control
10 pages
FTE Link IC To RSNG Test
No ratings yet
FTE Link IC To RSNG Test
45 pages
Module ChatGPT
No ratings yet
Module ChatGPT
15 pages
DEVELOPING DIGITAL LANGUAGE LEARNING MATERIALS
No ratings yet
DEVELOPING DIGITAL LANGUAGE LEARNING MATERIALS
3 pages
syllabus 2023-24-class-ii
No ratings yet
syllabus 2023-24-class-ii
6 pages
DJG201 2.1.0 Advanced Django Sample
No ratings yet
DJG201 2.1.0 Advanced Django Sample
219 pages
Latches Notes
No ratings yet
Latches Notes
8 pages
MCQ On Digital Signal Processing
No ratings yet
MCQ On Digital Signal Processing
3 pages
Chroma Meter Chroma Meter
No ratings yet
Chroma Meter Chroma Meter
5 pages
20 01 2022 21 58 38
No ratings yet
20 01 2022 21 58 38
42 pages
Datasheet Secure Logiq HPS-2U-H
No ratings yet
Datasheet Secure Logiq HPS-2U-H
2 pages
Ravana Prabhu1-16
No ratings yet
Ravana Prabhu1-16
75 pages
ImageQuality Testforms 2024 Jan
No ratings yet
ImageQuality Testforms 2024 Jan
21 pages
INTERNAL TABLE
No ratings yet
INTERNAL TABLE
11 pages
Swot Analysis of Personal Data Protection Bill, 2019
No ratings yet
Swot Analysis of Personal Data Protection Bill, 2019
7 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

1a Ravi

Uploaded by

1a Ravi

Uploaded by

Introduction to Data Warehousing

and Business Intelligence

Prof. Ravi Patel

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.