0% found this document useful (0 votes)

14 views59 pages

Unit1 Dwbi

The document outlines the architecture and components of data warehousing and business intelligence, including the processes of data extraction, transformation, and loading (ETL), as well as the structure of data warehouses and data marts. It describes the three-tier architecture of data warehouses, the multidimensional data model, and various operations such as roll-up, drill-down, slice, and pivot for data analysis. Additionally, it explains different schema designs like star, snowflake, and fact constellation schemas used in data warehousing.

Uploaded by

22b81a6610

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views59 pages

Unit1 Dwbi

Uploaded by

22b81a6610

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 59

Data Ware Housing

and Business
Intelligence
UNIT-I
Data ware House Architecture
• External Sources: External source is a source from where data is collected
irrespective of the type of data. Data can be structured, semi structured
and unstructured as well.
• Stage Area: Since the data, extracted from the external sources does not
follow a particular format, so there is a need to validate this data to load
into dataware house. For this purpose, it is recommended to use ETL tool.
• E(Extracted): Data is extracted from External data source.

• T(Transform): Data is transformed into the standard format.

• L(Load): Data is loaded into dataware house after transforming it into the standard
format.
• Data-warehouse: After cleansing of data, it is stored in the data
warehouse as central repository. It actually stores the meta data and the
actual data gets stored in the data marts. Note that data warehouse stores
the data in its purest form in this top-down approach.

• Data Marts: Data mart is also a part of storage component. It stores the
information of a particular function of an organisation which is handled by
single authority. There can be as many number of data marts in an
organisation depending upon the functions. We can also say that data
mart contains subset of the data stored in data warehouse.
• Data Mining: The practice of analysing the big data present in data
warehouse is data mining. It is used to find the hidden patterns that
are present in the database or in data warehouse with the help of
algorithm of data mining.
• Three-Tier Data Warehouse Architecture
• Data Warehouses usually have a three-level (tier) architecture that includes:
• Bottom Tier (Data Warehouse Server)
• Middle Tier (OLAP Server)
• Top Tier (Front end Tools).
• A bottom-tier that consists of the Data Warehouse server, which is almost always
an RDBMS. It may include several specialized data marts and a metadata repository.
• Data from operational databases and external sources (such as user profile data
provided by external consultants) are extracted using application program
interfaces called a gateway. A gateway is provided by the underlying DBMS and
allows customer programs to generate SQL code to be executed at a server.
• middle-tier which consists of an OLAP server for fast querying of the data
warehouse.
• The OLAP server is implemented using either
• (1) A Relational OLAP (ROLAP) model, i.e., an extended relational DBMS
that maps functions on multidimensional data to standard relational
operations.
• (2) A Multidimensional OLAP (MOLAP) model, i.e., a particular purpose
server that directly implements multidimensional information and
operations.
• A top-tier that contains front-end tools for displaying results provided by
OLAP, as well as additional tools for data mining of the OLAP-generated data.
• The metadata repository stores information that defines DW objects. It includes the
following parameters and information for the middle and the top-tier applications:
• A description of the DW structure, including the warehouse schema, dimension,
hierarchies, data mart locations, and contents, etc.
• Operational metadata, which usually describes the currency level of the stored data, i.e.,
active, archived or purged, and warehouse monitoring information, i.e., usage statistics,
error reports, audit, etc.
• System performance data, which includes indices, used to improve data access and
retrieval performance.
• Information about the mapping from operational databases, which provides
source RDBMSs and their contents, cleaning and transformation rules, etc.
• Summarization algorithms, predefined queries, and reports business data, which include
business terms and definitions, ownership information, etc.
Multi dimensional data model
• The multi-Dimensional Data Model is a method which is used for
ordering data in the database along with good arrangement and
assembling of the contents in the database.
• The Multi Dimensional Data Model allows customers to interrogate
analytical questions associated with market or business trends, unlike
relational databases which allow customers to access data in the form
of queries.
• They allow users to rapidly receive answers to the requests which
they made by creating and examining the data comparatively fast.
• It represents data in the form of data cubes. Data cubes allow to
model and view the data from many dimensions and perspectives.
• It is defined by dimensions and facts and is represented by a fact
table.
• Facts are numerical measures and fact tables contain measures of the
related dimensional tables or names of the facts.
• For example, a shop may create a sales data warehouse to keep
records of the store's sales for the dimension time, item, and location.
• These dimensions allow the save to keep track of things, for example,
monthly sales of items and the locations at which the items were
sold.
• Each dimension has a table related to it, called a dimensional table,
which describes the dimension further.
• For example, a dimensional table for an item may contain the
attributes item_name, brand, and type.
• for example, sales. This theme is represented by a fact table. Facts are
numerical measures. The fact table contains the names of the facts or
measures of the related dimensional tables.
Data Cube
• When data is grouped or combined in multidimensional matrices called Data
Cubes.
• The data cube method has a few alternative names or a few variants, such as
"Multidimensional databases," "materialized views," and "OLAP (On-Line
Analytical Processing).“

• For example, a relation with the schema sales (part, supplier, customer, and
sale-price) can be materialized into a set of eight views as shown in fig,
where psc indicates a view consisting of aggregate function value (such as total-
sales) computed by grouping three attributes part, supplier, and
customer, p indicates a view composed of the corresponding aggregate function
values calculated by grouping part alone, etc.
• A data cube enables data to be modeled and viewed in multiple
dimensions. A multidimensional data model is organized around a
central theme, like sales and transactions.
• A fact table represents this theme. Facts are numerical measures.
Thus, the fact table contains measure (such as Rs_sold) and keys to
each of the related dimensional tables.
• Dimensions are a fact that defines a data cube. Facts are generally
quantities, which are used for analyzing the relationship between
dimensions.
• Example: In the 2-D representation, we will look at the All Electronics
sales data for items sold per quarter in the city of Vancouver. The
measured display in dollars sold (in thousands).
• Let suppose we would like to view the sales data with a third
dimension.
• For example, suppose we would like to view the data according to
time, item as well as the location for the cities Chicago, New York,
Toronto, and Vancouver.
• The measured display in dollars sold (in thousands). These 3-D data
are shown in the table. The 3-D data of the table are represented as a
series of 2-D tables.
• The topmost 0-D cuboid, which holds the highest level of
summarization, is known as the apex cuboid.
• In this example, this is the total sales, or dollars sold, summarized
over all four dimensions.
• The lattice of cuboid forms a data cube. The figure shows the lattice of
cuboids creating 4-D data cubes for the dimension time, item,
location, and supplier. Each cuboid represents a different degree of
summarization.
• The figure shows data cubes for sales of a shop.
• The cube contains the dimensions, location, and time
and item, where the location is aggregated with
regard to city values, time is aggregated with respect
to quarters, and an item is aggregated with respect to
item types.
• The roll-up operation (also known as drill-up or
aggregation operation) performs aggregation on a
data cube, by climbing down concept hierarchies, i.e.,
dimension reduction.
• Roll-up is like zooming-out on the data cubes. Figure
shows the result of roll-up operations performed on
the dimension location.
• The hierarchy for the location is defined as the Order
Street, city, province, or state, country.
• The roll-up operation aggregates the data by
ascending the location hierarchy from the level of the
city to the level of the country.
• When a roll-up is performed by dimensions reduction, one or more
dimensions are removed from the cube.
• For example, consider a sales data cube having two dimensions,
location and time.
• Roll-up may be performed by removing, the time dimensions,
appearing in an aggregation of the total sales by location, relatively
than by location and by time.
• Drill-Down
• The drill-down operation (also called roll-down) is the reverse
operation of roll-up. Drill-down is like zooming-in on the data cube. It
navigates from less detailed record to more detailed data.
• Drill-down can be performed by either stepping down a concept
hierarchy for a dimension or adding additional dimensions.
• Drill-down appears by descending the time hierarchy from the level
of the quarter to a more detailed level of the month.
• Because a drill-down adds more details to the given data, it can also
be performed by adding a new dimension to a cube.
• For example, a drill-down on the central cubes of the figure can occur
by introducing an additional dimension, such as a customer group.
• A slice is a subset of the cubes corresponding to a single value for one
or more members of the dimension.
• For example, a slice operation is executed when the customer wants a
selection on one dimension of a three-dimensional cube resulting in a
two-dimensional site.
• So, the Slice operations perform a selection on one dimension of the
given cube, thus resulting in a sub cube.
• Pivot
• The pivot operation is also called a rotation. Pivot is a visualization
operations which rotates the data axes in view to provide an
alternative presentation of the data.
• It may contain swapping the rows and columns or moving one of the
row-dimensions into the column dimensions.
Schemas
• The star schema is intensely suitable for data warehouse database
design because of the following features:
• It creates a DE-normalized database that can quickly provide query
responses.
• It provides a flexible design that can be changed easily or added to
throughout the development cycle, and as the database grows.
• It provides a parallel in design to how end-users typically think of and
use the data.
• It reduces the complexity of metadata for both developers and end-
users.
• The snowflake schema consists of one fact table which is linked to
many dimension tables, which can be linked to other dimension
tables through a many-to-one relationship.
• Tables in a snowflake schema are generally normalized to the third
normal form. Each dimension table performs exactly one level in a
hierarchy.T
• Fact Constellation Schema is a sophisticated database design that is
difficult to summarize information.
• Fact Constellation Schema can implement between aggregate Fact
tables or decompose a complex Fact table into independent simplex
Fact tables.
• The schema contains a fact table for sales that includes keys to each
of the four dimensions, along with two measures: Rupee_sold and
units_sold.
• The shipping table has five dimensions, or keys: item_key, time_key,
shipper_key, from_location, and to_location, and two measures:
Rupee_cost and units_shipped.

SAP Business Data Cloud
100% (2)
SAP Business Data Cloud
30 pages
L3. Types of Data Warehouse PDF
No ratings yet
L3. Types of Data Warehouse PDF
14 pages
Datamining 1
No ratings yet
Datamining 1
21 pages
Bi Unit 4
No ratings yet
Bi Unit 4
40 pages
2 1 Datawarehouses
No ratings yet
2 1 Datawarehouses
56 pages
DWM PDF
No ratings yet
DWM PDF
35 pages
Data Cube
No ratings yet
Data Cube
55 pages
3 Business Analysis in Data Mining L6 7 8-9-10
No ratings yet
3 Business Analysis in Data Mining L6 7 8-9-10
39 pages
DMDW
No ratings yet
DMDW
40 pages
Bca DM Unit Ii
No ratings yet
Bca DM Unit Ii
17 pages
ML Module1
No ratings yet
ML Module1
56 pages
Unit - 4 Final
No ratings yet
Unit - 4 Final
71 pages
Multidimensional
No ratings yet
Multidimensional
77 pages
4th Year DW& DM Kai075 Unit 1
No ratings yet
4th Year DW& DM Kai075 Unit 1
25 pages
DBMS Part2
No ratings yet
DBMS Part2
23 pages
DWH Unit 1
No ratings yet
DWH Unit 1
12 pages
UNIT2DM
No ratings yet
UNIT2DM
63 pages
Summary For Exam
No ratings yet
Summary For Exam
8 pages
Datascience Unit 02 1
No ratings yet
Datascience Unit 02 1
53 pages
7 Data Warehousing - 1
No ratings yet
7 Data Warehousing - 1
32 pages
2-Data Warehouse Architecture - Three-Tier Data Warehouse Architecture-16!12!2024
No ratings yet
2-Data Warehouse Architecture - Three-Tier Data Warehouse Architecture-16!12!2024
30 pages
Unit 1 DWDM Pre
No ratings yet
Unit 1 DWDM Pre
20 pages
DWDM 2
No ratings yet
DWDM 2
16 pages
Lec.10.D. M. Spring 2025
No ratings yet
Lec.10.D. M. Spring 2025
40 pages
FDS Unit 2
No ratings yet
FDS Unit 2
21 pages
DWDM 3
0% (1)
DWDM 3
52 pages
Unit 1 - Data Warehouse
No ratings yet
Unit 1 - Data Warehouse
21 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
46 pages
Data Warehousing 2
No ratings yet
Data Warehousing 2
14 pages
DWDM 2020 Lecture02 Datawarehouses
No ratings yet
DWDM 2020 Lecture02 Datawarehouses
31 pages
HAJJATII
No ratings yet
HAJJATII
11 pages
DWDM 2
No ratings yet
DWDM 2
15 pages
Chapter 2.introduction To Data Warehouse
No ratings yet
Chapter 2.introduction To Data Warehouse
49 pages
Data Mining and Warehosuing Lecture 01
No ratings yet
Data Mining and Warehosuing Lecture 01
36 pages
DWDM Unit-2
No ratings yet
DWDM Unit-2
17 pages
DM Chapter 2
No ratings yet
DM Chapter 2
35 pages
Unit 2 DATA WAREHOUSE AND DATA MART
No ratings yet
Unit 2 DATA WAREHOUSE AND DATA MART
17 pages
Unit - II Data Warehouseing&OLAP
No ratings yet
Unit - II Data Warehouseing&OLAP
17 pages
3-Data Warehouse Modeling - Data Cube and OLAP-18!12!2024
No ratings yet
3-Data Warehouse Modeling - Data Cube and OLAP-18!12!2024
25 pages
03 04OLAP SKJ Edited Oct 1, 2024
No ratings yet
03 04OLAP SKJ Edited Oct 1, 2024
93 pages
Unit 2 - Data Science BCA
No ratings yet
Unit 2 - Data Science BCA
20 pages
Unit2 Olap
No ratings yet
Unit2 Olap
13 pages
Harsh Internship Report Final
No ratings yet
Harsh Internship Report Final
28 pages
Data Mining and Data Warehousing Notes ct1
No ratings yet
Data Mining and Data Warehousing Notes ct1
12 pages
Data Warehouse Modeling
No ratings yet
Data Warehouse Modeling
17 pages
Data Mining 9,10,11
No ratings yet
Data Mining 9,10,11
27 pages
TrainingGuide Geomatica OrthoEngine PDF
No ratings yet
TrainingGuide Geomatica OrthoEngine PDF
174 pages
What Motivated Data Mining? Why Is It Important?: The Evolution of Database Technology
100% (1)
What Motivated Data Mining? Why Is It Important?: The Evolution of Database Technology
18 pages
Data Warehousing and Data Mining 3rd Class Second Course: Dr. Khalil I. Ghathwan
No ratings yet
Data Warehousing and Data Mining 3rd Class Second Course: Dr. Khalil I. Ghathwan
32 pages
What Is A Data Warehouse?
No ratings yet
What Is A Data Warehouse?
47 pages
02datawarehousing For DM
No ratings yet
02datawarehousing For DM
38 pages
Data Ware House Concept 2019 (Compatibility Mode) PDF
No ratings yet
Data Ware House Concept 2019 (Compatibility Mode) PDF
25 pages
Chap 2
No ratings yet
Chap 2
21 pages
DWDM Unit 2 PDF
No ratings yet
DWDM Unit 2 PDF
16 pages
DMDW-Unit I
No ratings yet
DMDW-Unit I
14 pages
UNIT-1 (RIT-062) : Data Warehousing
No ratings yet
UNIT-1 (RIT-062) : Data Warehousing
34 pages
Chapter 2 and 3
No ratings yet
Chapter 2 and 3
89 pages
Concepts and Techniques: - Chapter 4
No ratings yet
Concepts and Techniques: - Chapter 4
50 pages
Unit-I: Introduction and Data Warehousing
No ratings yet
Unit-I: Introduction and Data Warehousing
17 pages
IT DWDM Unit I New PPT
No ratings yet
IT DWDM Unit I New PPT
60 pages
Designing The Data Warehouse - Part 1
100% (2)
Designing The Data Warehouse - Part 1
45 pages
New Text Document
No ratings yet
New Text Document
10 pages
Database 2nd Semester
No ratings yet
Database 2nd Semester
18 pages
Anup Thesis
No ratings yet
Anup Thesis
374 pages
Introduction To Power Apps Functions
No ratings yet
Introduction To Power Apps Functions
10 pages
PGM-A Technical Notes PDF
No ratings yet
PGM-A Technical Notes PDF
14 pages
Creating Business Intelligence For Your Organization Fast Track
0% (1)
Creating Business Intelligence For Your Organization Fast Track
708 pages
BI Platform
No ratings yet
BI Platform
139 pages
Bernard Stiegler - Carnival of The New Screen
No ratings yet
Bernard Stiegler - Carnival of The New Screen
10 pages
1 Tagetik Software Functionality Description
No ratings yet
1 Tagetik Software Functionality Description
15 pages
DMS Basics
No ratings yet
DMS Basics
11 pages
Best Practice
No ratings yet
Best Practice
54 pages
Best Practices For Designing Your Data Lake
No ratings yet
Best Practices For Designing Your Data Lake
13 pages
Foundations of Digital Libraries
No ratings yet
Foundations of Digital Libraries
26 pages
Information Organization
No ratings yet
Information Organization
7 pages
Chapter 2 Data Warehousing
No ratings yet
Chapter 2 Data Warehousing
47 pages
Form Pattern Reference Guide
No ratings yet
Form Pattern Reference Guide
21 pages
Control of Sound Environment Using Genet
No ratings yet
Control of Sound Environment Using Genet
139 pages
Chambers - Alternative Legal Service Providers 2021 - 1-Contract-Lifecycle-Managment
No ratings yet
Chambers - Alternative Legal Service Providers 2021 - 1-Contract-Lifecycle-Managment
6 pages
APDM MetadataAndInteroperabilty
No ratings yet
APDM MetadataAndInteroperabilty
25 pages
Talend Project Audit: User Guide
No ratings yet
Talend Project Audit: User Guide
22 pages
Search in Sharepoint 2019
No ratings yet
Search in Sharepoint 2019
11 pages
Isilon Onefs
No ratings yet
Isilon Onefs
11 pages
Enhancing Omnistudio Components Deployment For Salesforce Isv Partners
No ratings yet
Enhancing Omnistudio Components Deployment For Salesforce Isv Partners
2 pages
Log
No ratings yet
Log
2 pages
Log
No ratings yet
Log
4 pages
Digital Collection Proposal
No ratings yet
Digital Collection Proposal
6 pages
Log
No ratings yet
Log
2 pages
Learn SAP BI in 24 Hours
From Everand
Learn SAP BI in 24 Hours
Alex Nordeen
3/5 (1)
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit1 Dwbi

Uploaded by

Unit1 Dwbi

Uploaded by

Data Ware Housing

• T(Transform): Data is transformed into the standard format.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.