0% found this document useful (0 votes)

16 views53 pages

DMDW-MDM L8,9

The document discusses the multidimensional data model used in data warehouses and OLAP tools, emphasizing the structure of data cubes defined by dimensions and facts. It outlines various schemas such as Star, Snowflake, and Fact Constellation, detailing their characteristics and use cases. Additionally, it covers OLAP operations, data warehouse architecture, and the benefits and challenges of a three-tier data warehouse system.

Uploaded by

xataje8102

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views53 pages

DMDW-MDM L8,9

Uploaded by

xataje8102

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 53

Multidimensional Data Model

❖ Data warehouses and OLAP tools are based on a multidimensional data
model. This model views data in the form of a data cube.
❖ Data Cube allows data to be modeled and viewed in multiple dimensions. It is
defined by dimensions and facts.
❖ Dimensions are the perspectives or entities with respect to which an
organization wants to keep records.
❖ E.g:-
➢ Data warehouse in order to keep records of the store’s sales with respect to the dimensions
time, item, branch, and location.
Basic terms: Multidimensional Data Model
❖ Facts are numerical measures.
❖ Each dimension may have a table associated with it, called a dimension
table.
❖ Dimension tables can be specified by users / experts / automatically
generated and adjusted based on data distributions.
❖ The fact table contains the names of the facts, or measures, as well as keys
to each of the related dimension tables.
❖ The 0-D cuboid, which holds the highest level of summarization, is called the
apex cuboid.
❖ The cuboid that holds the lowest level of summarization is called the base
cuboid.
Schemas for Multidimensional Databases
❖ Data warehouse schema is a description, represented by objects such as
tables and indexes, of how data relates logically within a data warehouse.
❖ Data warehouse is a multidimensional model can exist in the form of:
❖ Star Schemas
❖ Snowflake Schemas
❖ Fact Constellation Schemas
Star Schema
❖ Star schema in a data warehouse is historically one of the most
straightforward designs.
❖ Star schema follows some distinct design parameters, such as only permitting
one central table and a handful of single-dimension tables joined to the table.
❖ Star Schema is known to create denormalized dimension tables
❖ Denormalization intends to introduce redundancy in additional dimensions
so long as it improves query performance.
Characteristics of the Star Schema

❖ Star data warehouse schemas create a denormalized database that enables

quick querying responses
❖ The primary key in the dimension table is joined to the fact table by the
foreign key
❖ Each dimension in the star schema maps to one dimension table
❖ Dimension tables within a star scheme are not to be connected directly
❖ Star schema creates denormalized dimension tables
Snowflake Schema

❖ Snowflake Schema is a data warehouse schema that encompasses a logical

arrangement of dimension tables.
❖ This data warehouse schema builds on the star schema by adding additional
sub-dimension tables that relate to first-order dimension tables joined to the
fact table.
❖ Snowflake schema approach, a primary key in a sub-dimension table
will relate to a foreign key within the higher order dimension table
❖ Snowflake schema creates normalized dimension tables.
❖ Purpose of normalization is to eliminate any redundant data to reduce
overhead.
Characteristics of the Snowflake Schema

❖ Snowflake Schema are permitted to have dimension tables joined to other

dimension tables
❖ Snowflake Schema are to have one fact table only
❖ Snowflake Schema create normalized dimension tables
❖ The normalized schema reduces required disk space for running and
managing this data warehouse
❖ Snowflake Scheme offer an easier way to implement a dimension
Fact Constellation Schemas

❖ Fact Constellation Schema also known as a Galaxy Schema.

❖ Fact Constellation Schema uses multiple fact tables connected with shared
normalized dimension tables.
❖ Fact Constellation Schema can be thought of as star schema interlinked and
completely normalized, avoiding any kind of redundancy or inconsistency of
data
Characteristics of the Fact Constellation Schemas

Fact Constellation Schema:

❖ Multidimensional acting as a strong design consideration for complex
database systems
❖ Reduces redundancy to near zero redundancy as a result of normalization
❖ Known for high data quality and accuracy and lends to effective reporting and
analytics
Question: The schema contains a central fact table for sales that contains keys to
each of the four dimensions, along with two measures: dollars sold and units sold.
To minimize the size of the fact table, dimension identifiers (such as time key and
item key) are system-generated identifiers.
Star Schema DQML
define cube sales star [time, item, branch, location]:
dollars sold = sum(sales in dollars), units sold = count(*)
define dimension time as (time key, day, day of week, month, quarter, year)
define dimension item as (item key, item name, brand, type, supplier type)
define dimension branch as (branch key, branch name, branch type)
define dimension location as (location key, street, city, province or state,
country)
Snowflake Schema DQML
define cube sales snowflake [time, item, branch, location]:
dollars sold = sum(sales in dollars), units sold = count(*)
define dimension time as (time key, day, day of week, month, quarter, year)
define dimension item as (item key, item name, brand, type, supplier (supplier key,
supplier type))
define dimension branch as (branch key, branch name, branch type)
define dimension location as (location key, street, city (city key, city, province or
state, country))
Galaxy Schema DQML
define cube sales [time, item, branch, location]:

dollars sold = sum(sales in dollars), units sold = count(*)

define dimension time as (time key, day, day of week, month, quarter, year)

define dimension item as (item key, item name, brand, type, supplier type)

define dimension branch as (branch key, branch name, branch type)

define dimension location as (location key, street, city, province or state, country)
define cube shipping [time, item, shipper, from location, to location]:
dollars cost = sum(cost in dollars), units shipped = count(*)
define dimension time as time in cube sales
define dimension item as item in cube sales
define dimension shipper as (shipper key, shipper name, location as location in cube sales,
shipper type)
define dimension from location as location in cube sales
define dimension to location as location in cube sales
Concept Hierarchies
❖ Concept Hierarchy Sequence of mappings from a set of low-level concepts to
higher-level, is called Concept Hierarchy.
❖ Concept hierarchy for the dimension location and Time.
➢ Location: Vancouver, Toronto, New York, and Chicago.
➢ Location Time
OLAP Process
OLAP Operations: Drill-down

Drill down operation allows a user to zoom in on the data cube i.e., the less
detailed data is converted into highly detailed data. It can be implemented by
either stepping down a concept hierarchy for a dimension or adding additional
dimensions to the hypercube.
Roll-up

It is the opposite of the drill-down operation and is also known as a drill-up or

aggregation operation. It is a dimension-reduction technique that performs
aggregation on a data cube. It makes the data less detailed and it can be
performed by combining similar dimensions across any axis.
Dice
Dice operation is used to generate a new sub-cube from the existing hypercube. It
selects two or more dimensions from the hypercube to generate a new sub-cube
for the give
Slice

Slice operation is used to select a single dimension from the given cube to
generate a new sub-cube. It represents the information from another point of view.
Pivot
It is used to provide an alternate view of the data available to the users. It is also
known as Rotate operation as it rotates the cube’s orientation to view the data
from different perspectives.
Starnet Query Model
❖ The querying of multidimensional databases can be based on a starnet
model.
❖ A starnet model consists of radial lines emanating from a central point, where
each line represents a concept hierarchy for a dimension.
❖ Each abstraction level in the hierarchy is called a footprint.
Data Warehouse Architecture
Principles of Data Warehousing
Single-Tier Data Warehouse Architecture

❖ The Single-Tier architecture of the data warehouse can be considered as a cumulation

of three layers that is physical source layer, the virtual data warehouse, and the
analysis layer, which can have reporting or OLAP tools.
❖ The purpose of having just a single layer of physical source layer in the architecture of
a data warehouse is mostly to minimize the amount of data stored to reach the goal,
which in turn removes data redundancies.
❖ Single-tier architecture of data warehouse has a primary drawback which is that it
doesn't have a component that separates analytical and transactional processing.
Single-Tier Data Warehouse Architecture
Two-Tier Data Warehouse Architecture

❖ Two-tier architecture vanishes the drawback of the single-tier as it has a separation

between the layers which plays an essential role in maintaining the two-tier
architecture.
❖ The two-tier architecture of the data warehouse comprises the following two tiers:
➢ Data Tier
➢ Client Tier
Two-Tier Data Warehouse Architecture
Two-Tier Data Warehouse Architecture: Data Tier

❖ The Data Tier in the two-tier architecture of the data warehouse can be defined as the
layer where actual data is stored after various ETL processes are used to load data into
the database or the data warehouse.
❖ The staging area where the ETL processes are used in the Data tier helps you ensure
that all data loaded into the warehouse is cleansed and in the appropriate format.

❖ The Data Tier consists of the following Three layers:

➢ The Source Layer
➢ The Data Staging Layer
➢ The Data Warehouse Layer
Three-Tier Data Warehouse Architecture
Three-Tier Data Warehouse Architecture
Three-Tier Data Warehouse Architecture

❖ Data Warehouse design in order to build a Data Warehouse by including

the required:
➢ Data Warehouse Schema Model,
➢ OLAP server type, and
➢ front-end tools for Reporting or Analysis purposes.
❖ The three different tiers here are termed as:
➢ Top-Tier
➢ Middle-Tier
➢ Bottom-Tier
Three-Tier Data Warehouse Architecture: Bottom-Tier

❖ The Bottom Tier in the three-tier architecture of a data warehouse consists of the Data
Repository.
❖ Data Repository is the storage space for the data extracted from various data sources,
which undergoes a series of activities as a part of the ETL process. ETL stands for
Extract, Transform and Load.
❖ These data are then cleaned up, to avoid repeating or junk data from its current storage
units.
❖ The next step is to transform all these data into a single format of storage.
Three-Tier Data Warehouse Architecture: Bottom-Tier

❖ The final step of ETL is to Load the data on the repository.

❖ ETL tools used are:
➢ Informatica
➢ Microsoft SSIS
➢ Snaplogic
➢ Confluent
➢ Apache Kafka
➢ Alooma
➢ Ab Initio
➢ IBM Infosphere
Three-Tier Data Warehouse Architecture: Middle Tier

❖ The Middle tier here is the tier with the OLAP servers.
❖ There are three types of OLAP server models, such as:
➢ ROLAP: Relational online analytical processing is a model of online analytical processing which
carries out an active multidimensional breakdown of data stored in a relational database, instead of
redesigning a relational database into a multidimensional database.
➢ MOLAP: Multidimensional online analytical processing is another model of online analytical
processing that catalogs and comprises of directories directly on its multidimensional database
system.
➢ HOLAP: Hybrid online analytical processing is a hybrid of both relational and multidimensional online
analytical processing models.
Three-Tier Data Warehouse Architecture: Top Tier

The Top Tier is a front-end layer, that is, the user interface that allows the user to connect
with the database systems.

This user interface is usually a tool or an API call, which is used to fetch the required data for
Reporting, Analysis, and Data Mining purposes.
if the Top tier is enabled with a bungling front-end tool, then the whole Data Warehouse Architecture can become an utter failure.
Three-Tier Data Warehouse Architecture: Top Tier

❖ Top Tier tools used are:

➢ IBM Cognos
➢ Microsoft BI Platform
➢ SAP Business Objects Web
➢ Pentaho
➢ Crystal Reports
➢ SAP BW
➢ SAS Business Intelligence
Different Data Warehouse Models

❖ Enterprise Warehouse
➢ A centralised system integrating data from all functions.
➢ Supports extensive queries and analysis for the entire organisation.
➢ Example: A large retail chain tracking inventory, sales, and customer trends across all
stores.
❖ Data Mart
➢ A smaller, department-specific subset of the warehouse.
➢ Designed for quick access and targeted analysis.
➢ Example: A finance team analysing quarterly budgets and expenditures.
❖ Virtual Warehouse
➢ Provides on-demand views of operational data.
➢ Focuses on quick access rather than permanent storage.
➢ Example: An e-commerce company generating real-time reports on daily sales
performance
Benefits of Three-Tier Data Warehouse System

❖ Scalability
➢ Handles growing data volumes without breaking a sweat.
➢ Supports an increasing number of users.
❖ Separation of Concerns:
➢ Keeps transactional and analytical processing distinct.
➢ Ensures faster analysis without affecting daily operations.
❖ Improved Query Performance:
➢ Prepares data in advance for lightning-fast queries.
➢ Delivers insights in seconds.
❖ Flexibility:
➢ Adapts to new data sources easily.
➢ Integrates with modern tools for enhanced analysis.
❖ Data Quality Assurance:
➢ Cleanses and standardize data before storage.
➢ Reduces errors and ensures reliability.
Challenges of Three-Tier Data Warehouse System

❖ Complexity: The top-down approach can be complex and time-consuming. To mitigate

this, it is essential to have a clear project plan and experienced personnel.
❖ Data Integration Issues: Integrating data from disparate sources can be challenging.
Using robust ETL tools and data integration techniques can help overcome these
challenges.
❖ Scalability: Ensuring the scalability of the EDW is crucial. This can be achieved by
using scalable hardware and software architectures.

Unit 1
No ratings yet
Unit 1
36 pages
M 1.4 Multidimensional Data Model
No ratings yet
M 1.4 Multidimensional Data Model
72 pages
Unit-2 1
No ratings yet
Unit-2 1
60 pages
Multidimensional
No ratings yet
Multidimensional
77 pages
3 - Business Analysis in Data Mining - L6 - 7 - 8 - 9 - 10
No ratings yet
3 - Business Analysis in Data Mining - L6 - 7 - 8 - 9 - 10
40 pages
Data Warehousing Fundamentals - Unit 1
No ratings yet
Data Warehousing Fundamentals - Unit 1
26 pages
Data Warehousing: People Making Technology Wor K™
100% (1)
Data Warehousing: People Making Technology Wor K™
44 pages
Datawarehouse Operations
No ratings yet
Datawarehouse Operations
18 pages
DWDM 2
No ratings yet
DWDM 2
16 pages
Unit I DMT
No ratings yet
Unit I DMT
74 pages
DWDM 3
0% (1)
DWDM 3
52 pages
DWDM Notes
No ratings yet
DWDM Notes
19 pages
DWDM Class PPT 9-9-23
No ratings yet
DWDM Class PPT 9-9-23
65 pages
Unit 2 Notes DWM
No ratings yet
Unit 2 Notes DWM
14 pages
Unit 2-DATA WAREHOUSE
No ratings yet
Unit 2-DATA WAREHOUSE
28 pages
DMDW Operations
No ratings yet
DMDW Operations
65 pages
DataMining - Chapter2 - Data WareHouse
No ratings yet
DataMining - Chapter2 - Data WareHouse
53 pages
DWDM Concept Demonstration
No ratings yet
DWDM Concept Demonstration
102 pages
DMDW Unit2
No ratings yet
DMDW Unit2
35 pages
MIS 385/MBA 664 Systems Implementation With DBMS/ Database Management
No ratings yet
MIS 385/MBA 664 Systems Implementation With DBMS/ Database Management
39 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
46 pages
Data Mining Notes UNIT II
No ratings yet
Data Mining Notes UNIT II
25 pages
Unit 2 Datawarehouse
No ratings yet
Unit 2 Datawarehouse
58 pages
DataWarehousing and Its Relevance
No ratings yet
DataWarehousing and Its Relevance
19 pages
Data Warehouse C
No ratings yet
Data Warehouse C
34 pages
R20-DMT Unit-I
No ratings yet
R20-DMT Unit-I
24 pages
03 04OLAP SKJ Edited Oct 1, 2024
No ratings yet
03 04OLAP SKJ Edited Oct 1, 2024
93 pages
Data Warehousing 2
No ratings yet
Data Warehousing 2
14 pages
Lecture 3 Data Warehouse Modelling
No ratings yet
Lecture 3 Data Warehouse Modelling
58 pages
Data Warehousing and OLAP Technology For Data Mining
No ratings yet
Data Warehousing and OLAP Technology For Data Mining
30 pages
Unit - 3 Data Warehousing and OLAP Technology
No ratings yet
Unit - 3 Data Warehousing and OLAP Technology
20 pages
Ec240 Volvo 1 102
100% (25)
Ec240 Volvo 1 102
102 pages
UNIT-1 Data Warehousing Part-III
No ratings yet
UNIT-1 Data Warehousing Part-III
68 pages
Unit 5 DW
No ratings yet
Unit 5 DW
12 pages
BA
No ratings yet
BA
6 pages
Lecture 4 (Dataware Housing)
No ratings yet
Lecture 4 (Dataware Housing)
50 pages
Olp PDF
No ratings yet
Olp PDF
25 pages
Unit-1 Lecture Notes
100% (1)
Unit-1 Lecture Notes
43 pages
Unit - 1
100% (1)
Unit - 1
29 pages
Data Mining 9,10,11
No ratings yet
Data Mining 9,10,11
27 pages
Auditorium Lighting Design
100% (2)
Auditorium Lighting Design
11 pages
DWM
No ratings yet
DWM
19 pages
DMDW 7
No ratings yet
DMDW 7
30 pages
Introduction To DataWarehouse and DataMining
No ratings yet
Introduction To DataWarehouse and DataMining
35 pages
DWM Mod 1
No ratings yet
DWM Mod 1
17 pages
DWDM Unit 2 PDF
No ratings yet
DWDM Unit 2 PDF
16 pages
What Is A Data Warehouse?
No ratings yet
What Is A Data Warehouse?
47 pages
Unit 2 - Data Science BCA
No ratings yet
Unit 2 - Data Science BCA
20 pages
Introduction To Datawarehousing: Duration: 45 Minutes (Approx.) Abhishek Ranjan
No ratings yet
Introduction To Datawarehousing: Duration: 45 Minutes (Approx.) Abhishek Ranjan
32 pages
DW-DM R19 Unit-1
100% (1)
DW-DM R19 Unit-1
25 pages
DWM Chp2 Notes
No ratings yet
DWM Chp2 Notes
21 pages
Internship Report: Supply Chain Management
100% (1)
Internship Report: Supply Chain Management
32 pages
Data Warehouse Concepts PDF
0% (1)
Data Warehouse Concepts PDF
14 pages
221 A Application Form New
0% (1)
221 A Application Form New
1 page
Data Warehouse
No ratings yet
Data Warehouse
71 pages
2 Datawarehouse 2
No ratings yet
2 Datawarehouse 2
57 pages
What Is Data Warehouse?: Data Mining by IK Unit 2
No ratings yet
What Is Data Warehouse?: Data Mining by IK Unit 2
21 pages
Data Warehouse: Subject Oriented
No ratings yet
Data Warehouse: Subject Oriented
6 pages
What Is A Data Warehouse
No ratings yet
What Is A Data Warehouse
11 pages
Kcse Analysis From 2017 To 2023 Mathematics Tr. Brian PP1 and PP2
No ratings yet
Kcse Analysis From 2017 To 2023 Mathematics Tr. Brian PP1 and PP2
4 pages
DW Concepts Shiva
No ratings yet
DW Concepts Shiva
32 pages
DW Concepts
No ratings yet
DW Concepts
7 pages
Module-5 Part-2: Exception and Interrupt Handling
No ratings yet
Module-5 Part-2: Exception and Interrupt Handling
23 pages
My Turtle Number Book
No ratings yet
My Turtle Number Book
7 pages
Gujarat Technological University: Analysis and Design of Algorithms
No ratings yet
Gujarat Technological University: Analysis and Design of Algorithms
3 pages
09c Alteon 500-201 Alteon Outbound SSL Detailed Configuration v1
No ratings yet
09c Alteon 500-201 Alteon Outbound SSL Detailed Configuration v1
44 pages
New Gen Strategy Ultra-Supercritical Technology
No ratings yet
New Gen Strategy Ultra-Supercritical Technology
21 pages
Aim Caido Brochure
No ratings yet
Aim Caido Brochure
18 pages
FactSheet - QoS v1
No ratings yet
FactSheet - QoS v1
4 pages
Report
No ratings yet
Report
53 pages
Disassembly Automation Automated Systems With Cognitive Abilities (Supachai Vongbunyong, Wei Hua Chen (Auth.) ) (Z-Library)
No ratings yet
Disassembly Automation Automated Systems With Cognitive Abilities (Supachai Vongbunyong, Wei Hua Chen (Auth.) ) (Z-Library)
205 pages
1756 Battery Module
No ratings yet
1756 Battery Module
32 pages
Video Resume For Teachers
100% (1)
Video Resume For Teachers
4 pages
Ak4351vt Akm
No ratings yet
Ak4351vt Akm
14 pages
PhpMyAdmin SQL Dump
No ratings yet
PhpMyAdmin SQL Dump
11 pages
1 PR3 Chap 1 2 3
No ratings yet
1 PR3 Chap 1 2 3
22 pages
Pibna GP 009A SeriesGPSoftwareRecoveryProcedure
No ratings yet
Pibna GP 009A SeriesGPSoftwareRecoveryProcedure
10 pages
Unit 4 1
No ratings yet
Unit 4 1
7 pages
Sys-Hier-WBS - Module - V1.0
No ratings yet
Sys-Hier-WBS - Module - V1.0
33 pages
Chat List
No ratings yet
Chat List
1 page
STAY by The Kid LAROI, Justin Bieber Piano Letter Notes
No ratings yet
STAY by The Kid LAROI, Justin Bieber Piano Letter Notes
1 page
Sdst1303 Statistics 1statistik 1
No ratings yet
Sdst1303 Statistics 1statistik 1
11 pages
Sai Opssb - Admit Card
No ratings yet
Sai Opssb - Admit Card
3 pages
22k-4522 (Shozab Mehdi) Lab - 1
No ratings yet
22k-4522 (Shozab Mehdi) Lab - 1
4 pages
Module 4 Quiz - Chapters 5 and 6 - CYBR 365 Intro To Digital Forensics - Jan 2022 - Online
No ratings yet
Module 4 Quiz - Chapters 5 and 6 - CYBR 365 Intro To Digital Forensics - Jan 2022 - Online
9 pages
FINAL Round Robin 3
No ratings yet
FINAL Round Robin 3
6 pages
InfoCaster DS1150 Hardware Specifications - 20111221
No ratings yet
InfoCaster DS1150 Hardware Specifications - 20111221
2 pages
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
From Everand
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
AJIT DASH
2/5 (2)
Administering Microsoft Azure SQL Solutions DP 300
From Everand
Administering Microsoft Azure SQL Solutions DP 300
Manish Soni
No ratings yet
Learn SAP BI in 24 Hours
From Everand
Learn SAP BI in 24 Hours
Alex Nordeen
3/5 (1)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

DMDW-MDM L8,9

Uploaded by

DMDW-MDM L8,9

Uploaded by

Multidimensional Data Model

Multidimensional Data Model

❖ Star data warehouse schemas create a denormalized database that enables

❖ Snowflake Schema is a data warehouse schema that encompasses a logical

❖ Snowflake Schema are permitted to have dimension tables joined to other

❖ Fact Constellation Schema also known as a Galaxy Schema.

Fact Constellation Schema:

dollars sold = sum(sales in dollars), units sold = count(*)

define dimension branch as (branch key, branch name, branch type)

It is the opposite of the drill-down operation and is also known as a drill-up or

❖ The Single-Tier architecture of the data warehouse can be considered as a cumulation

❖ Two-tier architecture vanishes the drawback of the single-tier as it has a separation

❖ The Data Tier consists of the following Three layers:

❖ Data Warehouse design in order to build a Data Warehouse by including

❖ The final step of ETL is to Load the data on the repository.

❖ Top Tier tools used are:

❖ Complexity: The top-down approach can be complex and time-consuming. To mitigate

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.