DMDW-MDM L8,9
DMDW-MDM L8,9
define dimension time as (time key, day, day of week, month, quarter, year)
define dimension item as (item key, item name, brand, type, supplier type)
define dimension location as (location key, street, city, province or state, country)
define cube shipping [time, item, shipper, from location, to location]:
dollars cost = sum(cost in dollars), units shipped = count(*)
define dimension time as time in cube sales
define dimension item as item in cube sales
define dimension shipper as (shipper key, shipper name, location as location in cube sales,
shipper type)
define dimension from location as location in cube sales
define dimension to location as location in cube sales
Concept Hierarchies
❖ Concept Hierarchy Sequence of mappings from a set of low-level concepts to
higher-level, is called Concept Hierarchy.
❖ Concept hierarchy for the dimension location and Time.
➢ Location: Vancouver, Toronto, New York, and Chicago.
➢ Location Time
OLAP Process
OLAP Operations: Drill-down
Drill down operation allows a user to zoom in on the data cube i.e., the less
detailed data is converted into highly detailed data. It can be implemented by
either stepping down a concept hierarchy for a dimension or adding additional
dimensions to the hypercube.
Roll-up
Slice operation is used to select a single dimension from the given cube to
generate a new sub-cube. It represents the information from another point of view.
Pivot
It is used to provide an alternate view of the data available to the users. It is also
known as Rotate operation as it rotates the cube’s orientation to view the data
from different perspectives.
Starnet Query Model
❖ The querying of multidimensional databases can be based on a starnet
model.
❖ A starnet model consists of radial lines emanating from a central point, where
each line represents a concept hierarchy for a dimension.
❖ Each abstraction level in the hierarchy is called a footprint.
Data Warehouse Architecture
Principles of Data Warehousing
Single-Tier Data Warehouse Architecture
❖ The Data Tier in the two-tier architecture of the data warehouse can be defined as the
layer where actual data is stored after various ETL processes are used to load data into
the database or the data warehouse.
❖ The staging area where the ETL processes are used in the Data tier helps you ensure
that all data loaded into the warehouse is cleansed and in the appropriate format.
❖ The Bottom Tier in the three-tier architecture of a data warehouse consists of the Data
Repository.
❖ Data Repository is the storage space for the data extracted from various data sources,
which undergoes a series of activities as a part of the ETL process. ETL stands for
Extract, Transform and Load.
❖ These data are then cleaned up, to avoid repeating or junk data from its current storage
units.
❖ The next step is to transform all these data into a single format of storage.
Three-Tier Data Warehouse Architecture: Bottom-Tier
❖ The Middle tier here is the tier with the OLAP servers.
❖ There are three types of OLAP server models, such as:
➢ ROLAP: Relational online analytical processing is a model of online analytical processing which
carries out an active multidimensional breakdown of data stored in a relational database, instead of
redesigning a relational database into a multidimensional database.
➢ MOLAP: Multidimensional online analytical processing is another model of online analytical
processing that catalogs and comprises of directories directly on its multidimensional database
system.
➢ HOLAP: Hybrid online analytical processing is a hybrid of both relational and multidimensional online
analytical processing models.
Three-Tier Data Warehouse Architecture: Top Tier
The Top Tier is a front-end layer, that is, the user interface that allows the user to connect
with the database systems.
This user interface is usually a tool or an API call, which is used to fetch the required data for
Reporting, Analysis, and Data Mining purposes.
if the Top tier is enabled with a bungling front-end tool, then the whole Data Warehouse Architecture can become an utter failure.
Three-Tier Data Warehouse Architecture: Top Tier
❖ Enterprise Warehouse
➢ A centralised system integrating data from all functions.
➢ Supports extensive queries and analysis for the entire organisation.
➢ Example: A large retail chain tracking inventory, sales, and customer trends across all
stores.
❖ Data Mart
➢ A smaller, department-specific subset of the warehouse.
➢ Designed for quick access and targeted analysis.
➢ Example: A finance team analysing quarterly budgets and expenditures.
❖ Virtual Warehouse
➢ Provides on-demand views of operational data.
➢ Focuses on quick access rather than permanent storage.
➢ Example: An e-commerce company generating real-time reports on daily sales
performance
Benefits of Three-Tier Data Warehouse System
❖ Scalability
➢ Handles growing data volumes without breaking a sweat.
➢ Supports an increasing number of users.
❖ Separation of Concerns:
➢ Keeps transactional and analytical processing distinct.
➢ Ensures faster analysis without affecting daily operations.
❖ Improved Query Performance:
➢ Prepares data in advance for lightning-fast queries.
➢ Delivers insights in seconds.
❖ Flexibility:
➢ Adapts to new data sources easily.
➢ Integrates with modern tools for enhanced analysis.
❖ Data Quality Assurance:
➢ Cleanses and standardize data before storage.
➢ Reduces errors and ensures reliability.
Challenges of Three-Tier Data Warehouse System