0% found this document useful (0 votes)
31 views13 pages

Dataware House Strcture

This document defines and explains multi-dimensional data models, including data cubes, star schemas, snowflake schemas, and fact constellation schemas. A multi-dimensional model views data as cubes with dimensions and facts. Star schemas organize data around a central fact table linked to dimension tables, while snowflake schemas further normalize the dimension tables. Fact constellation schemas involve two or more fact tables sharing dimensions.

Uploaded by

shital7028733151
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views13 pages

Dataware House Strcture

This document defines and explains multi-dimensional data models, including data cubes, star schemas, snowflake schemas, and fact constellation schemas. A multi-dimensional model views data as cubes with dimensions and facts. Star schemas organize data around a central fact table linked to dimension tables, while snowflake schemas further normalize the dimension tables. Fact constellation schemas involve two or more fact tables sharing dimensions.

Uploaded by

shital7028733151
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

What is Multi-Dimensional Data

Model?
A multidimensional model views data in the form of a data-cube. A data cube
enables data to be modeled and viewed in multiple dimensions. It is defined
by dimensions and facts.

The dimensions are the perspectives or entities concerning which an


organization keeps records. For example, a shop may create a sales data
warehouse to keep records of the store's sales for the dimension time, item,
and location. These dimensions allow the save to keep track of things, for
example, monthly sales of items and the locations at which the items were
sold. Each dimension has a table related to it, called a dimensional table,
which describes the dimension further. For example, a dimensional table for
an item may contain the attributes item_name, brand, and type.

A multidimensional data model is organized around a central theme, for


example, sales. This theme is represented by a fact table. Facts are numerical
measures. The fact table contains the names of the facts or measures of the
related dimensional tables.

Consider the data of a shop for items sold per quarter in the city of Delhi. The
data is shown in the table. In this 2D representation, the sales for Delhi are
shown for the time dimension (organized in quarters) and the item dimension
(classified according to the types of an item sold). The fact or measure
displayed in rupee_sold (in thousands).
What is Data Cube?
When data is grouped or combined in multidimensional matrices called Data
Cubes. The data cube method has a few alternative names or a few variants,
such as "Multidimensional databases," "materialized views," and "OLAP (On-
Line Analytical Processing)."

What is Star Schema?


A star schema is the elementary form of a dimensional model, in which data
are organized into facts and dimensions. A fact is an event that is counted or
measured, such as a sale or log in. A dimension includes reference data about
the fact, such as date, item, or customer.

A star schema is a relational schema where a relational schema whose design


represents a multidimensional data model. The star schema is the explicit data
warehouse schema. It is known as star schema because the entity-relationship
diagram of this schemas simulates a star, with points, diverge from a central
table. The center of the schema consists of a large fact table, and the points of
the star are the dimension tables.
Fact Tables
A table in a star schema which contains facts and connected to dimensions. A
fact table has two types of columns: those that include fact and those that are
foreign keys to the dimension table. The primary key of the fact tables is
generally a composite key that is made up of all of its foreign keys.

A fact table might involve either detail level fact or fact that have been
aggregated (fact tables that include aggregated fact are often instead called
summary tables). A fact table generally contains facts with the same level of
aggregation.

Dimension Tables
A dimension is an architecture usually composed of one or more hierarchies
that categorize data. If a dimension has not got hierarchies and levels, it is
called a flat dimension or list. The primary keys of each of the dimensions
table are part of the composite primary keys of the fact table. Dimensional
attributes help to define the dimensional value. They are generally descriptive,
textual values. Dimensional tables are usually small in size than fact table.
Fact tables store data about sales while dimension tables data about the
geographic region (markets, cities), clients, products, times, channels.

Characteristics of Star Schema


The star schema is intensely suitable for data warehouse database design
because of the following features:

o It creates a DE-normalized database that can quickly provide query


responses.
o It provides a flexible design that can be changed easily or added to
throughout the development cycle, and as the database grows.
o It provides a parallel in design to how end-users typically think of and
use the data.
o It reduces the complexity of metadata for both developers and end-
users.

Advantages of Star Schema


Star Schemas are easy for end-users and application to understand and
navigate. With a well-designed schema, the customer can instantly analyze
large, multidimensional data sets.

The main advantage of star schemas in a decision-support environment are:


Example: Suppose a star schema is composed of a fact table, SALES, and
several dimension tables connected to it for time, branch, item, and
geographic locations.

The TIME table has a column for each day, month, quarter, and year. The ITEM
table has columns for each item_Key, item_name, brand, type, supplier_type.
The BRANCH table has columns for each branch_key, branch_name,
branch_type. The LOCATION table has columns of geographic data, including
street, city, state, and country.
In this scenario, the SALES table contains only four columns with IDs from the
dimension tables, TIME, ITEM, BRANCH, and LOCATION, instead of four
columns for time data, four columns for ITEM data, three columns for
BRANCH data, and four columns for LOCATION data. Thus, the size of the fact
table is significantly reduced. When we need to change an item, we need only
make a single change in the dimension table, instead of making many
changes in the fact table.

We can create even more complex star schemas by normalizing a dimension


table into several tables. The normalized dimension table is called
a Snowflake.

What is Snowflake Schema?


A snowflake schema is equivalent to the star schema. "A schema is known as a
snowflake if one or more dimension tables do not connect directly to the fact
table but must join through other dimension tables."

The snowflake schema is an expansion of the star schema where each point of
the star explodes into more points. It is called snowflake schema because the
diagram of snowflake schema resembles a snowflake. Snowflaking is a
method of normalizing the dimension tables in a STAR schemas. When we
normalize all the dimension tables entirely, the resultant structure resembles a
snowflake with the fact table in the middle.

Snowflaking is used to develop the performance of specific queries. The


schema is diagramed with each fact surrounded by its associated dimensions,
and those dimensions are related to other dimensions, branching out into a
snowflake pattern.

The snowflake schema consists of one fact table which is linked to many
dimension tables, which can be linked to other dimension tables through a
many-to-one relationship. Tables in a snowflake schema are generally
normalized to the third normal form. Each dimension table performs exactly
one level in a hierarchy.

he following diagram shows a snowflake schema with two dimensions, each


having three levels. A snowflake schemas can have any number of dimension,
and each dimension can have any number of levels.

Example: Figure shows a snowflake schema with a Sales fact table, with Store,
Location, Time, Product, Line, and Family dimension tables. The Market
dimension has two dimension tables with Store as the primary dimension
table, and Location as the outrigger dimension table. The product dimension
has three dimension tables with Product as the primary dimension table, and
the Line and Family table are the outrigger dimension tables.

A star schema store all attributes for a dimension into one denormalized table.
This needed more disk space than a more normalized snowflake schema.
Snowflaking normalizes the dimension by moving attributes with low
cardinality into separate dimension tables that relate to the core dimension
table by using foreign keys. Snowflaking for the sole purpose of minimizing
disk space is not recommended, because it can adversely impact query
performance.
In snowflake, schema tables are normalized to delete redundancy. In
snowflake dimension tables are damaged into multiple dimension tables.

Figure shows a simple STAR schema for sales in a manufacturing company.


The sales fact table include quantity, price, and other relevant metrics.
SALESREP, CUSTOMER, PRODUCT, and TIME are the dimension tables.

The STAR schema for sales, as shown above, contains only five tables, whereas
the normalized version now extends to eleven tables. We will notice that in the
snowflake schema, the attributes with low cardinality in each original
dimension tables are removed to form separate tables. These new tables are
connected back to the original dimension table through artificial keys.
A snowflake schema is designed for flexible querying across more complex
dimensions and relationship. It is suitable for many to many and one to many
relationships between dimension levels.

What is Fact Constellation Schema?


A Fact constellation means two or more fact tables sharing one or more
dimensions. It is also called Galaxy schema.

Fact Constellation Schema describes a logical structure of data warehouse or


data mart. Fact Constellation Schema can design with a collection of de-
normalized FACT, Shared, and Conformed Dimension tables.
Fact Constellation Schema is a sophisticated database design that is difficult
to summarize information. Fact Constellation Schema can implement between
aggregate Fact tables or decompose a complex Fact table into independent
simplex Fact tables.

Example: A fact constellation schema is shown in the figure below.


his schema defines two fact tables, sales, and shipping. Sales are treated along
four dimensions, namely, time, item, branch, and location. The schema
contains a fact table for sales that includes keys to each of the four
dimensions, along with two measures: Rupee_sold and units_sold. The
shipping table has five dimensions, or keys: item_key, time_key, shipper_key,
from_location, and to_location, and two measures: Rupee_cost and
units_shipped.

The primary disadvantage of the fact constellation schema is that it is a more


challenging design because many variants for specific kinds of aggregation
must be considered and selected.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy