Dmbi Assignment 2: Q.1. Explain STAR Schema. Ans-1
Dmbi Assignment 2: Q.1. Explain STAR Schema. Ans-1
DMBI ASSIGNMENT 2
Q.1. Explain STAR schema.
Ans-1
Star schema is the fundamental schema among the data mart schema and it is simplest. This
schema is widely used to develop or build a data warehouse and dimensional data marts. It
includes one or more fact tables indexing any number of dimensional tables. The star schema
is a necessary case of the snowflake schema. It is also efficient for handling basic queries.
It is said to be star as its physical model resembles to the star shape having a fact table at its
center and the dimension tables at its peripheral representing the star’s points. Below is an
example to demonstrate the Star Schema:
In the above demonstration, SALES is a fact table having attributes i.e. (Product ID, Order ID,
Customer ID, Employer ID, Total, Quantity, Discount) which references to the dimension
tables. Employee dimension table contains the attributes: Emp ID, Emp Name, Title,
Department and Region. Product dimension table contains the attributes: Product ID, Product
Name, Product Category, Unit Price. Customer dimension table contains the attributes:
Customer ID, Customer Name, Address, City, Zip. Time dimension table contains the
attributes: Order ID, Order Date, Year, Quarter, Month.
Model of Star Schema –
In Star Schema, Business process data, that holds the quantitative data about a business is
distributed in fact tables, and dimensions which are descriptive characteristics related to fact
data. Sales price, sale quantity, distant, speed, weight, and weight measurements are few
examples of fact data in star schema.
Often, A Star Schema having multiple dimensions is termed as Centipede Schema. It is easy to
handle a star schema which have dimensions of few attributes.
Ans-2
The snowflake schema is a variant of the star schema. Here, the centralized fact table is
connected to multiple dimensions. In the snowflake schema, dimension are present in a
normalized from in multiple related tables. The snowflake structure materialized when the
dimensions of a star schema are detailed and highly structured, having several levels of
relationship, and the child tables have multiple parent table. The snowflake effect affects only
the dimension tables and does not affect the fact tables.
The Employee dimension table now contains the attributes: EmployeeID, EmployeeName,
DepartmentID, Region, Territory. The DepartmentID attribute links with Employee table with
the Department dimension table. The Department dimension is used to provide detail about
each department, such as Name and Location of the department. The Customer dimension
table now contains the attributes: CustomerID, CustomerName, Address, CityID. The CityID
attributes links the Customer dimension table with the City dimension table.
The City dimension table has details about each city such as CityName, Zipcode, State and
Country.
The main difference between star schema and snowflake schema is that the dimension table
of the snowflake schema are maintained in normalized form to reduce redundancy. The
advantage here is that such table(normalized) are easy to maintain and save storage space.
However, it also means that more joins will be needed to execute query. This will adversely
impact system performance.
Ans-3
Ans-4
OLAP Operations
Since OLAP servers are based on multidimensional view of data, we will discuss OLAP
operations in multidimensional data.
Roll-up
Drill-down
Pivot (rotate)
Roll-up
By dimension reduction
Initially the concept hierarchy was "street < city < province < country".
On rolling up, the data is aggregated by ascending the location hierarchy from the
level of city to the level of country.
When roll-up is performed, one or more dimensions from the data cube are removed.
Drill-down
Drill-down is the reverse operation of roll-up. It is performed by either of the following ways
−
Initially the concept hierarchy was "day < month < quarter < year."
On drilling down, the time dimension is descended from the level of quarter to the
level of month.
When drill-down is performed, one or more dimensions from the data cube are
added.
It navigates the data from less detailed data to highly detailed data.
Slice
The slice operation selects one particular dimension from a given cube and provides a new
sub-cube. Consider the following diagram that shows how slice works.
Here Slice is performed for the dimension "time" using the criterion time = "Q1".
Dice
Dice selects two or more dimensions from a given cube and provides a new sub-cube.
Consider the following diagram that shows the dice operation.
The dice operation on the cube based on the following selection criteria involves three
dimensions.
Pivot
The pivot operation is also known as rotation. It rotates the data axes in view in order to
provide an alternative presentation of data. Consider the following diagram that shows the
pivot operation.