0% found this document useful (0 votes)

16 views11 pages

Data Warehouse and Data Mining

The document discusses two data modeling techniques used in data warehousing: the star schema and the snowflake schema. The star schema features a central fact table surrounded by dimension tables, offering simplicity and optimized query performance but may introduce redundancy. In contrast, the snowflake schema normalizes dimension tables to reduce redundancy and enhance data integrity, though it can lead to increased complexity and slower query performance.

Uploaded by

slathajanuary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views11 pages

Data Warehouse and Data Mining

Uploaded by

slathajanuary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Star Schema

The star schema is one of the most popular data modeling techniques used in data
warehousing.

Its structure is relatively simple, making it easy to understand and conducive for query

performance. Here's a brief overview:

1. Central Fact Table:

● At the heart of the star schema is the fact table. This table contains the quantitative data
(often called "facts" or "measures") about specific events or transactions. Examples of facts

include sales revenue, quantities sold, profit, etc.

● The fact table usually has a composite primary key made up of foreign keys that link to

associated dimension tables. This composite key helps in relating facts to their descriptive

context.

2. Dimension Tables:

● Surrounding the central fact table are several dimension tables. Each dimension table

provides context for the data stored in the fact table.

● Dimension tables typically contain descriptive, textual, or categorical information, often

referred to as "attributes." These attributes give context to the quantitative data in the fact

table.

● Examples of dimension tables could be: "Time" (with attributes like day, week, month,

quarter, year), "Product" (with attributes like product name, category, manufacturer),

"Customer" (with attributes like customer name, address, and phone number), and so forth.

● Each dimension table is linked to the fact table by a primary-to-foreign key relationship.

3. Characteristics:

● Simplicity: One of the main advantages of the star schema is its simplicity. The clear

distinction between fact and dimension tables makes it easy for end-users and developers

to understand the database structure.

● Performance: Due to its denormalized nature, the star schema is optimized for query

performance. Queries often require fewer joins in a star schema than in more normalized

structures like the snowflake schema.

● Scalability: New dimensions or facts can be added without changing the existing structure,

making the star schema flexible and scalable.

4. Drawback:

● Redundancy: Because it's denormalized, the star schema can introduce data redundancy.

This can lead to increased storage requirements and potential data integrity issues.

5. Usage:

● The star schema is primarily used in OLAP systems, which are designed for complex

queries and aggregations, rather than OLTP systems, which are transaction-oriented.

In graphical representations, the structure resembles a star, with the fact table in the center and
dimension tables radiating outward, hence the name "star schema” as depicted in figure 5.7

SALES is a fact table having attributes i.e. (Product ID, Order ID, Customer ID, Employer ID,

Total, Quantity, Discount) which references to the dimension tables. Employee dimension table

contains the attributes: Emp ID, Emp Name, Title, Department and Region. Product dimension

table contains the attributes: Product ID, Product Name, Product Category, Unit Price.
Customer dimension table contains the attributes: Customer ID, Customer Name, Address,
City, Zip. Time dimension table contains the attributes: Order ID, Order Date, Year, Quarter,
Month.

Snowflake Schema

The snowflake schema is another common data warehousing model, closely related to the star

schema. While both are used for OLAP (Online Analytical Processing), they have structural

differences and a sample is shown in Figure 5.8. Here's an overview of the snowflake schema:

1. Normalized Dimension Tables:

● In the snowflake schema, dimension tables are normalized. That means the data is

organized within the database to reduce redundancy and improve data integrity. This is

done by dividing the data into additional tables, creating a structure that looks like a

snowflake, hence the name.

● For instance, if you have a "Customer" dimension in a retail scenario, that dimension could

be normalized into separate "Customer," "City," and "Country" tables instead of a single

denormalized table containing all the information.

2. Complex Structure:

● Because of this normalization, the snowflake schema tends to have a more complex

structure than the star schema. Queries can become more complex and involve more table

joins, potentially leading to longer query times.

3. Reduced Data Redundancy:

● The main advantage of the snowflake schema is the reduction in data redundancy. This can

lead to less storage space usage compared to the star schema.

● However, the space saved may be minimal compared to the overall size of the data

warehouse, and this saving might not justify the additional complexity.

4. Enhanced Data Integrity:

● The increase in normalization can improve data integrity, as the chances of inconsistent

data are reduced. Any changes to a data point need to be made in just one place, reducing

the risk of data anomalies.

5. Query Performance:

● Query performance can be slower compared to the star schema due to the increased number

of joins required by the normalization. However, modern databases are increasingly

capable of mitigating this performance difference.

6. Scalability Issues:

● While the snowflake schema can handle changing requirements by adding new dimensions

easily, the complexity of the schema might increase significantly as the database scales,

making maintenance more challenging.

In practice, the choice between a star schema and a snowflake schema often depends on
specific project requirements, the characteristics of the data being used, and the expected
query performance. While the snowflake schema helps save storage space and ensures data
integrity, it can increase complexity and affect performance. Conversely, the star schema is
simpler and generally offers better query performance, but at the expense of greater storage
space and potential data redundancy.

The Employee dimension table now contains the attributes: EmployeeID, EmployeeName,

DepartmentID, Region, and Territory. The DepartmentID attribute links with the Employee table

with the Department dimension table. The Department dimension is used to provide detail
about

each department, such as the Name and Location of the department. The Customer dimension

table now contains the attributes: CustomerID, CustomerName, Address, and CityID. The
CityID

attributes link the Customer dimension table with the City dimension table. The City dimension

table has details about each city such as city name, Zipcode, State, and Country.

Star, Snowflake, Starflake Schemas
100% (3)
Star, Snowflake, Starflake Schemas
4 pages
11 2 Multi-Step Subtraction Problems
No ratings yet
11 2 Multi-Step Subtraction Problems
2 pages
Schema
No ratings yet
Schema
17 pages
ADBMS: Assignment - 05: Snowflake Schema in Data Warehouse
No ratings yet
ADBMS: Assignment - 05: Snowflake Schema in Data Warehouse
5 pages
Lecture Six-Schemas
No ratings yet
Lecture Six-Schemas
5 pages
Adbms
No ratings yet
Adbms
4 pages
CDM - Class 8
No ratings yet
CDM - Class 8
4 pages
DM Week 10 Des
No ratings yet
DM Week 10 Des
4 pages
Dimensional Modeling and Schemas: Data Modeling Research Paper
No ratings yet
Dimensional Modeling and Schemas: Data Modeling Research Paper
11 pages
Home Work 3
0% (1)
Home Work 3
10 pages
Chapter Nine
No ratings yet
Chapter Nine
36 pages
1
No ratings yet
1
35 pages
Dataware House Strcture
No ratings yet
Dataware House Strcture
13 pages
Data Modelling
No ratings yet
Data Modelling
1 page
Data Warehouse Schema
No ratings yet
Data Warehouse Schema
6 pages
Data Warehouse Design Lecture2
No ratings yet
Data Warehouse Design Lecture2
10 pages
Data Warehouse Schema
No ratings yet
Data Warehouse Schema
6 pages
Star and Snowflake Schemas: What Is A Star Schema?
No ratings yet
Star and Snowflake Schemas: What Is A Star Schema?
18 pages
MODULE2
No ratings yet
MODULE2
22 pages
DWM Unit 2. Data Warehousing Modeling & OLAP I
100% (2)
DWM Unit 2. Data Warehousing Modeling & OLAP I
16 pages
Data Warehouse Schemas
No ratings yet
Data Warehouse Schemas
87 pages
Schemas For Multidimensional Databases
No ratings yet
Schemas For Multidimensional Databases
5 pages
Star and Snowflake
No ratings yet
Star and Snowflake
4 pages
Operational Data Stores Data Warehouse: 8) What Is Ods Vs Datawarehouse?
No ratings yet
Operational Data Stores Data Warehouse: 8) What Is Ods Vs Datawarehouse?
15 pages
Data Warehousing Concepts 2
No ratings yet
Data Warehousing Concepts 2
26 pages
Lecture 11data Warehouse Scema
No ratings yet
Lecture 11data Warehouse Scema
12 pages
Snowflake Schema - Jenny
No ratings yet
Snowflake Schema - Jenny
2 pages
Unit 2-DATA WAREHOUSE
No ratings yet
Unit 2-DATA WAREHOUSE
28 pages
Snowflake & Starflake
100% (2)
Snowflake & Starflake
9 pages
Datadeling
No ratings yet
Datadeling
27 pages
Lect-6-Data warehousing-Part-II
No ratings yet
Lect-6-Data warehousing-Part-II
37 pages
8 Database Schema
No ratings yet
8 Database Schema
8 pages
Star and Snowflake Schema in Data Warehouse With Examples: What Is Multidimensional Schema?
No ratings yet
Star and Snowflake Schema in Data Warehouse With Examples: What Is Multidimensional Schema?
6 pages
Star Schema
No ratings yet
Star Schema
1 page
DW Lab Manual Print
No ratings yet
DW Lab Manual Print
47 pages
Multidimensional Schema
No ratings yet
Multidimensional Schema
4 pages
Data Warehousing Schemas and Objects
No ratings yet
Data Warehousing Schemas and Objects
24 pages
Data Warehouse Schemas
No ratings yet
Data Warehouse Schemas
17 pages
Star and Snowflake Schema in Data Warehouse With Model Examples
No ratings yet
Star and Snowflake Schema in Data Warehouse With Model Examples
4 pages
Schema Cheatsheet No.1
No ratings yet
Schema Cheatsheet No.1
4 pages
2m Unit4
No ratings yet
2m Unit4
5 pages
Dimensional Model Schemas - Start and Snowflake
No ratings yet
Dimensional Model Schemas - Start and Snowflake
2 pages
Snowflake Schema: Made By: Sonal Arora 13112303909
No ratings yet
Snowflake Schema: Made By: Sonal Arora 13112303909
8 pages
What Is A Star Schema
No ratings yet
What Is A Star Schema
5 pages
C CC C CCC CCCCCCCCCCCC
No ratings yet
C CC C CCC CCCCCCCCCCCC
7 pages
Data Warehouse Interview - 250131 - 093801
No ratings yet
Data Warehouse Interview - 250131 - 093801
14 pages
SQL Data Engineer
No ratings yet
SQL Data Engineer
14 pages
Unit 2.4 Star SnowFlake Schema ETl Process
No ratings yet
Unit 2.4 Star SnowFlake Schema ETl Process
14 pages
Chapter V
No ratings yet
Chapter V
38 pages
The Basics: Facts & Dimensions
No ratings yet
The Basics: Facts & Dimensions
4 pages
Unit 2 Notes DWM
No ratings yet
Unit 2 Notes DWM
14 pages
Star Schema
No ratings yet
Star Schema
5 pages
Data Warehouse Lec-3
No ratings yet
Data Warehouse Lec-3
38 pages
Data Modeling
No ratings yet
Data Modeling
1 page
Unit 5 DW
No ratings yet
Unit 5 DW
12 pages
Infor Basics
No ratings yet
Infor Basics
15 pages
DWM Chp2 Notes
No ratings yet
DWM Chp2 Notes
21 pages
Why Is The Snowflake Schema A Good Data Warehouse Design
No ratings yet
Why Is The Snowflake Schema A Good Data Warehouse Design
19 pages
Data Warehousing Mid-Term Answers (Tentative)
No ratings yet
Data Warehousing Mid-Term Answers (Tentative)
4 pages
Interview Question
No ratings yet
Interview Question
14 pages
The Snowflake Handbook: Optimizing Data Warehousing and Analytics
From Everand
The Snowflake Handbook: Optimizing Data Warehousing and Analytics
Robert Johnson
No ratings yet
Eurocode 7 Geotechnical Limit Analysis
No ratings yet
Eurocode 7 Geotechnical Limit Analysis
19 pages
CSC403 - Software Engineering BOSU
No ratings yet
CSC403 - Software Engineering BOSU
13 pages
M & W Strategy
No ratings yet
M & W Strategy
19 pages
Planning A Lesson Using PRIMM: The Five Stages of PRIMM
No ratings yet
Planning A Lesson Using PRIMM: The Five Stages of PRIMM
2 pages
POLARES 2.0 UK LQ
No ratings yet
POLARES 2.0 UK LQ
4 pages
Information Required For Preparation of Offers For Safety Consultancy Assignments
No ratings yet
Information Required For Preparation of Offers For Safety Consultancy Assignments
3 pages
Conflict Resolution Skills
100% (11)
Conflict Resolution Skills
16 pages
Kleinman 2011
No ratings yet
Kleinman 2011
9 pages
Aa BPG 375001
No ratings yet
Aa BPG 375001
36 pages
45B Ahmed Shaikh AIML Journal
No ratings yet
45B Ahmed Shaikh AIML Journal
181 pages
Date Reference Description Valuedate Deposit Withdrawal Balance
No ratings yet
Date Reference Description Valuedate Deposit Withdrawal Balance
26 pages
Structure and Written Expression: Section Two
100% (1)
Structure and Written Expression: Section Two
26 pages
Dbms Lab 1,2,3,4
No ratings yet
Dbms Lab 1,2,3,4
40 pages
ZYJ260
No ratings yet
ZYJ260
78 pages
Origins of Lift
No ratings yet
Origins of Lift
5 pages
60. Đề Thi Thử TN THPT 2021 - Môn Tiếng Anh - Sở GD & ĐT Hưng Yên - File Word Có Lời Giải
No ratings yet
60. Đề Thi Thử TN THPT 2021 - Môn Tiếng Anh - Sở GD & ĐT Hưng Yên - File Word Có Lời Giải
6 pages
2017.09.13 - MY18 GLE-Coupe
No ratings yet
2017.09.13 - MY18 GLE-Coupe
29 pages
Essentials of Strategic Management The Quest For Competitive Advantage 8th Edition Gamble Test Bank Available Instantly
No ratings yet
Essentials of Strategic Management The Quest For Competitive Advantage 8th Edition Gamble Test Bank Available Instantly
341 pages
Sas#4 - Ite 303-Sia
No ratings yet
Sas#4 - Ite 303-Sia
10 pages
Fluostar 2L
No ratings yet
Fluostar 2L
1 page
वदेश मं ालय भारत सरकार Ministry of External Affairs Government of India Online Appointment Receipt
No ratings yet
वदेश मं ालय भारत सरकार Ministry of External Affairs Government of India Online Appointment Receipt
3 pages
Yamaha Fzr400swc 89 Parts Catalogue
100% (42)
Yamaha Fzr400swc 89 Parts Catalogue
6 pages
Creating Graphs and Charts in Excel
No ratings yet
Creating Graphs and Charts in Excel
6 pages
Physics Class Xii Project PDF
No ratings yet
Physics Class Xii Project PDF
20 pages
All India Machinery Data
0% (1)
All India Machinery Data
1,705 pages
SPM Swivels Operation Instruction and Service Manual
No ratings yet
SPM Swivels Operation Instruction and Service Manual
44 pages
Unit 4 - Week 2: Introduction To Python: Assignment 2
No ratings yet
Unit 4 - Week 2: Introduction To Python: Assignment 2
4 pages
Nba Lab Details May 2014
No ratings yet
Nba Lab Details May 2014
38 pages
Review Quiz - Attempt Review2
No ratings yet
Review Quiz - Attempt Review2
11 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Data Warehouse and Data Mining

Uploaded by

Data Warehouse and Data Mining

Uploaded by

Star Schema

performance. Here's a brief overview:

1. Central Fact Table:

include sales revenue, quantities sold, profit, etc.

provides context for the data stored in the fact table.

● Dimension tables typically contain descriptive, textual, or categorical information, often

to understand the database structure.

structures like the snowflake schema.

making the star schema flexible and scalable.

1. Normalized Dimension Tables:

snowflake, hence the name.

denormalized table containing all the information.

joins, potentially leading to longer query times.

3. Reduced Data Redundancy:

lead to less storage space usage compared to the star schema.

4. Enhanced Data Integrity:

the risk of data anomalies.

of joins required by the normalization. However, modern databases are increasingly

capable of mitigating this performance difference.

making maintenance more challenging.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.