On-Line Analytical Processing: Analyzing Data Resources
On-Line Analytical Processing: Analyzing Data Resources
-
Analyzing Data Resources
ADITI PAUL
MCS/08/20
REGISTRATION NO – 003834 OF 2008
POST GRADUATE DEPARTMENT OF COMPUTER
SCIENCE
ST.XAVIERS COLLEGE (AUTONOMOUS)
WHAT IS OLAP ?
Basic idea:
3 part description
Part 1 – Online
Part 2 – Analytical
Part 3 – Processing
PART 1 – ONLINE
FLASH BACK
Data Stored in a Database
TYPE 1
Operational Data
Measures –
This is Accomplished by
• OLAP Operations
• OLAP Functions
• SQL Extensions for OLAP.
OLAP OPERATIONS
• Dimension Tables
• Market (Market_ID, City , Region)
• Product (Product_ID,Name,Category,Price)
• Time(Time_ID,Week,Month,Quarter)
• Fact table
• Sales(Market_ID, Product_ID,Time_ID,Amount)
OLAP OPERATIONS
1. Aggregation – doing the ‘total’ of a measure
over one or more dimensions.
MARKET ID CITY REGION TIME ID WEEK MONTH QUARTER
M1 KOLKATA EAST T1 1 JAN 1
M2 KOLKATA EAST T2 1 JUNE 2
M3 DELHI NORTH
M3 P2 T1 50
SELECT Market_ID
M1 P1 600
,Product_ID ,SUM(AMOUNT)
FROM Sales M2 P2 600
GROUP BY Market_ID ,
M3 P1 100
Product_ID;
OLAP OPERATIONS
2. ROLL UP
Specific grouping on one dimension where we
go from lower level of aggregation to a higher.
Example :
Select S.Product_Id,M.City,SUM(S.Amount)
INTO City_Sales
FROM Sales S,Market M
WHERE
M.Market_ID = S.Market_ID
GROUP BY S.Product_ID,M.City
SELECT T.Product_ID,M.Region,SUM(T.Amount)
FROM City_Sales T,Market M
Where T.City=M.City
GROUP BY T.Product_ID,M.Region
OLAP OPERATIONS
3.DRILL DOWN
• Finer –grained view on aggregated data,i.e.
going from higher to lower aggregation
• Converse of Roll-up
• E.g disaggregate county sales by region/city.
OLAP OPERATIONS
4.PIVOTING
Select A different dimension(orientation) for analysis
OLAP OPERATIONS
5. SLICE and DICE
SELECT S.*
FROM Sales S,Time T
WHERE T.Time_ID = S.Time_ID
AND T.WEEK=’Week 12’
OLAP OPERATIONS
Dicing: A range selection in a hypercube. Partition or
group on one or more dimensions.
Example :
“ Total sales for each product in each quarter “
Dicing sales in the time dimension :
SELECT S.Product_ID,T.Quarter,SUM(S.Amount)
FROM Sales S,Time T
WHERE T.Time_ID=S.Time_ID
GROUP BY T.Quarter,S.Product_ID
SQL
ID FNAME LNAME MARKS SEM
EXTENSIONS
FOR OLAP 1 ANAL ACHARYA 300 1
2 740 1
1 625 2
3 585 3
• For the year 2000, the average number of orders was 1787. Four products
• (700, 601, 600, and 400) sold higher than that amount. In 2001, the average
• number of orders was 1048 and three products exceeded that amount.
OLAP FUNCTIONS
WINDOW FUNCTIONS
Window functions lets us analyze ourdata by
computing aggregate values over windows
surrounding each row. The result set returns a
summary value representing a set of rows.
The query returns a result set that partitions the data by department and then
provides a cumulative summary of employees’ salaries starting with the
employee who has been at the company the longest. The result set includes
only those employees who reside in West Bengal, BBSR, Maharashtra, or
Arunachal. The column Sum Salary provides the cumulative total of
employees’ salaries.
SELECT dept_id, emp_lname, start_date, salary,
SUM(salary) OVER (PARTITION BY dept_id
ORDER BY start_date
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS "Sum_
Salary"
FROM employee
WHERE state IN (’WB’, ’BBSR’, ’MH’, ’AR’) AND dept_id IN (’100’,
’200’)
ORDER BY dept_id, start_date;
On Line Analytical Processing
Thus Online Analytical Processing as a whole
can be understood to be a method which takes
in raw data , processes it through various
functions and operations and produces
Information as a Response to
Multidimensional Queries in Real Time
SERVER ARCHITECTURES
• Analytical Complexity
• Business questions can be rarely answered by a single query
• Complex queries are hard to understand,write and execute
efficiently
• Need for good business analysts
• Data Cubes can be HUGE
• But also can be sparse
• Can compute in advance,compute on demand , or some
combination.
• OLAP forms the underlying structure of DDAS –Distributed Data
Analysis and Dissemination System.
• From On line Analytical Processing to Online Analytical Mining
• ( OLAP to OLAM)
BIBLIOGRAPHY
• Data Warehousing , Data Mining and OLAP – Alex Berson,Stephen J.Smith
• Data Warehousing And OLAp - Hector Garcia-Molina
• Stanford University
• A Hitchhiker’s guide to OLAP – Paul Burton and Howard ong.
• Data mining data warehousing – Dr.Hani Saleeb
• DATA WAREHOUSE
AND OLAP TECHNOLOGY Prof. Anita Wasilewska
• Data Mining:
Concepts and Techniques Jiawei Han, Micheline Kamber, and Jian Pei
• University of Illinois at Urbana-Champaign &
• Simon Fraser University
• Wikipedia.
• Data Warehousing, Filtering, and Mining-Temple University
• Data Mining- Professor Maytal Saar-Tsechansky
THANK YOU !