Business Analytics
Business Analytics
Definition
Defined as scientific process of transforming data to insights for
making better decisions
Classification
Descriptive
Diagnostic
Predictive
Prescriptive
Data analytics and Data analysis
Data analytics
Defined as scientific process of transforming data to insights for
making better decisions
Data analysis
It is a process of examining, transforming and arranging raw data in
specific way to generate useful information
Analysis allow the evaluation of data to led some sort of conclusion
It involves number of steps, approaches, and diverse techniques
Interesting
Non trivial (not generalized)
Implicit (logically true)
Previously unknown
Potentially useful
Knowledge
Fact and Principles or justified believe
Meaningful and coherent expression that can be represented
Importance of Data mining
Knowledge
Can be represented in terms of
Rules
Patterns
Regularities
Non trivial (not generalized)
Intelligence
Ability to acquire, understand, and apply knowledge is called
Intelligence
Online Analytical Processing(OLAP)
Users: Knowledge Workers
Function: Decision support operation
DB Design: Application Oriented
Data: Historical, summarised, consolidated, integrated
Usage: Adhoc
Access: read, write, index, hash
Online: Complex Queries
Number of Users: Minimal
DB size: 100 GB to TB
Performance matrix: Query Throughput
Data Preprocessing
Requirement of pre-processing
Data may Incomplete:
Attribute of Interest may not be available
Some data are not considered at the time of entry
Relevant data may not be recorded due to malfunctioning of equipments
Noisy
Collecting instruments may be noisy
Human error
Errors in data transmission
Technology limits like buffer size etc.
Inconsistent
Dirty Data
Syntactically dirty data
Logical error and irregularities
Semantically dirty data
Integrity constraints violation
Redundancy
Coverage anomaly
Missing attributes and missing records
Data Cleaning
Ignore the data
Manually filling
Use global constant to fill missing values
Attribute mean to fill missing values
Consistency of Data
ETL model