0% found this document useful (0 votes)

12 views25 pages

wk6 - Data Analytics

The document outlines the objectives and content of the Week 6 module on Data Analytic Techniques in the MSc Computer Science program, focusing on various types of data analytics including descriptive, diagnostic, predictive, and prescriptive analytics. It discusses the importance of data analytics in making informed business decisions and highlights the methodologies and tools used in each type of analysis. Additionally, it covers the evaluation of models, challenges in data quality, and available academic and wellbeing support for students.

Uploaded by

binaya Dai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views25 pages

wk6 - Data Analytics

Uploaded by

binaya Dai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

MSc Computer Science

LDS7005M Big Data & Cloud Computing-LDS7005M,

Week 6: Data Analytic Techniques

Module Director(s): Dr Gayathri

Lecturer(s):

#WeAreYSJ

@YorkStJohn

@YorkStJohnUniversity
Objectives

▪ Learn what data analytics is and how it

works
▪ Analyse and understand the lifecycle of
data analytics
▪ Overview of statistical and machine
learning methods for data analysis
▪ Predictive modelling, clustering,
classification, and regression
▪ Evaluating model performance and
selection criteria
What is Data Analytics?

▪ Data analytics is the science of analysing raw data in order to make

conclusions about that information and make better business decisions.

▪ Many of the techniques and processes of data analytics have been

automated into mechanical processes and algorithms that work over raw
data for human consumption.
What is Data Analytics?
▪ Data analytics is a broad term that encompasses many diverse types of
data analysis. Any type of information can be subjected to data analytics
techniques to get insight that can be used to improve things.
▪ For example, content companies use data analytics to keep you clicking,
watching, or re- organizing content to get another view or another click.
Data Analytics Cycle

(Naveen, 2023)
Data Analytics Types

(Vinit Kachchi, 2021)

Descriptive analytics: What has happened?

▪ Descriptive analytics answers the “what happened” question by

summarizing past data without explaining the causes of future outcomes.
▪ Descriptive analytics is the ability to quantify events and report on them
in a human-readable way. It’s the first step in turning big data into
actionable insights.
▪ Easy to visualize
▪ (bar graphs, pie charts, histograms)
Descriptive analytics: What has happened?
▪ Typically, descriptive analytics takes the form of reports that synthesises the most relevant
tendencies in our data.
▪ Intends to be full of different types of plots to communicate different messages.
▪ A good example of descriptive analytics is Dashboards.
▪ It involves summarizing historical data to reveal patterns, trends, and key characteristics.
▪ It relies on summarization techniques, visualizations, and performance metrics to provide
insights into past events, serving as the foundational stage in the data analytics process.
▪ Examples include sales analysis, website traffic examination, and financial reporting
▪ Education: Student performance statistics over a semester.
▪ Healthcare: Number of hospital visits per month.

“Descriptive analysis is often the first step in data exploration before moving on to diagnostic, predictive,
or prescriptive analysis.”
Diagnostics Analytics : Why did it happen?

▪ This involves investigating and identifying the root causes of specific events or trends
revealed through descriptive analytics. It focuses on understanding why certain
patterns or outcomes occurred in historical data.

▪ When new problems arise, it is possible you have already collected certain data
pertaining to the issue that occurred in the past using questions that focus on the
reason behind the event. By already having the data at your disposal, it ends having
to repeat work and makes all problems interconnected.
Diagnostics Analytics : Why did it happen?

▪ Sample question may include: Why were Q2 sales less than Q1 sales?
▪ Diagnostics Analytics usually require collecting data from multiple
sources and storing it in a structure that lends itself to
▪ performing drill- down, correlation, regression and roll-up analysis.
▪ Risk analysis, anomaly detections
▪ Results are viewed via interactive visualisation tools that enable users to
identify trends and patterns.
Predictive analytics: What’s probably going to happen?

▪ Predictive analytics leverages statistical algorithms and machine learning techniques to forecast future
outcomes based on historical data patterns. It involves building models that can make predictions,
such as sales forecasting or customer behaviour, aiding in proactive decision-making. The goal is to
anticipate trends and events, enabling organizations to take pre-emptive actions.
▪ It uses historical data to predict future events. Typically, historical data is used to build a mathematical
model that captures important trends. That predictive model is then used on current data to predict
what will happen next, or to suggest actions to take for optimal outcomes.
▪ The accuracy of the predictions is field-dependent, i.e., is less complex to predict if a machine will fail
to predict cancer for example.
▪ Cloud Platforms: AWS SageMaker, Google Vertex AI.
▪ Python/R: Scikit-learn, TensorFlow, Prophet.
▪ SQL: Query large datasets for training models.
▪ BI Tools: Power BI
Predictive analytics: What’s probably going to happen?

▪ Machine Learning is a clear example where this type of analysis takes place.
▪ Questions are usually formulated using a what-if rationale, such as:
▪ What are the chances that a customer will default on a loan if they have missed a monthly
payment?
▪ The tools used generally abstract underlying statistical intricacies by providing user-
friendly front-end interfaces.
• Data Quality Matters: Garbage in → garbage out.
• Ethical Risks: Bias in data can lead to unfair predictions (e.g., loan denials).
Prescriptive analytics: What to do next?

▪ Prescriptive analytics recommends optimal actions by analysing data, considering

possible scenarios, and suggesting decisions that align with organizational goals.
▪ It goes beyond predicting outcomes, guiding decision-makers on the best course of
action for desired results.
▪ It requires such a seamless and completely integrated data analytics infrastructure
that just a few organizations have the capability to engage in a meaningful way.
Prescriptive analytics: What to do next?

▪ Prescriptive analytics is the frontier of data analysis, combining the insight from all
previous analyses to determine the course of action to take in a current problem or
decision.
▪ Artificial Intelligence (AI) is a perfect example of prescriptive analytics.
▪ Sample question may include: When is the best time to trade a particular stock?

▪ Tools for Prescriptive Analytics

▪ Advanced Software: IBM Decision Optimization, Gurobi, SAS.
▪ AI Platforms: Google Vertex AI, Azure Machine Learning.
▪ Custom Code: Python (PuLP, SciPy) for optimization models.
Data Analysis
Descriptive Statistics (summarizes the data)
Statistical Inferential Statistics (hypothesis)
Regression Analysis (Relationship between the
Methods models)
Analysis of Variance (ANOVA)-compare groups

Supervised Learning Machine

Unsupervised Learning
Reinforcement Learning Learning
Deep Learning Methods

Common Python
Tools and TensorFlow
PyTorch
Libraries
Types of Machine Learning
Clustering- Unsupervised Learning

Clustering
▪ Clustering aims to group similar data points together based on certain
features, uncovering inherent patterns or relationships.
▪ Applications: Customer segmentation, anomaly detection, or organizing
documents into topics.

Cluster 1 Cluster 2
Classification
▪ Classification assigns predefined categories or labels to data points based on their
characteristics.

▪ Applications: Spam detection in emails, sentiment analysis in social media, or

identifying fraud transactions.
Regression (Supervised Learning)
▪ Regression analyses the relationship between variables to predict a
continuous outcome, helping understand the impact of one variable on
another.

▪ Applications: Predicting house prices based on features, estimating sales

based on advertising spending, or forecasting temperature based on
historical data.
Research time- 15 mins
Could you provide an in-depth explanation of specific algorithms across different machine learning paradigms? In
your discussion, please cover:

• Supervised Learning: Detail one or two algorithms (e.g., Linear Regression, Support Vector Machines) including
their core concepts, typical applications, advantages, and limitations.
• Unsupervised Learning: Explain specific algorithms (such as K-Means Clustering or Principal Component Analysis
and Apriori Algorithm) with an emphasis on how they uncover hidden patterns in data.
• Reinforcement Learning: Describe a key algorithm (for instance, Q-Learning or Policy Gradient methods),
outlining how it learns through interactions with the environment and the challenges it faces.

• Semi-Supervised Learning: Discuss an algorithm that leverages both labeled and unlabeled data, and explain its
benefits in scenarios where fully labeled datasets are scarce.
• Self-Supervised Learning: Provide an overview of a self-supervised approach, highlighting how it generates
supervisory signals from the data itself.

Please ensure your explanation includes fundamental concepts, examples, and a discussion on the pros and
cons of each approach.
Model Valuation
▪ Both statistical and machine learning models need
rigorous evaluation and validation to ensure their
reliability and generalizability.

▪ Techniques include cross-validation, training and testing

datasets, and various performance metrics such as
accuracy, precision, recall, and F1 score.
▪ Hyper tuning
Challenges and Considerations
▪ Data Quality: Both statistical and machine learning methods require clean and relevant
data for accurate analysis.
▪ Interpretability: Statistical models often provide interpretable results, while some
complex machine learning models may lack transparency.
▪ Bias and Fairness: Addressing potential biases in data and models is crucial, especially in
machine learning applications.
Seminar Activity
• Part 1:Azure Virtual Machine creation – 1 hour ,
• Part 2 and 3: Data Structuring – 45 mins.

• Please upload your lab works in submission links created

Support available

• Academic Quality to check in with Student Support and Guidance Manager and Head of Student Opportunities

• Academic:
• Academic Skills Session (Compulsory – Scheduled session available on timetable)
• Small Group Academic Writing Tutorials both online and in-person (currently covering: Writing Critically, Essay writing, Report writing, Paraphrasing, Harvard Style
Referencing and Harvard Style Referencing) 1:1 support available on request if needed
• Targeted 1:1 support for students referred for Academic Misconduct
• Skills Guides available online to support

• Wellbeing:
• 1:1 Wellbeing Appointments
• Mental Health Support (online)
• Welfare Appointments (online)
• Wellbeing Breakfasts
Thank You ☺

Tools and Techniques For Predictive Analytics For Project Risk Management
No ratings yet
Tools and Techniques For Predictive Analytics For Project Risk Management
25 pages
SRU ADA Unit-1
No ratings yet
SRU ADA Unit-1
50 pages
Ccw331-Business Analytics Printed Notes
100% (1)
Ccw331-Business Analytics Printed Notes
59 pages
Data Analytics Chapter - 1
No ratings yet
Data Analytics Chapter - 1
42 pages
Lesson 2 Business Analytics Framework
No ratings yet
Lesson 2 Business Analytics Framework
29 pages
Ca 1 Merged
No ratings yet
Ca 1 Merged
677 pages
Data Analytics
No ratings yet
Data Analytics
11 pages
Business Analytics
No ratings yet
Business Analytics
33 pages
Big Data Analysis
No ratings yet
Big Data Analysis
25 pages
BDA CH 1 V1
No ratings yet
BDA CH 1 V1
48 pages
Dataanalytics 191124003453
No ratings yet
Dataanalytics 191124003453
32 pages
Social Media Analytics - Unit II
No ratings yet
Social Media Analytics - Unit II
9 pages
Data Analytics with Generative AI
From Everand
Data Analytics with Generative AI
Younish P
No ratings yet
FDS-Unit II-ECE
No ratings yet
FDS-Unit II-ECE
22 pages
Big - Data Unit-2
100% (2)
Big - Data Unit-2
64 pages
Introduction To Data Science and Data Analytics
No ratings yet
Introduction To Data Science and Data Analytics
72 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
62 pages
DAUnit 2
No ratings yet
DAUnit 2
18 pages
Big Data Chapter 2
No ratings yet
Big Data Chapter 2
62 pages
Unit II
No ratings yet
Unit II
91 pages
Business Analytics
No ratings yet
Business Analytics
16 pages
Abhijitya Midsem
No ratings yet
Abhijitya Midsem
6 pages
Module 1
No ratings yet
Module 1
49 pages
What Are Different Models of Data Analysis
No ratings yet
What Are Different Models of Data Analysis
9 pages
Analytics and Data Science
No ratings yet
Analytics and Data Science
12 pages
Here Is An Even More Detailed and Expanded Version of Chapter 1
No ratings yet
Here Is An Even More Detailed and Expanded Version of Chapter 1
5 pages
DSML
No ratings yet
DSML
62 pages
AA THeory and Methods
No ratings yet
AA THeory and Methods
40 pages
Data Science Introduction
100% (1)
Data Science Introduction
54 pages
Research Paper PDF
No ratings yet
Research Paper PDF
9 pages
Discussion Board 2
No ratings yet
Discussion Board 2
5 pages
Data-Driven Agentic AI: Integrating Data Science and Machine Learning
From Everand
Data-Driven Agentic AI: Integrating Data Science and Machine Learning
Anand Vemula
No ratings yet
03 Business Analytics
No ratings yet
03 Business Analytics
33 pages
1.data Analytics Overview and Variables Disruptive System
No ratings yet
1.data Analytics Overview and Variables Disruptive System
7 pages
Blockchain Hacking Preview
100% (1)
Blockchain Hacking Preview
37 pages
2 Types of Data Analytics
No ratings yet
2 Types of Data Analytics
21 pages
Intro To Business Analytics
No ratings yet
Intro To Business Analytics
27 pages
Week 1
No ratings yet
Week 1
50 pages
U1 C CLSRM
No ratings yet
U1 C CLSRM
30 pages
Bisma Itc
No ratings yet
Bisma Itc
7 pages
05 RSB Cluster
No ratings yet
05 RSB Cluster
14 pages
Unit 2 DS
No ratings yet
Unit 2 DS
30 pages
Module I - 1
No ratings yet
Module I - 1
23 pages
Unit 1
No ratings yet
Unit 1
50 pages
Dataanalyticsunit-1 (2) 104014
No ratings yet
Dataanalyticsunit-1 (2) 104014
51 pages
2.1 Data Analytics
No ratings yet
2.1 Data Analytics
16 pages
Introduction To Data Analytics
No ratings yet
Introduction To Data Analytics
19 pages
DataAnalytics UNIT 1 NOTES
No ratings yet
DataAnalytics UNIT 1 NOTES
13 pages
Module 2
No ratings yet
Module 2
18 pages
What Is Data Analytics
No ratings yet
What Is Data Analytics
3 pages
Da Unit 1
No ratings yet
Da Unit 1
12 pages
R Programming Basics
No ratings yet
R Programming Basics
17 pages
Unit-II (Data Analytics)
100% (1)
Unit-II (Data Analytics)
17 pages
Unit 1-2
No ratings yet
Unit 1-2
8 pages
Bilal CV
No ratings yet
Bilal CV
3 pages
Partlist N4004A-1
No ratings yet
Partlist N4004A-1
2 pages
Module 8 Artificial Intelligence in Monitoring and Evaluation
No ratings yet
Module 8 Artificial Intelligence in Monitoring and Evaluation
23 pages
Unit 1
No ratings yet
Unit 1
21 pages
DataAnalyticsCh 1
No ratings yet
DataAnalyticsCh 1
13 pages
BigData DataAnalyticsTypes
No ratings yet
BigData DataAnalyticsTypes
9 pages
Artificial Intelligence in Public Policy
No ratings yet
Artificial Intelligence in Public Policy
8 pages
Unit 1 Topic 1 Intro
No ratings yet
Unit 1 Topic 1 Intro
30 pages
HALLIBURTON-MWD-LWD Services Overview
100% (3)
HALLIBURTON-MWD-LWD Services Overview
8 pages
Cs8381 Datastructures Lab Manual
82% (28)
Cs8381 Datastructures Lab Manual
125 pages
Kenya Medical Training College Proposal
33% (3)
Kenya Medical Training College Proposal
13 pages
Business Analytics Introduction
No ratings yet
Business Analytics Introduction
37 pages
Logout Edit
No ratings yet
Logout Edit
5 pages
Checklist For Installation of CI Pipe
No ratings yet
Checklist For Installation of CI Pipe
1 page
Problem Set 1 Answers
No ratings yet
Problem Set 1 Answers
4 pages
Business Analytics Introduction
No ratings yet
Business Analytics Introduction
8 pages
Term Project GEN 351: Derry Ardiansyah Civil Engineering 61070503201
No ratings yet
Term Project GEN 351: Derry Ardiansyah Civil Engineering 61070503201
11 pages
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
From Everand
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
Janet Laane Effron
No ratings yet
Om Came
No ratings yet
Om Came
94 pages
Extended Essay BM IB
No ratings yet
Extended Essay BM IB
51 pages
Autoduel Quarterly 3 1
No ratings yet
Autoduel Quarterly 3 1
52 pages
TRAINEE's PROGRESS SHEET-TDNC2-JB - RAMOS
No ratings yet
TRAINEE's PROGRESS SHEET-TDNC2-JB - RAMOS
3 pages
Computer Forensic Analyst Intern-JD
No ratings yet
Computer Forensic Analyst Intern-JD
2 pages
Lowongan Pekerjaan - Employee Referral Program (10022021)
No ratings yet
Lowongan Pekerjaan - Employee Referral Program (10022021)
5 pages
BV - Embedded Software Engineer - Le Dinh Hoang
No ratings yet
BV - Embedded Software Engineer - Le Dinh Hoang
1 page
CUBO - Work Schedule
No ratings yet
CUBO - Work Schedule
1 page
Telangana State - State Eligibility Test 2023 Hall Ticket - 620822
No ratings yet
Telangana State - State Eligibility Test 2023 Hall Ticket - 620822
1 page
Teens English DWDM Book 1 INT U4
No ratings yet
Teens English DWDM Book 1 INT U4
4 pages
Curriculum Vitae: Nguyen Viet Anh
No ratings yet
Curriculum Vitae: Nguyen Viet Anh
7 pages
F-S Divertor PDF
No ratings yet
F-S Divertor PDF
174 pages
Cyber Insurance Policy
No ratings yet
Cyber Insurance Policy
4 pages
Advanced ATM Crime Prevention System by Using Wireless Communication
No ratings yet
Advanced ATM Crime Prevention System by Using Wireless Communication
6 pages
Application For Nda Alumni Association: Affix Photograph
No ratings yet
Application For Nda Alumni Association: Affix Photograph
3 pages
Data Science Mastery: From Beginner to Expert in Big Data Analytics
From Everand
Data Science Mastery: From Beginner to Expert in Big Data Analytics
Kameron Hussain
No ratings yet
Micros
No ratings yet
Micros
4 pages
Midjourney Cheat Sheet PROMPT
89% (9)
Midjourney Cheat Sheet PROMPT
126 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

wk6 - Data Analytics

Uploaded by

wk6 - Data Analytics

Uploaded by

MSc Computer Science

LDS7005M Big Data & Cloud Computing-LDS7005M,

Week 6: Data Analytic Techniques

Module Director(s): Dr Gayathri

▪ Learn what data analytics is and how it

▪ Data analytics is the science of analysing raw data in order to make

▪ Many of the techniques and processes of data analytics have been

(Vinit Kachchi, 2021)

▪ Descriptive analytics answers the “what happened” question by

▪ Prescriptive analytics recommends optimal actions by analysing data, considering

▪ Tools for Prescriptive Analytics

Supervised Learning Machine

▪ Applications: Spam detection in emails, sentiment analysis in social media, or

▪ Applications: Predicting house prices based on features, estimating sales

▪ Techniques include cross-validation, training and testing

• Please upload your lab works in submission links created

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.