0% found this document useful (0 votes)

53 views4 pages

GNR 652 Assignment 2

The document summarizes the analysis of a flight delay dataset. Various visualizations are created to explore trends in the data by carrier, distance, origin, day of week, and destination. [1] Logistic regression models are fitted on preprocessed training and test sets to classify flight delays, achieving up to 88% accuracy. [2] Variable selection identifies five key predictors of delay: delay, destination as LGA, distance of 214 miles, weather of 0 (no issues), and origin as DCA. [3] Refitting the model on the selected variables maintains high accuracy, suggesting ideal conditions for on-time travel from DCA to New York are no weather issues with a 214 mile flight.

Uploaded by

Sayan Rakshit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views4 pages

GNR 652 Assignment 2

Uploaded by

Sayan Rakshit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

GNR 652 Assignment 2

Report
Roll Number:183170017

1) Show visualisations to explore the dataset and understand the underlying

trends (Often called Exploratory Data Analysis). Choose visualisation
methods you think best represent the data
Bar Graph is the best to represent the data as we have to compare the data of
flight for different category, also in particular category how the delayed and
ontime flight is distributed.

A) Flight Carriers

a) The data lacks the number of flights from ‘CO’, ‘OH’ and ‘UA’ carrier
which is why it’s significance on the model will be low. Model will be unbised
if we have uniform distributed data.
b) Most of the flight carriers have similar percentage of delayed flights
therefore the flight carriers doesn’t affect the delays

B) DISTANCE
a) The number of flight data from ‘214’ is larger than others therefore it will
have more significance than others
b) The %age delay of same is quite lower than others therefore it will have
lesser probability of getting delayed

C) ORIGIN

a) Datapoints of ‘DCA’ are more than others and have lower number of delays
therefore it will have lesser probability to delays

D) DAY_WEEK

a) The data shows a pattern that, although more number of flights run on mid
days %age of delays in mid days of week are lesser than others therefore
mid days will have lesser chance of delay flights while end days will have
more delays.

E) DAY_OF_MONTH
a) The %age of delays with number of weeks are increasing while the
approximate number of flights in each week remains same
b) Therefore the flights delay more at the end of months

F) DEST

a) The data points of ‘LFG’ are much greater and its %age delays are also less
therefore it will have more probability of ontime flights

2) Preprocess the dataset (to remove null values, generate dummy variables
etc. ) and divide the dataset into 60% train and 40% test. Prepare a logistic
model that can obtain accurate classifications of new flights based on their
predictor information.

After pre-processing the data I got the following accuracy

A) F1score = 0.8792
B) test_accuracy = 87.62%
C) confusion matrix =[[126 43]
[ 66 646]]
3) Interpret the model and coefficients and present some insights
Most of the variables in above model had very small values of t statistics,
since the data depends mostly on the delay at the origin.

4) Perform variable selection, and reduce the size of the model, only keeping
the relevant variables based on the analysis done earlier.
A) Based on t statistics with a confidence interval of 80% the following
variable were found to be significantly affecting the delays:
B) ['Delay', 'DEST_LGA', 'DISTANCE_214', 'Weather_0', 'ORIGIN_DCA']

5) Conclude the analysis by fitting a new model on these selected variables

and report the same. Report the accuracy.
Based on above variables the following results were obtained:
A) F1score = 0.8821
B) Test_accuracy= 88.08%
C) confusion matrix= [[134 47]
[ 58 642]]

6) Find the ideal weather conditions for the highest chance of an ontime flight
from DC to New York.
A) The model predicts that the flight delays depends mostly on following
with (+ve) beta values

'Delay', [0.09849605],
'DEST_LGA', [0.0844899 ],
'DISTANCE_214', [0.06678279],
'Weather_0', [0.19673315],
'ORIGIN_DCA' [0.09010007]

B) Therefore the best weather condition is 0 i,e, no weather related issues and
doesn’t have much correlation with days of week and month but visualization
shows that mid days of week and early days of month are more suited to less
chances of delay.

Flight Delay Prediction Based On Machine Learning Full
No ratings yet
Flight Delay Prediction Based On Machine Learning Full
9 pages
TSMC PDK Usage Guide
No ratings yet
TSMC PDK Usage Guide
45 pages
Flight DElay Report
No ratings yet
Flight DElay Report
49 pages
Project Synopsis - Prediction of Flight Delay Analysis
No ratings yet
Project Synopsis - Prediction of Flight Delay Analysis
5 pages
Flight Price Prediction
No ratings yet
Flight Price Prediction
34 pages
Main Summary
No ratings yet
Main Summary
19 pages
Big Data As A Means To Avoid Weather Flight Delay
No ratings yet
Big Data As A Means To Avoid Weather Flight Delay
71 pages
Group 07 Class CC02
No ratings yet
Group 07 Class CC02
38 pages
Assignment1 Code and Conclude DSA Nikhil Mishra
No ratings yet
Assignment1 Code and Conclude DSA Nikhil Mishra
36 pages
FS20 Fire Detection System: Configuration
No ratings yet
FS20 Fire Detection System: Configuration
406 pages
FMS-3000 CJ1+ CJ2+ CJ3
No ratings yet
FMS-3000 CJ1+ CJ2+ CJ3
878 pages
GROUP 07 CLASS CC02 Ê
No ratings yet
GROUP 07 CLASS CC02 Ê
36 pages
Analysis of Factors in Flight Delay: Yiyang Xu, Luyao Liu, Xichen Gao and Fanyu Frank Zeng
No ratings yet
Analysis of Factors in Flight Delay: Yiyang Xu, Luyao Liu, Xichen Gao and Fanyu Frank Zeng
7 pages
Flight Delay Prediction - Tomer & Ofek
No ratings yet
Flight Delay Prediction - Tomer & Ofek
29 pages
User Manual 2.30 EN 01
No ratings yet
User Manual 2.30 EN 01
428 pages
Project 1.1
No ratings yet
Project 1.1
3 pages
Week 2 Lab - Introduction To Data - Coursera
No ratings yet
Week 2 Lab - Introduction To Data - Coursera
6 pages
620 Case Study3
No ratings yet
620 Case Study3
2 pages
Assignment Matrix Operations Research
No ratings yet
Assignment Matrix Operations Research
28 pages
KrutikaKolhe 862467252 HW4
No ratings yet
KrutikaKolhe 862467252 HW4
16 pages
Flight Delay Report
No ratings yet
Flight Delay Report
29 pages
Auto Water Pump Insem Report
100% (1)
Auto Water Pump Insem Report
44 pages
Quantitative Methods EL1 - SAIRAM G
No ratings yet
Quantitative Methods EL1 - SAIRAM G
3 pages
Assigment 4
No ratings yet
Assigment 4
3 pages
Delay Analysis Paper
No ratings yet
Delay Analysis Paper
9 pages
Seminar PPT - Lipika-1
No ratings yet
Seminar PPT - Lipika-1
21 pages
SNU Assignment 1
No ratings yet
SNU Assignment 1
3 pages
Release Notes ONT R24.02
No ratings yet
Release Notes ONT R24.02
88 pages
Software Project1
No ratings yet
Software Project1
76 pages
Network Traffic Analysis
No ratings yet
Network Traffic Analysis
9 pages
Documentation & Report For Flyzy Flight Cancellation Project
No ratings yet
Documentation & Report For Flyzy Flight Cancellation Project
25 pages
Tribhuvan University: Project Proposal
No ratings yet
Tribhuvan University: Project Proposal
17 pages
12CS em 2025
No ratings yet
12CS em 2025
193 pages
Boston Logan Airport in 2015
No ratings yet
Boston Logan Airport in 2015
34 pages
(IJCST-V10I5P36) :mrs R Jhansi Rani, T Govardhan Reddy
No ratings yet
(IJCST-V10I5P36) :mrs R Jhansi Rani, T Govardhan Reddy
5 pages
Flight DElay Report
No ratings yet
Flight DElay Report
49 pages
Data Mining Task Resume Jurnal
No ratings yet
Data Mining Task Resume Jurnal
28 pages
Literature Survey Big Data
No ratings yet
Literature Survey Big Data
15 pages
EX - NO: Date: Explore Flight Delay Data Analyzing Factors Contributing To Flight Delays
No ratings yet
EX - NO: Date: Explore Flight Delay Data Analyzing Factors Contributing To Flight Delays
4 pages
437 Xbox @iiicvv
No ratings yet
437 Xbox @iiicvv
45 pages
Airline Data Analysis
No ratings yet
Airline Data Analysis
20 pages
BDM - Mining Over Datasets
No ratings yet
BDM - Mining Over Datasets
20 pages
Lecture 14
No ratings yet
Lecture 14
40 pages
DMcase 2
No ratings yet
DMcase 2
5 pages
Data Presentation Final
No ratings yet
Data Presentation Final
14 pages
KrishnaBathula 1
No ratings yet
KrishnaBathula 1
6 pages
Module 19 Business Objects
No ratings yet
Module 19 Business Objects
15 pages
A Data Mining Approach To Flight Arrival Delay Pre
No ratings yet
A Data Mining Approach To Flight Arrival Delay Pre
6 pages
Predicting Flight Delays
No ratings yet
Predicting Flight Delays
5 pages
Aerospace 08 00152 v3
No ratings yet
Aerospace 08 00152 v3
20 pages
Lab EDA and Hypothesis Testing
No ratings yet
Lab EDA and Hypothesis Testing
2 pages
Project
No ratings yet
Project
4 pages
Lec05 Intermediate Code Generation
No ratings yet
Lec05 Intermediate Code Generation
40 pages
620 Case Study2
No ratings yet
620 Case Study2
2 pages
Ormulate The Data Science Problem
No ratings yet
Ormulate The Data Science Problem
5 pages
IJRTI2305086
No ratings yet
IJRTI2305086
6 pages
Fin Irjmets1676179194
No ratings yet
Fin Irjmets1676179194
6 pages
LAB1 - Descriptive Statistics
No ratings yet
LAB1 - Descriptive Statistics
4 pages
Flight Delay Prediction
No ratings yet
Flight Delay Prediction
17 pages
997-476 HW19
No ratings yet
997-476 HW19
144 pages
Information Age
No ratings yet
Information Age
33 pages
Intro To Data Coursera
No ratings yet
Intro To Data Coursera
9 pages
? Difference Between On Page and Off Page Seo
No ratings yet
? Difference Between On Page and Off Page Seo
3 pages
Predicting Flight Delays
No ratings yet
Predicting Flight Delays
7 pages
Nepal Power Engineering and Fabricator Pvt. Ltd.
No ratings yet
Nepal Power Engineering and Fabricator Pvt. Ltd.
27 pages
Base Paper (Flight Delay Prediction)
No ratings yet
Base Paper (Flight Delay Prediction)
6 pages
American Airlines Flight Arrival Delay Analysis
No ratings yet
American Airlines Flight Arrival Delay Analysis
11 pages
Report
No ratings yet
Report
5 pages
Wa0141.
No ratings yet
Wa0141.
22 pages
REPORT On Time Flights Performance
No ratings yet
REPORT On Time Flights Performance
9 pages
Tidyverse AssigmentMishalM
No ratings yet
Tidyverse AssigmentMishalM
2 pages
Predicting Flight Delays With Error Calculation Using Machine Learned Classifiers
No ratings yet
Predicting Flight Delays With Error Calculation Using Machine Learned Classifiers
6 pages
Unit 1 - Cloud Computing
No ratings yet
Unit 1 - Cloud Computing
12 pages
SIP Master Stations: Configuration Guide
No ratings yet
SIP Master Stations: Configuration Guide
36 pages
Grade XI: Computer Science Project Work: Submitted By: Rashihang Rai
No ratings yet
Grade XI: Computer Science Project Work: Submitted By: Rashihang Rai
21 pages
Flight Delay Prediction Team3
No ratings yet
Flight Delay Prediction Team3
8 pages
Exercises 01
No ratings yet
Exercises 01
2 pages
Brosur Rectiverter Dan Battery (48Vdc 24Vdc 220vac)
No ratings yet
Brosur Rectiverter Dan Battery (48Vdc 24Vdc 220vac)
6 pages
Assignment 2 Solution
No ratings yet
Assignment 2 Solution
6 pages
Email Alerts On Whatsapp
No ratings yet
Email Alerts On Whatsapp
12 pages
BA Resume
No ratings yet
BA Resume
6 pages
FTC100D Panel Interface AT052610
No ratings yet
FTC100D Panel Interface AT052610
8 pages
Apurv Mishra Resume Data Engineer
No ratings yet
Apurv Mishra Resume Data Engineer
1 page
Schedule Management Plan Example W22
No ratings yet
Schedule Management Plan Example W22
2 pages
K-Map Method
No ratings yet
K-Map Method
3 pages
PFRO
No ratings yet
PFRO
2 pages
Tosibox Datasheet Lock100 LR PDF
No ratings yet
Tosibox Datasheet Lock100 LR PDF
2 pages
A First Course in Dimensional Analysis: Simplifying Complex Phenomena Using Physical Insight
From Everand
A First Course in Dimensional Analysis: Simplifying Complex Phenomena Using Physical Insight
Juan G. Santiago
No ratings yet
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet
CartoMobile iPhone/iPad App: Optimize Ground Survey Data Collection for Airborne LIDAR Accuracy Assessment
From Everand
CartoMobile iPhone/iPad App: Optimize Ground Survey Data Collection for Airborne LIDAR Accuracy Assessment
James W. Dow
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

GNR 652 Assignment 2

Uploaded by

GNR 652 Assignment 2

Uploaded by

GNR 652 Assignment 2

1) Show visualisations to explore the dataset and understand the underlying

After pre-processing the data I got the following accuracy

5) Conclude the analysis by fitting a new model on these selected variables

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.