0% found this document useful (0 votes)

87 views17 pages

A Machine Learning Model For Flight Delay Prediction: Certificate

This document provides a synopsis for a project that aims to build a machine learning model to predict flight delays. It will apply classification algorithms like decision trees and random forest classifiers to historical flight data to determine if a given flight's arrival will be delayed or not. The goals are to improve understanding of flight delays and help customers. It will focus only on prediction and not solutions. The data will come from publicly available sources and Python will be used for analysis and modeling. Key steps include data preprocessing, training a model, and evaluating accuracy on test data.

Uploaded by

Ramesh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

87 views17 pages

A Machine Learning Model For Flight Delay Prediction: Certificate

Uploaded by

Ramesh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 17

PROJECT SYNOPSIS ON

A Machine Learning Model for

Flight Delay Prediction
By

PAYAL KUMARI (10900216037)

Under the guidance of:

(ANUPAM BERA)

Department of Information and Technology.

Netaji Subhash Engineering College

Garia, Kolkata – 700152

Certificate
Project group: SAURABH KUMAR, SAMEER AKHTER, PAYAL KUMARI, RAMESH KUMAR
Under my guidance and supervision the synopsis of the project
_____________________________________________________________of 4th
year Information and technology is submitted.

(signature of Project guide)

--------------------------------------
ANUPAM BERA
Information and Technology
Netaji Subhas Engineering College.
Garia, Kolkata - 700152

2
ACKNOWLEDGEMENT
I owe my deep sense of gratitude to my respected mentor Prof. ANUPAM
BERA, Department of Information and Technology. Netaji Subhash
Engineering College, Kolkata for his meticulous and expert guidance,
constructive criticism, patient hearing and benevolent behaviour throughout
my ordeal of the present research. I shall remain grateful to him for his
cordial, cooperative attitude, wise and knowledgeable counsel that acted as
an impetus in the successful completion of my project titled MACHINE
LEARNING MODEL FOR FLIGHT DELAY PREDICTION.
I would like to particularly thank the Head of the Department for giving me
guidance and inspiration during my study in the department. I never forget
the kind help extended by the HOD. It however, is not possible for me to
forget the kind of help provided by all the faculty members,
At last but not least my friends in the department who deserve some words
of thanks.

3
CONTENT

Abstract 5

Introduction 6

Project Goals and Scope 7

Data and Tools 7

4.1 Data Used
4.1.1 Choosing the Dataset
4.2 Tools 8
Python and associated packages

Proposed Work 9
5.1 Classification

System Design 10
The various modules of the project would be divided into the segments as described.
I. Data Collection 12
II. Pre Processing 12
III. Training the Machine 13
IV. Data Scoring 14

Conclusion 15

Future work 15

References 16

4
Abstract
Growth in aircraft industry has resulted in air-traffic congestion causing

flight delays. Flight delays not only have economic impact but also

harmful environmental effects. Air-traffic management is becoming

increasingly challenging. In this project I apply machine learning

algorithm like decision tree classifiers to predict if a given flight’s arrival

will be delayed or not.

5
Introduction

Delay is one of the most remembered performance indicators of any

transportation system. Notably, commercial aircraft players understand delay
as the period by which a flight is late or postponed. Thus, a delay may be
represented by the difference between scheduled and real times of departure
or arrival of a plane. Country regulator authorities have a multitude of
indicators related to tolerance thresholds for flight delays. Indeed, flight delay
is an essential subject in the context of air transportation systems. In 2013,
36% of flights delayed by more than five minutes in Europe, 31.1% of flights
delayed by more than 15 minutes in the United States, and 16.3% of flights
were cancelled or suffered delays greater than 30 minutes in Brazil. This
indicates how relevant this indicator is and how it affects no matter the scale
of airline meshes.

6
Project Goals and Scope

A chief goal of this project is to add to the academic

understanding of flight delay prediction. The hope is that with a greater

understanding of how the flight delays, customer will be better

equipped to prevent delay.

It is important here to define the scope of the project. This project

will focus exclusively on predicting the flight delay . The project will

make no attempt to decide how much money to allocate to each

prediction. More so, the project will analyse the accuracies of these

prediction.

Data and Tools

4.1 Data Used

4.1.1 Choosing the Dataset

We have selected dataset available on kaggle.com .Features

contained in the dataset are as follows:

1. Origin

2. Dest

3. Unique_Carrier

4. Day_of_Week

7
5. Dep_Hour

6. Arr_Delay.

4.2 Tools

Python and associated packages

Python was the language of choice for this project. This was an easy

decision for the multiple reasons. 16

1. Python as a language has an enormous community behind it. Any

problems that might be encountered can be easily solved with a

trip to Stack Overflow. Python is among the most popular

languages on the site which makes it very likely there will be a

direct answer to any query.

2. Python has an abundance of powerful tools ready for scientific

computing. Packages such as Numpy, Pandas, and SciPy are freely

available, performant, and well documented. Packages such as

these can dramatically reduce, and simplify the code needed to

write a given program. This makes iteration quick.

3. Python as a language is forgiving and allows for programs that

look like pseudo code. This is useful when pseudo code given in

academic papers needs to be implemented and tested. Using

Python, this step is usually reasonably trivial.

8
Proposed Work

I basically use here classification in my project.

5.1 Classification
Classification is an instance of supervised learning where a set is

analyzed and categorized based on a common attribute. From the

values or the data are given, classification draws some conclusion from

the observed value. If more than one input is given then classification

will try to predict one or more outcomes for the same. A few classifiers

that are used here for the flight delay prediction includes the random

forest classifier, SVM classifier.

Random Forest Classification and Logistic Regression

Random Forest Classifier

Random forest classifier is a type of ensemble classifier and also a

supervised algorithm. It basically creates a set of decision trees, that

yields some result. The basic approach of random class classifier is to

take the decision aggregate of random subset decision trees and yield a

9
final class or result based on the votes of the random subset of decision

trees.

Parameters

The parameters included in the random forest classifier are

n_estimators which is total number of decision trees, and other hyper

parameters like oobscore to determine the generalization accuracy of

the random forest, max_features which includes the number of

features for best-split. min_weight_fraction_leaf is the minimum

weighted fraction of the sum total of weights of all the input samples

required to be at a leaf node. Samples have equal weight when sample

weight is not provided.

System Design

The first step is the conversion of this raw data into processed data.

This is done using feature extraction, since in the raw data collected

there are multiple attributes but only a few of those attributes are

useful for the purpose of prediction. So the first step is feature

extraction, where the key attributes are extracted from the whole list of

10
attributes available in the raw dataset. Feature extraction starts from an

initial state of measured data and builds derived values or features.

These features are intended to be informative and non-redundant,

facilitating the subsequent learning and generalization steps. Feature

extraction is a dimensionality reduction process, where the initial set of

raw variables is diminished to progressively reasonable features for

ease of management, while still precisely and totally depicting the first

informational collection. The feature extraction process is followed by a

classification process wherein the data that was obtained after feature

extraction is split into two different and distinct segments. Classification

is the issue of recognizing to which set of categories a new observation

belongs. The training data set is used to train the model whereas the

test data is used to predict the accuracy of the model. The splitting is

done in a way that training data maintain a higher proportion than the

test data. The random forest algorithm utilizes a collection of random

decision trees to analyze the data. In layman terms, from the total

number of decision trees in the forest, a cluster of the decision trees

look for specific attributes in the data. This is known as data splitting. In

this case, since the end goal of our proposed system is to predict the

flight delay from its historical data.

11
System Architecture

The various modules of the project would be divided into

the segments as described.

I. Data Collection
Data collection is a very basic module and the initial step towards the

project. It generally deals with the collection of the right dataset. The

dataset that is to be used in the prediction has to be used to be filtered

based on various aspects. Data collection also complements to enhance

the dataset by adding more data that are external. Our data mainly

consists of the previous year flight time table. Initially, we will be

12
analyzing the Kaggle dataset and according to the accuracy, we will be

using the model with the data to analyze the predictions accurately.

II. Pre Processing

Data pre-processing is a part of data mining, which involves

transforming raw data into a more coherent format. Raw data is

usually, inconsistent or incomplete and usually contains many errors.

The data pre-processing involves checking out for missing values,

looking for categorical values, splitting the data-set into training and

test set and finally do a feature scaling to limit the range of variables so

that they can be compared on common environs.

III. Training the Machine

Training the machine is similar to feeding the data to the algorithm to

touch up the test data. The training sets are used to tune and fit the

models. The test sets are untouched, as a model should not be judged

based on unseen data. The training of the model includes cross-

validation where we get a well-grounded approximate performance of

the model using the training data. Tuning models are meant to

specifically tune the hyperparameters like the number of trees in a

random forest. We perform the entire cross-validation loop on each set

of hyperparameter values. Finally, we will calculate a cross-validated

13
score, for individual sets of hyperparameters. Then, we select the best

hyperparameters. The idea behind the training of the model is that we

some initial values with the dataset and then optimize the parameters

which we want to in the model. This is kept on repetition until we get

the optimal values. Thus, we take the predictions from the trained

model on the inputs from the test dataset. Hence, it is divided in the

ratio of 80:20 where 80% is for the training set and the rest 20% for a

testing set of the data.

IV. Data Scoring

The process of applying a predictive model to a set of data is referred to

as scoring the data. The technique used to process the dataset is the

Random Forest Algorithm. Random forest involves an ensemble

method, which is usually used, for classification and as well as

regression. Based on the learning models, we achieve interesting

results. The last module thus describes how the result of the model can

help to predict the probability of a flight delay based on certain

parameters. It also shows the vulnerabilities of a particular entity. The

user authentication system control is implemented to make sure that

only the authorized entities are accessing the results.

14
Conclusion

In this project, I am able to successfully apply machine learning

algorithms to predict flight arrival-delay and show simple classifiers like

decision tree and can predict if a flight’s arrival will be delayed or not

fairly accurately.

Future work

For further work I like to further improve my model, perhaps with

more training-data or deeper neural network, or both. We can improve

the accuracy further.

15
References:

[1] C. Cetek, E. Cinar, F. Aybek, and A. Cavcar, “Capacity and delay analysis for

airport manoeuvring areas using simulation,” Aircraft Engineering and

Aerospace Technology, vol. 86, no. 1, pp. 43–55, 2013. [Online]. Available:

https://doi.org/10.1108/AEAT-04-2012-0058

[2] K. B. Nogueira, P. H. Aguiar, and L. Weigang, “Using ant algorithm to

arrange taxiway sequencing in airport,” International Journal of Computer

Theory and Engineering, vol. 6, no. 4, p. 357, 2014.

[3] R. R. Clewlow, I. Simaiakis, and H. Balakrishnan, “Impact of arrivals on

departure taxi operations at airports,” 2010

16
References
1. https://www.researchgate.net/publication/315382748_A_Review_on_Flight_Delay_Pr

ediction

Predicting-Flight-Delays-AI ML
No ratings yet
Predicting-Flight-Delays-AI ML
7 pages
Bashar - All Pain Is The Result of Judgment
No ratings yet
Bashar - All Pain Is The Result of Judgment
5 pages
Writing Lesson Plan
No ratings yet
Writing Lesson Plan
11 pages
Flight Delay Prediction Based On Machine Learning Full
No ratings yet
Flight Delay Prediction Based On Machine Learning Full
9 pages
Interlanguage Theory
100% (2)
Interlanguage Theory
3 pages
Flight DElay Report
No ratings yet
Flight DElay Report
49 pages
Project Synopsis - Prediction of Flight Delay Analysis
No ratings yet
Project Synopsis - Prediction of Flight Delay Analysis
5 pages
FlightDelay SVR
No ratings yet
FlightDelay SVR
43 pages
Flight Price Prediction
No ratings yet
Flight Price Prediction
34 pages
Internationalization and Entry Strategy of Enterprises
No ratings yet
Internationalization and Entry Strategy of Enterprises
53 pages
Winter Report
No ratings yet
Winter Report
82 pages
Flight Price Prediction Project Report in PDF
No ratings yet
Flight Price Prediction Project Report in PDF
34 pages
Flightdelay
No ratings yet
Flightdelay
53 pages
TA12 - Unit 6
No ratings yet
TA12 - Unit 6
53 pages
Assignment1 Code and Conclude DSA Nikhil Mishra
No ratings yet
Assignment1 Code and Conclude DSA Nikhil Mishra
36 pages
Guide To Writing A Personal Statement
100% (1)
Guide To Writing A Personal Statement
2 pages
Why I Do CS Research: 25 April 2014 Wim Vanderbauwhede
No ratings yet
Why I Do CS Research: 25 April 2014 Wim Vanderbauwhede
32 pages
FLIGHT DELAY Prediction 4th
No ratings yet
FLIGHT DELAY Prediction 4th
18 pages
Flight Fare
No ratings yet
Flight Fare
15 pages
Lmis Final
No ratings yet
Lmis Final
10 pages
Flight Delay Report
No ratings yet
Flight Delay Report
29 pages
Exploring HashMap and HashSet-1
No ratings yet
Exploring HashMap and HashSet-1
13 pages
Delay Prediction
No ratings yet
Delay Prediction
37 pages
Flight Delay Prediction: Project Synopsis On
No ratings yet
Flight Delay Prediction: Project Synopsis On
13 pages
Project 1.1
No ratings yet
Project 1.1
3 pages
Machine Learning May 2024
No ratings yet
Machine Learning May 2024
8 pages
String Functions in SQL
No ratings yet
String Functions in SQL
10 pages
A Hybrid Machine Learning Based Model For Predicting Flight Delay Through Aviation Big Data
No ratings yet
A Hybrid Machine Learning Based Model For Predicting Flight Delay Through Aviation Big Data
16 pages
Model
No ratings yet
Model
20 pages
Big Data Journalpaper
No ratings yet
Big Data Journalpaper
41 pages
Flight Fare Predictor
No ratings yet
Flight Fare Predictor
21 pages
Ict Project Report
No ratings yet
Ict Project Report
14 pages
Sle - ACTION RESEARCH
No ratings yet
Sle - ACTION RESEARCH
4 pages
Bda Kav
No ratings yet
Bda Kav
9 pages
Project 1
No ratings yet
Project 1
9 pages
Belcastro 2016
No ratings yet
Belcastro 2016
20 pages
Seminar PPT - Lipika-1
No ratings yet
Seminar PPT - Lipika-1
21 pages
Module 4 Philo
No ratings yet
Module 4 Philo
33 pages
DMcase 2
No ratings yet
DMcase 2
5 pages
Software Project1
No ratings yet
Software Project1
76 pages
Prashant Major Project Final
No ratings yet
Prashant Major Project Final
90 pages
Final
No ratings yet
Final
15 pages
Report Card Comments: Made For Grade 3-4 But Is Suitable For Any Grade. Editable and Very Convenient
No ratings yet
Report Card Comments: Made For Grade 3-4 But Is Suitable For Any Grade. Editable and Very Convenient
11 pages
Apache Spark
No ratings yet
Apache Spark
2 pages
Flight Delay Prediction System Paper - 802 - 826 - 828
No ratings yet
Flight Delay Prediction System Paper - 802 - 826 - 828
7 pages
Major Project Final
No ratings yet
Major Project Final
21 pages
620 Case Study2
No ratings yet
620 Case Study2
2 pages
Flight DElay Report
No ratings yet
Flight DElay Report
49 pages
Duplichecker Plagiarism Report
No ratings yet
Duplichecker Plagiarism Report
3 pages
Aerospace 08 00152 v3
No ratings yet
Aerospace 08 00152 v3
20 pages
Features of Emerging Adulthood and Self-Efficacy of Senior College Students
No ratings yet
Features of Emerging Adulthood and Self-Efficacy of Senior College Students
13 pages
Big Data Analytics Using Predictive Analysis
No ratings yet
Big Data Analytics Using Predictive Analysis
4 pages
Fin Irjmets1676179194
No ratings yet
Fin Irjmets1676179194
6 pages
Example On Flight Delay Data
No ratings yet
Example On Flight Delay Data
10 pages
Base Paper (Flight Delay Prediction)
No ratings yet
Base Paper (Flight Delay Prediction)
6 pages
Phonological Analysis of Mymensingh Dialect
No ratings yet
Phonological Analysis of Mymensingh Dialect
15 pages
REPORT On Time Flights Performance
No ratings yet
REPORT On Time Flights Performance
9 pages
Flight Price Prediction Report
No ratings yet
Flight Price Prediction Report
18 pages
CV RMS
No ratings yet
CV RMS
1 page
IJRTI2305086
No ratings yet
IJRTI2305086
6 pages
19BM110
No ratings yet
19BM110
4 pages
RPH UNIT 9 (Full)
No ratings yet
RPH UNIT 9 (Full)
19 pages
Predicting Flight Delays
No ratings yet
Predicting Flight Delays
7 pages
Netaji Subhash Engineering College: Internet Technology Lab (IT-791)
No ratings yet
Netaji Subhash Engineering College: Internet Technology Lab (IT-791)
7 pages
Ict Project Rurbics
No ratings yet
Ict Project Rurbics
13 pages
Flight Delay Prediction
No ratings yet
Flight Delay Prediction
17 pages
Report
No ratings yet
Report
5 pages
Dynamic Organizations and Organizational Dynamics: Smaranda Boros
No ratings yet
Dynamic Organizations and Organizational Dynamics: Smaranda Boros
8 pages
Predicting Flight Delays With Error Calculation Using Machine Learned Classifiers
No ratings yet
Predicting Flight Delays With Error Calculation Using Machine Learned Classifiers
6 pages
Netaji Subhash Engineering College
No ratings yet
Netaji Subhash Engineering College
24 pages
Mathews 1989
No ratings yet
Mathews 1989
18 pages
Song By: Eddie Rabbitt: I Love A Rainy Night
No ratings yet
Song By: Eddie Rabbitt: I Love A Rainy Night
8 pages
Understanding Groups and Teams: © 2003 Pearson Education Canada Inc
No ratings yet
Understanding Groups and Teams: © 2003 Pearson Education Canada Inc
34 pages
Rubrics For Peer Evaluation Capstone Conv
No ratings yet
Rubrics For Peer Evaluation Capstone Conv
2 pages
Flight Delay Prediction Team3
No ratings yet
Flight Delay Prediction Team3
8 pages
Flight Delay Prediction System
No ratings yet
Flight Delay Prediction System
5 pages
PRaPM - Unit2
No ratings yet
PRaPM - Unit2
16 pages
Internet Technology Assignment: Department of Information Technology Netaji Subhash Engineering College
No ratings yet
Internet Technology Assignment: Department of Information Technology Netaji Subhash Engineering College
2 pages
E Commerce
No ratings yet
E Commerce
13 pages
Anticipation Guide-Phonics and Word Recognition
No ratings yet
Anticipation Guide-Phonics and Word Recognition
5 pages
Jur Nalm Pikal Bin 1
No ratings yet
Jur Nalm Pikal Bin 1
5 pages
IELTS Reading True False Not Given Tests
No ratings yet
IELTS Reading True False Not Given Tests
2 pages
5th International Conference On Electronics and Sustainable Communication Systems (ICESC 2024)
No ratings yet
5th International Conference On Electronics and Sustainable Communication Systems (ICESC 2024)
15 pages
Technical English Report
No ratings yet
Technical English Report
4 pages
Departure Delay Prediction Using Machine Learning
No ratings yet
Departure Delay Prediction Using Machine Learning
6 pages
ICT Rationale
No ratings yet
ICT Rationale
1 page
SOCIAL CULTURAL-WPS Office
No ratings yet
SOCIAL CULTURAL-WPS Office
2 pages
(IJCST-V10I5P36) :mrs R Jhansi Rani, T Govardhan Reddy
No ratings yet
(IJCST-V10I5P36) :mrs R Jhansi Rani, T Govardhan Reddy
5 pages
Edu 240 Artifact 1 Classroom Management Plan Portfolio
No ratings yet
Edu 240 Artifact 1 Classroom Management Plan Portfolio
5 pages
Pronomes em Latim
No ratings yet
Pronomes em Latim
4 pages
Discriminate
No ratings yet
Discriminate
4 pages
Graphcore Poplar Programming and Optimization: The Complete Guide for Developers and Engineers
From Everand
Graphcore Poplar Programming and Optimization: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Machine Learning with PyTorch: From Basics to Expert Proficiency
From Everand
Machine Learning with PyTorch: From Basics to Expert Proficiency
William Smith
No ratings yet
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet
PyTorch Foundations and Applications: Definitive Reference for Developers and Engineers
From Everand
PyTorch Foundations and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.