Predictive Analytics For Enhanced Passenger Satisfaction in The Airline Industry: Leveraging Machine Learning To Drive Strategic Decision-Making
Predictive Analytics For Enhanced Passenger Satisfaction in The Airline Industry: Leveraging Machine Learning To Drive Strategic Decision-Making
The development of a comprehensive data framework in this study Research has identified several key factors that significantly impact
includes diverse variables such as in-flight services and logistical airline passenger satisfaction. These include service quality
aspects of travel, measuring their impact on passenger satisfaction. attributes such as seat comfort, in-flight entertainment, and the
Previous research has shown that service quality directly influences efficiency of check-in services. A study by Walia, Sharma, and
passenger satisfaction and loyalty [5]. Mathur revealed that tangible and intangible service aspects
979-8-3503-8735-3/24/$31.00 ©2024
Authorized licensed use limited to: De Montfort IEEE
University. Downloaded on February 10,2025 at 03:47:19 UTC from IEEE Xplore. Restrictions apply.
collectively shape the overall passenger experience. Their findings RapidMiner software, which provides robust tools for data
suggest that airlines must focus on both operational efficiencies and preprocessing, visualization, and model development [13].
customer service improvements to enhance satisfaction levels [10].
III. RESEARCH METHODOLOGY
Moreover, studies like that of Archana and Subha emphasize the
importance of addressing delays and improving punctuality as A. Dataset Understanding
critical factors influencing passenger satisfaction. Their research This study uses a dataset called "Airline Passenger Satisfaction" to
indicates that reducing both departure and arrival delays can analyze factors affecting passenger satisfaction across various
substantially enhance customer perceptions of airline reliability and airline services. The dataset, which includes 130,000 entries,
service quality [11]. includes 23 features such as demographic information, service
ratings, and flight details. Key features include qualitative
The application of predictive analytics extends beyond mere assessments like inflight wifi, food and drink quality, seat comfort,
prediction to provide actionable recommendations for service and overall satisfaction. Quantitative data includes flight distance
improvement. By analyzing comprehensive datasets that include and delay times. The dataset, originally sourced from Kaggle, has
both qualitative and quantitative attributes of passenger been cleaned and preprocessed for accuracy and relevance for
experiences, airlines can identify specific areas for enhancement. predictive modeling. It provides a multifaceted view of the customer
For example, Kobi and Otieno demonstrated that machine learning base and service-specific ratings. The comprehensive nature of the
models could uncover hidden patterns in passenger feedback, data makes it a valuable resource for enhancing passenger
enabling airlines to make informed decisions about service satisfaction and service efficiency [14].
modifications [12].
B. Data Pre-processing
Despite these advancements, challenges remain in ensuring data To prepare the dataset for further analysis we used following steps
quality, handling imbalanced datasets, and addressing privacy (see Fig. 1) using Rapidminer [15], which are described below:
concerns related to the use of passenger data. Continuous innovation i. Importing Dataset
and adherence to ethical standards will be crucial in leveraging
There two "Read CSV" operators are used to import raw data,
predictive analytics to its full potential in the airline industry.
(training and testing datasets are in separate files) for accurate
model validation and comprehensive features of flight
A. Objectives: experience and customer demographics.
The primary objectives of this study are as follows: ii. Attributes Selection
• Predictive Analysis: Develop predictive models to accurately
The "Select Attributes" operators were used to select attributes
determine passenger satisfaction levels based on various
for further analysis, excluding 'att1' and 'id' due to their
factors.
irrelevantness and lack of predictive power.
• Service Improvement Insights: Identify key areas for
improvement in airline services to enhance passenger iii. Handling Missing Values
satisfaction. The "Replace Missing Values" operators address missing
• Strategic Recommendations: Provide actionable values in predictive models by replacing them with the average
recommendations for optimizing airline services to improve of their respective columns, assuming randomness and
customer satisfaction and increase profitability. minimal skew.
iv. Assigning Role
The study will utilize a comprehensive dataset sourced from Kaggle,
The "Set Role" operator assigns a specific role to an attribute
which includes detailed passenger feedback on various aspects of
in supervised learning tasks, defining the target variable as
their flight experience. The dataset will be analyzed using
"Satisfaction" and the rest as independent variables.
Authorized licensed use limited to: De Montfort University. Downloaded on February 10,2025 at 03:47:19 UTC from IEEE Xplore. Restrictions apply.
D. Model’s Evaluation predictive abilities. Precision measures the correctness of the
The "Performance" operator evaluates the models' predictions by positive predictions, while recall, also known as sensitivity,
comparing them to the actual outcomes using various metrics. measures the model's ability to identify all actual positives. Other
Accuracy is the most straightforward performance measure, but metrics, such as the F1-score, may be considered to balance
precision and recall provide deeper insights into the models' precision and recall, particularly if there is a class imbalance in the
dataset [16].
IV. INSIGHTS
The airline passenger satisfaction dataset was analyzed using
RapidMiner's built-in visualization capabilities to identify patterns and
correlations that could impact passenger satisfaction.
• Age Distribution and Satisfaction
Histograms segmented by satisfaction level showed (see Fig. 3)
that younger and older passengers are more dissatisfied than
middle-aged passengers.
Figure 4: Inflight Service Ratings vs. Satisfaction distribution using box plot
• Flight Distance and Class Longer flights in business class showed higher satisfaction
We plotted (see Fig. 5) flight distance against travel class to rates, indicating that comfort becomes more critical on longer
examine trends in passenger distribution and satisfaction. journeys.
Authorized licensed use limited to: De Montfort University. Downloaded on February 10,2025 at 03:47:19 UTC from IEEE Xplore. Restrictions apply.
Figure 5: Flight Distance and Class distribution using Bar chart
Authorized licensed use limited to: De Montfort University. Downloaded on February 10,2025 at 03:47:19 UTC from IEEE Xplore. Restrictions apply.
lower at 66.59%, indicating it missed some satisfied
passengers. The model's AUC was solid, averaging at 0.810.
The Naive Bayes model demonstrated superior accuracy and has a higher AUC value, indicating its superior predictive power.
reliability in predicting passenger satisfaction, outperforming the This highlights its superior ability to distinguish between classes.
KNN model in terms of accuracy, precision, recall, and AUC in
this dataset.
VI. Performance Comparison
Here's the comparison table (see tab. 1) showing the performance
metrics of the K-Nearest Neighbors (KNN) and Naive Bayes
models:
Authorized licensed use limited to: De Montfort University. Downloaded on February 10,2025 at 03:47:19 UTC from IEEE Xplore. Restrictions apply.
could help track changes in passenger expectations over time, [13]. M. S. Mousavian, S. Miah, and Y. Zhong, "A design
allowing airlines to adapt their strategies to shifting market concept of big data analytics model for managers in
conditions. By pursuing these avenues, airlines can refine their hospitality industries," Personal and Ubiquitous
approaches to improving passenger satisfaction, ensuring that Computing, vol. 27, pp. 1-11, 2023, doi: 10.1007/s00779-
their service enhancements are both strategic and effective. 023-01714-3.
[14]. J. D., "Passenger Satisfaction," 2018. [Online].
VIII. REFERENCES Available:
https://www.kaggle.com/datasets/teejmahal20/airline-
[1]. M. Salamoura, I. Chaniotakis, and C. Lymperopoulos, passenger-satisfaction?select=train.csv.
"Enhancing Airline Passengers' Satisfaction Through [15]. Mierswa and R. Klinkenberg, "RapidMiner Studio
Service Quality: The Importance of the Human Factor," (9.1)," Data science, machine learning, predictive
Journal of Air Transport Studies, vol. 8, pp. 54-69, 2017, analytics, 2018. [Online]. Available:
doi: 10.38008/jats.v8i2.32. https://rapidminer.com/.
[2]. Rish, "An Empirical Study of the Naïve Bayes [16]. Y. Shi, "A Machine Learning Study on the Model
Classifier," in IJCAI 2001 Work Empir Methods Artif Performance of Human Resources Predictive Algorithms,"
Intell., vol. 3, 2001. in 2022 4th International Conference on Applied Machine
[3]. G. Guo, H. Wang, D. Bell, and Y. Bi, "KNN Model- Learning (ICAML), Changsha, China, 2022, pp. 405-409,
Based Approach in Classification," 2004. doi: 10.1109/ICAML57167.2022.00082.
[4]. S. Mohamed and H. Klaus, "AI and ML for Data-Driven
Insights: Machine learning algorithms can analyze vast
amounts of medical data to identify patterns and trends,"
2024.
[5]. G. C. Saha and Theingi, "Service quality, satisfaction,
and behavioural intentions: A study of low-cost airline
carriers in Thailand," Managing Service Quality: An
International Journal, vol. 19, no. 3, pp. 350-372, 2009.
[6]. M. Rampersad-Jagmohan and Y. Wang, "Predictive
Analytics in Aviation Management," 2024, doi:
10.1007/978-981-97-0665-5_52.
[7]. M.-Y. Park, Y.-J. Kim, and Y.-H. Park, "A Deep
Learning Approach to Analyze Airline Customer
Propensities: The Case of South Korea," Applied
Sciences, vol. 12, no. 4, p. 1916, 2022, doi:
10.3390/app12041916.
[8]. L. Jiang, H. Zhang, and J. Su, "Learning k-Nearest
Neighbor Naive Bayes for Ranking," in Lecture Notes in
Computer Science, vol. 3584, pp. 175-185, 2005, doi:
10.1007/11527503_21.
[9]. P. Kang and S. Cho, "Locally linear reconstruction for
instance-based learning," Pattern Recognition, vol. 41,
no. 11, pp. 3507-3518, 2008, doi:
10.1016/j.patcog.2008.04.009.
[10]. S. Walia, D. Sharma, and A. Mathur, "The Impact of
Service Quality on Passenger Satisfaction and Loyalty in
the Indian Aviation Industry," 2021.
[11]. R. Archana and M. Subha, "A study on service quality
and passenger satisfaction on Indian airlines,"
International Journal of Multidisciplinary Research, vol.
2, 2012.
[12]. J. Kobi and B. Otieno, "Predictive Analytics
Applications for Enhanced Customer Retention and
Increased Profitability in the Telecommunications
Industry," International Journal of Innovative Science and
Research Technology, vol. 9, 2024, doi:
10.38124/ijisrt/IJISRT24MAY1148.
Authorized licensed use limited to: De Montfort University. Downloaded on February 10,2025 at 03:47:19 UTC from IEEE Xplore. Restrictions apply.