0% found this document useful (0 votes)
14 views33 pages

BDA Final

The document presents a project report on 'Heart Disease Prediction Using Machine Learning' submitted by students at Mahendra College of Engineering for their Bachelor of Technology degree. It outlines the project's objectives, scope, and the use of machine learning algorithms to predict heart disease risk based on various health indicators. The report also includes acknowledgments, literature review, system analysis, and requirements for the proposed predictive model.

Uploaded by

asranganathan59
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views33 pages

BDA Final

The document presents a project report on 'Heart Disease Prediction Using Machine Learning' submitted by students at Mahendra College of Engineering for their Bachelor of Technology degree. It outlines the project's objectives, scope, and the use of machine learning algorithms to predict heart disease risk based on various health indicators. The report also includes acknowledgments, literature review, system analysis, and requirements for the proposed predictive model.

Uploaded by

asranganathan59
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

HEART DISEASE PREDICTION USING

MACHINE LEARNING

A PROJECT REPORT

Submitted by

Pavethra M (621522205037)
Rubini N (621522205044)
Dhushara S (621522205015)

in partial fulfillment for the award of the degree

of

BACHELOR OF TECHNOLOGY

in

INFORMATION TECHNOLOGY

MAHENDRA COLLEGE OF ENGINEERING,

MAHENDRA SALEM CAMPUS-636106.

ANNA UNIVERSITY::CHENNAI 600 025

MAY 2025
i
ANNA UNIVERSITY: CHENNAI 600 025

BONAFIDE CERTIFICATE

Certified that this project report “HEART DISEASE PREDICTION USING

MACHINE LEARNING ” is the bonafide work done by

Pavethra M (621522205037)
Rubini N (621522205044)
Dhushara S (621522205015)

Who carried out the project work under my supervision.

SIGNATURE SIGNATURE
Dr. T. AKILA, M.E, Ph.D., Dr. T. AKILA, M.E, Ph.D.,
ASSOCIATE PROFESSOR, ASSOCIATE PROFESSOR,
HEAD OF THE DEPARTMENT, SUPERVISIOR,
Department of Information Technology, Department of Information Technology,
Mahendra College of Engineering, Mahendra College of Engineering,
Minnampalli, Salem-636106. Minnampalli, Salem-636106.

Submitted to Project and Viva-Voce Examination held on __________________________at MCE.

INTERNAL EXAMINER EXTERNAL EXAMINER

ii
ACKNOWLEDGEMENT

The Success and final outcome of this project required a lot of guidance and assistance from
many people and an extremely fortunate to have got this all along with the completion of my
project work.

We request and thank Thirumigu. M. G. BHARATHKUMAR, Founder & Chairman,


Shrimathi. VALLIYAMMAL BHARATHKUMAR, Secretary for their guidance and
blessings, also we express our deepest gratitude to Managing Directors Er. Ba.
MAHENDHIRAN, Er. B. MAHA AJAY PRASATH, who modeled us both technically and
morally for achieving greater success in life.

We were extremely grateful to Dr. N. MOHANASUNDARARAJU, Principal for his


constant encouragement, inspiration, presence, and blessings throughout our course, especially
for providing us with an environment to complete our project successfully.

We also extend my sincere appreciation to Dr. T.AKILA, Head of the Department of


Information Technology who provided her valuable suggestions and precious time in
accomplishing my project report.

We owe my profound gratitude to our guide, Dr. T.AKILA , Head of the Department of
Information Technology who took an interest in our projectwork and provided all the necessary
information for developing the project successfully. We also thank all the staff members of our
college and technicians for their help in making this project a successful one.

Lastly, we would like to thank the almighty and my parents for their moral support and my
friends with whom shared my day-to-day experience and received lots of suggestions that
improved my quality of work.

iii
ABSTRACT

Nowadays communication plays a major role in everything be it professional or


personal. Email communication service is being used extensively because of its free use
services, low-cost operations, accessibility, and popularity. Emails have one major
security flaw that is anyone can send an email to anyone just by getting their unique
user id. This security flaw is being exploited by some businesses and ill-motivated
persons for advertising, phishing, malicious purposes, and finally fraud. This produces
a kind of email category called SPAM.

Spam refers to any email that contains an advertisement, unrelated and frequent
emails. These emails are increasing day by day in numbers. Studies show that around
55 percent of all emails are some kind of spam. A lot of effort is being put into this
by service providers. Spam is evolving by changing the obvious markers of detection.
Moreover, the spam detection of service providers can never be aggressive with
classification because it may cause potential information loss to incase of a
misclassification.

To tackle this problem we present a new and efficient method to detect spam
using machine learning and natural language processing. A tool that can detect and
classify spam. In addition to that, it also provides information regarding the text
provided in a quick view format for user convenience.

iv
TABLE OF CONTENTS

CHAPTER NO CONTENTS PAGE


NO

ABSTRACT Iv

1 INTRODUCTION 1

1.1 OVERVIEW 1

1.2 SCOPE OF PROJECT 1

1.3 OBJECTIVE OF THE PROJECT 2

2 LITERARTUTE REVIEW 3

2.1 EXISTING PREDICTION TECHNIQUES 3

2.2 MACHINE LEARNING IN HEALTHCARE 3

2.3 FEATURE SELECTION METHODS 3

2.4 DATASETS USED IN HEART DISEASE 4


PREDICTION

2.5 ML MODELS IN MEDICAL PREDICTION 4

v
3 SYSTEM ANALYSIS 5

3.1 EXISTING SYSTEM 5

3.1.1 DISADVANTAGE OF EXISTING 5


SYSTEM

6
3.2 PROPOSED SYSTEM

3.2.1 ADVANTAGE OF PROPOSED SYSTEM 6

4 SYSTEM REQUIREMENTS 7

4.1 HARDWARE REQUIREMENTS 7

4.2 SOFTWARE REQUIREMENTS 7

5 SYSTEM DESIGN AND DEVELOPMENT 8

5.1 ML ARCHITECTURE OVERVIEW 8

5.2 SYSTEM ARCHITECURE 8

5.3 DATA COLLECTION 9

5.4 DATA PRE -PROCESSING 9

5.5 TESTING DATA SET 9

5.6 ALGORITHM SELECTION 9

vi
5.6.1 ALGORITHM USED 10

5.6.2 NAIVE BAYES CLASSIFIER 10

5.6.3 RANDOM FOREST 10

5.6.4 SUPPORT VECTOR MACHINE (SVM) 10

5.6.5 NEURAL NETWORKS 11

5.6.6 MODEL PERFORMANCE COMPARISON 11

6 UML DIAGRAMS 12

6.1 CLASS DIAGRAM 12

6.2 USE CASE DIAGRAM 13

6.3 ACTIVITY DIAGRAM 14

7 PERFORMANCE ANALYSIS 15

7.1 ABOUT THE DATASET 15

7.2 ACCURACY COMPARISON OF 16


ALGORITHMS

7.3 METHODOLOGY 17

8 CONCULSION 18

9 FUTURE ENHANCEMENT 19

vii
10 APPENDIX 20

10.1 SOURCE CODE 20

10.2 OUTPUT SCREENS 22

11 BIBLIOGRAPHY 24

viii
CHAPTER-1
INTRODUCTION

1.1 OVERVIEW

Heart Disease Prediction Using Machine Learning is an intelligent healthcare


application developed to assess the risk of heart disease in individuals based on their
medical and lifestyle data. This system employs advanced machine learning algorithms to
analyze various clinical parameters such as age, blood pressure, cholesterol levels, heart
rate, and other vital indicators to predict the likelihood of heart-related conditions. By
leveraging data-driven insights, the platform provides accurate and timely predictions that
can doctors and healthcare professionals in early diagnosis and treatment planning. This
model aims to support proactive health management, reduce the burden on healthcare
systems, and ultimately save lives by identifying at-risk individuals before critical
symptoms arise.

1.2 SCOPE OF THE PROJECT

The scope of the Heart Disease Prediction Using Machine Learning project is to
develop an intelligent and data-driven system capable of analyzing patient health data to
predict the likelihood of heart disease. This system is designed to assist healthcare
professionals, researchers, and medical institutions in making more informed, timely, and
accurate diagnostic decisions. By utilizing machine learning algorithms and statistical
modeling techniques, the application can process multiple health indicators—such as age,
cholesterol level, blood pressure, heart rate, and chest pain type—to classify patients as
either at risk or not at risk of developing heart disease.

The project primarily focuses on building a predictive model that demonstrates the
potential of machine learning in the healthcare sector, specifically in the early diagnosis and

1
prevention of cardiovascular conditions. It does not include integration with electronic
health record (EHR) systems, real-time monitoring through wearable devices, or any form
of medical intervention

1.3 OBJECTIVE OF THE PROJECT

In this study, the Heart Disease Prediction Using Machine Learning project aims
to empower healthcare providers and individuals by offering a reliable, intelligent system
capable of predicting the risk of heart disease based on clinical and physiological data. By
integrating core components such as data preprocessing, feature selection, predictive
modeling, and advanced machine learning algorithms into a cohesive and user-friendly
application, the project seeks to streamline the diagnostic process and support timely
medical intervention.
This system is designed to enhance the accuracy and efficiency of heart disease risk
assessment by delivering data-driven predictions, actionable insights, and real-time risk
evaluation. Through the use of predictive models and continual algorithmic learning, it helps
healthcare professionals identify high-risk individuals, prioritize medical care, and
potentially prevent critical cardiac events. Ultimately, the platform bridges the gap between
traditional diagnostic approaches and modern AI-powered healthcare, enabling smarter,
faster, and more proactive heart disease management for clinics, hospitals, and personal
health monitoring.

2
CHAPTER-2
LITERATURE REVIEW

2.1 TITLE: EXISTING PREDICTION TECHNIQUES


Authors: Dr.Ramesh Iyer
This literature review explores traditional techniques used in predicting heart disease,
including rule-based clinical scoring systems, logistic regression models, and statistical
pattern recognition methods. These conventional techniques rely heavily on expert-
defined thresholds and domain knowledge, limiting their adaptability to new patterns in
health data. The study highlights the limitations of such methods in handling complex,
non-linear relationships and stresses the need for more dynamic, data-driven approaches
like machine learning for accurate prediction and early diagnosis.

2.2 TITLE: MACHINE LEARNING IN HEALTHCARE


Authors: Dr. Priya Nandakumar
This review focuses on the application of machine learning algorithms within the
healthcare domain, especially for disease prediction. It examines supervised learning
models such as Support Vector Machines, Decision Trees, Random Forests, and Neural
Networks. The study emphasizes the ability of these algorithms to uncover hidden patterns
and correlations in medical data, leading to improved diagnostic accuracy and reduced
human error. The integration of ML into healthcare systems is shown to support faster,
more consistent, and scalable decision-making in clinical settings.

3
2.3 TITLE: FEATURE SELECTION METHODS
Authors:Dr.SanjayDesai
This literature review highlights key feature selection methods applied in heart disease
prediction tasks. Techniques such as correlation analysis, Recursive Feature Elimination
(RFE), Chi-square testing, and Principal Component Analysis (PCA) are discussed for their
effectiveness in identifying the most influential health attributes. Proper feature selection
improves model performance by reducing overfitting, simplifying models, and increasing
interpretability. The study concludes that optimized feature selection significantly contributes
to building robust and accurate predictive models.

2.4 TITLE: DATASETS USED IN HEART DISEASE PREDICTION


Authors:Dr.KavitaMehra
This review examines widely used datasets in heart disease prediction research, with a
particular focus on the Cleveland Heart Disease dataset, Statlog (Heart), and the Hungarian
Heart Disease dataset. It assesses these datasets for data quality, feature diversity, class
balance, and clinical relevance. The review also outlines preprocessing requirements, such as
normalization, handling missing values, and categorical encoding, which are critical for
effective training and evaluation of machine learning models.

2.5 TITLE: ML MODELS IN MEDICAL PREDICTION


Authors:Dr.VivekSharma
This review investigates machine learning models commonly applied in medical prediction
tasks. Algorithms such as Naive Bayes, k-Nearest Neighbors, Support Vector Machines,
Random Forests, and Deep Neural Networks are evaluated for their suitability in healthcare
diagnostics. The study outlines how each model handles medical data, their predictive
strengths, and challenges like data imbalance and interpretability.

4
CHAPTER-3
SYSTEM ANALYSIS

3.1 EXISTING SYSTEM

Traditional heart disease diagnosis relies heavily on manual interpretation of patient


data through clinical evaluation and the use of fixed diagnostic criteria. Medical
practitioners assess symptoms and physiological data such as blood pressure, cholesterol
levels, ECG results, and patient history to determine the likelihood of heart disease. These
evaluations often use scoring systems like the Framingham Risk Score or logistic
regression models.
While these systems have been effective in identifying many heart disease cases,
they are limited by their dependence on human expertise and inability to process complex
patterns across large datasets. The growing volume and complexity of medical data
necessitate a more automated, accurate, and scalable approach.

3.1.1 DISADVANTAGES OF EXISTING SYSTEM

1. Limited Pattern Recognition


Traditional diagnostic systems often fail to detect subtle, non-linear patterns within
patient data that might indicate early-stage heart disease.
2. High Risk of Human Error
Medical evaluations based solely on physician interpretation are susceptible to
errors, especially under time constraints or in resource-limited settings.
3. Inflexibility to New Data
Conventional models use static rules and do not adapt based on new insights, trends,
or evolving health conditions.

5
3.2 PROPOSED SYSTEM

The proposed system introduces a machine learning-based predictive model that uses
patient data to assess the risk of heart disease. This intelligent system is designed to automate
the diagnostic process by learning from historical medical records and identifying complex
patterns that may not be evident through traditional analysis.
By incorporating advanced machine learning algorithms and techniques like data
preprocessing, feature selection, and model optimization, the system offers a faster, more
accurate, and scalable solution for heart disease risk prediction. It supports clinicians by
providing data-driven insights and serves as a decision-support tool for early diagnosis and
treatment planning.

3.2.1.1 ADVANTAGES OF PROPOSED SYSTEM


1. Improved Accuracy Through Machine Learning
The system uses supervised learning algorithms trained on large, labeled datasets,
enabling it to identify risk factors and predict heart disease with high accuracy.
2. Early Diagnosis and Risk Prediction
By analyzing multiple features simultaneously, the system can detect early signs of
heart disease before symptoms become critical, improving patient outcomes.
3. Adaptability and Continuous Learning
As more patient data is collected, the model can be retrained and improved, making it
adaptive to new trends, treatment methods, and population-specific factors.
4. Scalability and Decision Support
The system can be deployed in a variety of healthcare settings—from small clinics to
large hospitals—and can assist medical professionals by offering real-time, data-
backed predictions, especially in under-resourced areas.

6
CHAPTER-4
SYSTEM REQUIREMENTS

4.1 HARDWARE REQUIREMENTS


 PROCESSOR: Minimum Dual-Core CPU (Quad-Core or higher
recommended for model training and data processing tasks)
 RAM: Minimum 4 GB
 STORAGE: At least 20 GB of free disk space
 GRAPHICS: Optional GPU (NVIDIA CUDA-compatible) for deep
learning or neural network training acceleration
 NETWORK: Stable broadband internet connection (for downloading
libraries, datasets, and potential cloud model integration)

4.2 SOFTWARE REQUIREMENTS


 OPERATING SYSTEM: Windows 10 or later / Ubuntu Linux
 PROGRAMMING LANGUAGE:
Python 3.8 or higher (used for data handling, model development,
training, evaluation, and deployment)
 LIBRARIES & FRAMEWORKS:
o Scikit-learn: Core machine learning library for training models
like SVM, Random Forest, etc.
o Pandas & NumPy: For structured data manipulation, analysis, and
preprocessing
o Matplotlib & Seaborn: For visualizing patient data, model
performance, and metrics
 DEVELOPMENT ENVIRONMENT:
Jupyter Notebook, Visual Studio Code
7
CHAPTER-5

SYSTEM DESIGN AND DEVELOPMENT

5.1 ML ARCHITECTURE OVERVIEW

5.2 SYSTEM ARCHITECURE

8
CHAPTER – 5

SYSTEM DESIGN AND DEVELOPMENT

5.2 DATA COLLECTION

Data for heart disease prediction was collected from well-known and publicly
available medical datasets, including the UCI Heart Disease Dataset, Cleveland Heart
Disease Dataset, and additional sources from Kaggle. These datasets consist of
anonymized patient records containing various clinical attributes such as age, sex, chest
pain type, resting blood pressure, cholesterol levels, fasting blood sugar, ECG results,
maximum heart rate, and exercise-induced angina. The data was cleaned and checked for
missing values and inconsistencies to ensure reliability for model training.

5.4 DATA PRE-PROCESSING

Pre-processing involved several critical steps to prepare the data for machine learning
models:
 Handling Missing Values: Missing or null values were imputed using mean,
median, or mode strategies depending on the feature type.
 Encoding Categorical Data: Features such as chest pain type and thalassemia were
encoded using one-hot or label encoding.
 Feature Scaling: Continuous variables were normalized using standardization (Z-
score normalization) to bring all attributes to a similar scale.
 Data Splitting: The dataset was split into training and testing subsets, typically using
an 80:20 or 70:30 ratio, ensuring random and unbiased distribution.

5.5 TESTING DATASET

The testing dataset, extracted from the main dataset, underwent the same

9
preprocessing steps as the training data. It was reserved exclusively for evaluating the
generalization performance of the machine learning models. Evaluation metrics such as
accuracy, precision, recall, F1-score, and ROC-AUC were calculated on the test set to
validate the model's ability to correctly predict the presence or absence of heart disease.

5.6 ALGORITHM SELECTION


Multiple machine learning algorithms were experimented with to identify the best-
performing model for predicting heart disease. Algorithms included:
 Logistic Regression
 Random Forest
 Support Vector Machine (SVM)
 K-Nearest Neighbors (KNN)
 Naive Bayes
 NeuralNetworks(DeepLearning)
The selection of the final algorithm was based on comparative analysis using
evaluation metrics across cross-validation folds.

5.6.1 ALGORITHM USED


The study explored several machine learning algorithms to build a robust and accurate
heart disease prediction model. The chosen models were evaluated for classification
accuracy, interpretability, and computational efficiency. The key algorithms included:
 Logistic Regression
 Random Forest
 Support Vector Machine (SVM)
 Naive Bayes
 Neural Networks
These models were selected due to their effectiveness in handling medical datasets and
their established roles in binary classification tasks.
10
5.6.2 NAIVE BAYES CLASSIFIER
Naive Bayes, based on Bayes' Theorem, assumes conditional independence between
features. Despite its simplicity, it proved effective in baseline modeling for heart disease
prediction.

5.6.3 RANDOM FOREST


Random Forest is an ensemble algorithm that builds multiple decision trees and
outputs the mode of their predictions. Its ability to handle both categorical and numerical
data makes it ideal for heart disease datasets.

5.6.4 SUPPORT VECTOR MACHINE (SVM)


SVM is a powerful classifier that constructs an optimal hyperplane for separating
classes. For heart disease prediction, SVM showed high precision and recall, especially
after feature scaling.

5.6.5 NEURAL NETWORKS


Neural networks, particularly multi-layer perceptrons (MLPs), were trained to learn
complex patterns in the data. While they required more computational resources, neural
networks were able to capture nonlinear relationships between medical features and heart
disease risk, offering the highest accuracy among all models tested.

5.6.6 MODEL PERFORMANCE COMPARISON

Each model was evaluated on the same test dataset using standard classification metrics:
 Naive Bayes: Quick and interpretable but slightly lower accuracy.
 Random Forest: Provided strong accuracy and good via feature importance.
11
CHAPTER-6
UML DIAGRAMS

6.1 CLASS DIAGRAM

A Class diagram in the Unified Modeling Language (UML) is a type of static


structure diagram that describes the structure of a system by showing the system's classes,
their attributes, operations (or methods), and the relationships among objects. It provides
a basic notation for other structure diagrams prescribed by UML. It is helpful for
developers and other team members too.

Figure 6.1 Class Diagram

12
6.2 USE CASE DIAGRAM
A use case diagram is a diagram that shows a set of use cases and actors and
their relationships. A use case diagram is just a special kind of diagram and shares the
same common properties as do all other diagrams, i.e a name and graphical contents
that are a projection into a model. What distinguishes a use case diagram from all other
kinds of diagrams is its particular content.

Figure 6.2 Use Case Diagram

13
6.3 ACTIVITY DIAGRAM
An activity diagram shows the flow from activity to activity. An activity is an
ongoing non- atomic execution within a state machine. An activity diagram is basically a
projection of the elements found in an activity graph, a special case of a state machine in
which all or most states are activity states and in which all or most transitions are triggered
by completion of activities in the source.

Figure 7.3 Activity Diagram

14
CHAPTER-7

PERFORMANCE ANALYSIS

7.1 ABOUT THE DATASET


The dataset was sourced from public repositories like the UCI Heart Disease Dataset. It
includes clinical features such as age, chest pain type, cholesterol, etc., with a binary target
indicating presence or absence of heart disease. Data preprocessing involved missing value
handling, encoding, scaling, and balancing using SMOTE.
Class Count
Heart Disease 526
No Heart Disease 499

7.2 ACCURACY COMPARISON OF ALGORITHMS


Several ML algorithms were tested and compared based on accuracy, precision, recall, and
F1-score:
Algorithm Accuracy (%)
Naive Bayes 84.7
Logistic Regression 87.9
Random Forest 90.5
SVM 88.3
Neural Network 92.1
Observations:
 Neural Networks gave the highest accuracy.
 Random Forest balanced accuracy and interpretability.
 Naive Bayes was fast but less accurate.

15
7.3 METHODOLOGY

The heart disease prediction system was developed using the following systematic approach:
1. Data Collection & Preprocessing
o Collected labeled patient data from public datasets.
o Cleaned data and handled missing values.
o Encoded categorical variables and normalized features.
o Addressed class imbalance using SMOTE.
2. Model Training
o Split the dataset into 70% training and 30% testing sets.
o Trained multiple machine learning models including Logistic Regression,
Naive Bayes, Random Forest, SVM, and Neural Networks.
o Performed cross-validation to ensure generalization.
3. Evaluation Metrics
o Measured model performance using Confusion Matrix, Accuracy, Precision,
Recall, F1-Score, and ROC-AUC.
4. Deployment (Optional)
o The best-performing model can be deployed as a web application using
Streamlit or Flask to allow real-time risk assessment.
Conclusion:
 Best Performing Model: Neural Network with 92.1% accuracy, offering robust
predictions for complex feature interactions.
 Best Trade-off Model: Random Forest with 90.5% accuracy and high
interpretability.
 Future Improvements: Integration of deep learning models such as CNN or
transformer-based architectures, and incorporation of real-time clinical data for
dynamic risk prediction.

16
CHAPTER-8
CONCULSION
The primary objective of this project was to develop an intelligent and automated system
capable of predicting the likelihood of heart disease using machine learning techniques. By
leveraging clinical data and applying supervised learning algorithms, the system provides a
reliable tool to support early diagnosis and preventive healthcare.
The integration of data preprocessing techniques, relevant feature selection, and
advanced machine learning models—such as Logistic Regression, Random Forest, Support
Vector Machine (SVM), and Neural Networks—enabled the system to deliver high accuracy
in identifying potential heart disease cases. Neural Networks showed the best performance in
terms of accuracy, while Random Forest offered a good balance between precision and
interpretability.
This project demonstrates the potential of machine learning in the medical field,
particularly for early diagnosis and decision support. The ability to analyze patient data and
provide accurate predictions can assist healthcare professionals in making timely and informed
decisions, ultimately improving patient outcomes and reducing the risk of complications.
In conclusion, the machine learning-based heart disease prediction system offers a
scalable, efficient, and user-friendly solution for medical diagnosis. With continuous training,
access to updated medical data, and integration into clinical environments, this system can
significantly enhance preventive healthcare and contribute to better cardiovascular disease
management.

17
CHAPTER-9
FUTURE ENHANCEMENT
The current implementation of the Heart Disease Prediction system demonstrates the
practical use of machine learning in medical diagnosis. However, to further enhance its
accuracy, usability, and impact in clinical settings, several future enhancements can be
considered:
1. Integration of Advanced Deep Learning Models
Future versions can incorporate deep learning architectures such as Recurrent Neural
Networks (RNNs), Long Short-Term Memory (LSTM), and transformer-based
models (like BERT for clinical text) to capture complex patterns and temporal health
data trends more effectively.
2. Real-Time Risk Monitoring
Enabling real-time heart disease risk prediction using continuous health monitoring
data (e.g., from wearable devices) can facilitate early intervention and personalized
care.
3. Incorporation of Electronic Health Records (EHRs)
Integrating patient history from EHRs, including medications, previous diagnoses,
and family history, can improve prediction accuracy and model comprehensiveness.
4. Mobile and Web-Based Health Applications
Developing mobile and web apps will allow easy access to heart disease risk
assessments, enabling users and healthcare providers to interact with the model
remotely and conveniently.
5. Adaptive Learning and Model Updates
Building a self-improving system that incorporates new medical data and user
feedback can help maintain accuracy and adapt to evolving clinical practices.

18
CHAPTER-10
APPENDIX

10.1 SOURCE CODE

App.py

import streamlit as st
import pandas as pd
import pickle

model_filename = './model/model.pkl'

with open(model_filename, 'rb') as file:


model = pickle.load(file)

# To run this code, use the following command in your terminal:


# streamlit run app.py

def main():
st.title('Heart Disease Prediction')
age = st.slider('Age', 18, 100, 50)
sex_options = ['Male', 'Female']
sex = st.selectbox('Sex', sex_options)
sex_num = 1 if sex == 'Male' else 0
cp_options = ['Typical Angina', 'Atypical Angina', 'Non-anginal Pain', 'Asymptomatic']
cp = st.selectbox('Chest Pain Type', cp_options)

19
cp_num = cp_options.index(cp)
trestbps = st.slider('Resting Blood Pressure', 90, 200, 120)
chol = st.slider('Cholesterol', 100, 600, 250)
fbs_options = ['False', 'True']
fbs = st.selectbox('Fasting Blood Sugar > 120 mg/dl', fbs_options)
fbs_num = fbs_options.index(fbs)
restecg_options = ['Normal', 'ST-T Abnormality', 'Left Ventricular Hypertrophy']
restecg = st.selectbox('Resting Electrocardiographic Results', restecg_options)
restecg_num = restecg_options.index(restecg)
thalach = st.slider('Maximum Heart Rate Achieved', 70, 220, 150)
exang_options = ['No', 'Yes']
exang = st.selectbox('Exercise Induced Angina', exang_options)
exang_num = exang_options.index(exang)
oldpeak = st.slider('ST Depression Induced by Exercise Relative to Rest', 0.0, 6.2, 1.0)
slope_options = ['Upsloping', 'Flat', 'Downsloping']
slope = st.selectbox('Slope of the Peak Exercise ST Segment', slope_options)
slope_num = slope_options.index(slope)
ca = st.slider('Number of Major Vessels Colored by Fluoroscopy', 0, 4, 1)
thal_options = ['Normal', 'Fixed Defect', 'Reversible Defect']
thal = st.selectbox('Thalassemia', thal_options)
thal_num = thal_options.index(thal)

with open('model/mean_std_values.pkl', 'rb') as f:


mean_std_values = pickle.load(f)

if st.button('Predict'):
user_input = pd.DataFrame(data={
'age': [age],
20
'sex': [sex_num],
'cp': [cp_num],
'trestbps': [trestbps],
'chol': [chol],
'fbs': [fbs_num],
'restecg': [restecg_num],
'thalach': [thalach],
'exang': [exang_num],
'oldpeak': [oldpeak],
'slope': [slope_num],
'ca': [ca],
'thal': [thal_num]
})
# Apply saved transformation to new data
user_input = (user_input - mean_std_values['mean']) / mean_std_values['std']
prediction = model.predict(user_input)
prediction_proba = model.predict_proba(user_input)

if prediction[0] == 1:
bg_color = 'red'
prediction_result = 'Positive'
else:
bg_color = 'green'
prediction_result = 'Negative'

confidence = prediction_proba[0][1] if prediction[0] == 1 else


prediction_proba[0][0]

21
st.markdown(f"<p style='background-color:{bg_color}; color:white;
padding:10px;'>Prediction: {prediction_result}<br>Confidence:
{((confidence*10000)//1)/100}%</p>", unsafe_allow_html=True)

if __name__ == '__main__':
main()

10.2 OUTPUT SCREENS

22
23
CHAPTER-11
BIBLIOGRAPHY

REFERENCES

1. P .K. Anooj, ―Clinical decision support system: Risk level prediction of heart disease
using weighted fuzzy rulesǁ; Journal of King Saud University – Computer and
Information Sciences (2012) 24, 27–40. Computer Science & Information Technology
(CS & IT) 59

2. Nidhi Bhatla, Kiran Jyoti”An Analysis of Heart Disease Prediction using Different
Data Mining Techniques”.International Journal of Engineering Research &
Technology

3. Jyoti Soni Ujma Ansari Dipesh Sharma, Sunita Soni. “Predictive Data Mining for
Medical Diagnosis: An Overview of Heart Disease Prediction”.

4. Chaitrali S. Dangare Sulabha S. Apte, Improved Study of Heart Disease Prediction


System using Data Mining Classification Techniques” International Journal of
Computer Applications (0975 – 888)

5. Dane Bertram, Amy Voida, Saul Greenberg, Robert Walker, “Communication,


Collaboration, and Bugs: The Social Nature of Issue Tracking in Small, Collocated
Teams”.

24
6. M. Anbarasi, E. Anupriya, N.Ch.S.N.Iyengar, ―Enhanced Prediction of Heart Disease
with Feature Subset Selection using Genetic Algorithmǁ; International Journal of
Engineering Science and Technology, Vol. 2(10), 2010.

7. Ankita Dewan, Meghna Sharma,” Prediction of Heart Disease Using a Hybrid


Technique in Data Mining Classification”, 2nd International Conference on Computing
for Sustainable Global Development IEEE 2015 pp 704-706. [2].

8. R. Alizadehsani, J. Habibi, B. Bahadorian, H. Mashayekhi, A. Ghandeharioun, R.


Boghrati, et al., "Diagnosis of coronary arteries stenosis using data mining," J Med
Signals Sens, vol. 2, pp. 153-9, Jul 2012.

9. M Akhil Jabbar, BL Deekshatulu, Priti Chandra,” Heart disease classification using


nearest neighbor classifier with feature subset selection”, Anale. Seria Informatica, 11,
2013

10. Shadab Adam Pattekari and Asma Parveen,” PREDICTION SYSTEM FOR HEART
DISEASE USING NAIVE BAYES”, International Journal of Advanced Computer and
Mathematical Sciences ISSN 2230-9624, Vol 3, Issue 3, 2012, pp 290-294.

25

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy