0% found this document useful (0 votes)
11 views5 pages

Final Research Paper

The document discusses a system for disease prediction using machine learning algorithms applied to medical data, aiming to enhance early detection and improve patient outcomes. It outlines the challenges of disease prediction, including data handling and privacy concerns, while proposing a CNN-based multimodal disease prediction algorithm. The study emphasizes the integration of diverse medical data types and the potential for improved healthcare resource allocation and patient care through predictive analytics.

Uploaded by

Tript sachdeva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views5 pages

Final Research Paper

The document discusses a system for disease prediction using machine learning algorithms applied to medical data, aiming to enhance early detection and improve patient outcomes. It outlines the challenges of disease prediction, including data handling and privacy concerns, while proposing a CNN-based multimodal disease prediction algorithm. The study emphasizes the integration of diverse medical data types and the potential for improved healthcare resource allocation and patient care through predictive analytics.

Uploaded by

Tript sachdeva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Disease Prediction using

medical data
Uttam Mandiwal(22BCS10399) Uday Mandiwal(22BCS10407) Vivek Poonia(22BCS10478) Mohit (22BCS12528)
Department of Department of Department of Department of
Computer science and engineering Computer science and engineering Computer science and engineering Computer science and engineering
Chandigarh University, India Chandigarh University, India Chandigarh University, India Chandigarh University, India
uttammandiwal@gmail.com mandiwaluday@gmail.com vivekpoonia2005@gmail.com sheorankunnu@gmail.com

Ayush Shastri (22BCS10634)


Department of
Computer science and engineering
Chandigarh university , India
shastriayush262003@gmail.com

Abstract—Majority of population in this world face issue


towards disease as they don’t know about the illness from In addition, there are significant regional variations in diseases, mostly
they are suffering. Sometime disease can be cure in early due to the various climates and lifestyles in the area. Therefore, the
stage by the patient itself, but they are not aware about their following difficulties with disease prediction based on medical data
disease. In the proposed system, it provide the application of analysis still exist: In what way ought the medical data to be handled?
machine learning algorithms to predict the onset, How should the predominant chronic illnesses and disease features
progression, and outcomes of various diseases based on within a given region be identified? How can the disease be analyzed
comprehensive medical datasets. It experiment the altered and a better model be made using deep learning technology?
estimate models over real-life medical data collected. The In order to address these issues, we integrate structured and unstructured
research focuses on utilizing diverse types of medical data, data in the healthcare domain to evaluate the likelihood of illness. First,
including demographic information, clinical history, laboratory we could identify the most common chronic illnesses in the area by
results, imaging data, and genetic markers, to develop accurate applying statistical knowledge. Second, we work with hospital
predictive models. The study also highlights real-world specialists to extract valuable information from organized medical data.
applications of disease prediction models in clinical practice, We use the CNN algorithm to automatically choose features for
such as early detection of chronic diseases, personalized unstructured medical data. Lastly, using medical data, we provide a
treatment planning, and healthcare resource allocation. unique CNN-based multimodal disease prediction (CNN-MDRP)
Moreover, the potential impact of integrating predictive algorithm.
analytics into healthcare systems for improving patient
outcomes and reducing healthcare costs is examined.
Keyword- Disease Prediction, Medical Data, Machine learning,

I. INTRODUCTION II. PROS & CONS:


Globally, health is influenced by various factors including infectious
and non-communicable diseases, healthcare disparities, and socio- Pros:
economic status. In India, challenges include infectious diseases and
rising non-communicable diseases due to lifestyle factors. Access to 1. Early Detection: Disease prediction allows for early detection of
pote- ntial health issues before symptoms manifest.This enables
healthcare is hindered by financial constraints and rural disparities. timely inte- rvention and treatment, potentially preventing the
Many individuals face health problems with unknown causes, leading progression of thedisease.
to delayed treatment due to hesitancy and lack of insurance.
Conducting surveys can justify the need for disease prediction from 2. Improved Patient Outcomes: Early detection often leads to better
medical data by capturing stakeholder perspectives on illness pat- ient outcomes as treatment can begin sooner, reducing the
prognosis, disease prevalence, effects of delayed diagnosis, current severity ofthe illness and its associated complications.
predictive model effectiveness, data integration challenges, privacy
concerns, resource limitations, and multidisciplinary cooperation. Cons:

People should put their health first, consulting a doctor for 1. Privacy Concerns: Disease prediction often relies on access to
personal health data, raising concerns about patient privacy and
troubling symptoms and going through routine examinations to data security. Unauthorized access or misuse of this data could
discover illnesses early. Data integrity, heterogeneity, the adaptive lead to breaches of privacy and confidentiality.
nature of diseases, privacy and ethics, resource constraints,
verification, and generalization are some of the difficulties in 2. Accuracy and Reliability: The accuracy and reliability of
predicting disease from medical data. When making predictions predictive models depend on the quality and quantity of the data
with conventional disease risk models, one typically uses a used for training. Incomplete or biased datasets may lead to
inaccurate predictions andunreliable results.
machine learning method (such as logistic regression, regression
analysis, etc.) and, more specifically, a supervised learning
algorithm that uses labels to train the model on training data. With
the advancement of deep learning technology, disease prediction
has received more attention from the standpoint of big data
analysis. Numerous studies have been carried out by automatically
picking features from a vast amount of medical data in order to
increase the accuracy of disease prediction.
III. PROPOSED WORK.

A. Collection of medical data: C. GUI

To compare the inputted data from user, we have to store GUI made for this project is simple tkinter GUI consisting labels,
medical data in the working model previously to cross check messagebox, button, text, title and option menu
and predict the disease, so the dataset for this project was
gathered from a study of university of Columbia conducted at
New York Presbyterian Hospital during 2004.

B. In our proposed system we have used different


technology to develop our prediction model:

1. Decision tree- Decision trees are seen as a highly


useful and adaptable categorization method. It is applied
to picture categorization and pattern recognition.
Because of its great versatility, it is employed for
categorization in extremely complicated problems.
Higher dimensionality problems can also be handled by
it. The three main components are the root, nodes, and Fig II.1: GUI home interface
leaf. The attributes in the roots of a tree determine its
overall outcome, while the leaves determine the value of
specific attributes and provide the tree's output. Labels are further used for different section

2. Random Forest- this algorithm is a supervised learning


algorithm used for both classification and regression. This
algorithm works on 4 basic steps –

• It chooses random data samples from dataset.


• It constructs decision trees for every sample dataset
chosen.
• At this step every predicted result will be compiled
and voted on.
• At last most voted prediction will be selected and be
presented as result of classification.
Fig II.2: Options
In this project we have used random forest classifier with 100
random samples and the result given is ~95% accuracy.

3. K Nearest Neighbour- The algorithm is one of


supervised learning algorithm. It is a basic but essential
algorithm. It finds extensive use in pattern finding and data
mining. It functions by finding a pattern in data that connects
the data to outcomes,and with each iteration,its get better at
identifying patterns. Our dataset was classified by K Nearest
Neighbor, and we attained an accuracy of approximately 92%.

4. Naïve Bayes- this algorithm is a family of algorithms based


on naïve bayes theorem. They share a common principle that
is every pair of prediction is independent of each other. It also
makes an assumption that features make an independent and
equal contribution to the prediction. In our project we have
used naïve bayes algorithm to gain a ~95% accurate
prediction.
IV. LIBRARY v. SERVICABILITY
A python library called mpl_toolkits.mplot3d used The capacity of Disease prediction from medical data to be
in disease prediction from medical data for – effectively maintained, updated, and supported over the course
of its lifecycle is referred to as serviceability. The following
a. The objective of the Python mpl_toolkits.mplot3d are some elements that affect disease prediction from medical
library is to facilitate the use of the Matplotlib data suitability for service:
library to create 3D plots. It has classes and
functions made especially for creating wireframes, Users can better grasp the features and functionalities of an
surface plots, scatter plots, and other three- application by having access to thorough documentation, user
dimensional visualizations. With the help of this manuals, and tutorials. Furthermore, enabling responsive
package, Matplotlib can now more efficiently customer support methods like email or a dedicated helpdesk
handle jobs involving 3D data visualization. guarantees that users can get help when they need it.
b. Overall, by offering strong tools for analyzing Fixing bugs and releasing updates frequently is essential for
models, visualizing multidimensional data, and preserving the functionality and dependability of the
deriving understanding of intricate relationships application. Updates containing bug fixes, security patches,
within the data, the mpl_toolkits.mplot3d library can and feature upgrades are promptly released, showing a
be very helpful in the prediction of disease from dedication to serviceability.
medical data.
VI. CONCLUSION
• Numpy as np is a python library provides a large
variety of mathematical operations that can be used In summary, the disease prediction from medical data offers a
with arrays.These operations can be used to process precise and effective way to assess disease from medical data.
medical data mathematically, perform statistical The time and effort needed for human evaluation are greatly
calculations, aggregate data, and more. reduced thanks to its automatic scanning and grading
capabilities. The application also reduces human error,
sklearn.preprocessing library is library in python used for delivering dependable and consistent results. Overall, the
OMR checking software provides a trustworthy, efficient, and
a. To standardize features, take the mean out of them error-free way to grade OMR sheets.
and scale them to the unit variance using the
StandardScaler from sklearn.preprocessing. In order
to guarantee that every feature contributes equally to VII. LIMITATION
the model fitting process and to enhance the
numerical stability of the optimization methods, this
preprocessing phase is essential to many machine Our disease prediction model leveraging medical data faces
learning algorithms. several inherent limitations. Challenges often arise from the
quality and availability of medical data, as incomplete or
b. All things considered, applying StandardScaler to biased datasets can compromise the model's accuracy.
illness prediction using medical data helps guarantee Additionally, overfitting poses a risk, where the model
that machine learning models are strong, performs well on training data but struggles to generalize to
comprehensible, and able to extract valuable new instances. Data privacy and security concerns are
patterns from the data without being influenced by paramount, requiring stringent measures to safeguard patient
feature scale. information. Moreover, temporal dynamics and the complexity
of diseases add layers of complexity, while ethical
Tkinter library in python used for: considerations regarding biases and interpretability underscore
the need for careful development and implementation of such
a. Users can create graphical user interfaces (GUIs) models. Addressing these limitations demands robust data
with the tkinter module. It offers a collection of governance, continuous model refinement, and adherence to
widgets and tools that let programmers create ethical guidelines to ensure both effectiveness and
windows, menus, buttons, and other graphical user trustworthiness in clinical practice.
interface components for their Python apps. Tkinter
is a popular option for creating basic Python GUIs
because it makes it simple to create interactive
desktop apps.

b. By using tkinter to develop a GUI for disease


prediction from medical data, developers can create
an intuitive and interactive interface that facilitates
data input, model seection, and result
visualization, Enhancing the usability.
VIII. FUTURE SCOPE

Disease prediction from medical data is set to transform 8. JC. Ho, C. H. Lee and JGhosh, "Septic shock prediction for
healthcare in the next years by providing timely, individualized patients with missing data", ACMTrans. Manage. Inf. Syst,
interventions and enhancing patient outcomes. The potential for 5, no. 1, pp. 1, 2014.
early disease identification, prevention, and management will
increase dramatically with the sophistication of predictive
models and the diversity of data sources available. This will 9. YD. Zhang e., "Fractal dimension estimation for developing
further our understanding of disease causes and risk factors in pathological brain detection system based on minkowskimg
addition to improving patient care. bouligand method", IEEE Access, vol. 4, 5937-5947, 2016.

Furthermore, in order to translate research findings into


scalable, user-friendly solutions that can be included into the
current healthcare infrastructure, industry stakeholders must
work together. Together, researchers, physicians, legislators,
and business partners can solve obstacles, resolve moral
dilemmas, and fully utilize medical data to anticipate diseases,
revolutionizing healthcare delivery and enhancing patient
outcomes for years to come.

IX. REFRENCES

1. P. Groves, B. Kayyali, D. Knott and S. van


Kuiken, The‘Big Data’Revolution in Healthcare:
Accelerating Value and Innovation, 2016.

2. M. Chen, S. Mao and Y. Liu, "Big data: A


survey", Mobile Netw. Appl., vol. 19, pp. 171-
209, Apr. 2014.

3. P. B. Jensen, L. J. Jensen and S. Brunak,


"Mining electronic health records: Towards
better research applications and clinical
care", Nature Rev. Genet., vol. 13, no. 6, pp.
395-405, 2012.

4. W. Yin and H. Schutze, "Convolutional neural


network for paraphrase identification", Proc.
HLT-NAACL, pp. 901-911, 2015.

5. N. Nori, H. Kashima, K. Yamashita, H. Ikai and


Y. Imanaka, "Simultaneous modeling of
multiple diseases for mortality prediction in
acute hospital care", Proc. 21th ACM SIGKDD
Int. Conf. Knowl. Discovery Data Mining, pp.
855-864, 2015.

6. S. Zhai, K.-H. Chang, R. Zhang and Z. M.


Zhang, "Deepintent: Learning attentions for
online advertising with recurrent neural
networks", Proc. 22nd ACM SIGKDD Int. Conf.
Knowl. Discovery Data Mining, pp. 1295-1304,
2016.

7. H. Chen, R. H. Chiang and V. C. Storey,


"Business intelligence and analytics: From big
data to big impact", MIS Quart., vol. 36, no. 4,
pp. 1165-1188, 2012.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy