Ijarcce 2020 9712
Ijarcce 2020 9712
Abstract: Diabetes (Diabetes Mellitus), is a group of metabolic disorders and millions of people are affected. Detection
of diabetes is of a great significance and serious complications should be concerned. Many research studies have been
done on the diagnosis of diabetes, most of the research studies are based on one particular data set which is the Pima
Indian diabetes data set. This Pima Indian data set is a data set of studies of women in India's population that began in
1965., and its onset rate is relatively high in diabetes. Most research studies were carried out prior to focusing primarily
on one or two specialized complex techniques for testing data, while an inclusive research on several general
techniques are missing. In this system, we extensively explore the most popular techniques in Machine Learning (e.g.
KNN algorithm) used to identify the diabetes and pre-processing of data methods. We will examine this technique by
the accuracy of the cross validation on the UCI ML repository data set.
I. INTRODUCTION
Diabetes has a direct sign of high blood sugar, with some symptoms including increased thirst, increased hunger,
weight loss and frequent urination. Diagnosis of diabetes is made at least 200mg/dL over 2-hours post-load plasma
glucose, and various studies regarding the diagnosis of diabetes require timely call identification. Diabetic patients
usually require constant treatment, otherwise it can possibly lead to many dangerous life-threatening consequences.
Detecting diabetes in early stage and rapidly play a significant part in curing the diabetes. The proposed system is
using machine learning techniques for diabetes detection. The proposed system will be a medical field application that
will be useful for patients and diabetes doctors in identifying diabetes. The proposed system is an automation for the
detection of diabetes by using the old diabetes patient’s data.
Defusal Faruque and Asaduzzaman, Iqbal H.Sarker has discussed that diabetes is one of the most common disorder of
the human body it is caused due the metabolic disorder .Hence that they used various and important ML algorithms that
are Support Vector machine, NB,KNN and DT to predict the diabetes[1].
Sidong Wei,Xuejiao Zhao and Chunyan Miao presented that diabetes is commonly called as disorder in which glucose
level in body is high. In this paper they use popular methods such as SVM and deep neural network for identify the
disease and data processing. [2].
Lakshmi K.S and G.Santhosh Kumar according to them Hospital databases serve as wealthy information source for the
fruitful medication diagnosis. IN this they used NLP tools along with combined with data mining algorithms for the
extraction of rules [3].
Jian-xunChen , Shih-LiSu and Che-Ha Chang discussed about Ontology that generate a primary care planning to the
medical professional’s for the accustoming. The result of the research paper shows the model can be provided
personalize diabetes mellitus care planning efficiently [4].
MM Alotaib, RSH.Istepanian, and A.Sungoor they are present a clever based mobile polygenic disease control system
& tutoring model for the patients with diabetes. In this, system is able to store the clinical information about the
diabetes system, such an often blood sugar level and BP measured and hypo glycaemia event [5].
Berina Alic and Lejila Gurbea,Almir Badnjevic they presented the overview of techniques in machine learning in the
diabetes classification and cardiovascular diseases using BNs and ANN [6].
M.Durgadevi and Dr.R.Kalpana In this paper they estimate that risks, So gigantic cat goring and detection algorithms
have been develop in the domain of DM. So, that this paper aim is to compare the fruition and 5 classification way are
anti-miner, Ad boost, RBF network, CN2 and Bagging for the diabetes prediction [7].
ElliotB.Sloane, Nilmini Wickramasingle and Steve Goldberg they presented Wireless diabetes monitoring which is a
cloud-based diabetes, it’s a coaching platform for diabetes management and its a low cost, innovative, cloud-based
diabetes support system [8].
Minyechil Alehegn and Rahul Joshi had present about the ML technology that help to identify a dataset at the
elementary so that rescue the life.By implementing NB and K-nn algorithms.[9].
Umatejaswi and P.Suresh Kumar had discussed about algorithms such as SVM, NB, DT for identify the mellitus make
use of technique like data mining [10].
III. METHODOLOGY
The Proposed method use KNN algorithm for classification and prediction of diabetes using trained data. And, the
proposed system also predicts the time of getting diabetes.
Figure 1: Methodology
Data Constraints
Data is a collection global dataset. IN this system use Pima Indian data set is used for training a model. Data set contain
21 parameters and around 1000 dataset. The dataset feature/parameters are:
• Age
• Gender
• Relation
• DOB
• Sugar tested value
• Symptoms
• Family history etc.
This are data is trained to the model for the prediction of diabetes.
Testing data is the input given to a software. It shows the data affects when the execution of the module that specifying
and this is basically used for testing.
Pre-processing of data
Data preprocessing is a process in which that is actual use for converting the basic data into the clean data set. It is the
step in which the data transform or an encode to the state that the machine can be easily parse. The major task of data
preprocessing in learning process is to remove the unwanted data and filling the missed value. So that it help to
machine can be trained easily.
Feature Extraction
Feature Extraction is the method in which it used for alter the key data for features of outcomes. This, trait square is
used to compute the characteristics of designs given that facilitate in different amid the class of key pattern details. This
method involving to decrease the counts of resource required to describe the huge set of data. Feature extraction is an
attribute reduction process. This is also used to increasing the speed and effectiveness of supervised learning.
ML Algorithm: KNN
The k-nearest neighbor’s is a ML algorithm is the non-parametric method proposed by Thomas Cover used for
Regression and Classification. This algorithm is mainly used for the classification of problems in the industry. KNN
algorithm is a type of instance-based learning method. This algorithm relies on the distance for objects classification,
training data normalizing to the improve its accuracy dramatically. The neighbors are derived from the set of things for
which classes or object property values are known. It can be thought of as a training set for the algorithm, although no
explicit training steps are required.
Result
After taking that input data from the system will able to divine the statistics by appeal the ML algorithm & also
provided the foremost output in the devise of different in between to detection the most accurate to treatment to
diabetes millets.
The forecast or detection of diabetes is the major and concerning it is severe the complications. The diabetes
complications showed in the below picture. Detection of mellitus in the starting phase and played a significant role in
the heal the diabetes.
The detection diabetes is plays very important role for the human life because it leads to death. The offered system
is used to initial detection of diabetes and time prediction whereas time prediction means when the patients the diabetes
it will be help to improve the habit of the patients. The proposed system is mainly concentered on development of
machine learning model and also it helpful in the medical sector to identify the diseases. This offer system is an
automation to predicts the diabetes using old patient’s data.
System Design
Designing of system is the process in which it is used to define the interface, modules and data for a system to specified
the demand to satisfy. System design is seen as the application of the system theory. The main thing of the design a
system is to develop the system architecture by giving the data and information that is necessary for the implementation
of a system. In this project three-tier architecture is used.
DB Design
Implementation
Implementation can be described as the realization of an application, or execution of the plans, ideas, models, design
and system development, specification of the model, standard, algorithms used in the system, or authority. In computer
science, an implement is explained as the realization of technically specified or algorithms’ as a programed, a software
component, or any others computer systems through computer programming and deployment. Many of the
implementations may existed for a given specification or standard.
Result Discussion
In our project the result is classified into Yes or No. If the result is classified into No then we use time prediction
module. Time Prediction - here we predict the "time" of getting the diabetes disease. We analyze the result of the
diabetes prediction and check the accuracy of the diabetes prediction, time taken to compute the accuracy of the
diabetes prediction, correctly classification and incorrectly classification of result of the diabetes prediction. We have
used KNN Algorithm to predict the diabetes where result is classified into Yes or No and also for time prediction
module same KNN Algorithm is used. We compared the testing data and actual data to get the accuracy of our project.
Conclusion
The prediction of diabetes is one the of great importance in today scenario, and concerning with its severe
complications. Due to the biggest reason for the death in worldwide is diabetes. The System model is mainly focus to
identification of diabetes using some of the parameters. System is useful to physicians to predict the diabetes in initial
dais. So, that conventional treatments and solutions may be given to the patients. System used some of the techniques
like ML for the prediction, so that to get the more precise results. There have been fortune of investigation on the
diabetes imprint. Building diabetes disease prediction system is useful for hospitals and doctors. System predicts
disease at early stages, so doctors can treat patients in a better way. Proposed model is the real time application in
which is meant for multiple hospitals and predicts disease in less time. As we use machine learning algorithms for
disease prediction, we will get more accurate and efficient results.
Future Scope
Proposed system uses “KNN algorithm” to find the diabetes disease, in data science we have many algorithms for
classification such as Naive Bayes, SVM, Decision Tree, ID3 etc… in future we can add more algorithms to find
outputs and algorithms can be compared to find the efficient algorithm. We can add visitor query module, where
visitors can post queries to administrator and admin can send reply to those queries. We can add treatment module,
where doctors upload treatment details for patients and patient can view those treatment details.
REFERENCES
[1] “Performance Analysis of Machine Learning Techniques to Predict Diabetes Mellitus” Md Faisal Faruque, Asaduzzaman, Iqbal H. Sarker,
IEEE 2019.
[2] “A Comprehensive Exploration to the Machine Learning Techniques for Diabetes Identification” Sidong Wei1, Xuejiao Zhao, Chunyan Miao
Shanghai Jiao Tong University, China.
[3] “Association Rule Extraction from Medical Transcripts of Diabetic Patients” Lakshmi K S, G Santhosh Kumar, 2014.
[4] “Diabetes Care Decision Support System” 2nd International Conference on Industrial and Information Systems IEEE 2010.
[5] “An Intelligent Mobile Diabetes Management and Educational System for Saudi Arabia: System Architecture” M.M. Alotaibi, R.S.H.
Istepanian, A.Sungoor and N. Philip, IEEE 2014.
[6] “Machine Learning Techniques for Classification of Diabetes and Cardiovascular Diseases” by BerinaAlic, Lejla Gurbeta, IEEE 2017.
[7] “Performance Analysis of Classification Approaches for the Prediction of Type II Diabetes” by M. Durgadevi, M. Durgadevi, IEEE 2017.
[8] “Cloud-Based Diabetes Coaching Platform for Diabetes Management” Elliot B. Sloane Senior Member IEEE, Nilmini Wickramasinghe,
Steve Goldberg 2016.
[9] Minyechil Alehegn and Rahul Joshi, “Analysis andprediction of diabetes diseases using machine learning algorithm”:International Research
Journal of Engineering and Technology Volume: 04 Issue: 10 | Oct -2017
[10] P. Suresh Kumar and V. Umatejaswi, “Diagnosing Diabetes using Data Mining Techniques”,International Journal of Scientific and
ResearchPublications, Volume 7, Issue 6, June 2017 705 ISSN 2250-3153.
[11] “Clustering Medical Data to Predict the Likelihood of Diseases” by Razan Paul, Abu Sayed Md. Latiful Hoque, IEEE 2010.
[12] “Robust Parameter Estimation in a Model for Glucose Kinetics in Type 1 Diabetes Subjects” Proceedings of the 28th IEEE EMBS Annual
International Conference New York City, USA, Aug 30-Sept 3, 2006.
[13] Anjali C And Veena Vijayan V, Prediction and Diagnosis of Diabetes Mellitus, “A Machine Learning Approach” ,2015 IEEE in Intelligent
Computational Systems (RAICS) | Trivandrum.
[14] Ridam Pal ,Dr. Jayanta Poray, and Mainak Sen, ,“Application of Machine Learning Algorithms on Diabetic Retinopathy”, 2017 2nd IEEE
International Conference On Recent Trends In Electronics Information & Communication Technology, May 19-20, 2017, India.
[15] Dr. M. Renuka Devi and J. Maria Shyla, “Analysis of Various Data Mining Techniques toPredict Diabetes Mellitus”, International Journal
ISSN 0973-4562 Volume 11, Number 1 (2016) pp 727-730.