0% found this document useful (0 votes)

14 views21 pages

ML Report2

Uploaded by

Bindu Prasad GS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views21 pages

ML Report2

Uploaded by

Bindu Prasad GS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 21

sMalnad College of Engineering, Hassan

(An Autonomous Institution affiliated to VTU, Belagavi)

Activity Report On
“MACHINE LEARNING”

MACHINE LEARNING
(21CS601)
in
Computer Science and Engineering
Under the Guidance of
Dr H M Keerthi Kumar
Assossiate Professor
Department of Computer Science and Engineering
Malnad College of Engineering
Submitted by

Bindu Prasad GS 4MC21CS027

Deeksha K 4MC21CS042
Deeksha S 4MC21CS043
Dhawan S 4MC21CS048

Department of Computer Science 2023-24

LITERATURE SURVEY

1. Anoud Shaikh , Naeem A. Mahoto, Faheem Khuhawar “Performance Evaluation of

Classification Methods for Heart Disease Dataset” Sindh Univ. Res. Jour. (Sci. Ser.) Vol.
47(3):389-394 (2015).

2. Sulyman Age Abdulkareema, Zainab Olorunbukademi Abdulkareemb. “An Evaluation of

the Wisconsin Breast Cancer Dataset” Institute for Communication Systems, Home of 5G
and 6G Innovation Centre, University of Surrey, Guildford, GU2 7XH, UK
.

3.Saima Sharleen Islam, Md. Samiul Haque1, M. Saef Ullah Miah, Talha
Bin Sarwar, Ramdhan Nugraha “Application of machine learning algorithms to predict
the thyroid disease risk: an experimental comparative study” .

4. A.K.M Sazzadur Rahman, F. M. Javed Mehedi Shamrat, Zarrin Tasnim, Joy Roy, Syed
Akhter Hossain “A Comparative Study On Liver Disease Prediction Using Supervised
Machine Learning Algorithms” INTERNATIONAL JOURNAL OF SCIENTIFIC &
TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019.
INTRODUCTION:
DOMAIN: HUMAN DISEASES
Datasets in human diseases are collections of data specifically gathered to study various
aspects of diseases affecting humans. These datasets can include information such as clinical
data ,genomic data, imaging data etc. Human disease datasets are used by researchers,
clinicians, and public health officials to improve our understanding of diseases, develop new
treatments and interventions, and enhance patient care. They are often stored in databases and
repositories and can be accessed for research purposes while ensuring patient privacy and
data security.

HEART DISEASE DATA SET

Description of dataset:
This data set dates from 1988 and consists of four databases: Cleveland, Hungary,
Switzerland, and Long Beach V. It contains 76 attributes, including the predicted attribute,
but all published experiments refer to using a subset of 14 of them. The "target" field refers to
the presence of heart disease in the patient. It is integer valued 0 = no disease and 1 = disease .
Algorithms:

Decision Tree: Decision Tree builds tree structure for the classification problem. The final
nodes in the tree are leaf nodes or decision nodes. Decision Tree classification works on the
core ID3 algorithm that implies entropy or information gain to build decision tree.

Naïve Bayes algorithm: Naïve Bayes algorithm is greatly applied in classification and
prediction problems. . Due to its simplicity, this algorithm is very useful for large datasets
and produces promising outcomes. Bayes theorem considers the value of unknown data point
for a given dataset is independent of other unknown data points. Naïve Bayesian classifier
considers each individual data point has independent distribution, therefore, it estimates the
probabilities of each unknown data point, as given below:

k-NN algorithm: k-nearest neighbor (k-NN) algorithm has been widely used for
classification, estimation and prediction. The k-NN algorithm based on training set selects the
most near known data point and label/predict the unknown data point. The k represents the
number of nearest neighbors for the unknown data point. For instance, if k=2 then k-NN
would choose the two closest known data points from the known data points to classify the
new unknown data point. The similarity between data points is measured by various distance
measures.

Performance Evaluation Metrics:

The performance of classifier/prediction model is measured using evaluation metrics such as

Precision, Recall, Accuracy and F-measure. In particular, the performance of the
classification model is measured to distinguish between actual and predicted class/label.

The performance of the classification model is computed based on confusion matrix. This
matrix is the base for the common evaluation metrics such as precision, recall, accuracy and
F-measure (Fawcett, 2004).

Precision - This is the positive predictive value (PPV).

TP
PRECISION =
TP+ FP
Recall - This is the sensitivity. In the expression of recall, P is the actual total POSITIVE
data points.
TP
RECALL=
P

Accuracy - This value presents the correctness of the classification model in predicting
unknown data points. In the expression of accuracy, N is the actual total NEGATIVE data
points, and P is actual total POSITIVE data points.

TP+ TN
ACCURACY =
P+ N

F-measure - It is the harmonic mean of precision and recall (Sasaki, 2007).

2
F−MEASURE=
1
+1/ RECALL
PRECISION

Results:

To evaluate the performance of each considered classification algorithm, the dataset has been
segmented in different cases. Each case represents the distribution of dataset. The different
cases considered in this study allows to evaluate the performance of the classification model
with respect to varying number of attributes and their varying values in the dataset.

Different cases used in this study are described in the following

Case 1: In this case, all the attributes (i.e., 14 attributes) of the dataset are used. Attributes:
Age, Sex,
CP, trestbps, chol, fbs, restecg, thalach, exang, oldpeak, slope, ca, thal, num

Case 2: Subset (i.e., 8 attributes) of the dataset is used in this case. Attributes: Age, Sex, num,
trestbps, chol, restecg, thalach, ca

Case 3: Subset (i.e., 7 attributes) of the dataset is used in this case. Attributes: Ca, fbs, exang,
oldpeak, slope, thal, num
Fig. 2. k-NN Classification/Prediction Results

Fig. 3, Naïve Bayesian Classification/Prediction

Results

Fig. 4: Decision Tree Classification/Prediction

Results
Fig. 5, Case 3 Results of classifiers
BREAST CANCER DATASET:

Description of dataset:

This is a classic dataset for training and benchmarking machine learning algorithms.Biopsy
features for classification of 569 malignant (cancer) and benign (not cancer) breast masses.
Features were computationally extracted from digital images of fine needle aspirate biopsy
slides. Features correspond to properties of cell nuclei, such as size, shape and regularity. The
mean, standard error, and worst value of each of 10 nuclear parameters is reported for a total
of 30 features.

Algorithms:

In our project, the predictive analysis of the machine learning algorithms is achieved. The
machine learning algorithms applied in our project are:

• Support Vector Machine (SVM) is a classifier which divides the datasets into classes to
find a maximum marginal hyper plane (MMH) via the nearest data points .

• Random forests or random decision forests are an ensemble method for classification,
regression and other tasks, that operate by constructing a multitude of decision trees at
training time and outputting the class that is the mode of the classes (classification) or mean
prediction (regression) of the individual trees. Random decision forests correct for decision
trees' habit of over fitting to their training set.

• k-Nearest Neighbors (K-NN) is a supervised classification algorithm. It takes a bunch of

labeled points and uses them to learn how to label other points. To label a new point, it looks
at the labeled points closest to that new point, which is its nearest neighbors, and has those
neighbors vote .
• Logistic regression is a very powerful modeling tool, is a generalization of linear
regression [11]. Logistic Regression is used to assess the likelihood of a disease or health
condition as a function of a risk factor (and covariates). Both simple and multiple logistic
regression , assess the association between independent variable(s) (Xi) -- sometimes called
exposure or predictor variables — and a dichotomous dependent variable (Y) -- sometimes
called the outcome or response variable. It is used primarily for predicting binary or
multiclass dependent variables.

• Decision Tree C4.5 is a predictive modeling tool that can be applied across many areas. It
can be constructed by an algorithmic approach that can split the dataset in different ways
based on different conditions .

Performance Evaluation Metrics:

The classifiers performances were evaluated using metrics such as Accuracy, Precision,
Recall, and F1- Score.

The Accuracy metric defines how correct the classifiers used in the experiment performed
the classification task. Accuracy measures the proportion of true instances classified by the
classifier (TP + TN) against the overall predicted classification instances. The metric is
defined as follows:

(TP+TN )
Acc .=
(TP+ FP+TN + FN )

Where TP=True Positive

TN=True Negative
FP=False Positive
FN=False Negative

Precision evaluates the proportion of the data instances predicted as true and were true in the
experiment i.e. (the fraction of relevant instances among all retrieved instances). It is defined
as follows:
TP
Precision=
TP+ FP

Recall evaluates the proportion of the actual true data instances that were predicted correctly
as true in the experiment i.e. (the fraction of retrieved instances among all relevant instances).
It is defined as follows:
TP
Recall=
TP+ FN

F1-Score evaluation metric was used in measuring the harmonic mean between precision
and recall scores of the classifiers. The metric was used in finding a fair balance between the
two metric values of the classifiers. It is defined as follows:
2∗Recall∗Precision
F 1−score=
Recall+ Precision

Results:

The confusion matrix for each model is formulated to evaluate the classifier.

From the results of training set and testing set we can see that all the classifiers have varying
accuracies but SVM always has higher accuracy testing set (97.2%) than the other classifiers.

Table 1. Accuracy percentage for breast cancer diagnostic dataset

Since confusion matrices are a useful way to assess the classifier, each row in Table 2
represents the rates in an actual class while each column displays the predictions. Table 3
present the calculated performance measures of classification models based on confusion
matrix results, precision sensitivity f1 score for benign and malignant.

Table 2. Confusion Matrix.

Table 3. Classifiers performances

Fig. 3. Comparative graph of different classifiers

THYROID DATASET:

Description of dataset:
The dataset has 3,162 rows and 25 data columns. Consider the other datasets in allbp.data,
which include only 2,800 instances and no missing values. Additionally, contains only 2,800
instances and contains no missing values. Each data set and test set contain the same 2,800
instances and 972 cases. However, not just because of the volume of cases, we discover that
relatively few researchers have previously worked with this dataset. As a result, we
concentrated on this dataset and studied the results in order to aid future researchers in
predicting similar types of multiclass thyroid datasets.

Table 1:Sick-euthyroid dataset structure.

Algorithms:

In our project, the predictive analysis of the machine learning algorithms is achieved. The
machine learning algorithms applied in our project are:

KNN
K-Nearest Neighbors (KNN) is a technique for inducing laziness in learning. All training data
is incorporated into the testing process. While this expedites planning, it delays testing and
necessitates a great deal of time and memory. When building the model, the number of
neighbors (K) must be specified in KNN. In this case, K acts as a controlling variable for the
prediction model. When the number of classes is even, K is usually an odd number.

ANN
The input neurons receive the data that we feed the ANN in the input layer. These neurons
transmit data to the hidden layer, which performs the magic, and then transfer output neurons
to the output layer, which stores the network’s final calculations for future use. The data will
produce outputs with insufficient data after ANN training. It can also learn on its own and
provide results
Decision tree
The decision tree technique is a subset of supervised machine learning that is based on a
continuous data splitting mechanism across specified parameters.After obtaining the training
data, it splits the dataset into small subsets using the Gini Index, Information Gain, Entropy,
Gain Ratio, and Chi-Square. The Gini Index and Information Gain are measured in the
majority of datasets using Eqs. (1) and (2) respectively. The process is repeated for each child
tuple until all tuples belong to the same class and no additional attributes are needed. The
Gini Index is used to increase precision and diagnostic accuracy.

This attribute then subdivides the dataset into smaller subsets for each child until there are no
more attributes.
GaussianNB
The technique of Gaussian Naive Bayes (GaussianNB) is predicated on the susceptibility to
predictor independence .As each feature is independent, the inclusion of one feature does not
affect the appearance of other features in the GaussianNB algorithm. The algorithm is based
on the Bayes theorem, and the probability can be calculated utilizing

The probability P(A|B) that we are interested in computing is referred to as the posterior
probability. P(A) is the probability of occurrence of event A. Similarly, P(B) is the probability
of occurrence of event B.

Performance Evaluation Metrics:

The classifiers performances were evaluated using metrics such as Accuracy, Precision,
Recall, and F1- Score.
Accuracy:
In classification problems, accuracy refers to the number of correct predictions made by the
model over all possible predictions. It is calculated by multiplying the number of correct
predictions by the total number of predictions multiplied by 100.

Here, TP is True Positive, where a person is actually having euthyroid sick syndrome, and the
model classifying his case as sick-euthyroid comes under True Positive.
Precision:
Precision is the ratio of true positives and total positives predicted.

Here, precision is a measure that tells us what proportion of patients that diagnosed as having
euthyroid sick syndrome, actually had the euthyroid sick syndrome. The predicted positives
(People predicted as sick-euthyroid are TP and FP) and the people actually having a euthyroid
sick syndrome are TP.
F1-score: Precision and recall are combined in the F1-score metric. Indeed, the F1 score is
the mean of the two harmonics. A high F1-score indicates both great precision and recall. It
has an excellent mix of precision and recall and performs well on issues involving
imbalanced classification.

Recall:
A recall is the ratio of true positives to all positives in the ground truth.

The actual positives (those with the euthyroid sick syndrome are TP and FN), as well as the
patients diagnosed with euthyroid sick syndrome by the model, are TP. FN is included since
the Person did indeed have a euthyroid sick syndrome, despite the model’s prediction.

Results:

The confusion matrix for each model is formulated to evaluate the classifier.
We used one neural network model (ANN), six tree-based models (CatBoost, XGBoost,
Random Forest, LightGBM, Decision Tree, and Extra-Trees), and three statistical models
(SVC, KNN, and GaussianNB) in this study. Experimental results are scrutinized utilizing
accuracy, precision, recall, F1-score, and learning curves. These evaluation matrices are
compared for ten classification algorithms.

Figure 14: Comparison of F1-scores for different machine learning algorithms

employed in this study.

Table 3:Accuracy, precision, recall and F1-scores for different classification

methods employed in this study.
Figure 15: Learning curves for the top four classification algorithms.
LIVER DISEASE DATASET:

Description of dataset:
The dataset consists of 583 liver patient’s data whereas 75.64% male patients and 24.36% are
female patients. This dataset has contained 11 particular parameters whereas we choose 10
parameters for our further analysis and 1 parameter as a target class. Such as,

Algorithms:

In our project, the predictive analysis of the machine learning algorithms is achieved. The
machine learning algorithms applied in our project are:

Decision Tree (DT):

The general thought process of utilizing Decision Tree is to make a training model that can
use to predict class or estimation of objective factors by taking in choice standards derived
from earlier data (training data).
Fig. 4: Sample of the process of Decision Trees.
K-Nearest Neighbors (KNN):
KNN is one of the most fundamental occasion-based classification algorithms in Machine
Learning. In any case, the KNN takes a shot at the idea that examples are near fit in similar
examples class. A KNN sorts an example to the class that is most decided among K
neighboring. K is a limitation for adjusting the classification algorithms

Support Vector Machine (SVM) :

SVM is a supervised learning calculation. It can utilize for both grouping or relapse issues
however generally it is utilized in characterization issues. SVM function admirably for some,
human services issues and can comprehend both linear and non-linear issues.
training involves the minimization of the error function:

Naive Bayes (NB):

Naive Bayes is one of the basic, best and ordinarily utilized, AI techniques. It is a
probabilistic classifier that classifies utilizing the speculation of restrictive freedom with the
pre-trained datasets

Performance Evaluation Matrix:

The classifiers performances were evaluated using metrics such as Accuracy, Precision,
Recall, and F1- Score.

Accuracy
This value presents the correctness of the classification model in predicting unknown data
points. In the expression of accuracy, N is the actual total NEGATIVE data points, and P is
actual total POSITIVE data points.

Precision:
It is otherwise called positive predictive value. It gives the proportion of an accurately
predicted positive outcome by classifier algorithms
F1:
It measures the precision of the model by a blend of accuracy and recall. It gives the
proportion of both FP and FN of a model.

Results:

The confusion matrix for each model is formulated to evaluate the classifier.

In this experiment, we considered different analyses to examine the six-machine learning

classifier for the classification of liver disease dataset. In terms of accuracy, LR achieved the
highest accuracy of 75% and NB achieved the worst performance 53%. With respect to
precision, LR achieved the highest score 91% and NB performs worst 36%. When
considering the sensitivity, SVM achieved the highest value 88% and KNN obtained the
worst 76%.
the Receiver Operating Characteristics (ROC). ROC is used to represent the performance of
machine learning techniques which is based on the true positive rate (TPR) and false-positive
rate (FPR) of these classification results

Fig. 7: Receiver Operating Characteristics (ROC).

In this experiment, we considered different analyses to examine the six-machine learning
classifier for the classification of liver disease dataset. In terms of accuracy, LR achieved the
highest accuracy of 75% and NB achieved the worst performance 53%. With respect to
precision, LR achieved the highest score 91% and NB performs worst 36%. When
considering the sensitivity, SVM achieved the highest value 88% and KNN obtained the
worst 76%.

IB Math Applications Syllabus Sample Problems PDF
No ratings yet
IB Math Applications Syllabus Sample Problems PDF
267 pages
ML Acti
No ratings yet
ML Acti
23 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
16 pages
Final Research Paper
No ratings yet
Final Research Paper
3 pages
Breast Cancer Classification
100% (2)
Breast Cancer Classification
16 pages
Ramana 2019
No ratings yet
Ramana 2019
6 pages
Foml Project Report
No ratings yet
Foml Project Report
8 pages
LS_Project_Report
No ratings yet
LS_Project_Report
10 pages
I. Bstract Iii. ATA ET: Heart Disease Prediction Using Weka Tools On Machine Learning Anshu Garg, Jasleen Kaur
No ratings yet
I. Bstract Iii. ATA ET: Heart Disease Prediction Using Weka Tools On Machine Learning Anshu Garg, Jasleen Kaur
9 pages
A Computational Study On Classification of Malignant
No ratings yet
A Computational Study On Classification of Malignant
63 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
17 pages
Machine Learning Algorithms For Breast Cancer Prediction
No ratings yet
Machine Learning Algorithms For Breast Cancer Prediction
8 pages
Thesis On Comparison of Machine Learning Techniques To Predict Cardiovascular Disease
No ratings yet
Thesis On Comparison of Machine Learning Techniques To Predict Cardiovascular Disease
52 pages
IDS Project Group 11
No ratings yet
IDS Project Group 11
35 pages
BSAN Case 3
No ratings yet
BSAN Case 3
9 pages
DL PPR3
No ratings yet
DL PPR3
57 pages
Study of Ensemble Classifers
No ratings yet
Study of Ensemble Classifers
8 pages
Disease Prediction Based on Symptoms
No ratings yet
Disease Prediction Based on Symptoms
16 pages
Applications of Machine Learning Techniques To Predict Diagnostic Breast Cancer
No ratings yet
Applications of Machine Learning Techniques To Predict Diagnostic Breast Cancer
11 pages
IEEE Conference Team ATOM
No ratings yet
IEEE Conference Team ATOM
5 pages
Multi-Disease Prediction With Machine Learning
No ratings yet
Multi-Disease Prediction With Machine Learning
7 pages
Article Eda
No ratings yet
Article Eda
7 pages
Disease Prediction Using Machine Learning
No ratings yet
Disease Prediction Using Machine Learning
4 pages
Ai Finalreport b2
No ratings yet
Ai Finalreport b2
11 pages
DiseasePredReport (3) (1)
No ratings yet
DiseasePredReport (3) (1)
42 pages
5 markd
No ratings yet
5 markd
24 pages
AMCIS 2020 Slide Template ERF
No ratings yet
AMCIS 2020 Slide Template ERF
14 pages
minor project
No ratings yet
minor project
21 pages
On Breast Cancer Detection: An Application of Machine Learning Algorithms On The Wisconsin Diagnostic Dataset
No ratings yet
On Breast Cancer Detection: An Application of Machine Learning Algorithms On The Wisconsin Diagnostic Dataset
5 pages
AI-based Smart Prediction of Clinical Disease Using Random Forest Classifier and Naive Bayes
No ratings yet
AI-based Smart Prediction of Clinical Disease Using Random Forest Classifier and Naive Bayes
22 pages
Final Research Paper
No ratings yet
Final Research Paper
5 pages
Prediction of Heart Disease Using Decision Tree in Comparison With KNN To Improve Accuracy
No ratings yet
Prediction of Heart Disease Using Decision Tree in Comparison With KNN To Improve Accuracy
5 pages
On Breast Cancer Detection: An Application of Machine Learning Algorithms On The Wisconsin Diagnostic Dataset
No ratings yet
On Breast Cancer Detection: An Application of Machine Learning Algorithms On The Wisconsin Diagnostic Dataset
5 pages
Predicting The Presence of Heart Diseases Using Comparative Data Mining and Machine Learning Algorithms
No ratings yet
Predicting The Presence of Heart Diseases Using Comparative Data Mining and Machine Learning Algorithms
5 pages
Heart Disease Prediction With Machine Learning Approaches
No ratings yet
Heart Disease Prediction With Machine Learning Approaches
6 pages
Mini Project 2024
No ratings yet
Mini Project 2024
48 pages
C. Karthik Chandran, M. Rajalakshmi, Sachi Nandan Mohanty, Subrata Chowdhury - Machine Learning For Healthcare Systems - Foundations and Applications-River Publishers (2023)
No ratings yet
C. Karthik Chandran, M. Rajalakshmi, Sachi Nandan Mohanty, Subrata Chowdhury - Machine Learning For Healthcare Systems - Foundations and Applications-River Publishers (2023)
251 pages
Aditya Predictive
No ratings yet
Aditya Predictive
12 pages
review
No ratings yet
review
5 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
5 pages
Machine Learning 1707965934
No ratings yet
Machine Learning 1707965934
15 pages
PeerEval Classification
No ratings yet
PeerEval Classification
5 pages
Machine Learning Project
No ratings yet
Machine Learning Project
12 pages
Lec_2☑️
No ratings yet
Lec_2☑️
23 pages
DMT Doc Final
No ratings yet
DMT Doc Final
20 pages
Disease Prediction Using Machine Learning
No ratings yet
Disease Prediction Using Machine Learning
10 pages
Final Year Project
No ratings yet
Final Year Project
57 pages
Edited - Django Website For Disease Prediction Using Machine Learning
No ratings yet
Edited - Django Website For Disease Prediction Using Machine Learning
7 pages
Miniproject Report
No ratings yet
Miniproject Report
11 pages
IEEE Paper Format Template
No ratings yet
IEEE Paper Format Template
4 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
27 pages
Project Report: Bangladesh University of Business & Technology (BUBT)
No ratings yet
Project Report: Bangladesh University of Business & Technology (BUBT)
18 pages
Data Mining Disease Diagnosis Presentation
No ratings yet
Data Mining Disease Diagnosis Presentation
35 pages
Breast Cancer Classifier Using Machine Learning
No ratings yet
Breast Cancer Classifier Using Machine Learning
7 pages
DL Project Progress Report
No ratings yet
DL Project Progress Report
49 pages
New Microsoft PowerPoint Presentation (Recovered)
No ratings yet
New Microsoft PowerPoint Presentation (Recovered)
23 pages
Mla - 2 (Cia - 1) - 20221013
No ratings yet
Mla - 2 (Cia - 1) - 20221013
14 pages
Application of Machine Learning
No ratings yet
Application of Machine Learning
8 pages
TEAM_03
No ratings yet
TEAM_03
21 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Git and GitHub
No ratings yet
Git and GitHub
40 pages
Linux
No ratings yet
Linux
38 pages
Core AWS Services
No ratings yet
Core AWS Services
36 pages
Internship Report 02
No ratings yet
Internship Report 02
37 pages
Module2 L2
No ratings yet
Module2 L2
29 pages
Malnad College of Engineering
No ratings yet
Malnad College of Engineering
15 pages
Example of Stratified Diversion-Curve Model
No ratings yet
Example of Stratified Diversion-Curve Model
13 pages
Managerial Economicsnotes
No ratings yet
Managerial Economicsnotes
9 pages
Fundamentals of Nonparametric Bayesian Inference ( PDFDrive )
No ratings yet
Fundamentals of Nonparametric Bayesian Inference ( PDFDrive )
671 pages
Hedonic Prices and Quality Adjusted Price Indices Powered by Ai
No ratings yet
Hedonic Prices and Quality Adjusted Price Indices Powered by Ai
47 pages
UNSW Master of Data Science
No ratings yet
UNSW Master of Data Science
20 pages
Machine Learning Course Handbook- RMA
No ratings yet
Machine Learning Course Handbook- RMA
12 pages
Homework2 v1.0
No ratings yet
Homework2 v1.0
5 pages
B.Arch S1S2 Syllabus
No ratings yet
B.Arch S1S2 Syllabus
67 pages
1 s2.0 S0885200619300705 Main
No ratings yet
1 s2.0 S0885200619300705 Main
10 pages
DP-100 Exam Valid Dumps
No ratings yet
DP-100 Exam Valid Dumps
69 pages
Making Hard Decisions With Decisiontools 3rd Edition Robert T Clemen download
No ratings yet
Making Hard Decisions With Decisiontools 3rd Edition Robert T Clemen download
78 pages
Linear Parametrization and Identification of Robot Dynamics: Robotics 2
No ratings yet
Linear Parametrization and Identification of Robot Dynamics: Robotics 2
23 pages
chang-and-brewer-20221
No ratings yet
chang-and-brewer-20221
22 pages
Ids Project
No ratings yet
Ids Project
25 pages
Artificial Intelligence - Chances and Challenges
No ratings yet
Artificial Intelligence - Chances and Challenges
12 pages
BBA 304 SLM
No ratings yet
BBA 304 SLM
270 pages
AVO For Managers Pitfalls and Solutions
No ratings yet
AVO For Managers Pitfalls and Solutions
22 pages
Binary Logistic Regression From Scratch
No ratings yet
Binary Logistic Regression From Scratch
10 pages
R Course File
No ratings yet
R Course File
18 pages
Business Analytics IV
No ratings yet
Business Analytics IV
8 pages
Week 3 - The SLRM (2) - Updated PDF
No ratings yet
Week 3 - The SLRM (2) - Updated PDF
49 pages
AI Using Python
No ratings yet
AI Using Python
9 pages
MLPUE2 Solution
No ratings yet
MLPUE2 Solution
9 pages
Solutions For Chapter 7
No ratings yet
Solutions For Chapter 7
36 pages
Final Report Textile Internship1
No ratings yet
Final Report Textile Internship1
46 pages
Beginning R
No ratings yet
Beginning R
337 pages
Lecture Note: Analysis of Financial Time Series
No ratings yet
Lecture Note: Analysis of Financial Time Series
12 pages
Ml Final Report
No ratings yet
Ml Final Report
40 pages
Regression Analysis Q&A Imp
0% (1)
Regression Analysis Q&A Imp
3 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

ML Report2

Uploaded by

ML Report2

Uploaded by

sMalnad College of Engineering, Hassan

(An Autonomous Institution affiliated to VTU, Belagavi)

Bindu Prasad GS 4MC21CS027

Department of Computer Science 2023-24

1. Anoud Shaikh , Naeem A. Mahoto, Faheem Khuhawar “Performance Evaluation of

2. Sulyman Age Abdulkareema, Zainab Olorunbukademi Abdulkareemb. “An Evaluation of

HEART DISEASE DATA SET

Performance Evaluation Metrics:

The performance of classifier/prediction model is measured using evaluation metrics such as

Precision - This is the positive predictive value (PPV).

F-measure - It is the harmonic mean of precision and recall (Sasaki, 2007).

Different cases used in this study are described in the following

Fig. 3, Naïve Bayesian Classification/Prediction

Fig. 4: Decision Tree Classification/Prediction

• k-Nearest Neighbors (K-NN) is a supervised classification algorithm. It takes a bunch of

Performance Evaluation Metrics:

Where TP=True Positive

Table 1. Accuracy percentage for breast cancer diagnostic dataset

Table 2. Confusion Matrix.

Fig. 3. Comparative graph of different classifiers

Table 1:Sick-euthyroid dataset structure.

Performance Evaluation Metrics:

Figure 14: Comparison of F1-scores for different machine learning algorithms

Table 3:Accuracy, precision, recall and F1-scores for different classification

Decision Tree (DT):

Support Vector Machine (SVM) :

Naive Bayes (NB):

Performance Evaluation Matrix:

In this experiment, we considered different analyses to examine the six-machine learning

Fig. 7: Receiver Operating Characteristics (ROC).

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.