0% found this document useful (0 votes)

103 views18 pages

Breast Cancer Classification

This document provides a summary of a project to classify breast cancer tumors using machine learning techniques. It includes an introduction to breast cancer and machine learning, a literature review of previous classification studies, a description of the Wisconsin Breast Cancer Dataset used, data visualization, preprocessing steps like standardization and principal component analysis, and evaluations of k-Nearest Neighbors and Naive Bayes classification models. The models are compared using accuracy scores and confusion matrices to determine the most effective approach for this breast cancer classification task.

Uploaded by

Satwik Sridhar Reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

103 views18 pages

Breast Cancer Classification

Uploaded by

Satwik Sridhar Reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 18

BREAST CANCER CLASSIFICATION

Project report submitted for EED363 Applied Machine Learning

ENDTERM REPORT

Submitted by

1. Satwik Boojala ( 1610110340 )

2. Kaliki Sai Preetham ( 1610110452 )

Department of Electrical Engineering

School of Engineering

INDEX
1) Introduction

2) Literature review

3) Dataset

4) Data Visualisation

5) Creating Training and Test data

6) Correlation matrix of the Data

7) Standardising the Data

8) Principal component analysis

9) Training and Evaluating different Classification Models

● K- Nearest Neighbors (KNN)

● Naive-Bayes (NB)

10) Confusion Matrix and Accuracy scores

11) ROC curves

12) References

INTRODUCTION
Many women are diagnosed with Breast cancer, second to lung cancer, Breast Cancer is the
second popular cause of death in both developed and undeveloped worlds. Every year, one
million women are newly diagnosed with breast cancer, according to the report of the world
health organization half of them would die, because it’s usually late when doctors detect the
cancer. Breast Cancer is caused by a mutation in a single cell, which can be shut down by the
system or causes a reckless cell division. Breast Cancer is characterized by the mutation of
genes, constant pain, changes in the size, color(redness), skin texture of breasts.
Classification of breast cancer leads pathologists to find a systematic and objective
prognostic, generally the most frequent classification is binary (benign cancer/malignant
cancer).

The early diagnosis of Breast cancer can improve the prognosis and chance of survival
significantly, as it can promote timely clinical treatment to patients. Further accurate
classification of benign tumors can prevent patients undergoing unnecessary treatments.
Thus, the correct diagnosis of Breast cancer and classification of patients into malignant or
benign groups is the subject of much research. Because of its unique advantages in critical
features detection from complex breast cancer datasets, Machine Learning (ML) techniques
are being broadly used in the breast cancer classification problem. They provide high
classification accuracy and effective diagnostic capabilities.

The relation between Breast cancer and Machine learning is not recent, it has been used for
decades to classify tumors and other malignancies, predict sequences of genes responsible for
cancer and determine the prognostic. The classification’s aim is to put each observation in a
category that it belongs to.

LITERATURE REVIEW:

A lot of studies have been done in the field of Breast cancer classification, some of them used
mammography images and some breast cancers are classified with other techniques such as
Softmax Discriminant Classifier (SDC), Linear Discriminant Analysis (LDA), and Fuzzy C
Means Clustering. The k-nearest neighbors algorithm is one of the most used algorithms in
machine learning. In cancer classification, KNN can be used to measure the performance of
false positive rates . Naive Bayesian classifiers are generally used to predict biological,
chemical and physiological properties. In cancer classification, NBC are sometimes
combined to other classifiers such as decision tree to determine prognostics or classification
models. Different classification techniques were developed for breast cancer diagnosis, the
accuracy of many of them was evaluated using the dataset taken from Wisconsin breast
cancer database. For example, in the optimized learning vector method’s performance was
96.7%, big LVQ method reached, SVM for cancer diagnosis’s accuracy is 97.13% is the
highest one in the literature.
DATASET:

Breast cancer Wisconsin Dataset-

https://www.kaggle.com/uciml/breast-cancer-wisconsin-data

Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast
mass. They describe characteristics of the cell nuclei present in the image.
Classification type: Binary
Class distribution: 357 benign, 212 malignant

Attribute Information:
1) ID number
2) Diagnosis (M = malignant, B = benign)

Ten real-valued features are computed for each cell nucleus:

a) Radius (mean of distances from center to points on the perimeter)
b) Texture (standard deviation of gray-scale values)
c) Perimeter
d) Area
e) Smoothness (local variation in radius lengths)
f) Compactness (perimeter^2 / area - 1.0)
g) Concavity (severity of concave portions of the contour)
h) Concave points (number of concave portions of the contour)
i) Symmetry
j) Fractal dimension ("coastline approximation" - 1)
The mean, standard error and "worst" or largest (mean of the three
largest values) of these features were computed for each image,
resulting in 30 features. For instance, field 3 is Mean Radius, field
13 is Radius SE, field 23 is Worst Radius.
All feature values are recoded with four significant digits.
Missing attribute values: none

Class distribution: 357 benign, 212 malignant

B- Benign
M- Malignant
DATA VISUALIZATION:

Histograms of all the feature vectors

Now from these histograms we see that features like- mean fractal dimension has very little
role to play in separating malignant from benign, but worst concave points or worst perimeter
are useful features that can give us strong hints about the classes of cancer data-set. So if your
data has only one feature e.g. worst perimeter, it can be good enough to separate malignant
from benign cases.

CREATE TRAINING AND TESTING DATA:

Scikit-Learn provides a few functions to split datasets into multiple subsets in various
ways. The simplest function is train_test_split, which does pretty much the same thing
as the function split_train_test defined earlier, with a couple of additional features. First
there is a random_state parameter that allows you to set the random generator seed as
explained previously, and second you can pass it multiple datasets with an identical
number of rows, and it will split them on the same indices (this is very useful, for
example, if you have a separate DataFrame for labels):
80% of the samples for testing
20% of the samples for training

CORRELATION MATRIX:

The Pandas corr() function is used to find the pairwise correlation of all columns in the breast
cancer dataframe. Correlation is used when referencing the strength of a relationship between
two variables have a high/strong correlation means.

Correlation matrix size- 30*30

A heatmap is a two-dimensional graphical representation of data values that are contained in

a visualized matrix. The seaborn Python package allows the creation of heatmaps which can

be tweaked using matplotlib tools.

STANDARDISING DATA:

Standardize features by removing the mean (i.e. making it to 0) and scaling to unit variance.

The standard score of a sample x is calculated as:

z = (x - u) / s

Where u is the mean of the training samples or zero if with_mean=False and ‘s’ is the
standard deviation of the training samples or one if with_std=False

Standardization of a dataset is a common requirement for many machine learning estimators:

they might behave badly if the individual features do not more or less look like standard
normally distributed data (e.g. Gaussian with 0 mean and unit variance).

PRINCIPAL COMPONENT ANALYSIS:

Principal component analysis (PCA) is a technique for reducing the dimensionality of such
datasets, increasing interpretability but at the same time minimizing information loss. It does
so by creating new uncorrelated variables that successively maximize variance.
By analysing the correlation matrix and since the dimensionality of our dataset is huge, we
are using PCA to reduce dimensions of our data.
PCA Scatterplot with two components:

- Benign
- Malignant
K- NEAREST NEIGHBORS CLASSIFIER:

k-Nearest Neighbors is an example of a classification algorithm. These algorithms are either

quantitative or qualitative and are used to place a particular data set in a particular category or

classification. The way that this algorithm works is through demarcation lines and decisions

about boundaries. In this algorithm, K is the data point that the operator is trying to figure out

more information about. The operator often wants to figure out what categories K fits in.

In order to do this, the algorithm draws a perimeter around K and studies the other data points

within that perimeter. The data points within a determined perimeter help push the artificial

intelligence machine to give K classification. Different neighbors in a different perimeter

would lead to potentially different results for this algorithm. K-nearest neighbors are helpful

for guiding machine learning and determining relationships while only knowing a limited

amount of data about the situation.

Finding the optimum k value:

From the above plot we can view that the nearest neighbor 3 has the highest accuracy rating

between the testing and training data. We are able to set the nearest neighbor to 3 and validate

k-NN score.
NAÏVE BAYES CLASSIFIER:

It is a classification technique based on Bayes’ Theorem with an assumption of

independence among predictors. In simple terms, a Naive Bayes classifier assumes that the
presence of a particular feature in a class is unrelated to the presence of any other feature.

Naive Bayes model is easy to build and particularly useful for very large data sets. Along
with simplicity, Naive Bayes is known to outperform even highly sophisticated classification
methods.

Bayes theorem provides a way of calculating posterior probability P(c|x) from P(c), P(x) and
P(x|c). Look at the equation below:

Above,

● P(c|x) is the posterior probability of class (c, target) given predictor (x, attributes).
● P(c) is the prior probability of class.
● P(x|c) is the likelihood which is the probability of a predictor given class.
● P(x) is the prior probability of the predictor.

There are three types of Naive Bayes model under the scikit-learn library:

● Gaussian: It is used in classification and it assumes that features follow a normal

distribution.
● Multinomial: It is used for discrete counts.
● Bernoulli: The binomial model is useful if your feature vectors are binary (i.e. zeros
and ones).

Since our problem is classification problem we have used Gaussian type of Naïve Bayes
Classifier.
CONFUSION MATRIX:

In the field of machine learning and specifically the problem of, a statistical classification,
also known as an error matrix, is a specific table layout that allows visualization of the
performance of an algorithm, typically a supervised learning one (in unsupervised learning it
is usually called a matching matrix. Each row of the matrix represents the instances in a
predicted class while each column represents the instances in an actual class (or vice versa).
The name stems from the fact that it makes it easy to see if the system is confusing two
classes (i.e. commonly mislabelling one as another).
It is a special kind of contingency table, with two dimensions ("actual" and "predicted"), and
identical sets of "classes" in both dimensions (each combination of dimension and class is a
variable in the contingency table).

True positive (TP) - No of positives correctly predicted

True negative (TN) - No of negatives correctly predicted

False positive (FP) - No of negatives predicted as negatives

False negative (FN) - No of positives predicted as negatives

ACCURACY SCORE:
It is calculated from the confusion matrix which has the values of how true positives(TP)
,true negatives(TN),false positives(FP) and false negatives(FN).the formula for the accuracy
score is
ROC CURVES:

A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the
diagnostic ability of a binary classifier system as its discrimination threshold is varied.
It gives us the trade-off between the True Positive Rate (TPR) and the False Positive Rate
(FPR) at different classification thresholds.

The ROC curve is plotted with TPR against the FPR where TPR is on y-axis and FPR is on
the x-axis.

Roc accuracy scores for both knn and naive bayes

REFERENCES:

[1] M. Amrane, S. Oukid, I. Gagaoua and T. Ensarİ, "Breast cancer classification using
machine learning," 2018 Electric Electronics, Computer Science, Biomedical Engineerings'
Meeting (EBBT), Istanbul, 2018, pp. 1-4.

[2] S.K. Prabhakar, H. Rajaguru, "Performance Analysis of Breast Cancer Classification with
Softmax Discriminant Classifier and Linear Discriminant Analysis", In: Maglaveras N.,
Chouvarda I., de Carvalho P. (eds) Precision Medicine Powered by pHealth and Connected
Health. IFMBE Proceedings, vol 66. Springer, Singapore, 2018.

[3] P.Bhuvaneswaria, B. Therese, "Detection of Cancer in Lung with K-NN Classification

Using Genetic Algorithm", Procedia Materials Science, Vol. 10, pp. 433-440, 2015.

[4]https://towardsdatascience.com/building-a-simple-machine-learning-model-on-breast-
cancer-data-eca4b3b99fa3

Also referred many websites which helped us in implementation.

1. Hands on Machine Learning with Scikit and Tensorflow(Book)

2. Kaggle
3. Geeks for Geeks
4. Stack Overflow
5. Towards Data Science
6. Medium.com
7. statisticsbyjim.com
8. Analyticstraining.com
9. Levelup.gitconnected.com
10. Github

(Ebook PDF) Physics For The Life Sciences 3rd Canadian Edition PDF Download
100% (2)
(Ebook PDF) Physics For The Life Sciences 3rd Canadian Edition PDF Download
50 pages
Breast Cancer Prediction Using Machine Learning
No ratings yet
Breast Cancer Prediction Using Machine Learning
8 pages
Breast Cancer Classification Using Machine Learning
No ratings yet
Breast Cancer Classification Using Machine Learning
9 pages
Project Final
No ratings yet
Project Final
15 pages
Vendor List: Ser No Name & Address of Firm Contact Details (Email and Tele Nos) Core Competencies
100% (1)
Vendor List: Ser No Name & Address of Firm Contact Details (Email and Tele Nos) Core Competencies
13 pages
KNN Algorithm - PPT (Autosaved)
0% (1)
KNN Algorithm - PPT (Autosaved)
8 pages
Heart Disease Prediction-02-1
No ratings yet
Heart Disease Prediction-02-1
27 pages
Amt305 Introduction To Machine Learning, Pyq
No ratings yet
Amt305 Introduction To Machine Learning, Pyq
5 pages
5 A Machine Learning Approach For Skin Disease Detection and 2022 Healthcare
No ratings yet
5 A Machine Learning Approach For Skin Disease Detection and 2022 Healthcare
15 pages
SMWW 2310 Acidity
100% (2)
SMWW 2310 Acidity
3 pages
Types of Classification Algorithm
No ratings yet
Types of Classification Algorithm
27 pages
P. Monribot - There Is No Sexual Relation PDF
No ratings yet
P. Monribot - There Is No Sexual Relation PDF
18 pages
Answers To Problems For Data Mining and Predictive Analytics (2nd Edition) by Larose
No ratings yet
Answers To Problems For Data Mining and Predictive Analytics (2nd Edition) by Larose
12 pages
Answer 1722791857 NLP and Classification Practical MCQ 4991
No ratings yet
Answer 1722791857 NLP and Classification Practical MCQ 4991
26 pages
Quiz 3 - Recommendation Systems, Association Rule Mining - Machine Learning 3 - Ravi
No ratings yet
Quiz 3 - Recommendation Systems, Association Rule Mining - Machine Learning 3 - Ravi
7 pages
Digital Image Processing LAB MANUAL 6th Sem-Final
No ratings yet
Digital Image Processing LAB MANUAL 6th Sem-Final
20 pages
Labpractice 2
100% (2)
Labpractice 2
29 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
10 pages
3 Circuit Breakers
100% (1)
3 Circuit Breakers
99 pages
Artificial Intelligence: CS60045 Course Introduction
100% (4)
Artificial Intelligence: CS60045 Course Introduction
16 pages
Breast Cancer Detection Using SVM Classifier With Grid Search Technique
No ratings yet
Breast Cancer Detection Using SVM Classifier With Grid Search Technique
6 pages
Breast Cancer Detection - Final
No ratings yet
Breast Cancer Detection - Final
21 pages
Bone Cancer Detection Using ML
No ratings yet
Bone Cancer Detection Using ML
23 pages
Unit - 4 Machine Learning
100% (1)
Unit - 4 Machine Learning
84 pages
Data Mining Project Shivani Pandey
100% (1)
Data Mining Project Shivani Pandey
40 pages
Lung Disease Prediction - Edited
No ratings yet
Lung Disease Prediction - Edited
35 pages
Topic 1 Etw3482
100% (2)
Topic 1 Etw3482
69 pages
ML0101EN Clas K Nearest Neighbors CustCat Py v1
100% (1)
ML0101EN Clas K Nearest Neighbors CustCat Py v1
11 pages
ML Unit-2
No ratings yet
ML Unit-2
26 pages
Flight Price Prediction Project Presentation
No ratings yet
Flight Price Prediction Project Presentation
15 pages
Class Xi Python
100% (2)
Class Xi Python
138 pages
OOSE Lab Report
No ratings yet
OOSE Lab Report
30 pages
Creditcard Fraud Detection
No ratings yet
Creditcard Fraud Detection
26 pages
Machine Learning Algorithms For Breast Cancer Prediction
No ratings yet
Machine Learning Algorithms For Breast Cancer Prediction
8 pages
1Z0 1087 24 Demo
No ratings yet
1Z0 1087 24 Demo
4 pages
Einstein First Paper PDF
67% (3)
Einstein First Paper PDF
18 pages
ML Question Bank
No ratings yet
ML Question Bank
29 pages
Disease Prediction Using Machine Learning
No ratings yet
Disease Prediction Using Machine Learning
4 pages
12 Outlier
No ratings yet
12 Outlier
55 pages
Cluster
100% (1)
Cluster
72 pages
KNN Presentation
No ratings yet
KNN Presentation
16 pages
Data Science
No ratings yet
Data Science
39 pages
SUpport Vector Machine
No ratings yet
SUpport Vector Machine
28 pages
Lung Disease Detection Using X Rays: Under The Mentorship of
No ratings yet
Lung Disease Detection Using X Rays: Under The Mentorship of
39 pages
Limits, Fits and Tolerances
100% (1)
Limits, Fits and Tolerances
81 pages
Face Detection & Emotion Recognition
No ratings yet
Face Detection & Emotion Recognition
26 pages
Survey of Machine Learning Algorithms For Disease Diagnostic
No ratings yet
Survey of Machine Learning Algorithms For Disease Diagnostic
16 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Heart Attack Predictions Using Machine Learning
No ratings yet
Heart Attack Predictions Using Machine Learning
8 pages
Decision Trees For Predictive Modeling (Neville)
100% (1)
Decision Trees For Predictive Modeling (Neville)
24 pages
Honours in Artificial Intelligence and Machine Learning: Board of Studies (Computer Engineering)
No ratings yet
Honours in Artificial Intelligence and Machine Learning: Board of Studies (Computer Engineering)
16 pages
Decision Tree Learning: - A Learned Decision Tree Can Also Be Re-Represented As A Set of If-Then Rules
No ratings yet
Decision Tree Learning: - A Learned Decision Tree Can Also Be Re-Represented As A Set of If-Then Rules
49 pages
Duda Solutions PDF
No ratings yet
Duda Solutions PDF
77 pages
Detection of Breast Cancer Using Data Mining Tool WEKA PDF
No ratings yet
Detection of Breast Cancer Using Data Mining Tool WEKA PDF
5 pages
Data Science Intervieew Questions
100% (1)
Data Science Intervieew Questions
16 pages
Astm d3689 - XYZ
No ratings yet
Astm d3689 - XYZ
4 pages
Machine Learning Techniques For Heart Disease Prediction: A. Lakshmanarao, Y.Swathi, P.Sri Sai Sundareswar
No ratings yet
Machine Learning Techniques For Heart Disease Prediction: A. Lakshmanarao, Y.Swathi, P.Sri Sai Sundareswar
4 pages
Module 1 Quiz
No ratings yet
Module 1 Quiz
7 pages
Advice For Applying Machine Learning: Deciding What To Try Next
No ratings yet
Advice For Applying Machine Learning: Deciding What To Try Next
30 pages
Dilation
No ratings yet
Dilation
13 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Quiz Week 7 - Support Vector Machines
100% (1)
Quiz Week 7 - Support Vector Machines
3 pages
02 - Decision Tree Classification On Iris Dataset
No ratings yet
02 - Decision Tree Classification On Iris Dataset
6 pages
Assignment # 01 Bscs - 7 Semester: Machine Learning
100% (1)
Assignment # 01 Bscs - 7 Semester: Machine Learning
5 pages
PCA Using Python
No ratings yet
PCA Using Python
18 pages
CS2055 - Software Quality Assurance
No ratings yet
CS2055 - Software Quality Assurance
15 pages
Electric Discharge Machining (Edm) BY: Dr. Manas Das Assistant Professor
No ratings yet
Electric Discharge Machining (Edm) BY: Dr. Manas Das Assistant Professor
40 pages
Data Science Project
No ratings yet
Data Science Project
3 pages
Arc-Free Atmospheric Pressure Cold Plasma Jets - A Review
No ratings yet
Arc-Free Atmospheric Pressure Cold Plasma Jets - A Review
12 pages
APDU Basic Commands
100% (1)
APDU Basic Commands
3 pages
GHH1
100% (1)
GHH1
8 pages
Aflatoxin 4
No ratings yet
Aflatoxin 4
34 pages
SPC
No ratings yet
SPC
49 pages
Heart Prediction
No ratings yet
Heart Prediction
15 pages
McqComputer Applications - Scholarexpress
No ratings yet
McqComputer Applications - Scholarexpress
6 pages
UM0058 Ai-logger-Modbus-TCP EN V03 0424
No ratings yet
UM0058 Ai-logger-Modbus-TCP EN V03 0424
26 pages
Ta 3CS
No ratings yet
Ta 3CS
12 pages
RRB Paramedical Answer Key 30-04-2025 Afternoon Shift LABORATORY-ASSISTANT-GRADE-II-3
No ratings yet
RRB Paramedical Answer Key 30-04-2025 Afternoon Shift LABORATORY-ASSISTANT-GRADE-II-3
15 pages
Inclined Bedding - Fold (Lab 2A)
No ratings yet
Inclined Bedding - Fold (Lab 2A)
14 pages
Artificial Neural Networks: An Overview: August 2023
No ratings yet
Artificial Neural Networks: An Overview: August 2023
11 pages
The Hydrology of Wadi Ibrahim Catchment in Makkah City The Kingdom of Saudi Arabia
No ratings yet
The Hydrology of Wadi Ibrahim Catchment in Makkah City The Kingdom of Saudi Arabia
10 pages
EVALKIT
No ratings yet
EVALKIT
24 pages
Mech BSN 2019R3 EN WS06.1
No ratings yet
Mech BSN 2019R3 EN WS06.1
19 pages
Mahendra SFDC8
No ratings yet
Mahendra SFDC8
5 pages
Bahan Ajar Dvb-t2 Utk Dikalt TX Digital 2013
No ratings yet
Bahan Ajar Dvb-t2 Utk Dikalt TX Digital 2013
56 pages
Cell Structures and Their Functions
No ratings yet
Cell Structures and Their Functions
1 page
Handout4-ReviewJavaBasis-OCA Java SE 8 Programmer I
No ratings yet
Handout4-ReviewJavaBasis-OCA Java SE 8 Programmer I
5 pages
The Scales of Harmonies: Family Popular Name Interval Steps Systematic Name Chords
No ratings yet
The Scales of Harmonies: Family Popular Name Interval Steps Systematic Name Chords
1 page
Mastering Parallel Programming with R
From Everand
Mastering Parallel Programming with R
Simon R. Chapple
No ratings yet
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
From Everand
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
Sebastian Thelen
5/5 (1)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Breast Cancer Classification

Uploaded by

Breast Cancer Classification

Uploaded by

BREAST CANCER CLASSIFICATION

Project report submitted for EED363 Applied Machine Learning

1. Satwik Boojala ( 1610110340 )

2. Kaliki Sai Preetham ( 1610110452 )

Department of Electrical Engineering

5) Creating Training and Test data

6) Correlation matrix of the Data

7) Standardising the Data

8) Principal component analysis

9) Training and Evaluating different Classification Models

● K- Nearest Neighbors (KNN)

10) Confusion Matrix and Accuracy scores

11) ROC curves

Breast cancer Wisconsin Dataset-

Ten real-valued features are computed for each cell nucleus:

Class distribution: 357 benign, 212 malignant

Histograms of all the feature vectors

CREATE TRAINING AND TESTING DATA:

Correlation matrix size- 30*30

be tweaked using matplotlib tools.

The standard score of a sample x is calculated as:

Standardization of a dataset is a common requirement for many machine learning estimators:

PRINCIPAL COMPONENT ANALYSIS:

k-Nearest Neighbors is an example of a classification algorithm. These algorithms are either

intelligence machine to give K classification. Different neighbors in a different perimeter

amount of data about the situation.

Finding the optimum k value:

It is a classification technique based on Bayes’ Theorem with an assumption of

● Gaussian: It is used in classification and it assumes that features follow a normal

True positive (TP) - No of positives correctly predicted

True negative (TN) - No of negatives correctly predicted

False positive (FP) - No of negatives predicted as negatives

False negative (FN) - No of positives predicted as negatives

Roc accuracy scores for both knn and naive bayes

[3] P.Bhuvaneswaria, B. Therese, "Detection of Cancer in Lung with K-NN Classification

Also referred many websites which helped us in implementation.

1. Hands on Machine Learning with Scikit and Tensorflow(Book)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.