0% found this document useful (0 votes)
10 views27 pages

Project Report

Breast cancer is the most common health issue among women, with early detection being crucial for reducing mortality rates. This paper discusses the use of machine learning techniques, specifically comparing Logistic Regression, Random Forest, and Decision Trees, to classify breast cancer outcomes using mammogram data. The research aims to improve prediction accuracy and reduce error rates through various datasets and machine learning algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views27 pages

Project Report

Breast cancer is the most common health issue among women, with early detection being crucial for reducing mortality rates. This paper discusses the use of machine learning techniques, specifically comparing Logistic Regression, Random Forest, and Decision Trees, to classify breast cancer outcomes using mammogram data. The research aims to improve prediction accuracy and reduce error rates through various datasets and machine learning algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

INTRODUCTION

Breast cancer has become the most recurrent type of health


issue among women especially for women in middle age. Early detection
of breast cancer can help women cure this disease and death rate can be
reduced [1]. In the present-day scenario, to observe breast cancer
mammograms are used and they are known be the most effective
scanning technique. In this paper the detection of cancer cells is done by
machine learning technique.

The second major cause of women's death is breast cancer (after


lung cancer). 246,660 of women's new cases of invasive breast cancer are
expected to be diagnosed in the US during 2016 and 40,450 of women’s
death is estimated. Breast cancer is a type of cancer that starts in the breast.
Cancer starts when cells begin to grow out of control. Breast cancer cells
usually form a tumour that can often be seen on an x-ray or felt as a lump.
Breast cancer can spread when the cancer cells get into the blood or lymph
system and are carried to other parts of the body. The cause of Breast
Cancer includes changes and mutations in DNA. There are many different
types of breast cancer and common ones include ductal carcinoma in situ
(DCIS) and invasive carcinoma. Others, like phyllodes tumours and
angiosarcoma are less common. There are many algorithms for classification
of breast cancer outcomes. The side effects of Breast Cancer are – Fatigue,
Headaches, Pain and numbness (peripheral neuropathy), Bone loss and
osteoporosis. There are many algorithms for classification and prediction of
breast cancer outcomes. The present paper gives a comparison between the
performance of three classifiers: Logistic Regression , Random Forest and

1
decision tree which are among the most influential data mining algorithms.
It can be medically detected early during a screening examination through
mammography or by portable cancer diagnostic tool. Cancerous breast
tissues change with the progression ofthe disease, which can be directly
linked to cancer staging. The stage of breast cancer (I–IV) describes how far
a patient’s cancer has proliferated. Statistical indicators such as tumour size,
lymph node metastasis, and distant metastasis and so on are used to
determine stages. To prevent cancer from spreading, patients have to
undergo breast cancer surgery, chemotherapy, radiotherapy and endocrine.
The goal of the research is to identify and classify Malignant and Benign
patients and intending how to parametrize our classification techniques
hence to achieve high accuracy. We are looking into many datasets and how
further Machine Learning algorithms can be used to characterize Breast
Cancer. We want to reduce the error rates with maximum accuracy. 10-fold
cross validation test which is a Machine Learning Technique is used in
JUPYTER to evaluate the data and analyse data in terms of effectiveness
and efficiency.

Machine learning is an application of artificial intelligence that


provides systems the ability to automatically learn and improve from
experience without being explicitly programmed. The basic premise of
machine learning is to build algorithms that can receive input data and
use statistical analysis to predict an output while updating outputs as new
data becomes available. The process of learning begins with observations
or data, such as examples, direct experience, or instruction, to look for
patterns in data and make better decisions in the future based on the
examples that we provide. The primary aim is to allow the computers
learn automatically without human intervention or assistance and adjust
actions accordingly.

2
1.1 MOTIVATION
Breast Cancer is the most affected disease present in women
worldwide. 246,660 of women's new cases of invasive breast cancer are
expected to be diagnosed in the U.S during 2016 and 40,450 of women’s
death is estimated. The development in Breast Cancer and its prediction
fascinated. The UCI Wisconsin Machine Learning Repository Breast
Cancer Dataset attracted as large patients with multivariate attributes were
taken as sample set.

1.2 RELATED WORK


The cause of Breast Cancer includes changes and mutations in DNA.
Cancer starts when cells begin to grow out of control. Breast cancer cells usually
form a tumour that can often be seen on an x-ray or felt as a lump. There are
many different types of breast cancer and common ones include ductal
carcinoma in situ (DCIS) and invasive carcinoma. Others, like phyllodes
tumours and angiosarcoma are less common.We have used classification
methods like Random Forest, Decision tree, Logistic regression. Prediction and
prognosis of cancer development are focused on three major domains: risk
assessment or prediction of cancer susceptibility, prediction of cancer relapse,
and prediction of cancer survival rate. The first domain comprises prediction of
the probability of developing certain cancer prior to the patient diagnostics. The
second issue is related to prediction of cancer recurrence in terms of diagnostics
and treatment, and the third case is aimed at prediction of several possible
parameters characterizing cancer development and treatment after the diagnosis

3
of the disease: survival time, life expectancy, progression, drug sensitivity, etc.
The survivability rate and the cancer relapse are dependent very much on the
medical treatment and the quality of the diagnosis. As we know that data pre-
processing is a data mining technique that used for filter data in a usable format.
Because the real- world dataset almost available in different format. It is not
available as per our requirement so it must be filtered in understandable format.
Data pre-processing is a proven method of resolving such issues. Data pre-
processing convert the dataset into usable format for pre-processing we have
used standardization method.

The following is the summary of the existing works on the given domain:

Title of the Date of Problem Software Techniques


SL-No Dataset used Accuracy
paper publication domain used used
Skin lesion
classification
from Skin lesion
ISIC Archive -KNN,ANN
1 dermoscopic 2017 as maligrant python 81.33%
dataset and SVMs
images using or bening
deep learning
techniques
python ,
Breast cancer 19th June, 2020 UCI-ML 96.50%
java(8),
detection and
Breast cancer Breast
2 prediction
dataset detection Repository for
using machine (Research (Random
xcyt breast cancer
learning Gate.net) Forest)
dataset

Lung cancer
Lung cancer U-Net ,
detection and
detection
classification TCIA dataset and Random
3 2021 and SVM 99%
using machine LIDC Forest
classifier for
learning
classfication Convolutio
algorithm
nal
Network

4
A Novel
approach to Analysis the
perform breast
2018 -Decision tree,
analysis and cancer data
4 Random forest, R
prediction on and do for
SVMs
breast cancer efficiency
dataset using prediction
R
(Research Gate)

A deep A method of
learning multi-level
model based feature Inception -V3 99.34%(inc
5th Mar, 2020
on extraction and DensNet eption -V3)
Brain tumor
5 concatenation and Python
dataset
approach for concatenati
the diagnosis on for early
of brain (IEEE Access) diagnosis of 201 99.51%
tumor brain tumor (DensNet2
02)

27th Apr, 2020 Used of SVMs and 97.90%


Potential machine
breast cancer KEGG-kyoto learning Random
drug (IEEE Explore) encyclopedia of model in (SVMs)
forest
6 prediction gens and classifying Python
using machine genomes and drug as a 99.20%
learning chEMBL method potent (Random
model breast forest)
cancer drug

Combinatio
MIAS
Automated n of various
(mammographic
breast mass 5th Apr, 2021 techniques 97.50%
image analysis
classification to classify
society) and Random
7 system using the breast Python
DDSM(digital forest and DL
deep learning mass in to
database for
in digital benign
screening
mammogram malignant
(IEEE Explore) mammography (MIAS)
and normal
96%
(DDSM)

5
Deep learning
to improve Improve
breast cancer cancer DL using the
CBIS-DDSM
8 detection on th
29 Aug, 2019 detection python CNN 97%
screening with deep
mammograph learning
y

And INbreast model

Attention - New deep


Enriched deep Integrating learning
learning 19th June, 2020 visud model using
Breast
model for saliency to attention
9 ultrasound Python 90.50%
breast tumor block and
images
segmentation
in ultrasound
images Tumor
(Elsevier) breast U-Net
cancer

COVID-19
detection 14th July, COVID-19 x-ray VGG.19
through
10 transfer COVID-19 98.40%
Detection
learning using
2020 through and And
multimodal
transfer
imaging data
(IEEE Access) learning CT scan DenseNet

Feature
selection from
colon cancer
dataset for Cancer ANN and
Colon cancer
cancer
11 2018 dataset and MATLAB 98.40%
classification
SVMs
using Artificial
Neural
Network(ANN
) Classificatio
SVMs
n using ANN

An
Automated
detection of
breast cancer CAD
diagnosis and (computer
12 prognosis 14th Apr, 2022 DDSM/323 Benign vs aided ANN, 98.83%
based on diagnostic
Machine )
learning using
ensemble of
classifier

6
(IEEE Explore) maligonant SVM and
KNN

A sustainable
IoTH based
computationa
Over come Greedy Best
lly intelligent 9th June, 2021 Lung cancer 98.80%
the rise of First Search
13 healthcare Python
lung cancer
monitoring
diseases
system for
lung cancer
risk detection
(Random
(Elsevier) dataset (GBFS)
Forest)
Diagnosing
Deep-chest , measure
multiclassficat diseases Deep
ion deep 2022 COVID-CT VGG19+(CNN) 98.05%
deep learning,
learning learning
model for model
14 diagnosing Chest X-ray
COVID-19 (Elsevier) Dataset and CT (AI) ResNet
pneumonia images
and lung
(computed
cancer chest 152V2
tomography)
diseases
95.31%

On the
Automatic 30th May, 2022 HAM1000 Raw deep MATLAB
detection and transfer
Deep transfer
classification learning in
15 learning of a 82.90%
of skin cancer classifying
CNN
using deep (Sensors) 10015 images of R2021a
transfer skin lesion
learning Dermoscopic

images

Build
models for
detecting
Hospital based
Prediction and 79.8%(De
dataset n=8066 Decision tree
factors for 19(1),1- visualising cision
16 with diagnosis Python and Random
17,2019 significant
forest
prognostic
indicators
of survival
rate
Tree),82.7
Survival of Information
%

7
breast cancer Between 1993 (Random
patient and 2016 forest)

Features
concentrati
Deep on using
learning pre-trained
model on Brain dataset model as 99.34%(In
concentratio comprised of compared Inception-v3 ception-
n approach 5 March 2020 3064 T1- to the and v3) and
17 Python
for the IEEE Access weighted current 99.51%
diagnosis of contrast image research
brain tumour of 233 method for
brain
tumour
classificatio
(IEEE n (DensNet
DensNet201
ACCESS) 201)

Lung cancer 32 instances,57 Compared


prediction characteristics with Support
from Text
2020 existing vector 98.8(SVM
18 datasets IEEE Access
Python
SVM and machines(SV )
using and one class SMOTE M)
machine attribute in its method
learning entirety

Deep
learning
3533 skin method 83.2%(CN
Detection of
lesions(benign, CNN was N),83.7%
skin cancer
used to CNN, (Resnet50
based on skin 30 May 2022
19 detect python Resnet50, ),
lesion images Sensors
malignant Inception V3 85.8%(Inc
using deep
and benign eption
learning
Malignant and using V3)
melanocytic ISIC2018
tumour) dataset

8
Algorithms
were
evaluated
Comparison
in terms of
of nomogram
ROC curve
with machine
and
learning 88.7%(De
accuracy
techniques Decision tree cision
31 May 2013 7596 tongue value and
20 for prediction springer link
python and tree) and
cancer patients the result
of overall nomogram 60.4%(no
was
survival in mogram)
compared
patients with
with
tongue
nomogram
cancer
to predict
survival of
patients

Proposing a
suitable
method
WBC that can
Analysis of dataset(699 manage 98.2%(J48
breast cancer instances and the ),99.56%
detection 11 attributes) imbalanced
27 April 2020 WEKA
21 using and Breast dataset and J48,NB,SMO
IEEE Explore 3.8.3
different cancer dataset the missing
machine (286 instances values to
learning and 10 enhance
attributes) the
classifier’s
performanc
e (SMO)
and
99.24%
(NB)
Reduce the
variability
Lung cancer
in assessing
prediction
and
using
reporting
machine 2021 3593 CAD
22 the lung SVM 98.56%
learning and IEEE Access LUNGRADS software
cancer risk
advanced
between
imaging
interpretin
techniques
g
physicians

9
Used to
Breast cancer
predict
prognosis
outcomes
using a N=318 (training Kernel-based
23 2019 in Python 96.30%
machine set) learning
individual
learning
cancer
approach
patients
Breast cancer
5 year
prediction 2020
Electronic survivabilit Logistic
24 using www.researchg WEKA 92.30%
ate.net health record y regression
Machine
prediction
learning

Patient
Breast cancer features
96.85%
prediction sorted out
2020
using Wisconsin from data KNN and
25 www.researchg WEKA
machine ate.net
breast cancer materials SVM
learning are (KNN)
approach statistically and
tested 96.85%(S
VM)
Breast cancer Decision
Naïve Bayes
prediction UC Irvine tree is the
2020 j48 decision
using machine best
26 www.researchg WEKA tree and 96.50%
machine ate.net learning predictor
bagging
learning repository on holdout
algorithm
approach sample
Requires
Breast cancer less input
prediction parameter
2020
using ,performin
27 www.researchg Cancer society WEKA ADABOOST 97.50%
machine ate.net
g well in
learning the low
approach noise
dataset
Breast cancer
prediction
2020 Getting Logistic and
using
28 www.researchg Cancer society higher MATLAB Neural 96.30%
machine ate.net accuracy network
learning
approach
Reduce the
variability
Breast cancer
in assessing
prediction Logistic
2020 and
using CAD regression
29 www.researchg BCI Dataset reporting 94.20%
machine ate.net
system and back
the lung
learning propagation
cancer risk
approach
between
interpretin

10
g
physicians

Build
models for
Breast cancer detecting 93.29%(C
C4,5 Bagging
prediction and 4,5
2020 Gene expression and
using visualising Bagging),
30 www.researchg dataset WEKA ADABOOST
machine significant 92.62%
ate.net collection Decision
learning prognostic
trees
approach indicators
of survival
rate

(ADABOO
ST)

11
2. PROPOSED METHODOLOGY

DATA DATA
PREPROCESSIN PREPARATIO

FEATURE FEATURE
PROJECTION SELECTION

FEATURE
SCALING

PREDICTION
MODEL SELECTION

Fig. (1) Phases of Machine Learning consists of seven phases, the phases are elaborated as
given below:
-
Phase 1 - Pre-Processing Data

The first phase we do is to collect the data that we are interested in collecting
for pre-processing and to apply classification and Regression methods. Data
pre-processing is a data mining technique that involves transforming raw data
into an understandable format. Real world data is often incomplete,
inconsistent, and lacking certain to contain many errors. Data pre-processing
is a proven method of resolving such issues. Data pre-processing prepares raw
data for further processing. For pre-processing we have used standardization

12
method to pre-process the UCI dataset. This step is very important because the
quality and quantity of data that you gather will directly determine how good
your predictive model can be. In this case we collect the Breast Cancer
samples which are Benign and Malignant. This will be our training data.

Phase 2 - DATA PREPRATION

Data Preparation, where we load our data into a suitable place and prepare it
for use in our machine learning training. We’ll first put all our data together,
and then randomize the ordering.

Phase 3 - FEATURES SELECTION

In machine learning and statistics, feature selection, also known as variable


selection, attribute selection, is the process of selection a subset of relevant
features for use in model construction.

Data File and Feature Selection Breast Cancer Wisconsin (Diagnostic):- Data
Set from Kaggle repository and out of 31 parameters we have selected about
8-9 parameters. Our target parameter is breast cancer diagnosis
– malignant or benign. We have used Wrapper Method for Feature Selection.
The important features found by the study are: Concave points worst, Area
worst, Area se, Texture worst, Texture mean, Smoothness worst, Smoothness
mean, Radius mean, Symmetry mean.

Attribute Information:

ID number 2) Diagnosis (M = malignant, B = benign) 3–32)

Phase 4 - Feature Projection

Feature projection is transformation of high-dimensional space data to a lower

13
dimensional space (with few attributes). Both linear and nonlinear reduction
techniques can be used in accordance with the type of relationships among the
features in the dataset.

Phase 5 - Feature Scaling

Most of the times, your dataset will contain features highly varying in
magnitudes, units and range. But since, most of the machine learning
algorithms use Euclidian distance between two data points in their
computations. We need to bring all features to the same level of magnitudes.
This can be achieved by scaling.

Phase 6 - Model Selection

Supervised learning is the method in which the machine is trained on the data
which the input and output are well labelled. The model can learn on the
training data and can process the future data to predict outcome. They are
grouped to Regression and Classification techniques. A regression problem is
when the result is a real or continuous value, such as “salary” or “weight”. A
classification problem is when the result is a category like filtering emails
spam” or “not spam”. Unsupervised Learning: Unsupervised learning is
giving away information to the machine that is neither classified nor labelled
and allowing the algorithm to analyse the given information without providing
any directions. In unsupervised learning algorithm the machine is trained from
the data which is not labelled or classified making the algorithm to work
without proper instructions. In our dataset we have the outcome variable or
Dependent variable i.e. Y having only two set of values, either M (Malign) or
B (Benign). So Classification algorithm of supervised learning is applied on
it. We have chosen three different types of classification algorithms in
Machine Learning. We can use a small linear model, which is a simple.

14
2.1 METHODS USED:
1) LOGISTICS REGRESSION
Logistic regression was introduced by statistician DR Cox in 1958
and so predates the field of machine learning. It is a supervised machine
learning technique, employed in classification jobs (for predictions based on
training data). Logistic Regression uses an equation like Linear Regression,
but the outcome of logistic regression is a categorical variable whereas it is a
value for other regression models. Binary outcomes can be predicted from the
independent variables.

The general workflow is:


1) Get a dataset
2) Train a classifier
3) Make a prediction using such classifier

2) RANDOM FOREST:
Random forest, like its name implies, consists of many individual
decision trees that operate as an ensemble. Each individual tree in the random
forest spits out a class prediction and the class with the most votes becomes our
model’s prediction.

3) DECISION TREE:
Decision Tree is a Supervised learning technique that can be used
for both classification and Regression problems, but mostly it is preferred for
solving Classification problems. It is a tree-structured classifier,
where internal nodes represent the features of a dataset, branches represent the
decision rules and each leaf node represents the outcome.

15
In a Decision tree, there are two nodes, which are the Decision
Node and Leaf Node. Decision nodes are used to make any decision and have
multiple branches, whereas Leaf nodes are the output of those decisions and
do not contain any further branches.

The decisions or the test are performed on the basis of features of the
given dataset.

16
3. PROGRAMMING USED:

THE CODE:

# importing libraries

import numpy

import matplotlib.pyplot as plt

import pandas as pd

import seaborn as sns

# reading data from the file

df=pd.read_csv("data.csv")

df.info()

# return all the columns with null values count

df.isna().sum()

# return the size of dataset

df.shape

# remove the column

df=df.dropna(axis=1)

# shape of dataset after removing the null column

df.shape

17
# describe the dataset

df.describe()

# Get the count of malignant<M> and Benign<B> cells

df['diagnosis'].value_counts()

sns.countplot(df['diagnosis'],label="count")

# label encoding(convert the value of M and B into 1 and 0)

from sklearn.preprocessing import LabelEncoder

labelencoder_Y = LabelEncoder()

df.iloc[:,1]=labelencoder_Y.fit_transform(df.iloc[:,1].values)

df.head()

sns.pairplot(df.iloc[:,1:5],hue="diagnosis")

# get the correlation

df.iloc[:,1:32].corr()

# visualize the correlation

plt.figure(figsize=(10,10))

sns.heatmap(df.iloc[:,1:10].corr(),annot=True,fmt=".0%")

# split the dataset into dependent(X) and Independent(Y) datasets

X=df.iloc[:,2:31].values

18
Y=df.iloc[:,1].values

# spliting the data into trainning and test dateset

from sklearn.model_selection import train_test_split

X_train,X_test,Y_train,Y_test=train_test_split(X,Y,test_size=0.20,rando
m_state=0)

# feature scaling

from sklearn.preprocessing import StandardScaler

X_train=StandardScaler().fit_transform(X_train)

X_test=StandardScaler().fit_transform(X_test)

# models/ Algorithms

def models(X_train,Y_train):

#logistic regression

from sklearn.linear_model import LogisticRegression

log=LogisticRegression(random_state=0)

log.fit(X_train,Y_train)

#Decision Tree

from sklearn.tree import DecisionTreeClassifier

tree=DecisionTreeClassifier(random_state=0,criterion="entropy")

19
tree.fit(X_train,Y_train)

#Random Forest

from sklearn.ensemble import RandomForestClassifier

forest=RandomForestClassifier(random_state=0,criterion="entropy
",n_estimators=10)

forest.fit(X_train,Y_train)

print('[0]logistic regression accuracy:',log.score(X_train,Y_train))

print('[1]Decision tree accuracy:',tree.score(X_train,Y_train))

print('[2]Random forest accuracy:',forest.score(X_train,Y_train))

return log,tree,forest

model=models(X_train,Y_train)

# testing the models/result

from sklearn.metrics import accuracy_score

from sklearn.metrics import classification_report

20
for i in range(len(model)):

print("Model",i)

print(classification_report(Y_test,model[i].predict(X_test)))

print('Accuracy :
',accuracy_score(Y_test,model[i].predict(X_test))

# prediction of random-forest

pred=model[2].predict(X_test)

print('Predicted values:')

print(pred)

print('Actual values:')

print(Y_test)

from joblib import dump

dump(model[2],"Cancer_prediction.joblib")

21
RESULT AND DISCUSSION OF PROPOSED
METHODOLOGY

The work was implemented on i5 processor with 2.30GHz speed, 8 GB


RAM and all experiments on the classifiers described in this paper were
conducted using libraries from machine learning environment. In
Experimental studies we have partition 70-30% for training & testing.
JUPYTER contains a collection of machine learning algorithms for data pre-
processing, classification, regression, clustering and association rules.
Machine learning techniques implemented in JUPYTER are applied to a
variety of real-world problems. The results of the data analysis are reported.
To apply our classifiers and evaluate them, we apply the 10-fold cross
validation test which is a technique used in evaluating predictive models that
split the original set into a training sample to train the model, and a test set to
evaluate it. After applying the pre-processing and preparation methods, we try
to analyse the data visually and figure out the distribution of values in terms
of effectiveness and efficiency.

We evaluate the effectiveness of all classifiers in terms of time to build the


model, correctly classified instances, incorrectly classified instances and
accuracy.

Table No. 1
Algorithms Accuracy Recall F1 Score

Logistic 0.9649122807017544 0.96 0.96


Regression

Decision 0.9385964912280702 0.94 0.94


Tree
RandomForest 0.9736842105263158 0.97 0.97

22
Fig 2 :- Comparison graphs between features where represents Malignant and blue represents
Benign.

23
4.1 CONCLUSION

Breast Cancer represents one of the diseases that makes highest number of
deaths every year. At present, only few accurate prognostic and predictive factors are
used clinically for managing the patients with breast cancer. Here, by making use of
Algorithms with Level Set approach, high accuracy can be achieved in detection of
effected cell shapes with exact marking on detected contours. The proposed system helps
to enhance the performance of mammogram retrieval by selecting optimal features.
After creating the predicted model, we can now analyse results obtained in
evaluating efficiency of our algorithms. Random forest achieved the highest accuracy of
97.36% and 96.49%, 93.85% for logistic regression and decision tree respectively.

4.2 FUTURE WORK

The analysis of the results signifies that the integration of multidimensional data along
with different classification, feature selection and dimensionality reduction techniques can
provide auspicious tools for inference in this domain. Further research in this field should be
carried out for the better performance of the classification techniques so that it can predict on
more variables. We are intending how to parametrize our classification techniques hence to
achieve high accuracy. We are looking into many datasets and how further Machine Learning
algorithms can be used to characterize Breast Cancer. We want to reduce the error rates with
maximum accuracy.

24
REFERENCES

[1] Wang, D. Zhang and Y. H. Huang “Breast Cancer Prediction Using Machine Learning” (2018), Vol. 66,
NO. 7.
[2] B. Akbugday, "Classification of Breast Cancer Data Using Machine Learning Algorithms," 2019 Medical
Technologies Congress (TIPTEKNO), Izmir, Turkey, 2019, pp. 1-4.

[3] Keles, M. Kaya, "Breast Cancer Prediction and Detection Using Data Mining Classification Algorithms:
A Comparative Study." Tehnicki Vjesnik - Technical Gazette, vol. 26, no. 1, 2019, p. 149+.
[4] V. Chaurasia and S. Pal, “Data Mining Techniques: To Predict and Resolve Breast Cancer Survivability”,
IJCSMC, Vol. 3, Issue. 1, January 2014, pg.10 – 22.
[5] ] Delen, D.; Walker, G.; Kadam, A. Predicting breast cancer survivability: A comparison of three data
mining methods. Artif. Intell. Med. 2005, 34, 113–127.
[6] R. K. Kavitha1, D. D. Rangasamy, “Breast Cancer Survivability Using Adaptive Voting Ensemble
Machine Learning Algorithm Adaboost and CART Algorithm” Volume 3, Special Issue 1, February 2014
[7] P. Sinthia, R. Devi, S. Gayathri and R. Sivasankari, “Breast Cancer detection using PCPCET and
ADEWNN”, CIEEE’ 17, p.63-65
[8] Vikas Chaurasia and S.Pal, “Using Machine Learning Algorithms for Breast Cancer Risk Prediction and
Diagnosis” (FAMS 2016) 83 ( 2016 ) 1064 – 1069
[9] N. Khuriwal, N. Mishra. “A Review on Breast Cancer Diagnosis in Mammography Images Using Deep
Learning Techniques”, (2018), Vol. 1, No. 1.
[10] Y. Khourdifi and M. Bahaj, "Feature Selection with Fast Correlation-Based Filter for Breast Cancer
Prediction and Classification Using Machine Learning Algorithms," 2018 International Symposium on
Advanced Electrical and Communication Technologies (ISAECT), Rabat, Morocco, 2018, pp. 1-6.
[11] R. M. Mohana, R. Delshi Howsalya Devi, Anita Bai, “Lung Cancer Detection using Nearest Neighbour
Classifier”, International Journal of Recent Technology and Engineering (IJRTE), Volume-8, Issue-2S11,
September 2019
[12] Ch. Shravya, K. Pravalika, Shaik Subhani, “Prediction of Breast Cancer Using Supervised Machine
Learning Techniques”, International Journal of Innovative Technology and Exploring Engineering (IJITEE),
Volume-8 Issue-6, April 2019.
[13] Haifeng Wang and Sang Won Yoon, “Breast Cancer Prediction Using Data Mining Method”, Proceedings
of the 2015 Industrial and Systems Engineering Research Conference,
[14] Abdelghani Bellaachia, Erhan Guven, “Predicting Breast Cancer Survivability Using Data Mining
Techniques”

25
[15] Juhyeon Kim, Hyunjung Shin, Breast cancer survivability prediction using labeled,
unlabeled, and pseudo-labeled patient data, Journal of the American Medical Informatics
Association, Volume 20, Issue 4, July 2013, Pages 613–618.
[16] N. Khuriwal and N. Mishra, "Breast cancer diagnosis using adaptive voting ensemble
machine learning algorithm," 2018 IEEMA Engineer Infinite Conference (eTechNxT),
New Delhi, 2018, pp. 1-5.
[17] M. Amrane, S. Oukid, I. Gagaoua and T. Ensarİ, "Breast cancer classification using
machine learning," 2018 Electric Electronics, Computer Science, Biomedical Engineerings'
Meeting (EBBT), Istanbul, 2018, pp. 1-4.
[18] M. R. Al-Hadidi, A. Alarabeyyat and M. Alhanahnah, "Breast Cancer Detection
Using K-Nearest Neighbor Machine Learning Algorithm," 2016 9th International
Conference on Developments in eSystems Engineering (DeSE), Liverpool, 2016, pp. 35-
39.
[19] Kibeom Jang, Minsoon Kim, Candace A Gilbert, Fiona Simpkins, Tan A Ince, Joyce
M Slingerland “WEGFA activates an epigenetic pathway regulating ovarian cancer
initiating cells” Embo Molecular Medicines Volume 9 Issue 3 (2017)

[20] Joseph A. Cruz and David S. Wishart “Applications of Machine Learning in


cancer prediction and
prognosis Cancer informatics” 2(3):59-77 · February 2007
[21] SA Medjahed, TA Saadi, A Benyettou “Breast cancer diagnosis by using k-nearest
neighbor with different distances and classification rules” International Journal of
Computer Applications 62 (1),

26
27

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy