0% found this document useful (0 votes)
19 views2 pages

NaiveBayesKfold Report

This document discusses developing a Naive Bayes classification model using k-fold cross validation. Three datasets were used to test the model and obtain accuracy scores for each fold and the highest overall accuracy. The Breast Cancer dataset achieved the highest accuracy of 85.7%.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views2 pages

NaiveBayesKfold Report

This document discusses developing a Naive Bayes classification model using k-fold cross validation. Three datasets were used to test the model and obtain accuracy scores for each fold and the highest overall accuracy. The Breast Cancer dataset achieved the highest accuracy of 85.7%.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Naïve Bayes Algorithm Classifier using k-fold cross validation

This project involved developing a supervised machine learning classification model namely
Naïve Bayes which was responsible to use given values of features and obtain the observation
class (data point). Naïve Bayes model is able to accomplish the task of classification by finding
the probability of a given class using a set of feature values of each class label.
The probability of a class is calculated:

Naïve Bayes already assumes that all feature data are independent a simple equation of
p(x1, x2 , … , xn | yi) give the class label then probability can be estimated foe each feature in
the dataset and the probability distributions of features for each class is stored independently in
case of 2 classes and 10 features then 20 probability distributions. By use of Gaussian
distribution then sum all the probabilities. The disadvantage of this model is the assumption that
features are independent which in real case scenario not.
The model also utilized k-fold cross validation by definition it is when the dataset is split into a
K number of folds and is used to evaluate the model's ability when given new data which helps
to split the given dataset into training and testing classes. K refers to the number of groups the
data sample is split into.
By combining this two three datasets were introduced and their accuracies was also obtained
given the number of folds and each fold’s accuracy as shown below.
For Hayes-Roth dataset
Accuracy for each fold is:
{0: 46.15384615384615, 1: 7.6923076923076925, 2: 23.076923076923077, 3:
30.76923076923077, 4: 38.46153846153847, 5: 38.46153846153847, 6:
46.15384615384615, 7: 53.84615384615385, 8: 7.6923076923076925, 9:
53.84615384615385} %

Highest Accuracy: 53.84615384615385

For Breast Cancer dataset


Accuracy for each fold is:
{0: 82.14285714285714, 1: 17.857142857142858, 2: 85.71428571428571, 3:
14.285714285714285, 4: 75.0, 5: 71.42857142857143, 6: 82.14285714285714,
7: 39.285714285714285, 8: 25.0, 9: 71.42857142857143} %

Highest Accuracy: 85.71428571428571

For Car Evaluation dataset


Accuracy for each fold is:
{0: 58.139534883720934, 1: 65.69767441860465, 2: 61.04651162790697, 3:
68.6046511627907, 4: 67.44186046511628, 5: 71.51162790697676, 6:
67.44186046511628, 7: 55.81395348837209, 8: 63.95348837209303, 9:
71.51162790697676} %

Highest Accuracy: 71.51162790697676

From the above datasets Breast cancer had the highest percentage on accuracy as compared to
Car Evaluation and Hayes Roth
The model was trained using features of each dataset and the target values or labels were
obtained from class column on each dataset.

References
1. https://machinelearningmastery.com/k-fold-cross-validation
2. https://machinelearningmastery.com/naive-bayes-classifier-scratch-python/

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy