NaiveBayesKfold Report
NaiveBayesKfold Report
This project involved developing a supervised machine learning classification model namely
Naïve Bayes which was responsible to use given values of features and obtain the observation
class (data point). Naïve Bayes model is able to accomplish the task of classification by finding
the probability of a given class using a set of feature values of each class label.
The probability of a class is calculated:
Naïve Bayes already assumes that all feature data are independent a simple equation of
p(x1, x2 , … , xn | yi) give the class label then probability can be estimated foe each feature in
the dataset and the probability distributions of features for each class is stored independently in
case of 2 classes and 10 features then 20 probability distributions. By use of Gaussian
distribution then sum all the probabilities. The disadvantage of this model is the assumption that
features are independent which in real case scenario not.
The model also utilized k-fold cross validation by definition it is when the dataset is split into a
K number of folds and is used to evaluate the model's ability when given new data which helps
to split the given dataset into training and testing classes. K refers to the number of groups the
data sample is split into.
By combining this two three datasets were introduced and their accuracies was also obtained
given the number of folds and each fold’s accuracy as shown below.
For Hayes-Roth dataset
Accuracy for each fold is:
{0: 46.15384615384615, 1: 7.6923076923076925, 2: 23.076923076923077, 3:
30.76923076923077, 4: 38.46153846153847, 5: 38.46153846153847, 6:
46.15384615384615, 7: 53.84615384615385, 8: 7.6923076923076925, 9:
53.84615384615385} %
From the above datasets Breast cancer had the highest percentage on accuracy as compared to
Car Evaluation and Hayes Roth
The model was trained using features of each dataset and the target values or labels were
obtained from class column on each dataset.
References
1. https://machinelearningmastery.com/k-fold-cross-validation
2. https://machinelearningmastery.com/naive-bayes-classifier-scratch-python/