0% found this document useful (0 votes)

3 views3 pages

Project Report

Ms project

Uploaded by

chaitanya.samanchi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views3 pages

Project Report

Ms project

Uploaded by

chaitanya.samanchi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

1.

Technical Approach

We have created 4 models using Random Forest Classifier, Decision Tree Classifier, Naïve Bayes
Classifier, and Support Vector Machine techniques. We have created them with 14 components from
the Principal Component Analysis and found out the confusion matrix as well as calculated the accuracy
score for each. The classifiers are imported from the sci-kit-learn library. The model was fit with the
training set and the target is predicted for the independent variables of the test set. These values are
compared against each other and the confusion matrix is computed. The accuracy score of each is as
follows:

Techniques Accuracy Score

Random Forest Classifier 75.00%
Decision Tree Classifier 67.42%
Naïve Bayes Classifier 74.24%
Support Vector Machine 71.96%

The accuracy scores appear to be moderately performing well. The Random Forest Classifier leads the
accuracy score closely followed by Naïve Bayes Classifier & Support Vector Machine. The Decision Tree
Classifier stands the least out of the four tested classification models. The four models were again
created without applying the PCA i.e. without eliminating any features from the given dataset (except
for the Body Mass Index which was dropped earlier due to high correlation). The new models show a
significant difference in the accuracy score:

Techniques Accuracy Score

Random Forest Classifier 84.84%
Decision Tree Classifier 81.06%
Naïve Bayes Classifier 78.78%
Support Vector Machine 74.24%

Without conducting Principal Component Analysis, the accuracy scores of Random Forest Classifier and
Decision Tree Classifier show a remarkable increase. This can be attributed to the fact that, though PCA
served it’s purpose in reducing the dimensionality of the data, the model applied on data with PCA failed
to capture the underlying pattern and returned a relatively lower accuracy score. The Random Forest
Classifier Technique has the best accuracy score and further, the parameters are tuned by applying Grid
Search. The parameters tuned are n_estimators: [6, 100, 30], max depth: [5, 7, 10]. Even after tuning the
Grid Search, the best parameters {‘max_depth’: 7, ‘n_estimators’: 30} did not show any significant
improvement in the accuracy score. An interesting observation we found was that the SVM model is
misclassifying all the observations as either Group 1 or Group 2, while eliminating to classify as Group 0.
This can be identified from the confusion matrix:
2. Test and Evaluation

The Test Data of all the models is replaced with the Test Dataset provided for evaluation. And, the
confusion matrix & accuracy score are computed for each model:

Techniques Accuracy Score

Random Forest Classifier 71.62%
Decision Tree Classifier 71.62%
Naïve Bayes Classifier 60.81%
Support Vector Machine 60.60%

The Random Forest Classifier and the Decision Tree Classifier both stand the highest at 71.62% accuracy
score. The Naïve Bayes Classifier stands third at about 60.81%. The Support Vector Machine is
misclassifying the total observations as Group 1 without considering Group 0 and Group 2, which means
that the SVM is not a recommended model for the given problem statement. This can be verified from
the confusion matrix below:

An accuracy score of above 70 is a good score considering that the problem statement falls in the scope
of Human Resources Domain. Human Resources deal directly with the behavior of human beings; hence,
high accuracy scores cannot be expected. Moreover, we are dealing with limited variables to predict
employee’s behavior. Several other variables such as employee morale, job satisfaction, relationship
with manager, workplace ambience etc. which are generally considered to be the key indicators of an
employee’s performance and absenteeism rates are not present in the dataset.

Thus, Random Forest Classifier & Decision Tree Classifier which have an accuracy score of 71.62% are
selected for further evaluation. As the accuracy score alone is not a sufficient metric to estimate the
discrimination ability of the model, we have plotted the Receiver’s Operating Characteristic Curve
between the True-Positive Rate (probability of detection) and False Positive Rate (probability of false
alarm) for both Random Forest & Decision Tree. We have chosen these two models as our final
consideration, thus, plotted the ROC curve only for these two models.

Random Forest Classifier

The Random Forest Classifier has an area under the curve around 89% for all the classes. Thus, the
model is having a near to perfect discrimination ability with an area under the curve of 0.89 and an
accuracy of 71.62%. The ROC Curve is shown below:

Decision Tree Classifier

The Decision Tree Classifier has an area under the curve for Class 1 as 0.94 but below 0.79 & 0.78 for
both Class 2 & 3 respectively. Hence, this model is satisfactory with its discrimination ability though the
accuracy score is 71.62% The ROC Curve is shown below:

Therefore, the Random Forest Classifier technique produced a better classification model with its high
accuracy score along with an eminent area under the curve.

Absenteeism at Work Project Report
No ratings yet
Absenteeism at Work Project Report
12 pages
Modelling-project notes-2
No ratings yet
Modelling-project notes-2
26 pages
Machine Learning Model
No ratings yet
Machine Learning Model
9 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Project Report - ML
100% (1)
Project Report - ML
17 pages
Unit6 -7 Issues_23bc7150-918a-4ebe-9af6-01db96af986a
No ratings yet
Unit6 -7 Issues_23bc7150-918a-4ebe-9af6-01db96af986a
53 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
Employee Turnover Prediction
100% (1)
Employee Turnover Prediction
16 pages
Int3209 - Data Mining: Week 5: Classification Model Improvements
No ratings yet
Int3209 - Data Mining: Week 5: Classification Model Improvements
56 pages
Sanatander Analysis
No ratings yet
Sanatander Analysis
19 pages
Heart Disease Predictor - ML - Report
No ratings yet
Heart Disease Predictor - ML - Report
15 pages
Unit3 7 Issues
No ratings yet
Unit3 7 Issues
24 pages
20150908-Lecture-3-Draft Asd Def HFL DFGF Lkreglker Lerg Kelr GK
No ratings yet
20150908-Lecture-3-Draft Asd Def HFL DFGF Lkreglker Lerg Kelr GK
15 pages
Model Evaluation - II
No ratings yet
Model Evaluation - II
12 pages
Liver Patient Analysis
No ratings yet
Liver Patient Analysis
12 pages
A Novel Optimized Approach For Machine Learning Techniques For Predicting Employee Attrition
No ratings yet
A Novel Optimized Approach For Machine Learning Techniques For Predicting Employee Attrition
9 pages
Exam PA Knowledge Based Outline
No ratings yet
Exam PA Knowledge Based Outline
22 pages
FALLSEM2024-25 BCSE334L TH VL2024250101768 2024-10-08 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE334L TH VL2024250101768 2024-10-08 Reference-Material-I
18 pages
MISY 631 Final Review Calculators Will Be Provided For The Exam
No ratings yet
MISY 631 Final Review Calculators Will Be Provided For The Exam
9 pages
IS4242 W6 Model Evaluation and Selection
No ratings yet
IS4242 W6 Model Evaluation and Selection
86 pages
Divorce Prediction System: Devansh Kapoor 179202050
No ratings yet
Divorce Prediction System: Devansh Kapoor 179202050
12 pages
Decision Tree and Evalaution
No ratings yet
Decision Tree and Evalaution
50 pages
DL_IT324a_4
No ratings yet
DL_IT324a_4
52 pages
Soft Computing Lab Practical Assignment 2
No ratings yet
Soft Computing Lab Practical Assignment 2
10 pages
7 Types of Classification Algorithms
No ratings yet
7 Types of Classification Algorithms
9 pages
Name: Le Ho Thao Nguyen Student ID: 20194224
No ratings yet
Name: Le Ho Thao Nguyen Student ID: 20194224
9 pages
Data Mining: Lecture - 03
No ratings yet
Data Mining: Lecture - 03
56 pages
Decision Tree and Ensemble
No ratings yet
Decision Tree and Ensemble
92 pages
19-Introduction classification algorithm-18-09-2024
No ratings yet
19-Introduction classification algorithm-18-09-2024
102 pages
BDT KSETA Freudenstadt
No ratings yet
BDT KSETA Freudenstadt
32 pages
DMDM Part 2
No ratings yet
DMDM Part 2
94 pages
WQD7005 Final Exam - 17219402
100% (1)
WQD7005 Final Exam - 17219402
12 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Module 5 Advanced Classification Techniques
No ratings yet
Module 5 Advanced Classification Techniques
40 pages
Vasu Gupta, Sharan Srinivasan, Sneha Kudli, Prediction and Classification of Cardiac Arrhythmia
No ratings yet
Vasu Gupta, Sharan Srinivasan, Sneha Kudli, Prediction and Classification of Cardiac Arrhythmia
5 pages
Day 6. Employee Attrition Prediction Using DataRobot
No ratings yet
Day 6. Employee Attrition Prediction Using DataRobot
63 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
ML2
No ratings yet
ML2
7 pages
Module - 4 - ECE3047 - Machine Learning
No ratings yet
Module - 4 - ECE3047 - Machine Learning
81 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
31 pages
Unit 3 (DWDM)
No ratings yet
Unit 3 (DWDM)
23 pages
MaWinPaPaMayPhyoAung - First Seminar
No ratings yet
MaWinPaPaMayPhyoAung - First Seminar
21 pages
Ramana 2019
No ratings yet
Ramana 2019
6 pages
FP Report - Group 2
No ratings yet
FP Report - Group 2
4 pages
Lectura 1
No ratings yet
Lectura 1
13 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
Classification Part 1
No ratings yet
Classification Part 1
76 pages
Random Forest
No ratings yet
Random Forest
16 pages
WQD7005 Final Exam - 17219402
No ratings yet
WQD7005 Final Exam - 17219402
12 pages
TB 969425740
No ratings yet
TB 969425740
16 pages
Implementation of Credit Card Fraud Detection Using Random Forest Algorithm
100% (1)
Implementation of Credit Card Fraud Detection Using Random Forest Algorithm
10 pages
Classification Slides
No ratings yet
Classification Slides
147 pages
CSC4316 9
No ratings yet
CSC4316 9
40 pages
Lecture 3b - Evaluation
No ratings yet
Lecture 3b - Evaluation
37 pages
Chap4 Imbalanced Classes
No ratings yet
Chap4 Imbalanced Classes
28 pages
Classification&DecisionTree (1)
No ratings yet
Classification&DecisionTree (1)
13 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
E IS388 Theory MellaMargaretaVeronica 00000059669
No ratings yet
E IS388 Theory MellaMargaretaVeronica 00000059669
7 pages
DSUP_Exp6[1]
No ratings yet
DSUP_Exp6[1]
5 pages
Results Thesis
No ratings yet
Results Thesis
14 pages
Question Paper Code: X10303
No ratings yet
Question Paper Code: X10303
3 pages
Artificial Intelligence Based Mobile Robot
No ratings yet
Artificial Intelligence Based Mobile Robot
19 pages
XYZware User Manual - EN - V3.3
No ratings yet
XYZware User Manual - EN - V3.3
50 pages
San Clemente Rotary Club Scholarship Application 2025 Fillable
No ratings yet
San Clemente Rotary Club Scholarship Application 2025 Fillable
4 pages
Frontend Projects
No ratings yet
Frontend Projects
3 pages
Krishna's Resume
No ratings yet
Krishna's Resume
1 page
Data Reduction Techniques
No ratings yet
Data Reduction Techniques
41 pages
Credit Card Fraud Detection Using State-Of-The-Art Machine Learning and Deep Learning Algorithms
No ratings yet
Credit Card Fraud Detection Using State-Of-The-Art Machine Learning and Deep Learning Algorithms
16 pages
LLM-and-Gen-AI-Data-Security-Best-Practices-2025-v1.0
No ratings yet
LLM-and-Gen-AI-Data-Security-Best-Practices-2025-v1.0
62 pages
EVA System Competitive Analysis Dysis
No ratings yet
EVA System Competitive Analysis Dysis
10 pages
Research Proposal Template
No ratings yet
Research Proposal Template
16 pages
Micro Python Code For Raspberry Pi Pico
No ratings yet
Micro Python Code For Raspberry Pi Pico
5 pages
C Mock Test-2
No ratings yet
C Mock Test-2
10 pages
4.-Revised-Tle-As-Css10-Q3-Disk Management
No ratings yet
4.-Revised-Tle-As-Css10-Q3-Disk Management
5 pages
Oracle Application Express: Developing Database Web Applications
100% (1)
Oracle Application Express: Developing Database Web Applications
34 pages
Core 2
No ratings yet
Core 2
5 pages
Chapter 1-Introduction To Computer Secuirty
No ratings yet
Chapter 1-Introduction To Computer Secuirty
15 pages
PFE Book 2024 Integration Objects
No ratings yet
PFE Book 2024 Integration Objects
24 pages
1.4. Infinite Limits
No ratings yet
1.4. Infinite Limits
5 pages
K1 Servo Drive User Manual V1.5
No ratings yet
K1 Servo Drive User Manual V1.5
199 pages
A Self-Organizing Deep Network Architecture Designed Based On LSTM Network Via Elitism-Driven Roulette-Wheel Selection For Time-Series Forecasting
No ratings yet
A Self-Organizing Deep Network Architecture Designed Based On LSTM Network Via Elitism-Driven Roulette-Wheel Selection For Time-Series Forecasting
17 pages
Final Lab Manual
No ratings yet
Final Lab Manual
41 pages
Mobile Automation Using Appium
No ratings yet
Mobile Automation Using Appium
6 pages
Alarm Clock
No ratings yet
Alarm Clock
8 pages
Versaart Re640
No ratings yet
Versaart Re640
140 pages
VXLAN EVPN Config Guide For Cisco Nexus
No ratings yet
VXLAN EVPN Config Guide For Cisco Nexus
19 pages
1745-lp101 User Manual
No ratings yet
1745-lp101 User Manual
12 pages
Excel Tests for Interview
No ratings yet
Excel Tests for Interview
13 pages
Project Planning Sheet-Jakhol SankriI
No ratings yet
Project Planning Sheet-Jakhol SankriI
6 pages
4.1 4-S2S-IPSecVPN-Tunnel-Router
No ratings yet
4.1 4-S2S-IPSecVPN-Tunnel-Router
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Project Report

Uploaded by

Project Report

Uploaded by

1.

Techniques Accuracy Score

Techniques Accuracy Score

Techniques Accuracy Score

Random Forest Classifier

Decision Tree Classifier

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.