0% found this document useful (0 votes)

102 views12 pages

Assignment AnjaliVats 244

1. Various decision tree models were generated using different combinations of independent variables from the Carseats dataset to predict the target variables. 2. The accuracy of the models was evaluated using measures like mean squared error. 3. The most accurate model was found to be one that predicts whether a person belongs to the US (Yes/No) category, achieving around 89% accuracy.

Uploaded by

AnjaliVats

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

102 views12 pages

Assignment AnjaliVats 244

Uploaded by

AnjaliVats

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Assignment

Decision Tree Analysis

IN PARTIAL FULLFILLMENT OF THE DEGREE OF

Masters of Business Administration-Intelligent Data Science (MBA-IDS:2018-2020)

UNDER GUIDANCE OF

Prof. Keerti Jain

Anjali Vats MB18GID244

Contents
Problem Statement........................................................................................................................ 3
DataSet ........................................................................................................................................... 3
Models of Decision Tree & Model Accuracy .............................................................................. 4
Using rpart .................................................................................................................................. 4
Comparing different combinations of model using mean squared error values ......................... 8
Using tree package ...................................................................................................................... 9
Best Model which is generated................................................................................................... 12
Comparing Accuracy ................................................................................................................ 12
Comaparing Means ................................................................................................................... 12
Problem Statement

1. Build various models of decision tree using different combinations of independent variables.
2. And check the accuracy of the models.
3. Find best model among the models which u generated.

DataSet

The Carseats dataset is a dataframe with 400 observations on the following 11 variables:

1. Sales: unit sales in thousands

2. CompPrice: price charged by competitor at each location
3. Income: community income level in 1000s of dollars
4. Advertising: local ad budget at each location in 1000s of dollars
5. Population: regional pop in thousands
6. Price: price for car seats at each site
7. ShelveLoc: Bad, Good or Medium indicates quality of shelving location
8. Age: age level of the population
9. Education: ed level at location
10. Urban: Yes/No
11. US: Yes/No
Models of Decision Tree & Model Accuracy

Using rpart
1. Predicting weather the person belongs to US or not based on variables (Income, Advertising,
Population, price)
############ CODE 1 ##################

install.packages("rpart")

library(rpart)

getwd()

Carseats<-read.csv("C:/Users/intone/Desktop/MBA/T4/MA/After MidTerm/Carseats.csv")

attach(Carseats)

names(Carseats)

tree_analysis<-rpart(US~Income+Advertising+Population+price, data=Carseats)

tree_analysis

install.packages("rpart.plot")

library(rpart.plot)

rpart.plot(tree_analysis,extra=1)

2) Predicting weather the person belongs to Urban or not based on other parameters in the dataset
i.e ( Income, Advertising, Education, Population, Price, Age, Shelvloc, US , Sales)
Carseats <- Carseats[,-1]

Carseats$Urban <- factor(Carseats$US, levels=c(2,4), labels=c("Yes", "No"))

print(summary(Carseats))

set.seed(1234)

ind <- sample(2, nrow(Carseats), replace=TRUE, prob=c(0.7, 0.3))

trainData <- Carseats[ind==1,]

validationData <- Carseats[ind==2,]

tree = rpart(Urban ~ ., data=trainData, method="class")

rpart.plot(tree)

evaluation(tree, validationData, "class")

C) Predicting Shelveloc (Good, Medium or bad) based on other parameters in the dataset
(Income+Advertising+Education+Population+price+age+Urban+US+Sales)
############ CODE ##################

tree_analysis<-
rpart(Shelveloc~Income+Advertising+Education+Population+price+age+Urban+US+Sales,
data=Carseats)

rpart.plot(tree_analysis,extra=1)

Comparing different combinations of model using mean

squared error values
Different combinations of independent variables are used to create models and their mean squared
error values are calculated using the difference in actual and predicted values

The least mean error value has been obtained for model7 and model8 with 7 and 8 independent
variables ie. With the following combinations :
tree_model7=tree(High~Advertising+age+price+Education+Income+Population+US+Shelv
eloc,training_data)
tree_model8=tree(High~Advertising+age+price+Education+Income+Population+US+Shelv
eloc+Urban,training_data)
Using tree package
a) Creating a decision model by splitting the dataset into training and test data.

#split data into training ans test set

set.seed(2)

train=sample(1:nrow(Carseats),nrow(Carseats)/2)

training_data=Carseats[train,]

testing_data=Carseats[test, ]

testing_High=High[test]

#fit thr tree model using training data

tree_model=tree(High~.,training_data)

plot(tree_model)

text(tree_model, pretty = 0)
tree_pred=predict(tree_model, testing_data, type="class")
mean(tree_pred!=testing_High)

#PRUNE the tree

##cross validation to check whre to stop pruning
set.seed(3)

cv_tree=cv.tree(tree_model, FUN=prune.misclass)
names(cv_tree)
plot(cv_tree$size, cv_tree$dev, type="b")
##prune the tree

pruned_model=prune.misclass(tree_model, best=9)

plot(pruned_model)

text(pruned_model, pretty=0)

##check how it is doing

tree_pred=predict(pruned_model, testing_data, type="class")

mean(tree_pred !=testing_High)
Best Model which is generated

Comparing Accuracy
1) The best model has been generated is the one in which US (Yes/ No) labels have been
predicted keeping other variables. This model has accuracy of around 89%.

Comaparing Means
The least mean error value has been obtained for model7 and model8 with 7 and 8 independent
variables ie. With the following combinations :
tree_model7=tree(High~Advertising+age+price+Education+Income+Population+US+Shelv
eloc,training_data)
tree_model8=tree(High~Advertising+age+price+Education+Income+Population+US+Shelv
eloc+Urban,training_data)

Log Sheet Chiller
80% (5)
Log Sheet Chiller
1 page
Continental GPEC2 Locked
100% (5)
Continental GPEC2 Locked
3 pages
Car Seats R Code
No ratings yet
Car Seats R Code
5 pages
Problem: # Partition
No ratings yet
Problem: # Partition
5 pages
Liebert Ds 28 105kw 8 30 Tons System Design Manual
100% (3)
Liebert Ds 28 105kw 8 30 Tons System Design Manual
234 pages
IE 451 Fall 2023-2024 Homework 7 Solutions
No ratings yet
IE 451 Fall 2023-2024 Homework 7 Solutions
11 pages
ML Lab 10 - Ensemble Learning
No ratings yet
ML Lab 10 - Ensemble Learning
7 pages
Module 4 - Supervised Learning - First ML Model
No ratings yet
Module 4 - Supervised Learning - First ML Model
23 pages
R Code - Session 11
No ratings yet
R Code - Session 11
4 pages
R Assignment
No ratings yet
R Assignment
8 pages
Tree Based Methods
No ratings yet
Tree Based Methods
21 pages
Janani Prakash Loan Prediction Study
No ratings yet
Janani Prakash Loan Prediction Study
97 pages
Data Empowerment: Harnessing Advanced Mathematical and Statistical Methods for Data Science and Machine Learning
From Everand
Data Empowerment: Harnessing Advanced Mathematical and Statistical Methods for Data Science and Machine Learning
NAGARAJU CHEVURU
No ratings yet
Amta - Final Exams: Code: # Load The Toyotacorolla - CSV
No ratings yet
Amta - Final Exams: Code: # Load The Toyotacorolla - CSV
13 pages
DA Lab Week-3
No ratings yet
DA Lab Week-3
15 pages
Random Forest PDF
No ratings yet
Random Forest PDF
14 pages
IDS 575 Assignment - 3: Name: Swapnil Shashank Parkhe UIN: 660014865
No ratings yet
IDS 575 Assignment - 3: Name: Swapnil Shashank Parkhe UIN: 660014865
7 pages
Write A Program To Demonstrate Decision Tree Algorithm For A Classification Problem and Perform Parameter Tuning For Better Results
No ratings yet
Write A Program To Demonstrate Decision Tree Algorithm For A Classification Problem and Perform Parameter Tuning For Better Results
5 pages
DM Tuto 3-6 (Final)
No ratings yet
DM Tuto 3-6 (Final)
16 pages
7708 - MBA PredAnanBigDataNov21
No ratings yet
7708 - MBA PredAnanBigDataNov21
11 pages
ML Lab-1
No ratings yet
ML Lab-1
32 pages
Practice 2+
No ratings yet
Practice 2+
25 pages
Decision Tree R
No ratings yet
Decision Tree R
5 pages
ChatGPT for Business: Strategies for Success
From Everand
ChatGPT for Business: Strategies for Success
Matthew C. Smith
1/5 (1)
Tutorial 6
No ratings yet
Tutorial 6
3 pages
Part I
No ratings yet
Part I
12 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
08 Tree Classification
No ratings yet
08 Tree Classification
22 pages
5b Python Implementation of Decision Tree
No ratings yet
5b Python Implementation of Decision Tree
7 pages
Unit Iii Machine Learning
No ratings yet
Unit Iii Machine Learning
19 pages
Thera Bank
100% (1)
Thera Bank
25 pages
FDS 16 Regression Tree
No ratings yet
FDS 16 Regression Tree
27 pages
Dar Lect 12
No ratings yet
Dar Lect 12
29 pages
To Edit Data Science
No ratings yet
To Edit Data Science
18 pages
Prathamesh KRAI
No ratings yet
Prathamesh KRAI
38 pages
ISYE6501 Homework 2
No ratings yet
ISYE6501 Homework 2
11 pages
UCS622
No ratings yet
UCS622
1 page
Practical No4 - 5 ML
No ratings yet
Practical No4 - 5 ML
11 pages
Machine Learning
No ratings yet
Machine Learning
16 pages
CP 4
No ratings yet
CP 4
2 pages
Da Lab Mannual
No ratings yet
Da Lab Mannual
25 pages
Random Forest Algorithm
No ratings yet
Random Forest Algorithm
9 pages
CE802 Report
No ratings yet
CE802 Report
7 pages
ML Using Python Programs
No ratings yet
ML Using Python Programs
12 pages
Hci Lab2 1
No ratings yet
Hci Lab2 1
4 pages
Set 2
No ratings yet
Set 2
19 pages
Lecture 7.2 - DTC Algorithm Implementation
No ratings yet
Lecture 7.2 - DTC Algorithm Implementation
7 pages
Classification Using Decision Trees
No ratings yet
Classification Using Decision Trees
43 pages
Machine Learning
No ratings yet
Machine Learning
12 pages
1
No ratings yet
1
19 pages
Machine Learning Random Forest Algorithm - Javatpoint
No ratings yet
Machine Learning Random Forest Algorithm - Javatpoint
14 pages
Practical 15 Python
No ratings yet
Practical 15 Python
6 pages
Week14 - LAQs - SWR
No ratings yet
Week14 - LAQs - SWR
3 pages
DAL Assignment 4 Endsem
No ratings yet
DAL Assignment 4 Endsem
8 pages
AIH Lab2
No ratings yet
AIH Lab2
10 pages
Is Lab Aman Agarwal PDF
No ratings yet
Is Lab Aman Agarwal PDF
8 pages
Expt7 ML2025 250306 143857
No ratings yet
Expt7 ML2025 250306 143857
5 pages
Assignment 4 R Program1
No ratings yet
Assignment 4 R Program1
11 pages
Assigmnent 3 (Data Mining)
No ratings yet
Assigmnent 3 (Data Mining)
18 pages
Codes
No ratings yet
Codes
14 pages
DAL Assignment 5 Endsem
No ratings yet
DAL Assignment 5 Endsem
8 pages
Unlocking Statistics for the Social Sciences
From Everand
Unlocking Statistics for the Social Sciences
Norma Sinclair
No ratings yet
Introduction to Data Analytics
From Everand
Introduction to Data Analytics
Dan Martin
No ratings yet
India Solar Sponsor Map Final Web 1
No ratings yet
India Solar Sponsor Map Final Web 1
2 pages
ADS Single Stub Tutorial
No ratings yet
ADS Single Stub Tutorial
24 pages
Sheet 3 Charting and Diagram Chapter 9
100% (2)
Sheet 3 Charting and Diagram Chapter 9
12 pages
Rechnung RE12300274-1
No ratings yet
Rechnung RE12300274-1
1 page
Security and Wincc Oa
No ratings yet
Security and Wincc Oa
24 pages
Plutonium Report
No ratings yet
Plutonium Report
359 pages
Open Day Brochure
No ratings yet
Open Day Brochure
2 pages
Protect PV 250 EN 02.03
No ratings yet
Protect PV 250 EN 02.03
2 pages
4 DPIOverview
No ratings yet
4 DPIOverview
30 pages
APS-Series 130 Operating Instructions v012 EN - RAP - Version
No ratings yet
APS-Series 130 Operating Instructions v012 EN - RAP - Version
198 pages
Barack Obama How Content Management and Web 2 0 Helped Win The White House
No ratings yet
Barack Obama How Content Management and Web 2 0 Helped Win The White House
9 pages
Magnum Manual Mmg320can6 Ops
No ratings yet
Magnum Manual Mmg320can6 Ops
52 pages
Byd Energy Storage Products - Battery Box (B-Box)
No ratings yet
Byd Energy Storage Products - Battery Box (B-Box)
2 pages
Pic C2assembly
No ratings yet
Pic C2assembly
11 pages
DBM PDF
100% (1)
DBM PDF
6 pages
Toolroom Accessories
50% (2)
Toolroom Accessories
70 pages
Something Strings
No ratings yet
Something Strings
7 pages
Project Management Course 2024 (Manual 1)
No ratings yet
Project Management Course 2024 (Manual 1)
68 pages
Top 50 Angularjs Interview Questions and Answers
No ratings yet
Top 50 Angularjs Interview Questions and Answers
83 pages
June 2014
No ratings yet
June 2014
15 pages
B07 FlowSens e
No ratings yet
B07 FlowSens e
2 pages
Relationship Marketing
100% (1)
Relationship Marketing
16 pages
Alineadora Arduino
No ratings yet
Alineadora Arduino
13 pages
Enclosure Heaters: Continuous Thermal Output 10 - 300 W
No ratings yet
Enclosure Heaters: Continuous Thermal Output 10 - 300 W
1 page
On-Load Tap-Changers, Types UC and VUC With Motor-Drive Mechanisms, Type BUE/BUL
100% (1)
On-Load Tap-Changers, Types UC and VUC With Motor-Drive Mechanisms, Type BUE/BUL
28 pages
TDA8511J Philips Elenota - PL PDF
No ratings yet
TDA8511J Philips Elenota - PL PDF
16 pages
Name Logo Project
No ratings yet
Name Logo Project
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Assignment AnjaliVats 244

Uploaded by

Assignment AnjaliVats 244

Uploaded by

Assignment

Decision Tree Analysis

IN PARTIAL FULLFILLMENT OF THE DEGREE OF

Masters of Business Administration-Intelligent Data Science (MBA-IDS:2018-2020)

Prof. Keerti Jain

Anjali Vats MB18GID244

1. Sales: unit sales in thousands

Carseats$Urban <- factor(Carseats$US, levels=c(2,4), labels=c("Yes", "No"))

ind <- sample(2, nrow(Carseats), replace=TRUE, prob=c(0.7, 0.3))

trainData <- Carseats[ind==1,]

validationData <- Carseats[ind==2,]

tree = rpart(Urban ~ ., data=trainData, method="class")

evaluation(tree, validationData, "class")

Comparing different combinations of model using mean

#split data into training ans test set

#fit thr tree model using training data

#PRUNE the tree

##check how it is doing

tree_pred=predict(pruned_model, testing_data, type="class")

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.