Individual Assignments: Unit 2: Values, Data Types and Data Structures in R, Assignment 1
Individual Assignments: Unit 2: Values, Data Types and Data Structures in R, Assignment 1
Site: My Courses
Course
BUSI 4063 Business Intelligence and Analytics 11/11(20U-C-BC-9A)
:
Assignments will be marked based on comprehensiveness, presentation quality, form, and content. Student
evidence of having relatively demonstrated or mastered these criteria will be assessed according to the
following grade standards. Submissions must be presented in the manner requested of each particular
assignment. Unless otherwise directed assignments should be uploaded in R-Markdown format, or as an R-
Script with accompanying files if required.
Sally adds five ounces to one of his bags, 3 ounces to another, and 12 to the last.
Use Vectors to show how much gold Frank has in each bag.
Split the dataframe into two pieces, and print both halfs out separately.
Page 1 of 5
Unit 4: Describing and Visualizing Data,
Assignment 3
In this assignment students become familiar measures of central tendency and with
plotting data visually.
Create a dataframe with 5 columns.
Boxplot
Scatterplot
Histogram
Replace one of the datapoints with an outlier, and generate a new boxplot showing the
outlier.
https://www.youtube.com/watch?v=ePD96i0YHII
#Other Types: l, b, h, s
Page 2 of 5
Change the data type in only two of the columns
Build it entirely in R so it will run standing alone (ie. do not import an excel file).
Value to be predicted: Will Nathan mow the lawn? (a FACTOR variable with two
levels).
Use a confusion matrix to demonstrate how accurate the model is, given the variables
available.
Page 3 of 5
In essay format explain the differences and similarities between how a decision tree
functions and how K-NN functions.
https://rpubs.com/Nitika/kmeans_Iris#:~:text=Let%E2%80%99s%20begin%20with
%20our%20clustering%20task%20on%20Iris,R%20Studio%20Console.
%201.%20Load%20and%20view%20dataset.
Build the model, and produce commented code explaining each step.
Imagine you work in a marketing department and need to divide your customers into
market segments. Which type of machine learning would you use, supervised or
unsupervised, and which R package would you implement?
Page 4 of 5
Import the data1.csv file into Rstudio.
Remove the last column, and create a boxplot from the remaining columns.
Explain the relationship between column 1 and 3 using your scatterplot and
correlation calculation as evidence.
If you applied a naïve-bayes classifier to this data to predict the last column, would
you include the last column in the training set? Why or why not?
Page 5 of 5