A1rib T4
A1rib T4
Submitting the assignment: Please use this word file as your starting point. Add your
answers in the boxes below the questions. Please also copy-paste the R code that you use if
the question asks you to do so. Once you have completed it, convert this word document to
pdf and submit the pdf as well as the R script that you used to come to the answers in Canvas
-> Assignments -> Assignment 1.
Please name the pdf document and the R script: “A1RIB_TeamName”. For example, if Team
A submitted the files, they would be named A1RIB_TA.pdf & A1RIB_TA.R. Each team
should submit only one file.
To check:
1. Make sure that you create a main folder for this assignment (you can name the folder
something like “A1_RIB” or whatever you like).
2. This folder can consist of sub-folders like code, data…
3. Set the working directory as your main folder (under Session -> Set Working
Directory).
4. Consult the R instructional videos and the “A very short introduction to R document
by Torfs & Brauer” to help you get started.
Questions
Basics
1. Install and load the package “tidyverse”. Please copy in the code you used.
a. Report at least one other way of installing a package.
library(tidyverse)
#Alternative method
## Tools >> Install Packages >> "type in packages seperated by commas" >> load
packages
( 73+4 )∗15
2. Compute and assign the name calculation to the result. Print calculation
√ 43
to the console and report the value below. Please copy in the code you used.
a. Now, standardize calculation by subtracting its mean and dividing by its
standard deviation. What is the result and why? Please copy in the code you
used.
[1] 176.1358
a.
[1] NA
the result does not have a sd. Only a range of numbers can have a sd. Thus c()
3. Create a vector called “a” that has the numbers 1 to 50. Then create a vector called
“b” that has the numbers 51 to 100. Assign the two vectors to a matrix called m1 that
has 2 columns. Please copy in the code you used.
a <- 1:50
b <- 51:100
m1 <- a+b
4. Create a vector called months containing the numbers 76, 32, 84, and 9. Compute a
vector called years from it by dividing months by 12. Report the value of years below
and copy in the code you used.
5. What happens when you check whether a is larger than b? Explain. Note that you
created these vectors above.
Because each vector/number in the sequence of a is less than the same vector in sequence
b.
6. Is the mean of a smaller or equal to the mean of b. Please copy in the code you used.
How is this operation different than in question 5?
7. Is the vector c(1, “a”, 3) equal to the vector c(1, 2, “3”)? Do you think it makes sense
what R is doing here?
Yes, R is again comparing individual arguments in the list. As the list is made up of
different kind of values, it can compare numeric and non-numeric values
8. Imagine there is a medical study and patients should be excluded from the study if
they weigh more than 90 kg or if they are younger than 18 years. Define the vector
age as age <- c(50,17,21,16,90) and the vector weight as weight <-
c(80,75,92,106,69). Then write a logical statement involving these two variables
that tests for the exclusion criteria. How many people qualify for the study? Please
copy in the code you used.
[1] 2
sum(criteria)
9. Load the d1.csv dataset into R and object called data1. Which function do you need to
use and why? Report at least one other way on how you could load this data. Please
copy in the code you used.
Click on file in files > import dataset > name dataset > import
10. Create a vector called new that is the result of a product of the variables "ahi01" and
"ahi02" from the data1 dataset. Why is this vector not another variable in the data1
dataset? Please copy in the code you used.
11. Create a histogram of the elapsed.days variable from the data1 dataset. What type of
distribution is this?
Negative exponential distribution
12. Create a scatter plot between the variables ahiTotal and cesdTotalfrom the data1
dataset. Can you already comment on the direction of the relationship between these
two variables? Hint: you can use plot() or a more complex version from the ggplot2
package.
a. For a bonus, try to give the plot a title and change the x and y coordinate
names.
plot(data1$ahiTotal, data1$cesdTotal,
main = "The Negative Distribution",
xlab = "X Total",
ylab = "Y Total")
13. If you run the following code: L1 <- list(a,b,data2) what type of error will you get
and why? How would you solve it?
14. If you run the following code: c <- c(a, b, 5, 6,7 8, 9) where is the mistake? How
would you solve it?
[1] NA
A[51] does not exist. A only has values from 1:50. Thus only having 50 vectors