0% found this document useful (0 votes)
2 views44 pages

R Programming Lab Manual (1)

The document is a laboratory manual for an R Programming course for B.Tech students, detailing various programming experiments to be conducted during the 8th semester. It includes tasks such as checking for leap years, calculating sums, grading students based on marks, and performing operations on matrices and data frames. Each program is accompanied by a description, aim, and example code with expected outputs.

Uploaded by

anshul301003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views44 pages

R Programming Lab Manual (1)

The document is a laboratory manual for an R Programming course for B.Tech students, detailing various programming experiments to be conducted during the 8th semester. It includes tasks such as checking for leap years, calculating sums, grading students based on marks, and performing operations on matrices and data frames. Each program is accompanied by a description, aim, and example code with expected outputs.

Uploaded by

anshul301003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Laboratory Manual

R Programming
(Laboratory)
D022822 (022)
B.TECH -VIII Semester

Department of Computer Science & Engineering


Columbia Institute of Engineering & Technology, Raipur
(Approved by AICTE, New Delhi and Affiliated to CSVTU, Bhilai)

1. LAB INCHARGE:………………………………………
SESSION-2020-2021
List of Program as per
University SEMESTER-8th

R Programming Laboratory - D022822(022)

S.
Experiments
No.
Write a program to check whether a year (integer) entered by the user is a leap
1
year or not?
Write an R program to find the sum of natural numbers without formula using the
2
if–else statement and thewhile loop.
Write a program that prints the grades of the students according to the marks
3 obtained. The grading of the marks should be as follows. Marks Grades 800-1000
A+ 700 – 800 A 500 – 700 B+ 400-500 B 150 – 400 C Less than 150 D
Write an R program to make a simple calculator that can add, subtract, multiply
4
and divide using switch cases and functions
Write a program to perform searching within a list (1 to 50). If the number is
5 found in the list, print that the search is successful otherwise print that the number
is not in the list.
Create a list and data frame that stores the marks of any three subjects for 10
6 students. Find out the total marks, average, maximum marks and minimum marks
of every subject.
Write a program to create two 3 X 3 matrices A and B and perform the following
7 operations a) Transpose of the matrix b) addition c) subtraction.

Write an R program to create a list containing strings, numbers, vectors and


logical values and do the following manipulations over the list.
a. Access the first element in the list
b. Give the names to the elements in the list
8
c. Add element at some position in the list
d. Remove the element
e. Print the fourth element
f. Update the third element
Let us use the built-in dataset air quality which has Daily air quality measurements
9 in New York,May to September 1973. Create a histogram by using appropriate
arguments for the following statements.
a. Assigning names, using the air quality data set.
b. Change colors of the Histogram
c. Remove Axis and Add labels to Histogram
d. Change Axis limits of a Histogram
e. Create a Histogram with density and Add Density curve to the histogram
Design a data frame in R for storing about 8 employee details. Create a CSV file
named “input.csv” that defines all the required information about the employee
such as id, name, salary, start_date, dept. Import into R and do the following
analysis.
a. Find the total number rows & columns
10 b. Find the maximum salary
c. Retrieve the details of the employee with maximum salary
d. Retrieve all the employees working in the IT Department
e. Retrieve the employees in the IT Department whose salary is greater than
600 and write these details into another file “output.csv”.
Program 1:

AIM: Write a program to check whether a year (integer) entered by the user is a leap year
ornot?

Description:

R is a programming language and software environment for statistical analysis,graphics


representation and reporting.
R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, NewZealand,
and is currently developed by the R Development Core Team.
➢ R is freely available under the GNU General Public License, and pre-compiled binary
versions are provided for various operating systems like Linux, Windows and Mac.
➢ This programming language was named R,based on the first letter of first name of the two R
authors (Robert Gentleman and Ross Ihaka),and partly a play on the name of theBell Labs
Language S.

➢ R is the most popular data analytics tool as it is open-source, flexible, offers multiple
packages and has a huge community.

In this program we will take a year and check it is leap or not

 Take year as a input from the user and store it in a variable y.


 Check if divisible by 4 but not 100, DISPLAY "leap year"
 Check if the year is divisible by 400, DISPLAY "leap year"
 Otherwise, DISPLAY "not leap year"

Program:

year = as.integer(readline(prompt="Enter a year: "))

if((year %% 4) == 0)

if((year %% 100) == 0)

if((year %% 400) == 0)

print(paste(year,"is a leap year"))


}

else

print(paste(year,"is not a leap year"))

} else {

print(paste(year,"is a leap year"))

} else {

print(paste(year,"is not a leap year"))

Output:

Enter a year: 1900

[1] "1900 is not a leap year"

Enter a year: 2000

[1] "2000 is a leap year"


Program 2

Aim : Write an R program to find the sum of natural numbers without formula using
the if–else statement and the while loop.

Description:

Here, we ask the user for a number and display the sum of natural numbers upto that number.

We use while loop to iterate until the number becomes zero. On each iteration, we add the
number num to sum, which gives the total sum in the end.

We could have solved the above problem without using any loops using a formula.

From mathematics, we know that sum of natural numbers is given by

Program:

# take input from the user

num = as.integer(readline(prompt = "Enter a number: "))

if(num < 0) {

print("Enter a positive number")

} else {

sum = 0

# use while loop to iterate until zero

while(num > 0) {

sum = sum + num

num = num - 1

print(paste("The sum is", sum))

}
Output

Enter a number: 10

[1] "The sum is 55"


Program: 3

AIM:Write a program that prints the grades of the students according to the marks
obtained. The grading of themarks should be as follows. Marks Grades 800-1000 A+ 700 –
800 A 500 – 700 B+ 400-500 B 150 – 400 CLess than 150 D

Description:

Print grade of student by using if else decision statement in r programming. Taking the input as
marks from user and compare it and display grade according to a marks as a output.

Program:

V <-as.integer(readline(prompt="Enter a Marks:"))
if(v<1000 & v>800){
print("A+")
}else if(v<800 & v>700){
print("A")
}else if(v<700 & v>500){
print("B+")
}else if(v<500 & v>400){
print("B")
}else if(v<400 & v>150){
print("C")
}else{
print("D")
}

Output:

Enter a Marks: 950

[1] "A+"
Program: 4

Aim: Write an R program to make a simple calculator that can add, subtract, multiply
and divide using switch cases and functions

Description:

In this program, we ask the user to choose the desired operation. Options 1, 2, 3 and 4 are
valid.

Two numbers are taken from the user and a switch branching is used to execute a particular
function.

User-defined functions add(), subtract(), multiply() and divide() evaluate respective


operations

Program:

add<- function(x, y) {

return(x + y)

subtract<- function(x, y) {

return(x - y)

multiply<- function(x, y) {

return(x * y)

divide<- function(x, y) {

return(x / y)

# take input from the user


print("Select operation.")

print("1.Add")

print("2.Subtract")

print("3.Multiply")

print("4.Divide")

choice = as.integer(readline(prompt="Enter choice[1/2/3/4]: "))

num1 = as.integer(readline(prompt="Enter first number: "))

num2 = as.integer(readline(prompt="Enter second number: "))

operator<- switch(choice,"+","-","*","/")

result<- switch(choice, add(num1, num2), subtract(num1, num2), multiply(num1, num2),


divide(num1, num2))

print(paste(num1, operator, num2, "=", result))

Output:

[1] "Select operation."

[1] "1.Add"

[1] "2.Subtract"

[1] "3.Multiply"

[1] "4.Divide"

Enter choice[1/2/3/4]: 4

Enter first number: 20

Enter second number: 4

[1] "20 / 4 = 5"


Program 5

AIM: Write a program to perform searching within a list (1 to 50). If the number is found
in the list, print that the search is successful otherwise print that the number is not in the
list.

Description:
List:
A list is an R-object which can contain many different types of elements inside it like vectors,
functions and even another list inside it .In this program we will perform searching within the
list.

Program:
x<-
list(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,3
4,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50)

v=7

if(v%in%x){
print("search is successful")
}else
{
print("number is not in the list")
}

Output:

[1] "search is successful"


Program 6

AIM: Create a list and data frame that stores the marks of any three subjects for 10
students. Find out the total marks, average, maximum marks and minimum marks of every
subject.

Description:
Dataframe:
Data frames are tabular data objects. Unlike a matrix in data frame each column can contain
different modes of data. The first column can be numeric while the second column can be
character and third column can be logical. It is a list of vectors of equal length.
Data Frames are created using the data.frame() function.

Program:
s=data.frame(Name=c("Ab","Bb","Cb","Ab","A","Aaa","Aac","Abb","Acc","Add"),
S1=c(70,76,60,67,66,56,80,79,88,65),
S2=c(73,78,60,62,66,59,85,77,93,66),
S3=c(74,74,69,55,68,77,87,73,98,66))
print(s)

print(sum(s$S1))
print(max(s$S1))
print(min(s$S1))
print(mean(s$S1))

print(sum(s$S2))
print(max(s$S2))
print(min(s$S2))
print(mean(s$S2))

print(sum(s$S3))
print(max(s$S3))
print(min(s$S3))
print(mean(s$S3))

output:

Name S1 S2 S3
1 Ab 70 73 74
2 Bb 76 78 74
3 Cb 60 60 69
4 Ab 67 62 55
5 A 66 66 68
6 Aaa 56 59 77
7 Aac 80 85 87
8 Abb 79 77 73
9 Acc 88 93 98
10 Add 65 66 66
[1] 707
[1] 88
[1] 56
[1] 70.7
[1] 719
[1] 93
[1] 59
[1] 71.9
[1] 741
[1] 98
[1] 55
[1] 74.1
Program 7:

AIM: Write a program to create two 3 X 3 matrices A and B and perform the following
operations a) Transpose of the matrix b) addition c) subtraction.

Program:

# Create two 3x3 matrixes.

A = matrix(c(1, 2, 3, 4, 5, 6,1,1,1), nrow = 3)

print("Matrix-1:")

print(A)

B = matrix(c(0, 1, 2, 3, 0, 2,2,2,2), nrow = 3)

print("Matrix-2:")

print(B)

print("Result of Transpose of matrix A")

r = t(A)

print(r)

print("Result of Transpose of matrix B")

r1 = t(B)

print(r1)

result = A %*% B

print("Result of Transpose of matrix")

print(result)
result = A + B

print("Result of addition")

print(result)

result = A- B

print("Result of subtraction")

print(result)

Output:

[1] "Result of Transpose of matrix A"

[,1] [,2] [,3]

[1,] 1 2 3

[2,] 4 5 6

[3,] 1 1 1

[1] "Result of Transpose of matrix B"

[,1] [,2] [,3]

[1,] 0 1 2

[2,] 3 0 2

[3,] 2 2 2

[1] "Matrix-1:"

[,1] [,2] [,3]


[1,] 1 4 1

[2,] 2 5 1

[3,] 3 6 1

[1] "Matrix-2:"

[,1] [,2] [,3]

[1,] 0 3 2

[2,] 1 0 2

[3,] 2 2 2

>

[1] "Result of Transpose of matrix"

>print(result)

[,1] [,2] [,3]

[1,] 6 5 12

[2,] 7 8 16

[3,] 8 11 20

[1] "Result of addition"

[,1] [,2] [,3]

[1,] 1 7 3

[2,] 3 5 3
[3,] 5 8 3

>

[1] "Result of subtraction"

[,1] [,2] [,3]

[1,] 1 1 -1

[2,] 1 5 -1

[3,] 1 4 -1

>
Program 8:

AIM:Write an R program to create a list containing strings, numbers, vectors and logical
values and do the following manipulations over the list.
a. Access the first element in the list
b. Give the names to the elements in the list
c. Add element at some position in the list
d. Remove the element
e. Print the fourth element
f. Update the third element

A List is a collection of similar or different types of data. In R, we use the list() function to
create a list. In R, each element in a list is associated with a number. The number is known as a
list index. We can access elements of a list using the index number (1, 2, 3 …).

Accessing list elements

Elements of the list can be accessed by the index of the element in the list. In case of named lists
it can also be accessed using the names. Using list(listname[index])
Update list element

replace() function in R Language is used to replace the values in the specified string vector x
with indices given in list by those given in values.

Naming List Elements

Names can be given to list elements and can be accessed using the corresponding names.

Add element to the list

To add or append an element to the list in R use append() function. This function takes 3
parameters input list, the string or list you wanted to append, and position.

The append() function from the rlist package can also use to append one list with another in R.

Program:

list_data = list("R Program", "PHP",4, c(5, 7, 9, 11), TRUE, 125.17, 75.83)

print("Data of the list:")

print(list_data)
#(a) access 1st element in list

print(list_data[1])

# (b)Give the names to the elements in the list

names(list_data)<- c("language","web development","nubers","logical","float1","float2")

print(list_data)

#(c) Add element at some position in the list

append(list_data,"hi",after = 2)

#(d) Remove the element

list_data[-2]

# (e)Print fourth element

print(list_data[4])

#(f) Update the third element

replace(list_data,3,123)

Output:

>print("Data of the list:")

[1] "Data of the list:"

>print(list_data)

[[1]]
[1] "R Program"

[[2]]

[1] "PHP"

[[3]]

[1] 4

[[4]]

[1] 5 7 9 11

[[5]]

[1] TRUE

[[6]]

[1] 125.17

[[7]]

[1] 75.83

>

>#(a) access 1st element in list

>print(list_data[1])

[[1]]

[1] "R Program"


>

> # (b)Give the names to the elements in the list

> names(list_data)<- c("language","web development","nubers","logical","float1","float2")

>print(list_data)

$language

[1] "R Program"

$`web development`

[1] "PHP"

$nubers

[1] 4

$logical

[1] 5 7 9 11

$float1

[1] TRUE

$float2

[1] 125.17

$<NA>

[1] 75.83
>

> #(c) Add element at some position in the list

>append(list_data,"hi",after = 2)

$language

[1] "R Program"

$`web development`

[1] "PHP"

[[3]]

[1] "hi"

$nubers

[1] 4

$logical

[1] 5 7 9 11

$float1

[1] TRUE

$float2

[1] 125.17
$<NA>

[1] 75.83

>

>#(d) Remove the element

>list_data[-2]

$language

[1] "R Program"

$nubers

[1] 4

$logical

[1] 5 7 9 11

$float1

[1] TRUE

$float2

[1] 125.17

$<NA>

[1] 75.83

>
> # (e)Print 4th element

>print(list_data[4])

$logical

[1] 5 7 9 11

>#(f) Update the third element

>replace(list_data,3,123)

$language

[1] "R Program"

$`web development`

[1] "PHP"

$nubers

[1] 123

$logical

[1] 5 7 9 11

$float1

[1] TRUE

$float2

[1] 125.17
$<NA>

[1] 75.83
Program 9:

AIM :Let us use the built-in dataset air quality which has Daily air quality measurements
in New York,May to September 1973. Create a histogram by using appropriate arguments
for the following statements.

a. Assigning names, using the air quality data set.

b. Change colors of the Histogram

c. Remove Axis and Add labels to Histogram

d. Change Axis limits of a Histogram

e. Create a Histogram with density and Add Density curve to the histogram

Program:

data = airquality

print("Original data: Daily air quality measurements in New York, May to September 1973.")

print(class(data))

print(head(data,10))

result = data[order(data[,1]),]

print("Order the entire data frame by the first and second column:")

print(result)

output:

[1] "Original data: Daily air quality measurements in New York, May to September 1973."

[1] "data.frame"

Ozone Solar.R Wind Temp Month Day

1 41 190 7.4 67 5 1

2 36 118 8.0 72 5 2

3 12 149 12.6 74 5 3

4 18 313 11.5 62 5 4

5 NA NA 14.3 56 5 5
6 28 NA 14.9 66 5 6

7 23 299 8.6 65 5 7

8 19 99 13.8 59 5 8

9 8 19 20.1 61 5 9

10 NA 194 8.6 69 5 10

[1] "Order the entire data frame by the first and second column:"

Ozone Solar.R Wind Temp Month Day

21 1 8 9.7 59 5 21

23 4 25 9.7 61 5 23

18 6 78 18.4 57 5 18

11 7 NA 6.9 74 5 11

76 7 48 14.3 80 7 15

147 7 49 10.3 69 9 24

9 8 19 20.1 61 5 9

94 9 24 13.8 81 8 2

114 9 36 14.3 72 8 22

137 9 24 10.9 71 9 14

73 10 264 14.3 73 7 12

13 11 290 9.2 66 5 13

20 11 44 9.7 62 5 20

22 11 320 16.6 73 5 22

3 12 149 12.6 74 5 3

50 12 120 11.5 73 6 19

51 13 137 10.3 76 6 20

138 13 112 11.5 71 9 15


141 13 27 10.3 76 9 18

144 13 238 12.6 64 9 21

14 14 274 10.9 68 5 14

16 14 334 11.5 64 5 16

148 14 20 16.6 63 9 25

151 14 191 14.3 75 9 28

12 16 256 9.7 69 5 12

82 16 7 6.9 74 7 21

95 16 77 7.4 82 8 3

143 16 201 8.0 82 9 20

4 18 313 11.5 62 5 4

15 18 65 13.2 58 5 15

140 18 224 13.8 67 9 17

152 18 131 8.0 76 9 29

8 19 99 13.8 59 5 8

49 20 37 9.2 65 6 18

87 20 81 8.6 82 7 26

130 20 252 10.9 80 9 7

153 20 223 11.5 68 9 30

47 21 191 14.9 77 6 16

113 21 259 15.5 77 8 21

132 21 230 10.9 75 9 9

135 21 259 15.5 76 9 12

108 22 71 10.3 77 8 16

7 23 299 8.6 65 5 7
28 23 13 12.0 67 5 28

44 23 148 8.0 82 6 13

110 23 115 7.4 76 8 18

131 23 220 10.3 78 9 8

145 23 14 9.2 71 9 22

133 24 259 9.7 73 9 10

142 24 238 10.3 68 9 19

74 27 175 14.9 81 7 13

6 28 NA 14.9 66 5 6

105 28 273 11.5 82 8 13

136 28 238 6.3 77 9 13

38 29 127 9.7 82 6 7

19 30 322 11.5 68 5 19

149 30 193 6.9 70 9 26

111 31 244 10.9 78 8 19

24 32 92 12.0 61 5 24

64 32 236 9.2 81 7 3

129 32 92 15.5 84 9 6

17 34 307 12.0 66 5 17

78 35 274 10.3 82 7 17

97 35 NA 7.4 85 8 5

2 36 118 8.0 72 5 2

146 36 139 10.3 81 9 23

31 37 279 7.4 76 5 31

48 37 284 20.7 72 6 17
41 39 323 11.5 87 6 10

93 39 83 6.9 81 8 1

67 40 314 10.9 83 7 6

1 41 190 7.4 67 5 1

104 44 192 11.5 86 8 12

112 44 190 10.3 78 8 20

134 44 236 14.9 81 9 11

29 45 252 14.9 81 5 29

116 45 212 9.7 79 8 24

139 46 237 6.9 78 9 16

128 47 95 7.4 87 9 5

77 48 260 6.9 81 7 16

63 49 248 9.2 85 7 2

90 50 275 7.4 86 7 29

88 52 82 12.0 86 7 27

92 59 254 9.2 81 7 31

109 59 51 6.3 79 8 17

79 61 285 6.3 84 7 18

81 63 220 11.5 85 7 20

66 64 175 4.6 83 7 5

91 64 253 7.4 83 7 30

106 65 157 9.7 80 8 14

98 66 NA 4.6 87 8 6

40 71 291 13.8 90 6 9

118 73 215 8.0 86 8 26


126 73 183 2.8 93 9 3

120 76 203 9.7 97 8 28

68 77 276 5.1 88 7 7

96 78 NA 6.9 86 8 4

125 78 197 5.1 92 9 2

80 79 187 5.1 87 7 19

85 80 294 8.6 86 7 24

89 82 213 7.4 88 7 28

122 84 237 6.3 96 8 30

71 85 175 7.4 89 7 10

123 85 188 6.3 94 8 31

100 89 229 10.3 90 8 8

127 91 189 4.6 93 9 4

124 96 167 6.9 91 9 1

69 97 267 6.3 92 7 8

70 97 272 5.7 92 7 9

86 108 223 8.0 85 7 25

101 110 207 8.0 90 8 9

30 115 223 5.7 79 5 30

121 118 225 2.3 94 8 29

99 122 255 4.0 89 8 7

62 135 269 4.1 84 7 1

117 168 238 3.4 81 8 25

5 NA NA 14.3 56 5 5

10 NA 194 8.6 69 5 10
25 NA 66 16.6 57 5 25

26 NA 266 14.9 58 5 26

27 NA NA 8.0 57 5 27

32 NA 286 8.6 78 6 1

33 NA 287 9.7 74 6 2

34 NA 242 16.1 67 6 3

35 NA 186 9.2 84 6 4

36 NA 220 8.6 85 6 5

37 NA 264 14.3 79 6 6

39 NA 273 6.9 87 6 8

42 NA 259 10.9 93 6 11

43 NA 250 9.2 92 6 12

45 NA 332 13.8 80 6 14

46 NA 322 11.5 79 6 15

52 NA 150 6.3 77 6 21

53 NA 59 1.7 76 6 22

54 NA 91 4.6 76 6 23

55 NA 250 6.3 76 6 24

56 NA 135 8.0 75 6 25

57 NA 127 8.0 78 6 26

58 NA 47 10.3 73 6 27

59 NA 98 11.5 80 6 28

60 NA 31 14.9 77 6 29

61 NA 138 8.0 83 6 30

65 NA 101 10.9 84 7 4
72 NA 139 8.6 82 7 11

75 NA 291 14.9 91 7 14

83 NA 258 9.7 81 7 22

84 NA 295 11.5 82 7 23

102 NA 222 8.6 92 8 10

103 NA 137 11.5 86 8 11

107 NA 64 11.5 79 8 15

115 NA 255 12.6 75 8 23

119 NA 153 5.7 88 8 27

150 NA 145 13.2 77 9 27

>hist(airquality$Ozone)

#a. Assigning names, using the air quality data set.


#changing color of histogram

hist(airquality$Ozone, col = 'blue', border = "white")


plot(h, xaxt = "n", xlab = "air", ylab = "polution", main = "", col = "pink")

axis(1, airquality$Ozones, lab els = LETTERS[1:6], tick = FALSE, padj= -1.5)

#c. Remove Axis and Add labels to Histogram

d.Change Axis limits of a Histogram

hist(temperature,

main = "Maximum daily temperature at La Guardia

Airport",

xlab = "Temperature in degrees Fahrenheit",

xlim = c(50,100),

col = "darkmagenta",

freq = TRUE)
e.Create a Histogram with density and Add Density curve to the histogram

hist(temperature,

main = "Maximum daily temperature at La Guardia

Airport",

xlab = "Temperature in degrees Fahrenheit",

xlim = c(50,100),

col = "darkmagenta",

freq = FALSE)
Program 10:

AIM: Design a data frame in R for storing about 8 employee details. Create a CSV file
named “input.csv” that defines all the required information about the employee such
as id, name, salary, start_date, dept. Import into R and do the following analysis.
a. Find the total number rows & columns
b. Find the maximum salary
c. Retrieve the details of the employee with maximum salary
d. Retrieve all the employees working in the IT Department
e. Retrieve the employees in the IT Department whose salary is greater than 600
and write these details into another file “output.csv”.

Description:

Getting and Setting the Working Directory

You can check which directory the R workspace is pointing to using the getwd() function. You
can also set a new working directory using setwd()function.

Program:

# Get and print current working directory.

print(getwd())

# Set current working directory.

setwd("/web/com")

# Get and print current working directory.

print(getwd())

[1] "/web/com/1441086124_2016"

[1] "/web/com"

Input as CSV File

The csv file is a text file in which the values in the columns are separated by a comma. Let's
consider the following data present in the file named input.csv.
You can create this file using windows notepad by copying and pasting this data. Save the file
as input.csv using the save As All files(*.*) option in notepad.

id,name,salary,start_date,dept

1,Rick,623.3,2012-01-01,IT

2,Dan,515.2,2013-09-23,Operations

3,Michelle,611,2014-11-15,IT

4,Ryan,729,2014-05-11,HR

5,Gary,843.25,2015-03-27,Finance

6,Nina,578,2013-05-21,IT

7,Simon,632.8,2013-07-30,Operations

8,Guru,722.5,2014-06-17,Finance

Reading a CSV File

Following is a simple example of read.csv() function to read a CSV file available in your current
working directory −

data<- read.csv("input.csv")

print(data)

output

id, name, salary, start_date, dept


1 1 Rick 623.30 2012-01-01 IT
2 2 Dan 515.20 2013-09-23 Operations
3 3 Michelle 611.00 2014-11-15 IT
4 4 Ryan 729.00 2014-05-11 HR
5 NA Gary 843.25 2015-03-27 Finance
6 6 Nina 578.00 2013-05-21 IT
7 7 Simon 632.80 2013-07-30 Operations
8 8 Guru 722.50 2014-06-17 Finance
Analyzing the CSV File

By default the read.csv() function gives the output as a data frame. This can be easily checked as
follows. Also we can check the number of columns and rows.

data<- read.csv("input.csv")

print(is.data.frame(data))

print(ncol(data))

print(nrow(data))

output

[1] TRUE

[1] 5

[1] 8

Once we read data in a data frame, we can apply all the functions applicable to data frames as
explained in subsequent section.

Get the maximum salary

# Create a data frame.

data<- read.csv("input.csv")

# Get the max salary from data frame.

sal<- max(data$salary)

print(sal)

Output
[1] 843.25

Get the details of the person with max salary

We can fetch rows meeting specific filter criteria similar to a SQL where clause.
# Create a data frame.

data<- read.csv("input.csv")

# Get the max salary from data frame.

sal<- max(data$salary)

# Get the person detail having max salary.

retval<- subset(data, salary == max(salary))

print(retval)

Output
id name salary start_datedept

5 NA Gary 843.25 2015-03-27 Finance

Get all the people working in IT department

# Create a data frame.

data<- read.csv("input.csv")

retval<- subset( data, dept == "IT")

print(retval)

Output
id name salary start_datedept

1 1 Rick 623.3 2012-01-01 IT

3 3 Michelle 611.0 2014-11-15 IT

6 6 Nina 578.0 2013-05-21 IT

Get the persons in IT department whose salary is greater than 600

# Create a data frame.


data<- read.csv("input.csv")

info<- subset(data, salary > 600 & dept == "IT")

print(info)

Output
id name salary start_datedept

1 1 Rick 623.3 2012-01-01 IT

3 3 Michelle 611.0 2014-11-15 IT

Get the people who joined on or after 2014

# Create a data frame.

data<- read.csv("input.csv")

retval<- subset(data, as.Date(start_date) >as.Date("2014-01-01"))

print(retval)

Output
id name salary start_datedept

3 3 Michelle 611.00 2014-11-15 IT

4 4 Ryan 729.00 2014-05-11 HR

5 NA Gary 843.25 2015-03-27 Finance

8 8 Guru 722.50 2014-06-17 Finance

Writing into a CSV File

R can create csv file form existing data frame. The write.csv() function is used to create the csv
file. This file gets created in the working directory.

# Create a data frame.

data<- read.csv("input.csv")

retval<- subset(data, as.Date(start_date) >as.Date("2014-01-01"))


# Write filtered data into a new file.

write.csv(retval,"output.csv")

newdata<- read.csv("output.csv")

print(newdata)

Output
X id name salary start_datedept

13 3 Michelle 611.00 2014-11-15 IT

24 4 Ryan 729.00 2014-05-11 HR

35 NA Gary 843.25 2015-03-27 Finance

48 8 Guru 722.50 2014-06-17 Finance

Here the column X comes from the data set newper. This can be dropped using additional
parameters while writing the file.

# Create a data frame.

data<- read.csv("input.csv")

retval<- subset(data, as.Date(start_date) >as.Date("2014-01-01"))

# Write filtered data into a new file.

write.csv(retval,"output.csv", row.names = FALSE)

newdata<- read.csv("output.csv")

print(newdata)

When we execute the above code, it produces the following result −

id name salary start_datedept

1 3 Michelle 611.00 2014-11-15 IT

2 4 Ryan 729.00 2014-05-11 HR

3 NA Gary 843.25 2015-03-27 Finance


4 8 Guru 722.50 2014-06-17 Finance

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy