datatypes variables operators in R
datatypes variables operators in R
13.0 INTRODUCTION
This unit covers the fundamental concepts of R programming. The unit
familiarises with the environment of R and covers the details of the global
environment. It further discusses the various types of data that is associated
with every variable to reserve some memory space and store values in R. The
unit discusses the various types of the data objects known as factors and the
types of operators used in R programming. The unit also explains the
important elements of decision making, the general form of a typical
decision-making structures and the loops and functions. R’s basic data
structures including vector, strings, lists, frames, matrices and arrays would
also be discussed.
13.1
OBJECTIVES
After going through this Unit, you will be able
to:
• explain and distinguish between the data types and assign them to
variables;
• explain about the different types of operators and the factors;
• explain the basics of decision making, the structure and the types of
loops;
• explain about the function- their components and the types;
• explain the data structures including vector, strings, lists, frames,
matrices, and arrays.
13.2 ENVIRONMENT OF R
R Programming language has been designed for statistical analysis of data. It
also has a very good support for graphical representation of data. It has a vast
set of commands. In this Block, we will cover some of the essential component
of R programming, which would be useful for you for the purpose of data
analysis. We will not be covering all aspects of this programming language;
therefore, you may refer to the further readings for more details.
5 5
Basics of R Programming
The discussion on R programming will be in the context of R-Studio, which is
an open-source software. You may try various commands listed in this unit to
facilitate your learning. The first important concept of R is its environment,
which is discussed next.
Environment can be thought of as a virtual space having collection of objects
(variables, functions etc.) An environment is created when you first hit the R
interpreter.
The top level environment present at R command prompt is the global
environment known as R_GlobalEnv, it can also be referred as .GlobalEnv.
You can use ls() command to know what variables/ functions are defined in the
working environment. You can even check it in the Environment section of R
Studio.
Every variable in R has an associated data type, which is known as the reserved
6 6
Basic of R Programming
memory. This reserved memory is needed for storing the values. Given below
is a list of basic data types available in R programming:
DATA TYPE Allowable Values
Integer Values from the Set of Integers, Z
Numeric Values from the Set of Real Numbers, R
Complex Values from the Set of Complex numbers, C
Numeric Datatype:
Decimal values are known to be numeric in R and is default datatype for any
number in R.
Integer Datatype:
R supports integer data type, you can create an integer by suffixing “L” to
denote that particular variable as integer as well as convert a value to an integer
by passing the variable to as.integer() function.
7 7
Basics of R Programming
Logical Datatype:
R has a logical datatype which returns value as either TRUE or FALSE. It is
usually used while comparing two variables in a condition.
Complex Datatype:
Complex data types are also supported in R. These datatype includes the set of
all complex numbers.
Character Datatype:
R supports character datatype which includes alphabets and special characters.
We need to include the value of the character type inside single or double
inverted commas.
8 8
Basic of R Programming
VARIABLES:
OPERATORS:
As the case with other programming languages, R also supports assignment,
arithmetic, relational and logical operators. The logical operators of R include
element by element operations. In addition, several other operators are
supported by R, as explained in this section.
Arithmetic Operators:
• Addition (+): The value at the corresponding positions in the vectors are
added. Please note the difference with C programming, as you are
adding a complete vector using a single operator.
• Subtraction (-): The value at the corresponding positions are subtracted.
Once again please note that single operator performs the task of subtract-
ing elements of two vectors.
• Multiplication (*): The value at the corresponding positions are multi-
plied.
9 9
Basics of R Programming
• Division (/): The value at the corresponding positions are divided.
• Power (^): The first vector is raised to the exponent (power) of the sec-
ond.
• Modulo (%%): The remainder after dividing the two will be returned.
Logical Operators:
Relational Operators:
The relational operators can take scalar or vector operands. In case of vector
operands comparison is done element by element and a vector of
TRUE/FALSE values is returned.
• Less than (<): If an element of the first operand (scalar or vector) is less
than that the corresponding element of the second operand, then this op-
erator returns Boolean value TRUE.
1 1
Basic of R Programming
• Less than Equal to (<=): If every element in the first operand or vector
is less than or equal to the corresponding element of the second
operand, then this operator returns Boolean value TRUE.
• Greater than (>): If every element in the first operand or vector is
greater than that the corresponding element of the second operand, then
this op- erator returns Boolean value TRUE.
• Greater than (>=): If every element in the first operand or vector is
greater than or equal to the corresponding element of the second
operand, then this operator returns Boolean value TRUE.
• Not equal to (!=): If every element in the first operand or vector is not
equal to the corresponding element of the second operand, then this op-
erator returns Boolean value TRUE.
• Equal to (==): If every element in the first operand or vector is equal to
the corresponding element of the second operand, then this operator re-
turns Boolean value TRUE.
Assignment Operators:
• Left Assignment (ß or <<-or =): Used for assigning value to a vector.
• Right Assignment (-> or ->>): Used for assigning value to a vector.
1 1
Basics of R Programming
Miscellaneous Operators:
FACTORS:
Factors are the data objects are used for categorizing and further storing the
data as levels. They store both, strings and integer values. Factors are useful in
the columns that have a limited number of unique values also known to be
categor- ical variable. They are useful in data analysis for statistical modelling.
For ex- ample, a categorical variable employment types – (Unemployed, Self-
Em- ployed, Salaried, Others) can be represented using factors. More details on
fac- tors can be obtained from the further readings.
Check Your Progress
1
1. What are various Operators in R?
……………………………………………………………………………
1 1
………………
………………
………………
………………
……………
1 1
Basic of R Programming
2. What does %*% operator do?
…………………………………………………………………………….
………………………………………………………………………………
3. Is .5Var a valid variable name? Give reason in support of your
answer.
………………………………………………………………………………
………………………………………………………………………………
If
Condition condition
is true If
conditi
on is
false
Conditional code
1 1
A loop is defined as a situation where we need to execute a block of code
several number of times. In the case of loops, the statements are executed
sequentially.
1 1
Basics of R Programming
Conditional code
If condition
Condition is true
If condition
is false
Example:
• For loop: Like while statement, executes the test condition at the end of
the loop body.
1 1
Basic of R Programming
Syntax:
Example:
FUNCTIONS:
A function refers to a set of instructions that is required to execute a command
to achieve a task in R. There are several built-in functions available in R.
Further, users may create a function basis their requirements.
Definition:
A function can be defined as:
Function Components
• Function Name: Actual name of the function.
1 1
Basics of R Programming
• Arguments: Passed when the function is invoked. They are optional.
• Function Body: statements that define the logic of the function.
• Return value: last expression of the function to be executed.
Built-in function: Built in functions are the functions already written and is ac-
cessible just by calling the function name. Some examples are seq(), mean(),
min(), max(), sqrt(), paste() and many more.
R’s basic data structures include Vector, Strings, Lists, Frames, Matrices
and
Array
s.
13.5.2 Lists
Lists are the objects in R that contains different types of objects within itself
like number, string, vectors or even another list, matrix or any function as its
element It is created by calling list() function.
1 1
Basics of R Programming
Matrix Manipulations:
Mathematical operations can be performed on the matrix like addition,
subtraction, multiplication and division. You may please note that matrix
division is not defined mathematically, but in R each element of a matrix is
divided by the corresponding element of other matrix.
2 2
Basic of R Programming
Arrays:
An array is a data object in R that can store multi-dimensional data that have
the same data type. It is used using the array() function and can accept vectors
as an input. An array is created using the values passed in the dim parameter.
For instance, an array is created with dimensions (2,3,5); then R would create 5
rectangular matrices comprising of 2 rows and 3 columns each. However, the
data elements in each of the array will be of the same data type.
2 2
Basics of R Programming
Dataframe:
A data frame represents a table or a structure similar to an array with two
dimensions It can be interpreted as matrices where each column of that matrix
can be of different data types.
The characteristics of a data frame are given as follow
2 2
Basic of R Programming
• The names of the columns should not be left blank
• The row names should be unique.
• The data frame can contain elements with numeric, factor or
character data type
• Each column should contain same number of data items.
Extracting specific data from data frame by specifying the column name.
2 2
Basics of R Programming
Expanding the data frame by Adding additional column.
2. What are the different data structures in R? Briefly explain about them.
…………………………………………………………………………………
…………………………………………………………………………………
2 2
Basic of R Programming
13.6 SUMMARY
The unit introduces you to the basics of R programming. It explains about the
environment of R, a virtual space having collection of objects and how a new
environment can be created within the global environment. The unit also
explains about the various types of data associated with the variables that
allocates a memory space and stores the values that can be manipulated. It also
gives the details of the five types of operators in R programming. It also
explains about factors that are the data objects used for organizing and storing
the data as levels. The concept of decision making is also been discussed in
detail that requires the programmer to specify one or more conditions to be
evaluated or tested by the program. The concept of loops and their types has
also been defined in this unit. It gives the details of function in R that is a set of
instructions that is required to execute a a command to achieve a task in R.
There are several built- in functions available in R. Further, users may create a
function basis their requirements. The concept of matrices, arrays, dataframes
etc have also been discussed in detail.
13.7 ANSWERS
Check Your Progress 1
2 2
Basics of R Programming
1. De Vries, A., & Meys, J. (2015). R for Dummies. John Wiley & Sons.
2. Peng, R. D. (2016). R programming for data science (pp. 86-181). Victoria, BC,
Canada: Leanpub.
3. Schmuller, J. (2017). Statistical Analysis with R For Dummies. John Wiley & Sons.
4. Field, A., Miles, J., & Field, Z. (2012). Discovering statistics using R. Sage publications.
5. Lander, J. P. (2014). R for everyone: Advanced analytics and graphics. Pearson
Education.
6. Lantz, B. (2019). Machine learning with R: expert techniques for predictive modeling.
Packt publishing ltd.
7. Heumann, C., & Schomaker, M. (2016). Introduction to statistics and data analysis.
Springer International Publishing Switzerland.
8. Davies, T. M. (2016). The book of R: a first course in programming and statistics. No
Starch Press.
9. https://www.tutorialspoint.com/r/index.html
2 2