MGN801 Ca1
MGN801 Ca1
Learning Outcomes: (Student to write briefly about learnings obtained from the academic tasks)
Declaration:
I declare that this Assignment is my individual work. I have not copied it from any other student’s work or from any
other source except where due acknowledgement is made explicitly in the text, nor has any part been written for me by
any other person.
Student’s Signature:
ggplot(data4,aes(x=car_name,y=engine_displacement_z,label=engine_displacement_z))+
geom_bar(stat = "Identity",aes(fill = engine_displacement_type),width = 0.5)+
scale_fill_manual(name="Engine Performance",
labels = c("Above Average","Below Average"),
values = c("above"="Blue","below"="Red"))+
labs(subtitle = "Engine Performance w.r.t Average",
title = "Graphical Representation")+
coord_flip()
PLOT
Interpreting:
The above shown graphical representation is called Diverging Bars and they work on the concept of Z-Score
Normalisation, as the name suggests, they normalise the outliers. Hence, it is easy to interpret data from one
glance. In the above plot, there are many cars which have “Engine Displacement” values less than the few
cars that have engine displacement indicated in blue, which have values above high. In this graph, on the x
axis we have normalised value of engine displacements, indicated as “engine_displacement_z”, and on the y
axis, we have the car names. This entire data set of Cars_1 contains 203 cars, however, for simplicity and
visibility, the data has been sliced and only 25 cars have been selected, and that data of 25 cars is saved as
“data4”.
3. Ranking
RSCRIPT
library(ggplot2)
library(ggcorrplot)
library(stringi)
library(dplyr)
CARS_1
data5=CARS_1 %>% slice(1:20)
data1barplot=ggplot(data5,aes(x=car_name, y=fuel_tank_capacity)) +
geom_bar(stat="identity", color = "red",fill=c(1:20))
data1barplot
PLOT
Interpreting:
The graph shown above represents bar graph and is used for ranking cars in the data set. 20 data have been
sliced and selected from the CARS_1 data for visibility, and the data has been saved as “data5”. On the y
axis, we are considering the parameter of “Fuel Tank Capacity”, whereas on x axis we are considering the
20 car names. From the graph, it can be inferred that “Toyota Fortune” has the largest tank capacity
followed by Mahindra XUV700 and Mahindra Bolero and so on. Thus, this bar plot ranks the vehicles on
the basis of their Fuel Tank Capacity.
4. Distribution
RSCRIPT
CARS_1
hist(CARS_1$fuel_tank_capacity,xlab="Fuel Tank Capacity",main="Frequency distribution according to Fuel
Tank Capacity",col = c(1:10))
Interpreting:
Histogram is an important graph when it comes to frequency distribution. In this case, we are counting the
number of cars that have fuel tank capacity between the intervals of 0-10, 10-20, 20-30, 30-40, 40-50, 50-60,
60-70, 70-80, 80-90, 90-100. Hence, we see frequency on y axis and fuel tank capacity on x axis. The largest
frequency distribution is seen in 0-10 group, indicating that more than 40 cars have fuel tank capacity less
than or equal to 10. Next, we can see that almost 38 cars have fuel tank capacities between 50-60. Similarly,
we see that no cars have fuel tank capacities between 10-20, and 90-100 have the least number of cars,
which around 4 or 5.
5. Composition
RSCRIPT
library(ggplot2)
library(stringi)
library(dplyr)
bodytypes=CARS_1 %>% group_by(body_type) %>% summarise(n())
bodytypes
ggplot(CARS_1,aes(x="",y="bodytypes",fill=body_type))+
geom_bar(stat = "identity",width = 1)+
coord_polar("y",start = 0)
PLOT
Interpreting:
Pie or doughnut charts are used for analysing qualitative data, and thus the vector used for this plot is
“body_type”, which is a character vector. In the 203 cars present in the data set CARS_1, there are cars of
various body types such as, SUV, sedan, hatchback and many more. These body types have been showcased
in the pie chart above. The larger the area of the a specific colour, the greater is the concentration of that
body type in the data set. We can see that violet is the colour that covers a major portion. According to the
legend on the right side, we see that violet colour represents SUVs, thus it can be concluded that most of the
vehicles shown in the data set are SUVs. The second largest body type that can be seen is sedan, which is
represented by purple colour. Some of the minorities in this data set are green and deep blue colours, which
represent luxury cars and pickup trucks respectively.
6. Change
RSCRIPT
library(ggplot2)
CARS_1
ggplot(CARS_1,aes(x=engine_displacement,y=no_cylinder))+
geom_line(color="Red")
PLOT
Interpreting:
Change means evolution. Thus, it can be used to depict a particular trend over time. However, in this graph
we are comparing no of cylinders against engine displacement. We can see a particular trend, which is
starting from the bottom left of the graph, and continuing to the top right of the graph. The slopes seems to
be roughly between 25-35 degrees and the values that fall far away from the line can be referred to as
outliers.
In this graph, we see that as engine displacement increases, the number of cylinders also increases. In other
words, as the engine displacement increases, the number of cylinders gets evolved and thus it gets higher
and we see a positive association of number of cylinders on engine displacement.