0% found this document useful (0 votes)
48 views9 pages

MGN801 Ca1

1. The document describes an academic task submission for a business analytics course. It includes the course code, instructor, task details, student information, evaluation parameters, and spaces for the student's learning outcomes, declaration, and the evaluator's comments. 2. The student is asked to write briefly about their learnings from completing the academic task. 3. The document provides a framework for submitting and grading an assignment, with sections for student work, declaration, and evaluator feedback.

Uploaded by

Atul Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views9 pages

MGN801 Ca1

1. The document describes an academic task submission for a business analytics course. It includes the course code, instructor, task details, student information, evaluation parameters, and spaces for the student's learning outcomes, declaration, and the evaluator's comments. 2. The student is asked to write briefly about their learnings from completing the academic task. 3. The document provides a framework for submitting and grading an assignment, with sections for student work, declaration, and evaluator feedback.

Uploaded by

Atul Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Course Code: MGN801 Course Title: Business Analytics-I

Course Instructor: Dr. Avinash Rana

Academic Task No.: 1 Academic Task Title: CA-1

Date of Allotment:08/27/2022 Date of submission:09/17/2022

Student’s Roll no: RQ1969B37 Student’s Reg. no: 11903267


Evaluation Parameters: As mentioned in Rubrics

Learning Outcomes: (Student to write briefly about learnings obtained from the academic tasks)

Declaration:
I declare that this Assignment is my individual work. I have not copied it from any other student’s work or from any
other source except where due acknowledgement is made explicitly in the text, nor has any part been written for me by
any other person.
Student’s Signature:

Evaluator’s comments (For Instructor’s use only)

General Observations Suggestions for Improvement Best part of assignment

Evaluator’s Signature and Date:


Marks Obtained: _______________ Max. Marks: ______________
1. Correlation
RSCRIPT
library(ggplot2)
library(ggcorrplot)
library(stringi)
library(dplyr)
CARS_1
str(CARS_1)
data1=CARS_1 %>%
select(engine_displacement,no_cylinder,seating_capacity,fuel_tank_capacity,rating,max_torque_nm,max_t
orque_rpm,max_power_bhp,max_power_rp)
data1
cordata1=cor(data1)
cordata1
corplot=ggcorrplot(corr = cordata1,lab = T,type = "full",method = "square")
corplot
PLOT
Interpreting:
The following data has been derived from Kaggle, and the data set describes about various models of cars
and their specifications such as, no. of cylinders, fuel tank capacity and many more and prices. This data set
contains 203 rows and 16 columns or variables.
The plot shown here shows the correlation of various properties of the cars with one another. The legend
indicates that the greater will be the red colour, greater is the positive correlation. On the other hand, if the
blue colour is stronger then that represents a strong negative correlation. However, in the plot boxes, the
magnitude has also been represented by the number written in the boxes. The correlation cannot go beyond -
1 and 1. Through this plot, we study the relation between 2 variables. For example, if we pick 2 variables,
say, engine displacement and no. of cylinders, then we see a bright red colour, and the magnitude of 0.95,
which shows that there is a strong relation between number of cylinders and engine displacement. Whereas,
on the other hand, if we pick the variables fuel tank capacity and max power bhp, then we see a very weak
correlation of 0.15, which is indicated by the light red colour. The correlation of number of cylinders with
number of cylinders, or seating capacity with seating capacity will always be 1 because they are the same
things.
The data CARS_1 contained data that were “character vector” and thus could not be used for corelation,
hence those data had to be filtered, therefore, the numeric data was selected and that dataset was termed as
data1.
2. Deviation
RSCRIPT
library(ggplot2)
library(ggcorrplot)
library(stringi)
library(dplyr)
CARS_1
data4=CARS_1 %>% slice(1:25)
data4$car_name=data4$car_name
data4$engine_displacement_z=round((data4$engine_displacement -
mean(data4$engine_displacement))/sd(data4$engine_displacement), 2)
data4$engine_displacement_z
data4$engine_displacement_type=ifelse(data4$engine_displacement_z < 0, "below", "above")
data4$engine_displacement_type
data4=data4[order(data4$engine_displacement_z),]
data4$car_name=factor(data4$car_name,levels = data4$car_name)

ggplot(data4,aes(x=car_name,y=engine_displacement_z,label=engine_displacement_z))+
geom_bar(stat = "Identity",aes(fill = engine_displacement_type),width = 0.5)+
scale_fill_manual(name="Engine Performance",
labels = c("Above Average","Below Average"),
values = c("above"="Blue","below"="Red"))+
labs(subtitle = "Engine Performance w.r.t Average",
title = "Graphical Representation")+
coord_flip()
PLOT

Interpreting:
The above shown graphical representation is called Diverging Bars and they work on the concept of Z-Score
Normalisation, as the name suggests, they normalise the outliers. Hence, it is easy to interpret data from one
glance. In the above plot, there are many cars which have “Engine Displacement” values less than the few
cars that have engine displacement indicated in blue, which have values above high. In this graph, on the x
axis we have normalised value of engine displacements, indicated as “engine_displacement_z”, and on the y
axis, we have the car names. This entire data set of Cars_1 contains 203 cars, however, for simplicity and
visibility, the data has been sliced and only 25 cars have been selected, and that data of 25 cars is saved as
“data4”.
3. Ranking
RSCRIPT
library(ggplot2)
library(ggcorrplot)
library(stringi)
library(dplyr)
CARS_1
data5=CARS_1 %>% slice(1:20)
data1barplot=ggplot(data5,aes(x=car_name, y=fuel_tank_capacity)) +
geom_bar(stat="identity", color = "red",fill=c(1:20))
data1barplot

PLOT

Interpreting:
The graph shown above represents bar graph and is used for ranking cars in the data set. 20 data have been
sliced and selected from the CARS_1 data for visibility, and the data has been saved as “data5”. On the y
axis, we are considering the parameter of “Fuel Tank Capacity”, whereas on x axis we are considering the
20 car names. From the graph, it can be inferred that “Toyota Fortune” has the largest tank capacity
followed by Mahindra XUV700 and Mahindra Bolero and so on. Thus, this bar plot ranks the vehicles on
the basis of their Fuel Tank Capacity.
4. Distribution
RSCRIPT
CARS_1
hist(CARS_1$fuel_tank_capacity,xlab="Fuel Tank Capacity",main="Frequency distribution according to Fuel
Tank Capacity",col = c(1:10))

Interpreting:
Histogram is an important graph when it comes to frequency distribution. In this case, we are counting the
number of cars that have fuel tank capacity between the intervals of 0-10, 10-20, 20-30, 30-40, 40-50, 50-60,
60-70, 70-80, 80-90, 90-100. Hence, we see frequency on y axis and fuel tank capacity on x axis. The largest
frequency distribution is seen in 0-10 group, indicating that more than 40 cars have fuel tank capacity less
than or equal to 10. Next, we can see that almost 38 cars have fuel tank capacities between 50-60. Similarly,
we see that no cars have fuel tank capacities between 10-20, and 90-100 have the least number of cars,
which around 4 or 5.
5. Composition
RSCRIPT
library(ggplot2)
library(stringi)
library(dplyr)
bodytypes=CARS_1 %>% group_by(body_type) %>% summarise(n())
bodytypes
ggplot(CARS_1,aes(x="",y="bodytypes",fill=body_type))+
geom_bar(stat = "identity",width = 1)+
coord_polar("y",start = 0)

PLOT

Interpreting:
Pie or doughnut charts are used for analysing qualitative data, and thus the vector used for this plot is
“body_type”, which is a character vector. In the 203 cars present in the data set CARS_1, there are cars of
various body types such as, SUV, sedan, hatchback and many more. These body types have been showcased
in the pie chart above. The larger the area of the a specific colour, the greater is the concentration of that
body type in the data set. We can see that violet is the colour that covers a major portion. According to the
legend on the right side, we see that violet colour represents SUVs, thus it can be concluded that most of the
vehicles shown in the data set are SUVs. The second largest body type that can be seen is sedan, which is
represented by purple colour. Some of the minorities in this data set are green and deep blue colours, which
represent luxury cars and pickup trucks respectively.
6. Change
RSCRIPT
library(ggplot2)
CARS_1
ggplot(CARS_1,aes(x=engine_displacement,y=no_cylinder))+
geom_line(color="Red")

PLOT

Interpreting:
Change means evolution. Thus, it can be used to depict a particular trend over time. However, in this graph
we are comparing no of cylinders against engine displacement. We can see a particular trend, which is
starting from the bottom left of the graph, and continuing to the top right of the graph. The slopes seems to
be roughly between 25-35 degrees and the values that fall far away from the line can be referred to as
outliers.
In this graph, we see that as engine displacement increases, the number of cylinders also increases. In other
words, as the engine displacement increases, the number of cylinders gets evolved and thus it gets higher
and we see a positive association of number of cylinders on engine displacement.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy