0% found this document useful (0 votes)
33 views12 pages

Business Analytics and Intelligence Assignment

An assignment done on Business Analytics and Intelligence

Uploaded by

Ayush Srivastava
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
0% found this document useful (0 votes)
33 views12 pages

Business Analytics and Intelligence Assignment

An assignment done on Business Analytics and Intelligence

Uploaded by

Ayush Srivastava
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
You are on page 1/ 12

BUSINESS ANALYTICS AND INTELLIGENCE

ASSIGNMENT SUMS

TEAM – 4

SUBMITTED BY

AYUSH SRIVASTAVA (21MBA1015)


NAVANEETHA KRISHNAN P N (21MBA1018)
RITESH KUMAR (21MBA1033)
SNEHAL (21MBA1043)
PRAVEENA D (21MBA1050)

Under the Guidance of

DR. SUDARSANAM K

Trimester 3
In partial fulfillment for the award of the
Degree of
MASTER OF BUSINESS ADMINISTRATION (VITBS)
VELLORE INSTITUTE OF TECHNOLOGY
CHENNAI – 600127

MAY 2022
DA1-121

4. The cost to company (CTC) of 50 IT professionals measured in lakhs of rupees is shown in


Table 4.7.
TABLE 4.7 CTC (in lakhs of rupees)

21.38 12.24 29.06 12.37 8.48 18.76 23.8 28.48 9.56 35.94
28.76 30.76 37.67 34.15 32.53 26.64 24.25 39.66 8.98 26.17
40.54 27.66 18.83 12.87 22.12 28.07 27.15 12.06 5.66 8.44
4.85 11.72 15.18 6.44 28.94 17.71 31.5 26.91 33.93 14.5
38.14 30.87 27.29 6.77 18.43 28.9 22.33 31.41 37.07 32.6

(a) Draw a histogram. Comment on the distribution of CTC using skewness and kurtosis.
(b) Generate 500 random samples of size 10 and plot the histogram of sampling distribution.
(c) What is the mean and standard deviation of the sampling distribution obtained in (b)?
How far is this mean from the mean CTC of values provided in Table 4.7?

Solution:

(a). Histogram of the given dataset

Skewness and Kurtosis of the given dataset

-
Kurtosis 1.10321
-
Skewness 0.23672

The skewness value of the given dataset is -0.23672 which states that the data is left tailed
having a negative value.
The kurtosis value is -1.10 which indicates the heavily distribution of negative skewness of
the dataset.

(b).

Skewn 0.0505
ess 75
Kurtos -
is 1.1073

The skewness of the random sample generated is positively skewed that is 0.05 which states
the data is right tailed.
The kurtosis is approximately same as for the given dataset in the question hence impacting
over the heaviness for the weighted skewed data.

(c). Mean and Standard Deviation obtained in B

Mean and Standard Deviation of Dataset in the question:-

Standard 10.2477
Deviation 5
Mean 23.1698
Mean and Standard Deviation of Random Samples:-

Standard
Deviation 13.7857
Mean 25.022

Difference in Mean= 25.022-23.1698


= 1.8522
DA2-131
90% confidence interval for number of paid news items telecast by English news
channels in
India in a week is (20. 30). The sample mean x̄ in this case is

(a)25 (b) 20 (c) 30 (d) lies between 20 and 30

Solution:

To find 90% confidence level find x̄ :


x̄+EBM

20+30/2 = 25 is the x̄

DA3 - 176

2. Peter Poulouse is the Vice President of an e-commerce company called


'We Sell Everything on Earth (WSEOE)'. Peter believes that at least 12%
of WSEOE's customers return the products purchased by them. “To
validate his belief, he took a sample of 620 customers and found that 80
customers had returned the products. Carry out an appropriate hypothesis
test at a significance of 0.1% to check whether the return is at least 12%.

Solution:

H0: customer return the product <= 12%


Ha: customer return the product > 12%

Sample Proportion – Z test


Population Proportion = 0.12
N =620

Alpha = 0.1
Z-Score = 0.4562
Z-test = (80-74.4)^2/74.4 = 0.421
Here we accept the null hypothesis that at least 12% customers return the product as z test
value is less than z score
DA4 – Page 203

Question 1

If 10 t-tests are conducted at a= 0.05 and all are statically significant, calculate the value of
Type I error that all tests are simultaneously significant.

Solution:
When a hypothesis test results in a p-value that is less than the significance level, the result of
the hypothesis test is called statistically significant. The value of type I error could be
determined as rejecting the null hypothesis when in fact it is true also. The significance
level α is the probability of making the wrong decision when the null hypothesis is true.

DA5 – Page 220

Question 2
Mr. Chellappa is the founder of Oho Productions that produces movies in different languages
of India. Mr. Chellappa believes that the length of the movie (measured in minutes) is not
related to its box-office collection. Table 8.11 shows length of the movie (in minutes) and the
box-office collection (in millions of rupees). Use an appropriate hypothesis test to check
whether there is a correlation between length of the movie and the box-office collection at a
significance level of 0.05.

TABLE 8.11Data on length of the movie and the box-office collection


Lengt
h of
the
movie 121 79 170 160 77 147 115 76 110 141
Box-
office
collec
tion 1078 415 441 1192 258 1185 139 427 309 411
Lengt
h of
the
movie 100 82 82 114 110 163 92 172 142 136
Box-
office
collec
tion 506 441 595 1728 1507 518 1463 1356 1014 422
Lengt 143 108 154 140 177 97 106 163 142 115
h of
the
movie
Box-
office
collec
tion 508 1262 1783 1281 1253 1178 1103 454 301 296

Solution 2:

Solved Using Excel


T Test: Paired Two Sample for Means        
           
  121 1078    
Mean 125.3103448 818.8275862  
Variance 1012.007389 250900.2906      
Observations 29 29      
Pearson Correlation 0.183960228      
Hypothesized Mean Difference 0    
Df 28        
t Stat -7.52913208        
P(T<=t) one-tail 1.67665E-08        
t Critical one-tail 1.701130908        
P(T<=t) two-tail 3.3533E-08        
t Critical two-tail 2.048407115        

The correlation coefficient is 0.183960228 which infers a weak uphill positive relationship
between the length of movie and box office collection at a significance level of 0.05.
DA6 – Page 266

4. Table 9.19 provides the winning margin of all 20 Lok Sabha constituencies of Kerala in
2014 parliament elections of India and maximum delay of top 20 flights (origin−destination)
of Air India between 15 July 2014 and 15 September 2014.

TABLE 9.19 Data on Lok Sabha election winning margin of Kerala constituencies and
maximum delay of top 20 Air India flights
S. Winning Maximum Delay in
No. Constituency Margin Air India Top 20 flights Minutes
1 Alappuzha 19407 Bangalore−Mumbai 182
2 Alathur 37312 Ahmedabad−Mumbai 203
3 Attingal 69378 Hyderabad−Mumbai 240
4 Chalakudy 13884 Mumbai−Goa 164
5 Ernakulum 87047 Delhi−Kolkata 265
6 Idukki 50542 Chennai−Delhi 226
7 Kannur 6566 Delhi−Bangalore 156
8 Kasaragod 6921 Mumbai−Chennai 161
9 Kollam 37649 Kolkata−Delhi 219
10 Kottayam 120599 Mumbai–Delhi 328
11 Kozhikode 16883 Hyderabad−Delhi 181
12 Malappuram 194740 Delhi–Mumbai 340
13 Mavelikkara 32737 Mumbai−Ahmedabad 202
14 Palakkad 105300 Mumbai−Hyderabad 284
15 Pathanamthitta 56191 Chennai−Mumbai 234
16 Ponnani 25410 Bangalore−Delhi 199
Thiruvananthap
17 uram 15470 Goa−Mumbai 178
18 Thrissur 38228 Delhi−Chennai 225
19 Vadakara 3306 Delhi−Hyderabad 146
20 Wayanad 20870 Mumbai−Bangalore 197

(a) Develop a simple linear regression model between winning margin (Y) and maximum
flight delay (X) and calculate the regression coefficients.
(b) What is the value of R2?
(c) Is the model statistically significant, what can you infer from the regression model?

Solution 4:

(A). Simple Linear Regression Model

SUMMARY
OUTPUT
Regression Statistics

Multiple R 0.95888

R Square 0.91945
Adjusted
R Square 0.914712
Standard
Error 14189.28
Observati
ons 19

ANOVA
d
f SS MS F Significance F
Regressio 3.91E+ 3.91E+ 194.0 9.96E-
n 1 10 10 49 11

3.42E+ 2.01E+
Residual 17 09 08

4.25E+
Total 18 10

Coefficie Standar P- Lower Upper Lower Upper


nts d Error t Stat value 95% 95% 95% 95%
- -
137 1.7E- 16553 - 16553 -
Intercept -136539 40.76 -9.9368 08 0 107549 0 107549
61.1481 13.9301 9.96E 722.7 980.81 722.7 980.81
182 851.8023 2 5 -11 91 35 91 35

(b) The Value of R2(square) is 0.91945

(c). Yes, the model is statistically significant and high correlation is observed in the case of
probability plots.
DA7- 327
Data for Questions 1−6: The dean of a business school has collected data on their recent
placement. To attract good students, it is important for the school to ensure that the students
are placed with good salary package. The dean of the school believed that the salary earned
by a student at placement depended on several variables. The data collected by the dean is
listed in Table 10.40`

The first regression model is built using degree of specialization as the explanatory variable.
Model 1:
Y = b0 + b1 Degree_Arts + b2 Degree_Commerce + b3 Degree_CompApp + b4
Degree_Engineering + b5 Degree_Management The model 1 SPSS outputs are shown in
Tables 10.41− 10.43
Question 1: Assuming that the salary package is important for the school, should the dean
give more importance to certain degree disciplines while admitting the students to their MBA
programme? Support your answers with precise arguments.

Solution 1:

The constant and the Average salary for Degree_Engineering value is significant at 5% level
as the significance value is less than 0.05. So, dean can consider engineering students for
admission since their average salary seems to be significant.

Question 2: Is there a significant difference between the average salary earned by a student
with science degree and commerce degree? Clearly state your arguments.

Solution 2:
Inadequate data (science degree is not used as an independent variable so no result available
for science variable, mean and standard deviation of the average salary earned after MBA
based on UG discipline is missing).

Question 3: The dean of the school believes that the engineering students earn on average at
least INR 25,000 more than the science students. Check whether his belief is true at 5%
significance level by conducting an appropriate hypothesis tests.
A new variable, which is the interaction between degree discipline engineering and the
percentage marks in degree, is added to model 1 and the corresponding output is shown in
Table 10.44.

Solution 3:
Inadequate data (science degree is not used as an independent variable so no result available
for science variable, mean and standard deviation of the average salary earned after MBA
based on UG discipline is missing.)

Question 4: Interpret the coefficient value for the interaction value ENGPERCENT
(Degree_Engineering × Percent_Degree). Explain possible reason for the salary of
engineering students decreasing as the percentage marks in degree increases. Clearly state
your arguments. A stepwise regression is carried out using SPSS and the results of stepwise
regression are shown in Tables 10.45 and 10.46.
Solution 4:
High VIF in Degree_Engineering and ENGPERCENT. The VIF is more than 4 clearly
depicting a situation of multicollinearity. Leading to inference like salary of engineering
students decreasing as the percentage marks in degree increases.

Question 5: What is the R-square value at step 2 of the stepwise regression?

Solution 5:
The increase in R2 is given by the square of part-correlation (semipartial correlation). Part-
correlation is 0.228 between GENCOM and dependent variable (Salary). 0.2282 = 0.051984
(increase in R2 at step 2)
R 2 at step 2 = 0.060516 + 0.051984 = 0.1125

Model R R- Square Adjusted R Std. Error of


Square Estimates
1 0.246 0.060516 0.057 82984.946
2 0.1125

Question 6: In Table 10.46, GENCOM is the interaction variable between gender and marks
in communication. Which of the following statements is true? Clearly state your arguments.
(a) Salary is more sensitive to marks in communication for females than males.
(b) Salary is more sensitive to marks in communication for males than females.
(c) There is no difference between males and females with respect to marks in
communication.
(d) Can’t say.

Solution 6:
Salary = 96461.563 + 2441.930 (MARKS_COMMUNICATION) + 689.203 (GENDER*
MARKS_COMMUNICATION
if GENDER = Male = 1
Salary =96461.563 + (689.203+2441.930) (MARKS_COMMUNICATION)
Salary = 96461.563 + 3131.133 * MARKS_COMMUNICATION
If Gender = Female = 0
Salary = 96461.563 + 2441.930 (MARKS_COMMUNICATION) + 689.203 (GENDER*
MARKS_COMMUNICATION
Salary = 96461.563 + 2441.930 (MARKS_COMMUNICATION)

Answer: Salary is more sensitive to marks in communication for males than females.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy