Factor-Hair RV PDF
Factor-Hair RV PDF
R Venkataraman
Objective
❖ Perform exploratory data analysis on the dataset.
Showcase some charts, graphs. Check for outliers and
missing values
❖ Is there evidence of multicollinearity ? Showcase your
analysis
❖ Perform simple linear regression for the dependent
variable with every independent variable
❖ Perform PCA/Factor analysis by extracting 4 factors.
Interpret the output and name the Factors
❖ Perform Multiple linear regression with customer
satisfaction as dependent variables and the four factors
as independent variables. Comment on the Model output
and validity. Your remarks should make it meaningful for
everybody
Customer Satisfaction (dependant variable) Both Shapiro-Wilk test & density graph shows that the
• Negatively skewed dependant variable is normally distributed.
Exploratory Data Analysis
• Histogram with density graph of independent variables
Multicollinearity
Regression Model
VIF
As a rule of thumb
• 1 = not correlated.
• Between 1 and 5 = moderately correlated.
• Greater than 5 = highly correlated.
Inference:
• CompRes moderately correlated
• DelSpeed highly correlated
Simple Linear Regression
Simple linear regression was performed with all the eleven independent variables and the results of
intercept & Slopes are produced below. Graphical representation is also illustrated.
Simple Linear Regression
PCA/Factor Analysis
Bartlett test: P value < .05 confirms the Eigen values : Kaiser Rule suggests eigen value >= 1 can be
possibility of data dimension reduction considered. So four components can be used
Model 1
• P value < .05 confirms the relationship between
independent and dependant variable and significant
• VIF of the variables also confirms no multicollinearity as
values are close to 1
• All independent variables except AfterSales including
intercept are highly significant
• R-Squared value of 70% is the variation of dependant
variable explained by the model
• Adjusted R-Square value of 68% explains how many
data points fall within the line of regression equation.
Model 2
• P value < .05 confirms the relationship between
independent and dependant variable and significant
• VIF of the variables also confirms no multicollinearity as
values are close to 1
• All independent variables including intercept are highly
significant
• R-Squared value of 70% is the variation of dependant
variable explained by the model
• Adjusted R-Square value of 68% explains how many
data points fall within the line of regression equation.
Model 3
• P value < .05 confirms the relationship between
independent and dependant variable and significant
• VIF of the variables also confirms no multicollinearity as
values are close to 1
• All independent variables including intercept are highly
significant, Marketing*Segment interaction moderately
significant.
• R-Squared value of 75% is the variation of dependant
variable explained by the model
• Adjusted R-Square value of 73% explains how many
data points fall within the line of regression equation.
Business Statistics
Communicating with Numbers
➢ Jaggia / Kelly
Marketing Research
An Applied Orientation
➢ Naresh K.Malhotra / Satyabhusan Dash