Meet5 Psy 312 Decision-Making Association
Meet5 Psy 312 Decision-Making Association
C
I
S What type
of Test?
I
O
Pearson
N Correlation /
Regression
T-test / ANOVA
R www.researchgate.net/
Independent
figure/A-basic-
E
decision-tree-on-how-
to-select-the-
appropriate-statistical-
E test-is-
shown_fig5_25630388
9
Test of Association: Linear Regression / SPSS practice
#6. Eight motorists who own auto insurance policies from the same insurance
company observed that their individual monthly auto insurance premiums seem
to be dependent on the driver’s experience. Listed below are their years of
driving experience and their respective monthly auto insurance premium.
OR
Ho: driving-years experience
do not predict monthly
insurance premium
(bx + a = 0)
Ha: driving-years experience
do not predict monthly Source: University
insurance premium of Idaho;
www.webpages.ui
(bx + a ≠ 0) daho.edu
REVIEW: Assumptions for Regression
1. Interval level of measurement *
2. related pairs / paired values *
3. absence of outliers *
4. normality of variable *
5. Linearity/relationship is linear (straight line)
6. Homoscedasticity (the distance between the points to a straight
line are of same variation)
7. Independent and dependent variables are at least moderately
correlated
8. IVs NOT multicollinear/ not highly correlated (r ≥ .9)
Encode data in SPSS, make
sure measurement level is
scale for all variables
1. Check assumptions of
normality using skewness
and kurtosis (click analyze,
descriptive statistics,
descriptives)
2. Transfer both variables to
variables box, Click options
and check Mean, SD,
Kurtosis and Skewness,
click continue and ok
Skewness and Kurtosis values for both years experience
and monthly insurance, data normality is achieved
Skewness: between -1 and +1
Kurtosis: between -3 and +3
Check assumptions of
normality of DV:
(1) analyze, descriptive
statistics, explore
(2) Transfer DV to dependent list box, click
statistics, check outliers, click continue**
**For assumptions check purposes,
you may include the IV in
transferring to the dependent list
* No circles/asterisks/stars
indicated in boxplot. Thus, no
outliers. Assumption of absence of
outliers is met.
* If there are, do not do another
data cleaning yet, decide after
checking other assumptions test
like cook’s distance
No circles/asterisks/stars indicated in boxplot. Thus, no outliers.
Assumption of absence of outliers is met.
Other assumptions
of regression can
be checked using
the
ACTUAL
REGRESSION
ANALYSIS
1. Click analyze,
regression,
linear
2. Transfer DV to
dependent box
and the IV to
Independent
box, click
statistics
3. Click on: estimates, model fit, r-squared
change, descriptives and collinearity
diagnostics . Also check casewise
diagnostics under residuals. Click continue,
then Click plots.
Both R2 and Adjusted R2 indicate how much variance the predictors can
explain in the outcome. in this case, the predictors can explain 52.1% of the
variance of the DV; in other words, the years of driving experience can
explain 52.1% of the variance in the monthly insurance premium.
Use Adjusted R-squared because it only increases when a new variable that
has impact on the regression model is added, whereas R-squared keeps on
increasing when there are new variables added, regardless of whether these
variables are useful or not
The R-square change is tested with an F-test, which is referred to
as the F-change. A significant F-change means that the variables
added in that step significantly improved the prediction. For this
round of regression, because all predictors are entered at once, no
need to use it. If predictors were entered in a hierarchical manner
or by block, change statistics values would be more useful.
The F-ratio in the ANOVA table tests whether the overall
regression model is a good fit for the data. The table shows
that the independent variable, driving-years experience,
statistically significantly predicts the dependent variable,
monthly insurance premium, F(1, 6) = 8.624 p=.026
This shows that the regression model is a good fit of the
data.
Unstandardized and Standardized coefficients indicate change in dependent
variable for every change in each IV.
Standardized coefficients: used more since it allows the comparison of the
two regressors or predictors in the model.
You may write the interpretation for this result in this manner:
“Given the significance value of the predictor, p=.026, it can be said that driving-
years experience significantly predicted the monthly insurance premium. For
every unit increase in driving-years experience, there is a .768 decrease in
monthly insurance premium.”
Assumption check: MULTICOLLINEARITY
(1) VIF value should not be more than 10 for variables to not be
considered as multicollinear. Here, VIF for the DV is 1 indicating that
multicollinearity assumption is met
Example of
a P-P plot
that has a
drastic
deviation
assumption check: HOMOSCEDASTICITY
Ideally, a plot that looks like the
dots/circles were shot out of a shotgun
indicates that data is homoscedastic. There
is no pattern and there are points equally
distributed above and below zero on the X
axis, and to the left and right of zero on the
Y axis.
Homoscedastic data also suggests linearity.
No point should also fall outside of -3 and
+3
Example of a
scatterplot that is
not homoscedastic
Reporting Regression Analysis results
M SD 1 2
Given the significance value of the predictor, p=.026, it can be said that driving-
years experience significantly predicted the monthly insurance premium. For
every unit increase in driving-years experience, there is a .768 decrease in
monthly insurance premium.
Reporting Regression Analysis results
Slides 29-49
are from
G.Conway’s
Encode data in SPSS, make sure that measure is scale ppt
Check normality of DV
using analyze, descriptive
statistics, explore
Transfer DV to dependent
list box, click statistics,
check outliers, click
continue
M SD 1 2 3 4
b SE B β p
Reading
.177 .049 .167 .000
Comprehension