0% found this document useful (0 votes)
274 views12 pages

SPSS ANNOTATED OUTPUT Multiple Regression

Multiple regression allows prediction of a dependent variable from two or more independent variables. It determines how well the independent variables together predict the dependent variable and the relative contribution of each predictor. A researcher used multiple regression to predict VO2max (fitness indicator) from age, weight, heart rate, and gender in 100 participants. The regression model significantly predicted VO2max and explained 57.7% of its variance. All independent variables significantly contributed to the prediction, with VO2max decreasing with age and weight and increasing with gender.

Uploaded by

Aditya Mehra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
274 views12 pages

SPSS ANNOTATED OUTPUT Multiple Regression

Multiple regression allows prediction of a dependent variable from two or more independent variables. It determines how well the independent variables together predict the dependent variable and the relative contribution of each predictor. A researcher used multiple regression to predict VO2max (fitness indicator) from age, weight, heart rate, and gender in 100 participants. The regression model significantly predicted VO2max and explained 57.7% of its variance. All independent variables significantly contributed to the prediction, with VO2max decreasing with age and weight and increasing with gender.

Uploaded by

Aditya Mehra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

SPSS ANNOTATED OUTPUT MULTIPLE REGRESSION|

Multiple regression is an extension of simple linear regression. It is used when we want to predict
the value of a variable based on the value of two or more other variables. The variable we want
to predict is called the dependent variable (or sometimes, the outcome, target or criterion
variable). The variables we are using to predict the value of the dependent variable are called
the independent variables (or sometimes, the predictor, explanatory or regressor variables).

For example, you could use multiple regression to understand whether exam performance can be
predicted based on revision time, test anxiety, lecture attendance and gender. Alternately, you
could use multiple regression to understand whether daily cigarette consumption can be predicted
based on smoking duration, age when started smoking, smoker type, income and gender.

Multiple regression also allows you to determine the overall fit (variance explained) of the model
and the relative contribution of each of the predictors to the total variance explained. For
example, you might want to know how much of the variation in exam performance can be explained by
revision time, test anxiety, lecture attendance and gender "as a whole", but also the "relative
contribution" of each independent variable in explaining the variance.

A health researcher wants to be able to predict "VO2max", an indicator of fitness and health.
Normally, to perform this procedure requires expensive laboratory equipment and necessitates that
an individual exercise to their maximum (i.e., until they can longer continue exercising due to
physical exhaustion). This can put off those individuals who are not very active/fit and those
individuals who might be at higher risk of ill health (e.g., older unfit subjects). For these
reasons, it has been desirable to find a way of predicting an individual's VO2max based on attributes
that can be measured more easily and cheaply. To this end, a researcher recruited 100 participants
to perform a maximum VO2max test, but also recorded their "age", "weight", "heart rate" and
"gender". Heart rate is the average of the last 5 minutes of a 20 minute, much easier, lower
workload cycling test. The researcher's goal is to be able to predict VO2max based on these four
attributes: age, weight, heart rate and gender.

Setup in SPSS Statistics

In SPSS Statistics, we created six variables: (1) VO2max , which is the maximal aerobic capacity;
(2) age , which is the participant's age; (3) weight , which is the participant's weight
(technically, it is their 'mass'); (4) heart_rate , which is the participant's heart rate;
(5) gender , which is the participant's gender; and (6) caseno , which is the case number.
The caseno variable is used to make it easy for you to eliminate cases (e.g., "significant
outliers", "high leverage points" and "highly influential points") that you have identified when
checking for assumptions. In our enhanced multiple regression guide, we show you how to correctly
enter data in SPSS Statistics to run a multiple regression when you are also checking for
assumptions. You can learn about our enhanced data setup content here. Alternately, we have a
generic, "quick start" guide to show you how to enter data into SPSS Statistics, available here.

Test Procedure in SPSS Statistics

 Click Analyze > Regression > Linear... on the main menu, as shown below:
Note: Don't worry that you're selecting Analyze > Regression > Linear... on the main menu or that
the dialogue boxes in the steps that follow have the title, Linear Regression. You have not made
a mistake. You are in the correct place to carry out the multiple regression procedure. This is
just the title that SPSS Statistics gives, even when running a multiple regression procedure.

 You will be presented with the Linear Regression dialogue box below:
 Transfer the dependent variable, VO2max , into the Dependent: box and the independent
variables, age , weight , heart_rate and gender into the Independent(s): box, using

the buttons, as shown below (all other boxes can be ignored):


Note: For a standard multiple regression you should ignore the and buttons as they
are for sequential (hierarchical) multiple regression. The Method: option needs to be kept at the
default value, which is . If, for whatever reason, is not selected, you need to
change Method: back to . The method is the name given by SPSS Statistics to
standard regression analysis.
 Click the button. You will be presented with the Linear Regression: Statistics dialogue
box, as shown below:

 In addition to the options that are selected by default, select Confidence intervals in the –
Regression Coefficients– area leaving the Level(%): option at "95". You will end up with the
following screen:
 Click the button. You will be returned to the Linear Regression dialogue box.
 Click the button. This will generate the output.

Interpreting and Reporting the Output of Multiple Regression Analysis

SPSS Statistics will generate quite a few tables of output for a multiple regression analysis. In
this section, we show you only the three main tables required to understand your results from the
multiple regression procedure, assuming that no assumptions have been violated. A complete
explanation of the output you have to interpret when checking your data for the eight assumptions
required to carry out multiple regression is provided in our enhanced guide. This includes relevant
scatterplots and partial regression plots, histogram (with superimposed normal curve), Normal P-P
Plot and Normal Q-Q Plot, correlation coefficients and Tolerance/VIF values, casewise diagnostics
and studentized deleted residuals.

However, in this "quick start" guide, we focus only on the three main tables you need to understand
your multiple regression results, assuming that your data has already met the eight assumptions
required for multiple regression to give you a valid result:

Determining how well the model fits

The first table of interest is the Model Summary table. This table provides the R, R2, adjusted R2,
and the standard error of the estimate, which can be used to determine how well a regression model
fits the data:

The "R" column represents the value of R, the multiple correlation coefficient. R can be considered
to be one measure of the quality of the prediction of the dependent variable; in this case, VO2max .
A value of 0.760, in this example, indicates a good level of prediction. The "R Square" column
represents the R2 value (also called the coefficient of determination), which is the proportion of
variance in the dependent variable that can be explained by the independent variables (technically,
it is the proportion of variation accounted for by the regression model above and beyond the mean
model). You can see from our value of 0.577 that our independent variables explain 57.7% of the
variability of our dependent variable, VO2max . However, you also need to be able to interpret
"Adjusted R Square" (adj. R2) to accurately report your data. We explain the reasons for this, as
well as the output, in our enhanced multiple regression guide.

Statistical significance

The F-ratio in the ANOVA table (see below) tests whether the overall regression model is a good
fit for the data. The table shows that the independent variables statistically significantly
predict the dependent variable, F(4, 95) = 32.393, p < .0005 (i.e., the regression model is a good
fit of the data).

Estimated model coefficients

The general form of the equation to predict VO2max from age , weight , heart_rate , gender , is:

predicted VO2max = 87.83 – (0.165 x age ) – (0.385 x weight ) – (0.118 x heart_rate ) + (13.208
x gender )
This is obtained from the Coefficients table, as shown below:

Unstandardized coefficients indicate how much the dependent variable varies with an independent
variable when all other independent variables are held constant. Consider the effect of age in
this example. The unstandardized coefficient, B 1, for age is equal to -0.165
(see Coefficients table). This means that for each one year increase in age, there is a decrease
in VO2max of 0.165 ml/min/kg.

Statistical significance of the independent variables

You can test for the statistical significance of each of the independent variables. This tests
whether the unstandardized (or standardized) coefficients are equal to 0 (zero) in the population.
If p < .05, you can conclude that the coefficients are statistically significantly different to 0
(zero). The t-value and corresponding p-value are located in the "t" and "Sig." columns,
respectively, as highlighted below:
You can see from the "Sig." column that all independent variable coefficients are statistically
significantly different from 0 (zero). Although the intercept, B0, is tested for statistical
significance, this is rarely an important or interesting finding.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy