Olsrr
Olsrr
URL https://olsrr.rsquaredacademy.com/,
https://github.com/rsquaredacademy/olsrr
BugReports https://github.com/rsquaredacademy/olsrr/issues
Encoding UTF-8
LazyData true
VignetteBuilder knitr
RoxygenNote 6.1.1
LinkingTo Rcpp
NeedsCompilation yes
Author Aravind Hebbali [aut, cre]
Maintainer Aravind Hebbali <hebbali.aravind@gmail.com>
Repository CRAN
Date/Publication 2020-02-10 12:00:02 UTC
1
2 R topics documented:
R topics documented:
auto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
cement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
fitness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
hsb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
olsrr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
ols_aic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
ols_apc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
ols_coll_diag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
ols_correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
ols_fpe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
ols_hadi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
ols_hsp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
ols_launch_app . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
ols_leverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
ols_mallows_cp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
ols_msep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
ols_plot_added_variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
ols_plot_comp_plus_resid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
ols_plot_cooksd_bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
ols_plot_cooksd_chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
ols_plot_dfbetas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
ols_plot_dffits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
ols_plot_diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
ols_plot_hadi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
ols_plot_obs_fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
ols_plot_reg_line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
ols_plot_resid_box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
ols_plot_resid_fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
ols_plot_resid_fit_spread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
ols_plot_resid_hist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
ols_plot_resid_lev . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
ols_plot_resid_pot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
ols_plot_resid_qq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
ols_plot_resid_regressor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
ols_plot_resid_stand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
ols_plot_resid_stud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
ols_plot_resid_stud_fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
ols_plot_response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
ols_pred_rsq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
ols_prep_avplot_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
ols_prep_cdplot_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
ols_prep_cdplot_outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
ols_prep_dfbeta_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
ols_prep_dfbeta_outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
ols_prep_dsrvf_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
ols_prep_outlier_obs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
auto 3
ols_prep_regress_x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
ols_prep_regress_y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
ols_prep_rfsplot_fmdata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
ols_prep_rstudlev_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
ols_prep_rvsrplot_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
ols_prep_srchart_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
ols_prep_srplot_data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
ols_press . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
ols_pure_error_anova . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
ols_regress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
ols_sbc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
ols_sbic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
ols_step_all_possible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
ols_step_all_possible_betas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
ols_step_backward_aic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
ols_step_backward_p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
ols_step_best_subset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
ols_step_both_aic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
ols_step_both_p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
ols_step_forward_aic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
ols_step_forward_p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
ols_test_bartlett . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
ols_test_breusch_pagan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
ols_test_correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
ols_test_f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
ols_test_normality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
ols_test_outlier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
ols_test_score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
rivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
rvsr_plot_shiny . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
stepdata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
surgical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Index 74
Description
Test Data Set
Usage
auto
Format
An object of class tbl_df (inherits from tbl, data.frame) with 74 rows and 11 columns.
4 hsb
Description
Test Data Set
Usage
cement
Format
An object of class data.frame with 13 rows and 6 columns.
Description
Test Data Set
Usage
fitness
Format
An object of class data.frame with 31 rows and 7 columns.
Description
Test Data Set
Usage
hsb
Format
An object of class data.frame with 200 rows and 15 columns.
olsrr 5
Description
Tools for teaching and learning OLS regression
Details
See the README on GitHub
Description
Akaike information criterion for model selection.
Usage
ols_aic(model, method = c("R", "STATA", "SAS"))
Arguments
model An object of class lm.
method A character vector; specify the method to compute AIC. Valid options include
R, STATA and SAS.
Details
AIC provides a means for model selection. Given a collection of models for the data, AIC estimates
the quality of each model, relative to each of the other models. R and STATA use loglikelihood to
compute AIC. SAS uses residual sum of squares. Below is the formula in each case:
R & STATA
AIC = −2(loglikelihood) + 2p
SAS
AIC = n ∗ ln(SSE/n) + 2p
where n is the sample size and p is the number of model parameters including intercept.
Value
Akaike information criterion of the model.
6 ols_apc
References
Akaike, H. (1969). “Fitting Autoregressive Models for Prediction.” Annals of the Institute of Sta-
tistical Mathematics 21:243–247.
Judge, G. G., Griffiths, W. E., Hill, R. C., and Lee, T.-C. (1980). The Theory and Practice of
Econometrics. New York: John Wiley & Sons.
See Also
Other model selection criteria: ols_apc, ols_fpe, ols_hsp, ols_mallows_cp, ols_msep, ols_sbc,
ols_sbic
Examples
# using R computation method
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_aic(model)
Description
Amemiya’s prediction error.
Usage
ols_apc(model)
Arguments
model An object of class lm.
Details
Amemiya’s Prediction Criterion penalizes R-squared more heavily than does adjusted R-squared
for each addition degree of freedom used on the right-hand-side of the equation. The higher the
better for this criterion.
where n is the sample size, p is the number of predictors including the intercept and R^2 is the
coefficient of determination.
Value
References
Amemiya, T. (1976). Selection of Regressors. Technical Report 225, Stanford University, Stanford,
CA.
Judge, G. G., Griffiths, W. E., Hill, R. C., and Lee, T.-C. (1980). The Theory and Practice of
Econometrics. New York: John Wiley & Sons.
See Also
Other model selection criteria: ols_aic, ols_fpe, ols_hsp, ols_mallows_cp, ols_msep, ols_sbc,
ols_sbic
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_apc(model)
Description
Usage
ols_coll_diag(model)
ols_vif_tol(model)
ols_eigen_cindex(model)
Arguments
Details
Collinearity implies two variables are near perfect linear combinations of one another. Multi-
collinearity involves more than two variables. In the presence of multicollinearity, regression esti-
mates are unstable and have high standard errors.
Tolerance
Percent of variance in the predictor that cannot be accounted for by other predictors.
Steps to calculate tolerance:
Condition Index
Most multivariate statistical approaches involve decomposing a correlation matrix into linear com-
binations of variables. The linear combinations are chosen so that the first combination has the
largest possible variance (subject to some restrictions), the second combination has the next largest
variance, subject to being uncorrelated with the first, the third has the largest possible variance,
subject to being uncorrelated with the first and second, and so forth. The variance of each of these
linear combinations is called an eigenvalue. Collinearity is spotted by finding 2 or more variables
that have large proportions of variance (.50 or more) that correspond to large condition indices. A
rule of thumb is to label as large those condition indices in the range of 30 or larger.
Value
ols_coll_diag returns an object of class "ols_coll_diag". An object of class "ols_coll_diag"
is a list containing the following components:
References
Belsley, D. A., Kuh, E., and Welsch, R. E. (1980). Regression Diagnostics: Identifying Influential
Data and Sources of Collinearity. New York: John Wiley & Sons.
Examples
# model
model <- lm(mpg ~ disp + hp + wt + drat, data = mtcars)
# collinearity diagnostics
ols_coll_diag(model)
Description
Usage
ols_correlations(model)
Arguments
Details
Value
ols_correlations returns an object of class "ols_correlations". An object of class "ols_correlations"
is a data frame containing the following components:
References
Morrison, D. F. 1976. Multivariate statistical methods. New York: McGraw-Hill.
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_correlations(model)
Description
Estimated mean square error of prediction.
Usage
ols_fpe(model)
Arguments
model An object of class lm.
Details
Computes the estimated mean square error of prediction for each model selected assuming that the
values of the regressors are fixed and that the model is correct.
M SE((n + p)/n)
where M SE = SSE/(n − p), n is the sample size and p is the number of predictors including the
intercept
Value
Final prediction error of the model.
ols_hadi 11
References
Akaike, H. (1969). “Fitting Autoregressive Models for Prediction.” Annals of the Institute of Sta-
tistical Mathematics 21:243–247.
Judge, G. G., Griffiths, W. E., Hill, R. C., and Lee, T.-C. (1980). The Theory and Practice of
Econometrics. New York: John Wiley & Sons.
See Also
Other model selection criteria: ols_aic, ols_apc, ols_hsp, ols_mallows_cp, ols_msep, ols_sbc,
ols_sbic
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_fpe(model)
Description
Measure of influence based on the fact that influential observations in either the response variable
or in the predictors or both.
Usage
ols_hadi(model)
Arguments
model An object of class lm.
Value
Hadi’s measure of the model.
References
Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley &
Sons, 2012. Print.
See Also
Other influence measures: ols_leverage, ols_pred_rsq, ols_press
12 ols_hsp
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_hadi(model)
ols_hsp Hocking’s Sp
Description
Average prediction mean squared error.
Usage
ols_hsp(model)
Arguments
model An object of class lm.
Details
Hocking’s Sp criterion is an adjustment of the residual sum of Squares. Minimize this criterion.
M SE/(n − p − 1)
where M SE = SSE/(n − p), n is the sample size and p is the number of predictors including the
intercept
Value
Hocking’s Sp of the model.
References
Hocking, R. R. (1976). “The Analysis and Selection of Variables in a Linear Regression.” Biomet-
rics 32:1–50.
See Also
Other model selection criteria: ols_aic, ols_apc, ols_fpe, ols_mallows_cp, ols_msep, ols_sbc,
ols_sbic
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_hsp(model)
ols_launch_app 13
Description
Launches shiny app for interactive model building.
Usage
ols_launch_app()
Examples
## Not run:
ols_launch_app()
## End(Not run)
ols_leverage Leverage
Description
The leverage of an observation is based on how much the observation’s value on the predictor
variable differs from the mean of the predictor variable. The greater an observation’s leverage, the
more potential it has to be an influential observation.
Usage
ols_leverage(model)
Arguments
model An object of class lm.
Value
Leverage of the model.
References
Kutner, MH, Nachtscheim CJ, Neter J and Li W., 2004, Applied Linear Statistical Models (5th
edition). Chicago, IL., McGraw Hill/Irwin.
See Also
Other influence measures: ols_hadi, ols_pred_rsq, ols_press
14 ols_mallows_cp
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_leverage(model)
ols_mallows_cp Mallow’s Cp
Description
Mallow’s Cp.
Usage
ols_mallows_cp(model, fullmodel)
Arguments
model An object of class lm.
fullmodel An object of class lm.
Details
Mallows’ Cp statistic estimates the size of the bias that is introduced into the predicted responses by
having an underspecified model. Use Mallows’ Cp to choose between multiple regression models.
Look for models where Mallows’ Cp is small and close to the number of predictors in the model
plus the constant (p).
Value
Mallow’s Cp of the model.
References
Hocking, R. R. (1976). “The Analysis and Selection of Variables in a Linear Regression.” Biomet-
rics 32:1–50.
Mallows, C. L. (1973). “Some Comments on Cp.” Technometrics 15:661–675.
See Also
Other model selection criteria: ols_aic, ols_apc, ols_fpe, ols_hsp, ols_msep, ols_sbc, ols_sbic
Examples
full_model <- lm(mpg ~ ., data = mtcars)
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_mallows_cp(model, full_model)
ols_msep 15
ols_msep MSEP
Description
Estimated error of prediction, assuming multivariate normality.
Usage
ols_msep(model)
Arguments
model An object of class lm.
Details
Computes the estimated mean square error of prediction assuming that both independent and de-
pendent variables are multivariate normal.
where M SE = SSE/(n − p), n is the sample size and p is the number of predictors including the
intercept
Value
Estimated error of prediction of the model.
References
Stein, C. (1960). “Multiple Regression.” In Contributions to Probability and Statistics: Essays in
Honor of Harold Hotelling, edited by I. Olkin, S. G. Ghurye, W. Hoeffding, W. G. Madow, and H.
B. Mann, 264–305. Stanford, CA: Stanford University Press.
Darlington, R. B. (1968). “Multiple Regression in Psychological Research and Practice.” Psycho-
logical Bulletin 69:161–182.
See Also
Other model selection criteria: ols_aic, ols_apc, ols_fpe, ols_hsp, ols_mallows_cp, ols_sbc,
ols_sbic
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_msep(model)
16 ols_plot_added_variable
ols_plot_added_variable
Added variable plots
Description
Added variable plot provides information about the marginal importance of a predictor variable,
given the other predictor variables already in the model. It shows the marginal importance of the
variable in reducing the residual variability.
Usage
Arguments
Details
The added variable plot was introduced by Mosteller and Tukey (1977). It enables us to visualize
the regression coefficient of a new variable being considered to be included in a model. The plot
can be constructed for each predictor variable.
Let us assume we want to test the effect of adding/removing variable X from a model. Let the
response variable of the model be Y
Steps to construct an added variable plot:
• Regress Y on all variables other than X and store the residuals (Y residuals).
• Regress X on all the other variables included in the model (X residuals).
• Construct a scatter plot of Y residuals and X residuals.
What do the Y and X residuals represent? The Y residuals represent the part of Y not explained
by all the variables other than X. The X residuals represent the part of X not explained by other
variables. The slope of the line fitted to the points in the added variable plot is equal to the regression
coefficient when Y is regressed on all variables including X.
A strong linear relationship in the added variable plot indicates the increased importance of the
contribution of X to the model already containing the other predictors.
Deprecated Function
References
Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley &
Sons, 2012. Print.
Kutner, MH, Nachtscheim CJ, Neter J and Li W., 2004, Applied Linear Statistical Models (5th
edition). Chicago, IL., McGraw Hill/Irwin.
See Also
[ols_plot_resid_regressor()], [ols_plot_comp_plus_resid()]
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_added_variable(model)
ols_plot_comp_plus_resid
Residual plus component plot
Description
The residual plus component plot indicates whether any non-linearity is present in the relationship
between response and predictor variables and can suggest possible transformations for linearizing
the data.
Usage
ols_plot_comp_plus_resid(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Deprecated Function
ols_rpc_plot() has been deprecated. Instead use ols_plot_comp_plus_resid().
References
Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley &
Sons, 2012. Print.
Kutner, MH, Nachtscheim CJ, Neter J and Li W., 2004, Applied Linear Statistical Models (5th
edition). Chicago, IL., McGraw Hill/Irwin.
18 ols_plot_cooksd_bar
See Also
[ols_plot_added_variable()], [ols_plot_resid_regressor()]
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_plot_comp_plus_resid(model)
Description
Bar Plot of cook’s distance to detect observations that strongly influence fitted values of the model.
Usage
ols_plot_cooksd_bar(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Details
Cook’s distance was introduced by American statistician R Dennis Cook in 1977. It is used to
identify influential data points. It depends on both the residual and leverage i.e it takes it account
both the x value and y value of the observation.
Steps to compute Cook’s distance:
A data point having a large cook’s d indicates that the data point strongly influences the fitted values.
Value
ols_plot_cooksd_bar returns a list containing the following components:
outliers a data.frame with observation number and cooks distance that exceed threshold
threshold threshold for classifying an observation as an outlier
ols_plot_cooksd_chart 19
Deprecated Function
ols_cooksd_barplot() has been deprecated. Instead use ols_plot_cooksd_bar().
See Also
[ols_plot_cooksd_chart()]
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_cooksd_bar(model)
Description
Chart of cook’s distance to detect observations that strongly influence fitted values of the model.
Usage
ols_plot_cooksd_chart(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Details
Cook’s distance was introduced by American statistician R Dennis Cook in 1977. It is used to
identify influential data points. It depends on both the residual and leverage i.e it takes it account
both the x value and y value of the observation.
Steps to compute Cook’s distance:
A data point having a large cook’s d indicates that the data point strongly influences the fitted values.
Value
ols_plot_cooksd_chart returns a list containing the following components:
outliers a data.frame with observation number and cooks distance that exceed threshold
threshold threshold for classifying an observation as an outlier
20 ols_plot_dfbetas
Deprecated Function
ols_cooksd_chart() has been deprecated. Instead use ols_plot_cooksd_chart().
See Also
[ols_plot_cooksd_bar()]
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_cooksd_chart(model)
Description
Panel of plots to detect influential observations using DFBETAs.
Usage
ols_plot_dfbetas(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Details
DFBETA measures the difference in each parameter estimate with and without the influential point.
There is a DFBETA for each data point i.e if there are n observations and k variables, there will be
n ∗ k DFBETAs. In general, large values of DFBETAS indicate observations that are influential in
estimating a given parameter. Belsley, Kuh,
p and Welsch recommend 2 as a general cutoff value to
indicate influential observations and 2/ (n) as a size-adjusted cutoff.
Value
list; ols_plot_dfbetas returns a list of data.frame (for intercept and each predictor) with the
observation number and DFBETA of observations that exceed the threshold for classifying an ob-
servation as an outlier/influential observation.
Deprecated Function
ols_dfbetas_panel() has been deprecated. Instead use ols_plot_dfbetas().
ols_plot_dffits 21
References
Belsley, David A.; Kuh, Edwin; Welsh, Roy E. (1980). Regression Diagnostics: Identifying Influ-
ential Data and Sources of Collinearity.
Wiley Series in Probability and Mathematical Statistics. New York: John Wiley & Sons. pp. ISBN
0-471-05856-4.
See Also
[ols_plot_dffits()]
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_plot_dfbetas(model)
Description
Plot for detecting influential observations using DFFITs.
Usage
ols_plot_dffits(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Details
DFFIT - difference in fits, is used to identify influential data points. It quantifies the number of
standard deviations that the fitted value changes when the ith data point is omitted.
Steps to compute DFFITs:
An observation is deemed influential if the absolute value of its DFFITS value is greater than:
p
2 (p + 1)/(n − p − 1)
where n is the number of observations and p is the number of predictors including intercept.
22 ols_plot_diagnostics
Value
ols_plot_dffits returns a list containing the following components:
outliers a data.frame with observation number and DFFITs that exceed threshold
threshold threshold for classifying an observation as an outlier
Deprecated Function
ols_dffits_plot() has been deprecated. Instead use ols_plot_dffits().
References
Belsley, David A.; Kuh, Edwin; Welsh, Roy E. (1980). Regression Diagnostics: Identifying Influ-
ential Data and Sources of Collinearity.
Wiley Series in Probability and Mathematical Statistics. New York: John Wiley & Sons. ISBN
0-471-05856-4.
See Also
[ols_plot_dfbetas()]
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_plot_dffits(model)
Description
Panel of plots for regression diagnostics.
Usage
ols_plot_diagnostics(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
#’ @section Deprecated Function: ols_diagnostic_panel() has been depre-
cated. Instead use ols_plot_diagnostics().
ols_plot_hadi 23
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_plot_diagnostics(model)
Description
Hadi’s measure of influence based on the fact that influential observations can be present in either
the response variable or in the predictors or both. The plot is used to detect influential observations
based on Hadi’s measure.
Usage
Arguments
Deprecated Function
References
Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley &
Sons, 2012. Print.
See Also
[ols_plot_resid_pot()]
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_hadi(model)
24 ols_plot_reg_line
Description
Plot of observed vs fitted values to assess the fit of the model.
Usage
ols_plot_obs_fit(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Details
Ideally, all your points should be close to a regressed diagonal line. Draw such a diagonal line
within your graph and check out where the points lie. If your model had a high R Square, all the
points would be close to this diagonal line. The lower the R Square, the weaker the Goodness of fit
of your model, the more foggy or dispersed your points are from this diagonal line.
Deprecated Function
ols_ovsp_plot() has been deprecated. Instead use ols_plot_obs_fit().
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_obs_fit(model)
Description
Plot to demonstrate that the regression line always passes through mean of the response and predic-
tor variables.
Usage
ols_plot_reg_line(response, predictor, print_plot = TRUE)
ols_plot_resid_box 25
Arguments
response Response variable.
predictor Predictor variable.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Deprecated Function
ols_reg_line() has been deprecated. Instead use ols_plot_reg_line().
Examples
ols_plot_reg_line(mtcars$mpg, mtcars$disp)
Description
Box plot of residuals to examine if residuals are normally distributed.
Usage
ols_plot_resid_box(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Deprecated Function
ols_rsd_boxplot() has been deprecated. Instead use ols_plot_resid_box().
See Also
Other residual diagnostics: ols_plot_resid_fit, ols_plot_resid_hist, ols_plot_resid_qq,
ols_test_correlation, ols_test_normality
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_resid_box(model)
26 ols_plot_resid_fit
Description
Scatter plot of residuals on the y axis and fitted values on the x axis to detect non-linearity, unequal
error variances, and outliers.
Usage
Arguments
Details
• The residuals spread randomly around the 0 line indicating that the relationship is linear.
• The residuals form an approximate horizontal band around the 0 line indicating homogeneity
of error variance.
• No one residual is visibly away from the random pattern of the residuals indicating that there
are no outliers.
Deprecated Function
See Also
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_resid_fit(model)
ols_plot_resid_fit_spread 27
ols_plot_resid_fit_spread
Residual fit spread plot
Description
Plot to detect non-linearity, influential observations and outliers.
Usage
ols_plot_resid_fit_spread(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Details
Consists of side-by-side quantile plots of the centered fit and the residuals. It shows how much
variation in the data is explained by the fit and how much remains in the residuals. For inappropriate
models, the spread of the residuals in such a plot is often greater than the spread of the centered fit.
Deprecated Function
ols_rfs_plot(), ols_fm_plot() and ols_rsd_plot() has been deprecated. Instead use ols_plot_resid_fit_spread(),
ols_plot_fm() and ols_plot_resid_spread().
References
Cleveland, W. S. (1993). Visualizing Data. Summit, NJ: Hobart Press.
Examples
# model
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_resid_spread(model)
Description
Histogram of residuals for detecting violation of normality assumption.
Usage
ols_plot_resid_hist(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Deprecated Function
ols_rsd_hist() has been deprecated. Instead use ols_plot_resid_hist().
See Also
Other residual diagnostics: ols_plot_resid_box, ols_plot_resid_fit, ols_plot_resid_qq,
ols_test_correlation, ols_test_normality
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_resid_hist(model)
Description
Graph for detecting outliers and/or observations with high leverage.
Usage
ols_plot_resid_lev(model, print_plot = TRUE)
ols_plot_resid_pot 29
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Deprecated Function
ols_rsdlev_plot() has been deprecated. Instead use ols_plot_resid_lev().
See Also
[ols_plot_resid_stud_fit()], [ols_plot_resid_lev()]
Examples
model <- lm(read ~ write + math + science, data = hsb)
ols_plot_resid_lev(model)
Description
Plot to aid in classifying unusual observations as high-leverage points, outliers, or a combination of
both.
Usage
ols_plot_resid_pot(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Deprecated Function
ols_potrsd_plot() has been deprecated. Instead use ols_plot_resid_pot().
References
Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley &
Sons, 2012. Print.
See Also
[ols_plot_hadi()]
30 ols_plot_resid_qq
Examples
Description
Usage
Arguments
Deprecated Function
See Also
Examples
ols_plot_resid_regressor
Residual vs regressor plot
Description
Graph to determine whether we should add a new predictor to the model already containing other
predictors. The residuals from the model is regressed on the new predictor and if the plot shows
non random pattern, you should consider adding the new predictor to the model.
Usage
ols_plot_resid_regressor(model, variable, print_plot = TRUE)
Arguments
model An object of class lm.
variable New predictor to be added to the model.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Deprecated Function
ols_rvsr_plot() has been deprecated. Instead use ols_plot_resid_regressor().
See Also
[ols_plot_added_variable()], [ols_plot_comp_plus_resid()]
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_resid_regressor(model, 'drat')
Description
Chart for identifying outliers.
Usage
ols_plot_resid_stand(model, print_plot = TRUE)
32 ols_plot_resid_stud
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Details
Standardized residual (internally studentized) is the residual divided by estimated standard devia-
tion.
Value
ols_plot_resid_stand returns a list containing the following components:
outliers a data.frame with observation number and standardized resiudals that ex-
ceed threshold
Deprecated Function
ols_srsd_chart() has been deprecated. Instead use ols_plot_resid_stand().
See Also
[ols_plot_resid_stud()]
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_resid_stand(model)
Description
Graph for identifying outliers.
Usage
ols_plot_resid_stud(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
ols_plot_resid_stud_fit 33
Details
Studentized deleted residuals (or externally studentized residuals) is the deleted residual divided by
its estimated standard deviation. Studentized residuals are going to be more effective for detecting
outlying Y observations than standardized residuals. If an observation has an externally studentized
residual that is larger than 3 (in absolute value) we can call it an outlier.
Value
ols_plot_resid_stud returns a list containing the following components:
outliers a data.frame with observation number and studentized residuals that ex-
ceed threshold
Deprecated Function
ols_srsd_plot() has been deprecated. Instead use ols_plot_resid_stud().
See Also
[ols_plot_resid_stand()]
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_resid_stud(model)
ols_plot_resid_stud_fit
Deleted studentized residual vs fitted values plot
Description
Plot for detecting violation of assumptions about residuals such as non-linearity, constant variances
and outliers. It can also be used to examine model fit.
Usage
ols_plot_resid_stud_fit(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
34 ols_plot_response
Details
Studentized deleted residuals (or externally studentized residuals) is the deleted residual divided by
its estimated standard deviation. Studentized residuals are going to be more effective for detecting
outlying Y observations than standardized residuals. If an observation has an externally studentized
residual that is larger than 2 (in absolute value) we can call it an outlier.
Value
ols_plot_resid_stud_fit returns a list containing the following components:
outliers a data.frame with observation number, fitted values and deleted studentized
residuals that exceed the threshold for classifying observations as outliers/influential
observations
threshold threshold for classifying an observation as an outlier/influential observation
Deprecated Function
ols_dsrvsp_plot() has been deprecated. Instead use ols_plot_resid_stud_fit().
See Also
[ols_plot_resid_lev()], [ols_plot_resid_stand()], [ols_plot_resid_stud()]
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_plot_resid_stud_fit(model)
Description
Panel of plots to explore and visualize the response variable.
Usage
ols_plot_response(model, print_plot = TRUE)
Arguments
model An object of class lm.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Deprecated Function
ols_resp_viz() has been deprecated. Instead use ols_plot_response().
ols_pred_rsq 35
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_plot_response(model)
Description
Use predicted rsquared to determine how well the model predicts responses for new observations.
Larger values of predicted R2 indicate models of greater predictive ability.
Usage
ols_pred_rsq(model)
Arguments
model An object of class lm.
Value
Predicted rsquare of the model.
See Also
Other influence measures: ols_hadi, ols_leverage, ols_press
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_pred_rsq(model)
Description
Data for generating the added variable plots.
Usage
ols_prep_avplot_data(model)
36 ols_prep_cdplot_outliers
Arguments
model An object of class lm.
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_prep_avplot_data(model)
Description
Prepare data for cook’s d bar plot.
Usage
ols_prep_cdplot_data(model)
Arguments
model An object of class lm.
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_prep_cdplot_data(model)
ols_prep_cdplot_outliers
Cooks’ d outlier data
Description
Outlier data for cook’s d bar plot.
Usage
ols_prep_cdplot_outliers(k)
Arguments
k Cooks’ d bar plot data.
ols_prep_dfbeta_data 37
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
k <- ols_prep_cdplot_data(model)
ols_prep_cdplot_outliers(k)
Description
Prepares the data for dfbetas plot.
Usage
ols_prep_dfbeta_data(d, threshold)
Arguments
d A tibble or data.frame with dfbetas.
threshold The threshold for outliers.
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
dfb <- dfbetas(model)
n <- nrow(dfb)
threshold <- 2 / sqrt(n)
dbetas <- dfb[, 1]
df_data <- data.frame(obs = seq_len(n), dbetas = dbetas)
ols_prep_dfbeta_data(df_data, threshold)
ols_prep_dfbeta_outliers
DFBETAs plot outliers
Description
Data for identifying outliers in dfbetas plot.
Usage
ols_prep_dfbeta_outliers(d)
38 ols_prep_outlier_obs
Arguments
d A tibble or data.frame.
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
dfb <- dfbetas(model)
n <- nrow(dfb)
threshold <- 2 / sqrt(n)
dbetas <- dfb[, 1]
df_data <- data.frame(obs = seq_len(n), dbetas = dbetas)
d <- ols_prep_dfbeta_data(df_data, threshold)
ols_prep_dfbeta_outliers(d)
Description
Generates data for deleted studentized residual vs fitted plot.
Usage
ols_prep_dsrvf_data(model)
Arguments
model An object of class lm.
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_prep_dsrvf_data(model)
Description
Identify outliers in cook’s d plot.
Usage
ols_prep_outlier_obs(k)
ols_prep_regress_x 39
Arguments
k Cooks’ d bar plot data.
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
k <- ols_prep_cdplot_data(model)
ols_prep_outlier_obs(k)
Description
Regress a predictor in the model on all the other predictors.
Usage
ols_prep_regress_x(data, i)
Arguments
data A data.frame.
i A numeric vector (indicates the predictor in the model).
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
data <- ols_prep_avplot_data(model)
ols_prep_regress_x(data, 1)
Description
Regress y on all the predictors except the ith predictor.
Usage
ols_prep_regress_y(data, i)
40 ols_prep_rfsplot_fmdata
Arguments
data A data.frame.
i A numeric vector (indicates the predictor in the model).
Examples
ols_prep_rfsplot_fmdata
Residual fit spread plot data
Description
Usage
ols_prep_rfsplot_fmdata(model)
ols_prep_rfsplot_rsdata(model)
Arguments
Examples
ols_prep_rstudlev_data
Studentized residual vs leverage plot data
Description
Generates data for studentized resiudual vs leverage plot.
Usage
ols_prep_rstudlev_data(model)
Arguments
model An object of class lm.
Examples
model <- lm(read ~ write + math + science, data = hsb)
ols_prep_rstudlev_data(model)
ols_prep_rvsrplot_data
Residual vs regressor plot data
Description
Data for generating residual vs regressor plot.
Usage
ols_prep_rvsrplot_data(model)
Arguments
model An object of class lm.
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_prep_rvsrplot_data(model)
42 ols_prep_srplot_data
Description
Generates data for standardized residual chart.
Usage
ols_prep_srchart_data(model)
Arguments
model An object of class lm.
Examples
model <- lm(read ~ write + math + science, data = hsb)
ols_prep_srchart_data(model)
Description
Generates data for studentized residual plot.
Usage
ols_prep_srplot_data(model)
Arguments
model An object of class lm.
Examples
model <- lm(read ~ write + math + science, data = hsb)
ols_prep_srplot_data(model)
ols_press 43
ols_press PRESS
Description
PRESS (prediction sum of squares) tells you how well the model will predict new data.
Usage
ols_press(model)
Arguments
Details
The prediction sum of squares (PRESS) is the sum of squares of the prediction error. Each fitted
to obtain the predicted value for the ith observation. Use PRESS to assess your model’s predictive
ability. Usually, the smaller the PRESS value, the better the model’s predictive ability.
Value
References
Kutner, MH, Nachtscheim CJ, Neter J and Li W., 2004, Applied Linear Statistical Models (5th
edition). Chicago, IL., McGraw Hill/Irwin.
See Also
Examples
Description
Assess how much of the error in prediction is due to lack of model fit.
Usage
ols_pure_error_anova(model, ...)
Arguments
model An object of class lm.
... Other parameters.
Details
The residual sum of squares resulting from a regression can be decomposed into 2 components:
If most of the error is due to lack of fit and not just random error, the model should be discarded
and a new model must be built.
Value
ols_pure_error_anova returns an object of class "ols_pure_error_anova". An object of class
"ols_pure_error_anova" is a list containing the following components:
mpred data.frame containing data for the response and predictor of the model
df_rss regression sum of squares degrees of freedom
df_ess error sum of squares degrees of freedom
df_lof lack of fit degrees of freedom
df_error pure error degrees of freedom
final data.frame; contains computed values used for the lack of fit f test
resp character vector; name of response variable
preds character vector; name of predictor variable
Note
The lack of fit F test works only with simple linear regression. Moreover, it is important that the
data contains repeat observations i.e. replicates for at least one of the values of the predictor x. This
test generally only applies to datasets with plenty of replicates.
References
Kutner, MH, Nachtscheim CJ, Neter J and Li W., 2004, Applied Linear Statistical Models (5th
edition). Chicago, IL., McGraw Hill/Irwin.
Examples
model <- lm(mpg ~ disp, data = mtcars)
ols_pure_error_anova(model)
Description
Ordinary least squares regression.
Usage
ols_regress(object, ...)
Arguments
object An object of class "formula" (or one that can be coerced to that class): a sym-
bolic description of the model to be fitted or class lm.
... Other inputs.
46 ols_regress
Value
ols_regress returns an object of class "ols_regress". An object of class "ols_regress" is a
list containing the following components:
Interaction Terms
If the model includes interaction terms, the standardized betas are computed after scaling and cen-
tering the predictors.
References
https://www.ssc.wisc.edu/~hemken/Stataworkshops/stdBeta/Getting
Examples
ols_regress(mpg ~ disp + hp + wt, data = mtcars)
Description
Bayesian information criterion for model selection.
Usage
ols_sbc(model, method = c("R", "STATA", "SAS"))
Arguments
model An object of class lm.
method A character vector; specify the method to compute BIC. Valid options include
R, STATA and SAS.
Details
SBC provides a means for model selection. Given a collection of models for the data, SBC estimates
the quality of each model, relative to each of the other models. R and STATA use loglikelihood to
compute SBC. SAS uses residual sum of squares. Below is the formula in each case:
R & STATA
AIC = −2(loglikelihood) + ln(n) ∗ 2p
SAS
AIC = n ∗ ln(SSE/n) + p ∗ ln(n)
where n is the sample size and p is the number of model parameters including intercept.
48 ols_sbic
Value
The bayesian information criterion of the model.
References
Schwarz, G. (1978). “Estimating the Dimension of a Model.” Annals of Statistics 6:461–464.
Judge, G. G., Griffiths, W. E., Hill, R. C., and Lee, T.-C. (1980). The Theory and Practice of
Econometrics. New York: John Wiley & Sons.
See Also
Other model selection criteria: ols_aic, ols_apc, ols_fpe, ols_hsp, ols_mallows_cp, ols_msep,
ols_sbic
Examples
# using R computation method
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_sbc(model)
Description
Sawa’s bayesian information criterion for model selection.
Usage
ols_sbic(model, full_model)
Arguments
model An object of class lm.
full_model An object of class lm.
ols_step_all_possible 49
Details
Sawa (1978) developed a model selection criterion that was derived from a Bayesian modification of
the AIC criterion. Sawa’s Bayesian Information Criterion (BIC) is a function of the number of ob-
servations n, the SSE, the pure error variance fitting the full model, and the number of independent
variables including the intercept.
where q = n(σ 2 )/SSE, n is the sample size, p is the number of model parameters including
intercept SSE is the residual sum of squares.
Value
Sawa’s Bayesian Information Criterion
References
Sawa, T. (1978). “Information Criteria for Discriminating among Alternative Regression Models.”
Econometrica 46:1273–1282.
Judge, G. G., Griffiths, W. E., Hill, R. C., and Lee, T.-C. (1980). The Theory and Practice of
Econometrics. New York: John Wiley & Sons.
See Also
Other model selection criteria: ols_aic, ols_apc, ols_fpe, ols_hsp, ols_mallows_cp, ols_msep,
ols_sbc
Examples
full_model <- lm(mpg ~ ., data = mtcars)
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_sbic(model, full_model)
Description
Fits all regressions involving one regressor, two regressors, three regressors, and so on. It tests all
possible subsets of the set of potential independent variables.
Usage
ols_step_all_possible(model, ...)
Arguments
Value
n model number
predictors predictors in the model
rsquare rsquare of the model
adjr adjusted rsquare of the model
predrsq predicted rsquare of the model
cp mallow’s Cp
aic akaike information criteria
sbic sawa bayesian information criteria
sbc schwarz bayes information criteria
gmsep estimated MSE of prediction, assuming multivariate normality
jp final prediction error
pc amemiya prediction criteria
sp hocking’s Sp
Deprecated Function
References
Mendenhall William and Sinsich Terry, 2012, A Second Course in Statistics Regression Analysis
(7th edition). Prentice Hall
See Also
Examples
model <- lm(mpg ~ disp + hp, data = mtcars)
k <- ols_step_all_possible(model)
k
# plot
plot(k)
ols_step_all_possible_betas
All possible regression variable coefficients
Description
Returns the coefficients for each variable from each model.
Usage
ols_step_all_possible_betas(object, ...)
Arguments
object An object of class lm.
... Other arguments.
Value
ols_step_all_possible_betas returns a data.frame containing:
Examples
## Not run:
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
ols_step_all_possible_betas(model)
## End(Not run)
52 ols_step_backward_aic
Description
Build regression model from a set of candidate predictor variables by removing predictors based
on akaike information criterion, in a stepwise manner until there is no variable left to remove any
more.
Usage
ols_step_backward_aic(model, ...)
## Default S3 method:
ols_step_backward_aic(model, progress = FALSE,
details = FALSE, ...)
Arguments
model An object of class lm; the model should include all candidate predictor variables.
... Other arguments.
progress Logical; if TRUE, will display variable selection progress.
details Logical; if TRUE, will print the regression result at each step.
x An object of class ols_step_backward_aic.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Value
ols_step_backward_aic returns an object of class "ols_step_backward_aic". An object of
class "ols_step_backward_aic" is a list containing the following components:
Deprecated Function
ols_stepaic_backward() has been deprecated. Instead use ols_step_backward_aic().
References
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
See Also
Other variable selection procedures: ols_step_all_possible, ols_step_backward_p, ols_step_best_subset,
ols_step_both_aic, ols_step_forward_aic, ols_step_forward_p
Examples
# stepwise backward regression
model <- lm(y ~ ., data = surgical)
ols_step_backward_aic(model)
# final model
k$model
Description
Build regression model from a set of candidate predictor variables by removing predictors based on
p values, in a stepwise manner until there is no variable left to remove any more.
Usage
ols_step_backward_p(model, ...)
## Default S3 method:
ols_step_backward_p(model, prem = 0.3,
progress = FALSE, details = FALSE, ...)
Arguments
model An object of class lm; the model should include all candidate predictor variables.
... Other inputs.
prem p value; variables with p more than prem will be removed from the model.
progress Logical; if TRUE, will display variable selection progress.
details Logical; if TRUE, will print the regression result at each step.
x An object of class ols_step_backward_p.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Value
Deprecated Function
References
Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley &
Sons, 2012. Print.
See Also
Examples
# stepwise backward regression
model <- lm(y ~ ., data = surgical)
ols_step_backward_p(model)
# final model
k$model
Description
Select the subset of predictors that do the best at meeting some well-defined objective criterion,
such as having the largest R2 value or the smallest MSE, Mallow’s Cp or AIC.
Usage
ols_step_best_subset(model, ...)
Arguments
model An object of class lm.
... Other inputs.
x An object of class ols_step_best_subset.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Value
ols_step_best_subset returns an object of class "ols_step_best_subset". An object of class
"ols_step_best_subset" is a data frame containing the following components:
n model number
predictors predictors in the model
rsquare rsquare of the model
adjr adjusted rsquare of the model
56 ols_step_both_aic
Deprecated Function
ols_best_subset() has been deprecated. Instead use ols_step_best_subset().
References
Kutner, MH, Nachtscheim CJ, Neter J and Li W., 2004, Applied Linear Statistical Models (5th
edition). Chicago, IL., McGraw Hill/Irwin.
See Also
Other variable selection procedures: ols_step_all_possible, ols_step_backward_aic, ols_step_backward_p,
ols_step_both_aic, ols_step_forward_aic, ols_step_forward_p
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_step_best_subset(model)
# plot
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
k <- ols_step_best_subset(model)
plot(k)
Description
Build regression model from a set of candidate predictor variables by entering and removing pre-
dictors based on akaike information criteria, in a stepwise manner until there is no variable left to
enter or remove any more.
ols_step_both_aic 57
Usage
ols_step_both_aic(model, progress = FALSE, details = FALSE)
Arguments
model An object of class lm.
progress Logical; if TRUE, will display variable selection progress.
details Logical; if TRUE, details of variable selection will be printed on screen.
x An object of class ols_step_both_aic.
print_plot logical; if TRUE, prints the plot else returns a plot object.
... Other arguments.
Value
ols_step_both_aic returns an object of class "ols_step_both_aic". An object of class "ols_step_both_aic"
is a list containing the following components:
Deprecated Function
ols_stepaic_both() has been deprecated. Instead use ols_step_both_aic().
References
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
See Also
Other variable selection procedures: ols_step_all_possible, ols_step_backward_aic, ols_step_backward_p,
ols_step_best_subset, ols_step_forward_aic, ols_step_forward_p
58 ols_step_both_p
Examples
## Not run:
# stepwise regression
model <- lm(y ~ ., data = stepdata)
ols_step_both_aic(model)
# final model
k$model
## End(Not run)
Description
Build regression model from a set of candidate predictor variables by entering and removing pre-
dictors based on p values, in a stepwise manner until there is no variable left to enter or remove any
more.
Usage
ols_step_both_p(model, ...)
## Default S3 method:
ols_step_both_p(model, pent = 0.1, prem = 0.3,
progress = FALSE, details = FALSE, ...)
Arguments
model An object of class lm; the model should include all candidate predictor variables.
... Other arguments.
pent p value; variables with p value less than pent will enter into the model.
prem p value; variables with p more than prem will be removed from the model.
progress Logical; if TRUE, will display variable selection progress.
details Logical; if TRUE, will print the regression result at each step.
x An object of class ols_step_both_p.
print_plot logical; if TRUE, prints the plot else returns a plot object.
ols_step_both_p 59
Value
ols_step_both_p returns an object of class "ols_step_both_p". An object of class "ols_step_both_p"
is a list containing the following components:
Deprecated Function
ols_stepwise() has been deprecated. Instead use ols_step_both_p().
References
Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley &
Sons, 2012. Print.
Examples
# stepwise regression
model <- lm(y ~ ., data = surgical)
ols_step_both_p(model)
# final model
k$model
60 ols_step_forward_aic
Description
Build regression model from a set of candidate predictor variables by entering predictors based on
akaike information criterion, in a stepwise manner until there is no variable left to enter any more.
Usage
ols_step_forward_aic(model, ...)
## Default S3 method:
ols_step_forward_aic(model, progress = FALSE,
details = FALSE, ...)
Arguments
model An object of class lm.
... Other arguments.
progress Logical; if TRUE, will display variable selection progress.
details Logical; if TRUE, will print the regression result at each step.
x An object of class ols_step_forward_aic.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Value
ols_step_forward_aic returns an object of class "ols_step_forward_aic". An object of class
"ols_step_forward_aic" is a list containing the following components:
Deprecated Function
ols_stepaic_forward() has been deprecated. Instead use ols_step_forward_aic().
ols_step_forward_p 61
References
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
See Also
Examples
# stepwise forward regression
model <- lm(y ~ ., data = surgical)
ols_step_forward_aic(model)
# final model
k$model
Description
Build regression model from a set of candidate predictor variables by entering predictors based on
p values, in a stepwise manner until there is no variable left to enter any more.
Usage
ols_step_forward_p(model, ...)
## Default S3 method:
ols_step_forward_p(model, penter = 0.3,
progress = FALSE, details = FALSE, ...)
Arguments
model An object of class lm; the model should include all candidate predictor variables.
... Other arguments.
penter p value; variables with p value less than penter will enter into the model
progress Logical; if TRUE, will display variable selection progress.
details Logical; if TRUE, will print the regression result at each step.
x An object of class ols_step_forward_p.
print_plot logical; if TRUE, prints the plot else returns a plot object.
Value
ols_step_forward_p returns an object of class "ols_step_forward_p". An object of class "ols_step_forward_p"
is a list containing the following components:
Deprecated Function
ols_step_forward() has been deprecated. Instead use ols_step_forward_p().
References
Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley &
Sons, 2012. Print.
Kutner, MH, Nachtscheim CJ, Neter J and Li W., 2004, Applied Linear Statistical Models (5th
edition). Chicago, IL., McGraw Hill/Irwin.
See Also
Other variable selection procedures: ols_step_all_possible, ols_step_backward_aic, ols_step_backward_p,
ols_step_best_subset, ols_step_both_aic, ols_step_forward_aic
ols_test_bartlett 63
Examples
# stepwise forward regression
model <- lm(y ~ ., data = surgical)
ols_step_forward_p(model)
# final model
k$model
Description
Test if k samples are from populations with equal variances.
Usage
ols_test_bartlett(data, ...)
## Default S3 method:
ols_test_bartlett(data, ..., group_var = NULL)
Arguments
data A data.frame or tibble.
... Columns in data.
group_var Grouping variable.
Details
Bartlett’s test is used to test if variances across samples is equal. It is sensitive to departures from
normality. The Levene test is an alternative test that is less sensitive to departures from normality.
Value
ols_test_bartlett returns an object of class "ols_test_bartlett". An object of class "ols_test_bartlett"
is a list containing the following components:
fstat f statistic
pval p-value of fstat
df degrees of freedom
64 ols_test_breusch_pagan
Deprecated Function
ols_bartlett_test() has been deprecated. Instead use ols_test_bartlett().
References
Snedecor, George W. and Cochran, William G. (1989), Statistical Methods, Eighth Edition, Iowa
State University Press.
See Also
Other heteroskedasticity tests: ols_test_breusch_pagan, ols_test_f, ols_test_score
Examples
# using grouping variable
library(descriptr)
ols_test_bartlett(mtcarz, 'mpg', group_var = 'cyl')
# using variables
ols_test_bartlett(hsb, 'read', 'write')
ols_test_breusch_pagan
Breusch pagan test
Description
Test for constant variance. It assumes that the error terms are normally distributed.
Usage
ols_test_breusch_pagan(model, fitted.values = TRUE, rhs = FALSE,
multiple = FALSE, p.adj = c("none", "bonferroni", "sidak", "holm"),
vars = NA)
Arguments
model An object of class lm.
fitted.values Logical; if TRUE, use fitted values of regression model.
rhs Logical; if TRUE, specifies that tests for heteroskedasticity be performed for the
right-hand-side (explanatory) variables of the fitted regression model.
multiple Logical; if TRUE, specifies that multiple testing be performed.
p.adj Adjustment for p value, the following options are available: bonferroni, holm,
sidak and none.
vars Variables to be used for heteroskedasticity test.
ols_test_breusch_pagan 65
Details
Breusch Pagan Test was introduced by Trevor Breusch and Adrian Pagan in 1979. It is used to
test for heteroskedasticity in a linear regression model. It test whether variance of errors from a
regression is dependent on the values of a independent variable.
Computation
Value
ols_test_breusch_pagan returns an object of class "ols_test_breusch_pagan". An object of
class "ols_test_breusch_pagan" is a list containing the following components:
Deprecated Function
ols_bp_test() has been deprecated. Instead use ols_test_breusch_pagan().
References
T.S. Breusch & A.R. Pagan (1979), A Simple Test for Heteroscedasticity and Random Coefficient
Variation. Econometrica 47, 1287–1294
Cook, R. D.; Weisberg, S. (1983). "Diagnostics for Heteroskedasticity in Regression". Biometrika.
70 (1): 1–10.
See Also
Other heteroskedasticity tests: ols_test_bartlett, ols_test_f, ols_test_score
66 ols_test_correlation
Examples
# model
model <- lm(mpg ~ disp + hp + wt + drat, data = mtcars)
Description
Correlation between observed residuals and expected residuals under normality.
Usage
ols_test_correlation(model)
Arguments
model An object of class lm.
Value
Correlation between fitted regression model residuals and expected values of residuals.
Deprecated Function
ols_corr_test() has been deprecated. Instead use ols_test_correlation().
See Also
Other residual diagnostics: ols_plot_resid_box, ols_plot_resid_fit, ols_plot_resid_hist,
ols_plot_resid_qq, ols_test_normality
ols_test_f 67
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_test_correlation(model)
ols_test_f F test
Description
Test for heteroskedasticity under the assumption that the errors are independent and identically
distributed (i.i.d.).
Usage
ols_test_f(model, fitted_values = TRUE, rhs = FALSE, vars = NULL,
...)
Arguments
model An object of class lm.
fitted_values Logical; if TRUE, use fitted values of regression model.
rhs Logical; if TRUE, specifies that tests for heteroskedasticity be performed for the
right-hand-side (explanatory) variables of the fitted regression model.
vars Variables to be used for for heteroskedasticity test.
... Other arguments.
Value
ols_test_f returns an object of class "ols_test_f". An object of class "ols_test_f" is a list
containing the following components:
f f statistic
p p-value of f
fv fitted values of the regression model
rhs names of explanatory variables of fitted regression model
numdf numerator degrees of freedom
dendf denominator degrees of freedom
vars variables to be used for heteroskedasticity test
resp response variable
preds predictors
Deprecated Function
ols_f_test() has been deprecated. Instead use ols_test_f().
68 ols_test_normality
References
Wooldridge, J. M. 2013. Introductory Econometrics: A Modern Approach. 5th ed. Mason, OH:
South-Western.
See Also
Examples
# model
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
Description
Usage
ols_test_normality(y, ...)
Arguments
Value
ols_test_normality returns an object of class "ols_test_normality". An object of class "ols_test_normality"
is a list containing the following components:
kolmogorv kolmogorv smirnov statistic
shapiro shapiro wilk statistic
cramer cramer von mises statistic
anderson anderson darling statistic
Deprecated Function
ols_norm_test() has been deprecated. Instead use ols_test_normality().
See Also
Other residual diagnostics: ols_plot_resid_box, ols_plot_resid_fit, ols_plot_resid_hist,
ols_plot_resid_qq, ols_test_correlation
Examples
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_test_normality(model)
Description
Detect outliers using Bonferroni p values.
Usage
ols_test_outlier(model, cut_off = 0.05, n_max = 10, ...)
Arguments
model An object of class lm.
cut_off Bonferroni p-values cut off for reporting observations.
n_max Maximum number of observations to report, default is 10.
... Other arguments.
Examples
# model
model <- lm(y ~ ., data = surgical)
ols_test_outlier(model)
70 ols_test_score
Description
Test for heteroskedasticity under the assumption that the errors are independent and identically
distributed (i.i.d.).
Usage
ols_test_score(model, fitted_values = TRUE, rhs = FALSE, vars = NULL)
Arguments
model An object of class lm.
fitted_values Logical; if TRUE, use fitted values of regression model.
rhs Logical; if TRUE, specifies that tests for heteroskedasticity be performed for the
right-hand-side (explanatory) variables of the fitted regression model.
vars Variables to be used for for heteroskedasticity test.
Value
ols_test_score returns an object of class "ols_test_score". An object of class "ols_test_score"
is a list containing the following components:
score f statistic
p p value of score
df degrees of freedom
fv fitted values of the regression model
rhs names of explanatory variables of fitted regression model
resp response variable
preds predictors
Deprecated Function
ols_score_test() has been deprecated. Instead use ols_test_score().
References
Breusch, T. S. and Pagan, A. R. (1979) A simple test for heteroscedasticity and random coefficient
variation. Econometrica 47, 1287–1294.
Cook, R. D. and Weisberg, S. (1983) Diagnostics for heteroscedasticity in regression. Biometrika
70, 1–10.
Koenker, R. 1981. A note on studentizing a test for heteroskedasticity. Journal of Econometrics 17:
107–112.
rivers 71
See Also
Other heteroskedasticity tests: ols_test_bartlett, ols_test_breusch_pagan, ols_test_f
Examples
# model
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
Description
Test Data Set
Usage
rivers
Format
An object of class data.frame with 20 rows and 6 columns.
Description
Graph to determine whether we should add a new predictor to the model already containing other
predictors. The residuals from the model is regressed on the new predictor and if the plot shows
non random pattern, you should consider adding the new predictor to the model.
Usage
rvsr_plot_shiny(model, data, variable, print_plot = TRUE)
72 surgical
Arguments
Examples
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
rvsr_plot_shiny(model, mtcars, 'drat')
Description
Usage
stepdata
Format
Description
Usage
surgical
surgical 73
Format
A data frame with 54 rows and 9 variables:
bcs blood clotting score
pindex prognostic index
enzyme_test enzyme function test score
liver_test liver function test score
age age, in years
gender indicator variable for gender (0 = male, 1 = female)
alc_mod indicator variable for history of alcohol use (0 = None, 1 = Moderate)
alc_heavy indicator variable for history of alcohol use (0 = None, 1 = Heavy)
y Survival Time
Source
Kutner, MH, Nachtscheim CJ, Neter J and Li W., 2004, Applied Linear Statistical Models (5th
edition). Chicago, IL., McGraw Hill/Irwin.
Index
∗ datasets ols_step_both_aic, 56
auto, 3 ols_step_forward_aic, 60
cement, 4 ols_step_forward_p, 61
fitness, 4 ∗ variable selection_procedures
hsb, 4 ols_step_both_p, 58
rivers, 71
stepdata, 72 auto, 3
surgical, 72 cement, 4
∗ heteroskedasticity tests
ols_test_bartlett, 63 fitness, 4
ols_test_breusch_pagan, 64
ols_test_f, 67 hsb, 4
ols_test_score, 70
∗ influence measures ols_aic, 5, 7, 11, 12, 14, 15, 48, 49
ols_all_subset (ols_step_all_possible),
ols_hadi, 11
49
ols_leverage, 13
ols_all_subset_betas
ols_pred_rsq, 35
(ols_step_all_possible_betas),
ols_press, 43
51
∗ model selection criteria
ols_apc, 6, 6, 11, 12, 14, 15, 48, 49
ols_aic, 5
ols_avplots (ols_plot_added_variable),
ols_apc, 6
16
ols_fpe, 10 ols_bartlett_test (ols_test_bartlett),
ols_hsp, 12 63
ols_mallows_cp, 14 ols_best_subset (ols_step_best_subset),
ols_msep, 15 55
ols_sbc, 47 ols_bp_test (ols_test_breusch_pagan), 64
ols_sbic, 48 ols_coll_diag, 7
∗ residual diagnostics ols_cooksd_barplot
ols_plot_resid_box, 25 (ols_plot_cooksd_bar), 18
ols_plot_resid_fit, 26 ols_cooksd_chart
ols_plot_resid_hist, 28 (ols_plot_cooksd_chart), 19
ols_plot_resid_qq, 30 ols_corr_test (ols_test_correlation), 66
ols_test_correlation, 66 ols_correlations, 9
ols_test_normality, 68 ols_dfbetas_panel (ols_plot_dfbetas), 20
∗ variable selection procedures ols_dffits_plot (ols_plot_dffits), 21
ols_step_all_possible, 49 ols_diagnostic_panel
ols_step_backward_aic, 52 (ols_plot_diagnostics), 22
ols_step_backward_p, 53 ols_dsrvsp_plot
ols_step_best_subset, 55 (ols_plot_resid_stud_fit), 33
74
INDEX 75
61
ols_step_forward_aic, 50, 53, 54, 56, 57,
60, 62
ols_step_forward_p, 50, 53, 54, 56, 57, 61,
61
ols_stepaic_backward
(ols_step_backward_aic), 52
ols_stepaic_both (ols_step_both_aic), 56
ols_stepaic_forward
(ols_step_forward_aic), 60
ols_stepwise (ols_step_both_p), 58
ols_test_bartlett, 63, 65, 68, 71
ols_test_breusch_pagan, 64, 64, 68, 71
ols_test_correlation, 25, 26, 28, 30, 66, 69
ols_test_f, 64, 65, 67, 71
ols_test_normality, 25, 26, 28, 30, 66, 68
ols_test_outlier, 69
ols_test_score, 64, 65, 68, 70
ols_vif_tol (ols_coll_diag), 7
olsrr, 5
olsrr-package (olsrr), 5
plot.ols_step_all_possible
(ols_step_all_possible), 49
plot.ols_step_backward_aic
(ols_step_backward_aic), 52
plot.ols_step_backward_p
(ols_step_backward_p), 53
plot.ols_step_best_subset
(ols_step_best_subset), 55
plot.ols_step_both_aic
(ols_step_both_aic), 56
plot.ols_step_both_p (ols_step_both_p),
58
plot.ols_step_forward_aic
(ols_step_forward_aic), 60
plot.ols_step_forward_p
(ols_step_forward_p), 61
rivers, 71
rvsr_plot_shiny, 71
stepdata, 72
surgical, 72