Useful Stata Commands: TH TH
Useful Stata Commands: TH TH
Performing a regression, computing residuals and preparing key plots (follow sequence)
regress y x
predict yhat
(Creates a new variable yhat default for the predict command)
predict residuals, resid (Creates a new variable residuals)
qnorm residuals
(Produces a normal quantile plot for the residuals)
twoway (scatter residuals x), yline(0)
(Produces a residuals versus x plot
with horizontal line at 0)
twoway (scatter residuals yhat), yline(0)
(Produces a residuals versus predicted y plot
with horizontal line at 0)
IPS Chapter 7
Calculating a single sample t confidence interval
ci y
(The default is a 95% confidence interval)
ci y, level(90)
(Gives a 90%, rather than 95% confidence interval)
Conducting a single sample t-test
ttest y = = 0
(Where 0 is the value for the mean of y specified in the null hypothesis)
Conducting a paired t-test
(Where x1 and x2 are the paired variables)
ttest x1= = x2
Conducting an unpooled 2-sample t-test (assumes unequal variances)
ttest x1= = x2, unpaired unequal
Conducting a pooled 2-sample t-test (assumes equal variances)
ttest x1= = x2, unpaired
Calculating a confidence interval for the 2-sample situation
CIs are imbedded within the 2-sample t-test results (unequal or equal variances)
(The level can be different than 95% CI)
ttest x1= = x2, unpaired level(90)
IPS Chapters 8 and 9
Finding frequencies and row and column percentages
tab y
(Quickly gives frequencies and percentages)
tab x y, row col
(Gives row and column frequencies and percentages)
IPS Chapter 10
Creating a plot of confidence intervals and prediction intervals from a regression model
regress y x
(Run the regression first)
twoway (lfitci y x, stdp) (scatter y x)
(Plots all 95% confidence intervals for the expected
value of y over the sampling range of x)
twoway (lfitci y x, stdf) (scatter y x)
(Plots all 95% prediction intervals for a future value
of y over the sampling range of x)
Calculating confidence intervals and prediction intervals from a linear regression model
regress y x
(Run the regression first)
adjust x = xo, ci level(95)
(Calculates the 95% confidence interval for the expected
value of y using x = xo in the simple linear model)
(Calculates the 95% prediction interval for a future value of
adjust x = xo, stdf ci level(95)
y using x = xo in the simple linear model)
IPS Chapter 11
Performing a multiple regression
regress y x1 x2 x3 etc.
Performing a multiple regression, computing residuals and preparing key plots
regress y x1 x2 x3 etc.
predict yhat
(Creates a new variable yhat default for the predict command)
predict residuals, resid (Creates a new variable residuals)
qnorm residuals
(Produces a normal quantile plot for the residuals)
twoway (scatter residuals x1), yline(0)
(Produces a residuals versus x1 plot
with horizontal line at 0)
twoway (scatter residuals yhat), yline(0)
(Produces a residuals versus predicted y plot
with horizontal line at 0)
Calculating correlations among a set of variables
correlate y x1 x2 x3 etc.
Preparing a scatterplot matrix (pairwise scatterplot of a set of variables)
graph matrix y x1 x2 x3 etc.
IPS Chapters 12 and 13
Converting categorical variable values to numerical values (if necessary)
encode x, gen(xnum)
Performing a one-way ANOVA
oneway y xnum
Performing a two-way or higher order ANOVA
anova y x1, x2, x3 etc. (For interactions add x2*x3 etc.)
IPS Chapter 14
Performing a logistic regression (y is a binary (0,1) random variable [1=event])
logistic y x1 x2 x3 etc.
(Displays the ORs)
logit y x1 x2 x3 etc.
(Displays the model coefficients)
Performing a logistic regression when predictor x1 is categorical (2 or more levels)
xi: logistic y i.x1 x2 etc.
(Creates binary (dummy) variable(s) for x1)
Expanding a 2 x 2 table into a rectangular datafile for analysis in a logistic model
input y x Count
(This sequence of commands creates a rectangular datafile
1 1 Count11
from a 2 x 2 table
Count11
Count10
0 1 Count01
Count01
Count00
1 0 Count10
0 0 Count00
The datafile will have Count rows where
end
Count = Count11 + Count01 + Count10 + Count00
expand Count
and can then be used for a logistic model analysis)