0% found this document useful (0 votes)
51 views3 pages

Useful Stata Commands: TH TH

This document provides a summary of useful Stata commands organized by statistical topic. It lists commands for summarizing data, creating graphs, transforming variables, performing regressions, confidence intervals, t-tests, ANOVA, and logistic regression. For each command, it provides a brief description of its functionality and examples of syntax using hypothetical variables like y, x, and x1. The document is intended as a reference for students in a course on the order in which commands may be needed and their basic usage.

Uploaded by

FlavioNavarrete
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views3 pages

Useful Stata Commands: TH TH

This document provides a summary of useful Stata commands organized by statistical topic. It lists commands for summarizing data, creating graphs, transforming variables, performing regressions, confidence intervals, t-tests, ANOVA, and logistic regression. For each command, it provides a brief description of its functionality and examples of syntax using hypothetical variables like y, x, and x1. The document is intended as a reference for students in a course on the order in which commands may be needed and their basic usage.

Uploaded by

FlavioNavarrete
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Stat 104: 9/3/09

Useful Stata Commands


Stata commands are listed below in approximately the order for which they are needed in the
course. For the following commands, let y be the response variable and, if needed, let x
be the predictor variable or let x1 x2 x3 etc. be the predictor variables.
IPS Chapters 1 and 2
Calculating summary measures
summarize y
(Produces n, mean, SD, min and max)
summarize y, detail
(Produces above plus additional summary measures)
(Median is the 50% percentile and the interquartile range (IQR)
is the 75th percentile minus the 25th percentile.)
bysort x: summarize y, detail
(Calculates summary measures for subgroups formed by a categorical variable x)
Creating boxplots
graph box y, aspectratio(4)
graph box y, over(x)
(Creates boxplots side by side for a categorical variable x)
Creating a histogram
histogram y, bin(20) freq
Creating a normal quantile plot
qnorm y
Transforming variables
gen Logy=log10(y)

(or =sqrt(y) or =y*y or =1/y, etc.)

Evaluating power transformations


gladder y
(Prepares a graph with power transformations yk)
ladder y
(Analyzes power transformations; small chi2 (2) values
indicate the best transformations)
qladder y
(Graphical analysis of power transformations using quantile plots)
[Note: gladder, ladder and qladder commands do not do log transformations when
one of the values of y is a zero (Log(0) is undefined). A common solution
is to add +1 to y, i.e. create a new variable, say y1, using: gen y1 = y + 1.]
Creating a scatterplot and adding the least-squares line
scatter y x
(Creates a scatterplot)
twoway (scatter y x) (lfit y x) (Adds the least-squares line to the scatterplot)
Calculating correlations
correlate y x
Performing a regression
regress y x

(For >2 variables, correlate y x1 x2 x3 x4 etc.)

Performing a regression, computing residuals and preparing key plots (follow sequence)
regress y x
predict yhat
(Creates a new variable yhat default for the predict command)
predict residuals, resid (Creates a new variable residuals)
qnorm residuals
(Produces a normal quantile plot for the residuals)
twoway (scatter residuals x), yline(0)
(Produces a residuals versus x plot
with horizontal line at 0)
twoway (scatter residuals yhat), yline(0)
(Produces a residuals versus predicted y plot
with horizontal line at 0)
IPS Chapter 7
Calculating a single sample t confidence interval
ci y
(The default is a 95% confidence interval)
ci y, level(90)
(Gives a 90%, rather than 95% confidence interval)
Conducting a single sample t-test
ttest y = = 0
(Where 0 is the value for the mean of y specified in the null hypothesis)
Conducting a paired t-test
(Where x1 and x2 are the paired variables)
ttest x1= = x2
Conducting an unpooled 2-sample t-test (assumes unequal variances)
ttest x1= = x2, unpaired unequal
Conducting a pooled 2-sample t-test (assumes equal variances)
ttest x1= = x2, unpaired
Calculating a confidence interval for the 2-sample situation
CIs are imbedded within the 2-sample t-test results (unequal or equal variances)
(The level can be different than 95% CI)
ttest x1= = x2, unpaired level(90)
IPS Chapters 8 and 9
Finding frequencies and row and column percentages
tab y
(Quickly gives frequencies and percentages)
tab x y, row col
(Gives row and column frequencies and percentages)
IPS Chapter 10
Creating a plot of confidence intervals and prediction intervals from a regression model
regress y x
(Run the regression first)
twoway (lfitci y x, stdp) (scatter y x)
(Plots all 95% confidence intervals for the expected
value of y over the sampling range of x)
twoway (lfitci y x, stdf) (scatter y x)
(Plots all 95% prediction intervals for a future value
of y over the sampling range of x)
Calculating confidence intervals and prediction intervals from a linear regression model
regress y x
(Run the regression first)
adjust x = xo, ci level(95)
(Calculates the 95% confidence interval for the expected
value of y using x = xo in the simple linear model)
(Calculates the 95% prediction interval for a future value of
adjust x = xo, stdf ci level(95)
y using x = xo in the simple linear model)

IPS Chapter 11
Performing a multiple regression
regress y x1 x2 x3 etc.
Performing a multiple regression, computing residuals and preparing key plots
regress y x1 x2 x3 etc.
predict yhat
(Creates a new variable yhat default for the predict command)
predict residuals, resid (Creates a new variable residuals)
qnorm residuals
(Produces a normal quantile plot for the residuals)
twoway (scatter residuals x1), yline(0)
(Produces a residuals versus x1 plot
with horizontal line at 0)
twoway (scatter residuals yhat), yline(0)
(Produces a residuals versus predicted y plot
with horizontal line at 0)
Calculating correlations among a set of variables
correlate y x1 x2 x3 etc.
Preparing a scatterplot matrix (pairwise scatterplot of a set of variables)
graph matrix y x1 x2 x3 etc.
IPS Chapters 12 and 13
Converting categorical variable values to numerical values (if necessary)
encode x, gen(xnum)
Performing a one-way ANOVA
oneway y xnum
Performing a two-way or higher order ANOVA
anova y x1, x2, x3 etc. (For interactions add x2*x3 etc.)
IPS Chapter 14
Performing a logistic regression (y is a binary (0,1) random variable [1=event])
logistic y x1 x2 x3 etc.
(Displays the ORs)
logit y x1 x2 x3 etc.
(Displays the model coefficients)
Performing a logistic regression when predictor x1 is categorical (2 or more levels)
xi: logistic y i.x1 x2 etc.
(Creates binary (dummy) variable(s) for x1)
Expanding a 2 x 2 table into a rectangular datafile for analysis in a logistic model
input y x Count
(This sequence of commands creates a rectangular datafile
1 1 Count11
from a 2 x 2 table
Count11
Count10
0 1 Count01
Count01
Count00
1 0 Count10
0 0 Count00
The datafile will have Count rows where
end
Count = Count11 + Count01 + Count10 + Count00
expand Count
and can then be used for a logistic model analysis)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy