0% found this document useful (0 votes)

20 views

Regression3 Slides

This document provides an overview of logistic regression. It introduces logistic regression as a method for modeling discrete response variables with two or more categories. The key aspects covered are: 1) Logistic regression models the log odds of the probabilities of different outcomes as a linear combination of predictor variables. 2) This allows predicting probabilities as a function of the predictors rather than directly modeling the discrete outcomes. 3) Logistic regression is a type of generalized linear model used widely in machine learning for classification problems.

Uploaded by

Fantahun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Regression3 Slides

Uploaded by

Fantahun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Regression 3: Logistic Regression

Marco Baroni

Practical Statistics in R
Outline

Logistic regression

Logistic regression in R
Outline

Logistic regression
Introduction
The model
Looking at and comparing fitted models

Logistic regression in R
Outline

Logistic regression
Introduction
The model
Looking at and comparing fitted models

Logistic regression in R
Modeling discrete response variables

I In a very large number of problems in cognitive science

and related fields
I the response variable is categorical, often binary (yes/no;
acceptable/not acceptable; phenomenon takes place/does
not take place)
I potentially explanatory factors (independent variables) are
categorical, numerical or both
Examples: binomial responses

I Is linguistic construction X rated as “acceptable” in the

following condition(s)?
I Does sentence S, that has features Y, W and Z, display
phenomenon X? (linguistic corpus data!)
I Is it common for subjects to decide to purchase the good X
given these conditions?
I Did subject make more errors in this condition?
I How many people answer YES to question X in the survey
I Do old women like X more than young men?
I Did the subject feel pain in this condition?
I How often was reaction X triggered by these conditions?
I Do children with characteristics X, Y and Z tend to have
autism?
Examples: multinomial responses
I Discrete response variable with natural ordering of the
levels:
I Ratings on a 6-point scale
I Depending on the number of points on the scale, you might
also get away with a standard linear regression
I Subjects answer YES, MAYBE, NO
I Subject reaction is coded as FRIENDLY, NEUTRAL,
ANGRY
I The cochlear data: experiment is set up so that possible
errors are de facto on a 7-point scale
I Discrete response variable without natural ordering:
I Subject decides to buy one of 4 different products
I We have brain scans of subjects seeing 5 different objects,
and we want to predict seen object from features of the
scan
I We model the chances of developing 4 different (and
mutually exclusive) psychological syndromes in terms of a
number of behavioural indicators
Binomial and multinomial logistic regression models

I Problems with binary (yes/no, success/failure,

happens/does not happen) dependent variables are
handled by (binomial) logistic regression
I Problems with more than one discrete output are handled
by
I ordinal logistic regression, if outputs have natural ordering
I multinomial logistic regression otherwise
I The output of ordinal and especially multinomial logistic
regression tends to be hard to interpret, whenever possible
I try to reduce the problem to a binary choice
I E.g., if output is yes/maybe/no, treat “maybe” as “yes”
and/or as “no”
I Here, I focus entirely on the binomial case
Don’t be afraid of logistic regression!

I Logistic regression seems less popular than linear

regression
I This might be due in part to historical reasons
I the formal theory of generalized linear models is relatively
recent: it was developed in the early nineteen-seventies
I the iterative maximum likelihood methods used for fitting
logistic regression models require more computational
power than solving the least squares equations
I Results of logistic regression are not as straightforward to
understand and interpret as linear regression results
I Finally, there might also be a bit of prejudice against
discrete data as less “scientifically credible” than
hard-science-like continuous measurements
Don’t be afraid of logistic regression!

I Still, if it is natural to cast your problem in terms of a

discrete variable, you should go ahead and use logistic
regression
I Logistic regression might be trickier to work with than linear
regression, but it’s still much better than pretending that the
variable is continuous or artificially re-casting the problem
in terms of a continuous response
The Machine Learning angle

I Classification of a set of observations into 2 or more

discrete categories is a central task in Machine Learning
I The classic supervised learning setting:
I Data points are represented by a set of features, i.e.,
discrete or continuous explanatory variables
I The “training” data also have a label indicating the class of
the data-point, i.e., a discrete binomial or multinomial
dependent variable
I A model (e.g., in the form of weights assigned to the
dependent variables) is fitted on the training data
I The trained model is then used to predict the class of
unseen data-points (where we know the values of the
features, but we do not have the label)
The Machine Learning angle

I Same setting of logistic regression, except that emphasis is

placed on predicting the class of unseen data, rather than
on the significance of the effect of the features/independent
variables (that are often too many – hundreds or thousands
– to be analyzed singularly) in discriminating the classes
I Indeed, logistic regression is also a standard technique in
Machine Learning, where it is sometimes known as
Maximum Entropy
Outline

Logistic regression
Introduction
The model
Looking at and comparing fitted models

Logistic regression in R
Classic multiple regression

I The by now familiar model:

y = β0 + β1 × x1 + β2 × x2 + ... + βn × xn +

I Why will this not work if variable is binary (0/1)?

I Why will it not work if we try to model proportions instead
of responses (e.g., proportion of YES-responses in
condition C)?
Modeling log odds ratios
I Following up on the “proportion of YES-responses” idea,
let’s say that we want to model the probability of one of the
two responses (which can be seen as the population
proportion of the relevant response for a certain choice of
the values of the dependent variables)
I Probability will range from 0 to 1, but we can look at the
logarithm of the odds ratio instead:
p
logit(p) = log
1−p
I This is the logarithm of the ratio of probability of
1-response to probability of 0-response
I It is arbitrary what counts as a 1-response and what counts
as a 0-response, although this might hinge on the ease of
interpretation of the model (e.g., treating YES as the
1-response will probably lead to more intuitive results than
treating NO as the 1-response)
I Log odds ratios are not the most intuitive measure (at least
for me), but they range continuously from −∞ to +∞
From probabilities to log odds ratios

5
logit(p)
0
−5

0.0 0.2 0.4 0.6 0.8 1.0

p
The logistic regression model

I Predicting log odds ratios:

logit(p) = β0 + β1 × x1 + β2 × x2 + ... + βn × xn

I Back to probabilities:

elogit(p)
p=
1 + elogit(p)
I Thus:
eβ0 +β1 ×x1 +β2 ×x2 +...+βn ×xn
p=
1 + eβ0 +β1 ×x1 +β2 ×x2 +...+βn ×xn
From log odds ratios to probabilities

1.0
0.8
0.6
p
0.4
0.2
0.0

−10 −5 0 5 10
logit(p)
Probabilities and responses

1.0
● ● ● ●● ● ● ●

0.8
0.6
p
0.4
0.2
0.0

● ● ● ●● ● ● ●

−10 −5 0 5 10
logit(p)
A subtle point: no error term

I NB:

logit(p) = β0 + β1 × x1 + β2 × x2 + ... + βn × xn

I The outcome here is not the observation, but (a function of)

p, the expected value of the probability of the observation
given the current values of the dependent variables
I This probability has the classic “coin tossing” Bernoulli
distribution, and thus variance is not free parameter to be
estimated from the data, but model-determined quantity
given by p(1 − p)
I Notice that errors, computed as observation − p, are not
independently normally distributed: they must be near 0 or
near 1 for high and low ps and near .5 for ps in the middle
The generalized linear model

I Logistic regression is an instance of a “generalized linear

model”
I Somewhat brutally, in a generalized linear model
I a weighted linear combination of the explanatory variables
models a function of the expected value of the dependent
variable (the “link” function)
I the actual data points are modeled in terms of a distribution
function that has the expected value as a parameter
I General framework that uses same fitting techniques to
estimate models for different kinds of data
Linear regression as a generalized linear model

I Linear prediction of a function of the mean:

g(E(y )) = X β

I “Link” function is identity:

g(E(y )) = E(y )
I Given mean, observations are normally distributed with
variance estimated from the data
I This corresponds to the error term with mean 0 in the linear
regression model
Logistic regression as a generalized linear model

I Linear prediction of a function of the mean:

g(E(y )) = X β

I “Link” function is :
E(y )
g(E(y )) = log
1 − E(y )
I Given E(y ), i.e., p, observations have a Bernoulli
distribution with variance p(1 − p)
Estimation of logistic regression models

I Minimizing the sum of squared errors is not a good way to

fit a logistic regression model
I The least squares method is based on the assumption that
errors are normally distributed and independent of the
expected (fitted) values
I As we just discussed, in logistic regression errors depend
on the expected (p) values (large variance near .5,
variance approaching 0 as p approaches 1 or 0), and for
each p they can take only two values (1 − p if response
was 1, p − 0 otherwise)
Estimation of logistic regression models

I The β terms are estimated instead by maximum likelihood,

i.e., by searching for that set of βs that will make the
observed responses maximally likely (i.e., a set of β that
will in general assign a high p to 1-responses and a low p
to 0-responses)
I There is no closed-form solution to this problem, and the
optimal β ~ tuning is found with iterative “trial and error”
techniques
I Least-squares fitting is finding the maximum likelihood
estimate for linear regression and vice versa maximum
likelihood fitting is done by a form of weighted least squares
fitting
Outline

Logistic regression
Introduction
The model
Looking at and comparing fitted models

Logistic regression in R
Interpreting the βs

I Again, as a rough-and-ready criterion, if a β is more than 2

standard errors away from 0, we can say that the
corresponding explanatory variable has an effect that is
significantly different from 0 (at α = 0.05)
I However, p is not a linear function of X β, and the same β
will correspond to a more drastic impact on p towards the
center of the p range than near the extremes (recall the S
shape of the p curve)
I As a rule of thumb (the “divide by 4” rule), β/4 is an upper
bound on the difference in p brought about by a unit
difference on the corresponding explanatory variable
Goodness of fit

I Again, measures such as R 2 based on residual errors are

not very informative
I One intuitive measure of fit is the error rate, given by the
proportion of data points in which the model assigns p > .5
to 0-responses or p < .5 to 1-responses
I This can be compared to baseline in which the model
always predicts 1 if majority of data-points are 1 or 0 if
majority of data-points are 0 (baseline error rate given by
proportion of minority responses over total)
I Some information lost (a .9 and a .6 prediction are treated
equally)
I Other measures of fit proposed in the literature, no widely
agreed upon standard
Binned goodness of fit

I Goodness of fit can be inspected visually by grouping the

ps into equally wide bins (0-0.1,0.1-0.2, . . . ) and plotting
the average p predicted by the model for the points in each
bin vs. the observed proportion of 1-responses for the data
points in the bin
I We can also compute a R 2 or other goodness of fit
measure on these binned data
Deviance

I Deviance is an important measure of fit of a model, used

also to compare models
I Simplifying somewhat, the deviance of a model is −2 times
the log likelihood of the data under the model
I plus a constant that would be the same for all models for
the same data, and so can be ignored since we always look
at differences in deviance
I The larger the deviance, the worse the fit
I As we add parameters, deviance decreases
Deviance

I The difference in deviance between a simpler and a more

complex model approximates a χ2 distribution with the
difference in number of parameters as df’s
I This leads to the handy rule of thumb that the improvement
is significant (at α = .05) if the deviance difference is larger
than the parameter difference (play around with pchisq()
in R to see that this is the case)
I A model can also be compared against the “null” model
that always predicts the same p (given by the proportion of
1-responses in the data) and has only one parameter (the
fixed predicted value)
Outline

Logistic regression

Logistic regression in R
Preparing the data and fitting the model
Practice
Outline

Logistic regression

Logistic regression in R
Preparing the data and fitting the model
Practice
Back to the Graffeo et al.’s discount study
Fields in the discount.txt file

subj Unique subject code

sex M or F
age NB: contains some NA
presentation absdiff (amount of discount), result (price after
discount), percent (percentage discount)
product pillow, (camping) table, helmet, (bed) net
choice Y (buys), N (does not buy) → the discrete
response variable
Preparing the data

I Read the file into an R data-frame, look at the summaries,

etc.
I Note in the summary of age that R “understands” NAs
(i.e., it is not treating age as a categorical variable)
I We can filter out the rows containing NAs as follows:
> e<-na.omit(d)
I Compare summaries of d and e
I na.omit can also be passed as an option to the modeling
functions, but I feel uneasy about that
I Attach the NA-free data-frame
Logistic regression in R

> sex_age_pres_prod.glm<-glm(choice~sex+age+
presentation+product,family="binomial")

> summary(sex_age_pres_prod.glm)
Selected lines from the summary() output

I Estimated β coefficients, standard errors and z scores

(β/std. error):
Coefficients:
Estimate Std. Error z value Pr(>|z|)
sexM -0.332060 0.140008 -2.372 0.01771 *
age -0.012872 0.006003 -2.144 0.03201 *
presentationpercent 1.230082 0.162560 7.567 3.82e-14 *
presentationresult 1.516053 0.172746 8.776 < 2e-16 *
I Note automated creation of binary dummy variables:
discounts presented as percents and as resulting values
are significantly more likely to lead to a purchase than
discounts expressed as absolute difference (the default
level)
I use relevel() to set another level of a categorical
variable as default
Deviance

I For the “null” model and for the current model:

Null deviance: 1453.6 on 1175 degrees of freedom

Residual deviance: 1284.3 on 1168 degrees of freedom

I Difference in deviance (169.3) is much higher than

difference in parameters (7), suggesting that the current
model is significantly better than the null model
Comparing models

I Let us add a presentation by interaction term:

> interaction.glm<-glm(choice~sex+age+presentation+
product+sex:presentation,family="binomial")

I Are the extra-parameters justified?

> anova(sex_age_pres_prod.glm,interaction.glm,
test="Chisq")
...
Resid. Df Resid. Dev Df Deviance P(>|Chi|)
1 1168 1284.25
2 1166 1277.68 2 6.57 0.04

I Apparently, yes (although summary(interaction.glm)

suggests just a marginal interaction between sex and the
percentage dummy variable)
Error rate
I The model makes an error when it assigns p > .5 to
observation where choice is N or p < .5 to observation
where choice is Y:

> sum((fitted(sex_age_pres_prod.glm)>.5 & choice=="N") |

(fitted(sex_age_pres_prod.glm)<.5 & choice=="Y")) /
length(choice)
[1] 0.2721088

I Compare to error rate by baseline model that always

guesses the majority choice:

> table(choice)
choice
N Y
363 813
> sum(choice=="N")/length(choice)
[1] 0.3086735

I Improvement in error rate is nothing to write home about. . .

Binned fit
I Function from languageR package for plotting binned
expected and observed proportions of 1-responses, as
well as bootstrap validation, require logistic model fitted
with lrm(), the logistic regression fitting function from the
Design package:
> sex_age_pres_prod.glm<-
lrm(choice~sex+age+presentation+product,
x=TRUE,y=TRUE)
I The languageR version of the binned plot function
(plot.logistic.fit.fnc) dies on our model, since it
never predicts p < 0.1, so I hacked my own version, that
you can find in the r-data-1 directory:
> source("hacked.plot.logistic.fit.fnc.R")
> hacked.plot.logistic.fit.fnc(sex_age_pres_prod.glm,e)
I (Incidentally: in cases like this where something goes
wrong, you can peek inside the function simply by typing
its name)
Bootstrap estimation

I Validation using the logistic model estimated by lrm() and

1,000 iterations:
> validate(sex_age_pres_prod.glm,B=1000)
I When fed a logistic model, validate() returns various
measures of fit we have not discussed: see, e.g., Baayen’s
book
I Independently of the interpretation of the measures, the
size of the optimism indices gives a general idea of the
amount of overfitting (not dramatic in this case)
Mixed model logistic regression

I You can use the lmer() function with the

family="binomial" option
I E.g., introducing subjects as random effects:
> sex_age_pres_prod.lmer<-
lmer(choice~sex+age+presentation+
product+(1|subj),family="binomial")
I You can replicate most of the analyses illustrated above
with this model
A warning
I Confusingly, the fitted() function applied to a glm
object returns probabilities, whereas if applied to a lmer
object it returns odd ratios
I Thus, to measure error rate you’ll have to do something
like:
> probs<-exp(fitted(sex_age_pres_prod.lmer)) /
(1 +exp(fitted(sex_age_pres_prod.lmer)))
> sum((probs>.5 & choice=="N") |
(probs<.5 & choice=="Y")) /
length(choice)

I NB: Apparently, hacked.plot.logistic.fit.fnc dies

when applied to an lmer object, on some versions of R (or
lme4, or whatever)
I Surprisingly, fit of model with random subject effect is
worse than the one of model with fixed effects only
Outline

Logistic regression

Logistic regression in R
Preparing the data and fitting the model
Practice
Practice time

I Go back to Navarrete’s et al.’s picture naming data

(cwcc.txt)
I Recall that the response can be a time (naming latency) in
milliseconds, but also an error
I Are the errors randomly distributed, or can they be
predicted from the same factors that determine latencies?
I We found a negative effect of repetition and a positive
effect of position-within-category on naming latencies – are
these factors also leading to less and more errors,
respectively?
Practice time

I Construct a binary variable from responses (error vs. any

response)
I Use sapply(), and make sure that R understands this is a
categorical variable with as.factor()
I Add the resulting variable to your data-frame, e.g., if you
called the data-frame d and the binary response variable
temp, do:
d$errorresp<-temp
I This will make your life easier later on
I Analyze this new dependent variable using logistic
regression (both with and without random effects)

Complete Download (Ebook PDF) Mind On Statistics: Australian & New Zealand 2nd PDF All Chapters
100% (4)
Complete Download (Ebook PDF) Mind On Statistics: Australian & New Zealand 2nd PDF All Chapters
41 pages
Logistic Regression
0% (1)
Logistic Regression
49 pages
Forecasting Part 2
No ratings yet
Forecasting Part 2
66 pages
Data Analytics Using R
No ratings yet
Data Analytics Using R
23 pages
Lecture 22. Glm
No ratings yet
Lecture 22. Glm
41 pages
Day 13 Logistic Regression
No ratings yet
Day 13 Logistic Regression
28 pages
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
No ratings yet
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
31 pages
Logistic Regression & Practice
100% (1)
Logistic Regression & Practice
51 pages
Logistic Regression
No ratings yet
Logistic Regression
54 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
What Is Logistic Regression
No ratings yet
What Is Logistic Regression
20 pages
Binary Logistic Regression Lecture 9
No ratings yet
Binary Logistic Regression Lecture 9
33 pages
02 LogisticRegression
No ratings yet
02 LogisticRegression
29 pages
Regression Logistic 4
No ratings yet
Regression Logistic 4
51 pages
Logistic Regression
No ratings yet
Logistic Regression
20 pages
Logistic Regression Analysis
No ratings yet
Logistic Regression Analysis
48 pages
W5S01 - PM-Logistic Regression
No ratings yet
W5S01 - PM-Logistic Regression
17 pages
Lec-4 Logistic Regression
No ratings yet
Lec-4 Logistic Regression
54 pages
Logistic Regression
100% (1)
Logistic Regression
37 pages
Data Science and Bigdata Analytics: Dr. Ali Imran Jehangiri
No ratings yet
Data Science and Bigdata Analytics: Dr. Ali Imran Jehangiri
20 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Introduction To Logistic Regression
No ratings yet
Introduction To Logistic Regression
20 pages
Logistic Regression: Psy 524 Ainsworth
No ratings yet
Logistic Regression: Psy 524 Ainsworth
37 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
Logistic Regression: in Experimental Research
No ratings yet
Logistic Regression: in Experimental Research
12 pages
Logistic Nota
No ratings yet
Logistic Nota
87 pages
5.1) Binary logistic regression
No ratings yet
5.1) Binary logistic regression
32 pages
Psy 512 Logistic Regression
No ratings yet
Psy 512 Logistic Regression
12 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
CH 8
No ratings yet
CH 8
13 pages
Lecture 4-Logistic-Regression
No ratings yet
Lecture 4-Logistic-Regression
50 pages
Lecture 2.3.1 (1)
No ratings yet
Lecture 2.3.1 (1)
50 pages
Log Reg
No ratings yet
Log Reg
32 pages
Fai Module 3
No ratings yet
Fai Module 3
67 pages
Logistic Regression
100% (2)
Logistic Regression
47 pages
Logistic Regression
No ratings yet
Logistic Regression
27 pages
3-Classification
No ratings yet
3-Classification
26 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
Ourse Notes Ogistic Egression: Course Notes: Descriptive Statistics Course Notes: Descriptive Statistics
No ratings yet
Ourse Notes Ogistic Egression: Course Notes: Descriptive Statistics Course Notes: Descriptive Statistics
6 pages
09_23ECE216_LogisticRegression
No ratings yet
09_23ECE216_LogisticRegression
40 pages
Binary Logistic Regression - 6.2
No ratings yet
Binary Logistic Regression - 6.2
34 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Report Logistic Regression
No ratings yet
Report Logistic Regression
21 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Regresion Logistica
No ratings yet
Regresion Logistica
71 pages
Dissertation Using Logistic Regression
100% (2)
Dissertation Using Logistic Regression
6 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
1_LogisticRegressionNotes1
No ratings yet
1_LogisticRegressionNotes1
11 pages
Logistic Regression For Machine Learning Complete TutorialUnderstand This Popular Supervised Classifi
No ratings yet
Logistic Regression For Machine Learning Complete TutorialUnderstand This Popular Supervised Classifi
10 pages
Econometrics II CH 1
No ratings yet
Econometrics II CH 1
48 pages
Materi MT
No ratings yet
Materi MT
14 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
Logistics Regression Notes
No ratings yet
Logistics Regression Notes
12 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
MLStackCafe QAS 1672810525772
No ratings yet
MLStackCafe QAS 1672810525772
12 pages
CHAPTER 2
No ratings yet
CHAPTER 2
11 pages
Lesson 13 Logistic Regression
No ratings yet
Lesson 13 Logistic Regression
26 pages
Materi MT
No ratings yet
Materi MT
14 pages
SAT Math: Master the Skills in 40 Pages
From Everand
SAT Math: Master the Skills in 40 Pages
Jennifer L Johnson
No ratings yet
Experimental Designs
No ratings yet
Experimental Designs
89 pages
DR Fikru
No ratings yet
DR Fikru
128 pages
Sheep VCA
No ratings yet
Sheep VCA
48 pages
Evaluation of Desho Grass For Their Agronomic Performances and Nutritive Values in Highland and Midland Areas of Guji Zone, Southern Oromia, Ethiopia
No ratings yet
Evaluation of Desho Grass For Their Agronomic Performances and Nutritive Values in Highland and Midland Areas of Guji Zone, Southern Oromia, Ethiopia
6 pages
Advanced Biostatistics LM
No ratings yet
Advanced Biostatistics LM
56 pages
ARC Journal of Animal and Veterinary Sciences
No ratings yet
ARC Journal of Animal and Veterinary Sciences
19 pages
Temesgen Tesfay, Tesfay Atsbha, Solomon Wayu, Adehanom Baraki
No ratings yet
Temesgen Tesfay, Tesfay Atsbha, Solomon Wayu, Adehanom Baraki
7 pages
Sugarecane Top
No ratings yet
Sugarecane Top
5 pages
Effect of Concentrate Supplementation On Performances of Ethiopian Lowland Afar and Blackhead Ogaden Lambs
No ratings yet
Effect of Concentrate Supplementation On Performances of Ethiopian Lowland Afar and Blackhead Ogaden Lambs
6 pages
Evaluation of Desho Grass (Pennisetum Pedicellatum) Hay As A Basal Diet For Growing Local Sheep in Ethiopia
No ratings yet
Evaluation of Desho Grass (Pennisetum Pedicellatum) Hay As A Basal Diet For Growing Local Sheep in Ethiopia
6 pages
Mutimuraetal 2018inpress
No ratings yet
Mutimuraetal 2018inpress
6 pages
The Carcass and Non-Carcass Parameters of Local Sheep Supplemented With Concentrate Mixture, Atella, Faidherbia Albida Leaf and Sesbania Sesban Leaf
No ratings yet
The Carcass and Non-Carcass Parameters of Local Sheep Supplemented With Concentrate Mixture, Atella, Faidherbia Albida Leaf and Sesbania Sesban Leaf
7 pages
BooteK-New Developments in DSSAT Modeling-Oct-2017
100% (1)
BooteK-New Developments in DSSAT Modeling-Oct-2017
58 pages
Accepted Manuscript: Plant Science
No ratings yet
Accepted Manuscript: Plant Science
38 pages
Simulating Tropical Forage Growth and Biomass Accumulation: An Overview of Model Development and Application
No ratings yet
Simulating Tropical Forage Growth and Biomass Accumulation: An Overview of Model Development and Application
12 pages
Ethiopian Institute of Agricultural Research, Debre Zeit Agricultural Research Center, Ethiopia
100% (1)
Ethiopian Institute of Agricultural Research, Debre Zeit Agricultural Research Center, Ethiopia
8 pages
Forage-Paper Published
No ratings yet
Forage-Paper Published
9 pages
Marketing Services and Bank Performance: Evidence From A Developing Banking Industry
No ratings yet
Marketing Services and Bank Performance: Evidence From A Developing Banking Industry
7 pages
Quantile Regression
No ratings yet
Quantile Regression
11 pages
assessing-the-effectiveness-of-stp-on-knowledge-and-attitude-regarding-blood-donation-among-relatives-of-the-patients-who-requires-blood-transfusion-admitted-at-the-selected-hospitals-of-bhopal
No ratings yet
assessing-the-effectiveness-of-stp-on-knowledge-and-attitude-regarding-blood-donation-among-relatives-of-the-patients-who-requires-blood-transfusion-admitted-at-the-selected-hospitals-of-bhopal
11 pages
Chapter 7 Correlation and Simple Linear Linear Regression Fall 2023-2024
No ratings yet
Chapter 7 Correlation and Simple Linear Linear Regression Fall 2023-2024
35 pages
Session 02 - Regression - and - Classification
No ratings yet
Session 02 - Regression - and - Classification
22 pages
Chapter 10 Power Point Slides
No ratings yet
Chapter 10 Power Point Slides
26 pages
ML Practical 04
No ratings yet
ML Practical 04
19 pages
An Introduction To Multivariate Statistics
No ratings yet
An Introduction To Multivariate Statistics
19 pages
MGT555 Group Project 20224
No ratings yet
MGT555 Group Project 20224
2 pages
SPE 93008 The Key To Predicting Emulsion Stability: Solid Content
No ratings yet
SPE 93008 The Key To Predicting Emulsion Stability: Solid Content
9 pages
Development of A Regression Model To Forecast Air
No ratings yet
Development of A Regression Model To Forecast Air
10 pages
Access Test Bank for Educational Research Quantitative Qualitative and Mixed Approaches 6th Edition by Johnson Christensen ISBN 9781483391601 All Chapters Immediate PDF Download
100% (29)
Access Test Bank for Educational Research Quantitative Qualitative and Mixed Approaches 6th Edition by Johnson Christensen ISBN 9781483391601 All Chapters Immediate PDF Download
54 pages
Regression Analysis: Causal Relationship Between The Explanatory and
No ratings yet
Regression Analysis: Causal Relationship Between The Explanatory and
17 pages
Chapter 3
No ratings yet
Chapter 3
5 pages
Article Critique
No ratings yet
Article Critique
9 pages
Business Research Chapter 4
No ratings yet
Business Research Chapter 4
62 pages
The Robust Beauty of Improper Linear Models in Decision Making PDF
No ratings yet
The Robust Beauty of Improper Linear Models in Decision Making PDF
12 pages
Predictive Analytics A Review of Trends and Techni
No ratings yet
Predictive Analytics A Review of Trends and Techni
7 pages
Shipan MechanismsPolicyDiffusion 2008
No ratings yet
Shipan MechanismsPolicyDiffusion 2008
19 pages
Seminar Presentation
No ratings yet
Seminar Presentation
25 pages
Multiple Discriminant Analysis
No ratings yet
Multiple Discriminant Analysis
18 pages
Indonesian Interdisciplinary Journal of Sharia Economics (IIJSE) Vol. 6. No. 3 (2023) e-ISSN: 2621-606X Page: 1436-1448
No ratings yet
Indonesian Interdisciplinary Journal of Sharia Economics (IIJSE) Vol. 6. No. 3 (2023) e-ISSN: 2621-606X Page: 1436-1448
13 pages
Machine Learning_ an Applied Econometric Approach
No ratings yet
Machine Learning_ an Applied Econometric Approach
29 pages
Project 03: Data Fitting Applied Mathematics and Statistics For Information Technology
No ratings yet
Project 03: Data Fitting Applied Mathematics and Statistics For Information Technology
17 pages
CH 17
No ratings yet
CH 17
12 pages
SECOND TERM SS 3 FURTHER MATHS Ua5jsj
No ratings yet
SECOND TERM SS 3 FURTHER MATHS Ua5jsj
13 pages
E Bike
No ratings yet
E Bike
9 pages
Unit 2 - (A) Correlation & Regression
No ratings yet
Unit 2 - (A) Correlation & Regression
15 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Regression3 Slides

Uploaded by

Regression3 Slides

Uploaded by

Regression 3: Logistic Regression

I In a very large number of problems in cognitive science

I Is linguistic construction X rated as “acceptable” in the

I Problems with binary (yes/no, success/failure,

I Logistic regression seems less popular than linear

I Still, if it is natural to cast your problem in terms of a

I Classification of a set of observations into 2 or more

I Same setting of logistic regression, except that emphasis is

I The by now familiar model:

I Why will this not work if variable is binary (0/1)?

0.0 0.2 0.4 0.6 0.8 1.0

I Predicting log odds ratios:

I The outcome here is not the observation, but (a function of)

I Logistic regression is an instance of a “generalized linear

I Linear prediction of a function of the mean:

I “Link” function is identity:

I Linear prediction of a function of the mean:

I Minimizing the sum of squared errors is not a good way to

I The β terms are estimated instead by maximum likelihood,

I Again, as a rough-and-ready criterion, if a β is more than 2

I Again, measures such as R 2 based on residual errors are

I Goodness of fit can be inspected visually by grouping the

I Deviance is an important measure of fit of a model, used

I The difference in deviance between a simpler and a more

subj Unique subject code

I Read the file into an R data-frame, look at the summaries,

I Estimated β coefficients, standard errors and z scores

I For the “null” model and for the current model:

Null deviance: 1453.6 on 1175 degrees of freedom

I Difference in deviance (169.3) is much higher than

I Let us add a presentation by interaction term:

I Are the extra-parameters justified?

I Apparently, yes (although summary(interaction.glm)

> sum((fitted(sex_age_pres_prod.glm)>.5 & choice=="N") |

I Compare to error rate by baseline model that always

I Improvement in error rate is nothing to write home about. . .

I Validation using the logistic model estimated by lrm() and

I You can use the lmer() function with the

I NB: Apparently, hacked.plot.logistic.fit.fnc dies

I Go back to Navarrete’s et al.’s picture naming data

I Construct a binary variable from responses (error vs. any

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.