0% found this document useful (0 votes)
3 views7 pages

Exam_2021

The document is an exam for the 5304 Econometrics course, consisting of three mandatory questions with specific instructions regarding the use of resources and submission format. The questions cover topics such as the relationship between income and democracy, the effectiveness of nudging strategies to reduce electricity consumption, and the principles of causal inference. Students are required to provide detailed analytical responses within a four-hour timeframe, adhering to guidelines on data sources and submission protocols.

Uploaded by

gabbe.css
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views7 pages

Exam_2021

The document is an exam for the 5304 Econometrics course, consisting of three mandatory questions with specific instructions regarding the use of resources and submission format. The questions cover topics such as the relationship between income and democracy, the effectiveness of nudging strategies to reduce electricity consumption, and the principles of causal inference. Students are required to provide detailed analytical responses within a four-hour timeframe, adhering to guidelines on data sources and submission protocols.

Uploaded by

gabbe.css
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

5304 Econometrics

Exam – 2020

17 December 2021

INSTRUCTIONS:

1. There are three questions, each of which is mandatory. The points allocated to each question are
indicated in the title of the question. The distribution of these points within each question is
indicated in the relevant sub-part.

2. There is no assigned word limit. However, you should aim to finish this exam in no more than four
hours equivalent of writing time.

3. You may consult course lecture slides and assigned reading material including the course textbooks
and any papers that were referenced during lectures. You may also consult any course notes that
you prepared or material shared in the seminars.

4. You may not consult any other sources of information. In particular, you may not consult any
individual, online or offline, to discuss or receive solutions.

5. Some of the questions may rely on previously published research. I expect you to answer based on
the information provided and the instruction received in lectures and through the course. You may
not look up the original papers, even if referenced in the exam, for answers.

6. Answers should be typed out and uploaded through the course webpage on Canvas by 3:00
p.m. CET on 17 December 2021. Students should upload PDF documents with their student
number clearly marked and without their name on the answer script. The file name should be
<student_number>_exam.pdf

(a) Students with an extension of time, agreed by the Examinations Office and communicated
beforehand to the Course Director, may submit at 4 p.m. CET by directly emailing their
answer scripts to abhijeet.singh@hhs.se

1
A. Income and Democracy (40 points)

“Increases in various measures of the standard of living tend to forecast a gradual rise in democracy. In
contrast, democracies that arise without prior development...tend not to last” (Robert Barro, 1999)

1. The figure above shows the association between democracy and per capita GDP in the 1990s – both
the vertical and horizontal axes plot the 10-year average for a country in the 1990s. Is this convincing
evidence for Barro’s point above? If not, list two potential sources of bias that may prevent us from
interpreting this relationship as the causal effect of national GDP per capita on democracy. Illustrate
your answer with a directed acyclic graph and comment also on the (hypothesised) direction of bias.
[10%]

2. Researchers assemble a dataset which has annual values of per capita GDP and democracy scores
for all countries from 1965-2010 at an annual level. They then estimate the following regression:

dit = γ.yit−1 + β.xit−1 + µc + uit (1)

where dit measures democracy for country i in year t, yit−1 is the lagged value of income per capita,
xit−1 is a vector of lagged control variables which include education levels and population size, µc is
a vector of country level dummy variables and uit is the error term. Does this address any concerns
that you may have had in Q1 above? Provide one example of a potential source of bias that might
still remain. [10%]

2
3. The researchers are advised to adopt a two-way fixed-effects specification and modify their
specification to include year dummy variables as well:

dit = γ.yit−1 + β.xit−1 + µc + φt + uit (2)

where φt is a vector of year dummy variables and all other variables defined as before.

(a) Would you accept γ as the causal effect of income on democracy in this specification? Why or
why not? [10%]
(b) The researchers do not believe that uit are independent and identically distributed but are
unsure of the appropriate way for computing standard errors in this instance. What would you
recommend? [5%]
(c) A reviewer points out that the Barro (1999) statement is not about year-to-year fluctuations in
income and democracy but rather about gradual changes. How would you modify the empirical
specification above to allow for longer-run effects of income on democracy (say over a period
of 5 years)? [15%]
(d) A second reviewer points out that, conditional on country and year fixed effects, there is very
little variation that is left in both national GDP per capita and democracy. Is that a cause for
potential concern here? Explain the potential impact of this issue on both remaining bias and
precision in Equation (2). [15%]

4. Annoyed by reviewers, the researchers consider potential instrumental variables which can be used
to obtain a consistent estimator for γ in Eq(2). They consider three potential instruments for yit−1 ,
each of which is available for each country in each year in the study period:

(a) Worker strikes: They collate all reports of industrial and political strikes in each country for
each year . Their intuition is that worker strikes reduce national income in a given year.
(b) Past savings rates (as a proportion of GDP): Their intuition is that past savings lead to growth
in the domestic economy in future period.
(c) Trade-weighted income shocks to other countries: They compute the share of trade between
country i and country j in the GDP of country i using (time-invariant) trade shares in the
1980s, which they denote by ωij ; this is computed for every country pair out of the N countries
in the sample. Using this measure, they compute a trade-weighted income shock measure for
each country which is given by

N
X
Qit = ωij Yjt−1 (3)
j=1,j6=i

where j indexes countries, Yjt is (total) GDP for country j at time t and ω are time-invariant
trade share weights defined as above. The intuition is that income shocks to country i’s
trading partners in the last period would spillover through international trade to affect country

3
i’s income this period.
For each of the three potential IVs, what is the implied exclusion restriction and do you think
it is plausible? State your reasoning and, if you think the IV might be invalid, provide one
example of potential violation of the exclusion restriction. [25%]
(d) The researchers are certain that at least one of the instruments they consider is, in fact, valid.
However, they are not sure which one is valid. They use all three IVs to instrument yit−1 (at
the same time). Under the assumption that all three IVs are valid, how would you interpret the
2SLS estimate γ̂? The authors further report an over-identification test (Hansen J statistic)
with a p-value of 0.03. How would you interpret this statistic? [10%]

[TURN PAGE FOR NEXT QUESTION]

4
Question B: Nudging to reduce electricity consumption (40 points)

An electricity utility company wants to encourage customers to reduce their energy consumption through,
for example, installing more energy-efficient appliances. To this end, it designs personalized energy reports
(see example below) targeted at high-consumption customers.

The company wants to test their effectiveness in a randomized control trial (RCT) before full-scale
rollout. The company randomly allocates 500 neighbourhoods to the program, out of a total of 2000
neighbourhoods served by the company, retaining the other neighbourhoods as control neighbourhoods.
In treatment neighbourhoods, the reports are sent by email to the top 50% of consumers in the past 12
months. The email is sent out to each treated household along with their monthly (30-day) bill, which is
generated at different dates for households, depending on the date the household had initially signed up
with the company, and a new report is sent after every 3 months for a full year (i.e. a total of 4 reports
per treated customer).1 Electricity consumption is observed at daily frequency for at least 12 months
1
Bills are generated at 30 day frequencies. So e.g. a household which signed up on the 1st of the month vs the 20th of
the month would get the bill on different dates.

5
before the trial and for at least 12 months before the trial. You are hired to analyze the trial to provide
an estimate of the effectiveness of this program.

1. Assume that the primary objective is only to see the effects on electricity consumption one year
after the trial has ended. What would be the empirical specification you would suggest? What level
should standard errors be clustered at? [10%]

2. The company believes that the nudges are likely to have significant effects soon after the report comes
but that these fade quite quickly in following weeks. Can you suggest an empirical specification
which allows for an analysis of these fade-out effects? Provide the empirical specification and the
interpretation that you would provide to the relevant coefficients.[15%]

3. The product development team at the company also digitally tracks whether the email was ever
opened and points out that the intervention could only have affected households which did, in fact,
open the email and see their report. Thus, they suggest comparing those treated households who
opened the email to the control group to estimate the effects of the intervention. Do you agree with
this suggestion? If yes, how would you implement it? If not, explain what you would do instead
to estimate the effect of the intervention while addressing the concern of the product development
team. [20%]

4. An SSE MSc graduate, who is part of the trainee program at this company, points out that it is
possible to estimate effects of the intervention using a regression discontinuity design (RDD) within
the treatment group (since all households at or above median consumption in the neighboourhood
are sent the email but not households in the bottom half of pre-treatment consumption).

(a) Explain how you would use this discontinuity to estimate the treatment effect? Provide the
empirical specification you would employ. [15%]
(b) Using the RDD, the researcher estimates that the intervention led to an average decline in
electricity consumption by 2%. She is confused, however, that the estimates from the RCT she
got in Q2 is a decline of 4%. Assuming that the identification assumptions of both strategies
are satisfied, what could account for the discrepancy between the two estimates? Assuming
sample size is not a constraint, can you verify whether this explanation is plausible? Which of
the two estimates (RCT vs. RDD) would you prefer as the “headline” result and why? [20%]

5. The researchers note that 10% of customers discontinue services between the beginning of the
experiment and the last period in which the outcomes are collected for the study. For the following
two scenarios, comment on whether (i) this poses a problem for their analysis and, (ii) if yes, can
you list a potential way to address it:

(a) who drops out vs not is uncorrelated with the treatment — the rate of discontinuation of services
is not statistically different for households that were treated vs not — but discontinuation is
higher for households who live in apartments vs households who lived in houses (a characteristic
observed by the company). [10%]
(b) 5% of the control group households discontinue services whereas 15% of the treated households
discontinue services. The researchers hypothesise that this is related to annoyance from the
treatment, which they do not observe directly. [10%]

6
Question C: Causal inference (20 points)

All methods in causal inference rely on identification assumptions that are fundamentally
unverifiable. Hence, we should not treat individual research designs as being more or less
reliable and/or useful than other designs for policy analysis.

Do you agree with this statement? Why or why not? Explain through explicit discussion of (a)
cross-sectional multivariate regression analysis, (b) instrumental variables, (c) difference-in-differences,
(d) regression discontinuity designs and (e) randomized control trials as tools for the analysis of treatment
effects. Clarify which you would choose under what circumstances assuming that you are interested in
internal validity, external validity and statistical precision.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy