EC902/907 – Quantitative Methods: Econometrics A Sample exam questions – Spring 2021 1.
Public smoking bans In recent years, a growing awareness of the deadly
effects of smoking has led most industrialized countries to enact
tobacco control policies. Smoking bans may affect smoking prevalence
within the population, and smoking intensity among smokers. However, to
date, surprisingly little research has been done on whether people
change their smoking habits as a result of smoking bans. In 2007
several German states introduced public smoking bans in the hospitality
industry (bars, restaurants, and dance clubs). The following graph shows
the evolution of the smoking rate over time according to survey
information in those states of Germany that approved the smoking ban
(lower line) and those states of Germany that by 2008 had no smoking ban
(upper line). a) Would it be appropriate to use a
differences-in-differences approach in order to obtain a consistent
estimate of the effect of smoking bans Discuss briefly which
assumptions need to be satisfied and whether you expect them to be
satisfied in this context. Probably yes. The crucial assumption for DID is that, in the absence of the ban, the smoking rate would have evolved similarly in the treatment and the control group (the so-called parallel trends assumption). The graph suggests that the smoking rates evolved similarly in the past, which constitutes supportive evidence (although a statistical test to verify this would be still needed). We should also verify whether states in the treatment group have simultaneously adopted other policies to reduce smoking (e.g. changes in tobacco tax rates) and whether the treatment may have somehow affected the control group (anticipation effect, spillovers, etc.) Common mistake: Both groups are assumed to be similar (Note: in a DID setting they are assumed to evolve similarly, which is a less demanding condition.)
b) Imagine that you have access to a database with survey information
on smoking behavior in Germany during years 2002 and 2008. This database
includes information for 85,000 individuals living in treated and
control states. If you use this database to estimate the impact of
smoking bans on smoking behavior, at which level would you cluster your
standard errors: (i) individual level, (ii) individual*year level, (iii)
state level, or (iv) state*year level Explain briefly why. (iii) state level, acknowledging that the treatment is defined at the state level (there might be common shocks that affect all individuals within the state in a given year) and that there is serial correlation across time. Note: option (iv), state*year level (e.g. Baviera-2005 would be a cluster) would
not account for serial correlation within the same state over time. c)
The authors find that the smoking rate decreased by 0.4 p.p. faster in
the treatment group relative to the control group, with a standard error
of 0.8 p.p. Discuss the statistical and the economic magnitude of this
result. Statistical significance: The effect on the ban is not statistically significant from zero at standard levels: using a 95% significance level, the public smoking ban may have decreased the smoking rate by up to 2 p.p. or it may have increased it by 1.2 p.p. Economic significance: At best, the ban may have caused a 2 p.p. decrease in smoking rates. That would imply that approximately 8% (2/26) of smokers quitted smoking following the introduction of the ban. Personally I would say that, from a public health perspective, a 8% decrease in the number of smokers may be considered like a substantial effect, but you were allowed to disagree. In sum, we cannot discard that the ban had an “economically” significant impact, but given the accuracy of the estimate we are unable to draw any conclusions. 2. Does compulsory school attendance affect schooling and earnings (Angrist and Krueger 1991) Using data from the 1980 census, Angrist and Krueger (1991) looked at the relationship between educational attainment and quarter of birth for men born from 1930 to 1959. The first figure below displays the relationship between education and quarter of birth for men born in the 1930s. The figure clearly shows that men born early in the calendar year tend to have lower average schooling levels. The second figure displays average earnings by quarter of birth for the same sample. Older cohorts tend to have higher earnings, because earnings rise with work experience. But the figure also shows that, on average, men born in early quarters of the year almost always earn less than those born later in the year. Importantly, this reduced form relationship parallels the quarter-of-birth pattern in schooling. a) Explain briefly the rationale for the Angrist and Krueger (1991) approach. (Why in the US people born at the beginning of the year finished less schooling ) (5 marks) In the US students start school the year they turn six, but they can drop out as soon as they turn 16. As a result, students born early in the year have the chance to drop out when they turn 16 in the beginning of the academic year, while those ones born at the end of the year have to complete the academic year before being allowed to drop out. b) Would you expect the exogeneity condition to be satisfied Provide some example of a possible violation. (5 marks) There might be a selection problem is some people may try to strategically select the timing of births (e.g.: working women having kids in summer) There might be other factors that affect the timing of births. For instance, Buckels and Hungerman (Restat 2013) document large changes in maternal characteristics for births throughout the year (e.g. winter births are disproportionally realized by teenagers and the unmarried). 88 CHAPTER 4. INSTRUMENTAL VARIABLES IN ACTION A Average Education by Quarter of Birth (first stage) 3 3 4 3 4 1 2 3 4 1 2 3 4 12 9 13 13.1 13.2 . A. Average Education by Quarter of Birth (first stage) 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 4 1 2 1 2 12.5 12.6 12.7 12.8 . Ye ar s o f E du ca tio n 1 12.2 12.3 12.4 30 31 32 33 34 35 36 37 38 39 Year of Birth 5.94 B. Average Weekly Wage by Quarter of Birth (reduced form) B. Average Weekly Wage by Quarter of Birth (reduced form) 3 4 3 4 3 2 3 4 3 4 3 4 3 4 3 4 2 3 4 2 45.91 5.92 5.93 ar nin gs 2 1 1 2 1 2 4 1 1 2 1 2 1 2 1 2 1 1 3 5.88 5.89 5.9 Lo g W ee kly E a 5.86 5.87 30 31 32 33 34 35 36 37 38 39 Year of Birth Figure 4.1.1: Graphical depiction of rst stage and reduced form for IV estimates of the economic return to schooling using quarter of birth (from Angrist and Krueger 1991). Nonetheless, most authors tend to consider that these differences are relatively minor. c) Would you expect the exclusion restriction to be satisfied Provide some example of a possible violation. (5 marks) There might be some minor violations. Being born at the beginning of the year may affect individuals labor market performance through other channels than length of education. Children born at the beginning of the year are older than their classmates, which may affect their non- cognitive abilities (eg Black et al. 2017). They are also older when they take exams, leading to better educational performance (for the same length of studies). d) Using this empirical strategy, we learn about the impact of educational attainment on earnings for which type of individuals (who are the `compliers’, in jargon) (5 marks) Compliers are those kids who are eager to drop out. Therefore, the instrument (month of birth) affects whether they receive the treatment or not (years of education) Common error (1): define compliers based on the impact of the treatment. E.g.: compliers are kids that react to the treatment (schooling) by gaining skills. (Note that being a complier is unrelated to whether the treatment has an effect or not on this group (that is an empirical question to be determined) Common error (2): define compliers based on when they are born. 3. Should I stay or should I go (to class) Andrietti, D’Addazio and Velasco (2008) examine the link between absenteeism and students’ performance using data from a Spanish university. Initially they follow an identification strategy based on observables. Their OLS estimates suggest that, conditional on a number of observable characteristics, students that attend class tend to obtain better grades (betaOLS=0.11). The set of observable characteristics includes information about students’ family background, habits and performance in previous courses. In order to deal with endogeneity concerns, they propose to use as instruments for attendance (i) distance to reach campus from the student’s house and (ii) a dummy variable that indicates if the student works. Please reply to the following questions: a) Discuss, for each instrument, whether the exogeneity assumption is likely to be satisfied. Neither instrument is likely to be exogenous. For instance, socio- economic status might affect the area where people live and also whether they work or not. b) Discuss, for each instrument, whether the exclusion restriction is likely to be satisfied. Can you think of some way to test empirically whether this assumption is satisfied In both cases the exclusion restriction is not likely to hold. For instance, distance to the university might affect not only attendance, but also time available to study. Student who work may also have less time to study, etc. c) Can you please verbally characterize who are the always takers, the never takers and the compliers in this context The always takers (never takers) always (never) attend lectures, independently of where they live or whether they work or not. Compliers attend lectures only when they do not work or they live close enough. d) The authors find that the IV estimate is equal to 0.50 (substantially larger that the OLS estimate). Can you please provide some reasonable explanation for the difference between IV and OLS estimates in this particular case There are two possible explanations. First, note that the IV estimate is likely to suffer a problem of weak instruments. Second, the IV estimate identifies the LATE, while the OLS identifies the ATE. (Note also that the omitted variable bias is likely to bias upwards the OLS estimate, and therefore cannot explain why the IV estimate is larger than the OLS one.) 4.
The Effect of Peer Salaries In the paper “Inequality at Work: The
Effect of Peer Salaries on Job Satisfaction”, David Card and co-authors
study the effect of disclosing information on peers’ salaries on
workers’ job satisfaction and job search intentions. They informed a
randomly chosen subset of employees of the University of California
about a new website listing the pay of University employees. Later on,
they surveyed all campus employees, eliciting information about their
use of the website, their pay and job satisfaction, and their job search
intentions. Their intervention had a large impact on access to
information. The fraction of people who used the website was equal to 20
percent among workers that were not informed about the existence of the
website and it raises to nearly 50 percent among workers who were
informed about the existence of the website. Furthermore, the
intervention caused an increase in the intention to look for a new job
among workers with pay below the median for their department and
occupation group. In particular, individuals who received from
researchers the information about the existence of the website and whose
pay was below the median were 4.3 p.p. (st. error=1.8 p.p.) more likely
to report that they were looking for a new job, relative to a benchmark
of 21.9% in the control group. a) Imagine that we use “provision of
information about the website” as an instrument for “access to
information about peers’ salaries”. Propose some possible violation of
the exclusion restriction. For instance, some people may dislike learning that their salaries are reported online. (Even if they do not check personally the website.) Instead, the exclusion restriction would be satisfied if the people that did not check the website somehow had not received the email at all, for
instance because it went directly to their spam folder. b) Let us
assume that the exclusion restriction was satisfied. How does access to
information on peers’ salaries affect the job search intentions of
individuals with pay below the median (i.e. report the IV estimate) 14.3 (=4.3/0.3) c) Quantify the share of always-takers, the share of never-takers and the share of compliers. 20%, 50%, 30% 5.
Regression discontinuity design A key policy question is whether the
benefits of additional medical expenditures exceed their costs. An
article by Almond et al. (2010)1 studies the impact of providing
extra-care to newborns. They focus on the extra-treatments received by
newborns weighting less that 1,500 grams, who are typically classified
by hospitals as “very low birth weight” (VLBW), and receive more
treatments than newborns with slightly larger weight. For instance, the
1,500 g threshold is commonly used as a point below which diagnostic
ultrasounds is used. They authors use a regression discontinuity
strategy that compares health outcomes and medical treatment provision
for newborns on either side of the very low birth weight threshold at
1,500 grams. As shown in the below figure, they find that newborns with
birth weights just below 1,500 grams have lower one-year mortality rates
than do newborns with birth weights just above this cutoff, even though
mortality risk tends to decrease with birth weight. One-year mortality
falls by approximately one percentage point as birth weight crosses
1,500 grams from above, which is large relative to mean infant mortality
of 5.5% just above 1,500 grams. Note: Each point represents the
average mortality within one year for newborns with a certain weight.
a) Explain briefly what are the key requirements that would make the
RDD strategy adequate in this context. Answer: The crucial assumption for the validity of the regression discontinuity design is that there are no discrete changes in any relevant variable at the threshold, other than the treatment (i.e. being labeled as “very low birth weight”). A possible threat would be that some doctors or parents may be able to manipulate the official weight of newborns. For instance,
it would be a problem if wealthy parents can pay doctors to classify
1 Douglas Almond,
Joseph J. Doyle, Jr., Amanda E. Kowalski, Heidi Williams, Estimating
Marginal Returns to Medical Care: Evidence from At-risk Newborns, The
Quarterly Journal of Economics, Volume 125, Issue 2, May 2010, Pages
591–634. their newborns as being below 1,500 so that they are entitled to receive special treatments. Another potential threat is the existence of other treatments based on this threshold. Finally, RDD requires a sufficiently large mass of observations around the threshold in order to provide estimates that are precise. b) How would you verify the existence of manipulation of the running variable Answer: First, we should verify whether the density function is continuous at the discontinuity threshold. It would be worrying if the number of newborns with slightly less than 1,500 grams is much larger/smaller than the number of newborns above the threshold. Second, we may want to verify that predetermined relevant factors evolve “smoothly” with respect to the running variable at the threshold. For instance, families with babies just above and below the threshold should have “similar” socio-economic characteristics, such as income or educational background.