● Difference between correlation and causality
● Introduced the potential outcomes model
● Discussed what are the parameters of interest: ATE & ATET
● Policy makers and economists are often interested in causal effects.
● Potential outcomes model provides a statistical framework to analyze causal effect.
● If (in a nonexperimental setting) some outcome is correlated to treatment, this does
not necessary imply causality.
● Social experiments provide a straightforward approach to estimate causal effects.
● Why are there not more social experiments:
○ 1 Social experiments may be costly.
○ 2 Ethical considerations.
● Selection issues are often present and complicate analysis.
● Selection can (often) be considered as omitted variables problem.
● Omitted variables may cause biased and inconsistent estimates.
WEEK 2
● There are many reasons why regressors may be endogenous (omitted variables,
reversed causality, measurement error, sample selection).
● Instrumental variables can deal with endogenous regressors.
● What a good instrument is depends on the application.
● Showed that we can estimate an IV using the 2SLS estimator
● Important that the instrumental variable is exogenous.
● IV estimators are not unbiased, but are consistent.
● Bias can be severe if instrument is weak (low predictive power).
● IV estimates the average treatment effect for the compliers (LATE).
● Compliers are individuals who change treatment status when the instrument changes
value.
● Compliers cannot be identified directly.
● Biggest challenge is finding good instruments (exogenous and relevant).
● What a good instrument is depends on the application.
WEEK 3
● Randomized experiments ensure that causal effects are estimated.
● Variation in scale, and field or laboratory.
● However, design crucial, cannot correct mistakes in design ex-post.
● Balancing table way to check ex-post whether randomization was done correctly.
● Different alternatives for full randomization if that is complicated/unfeasible
● Power analysis computes required size of experiment.
● Possible complications for simple analysis: nonrandom selection, attrition,
noncompliance, externalities.
1
, ● Attrition is only problematic if related to potential outcomes (otherwise only
reduction of power).
● External validity (population, context, administration, equilibrium effects, Hawthorne
effect).
WEEK 4
● Regression discontinuity allows estimation of causal effects in cases where treatment
is endogenous
● Requires a discontinuous jump of treatment probability in the running variable
● If the probability of treatment jumps from 0 to 1, discontinuity is sharp, otherwise it is
fuzzy
● Important to check:
○ Specification of relationship between outcome and running variable
○ (may be non-linear)
○ Bunching (manipulating running variable)
○ Continuity of other covariates around the threshold
● Various models for considering dummy endogenous variables.
● Linear probability models are easy to analyze and easy to interpret.
● However, functional form may be inconvenient.
● Logit and Probit models guarantee that probabilities are bounded between 0 and 1.
● Interpretation of coefficients is not straightforward.
● Link to consumer choice: model choice based on latent utility (unobserved, only
observe chosen outcome)
WEEK 5
● Panel data describe observations of individuals/regions/firms/etc. over time.
● Panel data models can deal with unit specific effects and can solve a lot of omitted
variable bias problems.
● Fixed effects or random effects model: Random effects model more efficient but need
stronger assumptions.
● Usual panel data models assume strict exogeneity of regressors.
● Dealing with lagged-endogenous variables uses instrumental variable methods.
● Policy changes can be used for evaluation with observational data
● The before-after estimator compares outcomes before and after a policy is
implemented
● To correct for other things that change over time, subtract change in control group:
difference-in-differences estimator
● DD-estimator provides causal effect if common trend assumption holds
● Look at pre-trends and do placebo tests to investigate plausibility
● DD-estimate can be obtained by performing simple regression
2
,WEEK 6
● Heritabilities of social science outcomes is considerable
● Genetic markers influence social science outcomes through non-deterministic
pathways that are likely to difficult to disentangle
● Social science outcomes are likely influenced by a very large number of SNPs, each
with tiny effect sizes
● Large sample sizes are key to achieve well-powered analyses
● GWAS methodology in combination with increasingly large sample sizes have
resulted in the discovery of many genome-wide significant SNPs for social science
outcomes
● Causality of genetic effects found in current GWAS studies can be
● questioned
● Principal components control for population stratification, but imperfectly
● Better solutions are possible when genetic data of family members is included in the
analysis, but such data is still too scarce for use in GWAS
● Polygenic scores are a summary measure of genetic endowments at the individual
level, that are sufficiently predictive to be used in “regular” econometric analyses
3
, Week 1.1 - Instrumental variables
Difference between correlation and causality
Causality is about questions such as:
● What would have happened
● What would happen
→ This requires knowing about unobserved outcomes, because we only observe one potential
outcome.
E.g. ‘Do people earn more if they complete university education?’
Correlation is a measure for the association between two variables.
→ One approach would be to compare those with university education to those without
university education.
However, correlation between D and Y can be caused by
1. A causal effect of D on Y
2. A causal effect of Y on D
3. Omitted variables: Z affects both D and Y
Potential outcome model
= general model to think about causal effects.
● Assume a treatment or choice variable can take two values (0/1)
● Each individual i has two potential outcomes, Y1i* with treatment and Y0i* without
treatment.
● Only one potential outcome is observed (factual). The unobserved outcome is the
counterfactual outcome.
● For an individual the effect of participating in the treatment equals:
● ∆i is always an unobserved random variable, because only one of the random
variables Y1i* and Y0i* is observed. (= the fundamental problem of causal inference)
4
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller jtimmermans. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $8.09. You're not tied to anything after your purchase.