Detailed notes for EC220 Introduction to Econometrics (Michaelmas Term) by a student that achieved 77% in the exam. The notes have been broken down by each week to make it easier to follow with on-going lectures, and equally when it comes to revising for the exam.
MICROBIOLOGY EXAM WITH 100 ACTUAL QUESTIONS AND COMPLETE 100%CORRECT ANSWERS WITH VERIFIED AND WELLEXPLAINED RATIONALES ALREADY GRADED A+ BY EXPERTS |LATEST VERSION 2024 WITH GUARANTEED SUCCESS AFT...
Term 2 Lecture notes EC226 Econometrics Mastering 'Metrics - Score a first too
Warwick EC226 - Econometrics T1 Full Revision Notes (1st Final Exam)
Alles voor dit studieboek
(8)
Geschreven voor
London School of Economics (LSE)
BSc Economics
EC220 (EC220)
Alle documenten voor dit vak (2)
2
beoordelingen
Door: jaipatel437 • 3 jaar geleden
Door: harkaranghai • 4 jaar geleden
Verkoper
Volgen
tc_econ
Ontvangen beoordelingen
Voorbeeld van de inhoud
EC220 – MT Week 1
Lecture 1: Causality and Correlation
In MT, we’ll focus on applied econometrics, particularly causal questions such as “what-if” questions.
We can almost always frame a question as a “what-if” question:
• What happens to a country if it withdraws from a trade agreement?
• What is the impact on your health if you go to a hospital?
Does ice cream cause deaths by drowning?
Data shows there is a strong correlation. However, what really causes drowning? The common
reason could be that it’s hot, which means more people go swimming and so the probability that
they die by drowning increases. Also, when it is hot, people eat ice creams more.
A causal relationship between two variables
If we say A causes B, we mean that event A contributes to/influences the occurrence of event B.
Remember, we’re saying that the cause is partly (not totally) responsible for the effect B, and that
the effect B is partly dependent on the cause A.
A can be necessary for the occurrence of B (e.g. did austerity cause a vote to leave?) but another
definition is that fluctuations in A can simply lead to fluctuations in B (e.g. when the central bank
changes interest rates, the next month we may see the inflation rate change).
A third variable can cause both the events – confounder bias
We have a couple of reasons why two correlated variables may not form a causal
relationship – one is that we have another common event that causes the two
events (see ice cream example) → therefore, we cannot claim correlation as
causation.
In formal terminology: event A is called the treatment; event B is called the outcome; the third
variable that causes the two events to happen is called the confounder.
So, we might observe a correlation between the treatment and outcome, but this relationship is not
necessarily causal. One of the reasons is that we might have a confounder C. Even when the
relationship is causal, the existence of the confounder C might “bias” the causal relationship we’re
interested in (see illness example below).
Note: timing doesn’t always help establish causality. Ice cream production predates drowning, since
the producers watch a forecast and predict temperature, but that doesn’t mean ice creams cause
drowning.
Reasons that A and B are correlated
1. Direct causation – A causes B
2. Reverse causation – B causes A
3. Confounder – A and B are consequences of a common cause, but do not cause each other
4. Bidirectional causation – A causes B and B causes A
5. Indirect causation – A causes C which then causes B
6. Pure coincidence – no connection between A and B
,All statistical techniques only establish correlation (the extent to which A and B tend to decrease and
increase at the same time), not causation which requires interpretation. This is where econometrics
comes in.
Causation can occur without correlation
We might not observe correlation, even if we have causation.
For example, illness (A) can cause death (B), but nowadays healthcare (C)
can eliminate the correlation between common illness and death.
Strictly speaking, “+1/−1” should refer to “perfectly positive/perfectly negative correlation” but
ignore this. Remember, we can only observe a correlation from a statistical output; causality comes
from our interpretation. So, illness positively correlates with death (+1), as it should in theory, and
we can infer causality from this correlation using medical knowledge.
But if we have a good healthcare system (C), the number of medicine treatments should be
positively correlated with the number of illness cases (hence we have a +1 relationship between A
and C). That is, nearly every ill person gets treated. In this case, although it is incorrect to say that
the number of treatments increases illness, we want to make the statement as general as possible,
so we use causal language.
The number of medical treatments should be negatively correlated with the number of deaths
(obviously), hence B is negatively correlated with C (−1). And using medical knowledge, we can
interpret this relationship as C reduces B.
So, we might observe that illness and death are uncorrelated and question whether illness causes
death. But this is a critical mistake as we have omitted a confounder (a bias) in the analysis. This
confounder C has cancelled out the positive effect of A on B by confounding both events A and B. In
our example, this confounder has positive and negative correlations with the treatment and the
outcome that might have caused the correlation of interest to disappear.
We will learn more about this, and how to correct for this confounder bias, in the next few lectures.
Lecture 2: Counterfactuals
What is the effect of health insurance on health?
This is a causal question. We want to compare:
- The health of someone with insurance (𝑌1𝑖 )
- The health of the same person without insurance (𝑌0𝑖 )
However, the issue is that you can’t observe someone having insurance and not having insurance at
the same time. So, in practice we get some data from surveys (e.g. 2009 US NHIS):
- “Would you say your health in general is excellent, very good, good, fair or poor?” (health
outcome) → excellent ==5, …, poor ==1
- “Are you covered by private hospital insurance?” (treatment)
However, we see differences from the survey (e.g. the health of married couples with health
insurance has a higher average health index).
,What can cause such differences?
i) True causation: observed difference is the causal effect of treatment we’re looking for
a. Having health insurance may lead to better health because of better healthcare
ii) Reverse causation: the decision to get treated (buying insurance) is caused by the
outcome (the health status of the insured)
a. Adverse selection: the less healthy are more likely to buy insurance → so their
health outcomes can be lower regardless of health insurance
iii) Confounders: there’s some variables that cause the individuals to get the treatment, and
those variables also affect the outcome
a. The more educated tend to buy insurance more often and they know how to live
healthier
iv) Other situations e.g. pure coincidence
Now let’s go through some terminology:
- Health insurance coverage for individual 𝑖 is described by a binary random variable, the
treatment: 𝐷𝑖 = {0,1}
o 0: not treated (i.e. no insurance); 1: treated (i.e. has insurance)
- The outcome of interest: a measure of health status, denoted by 𝑌𝑖
- Potential outcomes (counterfactual): describe what would have happened to someone
under the scenarios where they had/had not been treated
o There are two potential outcomes before the treatment assignment is made:
𝑌 𝑖𝑓 𝐷𝑖 = 1
𝑃𝑜𝑡𝑒𝑛𝑡𝑖𝑎𝑙 (𝑐𝑜𝑢𝑛𝑡𝑒𝑟𝑓𝑎𝑐𝑡𝑢𝑎𝑙) 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 = { 1𝑖
𝑌0𝑖 𝑖𝑓 𝐷𝑖 = 0
o The counterfactual for a treated person is if they hadn’t been treated (𝑌1𝑖 ), and the
counterfactual for an untreated person is if they had been treated (𝑌0𝑖 )
Potential and observed outcomes
Before the treatment decision is made, any outcome is a potential outcome: 𝑌1𝑖 and 𝑌0𝑖 . Until the
treatment is assigned, all the potential outcomes are (potentially) counterfactual outcomes →
referring to counterfactual outcomes without a conditional statement is equivalent to referring to
potential outcomes.
Once the treatment decision is made (i.e. a conditional statement is made), one of them becomes
the actual (observed) outcome. The other becomes the counterfactual (unobserved potential)
outcome.
The potential outcome without insurance is 𝑌0𝑖 because this is the notation for the potential
outcome of a person 𝑖 (without knowing whether he will be treated), if he is not treated. If 𝑖 is
indeed treated, his observed (actual) outcome is denoted as: 𝑌𝑖 |𝐷𝑖 = 1, or 𝑌1𝑖 |𝐷𝑖 = 1. Either way is
correct to describe his actual outcome. His counterfactual outcome, knowing he is treated, is
𝑌0𝑖 |𝐷𝑖 = 1.
After the treatment is assigned, there is only one potential outcome which is unobserved and
becomes the counterfactual outcome for the treated (or untreated) individual. The other has
become the actual outcome.
, To summarise, after the treatment is assigned, the outcomes for 𝑖 if he is treated (𝐷𝑖 = 1) are:
actual outcome: 𝑌𝑖 |𝐷𝑖 = 1 𝒐𝒓 𝑌1𝑖 |𝐷𝑖 = 1
{
counterfactual outcome: 𝑌0𝑖 |𝐷𝑖 = 1
Now that we’ve cleared that up, let’s move on.
- The effect of having insurance for an individual 𝑖 (i.e. the causal effect) is the difference
between the potential outcomes when 𝑖 is treated and when 𝑖 is not treated: 𝑌1𝑖 − 𝑌0𝑖
- The observed outcome, 𝑌𝑖 , can be written in terms of potential outcomes:
𝑌𝑖 = 𝑌0𝑖 + (𝑌1𝑖 − 𝑌0𝑖 )𝐷𝑖
o RHS: what would’ve happened if 𝑖 didn’t get treatment + the average causal effect
of the treatment
▪ If treated: 𝐷𝑖 = 1 so 𝑌𝑖 = 𝑌0𝑖 + 𝑌1𝑖 − 𝑌0𝑖 = 𝑌1𝑖
▪ If not treated: 𝐷𝑖 = 0 so 𝑌𝑖 = 𝑌0𝑖 + 0 = 𝑌0𝑖
- We only observe either 𝑌1𝑖 or 𝑌0𝑖 for a single individual
- In practice, we have to look at average outcomes for particular groups: 𝐸[ 𝑌𝑖 |𝐷𝑖 = 1] and
𝐸[ 𝑌𝑖 |𝐷𝑖 = 0] i.e. the average outcome for the treated and control groups
o Note that: 𝐸[ 𝑌𝑖 |𝐷𝑖 = 1] = 𝐸[ 𝑌1𝑖 |𝐷𝑖 = 1] ≠ 𝐸[ 𝑌1𝑖 ]
▪ The expected outcome of those who were actually treated might differ from
the expected outcome of any person who can be potentially be treated
because some may never get treated
The selection problem
How do we derive the observed difference in average health? The trick is to insert the
counterfactual of those who were treated, had they not been treated:
Comparisons of treated and untreated observations may not reveal the causal effect of treatment:
other factors may be in place → the observed difference can be the same of a treatment effect and
a selection bias.
The selection bias is the difference between the average outcome of those who were treated had
they not been treated, and the average outcome of those who were not treated. In essence, it
shows the difference between those who were treated and not treated in the absence of the
treatment, that arises from the assignment rule of the treatment and control group → when some
of the participants self-select themselves into the treatment (control) group.
For example, the selection bias if I were treated would be the difference between my health had I
not been treated minus the health of my friend who hadn’t been treated. This difference is why I
chose to go to the hospital and my friend didn’t.
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper tc_econ. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €13,02. Je zit daarna nergens aan vast.