Samenvatting ARM
Quantitative part
WEEK 1
Segment 1:
• You will be able…
• to explain the potential outcomes approach in causal inference and apply it in
thinking about causal effect estimation
• to define ‘causal effects’
• to apply the concepts of consistency, positivity, and exchangeability in randomised
and non-randomised settings
Causual interference framework.
• What was the question?
• What was the underlying question?
• What was actually estimated?
• Is the estimate biased or unbiased?
• Is this an estimate of a full or partial effect?
• Is the estimate really an answer to the question?
• How was the analysis designed?
• Were statistical methods applied correctly?
• What is the estimate? Is that big, small, good, bad, etc?
• How uncertain is the estimate?
• What do the researchers conclude? Is that conclusion justified?
• Is this strong or weak evidence for something?
• How does it compare with what we (thought we) knew?
Causation: towards a usable, formal framework
• Formal definition (Hernàn/Robins): ‘In an individual, a treatment has a causal effect if the
outcome under treatment 1 would be different from the outcome under treatment 2.’
• What would have happened? --> If they were not getting the true match mineral?
• What will happen? --> what will happen if they use it.
• We both want to know what happens, and what would have happened if someone got the
treatment, that is the what if question.
• Average treatment effect: average of individual effects in a population
Om overtuigende uitspraken te kunnen doen moet er vaak een controlegroep aanwezig zijn. Dan kan
je uitspraak doen of een bepaald middel een rol speelt in het behalen van een bepaalde uitkomst.
Een causaal effect: als het middel bij persoon y niet dezelfde uitkomst heeft als de persoon die het
middel niet gebruikt. Als de uitkomst van behandeling 1 niet het zelfde is als de uitkomst van
behandeling 2.
No causal conclusion can be drawn when: the research group is too small, when a commercial
organization performs the research and when there is no control group available.
Potential outcomes of individuals in treatments:
a=1 a=0
- Causal effect: Y i ≠Y i
o Meaning: There is a causal effect if the outcome of treatment 1 is different than the
outcome of treatment 0/ or without treatment.
1
, • Y = outcome
• a = treatment
• 1 = yes (received treatment)
• 0 = no (received no treatment)
• i = individual
• ≠ does not equal
Potential outcomes:
- For patient K: used the product (1), y= 1 and y=0 is the potential outcomes for each of the
patients for the treatment and the treatment effect. = delta Y. if K used the product she
would see an improvement (Y=1 = 1) and if she did not use the product she would not have
seen an improvement, so her individual treatment effect is 1-0 is 1. Patient L = used the
product. Skin would improve with the product use, but also without the product use, so her
individual treatment effect is 0
a=1
• Y K =1 (improvement with treatment)
a=0
• Y K =0(no improvement without treatment)
• Treatment effect for K: ΔYK = 1 – 0 = 1 = individual treatment effect.
• Average treatment effect = average of ΔYi
• But we don’t observe both outcomes, we only observe one, because you either have the
treatment or not. So now you have the counterfactual outcome: the potential outcome that
is not observed because the subject did not experience the treatment (counter the fact). This
problem: that you don’t know the outcomes of both treatments, is called the fundamental
problem:
• the individual causal effect cannot be observed. Except under really strong
assumptions, and the average causal effect cannot be inferred from the individual
estimates.
• Causal inference as a missing data problem: can’t observe some outcomes: can’t
observe the counterfactual
• So we need a different approach to causal effects: so it is still possible to know what
would have happened if certain conditions apply.
Identifiability conditions:
• this is “observing the counterfactual”: try to find out what would have happened if they
got the treatment or if they did not get the treatment. Based on population averages, causal
effects can be estimated if three identifiability conditions hold:
• Positivity
• Consistency
• Exchangeability
• If the conditions are met, then association ( difference between two groups) of exposure and
outcome is unbiased estimate of causal effect
• Average causal effect can still be determined under certain conditions: ‘observing’ the
counterfactual.
- Positivity units are assigned to all relevant treatments or combinations.
o This is about the sample and the way in which it was composed.
o Positivity probability of being assigned to each of the treatment levels
o People are assigned to all relevant treatment groups: you need a control group
We need:
smokers with cigarette lighter
smokers with without cigarette lighter
non-smokers with cigarette lighter
2
, non-smokers without cigarette lighter
o assumption that all the combinations of outcome and exposure variables are
known.
o Positivity means that we must have results for all treatments groups in order to
make the analysis possible. More exactly: results for all treatment groups in all
strata of the adjustment variables. Every individual has the same chance of
receiving the treatment. Anyone would have been possible to be assigned to the
treatment or control group --> flipping a coin.
- Consistency
o Consistency means that the treatment is well-defined and can always be
expected to lead to the same results (on average)
• So you have to be very clear about what you mean with certain concepts. Well
defined outcome and exposure variable: example of broccoli Broccoli : what do
you mean by broccoli? And compared to what? Also: What do we mean by not
broccoli?
o Observe ‘what would have happened if…’ Define ‘if’: clear definition of
‘treatments’: what does the if mean?
o Carrying a cigarette lighter? Yes or no carrying.
- Exchangeability
o Treatment groups are exchangeable: otherwise you could not compare them.-->
are the groups that you examine the same and are they interchangeable. If you
change the treatment groups the estimated results should not change.
• It does not matter who gets treatment a and who gets treatment b.
• ‘potential outcomes are independent of the treatment that was actually received’
Are people with and without lighters exchangeable (similar in other
respects)? (no, it is likely that those with lighters were smokers which is bad
for your health. Giving a lighter to someone who is not a smoker has no
effect on health)
• However, it may be necessary to take other factors into account (statistical
adjustment.) : to select groups. this is called stratification
• shows complication with positivity: we need units that are assigned to all relevant
treatments within the level of adjustment factors
• Positive probability: smokers with and without lighters, and non-smokers
with and without lighters how: stratification.
• Adjustment to improve exchangeability
• Small number of factors: stratification
• If not: matching, weighing, regression analysis
• To achieve exchangeability you need stratification. You adjust for certain groups. For
example you adjust for people by looking at their smoking status.
Stratification
Without adjustment you would conclude it is unhealth for you to carry a lighter unrealistic
problem with exchangeability.
3
, With adjustment there is now positivity (all possible groups are included) now outcomes are the
same for both groups depending on if they smoke or not.
Example: workgroup 1 case 3: with SES and health in which the 3 identifiability conditions are not
guaranteed:
- Positivity : something about observations being unavailable, for instance the lack of control
group if you have a disease or not
- Consistency: should be very clear what is compared to what. Not only the exposure of
interest, but also the comparator. For instance effect of increasing health expenditure
- Exchangeability: should be about unresolved confounding ( omitted variables, vergeten
variabelen ) has to be something that you are exposed to, that you cannot change with
another one. For example exposure to asbestos, you cannot just expose people to that in
order to do your research.
Difference between exchangeability and counterfactual and how are they linked:
- Exchangeability concerns the differences between the treatment group: here in the example:
could the mothers have been part of another group?: the problem here is that they cannot
be switched , other factors influence their behavior (education and smoking) and therefore
how they behave and if they breastfeed.
- Counterfactual concerns the outcome. In this case: what would have happened to the kids if
they received a different treatment = the real situation in the population
- Is it then correct to say that if exchangeability is not achieved, this becomes a problem
because you cannot observe the counterfactual? can both of these be addressed by a RTC?
They are very related, but the thing about the 3 identifiability rules: positivity, consistency and
exchangeability, these are 3 conditions to make sure you can give an unbiased estimate. These
conditions have to hold that these estimates are unbiased. If you have an unbiased estimate it s as
close as you can get to observe the counterfactual . the counterfactual can almost never be observed
and performing in analysis in which all conditions hold is the closest you can get to observing the
counterfactual. Exchangeability can be achieved in non RTC by correctly adjusting for everything.
That is why it was an issue in the workgroup because there may have been more confounders than
adjusted for. RTCs by definition are exchangeable.
4