Advanced Research
Methods
GW4003MV
S533735 – Thomas van Waesberge
August 30th 2022 – October 10th 2022
,0 Online Research Methods Modules
Quantitative statistics
The appropriate choice of the type of statistic mainly depends on the measurement level of the
variables involved. There are 4 of them:
1. Nominal. The scale points are distinguishable, but there is no ranking. E.g. Place of birth or
marital status. Variables with two scale points (yes/no) are called binary variables or dummy
variables.
2. Ordinal. The scale points are distinguishable and can be ranked, but the distance between the
scale points is not necessarily the same. E.g. Self-assessed health status (bad, average,
good, very good, excellent) or education level (low, middle, high). The difference between ‘
bad’ and ‘average’ is not necessarily as big as the difference between ‘very good’ and
‘excellent’ .
3. Interval. The scale points are distinguishable, can be ranked, the distance between the scale
points have a meaning, but there is no absolute zero. E.g. Temperature (Celsius and
Fahrenheit) or IQ.
4. Ratio. The scale point are distinguishable, can be ranked, the distance between the scale
points have a meaning and there is an absolute zero point. E.g. Age, height, speed.
Another important distinction is between outcome and exposure variables. The outcome variable is the
variable that is the focus of our attention, whose variation or occurrence we are seeking to understand.
Exposure variables (or identifying factors or determinants) may influence the size or the occurrence of
the outcome variable. Outcome variables are also called dependent variables and exposure variables
are also called independent variables.
1.2 Introduction to causal inference
Magazine advertising
- Scientific report
o Proven clinical results: 70% less imperfections in 4 weeks
o True Match Minerals foundations tested under dermatological control with 41 women
o Average reduction of the most visible imperfections linked to oily skin.
- Is the powder really that good?
o ‘Improves’ implies a causal effect:
A leads to B
Use of True Match Minerals powder leads to a better skin
Would you buy the powder? Is the scientific evidence convincing? What are arguments for
and against buying the powder? Problems?
o Small sample size (n=41)
o Study performed or financed by commercial company
o No control group
Essential omission
Potential regression to the mean
What would happen without treatment?
What do we want to know?
- In causal inference:
o we are not interested in the outcome per se (i.e., 70% less imperfections), but ...
, o we are interested in the role of the treatment in achieving this outcome (i.e., without
True Match Minerals powder, would there have been less skin imperfections)
- Conclusion:
o We do not have that information
o No causal claim can be made based on L’Oréal study
Causal effect
- Formal definition by Hernàn and Robins (2020):
o ‘In an individual, a treatment has a causal effect if the outcome under treatment 1
would be different from the outcome under treatment 2.’
- To asses this, we need information on:
o What would have happened?
o What will happen?
- Assume that we know that would have happened in the L’Oréal study:
o Woman A treated with True Match Minerals powder: 2 bad spot
o Had woman A not been treated with True Match Minerals powder: 5 bad spots
Individual treatment effect: -3 spots (or 60% less imperfections)
Average treatment effect: average of individual effects in a population
Potential outcomes approach
- Formal notation of causal effect:
o Y a=1
i ≠Y i
a=0
o Y = outcome
o a = treatment
o 1 = yes (received treatment)
o 0 = no (received no treatment)
o i = individual
-
- Y a=1
K = 1 (improvement with treatment)
a=0
- Y K = 0 (no improvement without treatment)
- Treatment effect for K: ΔYK = 1 – 0 = 1 (positive effect)
- Average treatment effect = average of ΔYi
- This table is ideal situation, but unrealistic, because you can’t give a treatment, and don’t give
the treatment at the same time.
, Not all potential outcomes are observed
-
- Use of a control group would improve the experiment
- Counterfactual outcome: potential outcome that is not observed because the subject did not
experience the treatment (‘counter the fact’) the “?”
- Potential outcome Ya=1 is factual for some subjects, and counterfactual for others
Fundamental problem
- Individual causal effect cannot be observed
o Except under extremely strong (and generally unreasonable) assumptions
- Average causal effect (i.e., in a population) cannot be determined based on individual
estimates
o Causal inference as a missing data problem
- We need a different approach to estimate causal effects
Identifiability conditions
- Average causal effect can be determined if, and only if, three identifiability conditions are met:
o Positivity
o Consistency
o Exchangeability
- If all conditions are met (and an association is found in the data) the association between
exposure and outcome is an unbiased estimate of a causal effect
Cigarette lighter example
- Research question (RQ):
o What is the effect of carrying a cigarette lighter of people’s health?
- 1. Go to the Beurstraverse (or Koopgoot) in Rotterdam
- 2. Ask people: ‘Are you carrying one plastic cigarette lighter (yes/no)?’
- 3. Get back in touch after 20 years and assess which group is more healthy?
Positivity
- This identifiability condition means that:
o Each individual has to have a ‘positive probability’ of being assigned to each of the
treatment arms (i.e., Pr(A=a)>0 for all treatment arms)