SUTVA (stable unit treatment value): treatment Simple difference means (naïve approach): All methods: identifiability assumptions
to any individual is not affecting the outcome of prima facie 𝑃𝐹𝐸 = 𝐸(𝑌!$ |𝑋! = 1) − 𝐸(𝑌!# |𝑋! = 0) Method-specific assumptions:
other individuals (𝐼𝐶𝐸 to 𝐴𝐶𝐸), and for all units, Equal to 𝐴𝐶𝐸 if randomised experiment
no versions of treatment levels leading to diff. a) Simple difference means: no confounding at all
potential outcomes (violated in multilevel) 1) Relation confounders 𝒁𝒊 , outcome 𝒀𝒊 b) ANCOVA and regression estimation: require
Regression and ANCOVA: from coefficients correct specification of the outcome model
SUTVA = no interference + consistency
𝜃 = 𝐸(𝑌! |𝑋! = 1, 𝑍! ) − 𝐸(𝑌! |𝑋! = 0, 𝑍! ) c) Matching, IPW, and stratification: require that
Consistency: observed outcome 𝑌! equals the
potential outcome with the treatment to the level If 𝑌! and covariates in 𝑍! linear relation use of 𝜋b! balances the confounder distribution;
that was observed; 𝑌! = 𝑌!" for 𝑋! = 𝑥! correct specification propensity score model
If the difference in mean response between 𝑋 = 0
Requires well defined treatment (and typically and 𝑋 = 1 do not vary with covariates d) Dual-modelling: require correct specification
one version/type of treatment) of outcome model / propensity score model
Extrapolation problem: if covariates do not
Common causal: single treatment, not another overlap sufficiently à propensity score matching 1) Conditional (in)dependence: find Markov
equivalence set and DAG skeleton and colliders if
Normality assumption: distribution of Regression estimation: splitting datasets
causes unrelated (immorality; unmarried+child)
(independent) sample means is normal
𝐴𝐶= 𝐸$ = >(𝑌?!$ − 𝑌?!# ) Assumptions: causal Markov; faithfulness; no
Unconfoundedness (exchangeability): the ! latent confounder, sufficiency, no selection bias
treatment is independent of the potential
Comparable: ANCOVA with interactions; 𝑌!$ PC: (A) original true causal graph (B) fully-
outcomes 𝑌!# , 𝑌!$ ⫫ 𝑋! (not people who took
and 𝑌!# , 𝐵𝑎𝑠𝑒𝑙𝑖𝑛𝑒 × 𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡, nonlinear models connected undirected graph (C) remove 𝑋 − 𝑌
aspirin because severe headache)
Standard errors by regression errors do not edge because 𝑋 ⫫ 𝑌 (D) remove 𝑋 − 𝑊 and 𝑌 − 𝑊
Lifted by RCT: individuals receiving treatment edges because 𝑋 ⫫ 𝑊|𝑍 and 𝑌 ⫫ 𝑊|𝑍. (E) finding
depend on variances of potential outcomes
(𝑋 = 1) are exchangeable (with respect to v-structures (F) orientation propagation
potential outcome) with those who do not receive 2) Relation confounders 𝒁𝒊 , treatment 𝑿𝒊
treatment (𝑋 = 0) Propensity scores: replacing covariates with 𝜋!
No unobserved confounding (conditional 𝑌!# , 𝑌!$ ⫫ 𝑋! |𝜋!
exchangeability): 𝑌!# , 𝑌!$ ⫫ 𝑋! |𝑍! , MAR no MCAR expS𝑋!& 𝛾U
Sufficiency: no unobserved common cause 𝜋! = 𝑃[𝑋$ = 1|𝑍! ] =
1 + exp(𝑋!& 𝛾)
Positivity (experimental treatment à Validate 𝑍! ⫫ 𝑋! |𝜋! to mimic RCT FCI (fast causal inference): variety PC, tolerates
assignment): exposed and unexposed and sometimes discovers unknown confounding
participants present at all combinations of 1) Reduce selection bias arising from non- variables (some directions remain unclear)
covariates of observed confounders in population random treatment assignment
GES (greedy equivalent search): start with empty
(in RCT present by design) 2) Useful for causal inference because they graph, and adds currently needed edges, and then
Strong ignorability = positivity + exchangeability balance the distributions (log (𝜋! )) of covariates eliminates unnecessary edges in some pattern
à non-overlap violate positivity à extrapolation
Identifiability: exchangeability, conditional 2) Restricting causal models: assume type
positivity, consistency (/SUTVA) 3) Include insignificant predictors in propensity relation (non-linear) or noise (non-gaussian)
modelling (to reduce Type 1 errors):
Randomised experiment (RCT): (no FCM (functional causal models): continuous
backdoor) 1) exchangeability 2) conditional 3a) Include variables (scientifically) predicting 𝑋! variables: effect 𝑌 as function of direct causes 𝑋
exchangeability 3) covariates 𝑋! are measured and 𝑌! and any 𝑝 > 0.10 ∨ .015 and unmeasurable factor/noise: 𝑌 = 𝑓(𝑋, 𝜀, 𝜃1)
after treatment and influenced by treatment 3b) Amount predictors and sample size must Assumptions: sufficiency, and transformation
Quasi experiment: random assigned and self- balance (dimensionality), leads to non-overlap (𝑋, 𝜀) to (𝑋, 𝑌) is invertible, recover 𝑁 uniquely
chosen treatment à compare research design Matching (of propensity scores): works best from observed variables 𝑋, 𝑌 (causal asymmetry)
Modularity, localised intervention: variable when one of the two groups (typically the control)
LiNGAM (linear, non-Gaussian model): 𝜀 ⫫ 𝑋
𝑝(𝑋) intervention (do-operator) is not changing is substantially larger, typically 1:1 matching and
causal direction assumed 𝑌 = 𝑏𝑋 + 𝜀, swapping
relation to other variables 𝑝(𝑌|𝑋) à forcing compare with unpaired t-test
coordinates, error term “flips 45°”, sufficiency
treatment same effect incidental treatment Balancing property: mimic RCT
3) Invariance, data different environment
𝑃(𝑍! |𝜋! = 𝑐! , 𝑋! = 1) = 𝑃(𝑍! |𝜋! = 𝑐! , 𝑋! = 0)
Invariant Causal Prediction: search for
If smaller group is treated persons, then an invariant models by normal and do-intervention
estimate represents 𝐴𝐶𝐸$ , vice versa 𝐴𝐶𝐸# to identify direct causes (means, conditional
If no matches found for some treated persons, dependencies which are invariant do not change)
causal effect of subpopulation Assumption: need different environments and
Inverse propensity weighting (IPW): modularity and localised interventions, and:
∑! 𝑋! 𝑌! /𝜋b! ∑!(1 − 𝑋! )𝑌! /(1 − 𝜋b! ) e.g., 𝑋$ , 𝑋) → 𝑌 is invariant under interventions
𝐴𝐶= 𝐸 = −
∑! 𝑋! /𝜋b! ∑!(1 − 𝑋! )/(1 − 𝜋b! )
Pseudo population: correcting for over/under
representation matching 𝜋! for treated with 1/𝜋! ;
and 1 − 𝜋! for not treated with 1/(1 − 𝜋! )
Posttreatment selection bias: covariates
𝜋! = 0.5 for each observation and thus 𝑍! ⊥⊥ 𝑋! , measured after the treatment should not be
double size new sample, outlier sensitive regarded as confounders because of time order
(Common cause) confounding bias: failure
, to condition on a common cause (fork) of
Markov Condition: every variable, 𝑋, in a treatment and outcome
directed acyclic graph, is independent of its non- Unconfoundedness bias: potential outcomes
descendants conditional on its parents (the should be independent from the treatment (if
variables with edges directed into 𝑋) Subclass- / block- / stratification: estimate necessary, conditional) because otherwise a third
Causal Markov: when the Markov condition is 𝐴𝐶𝐸 from multiple (5-10) strata, take average variable might be confounding the effect
assumed to hold for a causal graph and its 𝑁' Endogenous selection bias from sampling:
associated population distribution 𝐴𝐶= 𝐸 = > 𝜃?' collider bias (spurious associations) resulting
𝑁
! from the sampling procedure, and not from, e.g.,
Global Markov condition: d-seperated if two
variables 𝑋 ⫫ Y|S (DAG à statistics) Strata should be narrow: so, covariates do not the inclusion of inadequate covariates, two types:
matter within to make difference, mimic RCT 1) Nonresponse bias: analyse only completed
Faithfulness: if two variables statistically
independent (through conditioning) then that is 3) Dual modelling (doubly-robust) questionnaires, and variables of interest
d-separation 𝑋 ⫫ Y|S (statistics à DAG) Regression estimation propensity-related associate with survey completion (MNAR)
covariates: dummy treatment variable made 2) Attrition bias: over time (longitudinal),
Violated if confounder cancels out direct effect from propensity scores (post hoc correction)
(-0.25 and 0.50*0.50) à unobserving mediator respondents inevitably drop out of study, and this
becomes collider in causal discovery Standardised mean difference attrition is likely selective, the remaining is
(𝑍̅|𝑋 = 1) − (𝑍̅|𝑋 = 0) different from whom dropped out
Causal Faithfulness: when no clear d- ∆𝑍 =
separation but “faithful” to the graph 1 Non-representative samples are problematic
g S(𝑆 ( |𝑋 = 1) + (𝑆 ( |𝑋 = 1)U
2 Endogenous selection (collider) bias: from
Common trend, equality
condition (time-invariant If ∆𝑍 > 0.10: meaningful imbalance covariates: conditioning on collider (or descendant) which is
confounding): 𝛼 assumed non- after fix not < 0.10? delete matches/large weights on a noncausal path linking treatment and
zero, confounding bias cancels outcome (more general than sample bias)
If ∆𝑍 > 0.30: linear extrapolation is problematic:
out if unobserved 𝐴 affects the use interactions, transformations of 𝑍, kernel Overcontrol bias: this type of bias results from
pretest 𝑃 and the posttest 𝑌 to regression, local linear, propensity scores conditioning on a variable on a causal path
the same extent, 𝛽1 = 𝛽2 between treatment and outcome
Note wrong covariate is harmful (collider)
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller Samme. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $6.42. You're not tied to anything after your purchase.