GZW3024: Advanced statistics and research methods
Measures of disease frequency
Epidemiology = quantifying the existence/frequence of a disease. Het onderzoekt het
vóórkomen van ziektes descriptive (who is diseased?), analytical (examines cause-
effect, observational), and experimental (examines cause-effect, experimental).
Etiology: zoeken naar oorzaken.
Diagnose: frequentie in populatie
vaststellen; predictieve factoren
(screening (nog) geen factoren,
diagnose wel).
Prognose: disease course.
Epidemiology in public health; the science of understanding of what we’re exposed to or
what we do, may affect the overall/collective health of society (this makes it harder to
draw conclusion, compared to individuals). Use science to gain evidence that shows that
something is bad for health (in vivo (animals, not the same effect in humans) or in vitro
(lab, cells out of body behave differently)). Research in humans is often unethical, which
leaves us with observing the real world (connections between exposure/behaviours and
health); just because someone was exposed to something and got sick, doesn’t mean
they are related. This can be influences by a real cause, chance, bias (e.g. error in design),
or confounding (some other factor confuses out the interpretation).
- Correlation: two events that happen together/at the same time.
- Causation: one event causes the other. This is not the same as correlation.
Use statistics to see what information matters P-value (are the results the truth
(statistical significant; less than 5% chance that the results are due to random
variations) or by chance?). It doesn’t show how strong the association is, or how
important the health implications are. That’s why it’s also important to look at the effect
size (how large of an effect an exposure has), and not just whether it’s likely that there is
an effect or not.
Measures of disease frequency (MODF) describe the occurence of diseases.
- Prevalence and incidence (most occurring ones) are about the ratio between the
number of disease cases and the population ‘at risk’.
Prevalence = number of existing cases.
- Point prevalence (part of population at risk diseased at certain point in time).
o Ziek = niet meer ‘at risk’.
- Period prevalence (diseases within time period).
o Mid-term population at risk.
1
, - Lifetime prevalence (part of population that was diseased during life).
Incidence = number of new cases
- Cumulative incidence (CI)/Incidence Proportion: absolute risk for getting the
disease within a population.
o Exclude prevalent cases as T0.
o Loss to follow-up is unwanted, you need a complete follow-up (period P).
o Absolute risk = mean individual risk (in P).
o Disadvantage: you can’t see how long P is.
Closed population (cohort); iedereen start op hetzelfde moment. Er komen geen
mensen bij, maar ze kunnen er wel uit (loss to follow-up). Iedereen is even lang in
de populatie, behalve als ze eerder uitvallen; grootte wordt kleiner over de tijd.
E.g. birth cohort.
- Incidence Density (ID)/Incidence Rate: the pace at which new cases occur in the
population.
o Person time is the disease free time of persons at risk.
o Complete follow-up is not required (zoals bij CI). People are observed
until diagnosis, end of study, loss to follow-up (e.g. death, refusal to
continue, migration out of population).
o Closed population & dynamic (open) population; people can enter at a
relative point in time (e.g. 1 week after diagnosis). Time spend in the
population differs and characteristics of the members change over time.
E.g. inhabitants of a city.
Prevalence = incidence x average disease duration. Treat people faster to have a lower
prevalence. It’s correlated like this, but it can’t always be applied.
- Prevalence can be calculated in closed and dynamic population.
Study designs
Study design = plan/protocol for
conducting scientific research. It
enables the researcher to translate a
conceptual hypothesis (abstract)
into an operational one (more
specific, tells what/how to measure),
and statistically test the hypothesis.
Holes in theory hypothesis
study design.
2
,PICOT = Population, Intervention, Control group, Outcome, Time.
Observational study design
Observational, not manipulating.
Population level
Bijvoorbeeld; tijdstrendstudies (geboortecohort bestuderen en trend vaststellen), en
geografisch correlatie onderzoek (spreidingspatroon verschijnselen (+ correlatie) over
geografisch gebied, geen causaliteit).
Ecological study (RR?): correlation between (frequency of) exposure and disease.
- Voordelen: low costs, easy to perform.
- Nadelen: cause-effect?, regional differences, confounders, ‘ecological fallacy’
(wrong conclusion about causality).
Individual level
Transversaal (= cross-sectioneel) en longitudinaal (patient-controle, cohort en hybride
design).
Case-report study (none..): research 1 patient (can be extended to case-series). No
control-group, so no correlation.
- Case-fatality rate (CFT) =
- Mortality rate (incidence of death, prevalence is impossible) =
Cross-sectional study (OR): measure cause and effect at 1 point in time.
- Voordelen: low costs, easy to perform, ideal for diagnostic study, individual data.
- Nadelen: cause-effect is hard to determine (1 moment; komt de
oorzaak/determinant voor het effect?).
Cohort study (RR, AR, APe, Apt, OR): select sample population; research exposure and
non-exposure and follow the group over time disease.
- Oorzaak komt hier voor het gevolg.
- Prospective: starts now – follow up – results in future.
o Minder geode data, minder ‘grip op dataverzameling’.
Retrospective: data from the past – results now.
- Voordelen: individual data, cause-effect relation, study multiple diseases in
relation to exposure(s).
- Nadelen: costs and organization, duration, feasibility of cohort in rare diseases.
Incidence New cases / population at risk
Exposed: A / (A+B)
Non-exposed: C / (C+D)
RR Incidence exposed / incidence non-exposed.
3
, Odds Ziek / niet ziek
Exposed: A / B
Non-exposed: C / D
OR Odds exposed / odds non-exposed.
AR Incidence exposed – incidence non-exposed.
Risk difference (RD); risicoverschil.
AP Attributable proportion: deel van de uitkomst veroorzaakt door de
determinant. Te berekenen voor de blootgestelden of totale populatie.
Case-control study (OR): select patients, find similar control group (representation of
population at risk) matching strategy.
- Het voorkomen van de determinant in patiënten vergeleken met het voorkomen
van determinanten in de populatie van ze uit komen (controles). Gevolg komt
hier voor de oorzaak.
- Voordelen: individual data, suitable for rare diseases, efficient (time/money).
- Not suitable for rare exposure, sensitive to bias (information (from the past) and
recall), only 1 outcome can be studies.
- Use OR to estimate RR (OR is harder to interpret, and an overestimation of RR).
o 0 < RR < 1 beschermende factor.
o RR = 1 geen effect.
o RR > 1 risicofactor.
Nested design = variation of case-control study in which cases and controls are drawn
from the population in a fully enumerated cohort. Usually, the exposure in only
measured among cases and controls. It’s less efficient than a full cohort. It’s often used
when exposure is difficult or expensive to obtain.
Cross-over design: subjects receive a sequence of different treatments/exposures
(longitudinal – observational/controlled experiment), repeated measures. Er zijn twee
groepen deelnemers, beide krijgen ze zowel de interventie als de placebo, maar in een
andere volgorde. Tussendoor is er een wash-out periode.
Voordelen: less influence confounders (patients is its own control), less subjects
needed.
Nadelen: order of treatments, carry-over between treatments (washout).
N=1 study this can tell you whether a treatment is good for the specific patients. You
start with one patient. This patient is blinded and doesn’t know which treatment he is
getting (against bias). Then you measure the outcomes of the treatments the patient is
getting (first A, then B). The measurer should also be blinded. After all results are
collected, you can see which treatment option is the best for that specific patients.
- Disadvantages: not all therapeutic options lend themselves to randomisation for
an individual patient (the ideal treatment options have rapid onset and short
duration of effect), you have to ensure that there is a clinically relevant outcome
measure that you’re going to look at, time (N=1 studies can take very long), there
4