Statistics and Methodology
summary
1
, Recap of RBMS 3
Regression I 8
ANOVA I 14
ANOVA II 21
Regression II 25
Scienti c integrity 30
Power 35
Systematic review 40
Critical perspectives on statistics and scienti c literature 45
2
fi fi
, Recap of RBMS
THE PROCESS OF NULL HYPOTHESIS TESTING
1. Research question (based on population)
2. Hypotheses (based on population)
3. Study design and data collection
- Data is collected in a sub-sample of the population of interest
4. Descriptive statistics
- Is limited to sample
5. Inferential statistics: make inferences from the sample to the population of interest
- Based on the hypothesis, how likely is it that what was observed in the sample also holds
for the population?
6. Conclusion
7. Look back at RQ and possibly start whole process again
- Test null hypothesis (instead of Ha), because you cannot prove a negative, but you can prove a
positive by rejecting a negative
- Hypothesis testing does not reveal reality: it gives an estimate how likely it is to observe
what we observe given the null hypothesis
- Because observations/data come with a level of uncertainty, you can never accept H0 as
true/false: rather ‘retain/reject H0’
- In science, you can never prove things: only nd support for/against a certain hypothesis
RESEARCH QUESTION
- A well-formulated research question describes:
- Population
- Intervention
- Comparison
- Outcome (dependent) variables
- Study design
- Should not be too general
HYPOTHESES
- In general, two-sided hypotheses are used:
- Null hypothesis (H0): ‘no e ect’
- …=…
- Alternative hypothesis (H1 or Ha): ‘an e ect’ (can go in either direction)
- …≠…
- Direction of e ect is not speci ed in the hypotheses
- If two-sided hypotheses are (biologically) implausible, one-sided hypotheses are used:
- Null hypothesis (H0): ‘smaller than’ or ‘larger than’
- … < … or … > …
- Alternative hypothesis (H1 or Ha): ‘larger than or equal to’ or ‘smaller than or equal to’
- … > … or … < …
- Direction of e ect is speci ed in the hypotheses
RESEARCH DESIGN
- RQ: causal e ect or association?
- Dependent variable(s)
- Measurement
- Type (nominal, ordinal, discrete, continuous)
- How many?
- Independent variable(s)
- Measurement
- Type (nominal, ordinal, discrete, continuous)
- How many?
3
ffff fiff fi ff fi
, - Manipulation
- Compare groups or conditions? How many?
- Are measurements/manipulations: dependent/within-subjects/paired or independent/
between-subjects/unpaired?
- Types of research designs in biomedical research:
- Observational: involves observations without manipulation and without randomization
(observe as is), and does not allow conclusions on causal e ects (only on associations)
- Cross-sectional: all measurements happened at the same time
- Case control: measure outcome and look back in time to nd possible predictors
- Prospective: follow sample over time for a certain period
- Experimental: includes some sort of manipulation and randomization, and allows
conclusions on causal e ects
- Randomized control trial: participants are randomly assigned to one of more
groups, and a participant only takes part in only 1 condition (intervention or control)
- Cross-over design: participants are randomly assigned to an order of 2 or more
groups, and a participant takes part in all conditions
- Order in which a participant takes part in a certain group is randomly assigned
DESCRIPTIVE STATISTICS
- Goal: to present, organize and summarize data observed in the sample
- Measures of central tendency: mean, median, mode
- Measures of dispersion/variability: (interquartile) range, variance, standard deviation
- Graphs and gures
INFERENTIAL STATISTICS
- Goal: to draw conclusions about a population based on data observed in a sample, by using
statistical tests
- Statistical test: a procedure to decide whether a hypothesis about the population may or
may not be supported by the results of the sample
- How likely are we to observe the data we observed in our sample, if our null hypothesis is
true?
- Pr(data|H0)?
- = very unlikely -> reject H0
- = likely -> retain H0
- Statistical test results in a p-value: probability of the data given that the null hypothesis is true
- Very unlikely: reject the null hypothesis, accept the alternative/experimental hypothesis
- Likelihood is de ned by a threshold of α=0.05 (5%): a p-value <0.05 is regarded as
‘unlikely enough’ to reject the null hypothesis
- Test statistic = (point estimate - expected value) / SE
- Test statistic (e.g. Z, Chi2, t): deviation of the data from the data under null hypothesis
- Point estimate (e.g. mean or proportion): observed point estimate of the sample
- Expected value: expected value under the null hypothesis
- SE (standard error): precision of the point estimate
- One-sample t-test:
- Null hypothesis: μ0 = speci c value
- t = (x - μ0) / se
- Se = sd / √n
- x: mean of sample
- μ0 = value under the null hypothesis
- E.g.: Do students have a healthy blood pressure?
- Independent samples t-test (aka 2-sample t-test):
- Null hypothesis: means from 2 groups are equal: μ1 = μ2
- t = (x1 - x2) - (μ1 - μ2) / sepooled
- sepooled = √((sd12 / n1) + (sd22 / n2))
- x1: mean of group 1
- x2: mean of group 2
- μ1 - μ2 = 0 under the null hypothesis
4
fi fi ff fi fiff