Field summary (5th ed.)
Chapter 3: NHST and effect sizes p. 3
NHST
researcher degrees of freedom
statistical significance
effect sizes: Cohen’s d and Pearson’s r
odds ratio
Chapter 6: Statistical bias p. 4
bias
outlier
violation of assumption: linearity and additivity, normality, homogeneity/homoscedasticity,
independence
trimming
winsorizing
bootstrapping
Chapter 8: Correlations p. 6
covariation
bivariate correlation
one-tailed and two-tailed probability
directional and non-directional hypothesis
point-biserial correlation and biserial correlation
discrete dichotomy and continuous dichotomy
semi-partial (part) correlation and partial correlation
Chapter 9: Linear regression model p. 8
single regression
predicted value
multiple regression
parameter
total error
residual sum of squares, SSR
goodness of fit
sum of squared differences, SST
proportion of improvement, R2
F-statistic
mean squares, MS
unstandardized residuals, standardized residuals, and studentized residuals
adjusted predicted value
Cook’s distance
leverage
Mahalanobis distance
multicollinearity, VIF, and tolerance
cross-validation
hierarchical regression
forced entry
,stepwise, forward and backward
F-change
eigenvalues
Chapter 11: Moderation, mediation, and multi-category predictors p. 12
moderation
grand mean centring
simple slopes analysis
zone of significance
mediation
simple relationship and mediated relationship
Chapter 12: Comparing several independent means p. 15
F-statistic
SST, SSM, dfM, SSR, MS, MSM, MSR
Welch’s F and Brown-Forsythe’s F
contrast coding
post hoc tests
weights for planned contrasts
grand mean
non-orthogonal contrasts
standard contrasts: deviation (first and last), simple (first and last), repeated, Helmert, difference
(reverse Helmert)
between-group effects table
within-group effects table
harmonic mean sample size
Chapter 13: ANCOVA p. 20
covariates
adjusted means
partial eta-squared
omega squared
Chapter 14: Factorial designs p. 22
independent factorial design (Ch. 14)
repeated-measures (related) factorial design (Ch. 15)
mixed design (Ch. 16)
slope interaction term
Chapter 15: Repeated-measures designs p. 25
random intercept model
sphericity and Mauchly’s test
Greenhouse-Geisser and Huynh-Feldt
lower-bound estimate of sphericity
SSw, dfSSW
MSM, MSR, and F
factorial repeated-measures design
Chapter 16: Mixed designs p. 28
mixed designs
, Chapter 3: NHST and effect sizes
Null hypothesis significance testing (NHST)
Tests H0 against Ha to see whether Ha is likely to be true. p is used as an index of the evidence weight
against H0.
Problem with NHST: reliance on merely refuting H0 (all-or-nothing thinking, with p<.05 or >.05)
Researcher degrees of freedom: showing results in the most favorable light possible (not
controlling the Type I error rate)
- p-hacking: selectively reporting significant p-values
- p-HARKing: hypothesizing after the results are known
Statistical significance
p = the probability of getting a test statistic at least as large as the one observed, relative to all
possible values of the test statistic from an infinite number of identical replications.
p doesn’t measure the size/importance of an effect, so don’t use p as the measure of probability
that the hypothesis in question is true/false
small p suggests data are compatible with Ha, large p suggests data are compatible with H0
Effect sizes
Effect sizes = objective and (usually) standardized measures of the magnitude of observed effects.
Standardized: allows to compare [effect sizes] across different studies that have measured different
variables/used different scales of measurement.
The size of the effect should be placed within the research context. However, rules of thumb are:
Cohen’s d Pearson’s r
small .2 .1 r=.1 explains 1% of total variance
medium .5 .3 r=.3 explains 9% of total variance
large .8 .8 (r=.5 explains 25% of total variance)
Cohen’s d: Calculated by dividing the difference between means by the standard deviation:
d = (mean1 – mean2)/s
expressed in s-units, therefore standardized, and more reliable than Pearson’s r when group sizes
are discrepant
Pearson’s correlation coefficient r: A measure of the strength of the relationship between 2 variables
(either continuous or one continuous and one categorical with two categories).
r is not linear, meaning r=60 is not 2x r=30. It is constrained to lie between 0 (no effect) and 1.
Odds ratio: Popular effect size for counts as an outcome (e.g. on 2 categorical variables) with outcome
being a number of participants choosing each option on the categorical variables).
count is summarized in a [2x2] contingency table
odds = pevent/pno event = pevent/total / pno event/total
odds = 1 indicates that the odds of one outcome are the same as odds for the other outcome
Effect sizes as an indication to NHST:
- interpreting effects on a continuum rather than a rule (of p < .05)
- affected by sample size, but without a decision rule attached (e.g. p < .05)
- due to absence of decision rule: less researcher degrees of freedom.
Effect sizes can be used in a meta-analysis as it gives the average effect size across studies: ∑effect
sizes/Neffect sizes