Evaluating Association Claims
Understanding Statistical Significance
➔ significance testing
◆ decide what decision rules to use to test hypothesis prior to analyzing data
◆ null hypothesis significance testing (NHST): set ofdecision rules that help researcher
use margin of error to determine if observed effect is extreme enough to “reject the
null” and conclude the researcher’s alternative hypothesis issupported(never proved)
➔ statistically significant
◆ an effect is observed, even after factoring in the margin of error
◆ not statistically significant if margin of error is large that it questions whether the
effect exists or not
➔ what is an effect
◆ specific outcome being tested
◆ group comparisons: type of effect that compares twoor more groups
◆ correlation: type of effect that examines the associationbetween variables
➔ to test significance of an effect, start with the assumption that no effect exists (null
hypothesis)
◆ opposite of what the researcher’s hypothesis of there being an effect
◆ null = no effect ; alternative (researcher’s hypothesis) = effect
➔ general approaches for testing significance
◆ confidence interval approach
◆ p-values approach
Confidence Interval Approach
➔ construct a confidence level around the effect (hand calculations or computer program)
◆ i.e. mean difference or correlation coefficient
➔ assess whether the confidence interval around effect includes zero
◆ no zero, results are statistically significant
● fail to reject the null
● researcher hypothesis not supported
◆ yes zero, results arenotstatistically significant
P-Values Approach
➔ 1. set significance level
◆ similar to confidence level
◆ identified using confidence level
● remainder of confidence level = significance level
○ i.e. 95% confidence = 5% significance
, ◆ alpha
➔ 2. calculate effect and p-value
◆ either hand calculations or computer program to calculate effect and probability value
of an effect that large occurring if the effect were actually zero
● the probability value = p-value
➔ 3. compare p-value to statistical significance
◆ if probability value of the effect that large would occur if the effect was actually zero is
lower than the significance level, the results are statistically significant
◆ p-value < alpha = statistically significant
● reject the null
● take next steps to explore hypothesis
◆ p-value > alpha = not statistically significant
● fail to reject the null
● alternative hypothesis not supported
➔ though two approaches were presented, they are rooted in the same foundation
◆ one can be used to make inferences about the other
Understanding Effect Size
➔ considers the size of the group difference and/or strength of the association
➔ Cohen’s d = effect size used to compare the means across two groups
◆ small = |.20|
◆ medium = |.50|
◆ large = |.80|
➔ most effects in psychology are small to medium
◆ rarely large effects
◆ meaning subtle differences rather than huge
➔ power analysis: calculates the ideal sample size forthe effect to be large/seen
◆ larger sample = <statistical error
Cautions of Statistical Significance
➔ only tells if effect is likely to differ from zero in the population that the sample represents
➔ doesn’t say anything about effect size
◆ in large sample sizes, small effect sizes can be significant
➔ not reliable when sample size is low
➔ data from samples are merely estimates of true population parameters…always at risk of
making an error
Two Types of Statistical Error
➔ Type 1 error (false positive)
Understanding Statistical Significance
➔ significance testing
◆ decide what decision rules to use to test hypothesis prior to analyzing data
◆ null hypothesis significance testing (NHST): set ofdecision rules that help researcher
use margin of error to determine if observed effect is extreme enough to “reject the
null” and conclude the researcher’s alternative hypothesis issupported(never proved)
➔ statistically significant
◆ an effect is observed, even after factoring in the margin of error
◆ not statistically significant if margin of error is large that it questions whether the
effect exists or not
➔ what is an effect
◆ specific outcome being tested
◆ group comparisons: type of effect that compares twoor more groups
◆ correlation: type of effect that examines the associationbetween variables
➔ to test significance of an effect, start with the assumption that no effect exists (null
hypothesis)
◆ opposite of what the researcher’s hypothesis of there being an effect
◆ null = no effect ; alternative (researcher’s hypothesis) = effect
➔ general approaches for testing significance
◆ confidence interval approach
◆ p-values approach
Confidence Interval Approach
➔ construct a confidence level around the effect (hand calculations or computer program)
◆ i.e. mean difference or correlation coefficient
➔ assess whether the confidence interval around effect includes zero
◆ no zero, results are statistically significant
● fail to reject the null
● researcher hypothesis not supported
◆ yes zero, results arenotstatistically significant
P-Values Approach
➔ 1. set significance level
◆ similar to confidence level
◆ identified using confidence level
● remainder of confidence level = significance level
○ i.e. 95% confidence = 5% significance
, ◆ alpha
➔ 2. calculate effect and p-value
◆ either hand calculations or computer program to calculate effect and probability value
of an effect that large occurring if the effect were actually zero
● the probability value = p-value
➔ 3. compare p-value to statistical significance
◆ if probability value of the effect that large would occur if the effect was actually zero is
lower than the significance level, the results are statistically significant
◆ p-value < alpha = statistically significant
● reject the null
● take next steps to explore hypothesis
◆ p-value > alpha = not statistically significant
● fail to reject the null
● alternative hypothesis not supported
➔ though two approaches were presented, they are rooted in the same foundation
◆ one can be used to make inferences about the other
Understanding Effect Size
➔ considers the size of the group difference and/or strength of the association
➔ Cohen’s d = effect size used to compare the means across two groups
◆ small = |.20|
◆ medium = |.50|
◆ large = |.80|
➔ most effects in psychology are small to medium
◆ rarely large effects
◆ meaning subtle differences rather than huge
➔ power analysis: calculates the ideal sample size forthe effect to be large/seen
◆ larger sample = <statistical error
Cautions of Statistical Significance
➔ only tells if effect is likely to differ from zero in the population that the sample represents
➔ doesn’t say anything about effect size
◆ in large sample sizes, small effect sizes can be significant
➔ not reliable when sample size is low
➔ data from samples are merely estimates of true population parameters…always at risk of
making an error
Two Types of Statistical Error
➔ Type 1 error (false positive)