All POSSIBLE ANALYSES
Correlation
- Regular (Pearson/Spearman)
- Partial
Reliability analysis
- (Cronbach’s Alpha)
Simple linear regression
Comparing means
- Dependent samples t-test
- Independent samples t-test
- Wilcoxon signed-rank test
- Mann-Whitney test
ANOVA
- One-way ANOVA
- Two-way/factorial ANOVA
- Kruskal-Wallis test
Chi-Square
WHEN TO CHOOSE WHICH TEST
Correlation
- Regular correlation
o A way of measuring the relationship between continuous variables.
E.g., “The more TV a person watches the more positive that person’s
attitudes about male homosexuality”.
- Partial correlation
o A way of measuring the relationship between two variables while controlling for the
effects of a third variable on both variables in the original correlation.
Reliability
- A way of measuring the reliability of a scale consisting of multiple items
o Chronbach’s Alpha value should be larger than .7 for the reliability to be “good”.
Simple linear regression
- A way of predicting the value of one variable from another
o E.g., “A person’s age predicts that person’s view on women”
Comparing two means
- Dependent samples t-test
o Compares two means, when those means have come from the same entities.
Compares two means based on related data.
E,g., “People think they will feel more negative after negative feedback on a
test than is actually the case”.
- Independent samples t-test
, o Compares two means when those means have come from different groups of
entities. Compares the two means based on independent data.
“People who are confronted with a real spider show more anxiety than
people who are confronted with a picture of a spider”.
- Wilcoxon signed-rank test
o This is the non-parametric equivalent of the dependent t-test and should be used
instead when normality is broken IN COMBINATION with extreme outliers.
- Mann-Whitney test
o This is the non-parametric equivalent of the independent t-test and should be used
instead when normality is broken IN COMBINATION with extreme outliers.
ANOVA
- One-way ANOVA
o Compares several (more than 2) means, when those means have come from
different groups of people; for example, if you have several experimental conditions
and have used different participants in each condition.
E.g., “People with a low average grade are better in affective forecasting
than people with an average or high average grade”.
- Kruskal-Wallis test
o This is the non-parametric equivalent of the one-way ANOVA and should be used
instead when normality is broken IN COMBINATION with extreme outliers.
- Two-way/factorial ANOVA
o Compares several means when there are two independent variables and different
groups have been used in all experimental conditions.
E.g., testing the effects of alcohol and gender on perceived attractiveness of
mates.
IV 1 (alcohol): None, 2 pints, 4 pints
IV 2 (gender): Male, female
DV: attractiveness of partner
o Mostly, two-way ANOVA’s come with three hypotheses:
Main effect 1 (the effect of beer)
Main effect 2 (the effect of gender)
Interaction effect (the effect of beer depends on gender)
Chi-Square
- A way of measuring the relationship between two categorical variables
o E.g., Are men more likely than women to have hair on their chest?
EFFECT SIZES
1. Small effect
- r = .1 (The effect accounts for 1% of the total variance)
2. Medium effect
- r = .3 (The effect accounts for 9% of the total variance)
3. Large effect
- r = .5 (The effect accounts for 25% of the variance)
, STEP BY STEP PROCEDURE OF ALL TESTS
Correlation
A way of measuring the relationship between continuous variables (E.g., “The more TV a person
watches the more positive that person’s attitudes about male homosexuality”).
Relevant assumptions
Normally distributed DATA
- Checked by calculating z-scores of skewness and kurtosis
o Skewness or kurtosis values divided by their SE
o If this value is larger than 1.96 or -1.96 the data is not normally distributed
- Checked by observing Q-Q plot or histogram
If assumption is violated
o Report that the data is not normally distributed and that the p-value might be
biased.
o Report bootstrapped confidence intervals and proceed with a Pearson’s correlation
analysis OR select Spearman’s correlation instead.
Steps of the analysis
1. Get your data ready
- Check reversed items
- Check for (impossible) outliers (ascending in data view)
- Check reliability of scales (α)
- Compute mean values or other necessary variables
2. Explore the data to get mean values and standard deviations, test the assumption of
normality
- Report the mean value and the standard deviation of the relevant variables
- Divide for each variable their skewness and kurtosis values by the SE of these values. If
this value is larger than 1.96 or -1.96 the assumption of normality is violated.
3. Computing the correlation
- Select: Analyze > Correlate > Bivariate, and transfer the relevant variables into the box
labelled Variables. All options are fine. Click on Bootstrap > Perform bootstrapping when
dealing with not normally distributed data to generate bootstrapped confidence
intervals, or select Spearman instead of Pearson. Finally, click on OK to run the analysis.
4. Output from the correlation shows
- The correlation coefficient
- The significance of the correlation
- With non-normality the bootstrapped confidence intervals
o If they do not cross 0 the correlation is significant, even if the p-value is larger
than .05.
, 5. Effect size of the correlation analysis
- Correlation coefficients are effect sizes. YAY!
- Do report how much variance in one variable is shared by the other variable in %
o By squaring the r and multiplying by 100 (e.g., -.0932 = .008649 x 100 = 0.86%).
6. Reporting the results
To test this hypothesis, a correlation analysis was conducted. The amount of time that a person
watches TV was measured on a scale consisting of three items, which had a satisfactory reliability (α
= .78). A new variable was computed by combining these three items into one variable which
represents the amount of time a person watches TV across all days of the week. Attitudes about male
homosexuality were measured on a scale consisting of seven items, which also had a good reliability
(α = .86). Item ‘v3139’, which is one of these items, was recoded because it was a reversely
formulated statement. These seven items were computed into one new variable which represents
people’s average attitude about male homosexuality. The mean score on the TV watching scale was M
= 149.39 (SD = 102.22). However, an inspection of the data showed that respondent 502 had produced
a major outlier. After removing this outlier new descriptive statistics were generated, which showed a
mean score on the TV watching scale of M = 148.11 (SD = 93.79). The mean attitudes about male
homosexuality had a score of M = 3.56 (SD = 0.82). Both the amount of time people watch TV and
attitudes about male homosexuality were not normally distributed. The amount of time people watch
TV showed significant skewness and kurtosis (z-score skewness = 23.84, z-score kurtosis = 44.55).
This was also the case for attitudes about male homosexuality (z-score skewness = -6.70, z-score
kurtosis = 2.23). Therefore, the p-value may have been biased and bootstrapped confidence intervals
were used to interpret the relation between both variables. Results demonstrate a small, but significant
negative correlation between the amount of time people watch TV and their attitude about male
homosexuality, r = -.093, 95% CI [-.175, -.013], p = .003. Therefore, these data do not support the
hypothesis that the more TV a person watches the more positive that person’s attitude about male
homosexuality is. This is explained by the fact that a lower score on attitude about male
homosexuality represents a higher agreement with the negative statements about male homosexuality.
Thus, results indicate that the more a person watches TV the more negative that person’s attitude
about male homosexuality.
, Partial Correlation
A way of measuring the relationship between two variables while controlling for the effects of a third
variable on both variables in the original correlation.
Relevant assumptions
Normally distributed DATA
- Checked by calculating z-scores of skewness and kurtosis
o Skewness or kurtosis values divided by their SE
o If this value is larger than 1.96 or -1.96 the data is not normally distributed
- Checked by observing Q-Q plot or histogram
If assumption is violated
o Report that the data is not normally distributed and that the p-value might be
biased.
o Report bootstrapped confidence intervals and proceed with the correlation analysis.
Steps of the analysis
1. Get your data ready
- Check reversed items
- Check (impossible) outliers (ascending in data view)
- Check reliability of scales (α)
- Compute mean values or other necessary variables
2. Explore the data to get mean values and standard deviations, test the assumption of
normality
- Report the mean value and the standard deviation of the relevant variables
- Divide for each variable their skewness and kurtosis values by the SE these values. If this
value is larger than 1.96 or -1.96 the assumption of normality is violated.
3. Computing the partial correlation
Select: Analyze > Correlate > Partial, and transfer the variables that you want to correlate
into the box labelled Variables, and transfer variables that you want to control into the box
labelled Controlling for. Click on Options and request Zero-order correlations. Click on
Bootstrap > Perform bootstrapping when dealing with not normally distributed data to
generate bootstrapped confidence intervals. Finally, click on OK to run the analysis.
4. Output from the partial correlation shows
- The controlled variable
- The correlation coefficient while controlling for another variable (i.e., age)
- The significance of the correlation
- With non-normality the bootstrapped confidence intervals
- The zero order correlations
o If they do not cross 0 the correlation is significant, even if the p-value is larger
than .05.