MVDA SPSS Practicals
WEEK 1: Multiple Regression Analysis (MRA)
1) MVDA_MRA_1.sav.
Calculate the Pearson correlations between the five variables.
- Analyse ---> Correlate ---> Bivariate
- Variables: GPA, IQ, Age, Gender, SC
a) What is the sample size N ?
N = 78
b) Does it make sense to perform a linear regression of GPA on IQ, age, gender
and/or self-concept?
Correlations:
GPA - IQ: r = 0.579, p < 001 (high correlation, significant)
GPA - Age: r = 0.092, p = 0.422 (low correlation, not significant)
GPA - Gender: r = -0.051, p = 0.657 (low correlation, not significant)
GPA - SC: r = 0.484, p < 0.001 (medium correlation, significant)
- Because GPA correlates highly with IQ and SC, it makes sense to perform a
linear regression
c) Which variable is likely to be a good predictor of GPA?
The predictor that has the strongest significant Pearson correlation with the
dependent variables (Y), is the best predictor of GPA.
Both IQ (p <0.001) and SC (p <.001) have a significant relationship with GPA, but IQ
has the strongest Pearson correlation with GPA (r = 0.579)
Perform a linear regression of GPA on IQ, age, gender and self-concept
- Analyse ---> Regression ---> Linear
- Dependent: GPA
- Independent: IQ, Age, Gender, SC
- Statistics: Part and partial correlations, collinearity diagnostics
- Save: Cook’s Distance, Leverage Values
d) Can the null hypothesis of no relationship between GPA and IQ, age, gender
and/or self-concept be rejected?
ANOVA Table
F(4, 73) = 23.117, p < 0.001
Yes, we can reject the null hypothesis (at least some of the predictors are good
predictors)
, e) How much variance of GPA is explained by IQ, age, gender and SC together?
Model Summary
R = 0.748, R² = 0.559 (variance)
0.559 x 100 = 55.9%
---> 55.9% of changes in GPA is explained by IQ, Age, Gender, and SC together
f) What predictor explains the most unique variance?
Coefficients
Correlations: Part (unique correlations)
IQ: 0.487
Age: 0.402
Gender: -0.200
SC: 0.269
---> Biggest correlation = IQ (0.487)
---> Unique variance = (0.487)² = 0.237169
g) Is there evidence of multicollinearity in the predictors?
Coefficients
Collinearity Statistics
Rule: VIF < 10, Tolerance > 0.1
IQ: Tolerance = 0.647, VIF = 1.546
Age: Tolerance = 0.814, VIF = 1.229
Gender: Tolerance = 0.951, VIF = 1.051
SC: Tolerance = 0.690, VIF = 1.449
---> All tolerance scores > 0.1 and all VIF scores < 10, no evidence of
multicollinearity
h) Do Cook’s distances and Leverage values suggest the presence of outliers?
Residuals Statistics
Centered Leverage Value: 3 (P + 1) / N
3 x (number of predictors + 1) / total number of participants
3 (4 + 1) / 78 = 0.2 (0.2 = highest maximum value we are allowed to have at CLV)
---> Maximum: 0.929 (bigger than 0.2 = presence of outliers present)
Cook’s Distance: tells us if an outlier in the data is influential
Should not have a higher value than 1 in Cook’s Distance
---> Maximum: 7.918 (bigger than 1 = presence of influential outliers)
If one or more outliers are detected, steps 1 a) - h) are repeated with exclusion
of the outlier(s). Use Selection to get rid of the outlier(s).
- COO_1: Sort descending (highest values go on top)
- ID(78) = Cook’s Distance(7.918)
- Data ---> Select Cases ---> If condition is satisfied