100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Summary ARMS assessment SPSS - Grasple Notes €5,49   In winkelwagen

Samenvatting

Summary ARMS assessment SPSS - Grasple Notes

1 beoordeling
 212 keer bekeken  11 keer verkocht

Dit is de Engelse samenvatting van de Grasple lessen die getoetst worden op het SPSS practicum Statistiek op 11 maart. This is the English summary of the Grasple lessons you have to make before the assessment on March 11th for the ARMS course.

Voorbeeld 4 van de 36  pagina's

  • Nee
  • Multiple linear regression, moderation, mediation, (factorial) anova, ancova, manova, within and mix
  • 29 februari 2020
  • 36
  • 2019/2020
  • Samenvatting
book image

Titel boek:

Auteur(s):

  • Uitgave:
  • ISBN:
  • Druk:
Alle documenten voor dit vak (12)

1  beoordeling

review-writer-avatar

Door: evaholtland • 4 jaar geleden

Heel fijn om alle tekst bij elkaar te hebben zonder opnieuw alle Grasple-opdrachten te moeten doen!

avatar-seller
willemijnvanes
ARMS assessment SPSS
Grasple Notes



Week 1 Multiple linear regression

Refresh

Residual = Y – Ŷ

Assumptions

One condition for a multiple regression is that the dependent variable is a continuous measure
(interval or ratio). The independent variable should be continuous or dichotomous.
A second condition for a multiple regression is that there are linear relationships between the
dependent variable and all continuous independent variables.
→ Check this assumption by creating a scatterplot in SPSS with a continuous independent
variable on the X-axis and the dependent variable on the Y-axis.
Graphs > Chart builder > Scatter/Dot
A third assumption which should be met, is whether there is an absence of outliers. This can be
assessed visually, through examining the scatter plots. If this assumption is not met, consider
carefully if you can remove the outlier. If this is the case, remove the outlier from your dataset.
→ When you have to make a choice about whether or not to remove an outlier, a number of
things are important:
● Does this participant belong to the group about which you want to make inferences
about?
○ If not, do not include the participant in the analysis.
● Is the extreme value of the participant theoretically possible?
○ If not, do not include the participant in the analysis.
○ If so, run the analysis with and without the participant, report the results of both
analyses and discuss any differences.

So before performing a multiple regression you should check the following assumptions:
● Measurement levels
● Linearity
● Absence of outliers
→ Violations of these different conditions can have an influence on the statistical results which
are seen. Always visualise your data!

There are also assumptions that can be evaluated during a regression analysis. This step is
taken before the results can be interpreted.
Multiple linear regression in SPSS:

, Analyze > Regression > Linear
If you want to check various assumptions, tick the following boxes:
● Absence of outliers: Click on ​Save​ and check: ​Standardised residuals, Mahalanobis
Distance and Cook’s Distance​.
● Absence of Multicollinearity: Click on ​Statistics​ and check: ​Collinearity Diagnostics​.
● Homoscedasticity: Click on ​Plots,​ place the variable ​*ZPRED​ (The Standardised
predicted values) on the X-axis, place the variable ​*ZRESID​ (The Standardised
residuals) on the Y-axis.
● Normally Distributed Residuals: Click on ​Plots​ and check ​Histogram​.

Read the output to see if the assumptions are met.
Absence of outliers: It is possible to determine through a scatter plot or box plot, however more
formally, this can be assessed whilst performing the analyses. Look at the Residual Statistics
Table and view the Minimum and Maximum Values of the ​standardised residuals​, the
Mahalanobis Distance​ and ​Cook’s Distance​. On the basis of these values, it is possible to
assess whether there are outliers in the Y-space, X-space and XY-space, respectively.
● Standardized residuals​: With this we check whether there are outliers in the Y-space. As
a rule of thumb, it can be assumed that the values must be between -3.3 and +3.3.
Those smaller than -3.3, or greater than +3.3 indicate outliers.
● Mahalanobis Distance:​ This is used to check whether there are outliers within the
X-space. An outlier in X-space is an extreme score on a predictor or combination of
predictors. As a rule of thumb, it is suggested that the values of Mahalanobis distance
must be lower than 10 + 2 * (number of Independent Variables). Values higher than this
critical value indicate outliers.
● Cook’s Distance​: Using this, it is possible to check whether there are outliers within the
XY-space. An outlier in the XY-space is an extreme combination of X and Y scores.
Cook’s distance indicates the overall influence of a respondent on the model. As a rule
of thumb, we maintain that values for Cook’s distance must be lower than 1. Values
higher than 1 indicate influential respondents (​influential cases​).
Absence of Multicollinearity: The Coefficients table contains information on multicollinearity in
the last columns. This indicates whether the relationship between two or more independent
variables is too strong (r > .8). If you include overly related variables in your model, this has
three consequences:
1. The regression coefficients (B) are unreliable.
2. It limits the magnitude of R (the correlation between Y and Ŷ).
3. The importance of individual independent variables can hardly be determined, if at all.
Determining whether multicollinearity is an issue can be done on the basis of statistics provided
by SPSS in the last two columns of the Coefficients table. You can use the following rule of
thumb:
● Values for the Tolerance smaller than .2 indicate a potential problem.
● Values for the Tolerance smaller than .1 indicate a problem.
● The variance inflation factor (VIF) is equal to 1/Tolerance. So for the VIF, values greater
than 10 indicate a problem.

,Homoscedasticity: The condition of homoscedasticity means that the spread of the residuals for
an X value must be approximately the same across all points. We assess this by plotting the
standardised residuals against the standardised predicted values. If for every predicted value
(X-axis) there is approximately the same amount of spread around the Y-axis, then the condition
is met.
Normally Distributed Residuals: We assess this by the frequency distribution of the standardised
residuals. Most of the time this histogram does not exactly follow the line of a perfect normal
distribution, but if it is a little bit violated, it is not enough to conclude that the condition for
normally distributed residuals has been violated.

In conclusion, when performing a (multiple) regression analysis, the following assumptions
should be checked:
● Absence of outliers
● Absence of multicollinearity
● Homoscedasticity
● Normally distributed residuals

Performing and interpreting

If the assumptions are met, the regression model can be interpreted. To this end, we look at the
first four tables from the SPSS output:
1. The first table shows what the independent and dependent variables are.
2. The second table shows the general quality of the regression model.
3. The third table shows the outcome of the F-test for the model.
4. The fourth table contains information about the regression coefficients.

Table 2: ‘​Model summary​’
The multiple correlation coefficient ​R​ within the regression model indicates the correlation
between the observed dependent variable scores (Y) and the predicted dependent variable
scores (Ŷ). It is used to say something about how good the model is at predicting the dependent
variable.

Normally, the squared version of ​R​, ​R​ square ​ (​R​2​), is used to assess how much variance of the
dependent variable is explained by the model. This means that x% of the variance in the
dependent variable can be explained by the independent variables. On the other hand 100-x%
of the variance is explained by other factors.
In addition to ​R square,​ the ​Adjusted R​ square is given. The ​adjusted R2​ ​ is an estimate of the
proportion of explained variance in the population. It adjusts the value of ​R​2​ on the basis of the
sample size (​n)​ and the number of predictors in the model (​k)​ . The estimated proportion of
explained variance in the population is always somewhat lower than the proportion of explained
variance in the sample.

Table 3: ‘​ANOVA​’

, This table shows the outcome of the​ F-test​ that tests whether the model as a whole is
significant. So here we look at whether the three independent variables together can explain a
significant part of the variance in satisfaction.

Table 4: ‘​Coefficients​’
This table provides information about the regression coefficients. We consider for each
independent variable whether it is a significant predictor of satisfaction. The predictors are
significant if α < .05.
The standardized coefficients, the ​Beta's​, can be used to determine the most important predictor
of satisfaction. The independent variable with the largest absolute beta is the most important
predictor, where absolute means that the sign (+ or -) does not matter. So we are only
interested in what the most extreme value is.

A hierarchical multiple regression analysis extends the previous model with the variables
parental support and teacher support. This will allow us to answer the question whether this
addition provides a significantly better prediction of satisfaction compared to a model with only
age, gender and sports participation.
→ Note that since we are adding new predictors to the regression model, we should check for
assumption violations.

Perform the hierarchical regression analysis in SPSS:
Analyze > Regression > Linear
● The independent variables of the original model should go to ​block 1 of 1.​
● Click ​Next​ to add another block with independent variables. You should only define the
new predictors (it is not necessary to select the variables from the first block again). So,
in this case, you should add parental support and teacher support.
● Under ​Statistics​, ask SPSS for the ​R squared change.​
● Click ​OK​.

The output we now get is similar to the output of the previous multiple regression. This (new)
information is shown in the first two tables:
1. Once again, in the first table shows what the independent variables and dependent
variables are. Specifically, for each model (1 and 2) the predictors that were added in
that step are given.
2. The second table shows again the general quality data of the regression model.
The left part of the table contains information about the quality of each model (in separate lines
for model 1 and model 2). The right part of the table contains the ​Change Statistics.​ These
statistics show how the quality changes as we move from one model to the next. ​Sig F change
indicates whether this difference in explained variance is significant.
→ The R square change for model 1 is equal to the value of ​R square​ for model 1. Actually,
SPSS compares model 1 to model 0, a model that does not explain any variance.

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper willemijnvanes. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €5,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 72042 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€5,49  11x  verkocht
  • (1)
  Kopen