Summary Methods
Hair, J.F. Jr., Anderson, R.E., Tatham, R.L. (1987) Multivariate Data
Analysis, second edition, pp. 20- 40.
Multiple regression analysis is a statistical technique that can be used to analyse the relationship
between a single dependent (criterion) variable and several independent (predictor) variables.
The objective of multiple regression analysis is to use the several independent variables whose values
are known to predict the single dependent value the researcher wishes to know.
However, under certain circumstances, it is possible to use dummy-coded independent variables in
the analysis.
Four different purposes for which multiple regression analysis can be used:
1. Determine the appropriateness of using the regression procedure with the problem.
2. Examine the statistical significance of the attempted prediction.
3. Examine the strength of the association between the single dependent variable and the one or
more independent variables.
4. Predict the values of one variable from the values of others.
Simple regression = when the problem involves a single dependent variable that is predicted by a
single independent variable.
Multiple regression = when the problem involves a single dependent variable predicted by two or
more independent variables.
Prediction using a single measure – The Average
Errors.
By simply adding the errors, we might expect to obtain a measure of the prediction accuracy.
However, we would not – the errors would always sum to zero. To overcome this problem, we can
square (= kwadraat) each error and then add the results together to obtain the sum of squared
errors. The result, referred to as the sum of squared errors, provides a good measure of the
prediction accuracy of the arithmetic average. We wish to obtain the smallest possible sum of
squared errors, since this would mean that our prediction would be more accurate.
Prediction using two measures – Simple Regression
Simple regression also the rule, minimizing the sum of the squared errors of prediction.
Predicted number of … = Average number of …
^Y = ^y
^Y = b1X1
^Y = b0 + b1X1
b0 and b1 are regression coefficients.
b0 = the constant
Major assumptions
We have shown how improvements in prediction are possible, but in doing so we had to make several
assumptions about the relationship between the variable to be predicted and the variables we
wanted to use for predicting.
, - Statistical relationship: we are assuming that our description of … is statistical, not functional.
Functional relationship = calculates an exact value.
Statistical relationship = estimates an average value.
- Equal variance of the criterion variables:
- Lack of correlation of errors: we do want to find that any errors we make in prediction are
uncorrelated with each other.
Fixed versus random predictors
Random predictor variable = the levels of the predictor are selected at random. Our interest is not
just in the levels examined but rather in the larger population of possible predictor levels from which
we selected a sample.
Most regression models based on survey data are random effects models.
Prediction using several measures: multiple regression analysis
- Independence: no correlation between predictor variables.
- Interaction: not a constant effect??
- correlation: perfect correlation between two predictor variables (perfect correlation = 1.0)
No predictor should be included that is more closely related to the best predictor than it is to
the dependent variable.
Four purposes of multiple regression analysis:
1. Determining the appropriateness of our predictive model.
2. Examining the statistical significance of our model.
3. Predicting with the model.
4. Examining the strength of association between the variables.
Determining the appropriateness of our predictive model
We should note that residuals are an artefact of the particular predictive model we are using; they
are not equivalent to the random error in the population. These residuals should reflect the
properties of the population random error if the model is appropriate. Analysis of residuals may be
used to examine the appropriateness of the predictive models in terms of:
1. The linearity of the phenomenon measured: by plotting the residuals or by partitioning the error.
2. The constant variance of the error terms
3. The independence of the error terms
4. The normality of the error term distribution
5. The addition of other variables
Examining the statistical significance of our model
- Test of coefficients:
Hypothesis 1: the intercept (constant term) value of … arose by sampling error, and the real
constant term appropriate to the population is zero. (We test whether the constant term
should be considered appropriate for our predictive model)
Hypothesis 2: the coefficient … indicated that an increase of one unit in … is associated with
an increase in the average of … by …
We can test whether this coefficient differs significantly from 0.
- Test of the variation explained (coefficient of determination)
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper JoriekeMasselink. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €2,99. Je zit daarna nergens aan vast.