Summary Methods
Hair, J.F. Jr., Anderson, R.E., Tatham, R.L. (1987) Multivariate Data
Analysis, second edition, pp. 20- 40.
Multiple regression analysis is a statistical technique that can be used to analyse the relationship
between a single dependent (criterion) variable and several independent (predictor) variables.
The objective of multiple regression analysis is to use the several independent variables whose values
are known to predict the single dependent value the researcher wishes to know.
However, under certain circumstances, it is possible to use dummy-coded independent variables in
the analysis.
Four different purposes for which multiple regression analysis can be used:
1. Determine the appropriateness of using the regression procedure with the problem.
2. Examine the statistical significance of the attempted prediction.
3. Examine the strength of the association between the single dependent variable and the one or
more independent variables.
4. Predict the values of one variable from the values of others.
Simple regression = when the problem involves a single dependent variable that is predicted by a
single independent variable.
Multiple regression = when the problem involves a single dependent variable predicted by two or
more independent variables.
Prediction using a single measure – The Average
Errors.
By simply adding the errors, we might expect to obtain a measure of the prediction accuracy.
However, we would not – the errors would always sum to zero. To overcome this problem, we can
square (= kwadraat) each error and then add the results together to obtain the sum of squared
errors. The result, referred to as the sum of squared errors, provides a good measure of the
prediction accuracy of the arithmetic average. We wish to obtain the smallest possible sum of
squared errors, since this would mean that our prediction would be more accurate.
Prediction using two measures – Simple Regression
Simple regression also the rule, minimizing the sum of the squared errors of prediction.
Predicted number of … = Average number of …
^Y = ^y
^Y = b1X1
^Y = b0 + b1X1
b0 and b1 are regression coefficients.
b0 = the constant
Major assumptions
We have shown how improvements in prediction are possible, but in doing so we had to make several
assumptions about the relationship between the variable to be predicted and the variables we
wanted to use for predicting.
, - Statistical relationship: we are assuming that our description of … is statistical, not functional.
Functional relationship = calculates an exact value.
Statistical relationship = estimates an average value.
- Equal variance of the criterion variables:
- Lack of correlation of errors: we do want to find that any errors we make in prediction are
uncorrelated with each other.
Fixed versus random predictors
Random predictor variable = the levels of the predictor are selected at random. Our interest is not
just in the levels examined but rather in the larger population of possible predictor levels from which
we selected a sample.
Most regression models based on survey data are random effects models.
Prediction using several measures: multiple regression analysis
- Independence: no correlation between predictor variables.
- Interaction: not a constant effect??
- correlation: perfect correlation between two predictor variables (perfect correlation = 1.0)
No predictor should be included that is more closely related to the best predictor than it is to
the dependent variable.
Four purposes of multiple regression analysis:
1. Determining the appropriateness of our predictive model.
2. Examining the statistical significance of our model.
3. Predicting with the model.
4. Examining the strength of association between the variables.
Determining the appropriateness of our predictive model
We should note that residuals are an artefact of the particular predictive model we are using; they
are not equivalent to the random error in the population. These residuals should reflect the
properties of the population random error if the model is appropriate. Analysis of residuals may be
used to examine the appropriateness of the predictive models in terms of:
1. The linearity of the phenomenon measured: by plotting the residuals or by partitioning the error.
2. The constant variance of the error terms
3. The independence of the error terms
4. The normality of the error term distribution
5. The addition of other variables
Examining the statistical significance of our model
- Test of coefficients:
Hypothesis 1: the intercept (constant term) value of … arose by sampling error, and the real
constant term appropriate to the population is zero. (We test whether the constant term
should be considered appropriate for our predictive model)
Hypothesis 2: the coefficient … indicated that an increase of one unit in … is associated with
an increase in the average of … by …
We can test whether this coefficient differs significantly from 0.
- Test of the variation explained (coefficient of determination)
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller JoriekeMasselink. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $3.23. You're not tied to anything after your purchase.