2.1 Predictive Regression
Explanation vs prediction
The goal of scientific psychology is to understand human behaviour. Historically, this has meant to
explain behaviour - that is, to accurately describe its causal underpinnings - and to predict
behaviour - that is, to accurately forecast behaviours that have not yet been observed.
In practice these two goals are rarely distinguished!
● It might seem that the best explanatory model is equal to the best predictive model
● But from a statistical point of view this is simply not true (see this lecture)
→ different things to make a best explanation as compared to the best prediction
Regression
The regression model Y=f(X1,X2)=a+b1X1+b2X2 can be used for explanation or prediction.
● Explanation: how are the X’s related to the Y.
So we test the beta values for significance, and which are significant to explain the variance
on another variable
● Prediction: if we have new X’s what will be the predicted value of Y and how accurate is the
prediction? → We try to be as accurate as possible in predicting, not too interested in
which variables are important
● In explanation you usually use everyone to create the explanatory model, while in
prediction you usually split up the data set and use one part to train the model, and use the
other half to see how well it does predict the values
Explanatory Regression
● Explanatory regression starts with a theory about the data. The regression model is a
translation of the theory into mathematical form.
○ For example: gender and neuroticism have an effect on depression.
● Depressioni = 2 + 0.5*genderi + 1.5*neuroticismi
● The hypotheses generated from the theory can be examined in terms of statistical tests on
the regression weights
● In explanatory regression it is important that the regression weights are estimated
accurately, i.e. they should be unbiased.
Given the data that you have you try to explain the outcome variable as good as possible.
● The regression model itself is the object of interest.
● Explanatory regression heavily depends on assumptions
E.g. normality, independence, etc. (for prediction they are usually not very important
Funny use of “prediction” in psychology
● In psychology we often see papers with titles like
1. Impulsivity predicts problem gambling ...
2.Trait rumination predicts onset of Post-traumatic stress disorder ...
3.Predicting reading and mathematics from neural activity …
● Often the words explanatory and prediction are being used interchangeably.
● In psychology (as compared to the weatherman) we try to predict certain variables as good
as possible, without particularly caring about which variables actually explain those
prediction (as compared to what you do in explanation which is where you look what
explains a certain score, aka. Which variables have a sig. Beta value in predicting the
outcome variable)
, 21
Predictive Regression
● Usually we split a data set into two datasets, from which we use one to train the model (aka
create a model by seeing which variables are good predictors) and the other to test the
model (does it predict the scores well enough):
● Suppose we have data and obtain estimates. This is the training phase.
y=2+0.5X1i+1.5X2i
● Further suppose we have a new observation with and X1 = 2 and X2 = 3
y = 2 + 0.5*2 + 1.5*3
● y = 7.5
(so we are focusing on how accurate the 7.5 is to the observed model)
● Prediction focusses on the accuracy of the prediction. Therefore, we compare the predicted
value (y^) against the observed value (y). This is the testing phase.
● It is important that training and testing is performed on two different data sets. This provides
out-of-sample prediction accuracy
● Usually when we only do one explanatory regression, and use this to “predict” values, the
R2 value usually overfits what it can actually explain. Because you base your prediction
from one sample on the same data as what you build your model on. SO you would need to
use an adjusted R2
● More general, we have a population where the means of Y are given by a function of the
predictor variable(s) (X): Y = f(X) + e
● Often we collect data for a sample of n persons. These data are given by used to train a
model(xi,yi),...,(xn,yn)yi=̂f(xi)+εi
● Suppose we have new observations from the population.
● Based on the model that we estimated on the training data , we can make predictions for
the newly observed data .
● We can compare the predictions against the observations using the mean squared
prediction error (PE): PE(̂f(x0))=E[(y0−̂f(x0))2]
Prediction error
● The prediction error decomposes into (important!)
○ bias: the difference between the estimated f^ and the true f
○ variance: the variability of the estimated f
(can’t measure this from one model. But when you have a more complex model and
you repeatedly sample data and each time you fit this model, the outcomes will
differ. So more complex models have larger variance)
○ irreducible term: variance of Y at a specific value of X (that you cannot reduce.)
○ So the prediction error can be decomposed into those three components:
(PE(̂f(x0))=[Bias(̂f(x0))]2+Var(̂f(x0))+σ2
Alle Vorteile der Zusammenfassungen von Stuvia auf einen Blick:
Garantiert gute Qualität durch Reviews
Stuvia Verkäufer haben mehr als 700.000 Zusammenfassungen beurteilt. Deshalb weißt du dass du das beste Dokument kaufst.
Schnell und einfach kaufen
Man bezahlt schnell und einfach mit iDeal, Kreditkarte oder Stuvia-Kredit für die Zusammenfassungen. Man braucht keine Mitgliedschaft.
Konzentration auf den Kern der Sache
Deine Mitstudenten schreiben die Zusammenfassungen. Deshalb enthalten die Zusammenfassungen immer aktuelle, zuverlässige und up-to-date Informationen. Damit kommst du schnell zum Kern der Sache.
Häufig gestellte Fragen
Was bekomme ich, wenn ich dieses Dokument kaufe?
Du erhältst eine PDF-Datei, die sofort nach dem Kauf verfügbar ist. Das gekaufte Dokument ist jederzeit, überall und unbegrenzt über dein Profil zugänglich.
Zufriedenheitsgarantie: Wie funktioniert das?
Unsere Zufriedenheitsgarantie sorgt dafür, dass du immer eine Lernunterlage findest, die zu dir passt. Du füllst ein Formular aus und unser Kundendienstteam kümmert sich um den Rest.
Wem kaufe ich diese Zusammenfassung ab?
Stuvia ist ein Marktplatz, du kaufst dieses Dokument also nicht von uns, sondern vom Verkäufer fionabrosig. Stuvia erleichtert die Zahlung an den Verkäufer.
Werde ich an ein Abonnement gebunden sein?
Nein, du kaufst diese Zusammenfassung nur für 4,29 €. Du bist nach deinem Kauf an nichts gebunden.