Advanced Research Methods and Statistics – lectures
Lecture 1 – Introduction: Multiple linear regression
Week 1
Always critically review the way studies are performed
o Is there a representative sample?
o Are the measures or variables reliable?
o Are the analysis correct and the interpretation of results correct?
Always critically consider alternative explanations for the statistical association
o Association is NOT causation
o Does effect remain when additional variables are included?
Simple linair regression: involves 1 outcome (Y) and 1 predictor (X)
o Outcome = DV = dependent variable (e.g. IQ)
o Predictor = IV = independent variable (e.g. Birth order)
EQUATIONS ARE NEVER TESTED! Models & output are important —> equations of plots are
tested
If the model is a good way to describe the model and if the predictor is useful for predicting your
outcome. 2 main things that are evaluated:
1. The relevance of a predictor: To what extent does the model explain variation in the data?
—> can the predictor explain the outcome?
2. B- value, slope of the line: if it is larger than the predictor is more relevant: how important
is my predictor for predicting the outcome?
Multiple linair regression (MLR): examines a model where multiple predictors are included to check
their unique linear effect on Y
Things you need to know about MLR:
o The model (different trends)
1
, o The types of variables in MLR
o MLR and hierarchical MLR
Hypotheses
Output
Model fit: R2, adjusted R2, R2-change
Regression coefficients: B and Beta (standardized B)
o Exploratory MLR (stepwise) vs. Confirmatory MLR (forced entry)
o Model assumptions important to MLR
The model
Outcome variable: y, because it is placed on the y-axis when you plot things
Intercept:
Slope:
Residual: some error in the prediction
Observed outcome: prediction based on the model and some error in prediction
Y hat: prediction!!! (Y met dakje) —> will probably not be exactly the observed outcome —> this is
called the statistical model, MLR e.g.
Subscript i: notes that each individual can have a different score
Terms without subscript i’s: parameters, stay the same over the different individual scores
Additive linear model: multiple predictors, assume that the predictions are additive! (+, +) —> different
then e.g. Correlation models (interaction effects)
Main effect: x1, x2, look at a model where they are both added in the model
Types of variables
Formal distinctions in 4 measurements levels, logical order (lowest to highest level of complexity)
o Nominal
o Ordinal
o Interval
o Ratio
For choice of analysis we usually distinguish:
o Nominal + ordinal: categorical or qualitative
o Interval + ratio: continuous or quantitative or numerical —> allowed to make computations
with this variable
Rule 1 in MLR: the outcome is always continuous AND continuous predictors!!!
o Is created for the situation where all the variables are continuous
o One exception: if you want to include a categorical predictor, that’s possible, but you have to
use dummy variables
Dummy coding in MLR models: e.g., is gender a predictor of grade?
o Gender: create a dummy variable, e.g. 0 = male, 1 = female (ALWAYS a 1/0 variable!!)
2
, More predictors? Create more dummy variables!
o E.g., one to denote red(1) or not red (0)
o One to denote blue (1) or not blue (0)
o One to denote green (1) or not green (0)
o If all the dummy’s are 0 you will know it is 0 —> reference group (group with 0’s on all
dummy’s)
Predicted score on the outcome is a certain intercept —> average on y for the reference group (0’s on
all dummy’s so 3 terms disappear)
Hierarchical MLR
Output 1
For each model must be HA: R2(-change) > 0
o R-squared change > 0 means that the additional predictors improve the model
For each predictor x within each model: HA: B1 is not 0 —> unique effect of x within this model
Output 1: you can see 2 models. Always read the titles, columns and footnotes!
o In the model summary you can see R, R squared, adjusted R squared
R-squared: proportion of variance in the outcome variable explained by the model —> computed for
your sample
o Inferential statistics: using a sample to say something about the population
o Not a very good estimate for the population R-squared… Always a little bit too optimistic/high
More predictors, more optimistic! (Bias)
R: square root of R-squared. This is called multiple correlation coefficient: correlation between
observed Y’s en predicted Y’s (capital R to denote that it’s a multiple correlation and not bivariate!!!)
Adjust R-squared: somewhat smaller than unadjusted.
o Corrected for the bias of the sample, then you get the adjusted R-squared
o Says something about your guess about the population variance!
R-squared change says something about the difference between the two models. So R-squared change
0.127 for model 2 says something about the difference between model 2 and model 1 (significant
improvement)
Model summary: says something about the addition of new variables to the model, how do they
compare to each other? Is it a significant addition?
3
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper hannahvanrhoon. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €6,99. Je zit daarna nergens aan vast.