100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
ARMS general part - lectures and seminars $5.88
Add to cart

Class notes

ARMS general part - lectures and seminars

1 review
 229 views  36 purchases
  • Course
  • Institution

This is the summary of the lectures given for the exam statistics on 13 March. This is a summary of the lectures and seminars for the assessment on the 13th of March for the course ARMS.

Preview 4 out of 35  pages

  • March 5, 2020
  • 35
  • 2019/2020
  • Class notes
  • Unknown
  • Colleges 1 t/m 5 en seminars 1 & 2

1  review

review-writer-avatar

By: aeschipper • 4 year ago

avatar-seller
ARMS general part
Lectures, Seminars and Summary


Lecture 1 Multiple Linear Regression

It is important to critically review the articles and the way studies are performed, before believing
an outcome that is discussed. Things you can look at:
● Is it a representative sample? This is necessary for generalising the results - external
validity
● Are the variables measured in a reliable way? And do we really measure what we intend
to measure? - construct validity
● Correct analyses and correct interpretation of results? - statistical validity
● Critically consider alternative explanations for the statistical association! - internal validity
○ Association ≠ causation: if variables are related, does not mean that one causes
the other. There can be an alternative explanation possible.
○ Does effect remain when additional variables are included?
Investigate with multiple regression!

A simple linear regression model involves one outcome (Y) and one predictor (X). You assume
a linear relation between one variable and a certain outcome.
○ Outcome = DV = dependent variable
○ Predictor = IV = independent variable



Extending a simple linear regression model by adding more predictors to the model. A multiple
linear regression model involves one outcome and multiple predictors.



Two things in regression that are important to check if you have a good statistical model. So if
the model is a good way to describe the data and if the predictor is useful for predicting your
outcome. Two main things that are evaluated:
1. The amount of variance explained (​R​2​), i.e. the sizes of the residuals: how well do the
predictors (X) explain the outcome variable (Y)? With a lot of residuals, there is less
variance explained.
→ Larger ​R​2​: the dots fit the linear line.
→ Smaller ​R2​​ : the dots are more scattered.
2. The slope of the regression line (​b​1​): the increase of 1 unit on the X variable leads to the
increase/decrease on the predicted variable with B​1​. If the slope is steep, the ​b-​ value is
relatively large, then the X variable has a stronger effect on Y.
→ The slope is also called ​regression coefficient​.

,Multiple linear regression (MLR) examines a model where multiple predictors are included to
check their unique linear effect on Y.

The model

The full equation can be shortened by the observed score that is predicted by the model, but
always has some error (​residuals​) because the model will not predict perfectly.




So the observed outcome is a prediction based on the model and some error in the prediction.
The predicted part is called the statistical model (multiple linear regression) and is noted by Ŷ.



Every person has a different error (​ei​​ ) and as a result a different outcome (Y​i​). The ​i​ belongs to
the variables where people vary on, and the terms without it are the ​model parameters:​ relation
of the whole group!
→ Multiple linear regression model is also called an ​additive linear model​, because you are
adding multiple predictors (and the effect is additive - as you can see in the equation).

Types of variables

What type of variables can you include in a multiple regression? (model assumption)
There are formal distinctions in 4 measurement levels: nominal, ordinal, interval and ratio. But
the most important distinction is nominal/ordinal vs. interval/ratio.
● Nominal and ordinal creates categories (a.k.a. ​categorical​ or qualitative).
● Interval and ratio scores have numerical meaning (a.k.a. ​continuous,​ quantitative or
numerical).
The outcome of the multiple linear regression always requires a continuous outcome. The
predictors also need to be continuous.
→ Multiple linear regression is created for the situation where all the variables are continuous.
But categorical predictors can be included as ​dummy variables​. The dummy variable is a
variable with only two possible outcomes/values 0 or 1. You can write the equation of the
multiple linear regression with the intercept, the regression coefficient and the dummy predictor.
For example:



The coefficients have a clear interpretation, because they are multiplied by either 0 or 1. That
leaves the equation with either the intercept plus the regression coefficient (because it is
multiplied by 1) or only the intercept (because the regression coefficient is multiplied by 0).

,b0​​ can be interpreted, in this case, as the average grade of the females; ​b​0 ​+ ​b1​​ as the average
grade of the males. This means that ​b1​​ denotes the difference in the prediction for the average
grades for males and females. So ​b1​​ has another interpretation with dummy variables
(difference between groups, instead of the regression coefficient)!
Note! Don’t treat a dummy variable with more categories as a normal regression equation.
Instead, create multiple dummy variables. The multiple dummy variables are again noted with 0
or 1: if your first variable is zero, this variable disappears (​b1​​ *0); if your second variable is zero,
this variable disappears as well (​b​2​*0); if you score one on one of the variables, the rest
automatically becomes zero (because it cannot be that variable anymore)!
The last category doesn’t need a dummy variable, this is the ​reference group.​ Because if you
score zero everywhere, this means automatically that this is the last category (only ​b0​​ remains -
interpretation of the intercept is the average on Y for the reference group). For example:



→ The category yellow does not exist in the equation, but if you score 0 everywhere then there
is no red, no blue and no green = yellow (reference group).

MLR and hierarchical MLR

With a ​hierarchical multiple linear regression model​ you can test if your first predictors are good
predictors (research question 1), and if adding predictors improves the model significantly and
relevantly to explain the outcome Y (research question 2).
There are a lot of hypotheses you can test:
For each model (and research question 1):
● H​0​: ​R​2​ = 0 (the predictors of the model do not predict y)
● H​A​: ​R​2​ > 0
Research question 2:
● H​0​: ​R​2​-change = 0 (the additional predictors do not improve model)
2​
● H​A​: ​R​ -change > 0
For each predictor ​x​ within each model:
● H​0​: ​b​1​ = 0 (no unique effect of ​x​1​ within this model)
● H​A​: ​b​1​ ≠ 0

Output

→ Always read footnotes in the SPSS output!
R-​ values:
● R:​ multiple correlation coefficient; correlation between the observed Y and Y predicted

, ● R2​​ : proportion of variance of the outcome variable explained by the model; computed on
the sample and not a good estimate of the population (biased - the more predictors the
higher it is)
Inferential statistics means using the sample to say something more general, in that case use:
● Adjusted R​2​: proportion of explained variance corrected for the bias; to say something
about a population
● R2​​ ​Change​: improvement of fit compared to previous model
○ For the first model this is the same as the ​R​2​, because there is no previous model
to compare it to (only model zero).
○ For the second model it is the difference between the change in ​R2​​ with its
significance tested.
Regression coefficients:
● B:​ unstandardised coefficients; the relation/slope between the predictor and the outcome
within a model with ​x​ predictors (changes with more/less predictors!); includes the scale
of the variables that you are measuring.
→ ​Unique contribution​ of that predictor, given that the other predictors are part of the
model.
○ Controlling for other variables: the change of one variable in the predicted
outcome, if the other variable is fixed (the same for the whole group).
○ NOT bivariate correlation: how is X related to Y (ignores other variables).
● Beta​: standardized coefficients; which predictor has the strongest contribution to the
outcome, because the scale of the predictor is removed by standardization (they are
comparable).

Exploration or theory evaluation

When you do research, you have to think carefully about what variables to include. Otherwise
there could be effects that don’t make sense, because there are other variables in play. By
adding them to your multiple linear regression model, you control for these variables and see
unique effects.
Adding a lot of predictors into a multiple linear regression model, there are two ways of doing
this:
● Method enter​ (forced entry): based on a theory you include a few of all the predictors in
the MLR.
● Stepwise method:​ all predictors are explored for their contribution to predicting Y and the
final model will be based on observed relation in the data set.

Model assumptions

Statistical inference is based on many assumptions. Serious violations lead to incorrect results
such as wrong ​p-​ values or wrong confidence intervals. This is why you always have to check if
your data-set is fit to do a multiple linear regression analysis.
→ The model assumptions are discussed in the Grasple lessons.

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller willemijnvanes. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $5.88. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

53068 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$5.88  36x  sold
  • (1)
Add to cart
Added