1.1. Points of attention
Reliability, internal and external validity
• Operationalisation of theories and concepts
—> concepts/theories —> dimensions —> indicators
• Questionnaire design, measurement and scale construction
• Interpretation
—> correlation vs causality (think through)
—> synthesis and generalisation
1.2. Variable types and methods of analysis
• response variable (dependent variable) vs. explanatory (independent variable)
• manifest variable (measured directly) vs. latent variable (not directly measurable —> need to be studied
on the basis of “help” variables
Features of variables
• qualitative (colours) (non-numeric)
• quantitative —> discrete (numbers of goals) or continuous (grades)
1.3. Types of variables: level of measurement
Measurement levels determine the possibilities of the analysis.
• Numeric information = high information density
• Ideal information = number that can take every value.
• Non-numeric information = more restrictive information
• Variables can take less values, studying the relationship between levels is harder.
1.4. Linear regression: the model
• The conditional expectation of a continuous (dependent) variable Y is expressed as a linear function of
the explanatory variables X1, X2, Xm
1
,• Where Xi1 stands for (Xi1… Xim)
• Specific observations deviate randomly from the expected value so we add a random error term to the
model (E)
1.5. The model graphically
Linear regression: least squares methods (OLS)
• The regression line is estimated with help of the least squared method: take the line, for which te sum of
squared residuals is as small as possible
• Minimise the sum of residuals
• Residual “e” is the difference between the observed and predicted value
• In OLS, the best line is where the sum of the residuals is 0, which means no deviation, prediction is the
same as the actual observation. This is never the case, the model is always a prediction.
1.6. R-squared
To determine whether the line is a good line, we need to understand variance. Variance = how far each
observation is from the mean (the average of the observations)
= sum of the squared difference of the sum of each data point and the mean.
• R-square (goodness-of-fit) measures how well the model fits the observations, the share of the variation
of Y that is explained by the model
• How much (%) is explained by the model
• The share of explained variation out of the total
• The higher the R-square the better
1.7. Check model assumptions (4)
1. The sample consists of independent observations
Meaning that there is no relationship between the dependent variables. This has to do with your research
design.
2. Linear model is suitable
The relationship between de dependent and independent variable is linear.
—> use scatterplot with SPSS
2
,3. The variance of the residuals is equal for all possible values of the dependent variables
Constant variance or homoscedasticity.
—> average parameter effect, by adding 1, this result.
(higher parameter for the end and lower for the beginning. In
this graph, there is an increase in the variance, the dots get
further away from the mean)
4. The residuals are normally distributed
Check this with histograms. They have to be bell-shaped.
1.8. Outliers
• Observations that are shooting out
• Different than the ‘average’ observations
• Outliers can have a big influence on the data outcome
We need to know how these influence our model/parameters.
Tests
• box-plots: show you the extreme observations
• Scatterplots, to identify certain observations
• DF-beta
• Goodness-of-fit measure
• Cooks Distance indicator >1
1.9. Test for Multicollinearity
When there are multiple explanatory variables, there can be a relation between these variables. If the
correlation between these two variables is high, we have a problem. (r <0,8 or 0,9)
3
, Problem:
• Standard errors of regression coefficients increase untrustworthy coefficients
• Limits size of relevance of individual explanatory variables becomes impossible
Rules of thumb for detecting
• VIF: variante inflation factor. Test of the presence of multicollinearity when >10
1.10. Dummy variables
To include qualitative variables in regression
For women:
For men:
When to use dummy variables:
Independent variable Use of dummy variable
continuous not necessary
ordinal not necessary if linear trend
dichotomous yes
nominal (more than 2 categories) create help variables using dummies
1.11. Interaction term
• We speak of an interaction if the effect of an independent variable is influenced by a second
independent variable
• Example: the effect of study hours on grade is different for students with a higher level of prior
education than for students with a lower level of prior education.
• In the linear model an interaction term is added
• When the coefficients of the interaction term is significant, the regression lines are not paralel, we speak
of an interaction.
4
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller Iriswellen. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $5.97. You're not tied to anything after your purchase.