Lecture 1 - Introduction and Path Models
~ How to translate a theory into a statistical model, apply the model to research
data, and subsequently draw substantive conclusions based on the applied model.
Path model = is a hypothesis about underlying causal processes that explain the
observed correlation between two or more variables”
Correlation is NOT causation!!!
- When variables are associated, it does not mean that there is a causal
relationship; there may also be a common cause → Spurious effect
A theory of the relation between a number of “Facebook friends and subjective
well-being…
● Possible explanations mechanism
○ People with many friends experience a lot
of social support, which leads to a higher
degree of well-being
○ Social support and well-being are both
determined by
a. The degree they have a positive
self-image,
b. the degree to which someone has a
real (honest) image of himself
We can represent these hypotheses in a path diagram, and investigate to what
extent this model fits the data.
The goal of path analysis → Explain why variables
correlate
● Investigation of explanatory mechanisms:
, ○ It is useful to know that variables correlate, for example, to predict risks
or to take preventive measures. But more interesting is the question:
why do the variables correlate? And in particular to distinguish sham
effects (spurious effects) from real effects
○ Important for theory development, and for the development of effective
interventions (therapies, education)
■ Ex: If you know that introverts are more susceptible to burnout,
you can pay extra attention to them. But if you dont know why
introverts are more sensitive, it becomes more difficult to really
do something about it
Experimental research = active manipulation of the independent variables and
random assignment to experimental conditions
Correlational research: studying coherence between variables obtained from
surveys, and field observations. No manipulation has taken place as in (quasi)-
experimental research.
Variables
● Variables are properties of research units you are interested in and the
research units vary.
● If there is no variation in the variable, then you speak of a constant.
○ Notice that whether a property is a variable depends on the specific
research population envisaged.
■ If your target population is only girls, then gender is not a
variable in this, but a constant.
● Common mistakes: confusing the values of the variable with the variable
itself:
→ Example: “rich” and “poor” are two values of the same variable
income; should not be modeled separately.
→ The same confusion can sometimes be seen in the way hypotheses
are formulated.
● For example, it is incorrect to say: “High education is related to
income” → Better: “education is related to income” OR “highly
educated earn on average more than low educated”.
Basic descriptive statistics
● Mean
● Variance: a measure of dispersion; always a positive number)
● Standard deviation (a measure of dispersion)
● Covariance (a measure of linear association; cov > 0 positive association, cov
< 0 negative association) ~ Associations: negative, positive, or no association
● Correlation (standardized measure of linear correlation) ~ How strong the
association is
, ● Standardized scores (Z-scores; mean = 0, and SD = 1, and variance is also
1)
The hypothesis is a statement about the supposed relationship between two (or
more) variables. It can be a causal hypothesis or a correlation hypothesis.
The building blocks of the path model
● Each path model can be broken down into fundamental relations.
● The basic relationships provide possible explanations for the observed
covariance/correlation between variables
○ The five fundamental relations include:
1. Direct effect
2. Indirect effect
3. Spurious relation
4. Unknown effect
5. Reciprocal effect
The direct causal effect
● Hypotheses: Change in X cause changes in Y (X → Y)
● The reverse does not hold! A change on Y does not affect X (!)
● According to this hypothesis, X is the cause, and Y is the effect
The indirect effect (still a causal hypothesis, but with a third variable)
● Causal Hypotheses: A change in X directly causes a change in M (X→M),
and a change in M directly causes a change in Y (M→Y)
, ● Result: A change in X causes Y, but this effect flows indirectly through M. So,
variable X has an indirect impact on Y.
● Variable M is called the mediator, it mediates the relationship between X and
Y.
● Variable M is also called the intervening variable.
~ Example of an indirect effect
● Hypothesis 1: The more negative thoughts, the less well people (on average)
take care of themselves.
● Hypothesis 2: The less well people take care of themselves, the less good
(on average) their health is.
Spurious relation
● We assume a direct effect of Z on X and a direct effect of Z on Y
○ Variable Z is a common cause for both X and
Y.
● Because both X and Y have the same common
cause, there is an association between the two
variables.
~ But a change in X has no effect on Y, nor
vice versa. So it looks as if one affects the other, but it
is not true
→ Spurious relation between X and Y
● Variable Z is often referred to as a confounder
● Correlational research is often aimed at detecting or excluding possible
confounders
~ Example of a spurious relationship
● Research hypothesis: “Playing video games
increases aggression”