QMAE 2019-2020
Intro Week 1
Economist’s To infer that one variable has a causal effect on another variable
goal
Ceteris Other (relevant) things being equal
paribus
Causality If we succeed in holding all relevant factors fixed, then we can find the link
between the two variables.
Fe: What is the effect of medical care on mortality holding other relevant factors
fixed?
Simple
regression
model
Two Interested in “explaining y in terms of x” or “how y varies with changes in x”
variables: y Fe: community crime and police officers
and x Wages and education
Fe:
Assumptions 1. MLR1: Linear in parameters: In the population model, the dependent
for variable, y, is related to the independent variable, x, and the error, u, as
Unbiasednes
s of OLS
2. MLR2: Random sampling: We have a random sample of size n, following
the population model
3. MLR3: Sample variation in the explanatory variable: The sample
outcomes on x are not all the same value
4. MLR4: zero conditional mean
Multiple
regression
1
,QMAE 2019-2020
model
MLR vs SLR Difficult to draw ceteris paribus conclusions using SLR:
MLR allows to control for many other factors that simultaneously affect the
dependent variable (better predictions)
Assumptions Same as SLR
Zoomed in on a few:
MLR 1 : Linearity in parameters:
NOT POSSIBLE:
However, the assumption is linearity in parameters! So there can be nonlinearities
in the variables such as:
MLR 3: Sample variation in the explanatory variable
=No Perfect collinearity
In the sample (and therefore in the population), none of the independent variables
is constant, and there are no exact linear relationships among the independent
variables
Fe:
It means that there is no perfect linear relationship between explanatory variables.
Examples of perfect collinearity:
If there is perfect collinearity, estimator simply does not work
Stata drops one variable automatically/arbitrarily and then estimates a model that
does not suffer from this problem:
But it may not be the variable you would prefer to drop, so i) start by defining
model properly and, only then, ii) estimate it.
Binary (or
dummy)
variables
Are house Variable Rotterdam: (1 if Rotterdam, 0 if other)
2
,QMAE 2019-2020
prices in Variable other: (1 if other, 0 if Rotterdam)
Rotterdam
different
than in the We can’t include Rotterdam and other in the same equation!
rest of the
Netherlands
, ceteris
paribus?
There are two lines, one presents Rotterdam and the other line represents other.
Interaction Coefficient of the interaction allows us to test if the effect of income is the same in
Rotterdam compared to other regions, ceteris paribus
3
, QMAE 2019-2020
Categorical
variables
Perc_privat4 is the reference category
Intercept:
Inference
What does it Testing hypothesis about a single population parameter:
mean?
Two 5. MLR 5 Homoskedasticity: The error u has the same variance given any
additional value of the explanatory variables.
assumptions
If assumption does not hold, then we have heteroskedasticity:
Than:
OLS estimates are still unbiased but not efficient
OLS standard errors are incorrect. Thus, t-statistics and confidence
intervals are no longer valid
SE, t-statistics, confidence intervals, F-statistics… can easily be adjusted
ALWAYS use heteroskedasticity-robust standard errors
4