Econometrics Theory (from lectures & book)
Chapter 1
Types of questions for tests
- Quantitative questions answered by quantitative answers
For example: by how much will US GDP grow next year? How much do cigarette taxes
reduce smoking?
all questions require a numerical answer
mathematical way to quantify how a change in one variable affects another variable
multiple regression model
- Causality
A specific action (i.e. applying fertilizer) leads to a specific, measurable consequence (more
tomatoes).
- Randomized controlled experiment
There is both a control group that receives no treatment (no fertilizer) and a treatment
group that receives the treatment (100g/m2 of fertilizer).
randomized in the sense that the treatment is assigned randomly
- Causal effect
Effect on an outcome of a given action or treatment, as measured in an ideal randomized
controlled experiment.
Data sources and types
Data Sources:
- Experimental data:
Experiments designed to evaluate a treatment or policy or to investigate a causal effect.
- Observational data:
Data obtained by observing actual behavior outside of an experimental setting (challenging
when real-world data are used to estimate causal effects).
Data types:
- Cross-sectional data
Data on different entities (workers, consumers, firms, governmental units, etc.) for a single
time period is called cross-sectional data.
with cross-sectional data we can learn about relationship among variables by studying
differences across people, firms or other economic entities during a single time period.
, - Time series data
Data for a single entity collected at multiple time periods.
by tracking a single entity over time, time series data can be used to study the evolution
of variables overtime and to forecast future values of those variables.
- Panel data
Data for multiple entities in which each entity is observed at two or more time periods (also
known as longitudinal data).
panel data can be used to learn about economic relationships from the experiences of the
many different entities in the data sets and from the evolution over time of the variables for
each entity.
Chapter 4
The OLS model
𝑌 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖
Mostly the intercept 𝛽0 and the slope 𝛽1 of the population regression line are unknown and there
fore we must use data to estimate the unknown slope and intercept.
The most common way is to choose the line that produces the “least squares” fit to this data
ordinary least squares estimator:
- OLS regression line: 𝑌̂𝑖 = 𝛽̂0 + 𝛽̂1 𝑋𝑖 (fitted value), where 𝑌̂𝑖 = predicted value of 𝑌𝑖
- Vertical deviations: 𝑢̂𝑖 = 𝑌𝑖 − 𝑌̂𝑖 = 𝑌𝑖 − 𝛽̂0 + 𝛽̂1 𝑋𝑖 (residual/error term)
- Minimizing OLS: min 𝑆𝑆𝑅 = min ∑𝑛𝑖=1 𝑢̂𝑖 2 = min ∑𝑛𝑖=1(𝑌𝑖 − 𝛽̂0 + 𝛽̂1 𝑋𝑖 )2
̂ 0,𝛽
𝛽 ̂1 ̂ 0,𝛽
𝛽 ̂1 ̂ 0,𝛽
𝛽 ̂1
The estimators of the intercept and slope that minimize the sum of squared mistakes are
called the OLS estimators of 𝛽0 and 𝛽1 .
The OLS estimator of 𝛽0 is 𝛽̂0
The OLS estimator of 𝛽1 is 𝛽̂1
Why use the OLS estimator?
- Optimal statistical properties (under same theoretical assumptions)
- BLUE: best linear unbiased estimator (not estimate! Watch out in the MC!) (under the
Gauss-Markov assumptions)
- Recall: estimator is random, has therefore an expectation, population variance, population
distribution clearly distinguish population and sample parameters
, Measures of fit (SER, R2)
1) Standard error of the regression (SER)
SER is an estimator of the standard deviation of the regression error ui.
A measure of the spread of the observations around the regression line, measured in the
units of the dependent variable (magnitude of a typical regression error).
Since the regression errors u1, … , un are unobserved, the SER is computed using their
sample counterparts, the OLS residuals ûi, … , ûn.
1 𝑆𝑆𝑅
𝑆𝐸𝑅 = 𝑠𝑢̂ = √𝑠𝑢̂ 2 where 𝑠0 2 = 𝑛−2 ∑𝑛𝑖=1 𝑢̂𝑖 2 = 𝑛−2
Explanations:
𝑠0 2 since the sample average of the residuals is 0 (proven in Appendix 4.3)
𝑛 − 2 since 2 coefficients are estimated (𝛽0 and 𝛽1 )
2) Regression R2
The fraction of the sample variance of Yi explained by (or predicted by) Xi
𝑌𝑖 = 𝑌̂𝑖 + 𝑢̂𝑖
R2 is the ratio of the sample variance of 𝑌̂𝑖 to the sample variance of 𝑌𝑖 .
mathematically the R2 can be written as the ratio of the explained sum of squares to the
total sum of squares.
The explained sum of squares (ESS) is the sum of squared deviations of the predicted value
𝑌̂𝑖 from its average: 𝐸𝑆𝑆 = ∑𝑛𝑖=1(𝑌̂𝑖 − 𝑌̅ )2
The total sum of squares (TSS) is the sum of squared deviations of 𝑌𝑖 from its average:
𝑇𝑆𝑆 = ∑𝑛𝑖=1(𝑌𝑖 − 𝑌̅ )2
𝐸𝑆𝑆
The R2 is the ratio of the ESS and the TSS: 𝑅2 = 𝑇𝑆𝑆
𝑆𝑆𝑅 𝑣𝑎𝑟(𝑌̂ )
Plus, since 𝑇𝑆𝑆 = 𝑆𝑆𝑅 + 𝐸𝑆𝑆: 𝑅2 = 1 − 𝑇𝑆𝑆 = 𝑣𝑎𝑟(𝑌𝑖)
𝑖
And: 0 ≤ 𝑅2 ≤ 1 The higher the 𝑅2 , the higher the fit.
The least squares assumptions
1) Assumption 1: The conditional Distribution of ui given Xi has a mean of zero
ui is a random variable with 𝐸(𝑢𝑖 |𝑋𝑖 ) = 0
▪ In a randomized controlled experiment:
Subjects are randomly assigned to the treatment group (X=1) or to the control group
(X=0). The random assignment is typically done using a computer program that uses
no information about the subject, ensuring that X is distributed independently of all
personal characteristics of the subject.
random assignment makes X and u independent, which in term implies that the
conditional mean of u given X is zero.