Econometrics lectures
WEEK1;
CH1&CH2: introduction and simple linear regression:
Econometrics:
• Empirical estimation of economic relationships.
• Testing economic theories.
• Making economic predictions.
• Evaluating government and business policy.
Characteristics:
• Non-experimental (observational) data.
• Regression analysis (in wide sense) as major tool.
Steps in econometric analysis:
1. Economic model:
Economic model Specify theoretical dependence of variables of interest.
• Example: wage function W = f (Educ)
o W is hourly wages
o Educ is years of education
o f is some function
2. Econometric model
Specify functional form f .
• For example: W = β0 + β1Educ + u
o β0 and β1 are parameters of the model
o u is random error or disturbance term
3. Hypothesis: state hypotheses of interest for unknown parameters.
4. Collect or choose data.
5. Apply econometric methods to estimate parameters.
6. Use estimated parameters to make predictions.
Correlation: dependence (association) between two random variables x and y: Corr(x,y)
Causality: cause and effect: x -> y
Endogeneity: people with higher ability choose higher education level and have higher wages.
• Education, educ, is endogenous.
• Unobserved variable bias.
Confounding factors: experience, marital status, having children.
• observed variable bias.
Types of data in empirical social science research:
• experimental:
o Data generated in laboratory (or quasi-laboratory) setting.
, o Experimental Economics, Experiments in ’labs’ and in the ’field’
• non-experimental/observational:
o Data generated in real life from households, firms, etc.
o Typically collected in surveys or from administrative records.
o Econometrics techniques often deal with problems arising in observational data.
• Cross-sectional data:
o Data collected at given point of time.
• Time series data:
o Data consist of observations on variables over time.
• Panel data (longitudinal data):
o Time series for each cross-sectional member (i.e. same individuals) in the data.
➔ Important feature for observational data: observations consist of a random sample from the
underlying population.
Random sample:
Linear regression model:
,The Error term u:
u is the error, or ”disturbance” term and contains everything that we do not control for:
• Omitted variables: observed and unobserved variables.
• Measurement error: errors in measuring y and x are in u.
• Non-linearities: if the relation between y and x is in fact not linear.
• Unpredictable effects: u includes all unpredictable, random effects.
Simple Normalization Assumption: E[u] = 0
Zero conditional mean assumption:
• Fundamental relevance:
Population regression function
(PRF):
, Estimating parameters:
• Assuming the simple regression model y = β0 + β1x + u
population parameters β0 and β1 are not observed.
• Next steps
o 1. Derive general Estimators for β0 and β1 that are optimal (to be defined)
o 2. Given a sample of observations, derive Estimates of the population parameters
using the Estimators.
• Take a sample of observations (xi , yi), i = 1, ..., n where n is the total number of observations
in the sample.
• Assumption: xi are independent, identically distributed (i.i.d.).
• Write the model now adding index i: yi = β0 + β1xi + ui
• Estimated parameters are denoted with a ”hat”, e.g. βˆ 0 is the estimate of the (unknown)
population parameter β0
Least squares estimators: