Basic Concepts
1 PURPOSE REGRESSION ANALYSIS
Regression analysis is concerned with the study of the dependence of one variable, the dependent
variable, on one or plural variables, the explanatory variables, with a view to estimating and/or
predicting the population mean or average value of the former in terms of known or fixed (in
repeated sampling) values of the latter.
2 THE POPULATION REGRESSION FUNCTION (PRF)
The population regression curve = the locus of the conditional expectations of the dependent
variable for fixed values of the independent variable.
→In principle: population ∞ large, i.e. for each value of X we have an ∞ number of observations on Y
Mathematical specification: 𝐸(𝑌|𝑋𝑖 ) = 𝑓(𝑋𝑖 )
Linear population regression function (PRF): 𝐸(𝑌|𝑋𝑖 ) = 𝛽1 + 𝛽2 𝑋𝑖
2.1 INTERPRETATION ‘LINEAR’
Two alternative interpretations:
• Linear in the variables
o The conditional expectations of Y are a linear function of Xi
NOT: 𝐸(𝑌|𝑋𝑖 ) = 𝛽1 + 𝛽2 𝑋𝑖 ²
• Linear in the parameters
o The conditional expectations of Y are a linear function of βs
NOT: 𝐸(𝑌|𝑋𝑖 ) = 𝛽1 + √𝛽2 𝑋𝑖
The basic theory of regression analysis supposes linearity in the parameters!
Non-linearity in the variables is permitted:
Econometrics 2019-2020 Casier Tessa
1
,2.2 STOCHASTIC SPECIFICATION OF THE PRF
Population regression function is only on average correct
The deviations of the individual consumption expenditures Yi of their conditional expectation can be
represented as: µ𝑖 = 𝑌𝑖 − 𝐸(𝑌|𝑋𝑖 ) with µi the stochastic error term, hence:
2.3 SOURCE OF THE STOCHASTIC ERROR TERM
The error term = a collection of all variables/factors that affect Y but are not included in the model.
Possible explanations:
• Vague theory
• No proper data available
• Simplicity: other variables only have a marginal (and random) influence
• Measurement errors in the data
• Wrong functional form
• …
(Later: properties error terms determine properties estimators!!!)
3 THE SAMPLE REGRESSION FUNCTION (SRF)
Assumption: we have data for the entire population
• Parameters in the population regression function can simply be calculated (βs)
• No need for estimation methods (econometrics)
In practice: we typically only have a sample drawn from the population
• ‘Randomly’ = for each X-value we draw exactly one Y-value
• Let n denote the sample size
The mission is to reconstruct the population regression curve/function based on this sample.
We do this based on:
• The sample regression curve
• The sample regression function (SRF)
𝑌 ̂1 + 𝛽
̂𝑖 = 𝛽 ̂2 𝑋𝑖 where: ̂𝑖
𝑌 an estimator (E|Xi)
̂1
𝛽 an estimator 𝛽1
̂2
𝛽 an estimator 𝛽2
based on an estimator, which
will be determined later.
Econometrics 2019-2020 Casier Tessa
2
,3.1 TERMINOLOGY: ESTIMATOR VS. ESTIMATION
An estimator = a method (typically based on a formula) to estimate a population parameter
using information in a sample of data
An estimation = A numerical result of the estimator applied to the available sample
3.2 THE SAMPLE REGRESSION FUNCTION
Using the sample, we can not exactly reconstruct the population regression function!!
Reasons:
• The sample regression function (SRF) is merely an approximation of the population
regression function (PRF)
o 𝛽 ̂1 ≠ 𝛽1
o 𝛽 ̂2 ≠ 𝛽2
o 𝑌 ̂𝑖 ≠ 𝑌𝑖
o µ̂𝑖 ≠ µ𝑖
• An estimator is stochastic (= it varies over repeated sampling)
o An alternative sample results in a different sample regression function
3.3 PURPOSE REGRESSION ANALYSIS
Approximate the parameters of the population regression function 𝑌𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 + µ𝑖
̂ ̂ ̂
using the sample regression function 𝑌𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 + µ̂𝑖
̂1 and 𝛽
and make sure that 𝛽 ̂2 approximate 𝛽1 and 𝛽2 ‘as closely as possible’ even though we do not
know the population regression function.
Use an estimator (least squares method)
The performance of the estimator as an approximation will be formalised by the statistical properties
of this estimator.
Econometrics 2019-2020 Casier Tessa
3
, Estimating the Sample Regression
Function
1 THE ORDINARY LEAST SQUARES METHOD (OLS)
How do we estimate the sample regression function based on the sample data?
• Pragmatic approach: find a sample regression line such that the distance between this line
and the observed data points become as small as possible
o Minimize the distance between 𝑌 ̂𝑖 and 𝑌𝑖
• Possible criteria:
o min ∑ µ̂𝑖 : not possible (- and + error terms cancel out)
̂1 ,𝛽
𝛽 ̂2
o min ∑ |µ̂
𝑖 |: possible (less interesting properties)
̂1 ,𝛽
𝛽 ̂2
o min ∑ µ̂𝑖 ²: ordinary least squares (OLS) method
̂1 ,𝛽
𝛽 ̂2
1.1 SOLUTION MINIMIZATION PROBLEM
From
𝑛
̂1 − 𝛽
min ∑(𝑌𝑖 − 𝛽 ̂2 𝑋𝑖 )²
̂1 ,𝛽
𝛽 ̂2
𝑖=1
This is a system of 2 equations and 2 unknowns
̂1 and 𝛽
Hence 𝛽 ̂2 are identified
̂𝟐
OLS estimator 𝜷 on formula sheet
∑ 𝑥𝑖 𝑦𝑖
̂2 =
𝛽
∑ 𝑥𝑖2
where
1
𝑋̅ = ∑ 𝑋𝑖 and xi = Xi - 𝑋̅
2
1
𝑌̅ = ∑ 𝑌𝑖 and yi = Yi - 𝑌̅
2
̂𝟏
OLS estimator 𝜷 on formula sheet
̂1 = 𝑌̅ − 𝛽
𝛽 ̂2 𝑋̅
Econometrics 2019-2020 Casier Tessa
4
Les avantages d'acheter des résumés chez Stuvia:
Qualité garantie par les avis des clients
Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.
L’achat facile et rapide
Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.
Focus sur l’essentiel
Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.
Foire aux questions
Qu'est-ce que j'obtiens en achetant ce document ?
Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.
Garantie de remboursement : comment ça marche ?
Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.
Auprès de qui est-ce que j'achète ce résumé ?
Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur tessacasier. Stuvia facilite les paiements au vendeur.
Est-ce que j'aurai un abonnement?
Non, vous n'achetez ce résumé que pour €5,49. Vous n'êtes lié à rien après votre achat.