Term 1
Week 1
Simple linear regression
Conditional distributions
• E[ay+b]=aE[y]+b
• E[a(x)y+b(x) |x = a(x) E[y|x] + b(x)
• Var (y|x)= E[y²|x]- E[y|x]²
• LIE: E[y]=E[E[y|x]]
Causality, ceteris paribus, and counterfactual reasoning
• If we hold factors other than x that a ect y constant, ie. ceteris paribus, then we can conclude
that x has a causal e ect on y
• Counterfactual reasoning means considering counterfactual outcomes, the outcome that would
have occurred in one state vs the actual outcome
De nitions
y= β₀ + β₁x + u
• So, if ceteris paribus holds Δu=0 and Δy=β₁Δx
• u contains all factors a ecting y other than x
• SLR.1 Linear in parameters, y= β₀ + β₁x + u
• SLR.2 Random sampling, {xᵢ,yᵢ}, i=1,…,n are independently and identically distributed
• SLR.3 xᵢ, i=1,…,n, exhibition variations that SSTₓ= ∑(xᵢ-x̄ )²>0
• SLR.4 Zero conditional mean, E[u|x]=0
• No x conveys any information about u on average due to mean independence E[u|x]=E[u]
• Implies E[u]=0
• E[u]=LIE E[E[u|x]]=SLR.4 E[0]=0
• Implies E[ux]=0
• E[ux]=LIE E[E[ux|x]]=E[xE[u|x]]=SLR.4 E[0x]=0
• SLR.5 Homoscedasticity, Var[u|x]=σ² for all values x
• This means that any unobserved explanatory variables are uncorrelated with x
Deriving OLS
yᵢ=β₀+β₁xᵢ+uᵢ
These methods only require SLR.3 to hold
Method of moments:
Population moments:
(1a)E[u]=E[y-β₀-β₁x]=SLR.4=0
(2a)E[xu]=E[x(y-β₀-β₁x])=SLR.4=0
Sample moments:
(1) n⁻¹ ∑[yᵢ-β̂₀-β̂₁xᵢ]=SLR.4=0
• This gives 1/n [∑yᵢ-∑β̂₀-∑β̂₁xᵢ]= ȳ - β̂₀-β̂₁x̄ = 0
(2) n⁻¹∑[xᵢ(yᵢ-β̂₀-β̂₁xᵢ)]=SLR.4=0
• Substituting in gives β̂₁= n⁻¹∑(yᵢ-ȳ)xᵢ/n⁻¹(xᵢ-x̄ )xᵢ = n⁻¹∑(yᵢ-ȳ)(xᵢ-x̄ )/n⁻¹∑(xᵢ-x̄ )²
• This is because n⁻¹∑(yᵢ-ȳ)x̄ =0 and n⁻¹∑(xᵢ -x̄ )x̄ =0
,Outcome:
Predicted/ tted values: ᵢ=β̂₀+β̂₁xᵢ, i=1,..,n
OLS regression line: =β̂₀+β̂₁xᵢ
Residuals: ᵢ=yᵢ- ᵢ ie. the di erence between the actual yᵢ and its tted value
Sum of squared residuals:
Under this method we seek to minimise ∑( ᵢ²) = ∑(yᵢ-β̂₀-β̂₁xᵢ)²
This yields FOC with respect to :
β̂₀: 0=-2 ∑(yᵢ-β̂₀-β̂₁xᵢ)
β̂₁: 0=-2 ∑(yᵢ-β̂₀-β̂₁xᵢ)xᵢ
Working with the numerator:
∑(β₀+β₁xᵢ+uᵢ)(xᵢ-x̄ ) = β₀∑(xᵢ-x̄ ) + β₁∑(xᵢ-x̄ )xᵢ + ∑(xᵢ-x̄ )uᵢ = 0 + β₁∑(xᵢ-x̄ )xᵢ + ∑(xᵢ-x̄ )uᵢ = β₁SSTx +∑(xᵢ-x̄ )uᵢ
As ∑(xᵢ-x̄ )=0, β₀∑(xᵢ-x̄ )=0
And ∑(xᵢ-x̄ )xᵢ = ∑(xᵢ-x̄ )²
Overall:
β̂₁=β₁ +∑(xᵢ-x̄ )uᵢ/SSTx
∑(xᵢ-x̄ )uᵢ/SSTx= sampling error, the coe cient of regressing uᵢ on xᵢ
=β₁ +∑wᵢuᵢ
wᵢ= (xᵢ-x̄ )/SSTx
Conditional expectation:
E[β̂₁|X]=E[β₁ +∑wᵢuᵢ|X] = β₁ + ∑E[wᵢuᵢ|X] = β₁ + ∑wᵢE[uᵢ|X] = β₁
E[wᵢuᵢ|X]=wᵢE[uᵢ|X]=0 as E[u|X]=SLR.2E[u|x]=SLR.40, so wᵢ depends on X alone
LIE:
E[β̂₁]=E[E[β̂₁|X]]= E[β₁]=β₁
Variance of estimators
Using:
β̂₁=β₁ +∑wᵢuᵢ
That β₁ is constant so does not a ect Var(β̂₁|X)
Cov(uᵢ,uj|X)=SLR.2 and 4E[uᵢuj|xᵢ,xⱼ]=0
Var(uᵢ,|X)=SLR.2 and 5Var[uᵢ|xᵢ]= σ²
Variance of u
σ²= E[u²]
σ̂ ²= n⁻¹ ∑uᵢ²
As uᵢ² is unobservable we use n⁻¹ ∑uᵢ²= 1/(n-2) SSR
Standard error(β̂₁)= σ̂ /√SSTx
fi
û ŷ ŷ ŷ ff ff ffi û fi
, Variance of
Var = Var (β̂₀+β̂₁y)=Var(u)²+β̂₁²Var(Y)
Goodness of t
SST=SS= ∑(yᵢ-ȳ)² = total variation in y
SSE= ∑( ᵢ-ȳ)² = variation in y explained by x
SSR= ∑ ᵢ² = unexplained variation in y
SST=SSE+SSR
R²= SSE/SST= 1-SSR/SST = fraction of sample variation in y that is explained by x
Or, = SST total/ SST xᵢ
A larger R shows a better t of OLS line
Note- (n-1)⁻¹ SST= σ ²^y
Note- causality is about whether explanatory variables ⫫u, which R² says nothing about, it only
relates to the t of the model
Interpretation of coe cients
ŷ ûŷ fi ŷ fi fi ffi
, Week 2
Multiple linear regression
De nitions
ᵢ=β̂₀+β̂₁xᵢ₁+β̂₂xᵢ₂
yᵢ= β̂₀+β̂₁xᵢ₁+β̂₂xᵢ₂ + ᵢ
• β^s measure the ceteris-paribus change in y given a one unit change in xᵢ
• This interpretation is only valid if MLR.4 holds
Gauss-Markov assumptions:
• MLR.1 Linear in parameters
• MLR.2 Random sample
• MLR.3 No perfect collinearity xⱼ≠x and there is no exact linear relationship among any xⱼ in the
population
• ie. none of the explanatory variables are constant, and there are no exact linear relationships
among ⫫ variables
• MLR.4 Mean independence, E[u|x₁, x₂..xk]=0
• ie. other factors a ecting y are not related on average to x₁ and x₂
• If this holds then explanatory variables are exogenous
• This implies E[u]=0 and E[xⱼu]=0
• MLR.5 Homescedasticity, the variance of u is constant and ⫫x, ie. E[u² |x₁, x₂..xk]= σ²
• MLR.6 Normality, the population error u is ⫫ of X and normally distributed u~N(0, σ²)
• This necessarily assumes MLR.4 and 5 so is a very strong assumption
• The argument of it is the CLT, however its key weakness is that it assumes that all
unobservable factors a ect y in a separate, additive way
Deriving OLS
Sum of squared residuals:
These are found by solving min. SSR=∑(yᵢ-β₀-β₁xᵢ-β₂x₂)²
This yields:
Partialling-out method:
1) Estimate an SLR of x₁ on x₂
xᵢ1= α0+α₁xᵢ2+ri1
And us this to compute the residual
ri1^=xᵢ1-α0^-α₁^xᵢ2
This partials out the variation of xᵢ1 explained by xᵢ2
Properties of ri1^:
• ∑ri1^=0
• ∑ ri1^xᵢ2=0
• ∑ ri1^xᵢ1=∑ri1^²
2) Regress yᵢ on ri1^ and a constant
yᵢ= θ+ β₁ri1^+vi
Giving β̂₁= ∑ri1^yᵢ/ ∑ri1^²
ŷ
fi ff û ff
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper zctpfru. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €62,28. Je zit daarna nergens aan vast.