ASSIGNMENT 1: Solutions
Theoretical Exercises
Note that in some of the solutions, we did not use the bold notation for matrices and vectors.
Solution Theoretical Exercise 1 (each 5 points)
n n
a) Unbiasedness: Taking the expected value of both sides, E [µ̂] = n+1 E [ȳ] = n+1 µ 6= µ. Hence, µ̂ is
y1 +y2 +...+yn 1 1
biased. In this solution E [ȳ] = E n = n (E [y1 ] + E [y 2 ] + . . . E [yn ]) = n nm.
n n
Consistency: plim µ̂ = plim n+1 ȳ = plim n+1 plim ȳ = µ. The last equality makes use of the product
n→∞ n→∞ n→∞ n→∞
n
rule of probability limits, and the facts that plim n+1 = plim 1+1 1 = lim 1
1 = 1, and that plim ȳ = µ
n→∞ n→∞ n n→∞ 1+ n n→∞
by the W.L.L.N. if we assume that yi are i.i.d. Hence, µ̂ is consistent.
h Pn i P n2 2 n
b) Unbiasedness: Taking the expected value, E [µ̂] = E n2 i=1 2
y2i = n2 i=1 E [y2i ] = n 2 E [y2i ] = µ.
The estimator is unbiased.
n 1
Pm 1
Pm
Consistency Define m = 2. Then, plim µ̂ = plim m i=1 y2i . By the W.L.L.N, plim m i=1 y2i =
n→∞ n→∞ n→∞
E [y2i ] = µ if we assume that y2i is i.i.d. The estimator is consistent.
0.1
P100 0.9
Pn 0.1 0.9
c) Unbiasedness: E [µ̂] = 100 i=1 E [yi ] + n−100 i=101 E [yi ] = 100 100µ + n−100 (n − 100) µ = µ. The
estimator is unbiased.
0.1
P100 0.9
Pn 0.1
P100
Consistency: plim µ̂ = plim 100 i=1 yi + plim n−100 t=101 yi = 100 i=1 yi + 0.9µ since
n→∞ n→∞ n→∞
the probability limit of a constant is the constant itself and making use of the W.L.L.N and
assuming that y2i is i.i.d. The estimator is inconsistent. An alternative solution is the following.
P100 Pn
plim µ̂ = plim 0.1n 1
100 n
0.9
i=1 yi + plim n−100 t=101 yi . By the product rule of probability limits,
n→∞ n→∞ n→∞
100 n
lim 0.1n 1 1 0.1n
P P
n→∞ 100 plim n i=1 yi + 0.9 plim n−100 t=101 yi = lim 100 µ + 0.9 µ = ∞.
n→∞
n→∞ n→∞
Solution Theoretical Exercise 2 (10 points)
Note: no bold notation used to indicate matrices.
First, consider the following matrix derivate rules. If a and x are column vectors and A is a matrix, according
∂a0 x ∂x0 a ∂x0 Ax ∂x0 Ax
to the denominator layout notation, ∂x = a, ∂x = a, and ∂x = 2 (A + A0 ) x. ∂x = 2Ax if A
is symmetric (see for example exercise c) from week 4) . Arranging the terms of the objective function,
0
S (β) = (y − Xβ) (y − Xβ) = y 0 y − 2y 0 Xβ + β 0 X 0 Xβ. Realising that X 0 X is symmetric and using the matrix
∂S(β)
derivative rules, the first order condition requires that ∂β = −2X 0 y + 2X 0 Xβ = 0. Then, 2X 0 X β̂ = 2X 0 y.
−1
Assuming that the inverse of X 0 X exists, β̂ = (X 0 X) 0
X y. For this minimum to be unique, the second
∂ 2 S(β) ∂ 2 S(β)
order condition requires that ∂2β is positive definite. ∂2β = 2X 0 X. It can be shown that X 0 X is positive
definite if we assume that the explanatory variables are not perfectly collinear so that the columns of X are
linearly independent.
1
, Solution Theoretical Exercise 3 (10 points)
Start with M2 = I − P2 = I − x2 (x02 x2 )−1 x02 . Post-multiply both sides of the equation with x1 to obtain
M2 x1 = x1 − x2 (x02 x2 )−1 x02 x1 . Since x1 and x2 are orthogonal, x02 x1 = 0. Hence, M2 x1 = x1 . In the
regression of y on x1 and x2 , b1 = ((M2 x1 )0 M2 x1 )−1 (M2 x1 )0 y. Since M2 x1 = x1 , b1 = (x01 x1 )−1 x01 y. In the
regression of y on x1 and x2 , b1 = ((M2 x1 )0 M2 x1 )−1 (M2 x1 )0 y gives the effect of x1 on y where the effect of x2
is ‘partialled out’ from x1 because M2 x1 is orthogonal to x2 . However, since we know that x1 is orthogonal to
x2 , we do not need to partial out the effect of x2 from x1 . Hence, the transformation M2 x1 is not necessary.
We do not need to control for x2 while studying the effect of x1 on y because x2 has no influence on x1 .
Solution Theoretical Exercise 4 (5 pts for a), 10 for b) )
a) Code Plot:
rm(list=ls())
set.seed(1)
N_sim = 1000
N_obs = 1000
B_true = rbind(0.2,0.5)
N_par = length(B_true)
Cov_sim = c(0.1,0.9) # defines Covariances we use
B_hat_sim_1 = matrix(NA,nrow=N_sim,ncol=length(Cov_sim))
# install.packages("mvtnorm")
library("mvtnorm") # load package to draw multivariate normal
## Warning: package 'mvtnorm' was built under R version 3.5.2
for (j in 1:length(Cov_sim)) {
VCOV = matrix(c(1,Cov_sim[j],Cov_sim[j],1),nrow=2,ncol=2,byrow=T)
X = rmvnorm(N_obs,mean=c(0,0),sigma=VCOV)
for (i in 1:N_sim) {
y = X%*%B_true + rnorm(N_obs,mean=0,sd=1)
B_hat_sim = solve(t(X)%*%X)%*%t(X)%*%y
B_hat_sim_1[i,j] = B_hat_sim[1]
}
}
plot(density(B_hat_sim_1[,1],bw = 0.03),ylim = c(0,15),
main = "Sampling distributions of the OLS estimator",
xlab = "B_hat", col = "steelblue", lwd = 3)
lines(density(B_hat_sim_1[,2],bw = 0.03),ylim = c(0,15),col = "firebrick",lwd = 3 )
abline(v=B_true[1],col="black",lwd="2")
legend("topleft",c("COV(x1,x2) = 0 ","COV(x1,x2) = 0.9"),lty = 1,
2