Emperical Methods of Finance
Part 2 – Block 2: (More on) Panel data
Panel data: have both a time series and cross-sectional dimension
It measures the same collection of people, firms (etc.) over several periods.
Simplest setup = 𝑦𝑖𝑡 = 𝛼 + 𝛽1𝑥1,𝑖𝑡 + ⋯ + 𝛽𝑘𝑥𝑘,𝑖𝑡 + 𝑢𝑖t
Yit could be the stock return of firm i in year t
The explanatory variables could be:
o firm-level variables (such as earnings growth of firm i in year t)
o but also pure time-series variables such as GDP growth
o or pure cross-sectional variables (such as industry dummies)
Advantages of using panel data:
Can address broader range of issues and tackle more complex problems with panel
data than with pure time series or pure cross-sectional data alone; like for example:
o Difference-in-difference (as discussed in part 1)
o Regression discontinuity (as discussed in part 1)
Is often of interest to examine how the relationships between objects change over
time;
By structuring the model in an appropriate way, we can remove the impact of certain
forms of omitted variables bias in regression results.
Estimation models/methods for panel data: (*Most important methods)
1. Pooled OLS
2. Seemingly unrelated regressions (SUR)
3. Fixed effects estimator
4. Random effects estimator
5. Fama-MacBeth estimator (see point II.)
1)Pooled OLS
This approach is advisable in case of a small sample (both N and T small).
Estimate a single, pooled regression on all the observations together;
Pooling the data in this way assumes that the constant term and slope coefficients do
not vary across i and t.
Regression: = 𝑦𝑖𝑡 = 𝛼 + 𝛽1𝑥1,𝑖𝑡 + ⋯ + 𝛽𝑘𝑥𝑘,𝑖𝑡 + 𝑢𝑖t
3) Fixed effects model
Slope coefficients the same across i (firms), but constant term allowed to differ across i.
By including fixed effects you control for unobservable differences across the units
(firms, individuals) you analyze; and prevent omitted variables.
Can think of 𝛼𝑖 as capturing all (omitted) variables that affect 𝑦𝑖𝑡 cross-sectionally but
do not vary over time.
Fixed effects models:
a. One approach: incorporate N dummy variables
b. The Within Transformation;
c. Time Fixed Effects Models;
1
,Emperical Methods of Finance
a) One approach
Regression: 𝑦𝑖𝑡 = 𝛼1𝐷1𝑖 + 𝛼2𝐷2𝑖 ... + 𝛼𝑁𝐷𝑁𝑖 + 𝛽1𝑥1,𝑖𝑡 + ⋯ + 𝛽𝑘𝑥𝑘,𝑖𝑡 + 𝑢𝑖𝑡
D1 is a dummy variable equal to 1 for observations on the first entity in
the sample and zero otherwise.
Not practical if N is large (so use ‘’within’’ transformation)
b) Within transformation
T
Take the time-series mean of each entity (firm): Yi=∑ Y ¿ /T
t=1
Subtract this (step above) from the values of the variable 𝑦𝑖1, ... , 𝑦𝑖𝑇
Do this for all entities i (firms) and for all explanatory variables
The model containing the demeaned variables is:
Y ¿ −Y i=β ( X ¿ −X i ) +u ¿−ui
c) Time fixed effects model
If the average value of yit changes over time but not cross-sectionally
By including time fixed effects, you control for any common time-series variation
in the variables.
Time-fixed effects model: 𝑦𝑖𝑡 = 𝛼 + t + 𝛽1𝑥1,𝑖𝑡 + ⋯ + 𝛽𝑘𝑥𝑘,𝑖𝑡 + 𝑢𝑖𝑡
o t = a time-varying intercept
Also possible to allow for both entity fixed effects and time fixed effects within the
same model. Such a model would contain both cross-sectional and time dummies.
4) Random effects model
Under the random effects model, the intercepts for each cross-sectional unit are
assumed to arise from:
o a common intercept (the same for all cross-sectional units and over time); and
o plus, a random variable i that varies cross-sectionally but is constant over time.
Random effects panel model: y ¿ =α + β x ¿ +❑¿
❑¿=ε i +V ¿:
Heterogeneity (variation) in the cross-sectional dimension occurs via the i terms (and
not via dummies). So, this framework requires that the i has:
o Zero mean and constant variance;
o Is independent of the individual observation error term v ¿;
o Is independent of the explanatory variables.
Advantages of the random effects model:
Less parameters to be estimated, compared to fixed effects.
One can still estimate the effect of variables that are constant over time.
o Standard errors
Often applicable issue:
o error term across firms, households, etc, are positively correlated; and
o error term over time positively correlated (persistence)
In these cases, if you neglect these positive correlations, standard errors are too low.
Correct the standard errors for such patterns by using clustered standard errors.
2
, Emperical Methods of Finance
Define clusters: Error terms are then allowed to be correlated within each cluster
but assumed not to be correlated across clusters.
If you cluster standard errors by firm: each firm is a cluster;
– So error terms within each firm are allowed to be correlated across time:
Cov ( u is , u jt ) ≠ 0
– But no correlation across firms: Cov ( u is , u¿ ) =¿0
Can also cluster by firm and year (double or two-way clustering):
– Cov ( u is , u jt ) ≠ 0 : correlation across firms in a year
– Cov ( u is , u¿ ) ≠0 : correlation across years in a firm
– But Cov ( u is , u jt ) =¿0: no correlation across year and firm
Usually, two-way clustering is used.
Panel data of asset returns:
Methods of panel data of asset returns:
I. The cross-sectional approach to test the CAPM
II. Additional test of the CAPM: An extended second-stage regression
III. Fama-Macbeth procedure.
I. The cross-sectional approach to test the CAPM
CAPM says: E ( Ri ) =Rf + β i [ E ( Rm ) −Rf ]
Risk premium = β i [E ( R m )−R f ]
Steps of testing the CAPM:
1. Estimating the stocks beta;
2. Analyze if average returns indeed increase with stock beta (in the cross-section).
Step 1: Calculating the beta:
e e
Cov( R i , Rm )
Approach 1: Calculate it directly => β i=
Var ¿ ¿
Approach 2: Run time-series regression of the excess portfolio returns on the
e e
excess market returns => Ri , t=ai+ βi Rm , t +ui , t
o Do this separately for each portfolio, and the slope estimate will be the
beta.
Step 2: Single cross-sectional regression of average portfolio return (over time) on a
constant and the betas:
- Ri = λ0 + λ1 ^β i+ v i
e
- The number of observations for this cross-sectional regression is the size of the
cross-section.
- The regression will provide estimates for Lambda 0 en lambda 1.
- According to CAPM:
o λ 0=0
o λ 1=E ¿
CAPM-predicted value for: E( Ri ) - R f = β i R m
o Rm =Sample average of the excess market return
3