X = Xi N(p . ) and 22 X (v) )
*
In all chapters before, we used asymptotic theory, ex CLT chi-square z - N(0 ,,
An alternative approximation is provided by bootstrap
Bootstrapping is used to estimate (by simulations), the sampling
X
distribution, by repeatedly sampling from the observed sample/data. The
2 approaches goal of bootstrapping is to make inference about a population, without
making strong assumptions about the underlying distribution.
I
Simple bootstrapping: draws asymptotic conclusions when theory is hard to implement
2
Bootstrap with asymptotic refinements: provides asymptotic refinements that lead to better approximations
notation
·
estimate
·
(yi xi) sample
wi =
,
S standard error
18-80)
t =
t-statistic
58
· Go
estimate under the null hypothesis
basics of bootstrapping
I
Simple bootstrapping
Suppose yi F(p 0)) -
,
F is a random population distribution ex. Normal or Chi-square
>
Hence real population F(m 8) ,
sample/bootstrap population Sy ya
>
, . . .
.,
bootstrap sample Syr, ] can generate B bootstrap sample (using
>
...,
*
ya
<
replacements ! )
N then we can calculate
1 B
*
mean of means: y *
=
Ba = 1
yb
I
variance of means: Var(j) =
B -
1, (yb -
y
*
)2
>
&
In general, for estimator ⑦ , we can use bootstrapping to estimate Var(8) , when analytic formulas
for Var (8) are complex. Such bootstraps are valid and have similar properties to estimates obtained
from the usual theory
2
Bootstrapping with asymptotic refinements
In some cases it is possible to improve on the simple bootstrapping and obtain estimates that may better
M
approximate the finite sample distribution of ⑦ , using refined asymptotic
I
theory
↓
-
(8-00)
Until now we know the following from asymptotic theory (Taylor expansion): P No z -
=
P(z) + R ,
(z)((z)
We now look at Edgeworth expansion: p[n(f) (2
g ,
(2) R2 = + N +
, The Edgeworth expansion is a better approximation but difficult to implement theoretically. A bootstrap
with asymptotic refinement provides a simple computational method to implement Edgeworth expansion
N
For asymptotic refinement to occur, the statistic being bootstrapped must be an asymptotically pivotal
statistic
a statistic whose limit distribution does not depend on unknown parameters
>
Ex. yi F(p z) , depends on F, M and 82 . Then j
-
,
<
N(p . ), depends on M and 0.0M Under H =
Mo , the
i -
Mo
2
distribution still depends on , using SE(j) G 5 =
>
N(o 1) , we find pivotal statistic
bootstrap algorithm
step 1: we have the given data Ew wa] , draw a bootstrap sample & wi*, ...... , ....
wa
*
Y
step 2: calculate appropriate
(8 8)
statistics *
-
**
*
ex. S *,
=
S
*
step 3: repeat steps 1 and 2, B independent times and obtain ex. Y Es
* *
or A ti .... , ....
step 4: use these bootstrapped values to obtain a bootstrapped version of the statistics
ex. bias & , approximates E(0) 0 I
-
standard error SEboo(8) (8 8 ]
*
:
B-1 -
2-sided equal tail CI (8 -
A *.. EJIB + 1] SE(8) ,
8-fELB + ] SE(E)) An I ...
I Ass
bootstrapped p-value 2 min (5) . (1)
Example Co .
07 ,
0 . 031 ,
0 .
338 ,
1 .
690 ,
3 .
392 ,
0 411
.
,
0 .
479 ,
3 .
572 ,
0 .
637 ,
0 .
434
Ho Ha = 5 %
M
:
x
M vs 1 =
= 1
,
⑧
Bootstrap without refinement (when standard error is hard to determine)
X -
1 * -
1 1 .
100 -
1
&
5 :
SEboot N(o ,
1) ,
jobs :
seroot
=
0 . 399 =
0 .
251 ( -
0
,
1 .
96]u [1 .
96 ,
a)
E
Bootstrap with refinement
- S : ! (xi - x) -
further details about bootstrapping
Types of bootstrapping
In step 1 of the algorithm, we can use different kind of bootstrapping
&
Paired bootstrap/non-parametric bootstrap/empirical distribution function bootstrap: draw bootstrap
sample from Ew wa] with (yi xi) ,
,
. .
..
wit ,
*
Parametric bootstrap: draw randomly from F(xi , )
o
Residual bootstrap: bootstrap from the residuals ( .
,
. . .
,
un) , to get (ii, ....
*
n
, Optimal number of bootstraps
Bootstrap remains valid voor finite B, as it relies on N >
B (YB-Yo)/ % "
>
No ,
w)
quantity of interest 7 B =
quantity of interest
7
Rule of thumb: B =
3846
ex. standard deviation = (2 (g) w +
a(1 -
x)
symmetric two sided test/CI w = (22x24(2x)) Look at loss in power when choosing B, we find that
when testing choose B such that a(B 1) is an integer
+
Standard error estimation
When it is hard to estimate the standard error, we can do this using bootstrapping
SEB :
B ! ( *** ) 7 Bootstrap estimate of the standard error
<**B
As this bootstrap estimate is consistent, we can use it in place of 5
Hypothesis testing
Tests with asymptotic refinenement (8 00) -
note that the usual test statistic is N(o 1) T =
58
>
,
Yit is a pivotal statistic, hence it has potential for asymptotic refinement
>
percentile t-method: (E -00)
produce A ,* A using A .... ,
=
Sb
*
these values ordered from smallest to largest is used to approximate the distribution of
7 T
then we can specify the bootstrap critical values
"
*
H .
: 0 < 0 :
reject if t = A (x(B + 1))
Hi ((1 a)(B 1))
*
:
8 >
to :
reject if A2 t - +
(E(B 1)) )) -E)(B 1))
* *
H: At 00 :
reject if t = t + or t = t , +
Tests without asymptotic refinenement
8 -
Go
compute t SEBooT(f) and compare to critical value of standard normal distribution
I =
percentile method: find the lower * and upper * quantile of the bootstrap values
2 ** and
reject Ho if Do falls outside this region
reject Ho if fo is not contained in (8(B
> * 1) + 1 ,
( , -
E)(B + ,