Statistical Inference
Summary
EBB075A05
Semester IA
Wouter Voskuilen
S4916344
1
,Wouter Voskuilen Statistical Inference
Lecture 1
Random sample
Definition (Random sample)
If the collection of random variables Y = (Y1 , ..., Yn )′ are independently identically dis-
i.i.d.
tributed, we write Yi ∼ F0 and refer to their realization y = (y1 , ..., yn )′ as a (simple)
random sample (r.s.) from Y .
Sample space: The set of all samples you could draw. Notation: Y, y, Y ∈ Y
Parametric model
Definition
The parametric statistical model (or parametric class) F is a set of pdf’s with the same
given functional form, of which the elements differ only by having different values of some
finite-dimensional parameter θ:
F := {f (·; θ) | θ ∈ Θ ⊂ Rk }, k < ∞
where Θ is called the parameter space.
The Likelihood funtion
If θ is known and y is unknown, then observing Y = y has density f (y; θ).
If the θ is unknown and y is known, we define a function that switches the variable and
parameter L(θ; y) = f (y; θ).
Definition
The likelihood function for the parametric statistical model F is a function L : Θ → R+ ,
defined as
L(θ; y) := c(y)f (y; θ),
for the joint pdf f and a generic function of the data c(·), for any fixed y ∈ Y.
Definition
The log-likelihood function for the parametric statistical model F is a function l : Θ → R
defined as
l(θ) = ln[L(θ)] = c + ln[f (y; θ)],
using the convention l(θ) = −∞ if L(θ) = 0.
2
, Wouter Voskuilen Statistical Inference
Definiton
Likelihoods for different samples are said to be equivalent if their ratio does not depend
on θ. The notation is L(θ; y) ∝ L(θ; z)
Definition
Given the samples y and z, if their likelihood functions are equivalent, then the statistical
inference on y and z is the same.
Lecture 2
Sufficient statistics
Definition
A statistic is a function T : Y → Rr , r ∈ N+ , such that T (y) does not depend on θ, with
t = T (y) its realization, or sample value.
Definition
For some F, a statistic T (y) is sufficient for θ if it takes the same value at two points
y, z ∈ Y only if y and z have equivalent likelihoods. i.e. if, for all y, z ∈ Y,
T (y) = T (z) =⇒ L(θ; y) ∝ L(θ; z) for all θ ∈ Θ.
Theorem (Neyman’s factorization)
For some F, T (·) is sufficient for θ if and only if we can factorize
f (y; θ) = h(y)g(T (y); θ).
So, if T is sufficient for θ, we can factor the joint density into a part that only depends
on the data, and a function that depends on θ and on the data, but only through T .
Minimal sufficient statistics
Definition
For some F, a sufficient statistic T (y) is minimal sufficient for θ if it takes distinct values
only at points in Y with non-equivalent likelihoods. i.e. if, for all y, z ∈ Y,
T (y) = T (z) ⇐⇒ L(θ; y) ∝ L(θ; z) for all θ ∈ Θ.
3