Lecture overview
Lecture 1: Probability spaces and conditional probability
Lecture 2: Bayes’ rule and independence
Lecture 3: Discrete RV’s
Lecture 4: Continuous RV’s
Lecture 5: Expectation and variance
Lecture 6: Jensen add joint distribution
Lecture 7: Independence, covariance and correlation.
Lecture 8: Poisson process
Lecture 9: Central Limit Theorem
Lecture 10: Statistical models
Lecture 11: Estimators and efficiency
Lecture 12: Maximum likelihood
Lecture 13: Linear regression
Lecture 14: Confidence intervals
Lecture 15: Testing hypotheses
Lecture 16: t-test
,Lecture 1: Probability spaces and conditional probability
§ Material: 2.1, 2.2, 2.3, 2.4, 2.5, 3.1, 3.2.
§ Set up a probability model and compute probabilities.
§ Compute and interpret conditional probabilities.
§ Apply various probability rules.
2.1 Sample spaces
Sample spaces: sets whose elements describe the outcomes of the experiment in which we are
interested.
There exists 𝑛! possible permutations of 𝑛 objects.
2.2 Events
Event: a subset of the sample space; an event 𝐴 occurs if the outcome of the experiment is an
element of the set 𝐴.
Set theory: 𝐴 ∩ 𝐵 (intersection), 𝐴 ∪ 𝐵 (union), 𝐴! (complement), 𝐴 ∩ 𝐵 = ∅ ↔ 𝐴 and 𝐵 are disjoint
(or mutually exclusive), 𝐴 ⊂ 𝐵 (𝐴 implies 𝐵).
DeMorgan’s Laws: (𝐴 ∪ 𝐵)! = 𝐴! ∩ 𝐵! and (𝐴 ∩ 𝐵)! = 𝐴! ∪ 𝐵!
2.3 Probability
Definition. A probability function 𝑃 on a finite sample space Ω assigns to each event 𝐴 in Ω a
number 𝑃(𝐴) in [0,1] such that.
§ 𝑃(Ω) = 1, and
§ 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) if 𝐴 and 𝐵 are disjoint.
The number 𝑃(𝐴) is called the probability that 𝐴 occurs.
Probability of a union. For any two events 𝐴 and 𝐵 we have: 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵)
2.4 Products of sample spaces
When we perform an experiment 𝑛 times, then the corresponding sample space is
Ω = Ω" × Ω# × … × Ω$ , where Ω% for 𝑖 = 1, … , 𝑛 is a copy of the sample space of the original
experiment. We assign probabilities to the outcomes 𝑃:(𝜔" , 𝜔# , … , 𝜔& )< = 𝑝" ∗ 𝑝# ∗ … ∗ 𝑝& , if each
𝜔% has probability 𝑝% .
2.5 An infinite sample space
Definition (extension on previous). A probability function 𝑃 on an infinite (or finite) sample space Ω
assigns to each event 𝐴 in Ω a number 𝑃(𝐴) in [0,1] such that.
§ 𝑃(Ω) = 1, and
§ 𝑃(𝐴" ∪ 𝐴# ∪ 𝐴' ∪ … ) = 𝑃(𝐴" ) + 𝑃(𝐴# ) + 𝑃(𝐴' ) + ⋯ if 𝐴 and 𝐵 are disjoint events.
3.1 Conditional probability
𝑃(𝐴|𝐵), we call this the conditional probability of 𝐴 given 𝐵. Computing the probability of an event
𝐴, given that an event 𝐵 occurs, means finding which fraction of the probability of 𝐵 is also in the
event 𝐴.
((*∩,)
Definition. The conditional probability of 𝐴 given 𝐵 is given by 𝑃(𝐴|𝐵) = ((,) , provided 𝑃(𝐵) > 0.
The rule 𝑃(𝐴) + 𝑃(𝐴! ) = 1 also holds for conditional probabilities, so 𝑃(𝐴|𝐵) + 𝑃(𝐴! |𝐵) = 1
3.2 The multiplication rule
For any events 𝐴 and 𝐶: 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴|𝐵) ∗ 𝑃(𝐵)
, Lecture 2: Bayes’ rule and independence
§ Material: 3.3, 3.4, 4.1, 4.2
§ Compute conditional probabilities with Bayes’ rule.
§ Check whether events are independent.
§ Derive and use probability mass function and the distribution function of discrete RVs.
3.3 The law of total probability and Bayes’ rule
The law of total probability. Suppose 𝐶" , 𝐶# , … , 𝐶. are disjoint events such that 𝐶" ∪ 𝐶# ∪ … ∪ 𝐶. =
Ω. The probability of an arbitrary event 𝐴 can be expressed as:
𝑃(𝐴) = 𝑃(𝐴|𝐶" )𝑃(𝐶" ) + 𝑃(𝐴|𝐶# )𝑃(𝐶# ) + ⋯ + 𝑃(𝐴|𝐶. )𝑃(𝐶. )
Bayes’ rule. Suppose the events 𝐶" , 𝐶# , … , 𝐶. are disjoint and 𝐶" ∪ 𝐶# ∪ … ∪ 𝐶. = Ω. The
conditional probability of 𝐶% , given an arbitrary event 𝐴, can be expressed as:
𝑃(𝐴|𝐶% )𝑃(𝐶% )
𝑃(𝐶% |𝐴) =
𝑃(𝐴|𝐶" )𝑃(𝐶" ) + 𝑃(𝐴|𝐶# )𝑃(𝐶# ) + ⋯ + 𝑃(𝐴|𝐶. )𝑃(𝐶. )
3.4 Independence
Definition. An event 𝐴 is called independent of 𝐵 if 𝑃(𝐴|𝐵) = 𝑃(𝐴).
𝐴 independent of 𝐵 ↔ 𝐴! independent of 𝐵.
𝐴 independent of 𝐵 ↔ 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴)𝑃(𝐵).
𝐴 independent of 𝐵 ↔ 𝐵 independent of 𝐴.
Independence. To show that 𝐴 and 𝐵 are independent it suffices to prove just one of the following:
𝑃(𝐴|𝐵) = 𝑃(𝐴)
𝑃(𝐵|𝐴) = 𝑃(𝐵)
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴)𝑃(𝐵)
where 𝐴 may be replaced by 𝐴! and 𝐵 replaced by 𝐵! , or both. If one of these statements hold, all
of them are true. If two events are not independent, they are called dependent.
Independence of two or more events. Events 𝐴" , 𝐴# , … , 𝐴. are called independent if
𝑃(𝐴" ∩ 𝐴# ∩ … ∩ 𝐴. ) = 𝑃(𝐴" )𝑃(𝐴# ) … 𝑃(𝐴. )
and this statement also holds when any number of the events 𝐴" , … , 𝐴. are replaced by their
complements throughout the formula.
4.1 random variables
Definition. Let Ω be a sample space. A discrete random variable is a function 𝑋: Ω → ℝ that takes
on a finite number of values 𝑎" , 𝑎# , … , 𝑎& or an infinite number of values 𝑎" , 𝑎# , …
In a way, a discrete random variable “transforms” a sample space to a more “tangible” sample
space whose events are more directly related to what you are interested in.
However, one has to determine the probability distribution of 𝑋, i.e., to describe how the probability
mass is distributed over possible values of 𝑋.
4.2 The probability distribution of a discrete random variable
Definition. The probability mass function 𝑝 of a discrete random variable 𝑋 is the function
𝑝: ℝ → [0,1], defined by 𝑝(𝑎) = 𝑃(𝑋 = 𝑎) for −∞ < 𝑎 < ∞
Definition. The distribution function 𝐹 of a random variable 𝑋 is the function 𝐹: ℝ → [0,1], defined
by 𝐹(𝑎) = 𝑃(𝑋 ≤ 𝑎) for −∞ < 𝑎 < ∞
Properties of the distribution function 𝐹 of a random variable 𝑋:
1. For 𝑎 ≤ 𝑏 one has that 𝐹(𝑎) ≤ 𝐹(𝑏).
2. Since 𝐹(𝑎) is a probability the value of the distribution function is always between 0 and 1.
Moreover, lim 𝐹(𝑎) = lim 𝑃(𝑋 ≤ 𝑎) = 1 and lim 𝐹(𝑎) = lim 𝑃(𝑋 ≤ 𝑎) = 0
/→1 /→1 /→21 /→21
3. 𝐹 is right-continuous, i.e., one has lim 𝐹(𝑎 + 𝑒) = 𝐹(𝑎)
3↓5