“Probabilistic Machine Learning: An Introduction”
Kevin Murphy
1 1 Solutions
2 Part I
Foundations
3 2 Solutions
2.1 Conditional independence
PRIVATE
1. Bayes’ rule gives
P(HjE1;E2) =P(E1;E2jH)P(H)
P(E1;E2)(1)
Thus the information in (ii) is sufficient. In fact, we don’t need P(E1;E2)because it is equal to the
normalization constant (to enforce the sum to one constraint). (i) and (iii) are insufficient.
2. Now the equation simplifies to
P(HjE1;E2) =P(E1jH)P(E2jH)P(H)
P(E1;E2)(2)
so (i) and (ii) are obviously sufficient. (iii) is also sufficient, because we can compute P(E1;E2)using
normalization.
2.2 Pairwise independence does not imply mutual independence
We provide two counter examples.
LetX1andX2be independent binary random variables, and X3=X1X2, whereis the XOR
operator. We have p(X3jX1;X2)6=p(X3), sinceX3can be deterministically calculated from X1andX2. So
the variablesfX1;X2;X3gare not mutually independent. However, we also have p(X3jX1) =p(X3), since
withoutX2, no information can be provided to X3. SoX1?X3and similarly X2?X3. HencefX1;X2;X3g
are pairwise independent.
Here is a different example. Let there be four balls in a bag, numbered 1 to 4. Suppose we draw one at
random. Define 3 events as follows:
•X1: ball 1 or 2 is drawn.
•X2: ball 2 or 3 is drawn.
•X3: ball 1 or 3 is drawn.
We havep(X1) =p(X2) =p(X3) = 0:5. Also,p(X1;X2) =p(X2;X3) =p(X1;X3) = 0:25. Hence
p(X1;X2) =p(X1)p(X2), and similarly for the other pairs. Hence the events are pairwise independent.
However,p(X1;X2;X3) = 06= 1=8 =p(X1)p(X2)p(X3).
2.3 Conditional independence iff joint factorizes
PRIVATE
Independency)Factorization. Let g(x;z) =p(xjz)andh(y;z) =p(yjz). IfX?YjZthen
p(x;yjz) =p(xjz)p(yjz) =g(x;z)h(y;z) (3)
4