Tentamen (uitwerkingen)

Stanford University STATS 231 hw1-solutions.

Beoordeling

Verkocht

Pagina's

Cijfer

A+

Geüpload op

06-07-2021

Geschreven in

2020/2021

Homework 1 solutions CS229T/STATS231 (Fall) 1. Value of labeled data (11 points) In many applications, labeled data is expensive and therefore limited, while unlabeled data is cheap and therefore abundant. For example, there are tons of images on the web, but getting labeled images is much harder. What is the statistical value of having labeled data versus unlabeled data? This problem will explore this formally using asymptotics. Specifically, suppose we have an exponential family model over a discrete latent variable h and a discrete observed variable x: pθ(h; x) = expfθ · φ(h; x) − A(θ)g; where A(θ) = log Ph;x expfθ · φ(h; x)g is the usual log-partition function. Suppose that n examples (h(1); x(1)); : : : ; (h(n); x(n)) are drawn i.i.d. from some true distribution pθ∗. Define the following two estimators: ^ θ sup = arg max θ2Rd 1 n nX i =1 log pθ(h(i); x(i)) (1) ^ θ unsup = arg max θ2Rd 1 n nX i =1 log X h pθ(h; x(i)): (2) The supervised estimator θ^sup uses the variable h(i) and maximizes the joint likelihood, while the unsupervised estimator θ^ unsup marginalizes out the latent variable h. One important caveat: our results will hold when we assume that data is actually generated from our model family and that unsupervised learning is possible. Otherwise, labeled data is worth a lot more. a. (2 points) (supervised asymptotic variance) Compute the asymptotic variance of θ^sup: that is, given that pn(θ^sup − θ∗) −! N d (0; Vsup), write an expression for Vsup that depends on expectations/variances involving φ. Solution: Using notation from class, let ‘ denote the log-likelihood and L be the expected log-likelihood. Recall that V sup = r2L(θ∗)−1 Covθ∗[r‘(z; θ∗)]r2L(θ∗)−1 = Covθ∗[r‘(z; θ∗)]−1; where the second equality follows from Bartlett’s identity since we have assumed the model is well specified. Now, letting z = (h; x), r‘(z; θ∗) = r(θ∗ · φ(h; x) − A(θ∗)) = φ(h; x) − Eθ∗[φ(h; x)]; so V sup = Covθ∗ [φ(h; x) − Eθ∗[r‘(z; θ∗)]]−1 = Covθ∗[φ(h; x)]−1 : 1b. (2 points) (unsupervised asymptotic variance) Compute the asymptotic variance of θ^unsup: that is, given that pn(θ^unsup − θ∗) −! N d (0; Vunsup), write an expression for Vunsup that depends on expectations/variances involving φ. Solution: Similar to the previous part, we have r‘(z; θ∗) = r log X h expfθ∗ · φ(h; x) − A(θ∗)g = Ph pθ∗(h; x) · (φ(h; x) − Eθ∗[φ(h; x)]) Ph pθ∗(h; x) = Eθ∗[φ(h; x) j x] − Eθ∗[φ(h; x)]: Therefore, V unsup = Covθ∗[Eθ∗ [φ(h; x) j x] − Eθ∗[φ(h; x)]]−1 = Covθ∗ [Eθ∗[φ(h; x) j x]]−1 : c. (3 points) (comparing estimators) Prove that θ^sup has lower (or equal) asymptotic variance compared to θ^unsup. That is, show that V sup Vunsup; Solution: We have, V −1 sup = Covθ∗[φ(h; x)] = Eθ∗[Covθ∗[φ(h; x) j x]] + Covθ∗[Eθ∗[φ(h; x) j x]] = Eθ∗[Covθ∗[φ(h; x) j x]] + Vunsup −1 V −1 unsup; where the last inequality follows since the covariance is positive semi-definite. Therefore, Vsup Vunsup; i.e., ^

Meer zien Lees minder

Instelling

Vak

Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Meld schending auteursrecht

Geschreven voor

Instelling: Stanford University
Vak: STATS 231

Alle documenten voor dit vak (11)

Documentinformatie

Geüpload op: 6 juli 2021
Aantal pagina's: 10
Geschreven in: 2020/2021
Type: Tentamen (uitwerkingen)
Bevat: Vragen en antwoorden

Onderwerpen

oct 10
labeled data is exp
homework 1 solutions cs229tstats231 fall 20182019 note please do not copy or distribute due date wed
11pm 1 value of labeled data 11 points in many applications

Voorbeeld van de inhoud

Homework 1 solutions
CS229T/STATS231 (Fall 2018–2019)
Note: please do not copy or distribute.

Due date: Wed, Oct 10, 11pm

1. Value of labeled data (11 points)
In many applications, labeled data is expensive and therefore limited, while unlabeled data is cheap and
therefore abundant. For example, there are tons of images on the web, but getting labeled images is much
harder. What is the statistical value of having labeled data versus unlabeled data? This problem will explore
this formally using asymptotics.
Specifically, suppose we have an exponential family model over a discrete latent variable h and a discrete
observed variable x:
pθ (h, x) = exp{θ · φ(h, x) − A(θ)},
P
where A(θ) = log h,x exp{θ · φ(h, x)} is the usual log-partition function.
Suppose that n examples (h(1) , x(1) ), . . . , (h(n) , x(n) ) are drawn i.i.d. from some true distribution pθ∗ .
Define the following two estimators:
n
1X
θ̂sup = arg max log pθ (h(i) , x(i) ) (1)
θ∈Rd n i=1
n
1X X
θ̂unsup = arg max log pθ (h, x(i) ). (2)
θ∈Rd n i=1
h

The supervised estimator θ̂sup uses the variable h(i) and maximizes the joint likelihood, while the unsuper-
vised estimator θ̂unsup marginalizes out the latent variable h.
One important caveat: our results will hold when we assume that data is actually generated from our
model family and that unsupervised learning is possible. Otherwise, labeled data is worth a lot more.

a. (2 points) (supervised asymptotic variance) Compute the asymptotic variance of θ̂sup : that is,
√ d
given that n(θ̂sup − θ∗ ) −
→ N (0, Vsup ), write an expression for Vsup that depends on expectations/variances
involving φ.

Solution:
Using notation from class, let ` denote the log-likelihood and L be the expected log-likelihood. Recall that

Vsup = ∇2 L(θ∗ )−1 Covθ∗ [∇`(z, θ∗ )]∇2 L(θ∗ )−1 = Covθ∗ [∇`(z, θ∗ )]−1 ,

where the second equality follows from Bartlett’s identity since we have assumed the model is well specified.
Now, letting z = (h, x),

∇`(z, θ∗ ) = ∇(θ∗ · φ(h, x) − A(θ∗ )) = φ(h, x) − Eθ∗ [φ(h, x)],

so
−1
Vsup = Covθ∗ [φ(h, x) − Eθ∗ [∇`(z, θ∗ )]] = Covθ∗ [φ(h, x)]−1 .

1

, b. (2 points) (unsupervised asymptotic variance) Compute the asymptotic variance of θ̂unsup : that
√ d
is, given that n(θ̂unsup − θ∗ ) −
→ N (0, Vunsup ), write an expression for Vunsup that depends on expectation-
s/variances involving φ.

Solution:
Similar to the previous part, we have
X
∇`(z, θ∗ ) = ∇ log exp{θ∗ · φ(h, x) − A(θ∗ )}
h
P
· (φ(h, x) − Eθ∗ [φ(h, x)])
h pθ (h, x)P
∗
=
h pθ (h, x)
∗

= Eθ∗ [φ(h, x) | x] − Eθ∗ [φ(h, x)].

Therefore,
−1 −1
Vunsup = Covθ∗ [Eθ∗ [φ(h, x) | x] − Eθ∗ [φ(h, x)]] = Covθ∗ [Eθ∗ [φ(h, x) | x]] .

c. (3 points) (comparing estimators) Prove that θ̂sup has lower (or equal) asymptotic variance
compared to θ̂unsup . That is, show that

Vsup Vunsup ,

Solution:
We have,
−1
Vsup = Covθ∗ [φ(h, x)] = Eθ∗ [Covθ∗ [φ(h, x) | x]] + Covθ∗ [Eθ∗ [φ(h, x) | x]]
−1
= Eθ∗ [Covθ∗ [φ(h, x) | x]] + Vunsup
−1
Vunsup ,

where the last inequality follows since the covariance is positive semi-definite. Therefore, Vsup Vunsup ; i.e.,
θ̂sup has lower asymptotic variance.

d. (4 points)
Consider the exponential family
1
pθ (h, x) = exp(θhx),
Z
where h, x ∈ {0, 1} and Z = h,x∈{0,1}2 exp(θhx).1 Essentially, (h, x) is a pair of correlated biased coin
P

flips, where pθ (1, 1) = exp(θ)/Z and pθ (0, 0) = pθ (0, 1) = pθ (1, 0) = 1/Z.
1Z is often referred to as the partition function.

2

$9.49

Krijg toegang tot het volledige document:

100% tevredenheidsgarantie

Direct beschikbaar na je betaling

Lees online óf als PDF

Geen vaste maandelijkse kosten

Maak kennis met de verkoper

Themanehoppe

3.4

(48)

Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

Themanehoppe American Intercontinental University Online

Bekijk profiel

Volgen

Verkocht

292

Lid sinds

4 jaar

Aantal volgers

223

Documenten

3485

Laatst verkocht

3 maanden geleden

3.4

48 beoordelingen

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via Bancontact, iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo eenvoudig kan het zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper Themanehoppe. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor $9.49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 41729 samenvattingen verkocht Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen