Examen

STSCI 4740 Machine Learning and Data Mining_HW1_solutions Cornell University STSCI 4740

12 vues 0 fois vendu

Cours
STSCI 4740

Établissement
STSCI 4740

[Montrer plus]

Aperçu 2 sur 7 pages

Voir l'exemple

Publié le 16 février 2023
Nombre de pages 7
Écrit en 2022/2023
Type Examen
Contient Questions et réponses

stsci 4740 machine learning and data mininghw1solutions cornell university stsci 4740
stsci 4740 machine learning and data mining fall dr yang ning homework 1 problem 1 6 points 1 express v

Themanehoppe

Membre depuis 3 année 255 documents vendus

€7,81

Ajouté

Ajouter au panier

Ajouter au liste de veux

Garantie de satisfaction à 100%
Disponible immédiatement après paiement
En ligne et en PDF
Tu n'es attaché à rien

STSCI 4740 Machine Learning and Data Mining Fall
Dr. Yang Ning Homework 1

Problem 1 (6 points)

1. Express Var(X1 X2 ) through the variances and covariances of X1 , X2 (assuming all
variances exist).
Answer:

Var(X1 X2 ) = E((X1 X2 ) 2 ) (E(X1 X2 ))2
= E(X12 ) 2E(X1 X2 ) + E(X22 ) E(X1 )2 + 2E(X1 )E(X2 ) E(X2 )2
= Var(X1 ) + Var(X2 ) 2Cov(X1 , X2 )

2. Assume that X1 , ..., Xn are i.i.d. real-valued random variables with finite variances. Show
that
⇣1 X n ⌘ 1
Var Xi = Var(X1 ).
n n
i=1

Answer: From 1.1, we notice that if X1 and X2 are independent, then variance of the
sum of random variables is the sum of variance.

n
! n
!
1X 1 X
Var Xi = 2 Var Xi
n n
i=1 i=1
n
X
1
= Var (Xi ) (Xi ’s are independent)
n2
i=1
1
= · nVar (X1 ) (Xi ’s are identically distributed)
n2
1
= Var(X1 )
n

3. Assume that X, Y are independent random variables with E[X] = 0, E[Y ] = 1, Var(X) =
1, Var(Y ) = 2. Compute E[(3X + Y )(5Y + 2X 1)]
Answer:

E[(3X + Y )(5Y + 2X 1)] = E(15XY + 5Y 2 + 6X 2 Y 3X)
2 2
= 15E(XY ) + 5E(Y ) + 6E(X ) E(Y ) 3E(X)
= 15 · 0 + 5(Var(Y ) + E(Y ) ) + 6(Var(X) + E(X)2 ) E(Y ) 3E(X)
2

(X and Y are independent)
= 20

Problem 2 (8 points)

Assume
This study source that weby have
was downloaded the regression
100000850872992 model on 02-16-2023 08:49:35 GMT -06:00
from CourseHero.com

1
https://www.coursehero.com/file/47582188/STSCI-4740-HW1-solpdf/

, Y = f (X) + ",
where " is independent of X and E(") = 0, E("2 ) = 2 .Assume that the training data
(x1 ; y1 ), ..., (xn ; yn )are used to construct an estimate of f(x), denoted by fˆ(x). Given a new
random vector (X,Y ) (i.e., test data independent of the training data),

1. show that E[(f (X) fˆ(X))2 |X = x] = var(fˆ(x)) + [E[fˆ(x)] f (x)]2
Answer:

E[(f (X) fˆ(X))2 |X = x] = E[(f (x) fˆ(x))2 ]
(X and the estimate of f are independent)
= E[(f (x) E(fˆ(x)) + E(fˆ(x)) fˆ(x))2 ]
= E[(f (x) Efˆ(x))2 ] + E[(fˆ(x) Efˆ(x))2 ] + 2E[(f (x) Efˆ(x))(fˆ(x) Efˆ(x))]
= [(f (x) Efˆ(x))2 ] + E[(fˆ(x) Efˆ(x))2 ] + 2(f (x) Efˆ(x))E(fˆ(x) Efˆ(x))
(f(x) and Efˆ(x)are constant)
= [f (x) Efˆ(x)]2 + var(fˆ(x))

2. Show that E[(Y fˆ(x))2 |X = x] = var(fˆ(x)) + [E[fˆ(x)] f (x)]2 + 2

Answer: We have shown in the class that

E[(Y fˆ(x))2 |X = x]
= E[(f (x) + " fˆ(x))2 ]
= E[(f (x) fˆ(x))2 ] + E("2 ) + 2E["(f (x) fˆ(x))]
= var(fˆ(x)) + [E[fˆ(x)] f (x)]2 + 2 + 2E["]E[f (x) fˆ(x)] (from 2.1)
= var(fˆ(x)) + [E[fˆ(x)] f (x)]2 + 2
. (" is independent of f(x) and fˆ(x))

3. Explain the bias-variance trade-o↵ based on the above equation.
Answer: the total error= bias +variance+ irriducible error Our goal is to minimize the
total error to attain an accurate model. Howerver, there is a trade-o↵ between bias and
variance. Flexible models have low bias and high variance and relatively rigid models
have high bias and low variance. The model with the optimal predictive capability is the
one that leads to the best balance between bias and variance.

4. Explain the di↵erence between training MSE and test MSE. Can expected test MSE be
smaller than 2 ?
Answer: Training MSE is computed in the trainig data set and can reach 0 if we fit
the training data very well. Test MSE is computed with the test observations and fitted
model. Although some model performs well with respect to trainig MSE, it may not have
the same predictive ability in the test data. Our goal is to find the model which minimize
the expected test MSE.
As 2.2 shows, the expected test MSE is the sum of variance of preidictor, the squared
bias and 2 , so it can’t be smaller than 2

This study source was downloaded by 100000850872992 from CourseHero.com on 02-16-2023 08:49:35 GMT -06:00

2
https://www.coursehero.com/file/47582188/STSCI-4740-HW1-solpdf/

Les avantages d'acheter des résumés chez Stuvia:

Qualité garantie par les avis des clients

Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.

L’achat facile et rapide

Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.

Focus sur l’essentiel

Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.

Foire aux questions

Qu'est-ce que j'obtiens en achetant ce document ?

Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.

Garantie de remboursement : comment ça marche ?

Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.

Auprès de qui est-ce que j'achète ce résumé ?

Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur Themanehoppe. Stuvia facilite les paiements au vendeur.

Est-ce que j'aurai un abonnement?

Non, vous n'achetez ce résumé que pour €7,81. Vous n'êtes lié à rien après votre achat.

Peut-on faire confiance à Stuvia ?

4.6 étoiles sur Google & Trustpilot (+1000 avis)

75632 résumés ont été vendus ces 30 derniers jours

Fondée en 2010, la référence pour acheter des résumés depuis déjà 14 ans

Commencez à vendre!

Populaire universiteiten

Populaire hogescholen

Populaire studieboeken voor Communicatie en Taal

Populaire studieboeken voor Economie en Bedrijf

Populaire studieboeken voor Exact en Informatica

Populaire studieboeken voor Gedrag en Maatschappij

Populaire studieboeken voor Gezondheid en Geneeskunde

Populaire studieboeken voor Recht en Bestuur

Examen

STSCI 4740 Machine Learning and Data Mining_HW1_solutions Cornell University STSCI 4740

Infos sur le Document

Sujets

École, étude et sujet

Vendeur

Avis reçus

Aperçu du contenu