Class notes

Advanced machine learning

Course
ADVANCED MACHINE LEARNING

Institution
Massachusetts Institute Of Technology

These are notes for advanced machine learning, focusing on ; Measurement space, features, typical learning problems, key concepts etc .

[Show more]

Preview 2 out of 11 pages

View example

Uploaded on July 4, 2022
Number of pages 11
Written in 2021/2022
Type Class notes
Professor(s) Jessedunietz
Contains All classes

advanced machine learning pdf

Institution
Massachusetts Institute Of Technology
Course
ADVANCED MACHINE LEARNING

$7.99

Added

Add to cart

Add to wishlist

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Advanced Machine Learning
Lectures

1 - Introduction
Measurement space, features, typical learning problems, key concepts, what you should
know
Supervised vs unsupervised learning, generative vs discriminative modeling

2 - Representations
Expected risk (R): conditional and total expected risk
Empirical risk (R^): training error, empirical risk minimizer, test error

Distinguish between test error and expected risk
Taxonomy of data, object space, measurement

Monadic, dyadic (e.g. pairwise), polyadic
Scales

Nominal (categorical): qualitative, but without quantitative measurements

Ordinal: measurement values are meaningful only with respect to other
measurements, i.e., the rank order of measurements carries the information, not the
numerical diﬀerences

Quantitative scale
Interval: the relation of numerical diﬀerences carries the information. Invariance
w.r.t. translation and scaling
Ratio: zero value of the scale carries information but not the measurement unit
Absolute: absolute values are meaningful
Mathematical spaces: topological, metric, Euclidean vector, metrizable

Probability spaces: elementary event, sample space, family of sets, algebra of events,
probability of events, probability model (triplet)

Stackexchange: Where a distinction is made between probability function and density,
the pmf applies only to discrete random variables, while the pdf applies to continuous
random variables
ml2016tutorial1: Note: Expected value =/= Most likely value
Describing dependencies in data by covariance is equivalent to approximation of data
distribution by a Gaussian model.

3 - Density Estimation in Regression: Parametric Models
Modeling assumptions for regression, diﬀerent approaches, Bayesianism and frequentism

Maximum Likelihood Estimation, ML estimation for normal distributions
Procedure: Find the extremum of the log-likelihood function
Wikipedia: Under the additional assumption that the errors are normally distributed,
ordinary least squares (OLS) is the maximum likelihood estimator.

, Wikipedia: Gauss-Markov Theorem states that in a linear regression model in which
the errors have expectation zero, are uncorrelated and have equal variances, the best
linear unbiased estimator (BLUE) of the coeﬃcients is given by the ordinary least
squares (OLS) estimator, provided it exists. The errors do not need to be normal, nor do
they need to be independent and identically distributed (only uncorrelated with mean
zero and homoscedastic with ﬁnite variance).
ml2016tutorial3: Note that if we don't know the real value of μ, we can use its
obtained prediction μ^ to calculate σ^, however in this case σ^ would be biased, i.e. σ^
=/= σtrue.
The James—Stein estimator is a biased estimator of the mean of Gaussian random
vectors. It can be shown that the James—Stein estimator dominates the "ordinary"
least squares approach, i.e., it has lower mean squared error. It is the best-known
example of Stein's phenomenon.
Maximum likelihood estimation of variance is biased, but it is nevertheless consistent.
Rao-Cramer inequality, Fisher information, score etc.

Wikipedia: In its simplest form, the bound states that the variance of any unbiased
estimator is at least as high as the inverse of the Fisher information.
Wikipedia: An unbiased estimator which achieves this lower bound is said to be (fully)
eﬃcient. Such a solution achieves the lowest possible mean squared error among all
unbiased methods, and is therefore the minimum variance unbiased (MVU) estimator.
Wikipedia: The Cramér–Rao bound can also be used to bound the variance of biased
estimators of given bias. In some cases, a biased approach can result in both a variance
and a mean squared error that are below the unbiased Cramér–Rao lower bound
Importance of the Maximum Likelihood Method, realizable model
Summary of MLEs

Consistency, equivariance, asymptotic eﬃciency, asymptotic normality
Bayesian Learning, on normal distribution, recursive Bayesian estimation
Exercise 2: Having determined the functional form of the prior and likelihood, we want
to compute the posterior. Doing it analytically can be hard in general, but it is easy if
the prior and likelihood form a conjugate pair. Then the posterior will have the same
functional form as the prior, only the parameters diﬀer.

Wikipedia: In Bayesian probability theory, if the posterior distributions are in the same
probability distribution family as the prior probability distribution, the prior and
posterior are then called conjugate distributions, and the prior is called a conjugate
prior for the likelihood function

ml2016tutorial3: Conjugate priors:

the gamma distribution is conjugate to the exponential distribution
the normal distribution is conjugate to the normal one
ML-Bayes estimation diﬀerences
The maximum likelihood method only estimates the parameters μ^, σ^, but not the
distribution!
ml2016tutorial3: simple linear regression corresponds to MLE, regularized linear
regression corresponds to MAP.
Schematic behaviour of bias and variance

4 - Regression
Linear regression models, least squares, residual sum of squares (RSS)

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller Studentscenter. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $7.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

66997 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling

Popular Universities in the United States

Popular books

Find notes and summaries for these qualifications