Summary

Summary Machine Learning X_400154 - ideal reference summary

120 views 11 purchases

Course
Machine learning X_400154 (X_400154)

Institution
Vrije Universiteit Amsterdam (VU)

It is a summary of all lectures from Machine learning at the VU (X_). It is an ideal summary to keep as reference during the final quiz. It also contains some quiz questions

[Show more]

Preview 4 out of 75 pages

View example

Uploaded on March 20, 2021
Number of pages 75
Written in 2020/2021
Type Summary

linear models and search
model evaluation
data pre processing
probabilistic models
beyond linear models
deep learning
density estiomation
density estimation
deep g
summary machine learning x400154 vu

Institution
Vrije Universiteit Amsterdam (VU)
Education
Computer Science
Course
Machine learning X_400154 (X_400154)

lenie22

Member since 4 year 67 documents sold

$3.70

Also available in package deal from $7.42

Add to cart

Add to wishlist

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Also available in package deal (1)

Summary Structural Bioinformatics and Machine Learning

$ 7.94 $ 7.42

1x sold

2 items

1. Summary - Summary machine learning x_400154 - ideal reference summary
2. Summary - Summary for structural bioinformatics x_405019
Show more

Machine Learning
Table of Content
Week 1 ..................................................................................................................................................................2
Lecture 1 Introduction ..........................................................................................................................2
Lecture 2 Linear Models and Search– 1................................................................................................4
Week 2 ..................................................................................................................................................................8
Lecture 3 Methodology – 1...................................................................................................................8
Lecture 4 Methodology – 2................................................................................................................ 15
Week 3 ............................................................................................................................................................... 20
Lecture 5 Probability – 1 .................................................................................................................... 20
Lecture 6 Linear Models – 2 ................................................................................................................. 25
Week 4 ............................................................................................................................................................... 32
Lecture 7 Deep learning..................................................................................................................... 32
Lecture 8 Probability – 2 .................................................................................................................... 40
Week 5 ............................................................................................................................................................... 45
Lecture 9 Deep Generative Models ................................................................................................... 45
Lecture 10 Probability – 1 .................................................................................................................... 48
Week 6 ............................................................................................................................................................... 52
Lecture 11 Probability – 1 .................................................................................................................... 52
Lecture 12 Probability – 1 .................................................................................................................... 56
Week 7 ............................................................................................................................................................... 58
Lecture 13 Reinforcement Learning .................................................................................................... 58
Quizes ................................................................................................................................................................ 61
Testquiz...................................................................................................................................................... 61
Quiz 1 ......................................................................................................................................................... 63
Quiz 2 ......................................................................................................................................................... 66
Quiz 3 ......................................................................................................................................................... 70

Exam: recall, applied knowledge or active knowledge
(calculating stuff).
Open book exam

,Week 1
Lecture 1 Introduction
What is machine learning?

Machine learning is often used in other software, in analytics, data mining, data science and statistics.

Machine learning: provides systems the ability to automatically learn and improve from experience without
being explicitly programmed.

Reinforcement learning: taking actions in a world based on delayed feedback.
Online learning: predicting and learning at the same time.

Offline learning: separate learning, predicting and action.
- Take a fixed dataset of examples (aka instance).
- Train a model to learn from these examples.
- Test the model to see if it works, by checking its predictions.
- If the model works, put it into production; i.e. use its predictions to take action.

In machine learning we have a problem (chess, driving car)  break it into an abstract task (classification,
regression, clustering, etc.)  we build an algorithm to solve the abstract task (linear model, kNN, etc.).

Supervised tasks: explicit examples of input and output. Learn to predict the output for an unseen input. E.g.
classification and regression. Use linear models, tree models and NN models.
Unsupervised tasks: only inputs provides. Find any pattern that explains something about the data. E.g.
clustering, density estimation, generative modelling.

2 supervised tasks Classification: assign a class to each example.
Regression: assign a number to each example.

ML is not AI, data science, data mining, information retrieval, statistics or deep learning:
Data science but not ML: gathering data, harmonising data and interpreting data.
More datamining than ML: finding common clickstreams in web logs. Finding fraud in transaction networks.
More ML than datamining: spam classification, predicting stock prices, learning to control a robot.
Statistics but not ML: analysing research results. Experiment design. Courtroom evidence.
More ML than statistics: Spam classification, movie recommendation.

Classification
Two spaces of machine learning: feature space (2D) and model space (3D)

Loss function: lossdata(model) = performance of model on the data (the lower the better). It maps a choice of
model to a loss for the current data. For classification: e.g. the number of misclassified samples.
Loss function for regression, aka mean-squared-errors loss: 𝑙𝑜𝑠𝑠(𝑝) = ∑ 𝑓 (𝑥 ) − 𝑦
The lower the loss function the better.

Few variations of classification: Features: usually numerical or categorical (binary).
Binary classification: two classes. VS Multiclass classification: more than 2 classes.
Multilabel classification: none, some or all classes may be true.
Class probabilities/score: the classifier reports a probability or score for each class.

Offline machine learning recipe:
- Abstract (part of) your problem to a standard task (e.g. classification, etc.).
- Choose your instances & their features (for supervised learning, choose a target)
- Choose your model class (linear model, decision tree, kNN).
- Search for a good model.

2

,Regression: features of instance i = xi; true label for xi = yi and model= f(xi) : loss(𝑓) = ∑ (𝑓(𝑥 ) − 𝑦 )

Unsupervised learning: clustering (e.g. k-means), generative modelling and density estimation.

Semi-supervised learning: e.g. self-training: you have a small set of labelled data (XL) and a large set of
unlabelled data (XU). Train classifier C on XL and then loop over: label XU with C and retrain C on XU + XL.

Self-supervised learning: a large set of un-notated data used without a lot of manual annotation. E.g. for a
natural language program. Thus a model which is build on structure. Often deep-leaning models.

The question is to whether to include sensitive attributes in data or not? To study bias we need these
attributes to be annotated. If we remove them they may be inferred from other features (postcode,
shopping habits, profile picture). Directly using a SA may be preferable to indirectly doing so. There are valid
use case (Race and sex affect medicine; it often requires a causal link).

What is input and what is target is not always clearly separated (embeddings, clustering, semi-supervised
learning, link prediction). Showing that sensitive attributes can be inferred, may serve as a warning to those
who are vulnerable.

Use sensitive attributes with extreme care: consider user communication over prediction; check the
distribution. Do not: imply causality or overrepresent what your predictions mean.

Machine learning is shallow: classification is a simplistic abstraction (male/female, race vs. ethnicity,
gay/straight, sex vs. gender). Models pick up on surface features first (even if deeper features are available).
Interpretability and responsibility is hard (We don’t know what models look at or how to make them look
elsewhere). 95% percent accuracy is not as impressive as it sounds (= 1 mistake in 20 attempts).

Never judge your model’s performance on the training data.
Solve: split your test and training data. Choose your model based on the training data. The aim is not to
minimise the loss on the training data, but to minimise the loss on your test data. You don’t get to see the
test data until you’ve chosen your model.

Machine learning is an empirical science.

Deductive reasoning: all men are mortal Socrates; is a man, therefore Socrates is mortal (discrete
unambiguous provable known rules).
Inductive reasoning: the sun has risen in the east every day of my life, so it will do so again tomorrow (fuzzy
ambiguous experimental unknown rules).

Simplicity, Occams razor: All else being equal, prefer the simpler solution.

3

, Lecture 2 Linear Models and Search– 1
Linear Regression

Notations used: lowercase non-bold for scalars, x, y, z scalar (i.e. single number)
lowercase bold for vectors and uppercase bold for matrices. , y, z vector (column of numbers)
xi : scalar element of vector x; Xij : scalar element of X X, Y, Z matrix (grid of numbers)

Xi : instance i in the data; xj : feature j (of some instance)

Features can be represented in a vector, with each element being a feature.
Model for one feature: f , (x) = 𝑤x + 𝑏
w is the weight (= coefficient) and b the bias (= intercept)
b = determines where the line crosses the vertical axis (when x = 0)
w = determines how much the line rises if we move one step to the right.
Model for two features: f , , (x) = 𝑤 x + 𝑤 x + 𝑏 (it spans a plane)
𝑤 𝑥
Model for n features: f , (x) = 𝑤 x + 𝑏 with ⋮ and ⋮
𝑤 𝑥

Example: try to predict blood pressure on job stress, healthy diet and age  the three features.

f , (x) = 𝑤 x + 𝑤 x + ⋯ + 𝑏 = 𝑤 x + 𝑏
Where wTx is a dot product: w 𝐱 = w ∙ 𝐱 = ∑ 𝑤 𝑥 = ‖w‖‖𝐱‖ cos 𝛼

Which model fits best?
Use two more ingredients: loss function and search method.

Search for a well-fitting model: try to reduce the mean squared error (also called sum-of-
squares loss). Slight variations on the mean squared error : 
Mean squared error loss:
𝑙𝑜𝑠𝑠 , (𝑝) = ∑ f x − t
𝑙𝑜𝑠𝑠 , (w, b) = ∑ w x + b − t

The loss function maps every point in the model space to a loss value. Here, the instance space is just the x
axis.

2. Searching for a good model

Model and feature space difference: most important spaces in machine
learning. Feature space: every example in your data is a point in this space.
Model space: the space where every model is a point. Lines in feature space
(wx + b) can be plotted in model space (with x axis = w and y axis is b) as a
point with the weight w and b.

Loss surface or loss landscape: plot the loss for every point in the model
space.

4

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller lenie22. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $3.70. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

51292 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 15 years now

Start selling

Popular Universities in the United States

Popular books

Find notes and summaries for these qualifications

Seller