Samenvatting

Summary Tentamen voorbereiding - Machine Learning for the Quantified Self (XM_40012)

0 keer verkocht

Instelling
Vrije Universiteit Amsterdam (VU)

Alle belangrijke termen, formules, voorbeelden en feiten die je moet kennen voor het tentamen voor Machine Learning for the Quantified Self.

[Meer zien]

Voorbeeld 2 van de 14 pagina's

Bekijk voorbeeld

Geupload op 5 oktober 2023
Aantal pagina's 14
Geschreven in 2022/2023
Type Samenvatting

€5,57

In winkelwagen

Opslaan

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Machine Learning for the
Quantified Self
Terminology
A measurement is one value for an attribute recorded at a specific time point. E.g., heart
rate, velocity, etc.
A time series is a series of measurements in temporal order.
Supervised learning is the machine learning task of inferring a function from a set of labeled
training data.
In unsupervised learning, there is no target measure (or label), and the goal is to describe
the associations and patterns among the attributes.
Reinforcement learning tries to find optimal actions in a given situation so as to maximize a
numerical reward that does not immediately come with the action but later in time.

An example of an instance is x 1=[0,45 , low ,0 ]. A target for the instance is g1=[inactive] .
Outlier detection
An outlier is an observation point that is distant from other observations. There can be two
causes of an outlier:
- Measurement error (Arnold with a heart rate of 400)

, - Variability (Arnold trying to push his limits with a heart rate of 190)

Outliers can be detected and removed using two types of outlier detection:
- Distribution based (we assume a certain distribution of the data)
- Distance based (we only look at the distance between data points)
Distribution-based outlier detection
Chauvenet’s criterion assumes a normal distribution of a single attribute. The mean and
variance of the dataset are used as parameters of the normal distribution. A measurement is
1
rejected if the probability of observing it is less than , where c is a parameter indicating
c⋅N
the certainty of the outlier, and N is the size of the dataset.
Mixture models assume that the data can be described by K normal distributions
{N ( μ1 , σ 1 ) , … , N ( μK , σ K ) }. All the 2 K parameters can be estimated by using the maximum
likelihood of observing the data. Points with the lowest probabilities are candidates for
removal.
Distance-based outlier detection
The simple distance-based approach calls a point close if they are within distance d min . Points
are outliers when there is more than a fraction f min of points outside d min .
The local outlier factor also takes the density of the surrounding points into account, to
prevent a less dense cluster of points to all be flagged as outliers. The first step is to define
the k -distance k dist of a point x i. This is defined as the largest distance among the distances to
the k closest points. In other words, there should be at most k −1 points with a distance less
than k dist and at least one point which is exactly k dist away. These two together form the k dist nh

set.
The reachability of a point x i to another point x is:

This expresses that a reachability distance is the real distance if the point x i is not among the
k nearest points of x (in that case the value for d ( x , x i ) will be larger than k dist (x )) and
otherwise it is k dist of that point, so we set the distance value of all points within k dist (x )
equal to k dist ( x ).
Next, the local reachability density around our point x i is:

Intuitively, this says something about how close x i is to its neighbors. If a point is part of x i’s
nearest k neighbors, but this relationship does not hold the other way, x i might be an outlier.
The lower the average distance to the neighbors, the higher the local reachability distance
becomes.

Dit zijn jouw voordelen als je samenvattingen koopt bij Stuvia:

Bewezen kwaliteit door reviews

Studenten hebben al meer dan 850.000 samenvattingen beoordeeld. Zo weet jij zeker dat je de beste keuze maakt!

In een paar klikken geregeld

Geen gedoe — betaal gewoon eenmalig met iDeal, creditcard of je Stuvia-tegoed en je bent klaar. Geen abonnement nodig.

Direct to-the-point

Studenten maken samenvattingen voor studenten. Dat betekent: actuele inhoud waar jij écht wat aan hebt. Geen overbodige details!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper sandervanwerkhooven. Stuvia faciliteert de betaling aan de verkoper.