Samenvatting

Patter Recognition Summary & Course Notes

1 keer verkocht

Vak
Pattern Recognition

Instelling
Universiteit Utrecht (UU)

This document contains notes and summaries covering the content of the course Pattern Recognition within the Artificial Intelligence Master at Utrecht University. It covers the following topics: - LMs for regression - LMs for classification 1&2 - intro to DL and NN - CNNs - RNNs

[Meer zien]

Voorbeeld 3 van de 16 pagina's

Bekijk voorbeeld

Geupload op 22 november 2022
Aantal pagina's 16
Geschreven in 2022/2023
Type Samenvatting

Volgen

massimilianogarzoni Lid sinds 7 jaar 17 documenten verkocht

€10,99

In winkelwagen

Op verlanglijstje

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Pattern Recognition Class Notes

PR CLASS 1 - LINEAR MODELS FOR REGRESSION

– what’s statistical pattern recognition --> field of PR ML concerned
with automatic discovery or regularities in data thru use of computer
algorithms and with use of these regularities to take actions such as
classifying data into categories
– ML approach to do this --> use set of training data of n labeled
examples and fit a model to this data, the model can subsequently be
used to predict class for new input vectors about a new data point,
ability to categorise correctly is called generalization
– supervised: numeric target (regression), discrete unordered target
(classification), discrete ordered target (ordinal classification, ranking)
– unsupervised learning: clustering, density estimation
– always in data analysis --> DATA = STRUCTURE + NOISE --> we want
to capture the structure but not the noise
– coefficients (weights) of a liner and nonlinear function are learned or
estimated from the data
– maximum likelihood estimation: --> probability of observing x events
(successes) for y trials, get N independent observations from the
same probability distribution, find that particular value such that the
observed data are more likely to have come from the probability we
actually sampled from, how likely that future observation is similar to
observations drawn from general distribution
– score = quality of fit = complexity
– decision theory --> when wanting to make a decision in a situation
involving uncertainty, then two steps:
– 1) inference: learn p(x,t) from data (main subject of this course)
– 2) decision: given the estimate of p(x,t), determine optimal
decision (not focus on this in this course)
– suppose we know the joint probability distribution p(x,t):
– the task is --> given value for x, predict class labels t
– Lkj is the loss of predicting class j when the true class is k, K is the
number of classes
– then, to minimize the expected loss, predict the class j that
minimizes the function at page 37 of lecture 1 slides
– useful properties of expectation and variance:
– E[c] = c for constant c.
– E[cx] = cE[x].
– E[x ± y ] = E[x] ± E[y ].
– var[c] = 0 for constant c.
– var[cx] = c2var[x].
– var[x ± y ] = var[x] + var[y ] if x and y independent

, – if E (expected value), c is a constant and x a variable:
– 1) E(c) = c
– 2) E(cx) = cE(x)
– 3) E(x +- y) = E(x) +- E(y)
– to minimize the expected squared error we should predict the mean of
t at point t0
– generally y(x) = Et(t|x) --> population regression function
– simple approach to regression:
– predict the mean of the target values of all training observations
with x = x0
– problem here is that given the numerical nature of the x value, it
wouldn’t really be in practice
– possible solution --> look at data points that are close by to where
i wanna make the prediction, then take the mean of those train
points
– idea here is KNN for regression, ex. with in variable x ad out
variable t:
– 1) for each x define a neighborhood Nk(x) containing the
indices n of the k points (xn, tx)
– with k = small value (e.g. 2), line is very wiggly, but with k =
larger value (e.g. 20), line much smoother because we are
averaging on more k nearest neighbors
– k is a complexity hyperparameter for KNN
– for a regression problem, the best predictore for out variable is the
mean of the out variable for a given input x, but this not applicate with
finite data sample, so:
– NN function approximates mean by averaging over training data
– then NN function relaxes conditioning at a specific input

passphrase for ssh key: Umga!20RuspaSape21?mgaU

PR CLASS 2 - LINEAR MODELS FOR CLASSIFICATION 1

– assumption in linear regression model is that the expected value of t is
a linear function of x
– have to estimate slope and intercept from data
– the observed values of t are equal to the linear function x + some
random error
– given a set of train points (N), we have N pairs of input variables x and
target variable t
– find values for w0 and w1 that minimized the sum of squared errors
– average value is denoted with a bar little dash above the variable term

, – r2 (coefficient of determination):
– metrics for regression prediction evaluation
– define three terms:
– 1) total sum of squares —> is total variation of t in the sample
data
– 2) explained sum of squares —> is the part explained by the
regression, the variation in t explained by regression function
– 3) error sum of squares, part explained by the regression —> is
the sum over all observation to find the difference between
actual value of t and predictions
– therefore we have SST = SSR (part explained by regression) + SSE
– Rsquared = proportion of variation in t explained by x:
– r2 = SSR/SST = 1 - SSE / SST
– output is a number between 1 and 0
– if = 1, all data points are on regression line
– If = 0, they as far away as possible
– the above is the Calculus based solution to this
– but let's check the geometrical approach:
– suppose pop regression line goes through origin, meaning that w0
intercept is = 0, passes thru origin on y axis
– then follow same approach, result is same just regression error is
different
– error vector should be taken perpendicular to the vector of x
meaning that the dot product of x and error vector should be 0
– in least squares solution:
– y is a linear combination of the columns of X
– y = Xw
– want the error vector to be as small as possible
– the observed target values are not a linear combination of the
predictor variables --> if that would be the case, we'd get a perfect
regression line (all the points on it), which is just never the case with
real data
– general linear model:
– the term linear means that it has to be a linear function of the
coefficients/weights, but not necessarily of the input variables
– is linear in the features, but not necessarily in the input variables
that generate them
– interaction between two features modifies the effect of these two
variables on the target variable (e.g. age and income on pizza
expenditure)
– regularized least squares:
– add a regularization term to control for overfitting, in those cases
where the number of predictor variables is very large
– here use ridge regression (weight decay in NN)
– penalize the size of the weights

Dit zijn jouw voordelen als je samenvattingen koopt bij Stuvia:

Bewezen kwaliteit door reviews

Studenten hebben al meer dan 850.000 samenvattingen beoordeeld. Zo weet jij zeker dat je de beste keuze maakt!

In een paar klikken geregeld

Geen gedoe — betaal gewoon eenmalig met iDeal, creditcard of je Stuvia-tegoed en je bent klaar. Geen abonnement nodig.

Direct to-the-point

Studenten maken samenvattingen voor studenten. Dat betekent: actuele inhoud waar jij écht wat aan hebt. Geen overbodige details!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper massimilianogarzoni. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €10,99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 69411 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Begin nu gratis

Samenvatting

Patter Recognition Summary & Course Notes

Document informatie

Onderwerpen

Geschreven voor

Verkoper

Ontvangen beoordelingen

Voorbeeld van de inhoud