Volledige sammenvatting van de tentamenstof voor het vak Introduction to Computer Vision inhet 2e jaar van de bachelors Informatica & Kunstmatige Intelligentie aan de Universiteit van Amsterdam. De samenvatting is in het Engels, net zoals het grootste gedeelte van het vak.
Alle hoorcolleges en lec...
Introduction to Computer Vision Summary CHoogteijling
Introduction to Computer Vision
Contents
1 Interpolation 2
2 Point operators 3
3 Histogram based image operations 3
4 Least Squares Estimators 4
5 Geometric operators 6
6 Homogeneous coordinates 6
7 Local operators 8
8 Local structure 10
9 Image stitching using SIFT 13
10 Pinhole camera 14
11 Convolutional neural networks 16
12 Motion 16
1
,Introduction to Computer Vision Summary CHoogteijling
1 Interpolation
Interpolation allows us to find the value of a function in between the points where the image is
sampled.
Nearest neighbor interpolation is, given the samples F (k), the value of the interpolated function
fˆ at coordinate x. The function is not continuous nor differentiable.
1
fˆ(x) = F (⌊x + ⌋)
2
Linear interpolation is, between adjacent sample points k and k + 1, we assume the function is a
linear function. The function is continuous but not differentiable in the sample points. The second
and higher order derivatives are equal to zero
k ≤ x ≤ k + 1 : fˆ(x) = (1 − (x − k))F (k) + (x − k)F (k + 1)
With cubic interpolation we look at two pixels on the left and two on the right. To interpolate
the function value f (x) for x in between x = k and x = k + 1 we fit a cubic polynomial to the
sample points {k − 1, k, k + 1, k + 2}.
k ≤ x ≤ k + 1 : fˆ(x) = a(x − k)3 + b(x − k)2 + c(x − k) + d
There are better interpolation methods, for example using more samples. A disadvantage of higher
order polynomials is overfitting of the original function: higher order polynomials tend to fluctuate
wildly in between the sample points.
For 2D functions we can also use nearest neighbor, cubic and spline interpolation. We first
interpolate in the x-direction and then in the y-direction.
Extrapolation allows us to find the value of a function outside the domain of the image. To find
the value we can:
• Set the value of a point outside the bounds of the grid to a constant value (often zero).
• Pick a point that is within the bounds of the grid and use the (interpolated) value in that
point.
– Closest point. We select the point inside the bounds of the grid that is closest to the
outside point.
– Mirrored point. We mirror the outside point in the vertical line through the last
sample points in horizontal direction of the grid.
– Wrapping. We select the same point from inside the bounds. Imagine a tiled wall and
each tile showing the same image. This is what the discrete Fourier transform implicitly
assumes.
2
, Introduction to Computer Vision Summary CHoogteijling
1.1 Image histograms
A histogram of all possible scalar pixel values in an image provides a summary of the distribution
of the values over all possible values. There is no science behind choosing an appropriate bin size.
One rule of thumb is Sturges’ rule k = ⌈log2 n⌉ + 1.
The function for an univariate histogram is:
X
hf [i] = [ei ≤ f (x) < ei+1 ]
x∈E
To capture a histogram of data that is multi-dimensional, we can compute a multivariate his-
togram.
A point operator γ is a function that constructed by pointwise lifting a value operator to the
image domain. For two images f : D → R and g : D → R′ . Let γ : R × R′ → R′′ be an operator.
The operator γ can be lifted to work on images.
∀x ∈ D : γ(f, g)(x) = γ(f (x), g(x))
α-blending takes the weighted average of two images. Let f and g be two colour images defined
on the same spatial domain. A sequence of images that shows smooth transition from f to g can
be obtained by the following equation for α-values increasing from 0 to 1.
hα = (1 − α)f + αg
Unsharp masking uses alpha-blending to sharpen an image. Let f be an image and g an unsharp
version of the image. The result is obtained by adding β times the difference of f − g to the original
image.
h = f + β(f − g)
Image thresholding uses a relational operator. Let f be a scalar image, then [f > t], for constant
t, results in a binary image.
3 Histogram based image operations
A Monadic point operator is an operator that changes the pixel value f (x) independent of the
position x and independent of all other pixel values in the neighbourhood.
3
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper charhoog. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €7,99. Je zit daarna nergens aan vast.