Complete summary of the contents for the exam for the course Introduction to Computer Science in the 2nd year of the Computer Science & Artificial Intelligence bachelor's degrees at the University of Amsterdam. The summary is in English, like the course.
All lectures and lecture notes are in the s...
Introduction to Computer Vision Summary CHoogteijling
Introduction to Computer Vision
Contents
1 Interpolation 2
2 Point operators 3
3 Histogram based image operations 3
4 Least Squares Estimators 4
5 Geometric operators 6
6 Homogeneous coordinates 6
7 Local operators 8
8 Local structure 10
9 Image stitching using SIFT 13
10 Pinhole camera 14
11 Convolutional neural networks 16
12 Motion 16
1
,Introduction to Computer Vision Summary CHoogteijling
1 Interpolation
Interpolation allows us to find the value of a function in between the points where the image is
sampled.
Nearest neighbor interpolation is, given the samples F (k), the value of the interpolated function
fˆ at coordinate x. The function is not continuous nor differentiable.
1
fˆ(x) = F (⌊x + ⌋)
2
Linear interpolation is, between adjacent sample points k and k + 1, we assume the function is a
linear function. The function is continuous but not differentiable in the sample points. The second
and higher order derivatives are equal to zero
k ≤ x ≤ k + 1 : fˆ(x) = (1 − (x − k))F (k) + (x − k)F (k + 1)
With cubic interpolation we look at two pixels on the left and two on the right. To interpolate
the function value f (x) for x in between x = k and x = k + 1 we fit a cubic polynomial to the
sample points {k − 1, k, k + 1, k + 2}.
k ≤ x ≤ k + 1 : fˆ(x) = a(x − k)3 + b(x − k)2 + c(x − k) + d
There are better interpolation methods, for example using more samples. A disadvantage of higher
order polynomials is overfitting of the original function: higher order polynomials tend to fluctuate
wildly in between the sample points.
For 2D functions we can also use nearest neighbor, cubic and spline interpolation. We first
interpolate in the x-direction and then in the y-direction.
Extrapolation allows us to find the value of a function outside the domain of the image. To find
the value we can:
• Set the value of a point outside the bounds of the grid to a constant value (often zero).
• Pick a point that is within the bounds of the grid and use the (interpolated) value in that
point.
– Closest point. We select the point inside the bounds of the grid that is closest to the
outside point.
– Mirrored point. We mirror the outside point in the vertical line through the last
sample points in horizontal direction of the grid.
– Wrapping. We select the same point from inside the bounds. Imagine a tiled wall and
each tile showing the same image. This is what the discrete Fourier transform implicitly
assumes.
2
, Introduction to Computer Vision Summary CHoogteijling
1.1 Image histograms
A histogram of all possible scalar pixel values in an image provides a summary of the distribution
of the values over all possible values. There is no science behind choosing an appropriate bin size.
One rule of thumb is Sturges’ rule k = ⌈log2 n⌉ + 1.
The function for an univariate histogram is:
X
hf [i] = [ei ≤ f (x) < ei+1 ]
x∈E
To capture a histogram of data that is multi-dimensional, we can compute a multivariate his-
togram.
A point operator γ is a function that constructed by pointwise lifting a value operator to the
image domain. For two images f : D → R and g : D → R′ . Let γ : R × R′ → R′′ be an operator.
The operator γ can be lifted to work on images.
∀x ∈ D : γ(f, g)(x) = γ(f (x), g(x))
α-blending takes the weighted average of two images. Let f and g be two colour images defined
on the same spatial domain. A sequence of images that shows smooth transition from f to g can
be obtained by the following equation for α-values increasing from 0 to 1.
hα = (1 − α)f + αg
Unsharp masking uses alpha-blending to sharpen an image. Let f be an image and g an unsharp
version of the image. The result is obtained by adding β times the difference of f − g to the original
image.
h = f + β(f − g)
Image thresholding uses a relational operator. Let f be a scalar image, then [f > t], for constant
t, results in a binary image.
3 Histogram based image operations
A Monadic point operator is an operator that changes the pixel value f (x) independent of the
position x and independent of all other pixel values in the neighbourhood.
3
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller charhoog. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $8.99. You're not tied to anything after your purchase.