CV midterm to final
optical flow, goals and problems - ✅✅ -goal: Optical flow is the pattern of apparent
motion of objects, surfaces, and edges in a visual scene caused by the relative
motion between an observer (an eye or a camera) and the scene.
problem: aperture problem (small window observation window result in inability to
perceive motion parallel to edge.
under-constrain
constant brightness ....
that's why motion field is different from optical flow
Horn & Schunck overview and pseudo code - ✅✅
-global constraint of smoothness
to solve aperture problem
method:
assume flow is smooth over entire image, compute velocity field [u v] that minimizes
E
motion field vs optical flow -✅✅ -motion field is True motion mapped to 2D like a
vector field.
optical flow is apparent motion of brightness patterns in image sequence. it is
PROJECTION of motion field
✅✅-f(x,y,t)=f(x+dx,y+dy,t+dt)
optical flow equation -
Lucas kanade optical flow naive - ✅✅-assume constant motion inside a
neighborhood
assume we have window of 5x5 around pixel of interest(25 pixels), giving useAx=b
where A has 25 rows.
Then use least squares method to solve, AtAX=ATb
minimize error in regard to fx and fy
pyramid lucas kanade: goal and method - ✅✅ -deal with larger motion, point
doesn't move like its neighbors, color not constant
method is that the lower-resolution levels allow the algorithm to capture large, global
motions, while the subsequent levels refine the flow to capture finer, localized
movements. The Pyramid Lucas-Kanade method is more robust to large motion in
images and is one of the standard approaches used in computer vision for
estimating optical flow.
at level i:
take flow u, v from level i-1
interpolate to create u, v matrices, multiple u, v by 2
compute f
apply LK
correct for error
, simple KLT tracking algorithm - ✅✅ -basic algorithm
1detect harris corner in 1st frame
2for each corner computer motion between consecutive frames
3Link motion vectors in successive frames
4introduce new harris point every m frames (cause corners dissappear)
5tracker new and old harris 1 -3
jacobian -✅✅ -a matrix that represents how small changes in parameters in W
affect output.
bridges gap between parameters of warp function and actual pixel displacement
baker extend KLT method - ✅✅ -1warp I with W
2subtract warped image I from template image T
3compute gradient of T
4evaluate jacobian of warp function in respect to p
5compute descent and inverse hessian, this takes both the jacobian, template
gradients to get steepest descent image
6multiply descent with error
7compute gradient of p
baker extend KLT method-math version - ✅✅-
ANN-overview - ✅✅ -simulate biological neural networks
Each neuron contributes to the weight, has its own activition function
deep learning has more layers, able to handle more complex tasks
reLU function -✅✅-max(0,x)
Deep learning overview - ✅✅-lot more hidden layers
higher accuracy
perform better with larger datasets
CNN overview - ✅✅ -CONVOLUTION LAYER convolve the filter with the image
Convolving results in convolving an image is sliding a filter over image, performing
multiplication of filter values with image values. Result sums up to single value. so a
32x32x3 image becomes a 28x28x1 map for example
note that width is from steps directly before
can have multiple channels
Use activation function reLu LAYER- f(x)=max(0,x) for example: x is input
POOLING reduces size further (max-pooling for example takes the max of each
region)
optical flow, goals and problems - ✅✅ -goal: Optical flow is the pattern of apparent
motion of objects, surfaces, and edges in a visual scene caused by the relative
motion between an observer (an eye or a camera) and the scene.
problem: aperture problem (small window observation window result in inability to
perceive motion parallel to edge.
under-constrain
constant brightness ....
that's why motion field is different from optical flow
Horn & Schunck overview and pseudo code - ✅✅
-global constraint of smoothness
to solve aperture problem
method:
assume flow is smooth over entire image, compute velocity field [u v] that minimizes
E
motion field vs optical flow -✅✅ -motion field is True motion mapped to 2D like a
vector field.
optical flow is apparent motion of brightness patterns in image sequence. it is
PROJECTION of motion field
✅✅-f(x,y,t)=f(x+dx,y+dy,t+dt)
optical flow equation -
Lucas kanade optical flow naive - ✅✅-assume constant motion inside a
neighborhood
assume we have window of 5x5 around pixel of interest(25 pixels), giving useAx=b
where A has 25 rows.
Then use least squares method to solve, AtAX=ATb
minimize error in regard to fx and fy
pyramid lucas kanade: goal and method - ✅✅ -deal with larger motion, point
doesn't move like its neighbors, color not constant
method is that the lower-resolution levels allow the algorithm to capture large, global
motions, while the subsequent levels refine the flow to capture finer, localized
movements. The Pyramid Lucas-Kanade method is more robust to large motion in
images and is one of the standard approaches used in computer vision for
estimating optical flow.
at level i:
take flow u, v from level i-1
interpolate to create u, v matrices, multiple u, v by 2
compute f
apply LK
correct for error
, simple KLT tracking algorithm - ✅✅ -basic algorithm
1detect harris corner in 1st frame
2for each corner computer motion between consecutive frames
3Link motion vectors in successive frames
4introduce new harris point every m frames (cause corners dissappear)
5tracker new and old harris 1 -3
jacobian -✅✅ -a matrix that represents how small changes in parameters in W
affect output.
bridges gap between parameters of warp function and actual pixel displacement
baker extend KLT method - ✅✅ -1warp I with W
2subtract warped image I from template image T
3compute gradient of T
4evaluate jacobian of warp function in respect to p
5compute descent and inverse hessian, this takes both the jacobian, template
gradients to get steepest descent image
6multiply descent with error
7compute gradient of p
baker extend KLT method-math version - ✅✅-
ANN-overview - ✅✅ -simulate biological neural networks
Each neuron contributes to the weight, has its own activition function
deep learning has more layers, able to handle more complex tasks
reLU function -✅✅-max(0,x)
Deep learning overview - ✅✅-lot more hidden layers
higher accuracy
perform better with larger datasets
CNN overview - ✅✅ -CONVOLUTION LAYER convolve the filter with the image
Convolving results in convolving an image is sliding a filter over image, performing
multiplication of filter values with image values. Result sums up to single value. so a
32x32x3 image becomes a 28x28x1 map for example
note that width is from steps directly before
can have multiple channels
Use activation function reLu LAYER- f(x)=max(0,x) for example: x is input
POOLING reduces size further (max-pooling for example takes the max of each
region)