Summary Sensation and Perception chapter 8 t/m 15
Chapter 8 – Visual Motion Perception
Fundamental perceptual dimensions: characteristics of visual stimuli that are directly encoded by
neurons fairly early in the visual system.
Motion Aftereffects
Motion is a low-level perceptual phenomenon. Many cells in the primary visual cortex have motion
direction selectivity. The Waterfall Illusion is an example to show how ‘special’ motion is being
processed in the brain.
→ The motion aftereffect (MAE) = The illusion of motion of a stationary object that occurs
after prolonged exposure to a moving object. This object then moves in the opposite direction. This is
analogous to the color aftereffects. Just as color aftereffects are caused by opponent processes for
color vision, MAEs are caused by opponent processes for motion detection.
Neurons tuned to different directions of motion generally do not respond to a stationary
object, so they simply continue to fire at their spontaneous rate, and the spontaneous rates for
upward- and downward-sensitive cells are balancing each other out. That is, neurons sensitive to
upward motion fire at about the same rate as neurons sensitive to downward motion, so the signals
cancel out and no motion is perceived.
Example MAE: When we look at a waterfall for a prolonged period, the detectors sensitive to
downward motion become adapted. When we then switch our gaze to a stationary object, such as
the rocks next to the waterfall, the neurons sensitive to upward motion fire faster than the adapted
downward-sensitive neurons, and we therefore perceive the rocks as drifting up. Our eyes are
constantly drifting around, so there is always a small amount of retinal motion to stimulate motion
detectors at least slightly.
Interocular transfer = The transfer of an effect (such as adaptation) from one eye to the
other. Could motion aftereffects be subserved by neurons in the retinas? What about neurons in the
lateral geniculate nucleus (LGN)?
The fact that a strong MAE is obtained when one eye is adapted and the other tested means
that the effect must be reflecting the activities of neurons in a part of the visual system where
information collected from the two eyes is combined. Input from both eyes is not combined until the
primary visual cortex (V1), where neurons show a preference for input from one eye or another but
respond to some extent to stimuli in both eyes. Emerging evidence suggests that the MAE in humans
is caused by the same brain region shown to be responsible for global-motion detection in monkeys:
the middle temporal area (MT; V5) of the cortex.
It was demonstrated that the direction-selective adaptation produced a selective imbalance
in the functional magnetic resonance imaging (fMRI) signal in human MT, providing evidence that
MAEs are due to a population imbalance in area MT. Remarkably, more recent work shows that
MAEs can occur even after very brief exposures (25 ms). These can be explained by direction-
selective responses of neurons in MT to subsequently presented stationary stimuli.
Computation of Visual Motion
Start with two adjacent receptors (neuron A and B) separated by a fixed distance. For instance, a bug
moving from left to right would first pass through neuron A’s receptive field, and then a short time
later it would enter neuron B’s receptive field (See Figure 8.3A). A third cell would then listen to
neurons A and B and should be able to detect this movement, called M.
But, this motion detector cell M cannot simply add up excitatory inputs from A and B.
Because this would mean that M would fire in response to the moving bug, but it would also respond
to two stationary bugs, one in each receptive field (Figure 8.3B). So, we need two additional
,components, a cell D and a neuron X (Figure 8.3C). Cell D receives input from neuron A and delays
transmission of this input for a short period of time. It also has a fast adaptation rate, so it quickly
stops firing if the light remains shining on A’s receptive field (after first firing a lot). Cells B and D are
connected to neuron X, which is a multiplication cell. It will only fire when both cells B and D are
active. By delaying receptor A’s response (via D) and then multiplying it by receptor B’s response (via
X), we can create a mechanism that is sensitive to motion (M).
This mechanism would be direction-selective. It is also tuned to speed, because when the bug moved
at just the right speed, the delayed response from receptor A and the direct response from receptor
B would occur at the same time and therefore reinforce each other. Too fast or too slow would cause
an out of sync situation between B and D outputs and will result in no motion perceiving. A realistic
circuit would include additional receptors to detect longer-rate motion (See Figure 8.3D).
Apparent Motion
One possible objection to the Reichart model is that it does not, in fact, require continuous motion in
order to fire. Though it provides an excellent explanation for a visual illusion, called apparent motion,
that modern humans experience on a daily basis.
Apparent motion = The illusory impression of smooth motion resulting from the rapid
alternation of objects that appear in different locations in rapid succession. A movie consists of
frames, and if we see these at a sufficiently fast speed (30 frames per second), we perceive these
position changes over time as motion.
The Correspondence Problem – Viewing through an Aperture
→ The correspondence problem = In reference to motion detection, the problem faced by the
motion detection system of knowing which feature in Frame 2 corresponds to a particular feature in
Frame 1. Is it vertical or horizontal? The two motion detectors (for the different directions) compete
to determine our overall perception.
Example: The aperture problem, which
is the fact that when a moving object in viewed
through an aperture (or a receptive field), the
direction of motion of a local feature or part of
the object may be ambiguous (See Figure 8.6).
The motion direction of the grating is
ambiguous, because the grating could be
moving up ánd to the left, but it could also be
moving just up or just to the left. The motion component parallel to the grating cannot be inferred
from the visual input, because there are no perpendicular contours or features on the grating. This
can cause identical responses in a motion-sensitive neuron in the visual system. Without the
aperture (in this case the window in Figure 8.6), there is no ambiguity and no problem.
To understand the broader implications of the correspondence and aperture problems,
consider the fact that every neuron in V1 has a limited receptive field. In other words, every V1 cell
,sees the world through a small aperture. The solution to this problem is to have another set of
neurons listen to the V1 neurons and integrate the potentially conflicting signals.
Detection of Global Motion in Area MT
Lesions to the magnocellular layers of the LGN impair the perception of large, rapidly moving objects.
Information from magnocellular neurons feed into V1 and is then passed on to (among other places)
the middle temporal area of the cortex (MT) in nonhuman primates, and then to the medial superior
temporal area (MST). MT and MST are considered to be the hub for motion processing. The human
MT is labeled as hMT+, or V5. When you electrically stimulate MT neurons with a rightward motion
preference, the monkey showed a strong tendency to report motion is this direction, even though
the dots were actually moving in the opposite direction. So, MT is critically involved in the processing
of global motion.
This motion-sensitive area may have at least two separate maps located on the lateral
surface at the temporal-occipital (TO) boundary. Most neurons in MT are motion direction-selective,
but they show little selectivity for form or color.
In Figure 8.9 you see a
correlated-dot-motion display. Just
as in the aperture problem
demonstration, no single dot in these
displays is sufficient to determine the
overall direction of correlated motion. So, to detect the correlated direction, a neuron must integrate
information from many local-motion detectors (in the original stimuli there were no arrows).
Lesion studies are often less than completely compelling, because lesions may be incomplete
or may influence other structures.
First-order motion = The motion of an object that is defined by changes in luminance.
Luminance-defined object = An object that is delineated by differences in reflected light.
OR
Second-order motion = The motion of an object that is defined by changes in contrast or texture, but
not by luminance.
Texture-defined object or contrast-defined object = An object that is defined by differences in
contrast, or texture, but not by luminance.
Second-Order Motion
As in first-order apparent-motion displays, nothing actually moves in second-order motion. The only
thing that changes in our second-order-motion movie is that stripes of dots are inverted from one
frame to another (see Figure 8.11). Just as random dot stereograms prove that matching discrete
objects across the two eyes is not
necessary for stereoscopic depth
perception, second-order motion proves
that matching discrete objects across
movie frames is not necessary for motion
perception. Some evidence suggests that
the visual system includes specialized mechanisms of second-order motion.
→ A double dissociation = The phenomenon in which one of two functions, such as first- and second-
order motion, can be damaged without harm to the other, and vice vera.
It is found that the second-order MAE transfers even more completely between eyes than does first-
order MAE. Second-order-like motion does occur in the real world, especially when an object is
effectively camouflaged.
, Motion Induced Blindness (MIB)
If you carefully fixate a central target, stationary targets in the periphery will simply disappear when
a global moving pattern is superimposed.
MIB seems to be somewhat related to the well-known Troxler effect, in which an unchanging
target in peripheral vision will fade and disappear if you steadily fixate a central target. The retinal
image needs to be (relatively) stabilized, so the eyes cannot move the target onto new receptive
fields. The result is that the target is effectively not changing, and the underlying neurons become
adapted.
Using Motion Information
Going with the Flow: Using Motion Information to Navigate
Safe navigation is one of the primary functions of the visual system.
The optic array = The collection of light rays that interact with objects in the world that are in
front of a viewer. Some of these rays strike our retinas, enabling us to see.
Optic flow = The changing angular positions of points in a perspective
image that we experience as we move through the world.
Focus of expansion = The point in the center of the horizon from which,
when we’re in motion, all points in the perspective image seem to emanate.
The focus of expansion is one aspect of optic flow. See Figure 8.12.
This only occurs when the head and eyes remain fixed and pointed straight
ahead. As soon as gaze shifts to one side, a new radial component is introduced
to the optic flow. If the radial shift is relatively slow, observers can compensate
for simulated eye movements just as readily as they do for real eye movements; but with faster
simulated eye movement speeds, performance breaks down. This implies that the visual system can
make use of the copies of eye muscle signals when it is processing optic flow information.
A number of optic flow heuristics:
1. At the most basic level, the mere presence of optic flow indicates locomotion, and a lack of
flow is a signal that you are stationary.
2. Outflow (flow towards the periphery) indicates that you are approaching a particular
destination; inflow indicates retreat.
3. And the focus of expansion, or focus of attention if you’re looking forward while driving in
reverse, tells you where you’re going to or coming from.
Avoiding Imminent Collision: The Tao of Tau
How to we estimate the time to collision (TTC) of an approaching object? This holds the time
required for a moving object to hit a stationary object. TTC = distance / rate.
The most direct way to estimate TTC would be to estimate the distance and speed of the
moving object. However, determining absolute distances in depth is tricky, and humans are far better
at judging TTC than would be predicted on the basis of their ability to judge distance.
There is an alternative source of information in the optic flow that could signal TTC without
the need for absolute distances or rates to be estimated → Tau = The information in the optic flow
that could signal the collision (TTC) without the necessity of estimating either absolute distances or
rates. The ratio of the retinal image size at any moment to the rate at which the image is expanding is
tau, and TTC is proportional to tau. The great advantage of using tau to estimate TTC is that it relies
solely on information available directly from the retinal image.