From Perception to Consciousness
Hi!
Thank you for buying my notes! Please keep in mind that, while these notes are
extensive, I do recommend you to watch the lectures for yourself and only keep this
document as extra learning material.
Good luck!
Lecture Topic Page
Lecture 9: Attention 2
Lecture 10: Attention 7
Lecture 11: States of Consciousness 12
Lecture 12: Conscious Control 19
Lecture 13: Free Will 26
Lecture 14: To See or Not To See 31
Lecture 15: Theories of Consciousness 38
Lecture 16: What is it like to be a fly? 50
1
,Lecture 9: Attention
Monday the 1st of March – Victor Lamme
You cannot translate every sensory input into a motor output. Some sort of selection has
to be made there. That is the prime function of attention. The processing of sensory
input goes from low to high level sensory cortex, to extract ever more meaningful
features from the environment. These are then translated to motor outputs in a reverse
hierarchy, going from abstract motor commands to the activation of – eventually –
muscles that in turn will influence the environment.
In the Dichotic Listening experiment, participants hear two different auditory streams in
the two ears. They can only reproduce (and remember) one stream at a time: the
stream they attend to. There is a limited capacity in going from perception to action. It is
not possible to attend to and remember both streams. This, again, points to the fact that
the core function of attention is selection.
In the visual domain, we see something similar. Here, there are two types of attention:
overt attention, which is moving your eyes towards something that you want to see, and
covert attention, which is moving your attention towards something without any
external, overt signs, while maintaining fixation on another point. Covert attention also
exists in the auditory domain, which has something to do with the cocktail party effect.
You may pretend to listen to someone who you are talking to while you are actually
focusing on something that is said in another conversation. It can also be the case that
you are listening to the person in front of you, but your attention is suddenly grabbed by
someone else mentioning your name. This is called attentional capture.
Different types of attention
There are multiple types of attention.
Top-down attention, or voluntary attention. This is when you instruct a subject
to focus their attention on some location of the visual field. The reaction time to
presented targets in the visual field are faster at the attended location compared
to when the attention would be focused elsewhere.
Bottom-up attention, or capture. Here, a stimulus will suddenly appear, which
will automatically capture the attention. The subjects are instructed to ignore the
stimulus. There is a shorter reaction time to the primed location. This happens
even when the subjects know the cue is mostly invalid. Attentional capture can
also transform into ‘inhibition of return’. You then have the same prime and
targets, but there is an interval of at least 300 ms. Then you see the opposite
effect: the valid trial is slower than the invalid trial. The prime appears and you
focus your attention there, but you realize that you shouldn’t and you withdraw
your attention from the primed location (in 300 ms). After 300ms, the withdrawal
of your attention gives you sort of a negative attention for that location. There is
an inhibition of this location because you return from this location to your original
point of focus.
Object based attention. Here, you focus on an object instead of a location. You
can see this on the moving image on slide 13 (19:49 in the lecture). You see the
stationary house and the moving face, which have the same location in space but
you can either focus your attention on the face (which is easier because it’s
moving), or you can ignore the face and focus your attention on the house. In
other experiments, which more elegantly show that there is something like
object-based attention. Here (a), you have two objects and people are given a
cue on location C. When you then present a target on location S or D, where will
they turn their attention faster? If you present a cue on C, people are fastest
2
, when the target appears on C, but they are also faster when the target appears
on S than on D, because it’s on the same object. It is even more nicely
demonstrated in (b). If you present a cue on C, the target on S will still be
detected faster than on D.
In another experiment, they also presented two objects: two white bars over a
textured background. You can also see them as two slits or holes. You can make
a subject see them either as two separate objects, or as one background. If
people perceive the things as holes, there is no difference in reaction time
between invalid-within and invalid-between. However, if you see the bars as two
separate objects, the invalid-within is recognized faster than the invalid-between.
Feature based attention. We also have feature-based attention, which means
that the reaction time is shorter to (objects with) features that are attended to.
The effect of attention on processing
The effect of attention is always: bigger responses. The amplitude of the visual evoked
potentials that are recorded from the human scalp increase when the participants focus
on the cue where the target appears. This increase is called the P1 component. The EGG
response is bigger.
A possibility is that attention could generate some sort of a general increase of
responses, no matter where: but that is not the case. Only in the locations where you
attend, the responses will be bigger. In that sense, it’s location specific. The same holds
for attentional capture. If you have such an unpredictive prime, you get the same effect:
if you have a prime that happens to be at the same location as the target will be, the
response will be stronger; if you have a prime that happens to be at a different location
compared to where the target will be, the response will be weaker. The same thing
happens for inhibition of return. These effects of capture (and inhibition of return) again
show spatial specificity, in that they occur only at the cortical locations where the
attention is directed to (or drawn away from).
It is a bit more complicated if you look at the effect of attention of responses of single
neurons. This was demonstrated in an experiment where the blue stimulus was the
preferred stimulus of the neuron, and the white stimulus was not preferred (slide 23). In
the monkey area V4, Desimone found that if you have a receptive field of a V4 neuron,
and the blue (preferred) stimulus was attended, then the response would be larger. If
the white (ineffective) stimulus was attended, the response would be weaker.
Feature-based attention is different, in the sense that it is not location-specific. It’s more
of an effect that occurs all over the visual field. You also see this in terms of
neurophysiology. In this experiment (slide 24), the monkey is fixating on point +, and
on the left side, the monkey is seeing transparent motion: some of the dots move
upwards, some move downwards. The monkey is instructed to either focus on the dots
moving down or the dots moving up: feature-based attention. On the right side, there is
a recording of a receptive field in area MT. If the receptive field of this neuron is
preferring motion upwards, you will see that the response is stronger when the monkey
is focusing on things moving upwards as opposed to when the monkey is focusing on
things moving downwards.
Slide 25 presents another experiment on feature-based attention, but now in area V4.
The monkey has to fixate on the dot in the middle, and there is a recording in the circle.
Then, the monkey is shown three white bars and three black bars. Because the fixation
spot was white, in the next stimulus, the monkey has to make a saccadic eye movement
to the white stimulus. On the right, it is the opposite (a black fixation spot, and a black
stimulus). This is feature-based attention. If what is happening in the receptive field is
3
, the same as where the attention is, you’ll see a stronger response than when there is a
non-match.
In object-based attention, we have the same story. Experiments have been done
regarding the FFA and the PPA. Here, the participant is shown the same image with the
stationary house and the moving face. Attending to faces enhances activity in the FFA,
while attention to houses enhances activity in the PPA.
Another experiment focuses on the monkey area V4. Here, there is a stimulus that is
going to be attended and a stimulus that is not going to be attended. The recording is
from two separate neurons. One stimulus is stimulating two receptive fields, and will
activate two neurons. When the monkey is attending that stimulus, the response will be
larger than when he will be attending the other stimulus. When you look at the amount
of synchrony between the action potentials of the two neurons, you see that the
synchrony is of an oscillatory nature. That happens a lot: apparently these neurons fire
their action potential in an oscillatory frequency. The synchrony is larger in the attended
case than the inattended case.
Feature Integration Theory
Synchrony is a potential label for assembly coding, we have seen that before. It has
something to do with perceptual grouping. The idea is that this has something to do with
binding. We have also observed late modulation, which are reflecting figure-ground
segregation. Synchrony and increased activation have also been associated with what we
call perceptual binding. This is relevant to a theory about what attention does: The
Feature Integration Theory. This theory is about the integration of features.
We can see this in the Where’s Waldo picture. Waldo is uniquely defined by a
combination of different features. We have to make use of conjunction search, which is a
form of visual search. This is demonstrated on slide 32: the question is whether there is
a red T in the display. Here, it is easy: there are not a lot of distractors, and so you will
be quite fast.
A typical finding is that a feature search will be equally fast, regardless of the number of
distractors. This is also called a pop-out search: the object immediately pops out. Here,
there is a flat display size, and a zero slope. In a conjunction search, the more
distractors there are, the longer the search takes. This is a non-zero slope. Conjunction
targets are found slower when there are more items, because the targets are defined by
multiple features (both shape and color, for example). It requires the binding of
features. You have to do the search in a serial way, instead of a parallel way. This is the
central point of the Feature Integration Theory (FIT).
FIT states that there are different features, which are detected in parallel across the
visual field. To know which color goes with which orientation, you have to bind the two
features together, and for that you have to attend at that location. The attending gives
you the binding (see slide 36). This has parallels with the notion of assembly coding that
has been discussed, in that they both propose some sort of mechanism to integrate
features into an object representation. The key thing of FIT is that the mechanism of
attention is the glue that holds the assembly together.
Why does enhanced sensory activity (or an enhanced synchrony) lead to a faster
response? This is formalized in a horse-race model. Responses being stronger compared
to responses being weaker, reach a particular level of activation, sooner than the weaker
responses. Because they reach this particular level of activation sooner, and this is a
threshold to make a movement, you can understand why the stronger response gives
4