ECB II – HC 1 – Proteomics
Proteomics is the study of proteins. The proteome is the entire set of proteins that is produced or
modified by an organism. The information gained through proteomics can be helpful in treating
patients with diseases that are caused by a defect in the proteome of a certain organ or the entire
organism, such as the formation of amyloid-beta plaques in the brain causing Alzheimer’s disease.
Topics in this summary: Proteomics & Proteomics in Neurodegenerative Disorders.
Proteomics
Proteomics relies on the use of mass spectrometry (MS). MS is a device that can identify and quantify
peptides/proteins, along with carbohydrates, fatty acids and so on (the study of carbohydrates, fatty
acids and other metabolites and metabolic processes have their own names: metabolomics,
interactomics etc). The way this is done, is by injecting a molecule (protein/peptide) into the MS. The
molecule undergoes a series of processing (explained later) and eventually will be detected by a
detector. The output of the detector is a graph that shows the intensity and mass of the molecule
that is being analyzed. The intensity indicates how many of the molecules have been detected and
the mass indicates the mass that has been detected per molecule. So, if the same molecule were to
be injected at a different concentration, the intensity would change and the mass would remain the
same. The intensity, correlating to concentration, can help quantify the peptides/proteins and the
mass can help identify a peptide/protein.
A peptide is a combination of amino acids, whereas a protein can be cut into many peptides with
specific masses. In proteomics, proteins are almost always digested into peptides with the help of
trypsin before analysis. The reason for this is due to the difficulty of measuring proteins with MS.
Proteins measurements give an output with low resolution and poor mass accuracy, whereas
measuring peptides is more accurate with higher resolution.
The mass of each amino acid is known. The mass of a peptide is a character of the peptide itself.
However, not much can be said about a peptide by only knowing the mass of the peptide. Two
peptides could have the same mass and the same amino acids, but if the amino acids are arranged
differently they may take a different shape and function. For high fidelity peptide/protein
identification, the amino acid sequence needs to be determined.
See picture. A tandem MS (two MSs together) is
required for figuring out the amino acid sequence.
In the first MS, the protein is digested with trypsin
into different peptides. The masses and charge of
these different peptides are measured. In the
transition to the second MS, one of the peptides is
filtered through. The single peptide is then
fragmented by energy in the fragmentation
chamber. This is done in such a manner that it
breaks only once (fragments are not further
fragmented). The breaks occur in the weakest
covalent bonds between the amino acids, which are
all practically the same strength of bond (peptide
bonds). The result is various randomly fragmented
charged peptide pieces. The second MS then
measures the mass and charge of each of these
fragmented peptide pieces. The information from the graphs that are created by tandem MS can be
used to deduce the amino acid sequence of a peptide.
,During fragmentation, every type of fragment can be
made amino acid by amino acid from the N-terminus to
the C-terminus and vice versa. Fragments starting from
the N-terminus are called the B-series, and fragments
starting from the C-terminus are called the Y-series. Since
each amino acid mass is known, by subtracting the
masses of the fragmented ion peptides, the sequence of
the peptide can be revealed (you know how much K and S
weighs, so S-K weight is also known, and with Y weight
known, Y-S-K is also known etc.).
See graph. The letters under the graph is an attempt to
create the amino acid sequence. The one color indicates the
N-terminus direction and the other color indicates the C-
terminus direction (he did not say which color was what).
Sometimes, due to how the fragmentation goes, some
fragments are missing or there are not enough of some
fragments, which leads to a difficulty in finding the amino
acid sequence. So the more fragments there are of a single
fragment, the higher the confidence is with which the
conclusion/identification can be made of a certain amino
acid sequence.
Once you have the fragmentation spectrum and the precursor mass and charge state, a computer
will calculate the amino acid sequence. It does this by:
• Selecting peptides from databases that are equal to the mass analyzed
• Comparing the fragment peptides with theoretical fragment peptides
• Comparing theoretical fragments to acquired spectrum
• Generating a score
• Ranking by score and display of best matches
Consequently, the peptide is identified.
In today’s proteomics analysis, a sample may consist of more than 3,000 proteins. Each protein may
generate roughly 30 peptides, 3,000 proteins would generate 90,000 peptides if this were the case.
These 90,000 peptides cannot be injected in one go into the mass spectrometer, this would be too
complex. Therefore, the complexity of each peptide is often reduced by high pressure liquid
chromatography (HPLC). This process lasts about 2 hours (elution time of each peptide is 30
seconds). In most cases the elution is done based on the hydrophobicity of the peptides. The
complexity can also be reduced by separating the proteins and choosing the protein of interest by
using SDS-PAGE. The gel can be cut into several slices and MS analysis performed separately to
further reduce sample complexity.
HPLC is first of all a type of liquid chromatography. This means that a solvent is involved. The solvent
is pumped into a system. Afterwards, the sample is introduced into the system as well. This sample is
then separated based on the hydrophobicity using separation columns. Hydrophobic peptides are
attracted to the columns more easily and hydrophilic peptides go through the columns very fast. This
creates a distance between peptides based on hydrophobicity. A detector will detect the different
,peptides (most hydrophilic first, most hydrophobic last).
The time frame between the peptides is about 30 seconds.
Linked up to a MS, the mass and intensity can be visualized.
The peptide sample used in the MS has to be in the gas
phase. When digesting a protein into peptides it is done in a
solution and the sample is in the liquid phase. To make a
gas phase out of the sample, there are two different types
of desorption/ionization techniques: electrospray and
MALDI (Matrix-Assisted Laser Desorption Ionization).
In electrospray, the sample is first separated
in a solution with the HPLC. The solution with
the peptides then reaches a tip in the tube.
This tube has very high voltage going through
it. The energy of this high voltage allows for
the change of the liquid phase into the gas
phase. This gas is then sprayed across and
caught by the MS and the mass and intensity
will be measured.
In MALDI, a crystal is made containing the
peptides to analyze and small molecules. This crystal is
made to protect the peptides from fragmenting after
being hit by a laser beam. The crystal is in an electric field
on the positively charged side. Once hit with a strong
laser beam, the crystal becomes positively charged and
goes into the gas phase. This positively charged gas is
then attracted to the negative side of the electric field
and consequently caught by the MS. The speed at which
the gas reaches the detector can be measured and used
to distinguish between heavy and lighter ions. This is
called time-of-flight MS. The energy the ions are given to
move is the same, the time-of-flight differs in ions with
different masses only.
Two other MS techniques are Orbitrap and Q-Tof. You can
deduce the mass and intensity of the peptides accurately
with Orbitrap, by looking at the waves the peptides create
(Orbitrap). In an Orbitrap analyzer, the peptides are moved
around in an orbit and the waves are captured. With Q-Tof,
(he does not explain further in the lecture).
In conclusion, the success of protein analyses depends on the type of mass
spectrometer you have. The mass is first identified, and then the protein is
sequenced. These experiments are hypothesis-free, you do not need a hypothesis,
but it is important to know that you have an aim: why are you doing this? →
gathering information that can be used in a functional study.
, Proteomics in Neurodegenerative Disorders
The information gained from proteomics can be used in studying several diseases that involve
protein disfunction or loss of function. As an example, we take Alzheimer’s disease (AD). AD is the
most common cause of dementia that leads to memory impairment, disorientation, personality
changes, cognitive decline and complete dependency on other people for even the simple tasks in
life. A definitive diagnosis of AD is only possible post-mortem and no cure is currently available to
treat the disease.
The underlying cause of AD is not clearly known. However, mutations in APP, PSEN1, PSEN2 and
APOE4 seem to be a predisposing genetic factor. Another major risk factors is the increase of age:
40% of the people beyond the age of 85 have AD in the US.
Pathological hallmarks of AD are 1). plaque forming
of amyloid-beta 42 proteins that is a result of the
accumulation of misfolded amyloid-beta 42 proteins
(they act as prions and can misfold other proteins of
the same type: amyloid cascade hypothesis).
Chaperone inefficiency, proteasome inefficiency and
endosome dysfunction contribute to the amyloid
plaque formation. The way it happens is that gamma-
secretase cuts it in such a way (presenlin 1 and 2 are
involved too) that amyloid proteins are made that
can aggregate by sticking to each other (amyloidic
pathway). Another hallmarks is 2). neurofibrillary tau
tangle formation due to hyperphosphorylated tau
proteins. There are two types of AD, the early and
old on-set (old on-set is also called sporadic AD). So,
AD is a multi-factorial disease, you have to look at all
the causes. Proteomics may help understand more
about AD. (Other pathology in picture, see later).
The different stages of pathology of AD, ranging from
minor to severe, is indicated by Braak stages
from 1 to 6. AD pathology highly correlates
with the clinical symptoms. See graph. When
cognitive impairment occurs, the neuronal cells
have started dying and this is irreversible.
Proteomics analysis of CA1 and Subiculum
regions of the hippocampus from human
postmortem brains of AD patients revealed
more insight into the disease mechanisms and
allowed the identification of potential early
biomarkers and drug targets for diagnosis and treatment of AD.
Selection of the brains for studying AD is based on two things: 1). Clinical
and neuropathological report: no secondary diseases (lewy body
dementia), no vessel deviations etc. This is important to not measure the
influence of other diseases on AD in the analyses. 2). Staining for: amyloid
beta plaques, p-tau (phosphorylated tau), astrocytes (GFAP) and
microglia (CR3/43). This is important to find what Braak stage the AD
brain is in.