Proteomics is the study of proteins. The proteome is the entire set of proteins that is produced or modified by an organism. The
information gained through proteomics can be helpful in treating patients with diseases that are caused by a defect in the proteome of a
certain organ or the entire organism, such as the formation of amyloid-beta plaques in the brain causing Alzheimer’s disease.
1. Proteomics:
Proteomics relies on the use of the mass spectrometry (MS). MS is a device that can identify and quantify peptides/proteins, along with
carbohydrates, fatty acids and so on (the study of carbohydrates, fatty acids and other metabolites and metabolic processes have their
own names; metabolomics, interactomics etc). The way this is done, is by injecting a molecule (which can be a protein/peptide) into the
mass spectrometer. The molecule undergoes a series of processing (Explained later) and eventually will be detected by a detector. The
output of the detector is a graph that shows the intensity and mass of the molecule. the peak intensities on the graph correlate to the
peptide concentration (quantification).
- The intensity: indicates how many molecules (peptides/proteins) have been detected by the
MS = concentration. → QUANTIFY PROTEIN.
- The mass: indicates the mass that has been detected per molecule. → IDENTIFY PROTEIN
So if the same molecule were to be injected at a different concentration, the intensity would change
and the mass would remain the same. The intensity, correlating to concentration, can help quantify the
peptides/proteins and the mass can help identify the peptide/protein.
A peptide is a combination of amino acids, whereas a protein can be cut into may peptides with specific
masses. In proteomics, proteins are almost always digested into peptides with the help of trypsin before analysis.
Trypsin cleaves specifically the peptide bond between the carboxyl group of arginine, or the carboxyl group of lysine,
and the amino acid group of the adjacent amino acid. These peptides are derived from the protein: identify the
peptides = identify the protein. The reason for this is due to the difficulty of measuring proteins with MS. Protein
measurements give an output with low resolution and poor mass accuracy, whereas measuring peptides is more
accurate with higher resolution.
The mass of each amino acid is known. The mass of the peptide is a character of the peptide itself. However, not much can be said about a
peptide with only knowing the mass of the peptide. Two peptides could have the same mass and the same amino acid sequence, but if the
amino acids are arranged differently, they may take a different peptide shape and function. For high fidelity peptide/protein identification,
the amino acid sequence should be determined; only the mass is not enough. So DNA→ RNA → amino acid sequence → peptide →
protein.
MS measurement of peptides is excellent, and MS measurement of proteins is possible, but with low resolution and poor mass accuracy.
A tandem MS (two MSs together) is required for figuring out the amino acid
sequence of the peptides. In the first MS, the protein is digested with trypsin
into different peptides. The masses and charge of these different peptides
(from the same protein) are measured. In the transition to the second MS,
one of the peptides is filtered through. The single peptide is then fragmented
by energy in the fragmentation chamber. This is done in such a manner that it
breaks only once (fragments are not further
fragmented). The breaks occur in the weakest
covalent bonds between the amino acids of
the peptide, which are practically the same
strength of bond (peptide bonds). The result is
variously randomly fragmented charged
peptide pieces. The second MS then measures
the mass and charge of each of these
fragmented peptide pieces. The information
from the graph that are created by the tandem
MS can be used to deduce the amino acid
sequence of a peptide.
During peptide fragmentation, every type of
fragment can be made from the N-terminus to the C-terminus and vice versa. Fragments starting from the
N-terminus (left) are called the B-series and fragments starting from the C-terminus are called Y-series.
Since each amino acid mass is known, by subtracting the masses of the fragmented ion peptides, the
sequence of the peptide can be revealed (you know how much K and S weighs (= amino acids), so S-K
weight is also known, and with Y-weight known, Y-S-K is also known) etc.). see graph. the letters under the
graph is an attempt to create the amino acid sequence. The one color indicated the N-terminus direction
and the other colour indicates the C-terminus.
Sometimes, due to how the fragmentation goes, some fragments are missing or there are not enough of some fragments, which leads to
difficulty in finding the amino acid sequence. So the more fragments there are of a single fragment, the higher the confidence is with which
the conclusion/identification can be made of certain amino acid sequences. MORE FRAGMENTS = BETTER!
Once you have the fragmentation spectrum and the precursor mass and charge state, a computer will calculate the amino acid sequence. It
does by:
- Selecting peptides from databases that are equal in masses
- Comparing the fragment peptides with theoretical fragment peptides
, - Comparing theoretical fragments to acquired spectrum
- Generating a score
- Ranking a score and display the best matches.
Consequently, the peptide is identified.
Sample complexity
In today’s proteomics analysis, a sample may consists of more than 3000 proteins. each protein may generate roughly 30 peptides, so 3000
proteins would generate 90000 peptides if this was the case. these 90000 peptides cannot be injected in one go into the mass
spectrometer, this would be too complex. Therefore, complexity of each peptide is often reduced by high pressure liquid chromatography
(HPLC). HPLC takes an hour run to reduce the sample complexity. This process lasts about 2 hours (elution time of each peptide is 30
seconds). In most cases, the elution is done based on the hydrophobicity of the peptides. The complexity can also be reduced by separating
the proteins and choosing the protein of interest by using SDS-PAGE. The gel of SDS-PAGE can be cut into several slices and MS analysis
performed separately to further reduce sample complexity. Liquid chromatography mass spectrometry (LC-MS/MS) is the most commonly
used methodology for proteomics analysis.
High pressure liquid chromatography (used to reduce sample complexity)
HPLC is first of all a type of liquid chromatography. This means that a solvent is used. The
solvent is pumped into a system. afterwards, the sample (containing peptides/proteins) is
introduced into the system as well. This sample is then separated based on hydrophobicity
using separate columns. Hydrophobic peptides are attracted to the columns more easily and
hydrophilic peptides go through the columns very fast towards the detector. This creates a
distance between peptides based on hydrophobicity. A detector will detect the different
peptides (most hydrophilic first, most hydrophobic last). The time frame between the
peptides is about 30 seconds. Linked up to MS, the mass and intensity can be visualized; the
peptides will become fragmented into peptide fragments and reach the detector; the
detector will send the information from the peptide fragments to a software on the
computer which converts the information in a chromatography graph.
Mass spectrometer (schematic representation): sample → inlet system (sample goes inside the system) → ion source (fragments the
peptides) → mass analyser (analyses the masses of the different fragmented peptides) → detector receives this information and sends it to
the computer that compares these fragmented peptide masses to the theoretical fragmented peptides. → a graph is obtained with a
higher score (higher peak) = more alike between theoretical and researched peptides.
From liquid phase → gas phase: electrospray and MALDI.
Peptides can only be analysed in the MS when they are in the gas
phase. When digesting a protein into peptides, it is done in a solution
that contains trypsin and the sample is in the liquid phase. To make a
gas phase out of the sample, there are 2 types of desorption/ionisation
techniques: electrospray and MALDI (matrix-assisted laser desorption
ionization).
- Electrospray: in electrospray, the peptide sample is first
separated in a solution with the high-pressure liquid chromatography.
This reduces the sample complexity. (based on hydrophobicity, the
peptides move along the columns to reach the detector). The solution
with the peptide then reaches the tip in the tube. This tube has a very
high voltage going through it. The energy of this high voltage allows for the change of the liquid phase into the gas phase. This
gas is then sprayed across and caught by the MS and the mass and intensity of the sample will be measured.
- MALDI: in MALDI, a crystal is made containing the peptides to analyse and small molecules. This crystal is made to protect the
peptides from fragmenting after being hit by a laser beam. The crystal is in an electric field on the positively charged side. Once
the crystal is hit by the laser beam, the crystal becomes positively charged and goes into the gas phase (peptide-containing
crystal becomes gas). This positively charged gas is then attracted to the negative side of the electric field and consequently
caught by the MS. The speed at which the gas reaches the detector in MS can be measured and used to distinguish between
heavy and lighter ions. This is called the time-of-flight MS. The energy the ions are given to move is the same, the time-of-flight
is different in ions with different masses only; heavy ions (peptides in the gas phase) move slower towards the ion detector than
light ions.
Mass spectrometers (Tripletop 5600+ and TimsTOF pro 2) measure the masses of the analytes. There are many types of MS:
- Orbitrap MS: you can deduce the mass and intensity of the peptides accurately with Orbitrap, by looking at the waves the
peptides create. In an orbitrap analyser, the peptides are moved around in an orbit and the waves are captured.
, - Q-tof
In conclusion, the success of protein analyses depends on the type of mass spectrometer you have. The mass of the peptides is first
identified, and then the protein is sequenced (its amino acid sequence is determined). These experiments are hypothesis free; you don’t
need a hypothesis, but it’s important to know that you have an aim: why are you doing this? → gathering information that can be used in a
functional study.
We don’t only look at transcriptomics (study of DNA transcription) because post-translational modification (modifications that occur on a
protein after It has been translated) is not revealed by the mRNA. if we really want to identify a protein, then we need to look at
proteomics in order to take these modifications into consideration as well.
2. Proteomics in neurodegenerative disorders
The information gained from proteomics can be used in studying several diseases that involve protein dysfunction or loss of protein
function. As an example, we take Alzheimer’s disease (AD). AD is the most common cause of dementia that leads to memory impairment,
disorientation, personality changes, cognitive decline and complete dependency on other people for even the simple tasks in life. A
definitive diagnosis of AD is only possible post-mortem and no cure is available. The underlying cause of AD is not clearly known. However,
mutations in APP, PSEN1, PSEN2 and APOE4 seem to be the predisposing genetic factor. Remember, if APP remains longer on the
endosome, it will interact with the BACE protein on the endosome and together they will initiate the formation of alpha-beta plaques (B-
secretase enzyme will be activated then and cleave APP into alpha-beta plaques, leading to accumulation of these plaques). Another major
risk factors is the increase of age: 40% of the people beyond the age of 85 have AD in the US. APOE4 gene mutation is the strongest risk
factor gene for AD.
Pathological hallmarks of AD are:
- 1. Plaque forming of amyloid-beta 42 proteins that is the result of the accumulation of misfolded amyloid-beta 42 proteins
(they act as prions and can misfold other proteins of the same type: amyloid cascade hypothesis).
Chaperone inefficiency, proteosome inefficiency and endosome dysfunction contribute to amyloid plaque formation. The way it
happens is that the gamma-secretase cuts the APP (precursor protein) in such a way (presenlin 1 and 2 are involved too) that
amyloid proteins are made that can aggregate by sticking to each other (amyloid pathway).
- 2. Neurofibrillary tau tangle formation due to hyperphosphorylated tau proteins.
There are different types of AD pathology:
A finding was that AD is also partially caused by aggregation of amyloid-
beta proteins around vessels in and around the brain. This would lead
to insufficiency in nutrient transfer to the brain. This is called Cerebral
amyloid angiopathy (CAA).
They wanted to find proteins that are highly expressed in CAA, that are
perhaps involved in pathological stages of AD (so CAA specific proteins).
A protein, MTP was found to be highly expressed around blood vessels
where accumulation/aggregation of amykoid-beta plaques was seen
(CAA). This way markers can be found that may prove useful in
diagnosis and treatment.
Some neurons have granulovacuolar degeneration (GVD). In principle,
nobody knows what has happened to these neurons, but a hypothesis
claims that these neurons are precursor for cell death which leads to
neurodegeneration.
The neurofibrillary tangles (insoluble, twisted fibers) are found inside
the brain’s cells (intracellular) and can accumulate in these brain cells
(neurons). These tangles consist of phosphorylated tau (protein) and this accumulation of these tangles can lead to neuronal death →
neurodegeneration.
There are two types of AD: the early and old on-set (old on-set is also called sporadic AD). So, AD is a multii-factorial disease, you have to
look at all the causes. Proteomics may help understand more about AD. Types of AD:
- Early(familial) AD: young onset 50-60 years, caused by mutations in the APP, PSEN1 and PSEN2 genes.
- Old on set(sporadic) AD:30-40% occurrence for 80+ population, more than 85% of the AD patients are late onset.
The different stages of pathology of AD, ranging from minor to severe, is indicated by Braak stages from 0-6. As you
can see, clinical diagnosis often happens between stage 4 and 5 which is too late. From stage 2 onwards, the
cognitive impairment starts to increase, but the neurofibrillary tangles accumulation inside the brain cells already
starts at stage 0. AD pathology highly correlates with the clinical symptoms. See graph. when cognitive impairment
occurs, the neuronal cells have started dying and this is irreversible. The Aβ oligomer can induce cellular changes
which can lead to neuronal cell death/synapse loss and therefore cognitive impairment. BUT, AD IS A MULTI-
FACTORIAL DISEASE; it’s not only caused by Aβ oligomers.
, Study:
Proteomics analysis of CA1 and Subiculum regions of the hippocampus from human
post-mortem brains of AD patients revealed more insight into the disease mechanisms
and allowed the identification of potential early biomarkers and drug targets for
diagnosis and treatment of AD.
We receive all the post-mortem brains from the Netherlands Brain Bank. Selection of the brains for studying AD is
based on 2 things:
- clinical and neuropathological report: there should be no secondary diseases, no vessel deviations
(infarcts) etc. This is important to not measure the influence of other diseases on AD in the analyses.
- Staining for: amyloid beta plaques, p-tau (phosphorylated tau), astrocytes (GFAP) and microglia
(CR3/43). This is important to find what Braak stage the AD brain is in.
Of note, the tissues that are selected for study
analysis of AD should be used in every type of
experiment. If you choose a different part of the
brain or even a different part of the hippocampus for one experiment and
a different one for the other experiment, than you will have performed a
really bad experiment: every part (even every cell) expresses different
proteins so in order to get the same tissue for various experiments, you
use the laser capture microdissction (LCM). This allows for the cutting of
specific parts of the tissue. By using this LCM, we can isolate the CA1 and
subiculum region from the postmortem hippocampal tissue and thereby
use the same tissue for all the experiments.
SDS-PAGE is used to separate the protein of interest from all the proteins
(reduce sample complexity), the protein will then become cleaved into
peptides (by trypsin) and the sample will become in the gas phase via
MALDI or electrospray. Now, the sample can enter the liquid
chromatography (makes the sample from liquid → gas phase) mass
spectrometry.
The data is generated followingly, the complexity of the proteins is
reduced with SDS-PAGE and digested with trypsin, and the peptides are analysed by MS.
samples Also it is important to make sure that the variations in AD samples
from the same Braak stage are small. See picture with boxplot. As a test, the intensity of p-tau and GFAP should go up when the Braak
stage becomes higher. In other words, phosphorylated Tau protein and GFAP (astrocytes) increase with Braak stages. In the control, you
will not see much as it should be:
Cluster analysis will also be conducted (see picture below with green and
red). Such analysis looks at the amount of expression of certain proteins.
green: low intensity. Red: high intensity. This way, the protein expression per
Braak stage can be measured and visualized.
Some proteins are synthesized more and others less as a result of AD. Further
research can be conducted to find out why. Some findings are: expression
changes in microtubule-associated protein, in Braak stage 1 and 2 the proteins
involved in synaptic vessel release go up and then down, for ionotropic glutamate receptors everything goes down but one goes up a lot.
By analysing these proteins, we can make hypotheses about AD and further research can be done. We can also see in what cells what
proteomics take place.