Concepts of protein technology
and applications: partim XVO
,
, Introduction
I. Definition proteomics
Proteomics = determination of the complete set of proteins that is present in a system under
specific circumstances
< System: protein complex, subcellular compartment, cell, tissue (e.g. blood, saliva…),
organism…
< Circumstances: treatment (e.g. chemotherapy), time after treatment (e.g. different effects
after 1h and after 6h), condition of the cell (e.g. age, normal, infected, tumor…)…
II. Why proteomics?
Different reasons:
1. Compared to genomics, proteomics is ‘the real thing’
There is no direct correlation between the order of complexity of an organism and the
amount of genes in the genome of that organism (genomics can be misleading!)
↔ DNA, RNA, proteins (‘workhorses’ of the organism), metabolites and their interactions
determine complexity
→ ‘Proteogenomics’; since it is difficult to predict the exact sequence, structure and
modification states of an eventual protein product from genes, verification of gene
products by proteomic analysis is still necessary
2. mRNA vs. protein profiling
Microarrays are insufficient to measure protein expression since there is no direct
correlation between amount of mRNA and protein expression (transcriptomics can be
misleading!)
3. More (6 - 8) proteins per gene
As a result of:
- Posttranslational modifications (PTMs) e.g. methylation, glycosylation, ubiquitination,
sumoylation (involved in signaling processes)
- Alternative splicing into isoforms
→ Proteoforms: any of several different forms of the same protein, arising from either
PTM, alternative splicing or single nucleotide polymorphisms (SNP)
4. Protein interaction networks
Most cellular processes are regulated by protein complexes instead of individual proteins
BUT no drastic increase of number of components of the complexes concerning organisms
with higher order of complexity e.g. 78% of yeast proteins is involved in complex
→ Functional proteomics = definition of a protein as an element in an interaction network
(‘contextual function’), rather than ascribing it to one function
1
,5. Cellular localization
- Depending on the biological state of the cell, a protein can be localized in one or different
cellular locations (nucleus, cytosol, plasma membrane, mitochondria, ER…)
- A protein can have different binding partners in different locations
THUS one protein can have several functions, depending on the localization in the cell
We need proteomics to achieve a higher level of contextual information e.g. not only ‘Is the
SNP causing a lower expression of my protein of interest?’, but also ‘What is the effect of this
lower expression on cellular processes like phosphorylation?’
III. Proteomics as part of systems biology
In order to understand the dynamic complexity of an organism, an integrated image of all
aspects of proteins needs to be developed:
• mRNA and protein profiles and how these change over time e.g. during development or
changing (pathological) conditions
• Knowledge of the state and properties of all proteins:
o PTMs
o Cellular localization
o Binding of ‘metabolomic’ ligands e.g. haem ring, metal ions, glucose, ATP, ADP,
GTP, GDP…
o Alternative splicing
o Proteolytic degradation (synthesis, localization and activity status of a protease are
regulating factors)
o Oligomeric state and contribution in complexes
o Structure, conformation and allosteric mechanisms
• All protein - protein interaction in space and time in one cell
BUT so far only the average of all possible states is measured
→ Systems biology = proteomic + genomic + metabolomic data in space and time
IV. The different faces of proteomics
Proteomics sensu strictu = large scale identification and characterization of proteins, inclusive
their PTMs
Differential proteomics = large scale comparison of protein expression levels
→ ‘Shotgun’ proteomics = proteomics sensu strictu + differential proteomics
Cell-mapping proteomics = protein - protein interaction studies
2
,V. Identification of proteins: principles
- PAGE: polyacrylamide gel electrophoresis
- HPLC: high pressure liquid chromatography
- GO: gene ontology
V.1. Sample preparation
A good sample preparation is needed to obtain good results; ‘crap comes in, crap comes out’
Different steps:
1. Break up tissues or cells
2. Extract protein fraction
3. Modification (e.g. denaturation, reduction…) of proteins for further analysis (depending
on the forthcoming methods for separation/purification and identification)
Important variables in sample preparation that determine the success of
separation/purification and identification:
• Method of cell lysis; type of detergent
• pH
• Temperature
• Addition of protease inhibitors to prevent proteolytic degradation
o Proteases hydrolyse (specific) peptide bonds:
- Trypsin: C-terminal of Lys and Arg residues
- Chymotrypsin: C-terminal of large hydrophobic residues (Tyr, Phe, Trp)
o Digestion of proteins by proteases (before or after protein separation) is necessary
for mass spectrometry (MS)
In many cases, sample preparation is a matter of trial and error!
3
, V.2. Separation of proteins/peptides
Protein enrichment (concentration of proteins) of the sample is needed
- Dynamic range of proteins in a cell is very wide: between 1 000 000 and 10 copies/cell e.g. protein
concentrations of plasma proteins differ by 11 orders of magnitude
- Extremely difficult to detect low abundant proteins: black curve only reaches detection limit of silver
staining when a huge number of cells are analyzed
V.2.1. Increasing separation capacity (peak capacity)
Use sequential separation techniques (‘dimensions’) whereby each dimension is a technique based on
a different physicochemical characteristic of the proteins or peptides (= orthogonal separation) and
multiply separation capacities/peak capacities of each dimension to increase the total separation
capacity
Total peak capacity = CS1 X CS2 X CSn-1 X CSn
with CSn = peak capacity for a chromatographic system n
E.g. separation in 1st dimension
for color with CS1 = 100 ↔
separation in 2nd dimension for
calorie count with CS2 = 250 →
total peak capacity = 25 000
4