APPLIED COMPUTATIONAL AND SYSTEMS BIOLOGY: Structural computational Biology
Lecture 1: Protein architecture/structure of proteins
1. Introduction
• Protein architecture = changing of protein structure
• Central dogma = from digital information (DNA sequences) => to analog information (folded proteins)
• Proteins are extremely diverse: in size, shape, function + more diverse tov genes bec of post-modification
2. Amino Acids
• 20 ‘natural’ Amino acids
o The side chains, R, determine the differences in the structural and chemical properties
o Many ways to group AA’s into groups of AA with similar properties
▪ The 20 amino acids can be classified as follows: aliphatic/hydrophobic, polar, alcoholic,
sulfur-containing, aromatic, charged, special
• Aliphatic: only connected carbons in the sidechain
▪ Several amino acids belong in more than one category
• Polar: charged (pos, neg), non charged
• Hydrophobic: aromatic, aliphatic
• Small: tiny
3. Connectivity and secondary structure
• Protein = are linear polymers of amino acids (polypeptides):
o Peptide bond connects AA’s via condensation/dehydration reaction (-H2O)
o In this polypeptide backbone: amino acids are restricted
▪ A dihedral angle = to describe the degree of freedom of the backbone
▪ Overall fold is determined by 3 angles:
• => these 3 angles determine where the chain goes
• => sidechain does not determine where chain goes
• Omega: rotation over there is very difficult (not much degree of freedom) & can be 0 (cis) or 180 (planar;
trans) => Vb: proline can have both orientations 0 & 180: proline reconnects with backbone => restriction
• Rotational prohibition
o = comes from sidechain & backbone bumping into each other as you make rotations around the 2
angles you can rotate (psi, phi)
o Ramachandran map : Phi vs psi angle map
▪ Blue = accessible region to rotate for most AA, where no rotational prohibition is
▪ Grey = large non-accesible regions, where there is rotational prohibition
▪ => when rotating, there are only restricted configurations possible
• Vb: B sheet region, alfa helical region of AA’s configurations
▪ => AA’s differ in Ramachandran maps
• Vb: differences for Arg (standard AA), Gly (lacks side chain, so can reach parts of plot
other residues cannot reach), Pro (less degree of freedom; has restraints on
movement of main chain/backbone, so has restricted range of phi values)
• Hbonding (large EN): order & directionality
o 1) Peptide backbone has nitrogen groups as H donors & carbonyl groups as H acceptors
o 2) Side chains can act as H donors & H acceptors
1
, o => this gives high degree of order & directionality: u need lining up of donor with acceptor, which
gives a specific structure (Hbond gives thus also restriction on peptide conformation)
• Hbonding: specific conformation
o Carbonyl group lies in a plain: nitrogen H donor should approach at a certain angle & distance
towards the carbonyl/oxygen H acceptor => to allow formation of Hbond
o Gevolg: Hbond keeps N& O acceptor in specific orientation = basis of protein secondary structure
• Levels of protein structure
o Primary
▪ Linear AA sequence
▪ Covalent bonds
o Secondary
▪ Local structure; certain “motifs” are common (alfa helix, beta sheet)
▪ Mostly H-bonds
o Tertiary
▪ Complete 3D shape
▪ H-bonds, hydrophobic interactions, ionic bonds, van der Waals interactions, disulfide bonds
o Quaternary
▪ >1 peptide chain
▪ Mostly H-bonds, but also other interaction types
3.1 Alpha helix
• Alpha helix
o (30-35% of AA)
o R. map: phi = -57°, psi = -47°
o Discovered by Pauling: 1951
o α-helix formers: A,C,L,M,E,Q,H,K (AA that favor & stabilize alpha helix)
o Tightly wound, repeating sequence
o “Right-handed”
o Each twist ∼ 5.4 Å; 3.6 residues
o Average length = 18 residues (5 turns)
o R-groups are on outside of helix
o Stabilized by H-bonds between C=O (residue i) and N-H (residue i + 4)
• Alpha helix, cont.
o Deviates from ideal conformation at ends (less H-bonding)
o Some amino acids are “α-helix breakers”
▪ Repeating like-charges/ charged side chains
▪ Repeating “bulky” groups/ bulky side chains
▪ Pro and Gly
o Effects on helical stability:
▪ Electrostatic interactions between adjacent residues
▪ Steric interference between adjacent residues
▪ Interactions between residues 3-4 amino acids away
▪ Polarity of residues at both ends of helix (positive at amino end; negative at carboxyl)
• Propensity (neiging) of AA to take up an alfa-helix conformation: how compatible AA is in alfa-helix
o deltadeltaG = free EN change of mutating from alanine, in alpha helix
▪ the greater = the more destabilizing the alpha helix
o alanine = 0 = preferred in alpha-helix the most (default); proline = highest score = hates alpha helix
o conclusion: By adding up these values in a sequence => calculate probability of alpha helix
• Alpha helix: Helix dipole
o The electric dipole of a peptide bond is transmitted along an alfa helical segment through Hbonds =>
resulting in an overall helix dipole = net pos. charge at N terminal & net neg. charge at C terminal
2
,3.2 B-sheet/B-strand
• B-sheet/B-strand
o Extended, zigzag conformation (20-25% of AA)
▪ Hydrogen bond between groups across strands
▪ Forms parallel and antiparallel pleated sheets
▪ Residues alternate above and below β-sheet
▪ β-sheet formers: V, I, P, T, W
o Intrastrand H-bonding
o Average 6 residues/strand; up to 15
o 2-12 strands/sheet; average 6
o R-groups alternate on opposite sides of sheet
o Distortions:
▪ Beta-bulge = extra residue
▪ Kink = Pro
• Anti-parallel vs parallel.
o Anti-parallel b-sheet
▪ Opposite orientation
▪ R map: phi = -140°, psi = 135°
▪ More stable: middle strand in opposing direction is stitched together in both direction with
double Hbond in optimal angles (2H bond, nothing, 2H bond, nothing etc...)
▪ Can be twisted
▪ 6.5 Å per two amino acid residues
▪ Can withstand distortions and exposure to solvent
o Parallel b-sheet
▪ Same amino-carboxyl direction
▪ R map: phi = -120°, psi = 115°
▪ Less stable: Hbonds have suboptimal angles
▪ Less twisted
▪ Tend to be buried in inside of protein
▪ 7.0 Å per two amino acid
o Can have mix of parallel and anti-parallel
• B-turns
o Interacting strands can be many amino acids apart, so you need folding/B-turn
o Turns are 180° & “connect” strands in folded (globular) proteins (& are for antiparallel sheets?)
o Interaction = between carbonyl oxygen of AA 1 and amino hydrogen of AA 4
▪ Short turn (needs 4 AA residues)
▪ Hydrogen bond between C=O & NH groups within strand (3 positions apart)
▪ Usually polar, found near surface
▪ β-turn formers: S, D, N, P, R
o Pro and Gly are often present
▪ Gly: small and flexible (Type II turns)
▪ Pro: cis conformation makes inclusion in tight turn favorable
3.3 Others
• Loop
o Regions between α-helices and β-sheets
o On the surface, vary in length and 3D configurations
o Do not have regular periodic structures
o Loop formers: small polar residues
• Coil (40-50%)
3
, o Generally speaking, anything besides α-helix, β-sheet, β-turn
3.4 Methods to detect secondary structures
• 1) Circular dichroism can detect secondary structure
o = look at difference in absorption of light (UV) polarized in 2 different planes, by a sample =>
dichroism absorption gives typical signature absorption
▪ dichroism material = absorberen van licht meer of minder afh vd polarisatie
o Alfa helix: has double minimum, Beta conformation: single minimum, Random coil: maximum ABS
• 2) Fourier transform infrared spectroscopy also detects secondary structure
o = absorption of infrared light
o 1) Hbonding groups in proteins have resonance frequencies around infrared
o 2) Hbonding groups will thus absorb at slightly different frequencies in infrared, whether they are in
alfa-helix, B-sheet, random coil
• Propensity of AA’s for being in alfa-helix, B-conformation, B-turn (used for prediction models)
o Vb: Glu prefers alfa-helix, don’t like B-conformation, tolerance for B-turn
4. Hydrophobicity
• Hydrophobicity = effect mediated by solvent (ie water)
o Water-oil experiment: water & oil & add AA => mix water & oil => you find some AA in water, some
in oil, some in interface (~ cytosol water, membrane oil)
• Measure hydrophobicity scales by doing water-oil experiment
o = measure difference in EN if AA goes from water to oil (deltaG)
o Typical hydrophobicity scale: charged AA prefer water (high transfer EN); aromatic, small aliphatic
prefer hydrophobic (low transfer EN); polar AA prefer interface
• Hydrophobic effect
o 1) dissolving 2 hydrophobic groups (aromatic) in water
o 2) ordering of water around these hydrophobic groups, in a highly ordered Hbond network = ice berg
▪ Gevolg: waterstructure gets perturbed & low entropy (in iceberg tov bulk of water)
o 3) you get clustering of hydrophobics, via stable VDW interaction => 1 ice berg is free
▪ Gevolg: water structure perturbation is minimal & higher entropy
• = driving force to push hydrophobic AA towards each other = driving protein folding
• Hydrophobic core = hydrophobic AA are tightly packed together inside, away from water/solvent (no H2O)
o Hydrophilic AA are on the outside, towards water/solvent
• Imprint on sequence
o Evolution has selected sequences that can fold, because they have interspersed right amount of
hydrophobic & hydrophilic AA and in the right order (=specific sequence imprint) => for right
connectivity in sequence (red line), and thus right structure for folding of globular proteins
o Gevolg: random polypeptide will not fold
• Not every sequence will fold vb: intrinsically disordered sequence (IDP)
o IDPs often acquire some folden structure upon binding with an binding protein at an Extended
Linear Motif in the IDP
o IDPs can have large interaction surfaces
o IDPs can be very rapidly turned over
o IDPs are a recent invention in evolution: First folded proteins invented in evolution, then IDP
▪ Eukaryotes have it a lot in their genome bacteria
5. Putting hydrophobicity and secondary structure together
• 1) Amphiphilic alpha-helix: side chains point outwards, which makes one hydrophilic side (point towards
water), one aliphatic side (point towards oil)
• 2) Same in B-sheets: hydrophobic top band, hydrophilic bottom band
4