Proteome
Comparative genomics
Comparative genomics is the systematic comparison of genomic sequences of different species.
Conserved sequences are functionally important and rapidly evolving sequences tell us why species
differ. Comparative genomics is important in human genetic research because it allows identification
of new genes and other genomic elements. It also helps us to understand gene function and the
effects/pathogenicity of mutations. The more the amino acid is conserved, the stronger the effect of
a change.
Comparative genomics is most often performed on the protein level. There are many different
genomes who get obtained:
New genomes of industrially and agriculturally relevant organisms (plant pathogenic fungi)
New genomes of medically relevant organisms (pathogens)
New genomes of evolutionary interesting organisms
The human genome
Three reasons to compare the protein in your genome of interest to proteins in other organisms:
1. Alignment important residues negative/purifying selection, alignment needs orthologs
2. Find information from “the same” gene (=ortholog) in model organisms, or find the same
gene in a model organism to do experiments on
3. Copy/hypothesize functional information from experimentally characterized homolog to
gene of interest
For all three purposes you want the same gene in other species/genomes, not just homologs but
orthologs. A homolog is a gene or sequence in two or more species that is derived from a common
ancestor. There are two types of homologs; Orthologs are genes found in two species that had a
common ancestor. An orthologous gene arises by speciation. Paralogs are genes in the same species
created through gene duplication.
Phylogenetic trees
To differentiate orthologs from homologs we need to look at the relations between genes. These we
infer from and summarize in trees. A tree consists of a hierarchical classification: order family
genus species. A phylogenetic tree consists of a historical pattern of relationships among
organisms.
A phylogenetic tree can be rooted and contain a molecular clock. If
it consists of a uniform clock this leads to identical distances from
1
, Bioinformatica & Genoomanalyse Evelien Floor
root to leaves (ultrametric tree). If it consists of a non-uniform evolutionary clock, the leaves will
have different distances to the root (additive tree).
In case of no molecular clock it means that a phylogenetic
reconstruction method will only infer relations and no direction. The
analysis will give you an unrooted tree. To go from unrooted to rooted
or vice versa you can introduce a root somewhere in the tree. So, one
unrooted tree can be turned into multiple rooted trees.
The first step in making a molecular phylogenetic tree is alignment of
the sequences. From those different species you can make a radial
unrooted tree. To go from unrooted to rooted you take another
organism that is definitely not related to the other species and you
introduce the root on that branch.
There are two ways to make a molecular phylogenetic tree:
1. Alignment distances clustering
2. Alignment best fitting tree
Parsimony
Maximum likelihood
Phylogenetic tree by distance methods
After alignment you start by making a distance matrix based on
alignment differences. To make a phylogenetic tree of this information,
the algorithm UPGMA is used:
Initialization:
• Fill distance matrix with pairwise distances
• Start with N clusters of 1 element (gene) each
Iteration:
• Merge cluster Ci and Cj for which dij is minimal
• Place internal node connecting Ci and Cj at dij/2
• Delete Ci and Cj; replace by new C with group average distances
Termination:
• When only two clusters i, j remain, put root at d ij/2
Phylogenetic tree by best fitting alignment
There are two ways to find the best fitting tree direct after alignment; maximum parsimony and
likelihood. Maximum parsimony: the tree that requires the fewest evolutionary events to explain the
alignment, the simplest explanation of the observations. Maximum likelihood: the tree most likely to
have led to the alignment given a certain model of evolution.
With the maximum parsimony you can draw all possible trees for the sequences/species present in
your multiple alignment. For each tree, identify where the mutations have taken place. You then
choose the tree with the minimum number of required mutations. However, a problem with this
method is that for only 50 species there are already billions of trees possible. Therefore, the method
does not search all the trees but just a selection heuristic search.
2
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper evelienfloor. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €4,04. Je zit daarna nergens aan vast.