How to compare two sequences?
To compare 2 sequences, you need to determine the sequence alignment. They are either
homologous or not. Homologous is are the same and sequences of ancestors look the same
as well. Paralogues are homologues between species, orthologues are homologues within a
species.
Identity: A residue is exactly the same (either nucleotide or amino acid)
Conservation: An amino acid is replaced by another amino acid that preserves the
physicochemical properties of the original residue (for nucleotides, conservation has
no meaning, as two residues are only considered equal or not)
Similarity: The extent to which sequences are related, based on identity and
conservation.
Local alignment: Only take a small portion into account when finding alignments
Global alignment: Take the entirety of both sequences into consideration when finding
alignments.
o DNA
Alignment of nucleotide sequences is useful, for example to:
Study noncoding regions of DNA (promoter regions, intergenic regions)
Confirm the identity of a cDNA
Study DNA polymorphisms
Study genomes (compare genomes, create genome annotations)
o Proteins
Alignment of protein sequences can be more informative than alignment of nucleotide
sequences. It makes it more useful to use them when there is a large evolutionary
distance. This has several reasons:
There are 20 different amino acids, but only 4 different nucleotide bases
For amino acids we can take their biophysical similarities into account
Codons are degenerate: changes in the third position of coding sequences
often do not alter the amino acid that is specified
Query NLYENFVQATFNALTAEKV
NY ENF+Q+ + + +
Subject NYAENTIQSIISTVEPAQR
The centerline provides the following information: A letter designates an identity (or
high similarity) between the two sequences. A “+” means the two sequences are
similar but not highly similar. If no symbol is given between the two sequences, then a
non-similar substitution has occurred
Substitution matrix: Putting in the number of mutations for each pair of amino acids.
From there on, a scoring matrix can be formed.
Positive values for combinations that are more likely than random chance,
Negative values for combinations that are less likely than random chance,
Zero values for combinations that are as likely as expected from random
chance
- BLOSUM matrix
A substitution scoring matrix in which scores for each position are derived
from observations of the frequencies of substitutions in blocks of local
alignments in related proteins. Each matrix is tailored to a particular
evolutionary distance. In the BLOSUM62 matrix, for example, the alignment
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller emmavandergaag. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $3.79. You're not tied to anything after your purchase.