16-06-23 HC Intro Blast
HC Quantifying Sequence Similarity
HC Sequence Conservation
HC Analysis of omics data
HC Phylogenetic Trees
HC Phylogenetic Interference
HC Bootstrapping
HC GWAS
WC Blast
WC Probabilities
WC Quantifying Molecular
Evolution
WC Sequence Conservation
WC Analysis of omics data
WC UPGMA & ArLV1
WC Phylogenetic Trees
WC Phylogentic inference
WC A bioinformatic murder
mystery.
WC Bootstrapping
WC GWAS
Oefentoets 1
Oefentoets 2
Oefentoets 3
Formuleblad begrijpen
KIJK OOK DE KENNISCLIPS!
Intro to Bioinformatics, Blast
Modern genetisch onderoek genereert steeds meer en steeds comlexere data en is onlosmakelijk
verbonden met bioinformatica. In deze cursus combineren we deze twee onderwerpen.
The ‘Omics’: Sequence everything of something
Genomics: Sequence all of the DNA of one organism
Transcriptomics: Sequence all of the mRNA in an organism/tissue/cell
Proteomics: Sequence all of the proteins in an organism/tissue/cell
Metagenomics: Sequence the DNA of all organisms in a sample
Metatranscriptomics: Sequence the mRNA of all organisms in a sample
Metaproteomics: Sequence the proteins of all organisms in a sample
Meta = All organisms.
Omics solves a major problem in science: Reducing biases by measuring all of a thing.
People are mostly interested in:
, - Their diseases
- Their food
- Themselves
This causes biases in our general understanding of biology, and biases our databases.
Plants have the largest genome. They like to duplicate it.
Top-down = Question first. Given a biological question, a good bioinformatician will immediately
think about which dataset could be used to answer it.
Bottom-up = Data first. Given a dataset, a good bioinformatician will immediately think about which
biological hypothesis it could help to test.
List the factors including heuristic that make BLAST fast.
Looking something up in a database:
Query Database
TGCTGCAGGA AATGAGGTTAAGACTAAGCAATGCATGTGTAAGTATGAACTCTTGTATCATAGATTAAGC
CAACAGTT CATGCATGTGTGATATCATGGTTGTGGTGGTATGACTTATT
Step 1: We have to break down the search because of possible mutations.
We do that with k-mers:
K-mer searches
- Sequences can be divided into shorter subsequences or k-mers
- k-mers consist of k nucleotides or amino acids
- We can make an index of all k-mers that occur in the database
Sequences
- If we split a query sequence into k-mers of the same length, we
- can rapidly identify all the database sequences containing them
- But: we limit ourselves to exact matches
Sequence alignment. Dif global and local
Sequence alignment: We try to match two sequences as good as possible.
We do this using a k-mer search (will be very fast, but limits you to exact matches.) and to make
pairwise alignments (will let you find distantly related sequences as well, but it would take a very
long time.).
The solution is to combine the best of both worlds: Quickly find potential hits using k-mers stored in
an index. Make pairwise alignment, but only for potential hits.
There is a tool that does this for you:
Basic Local Alignment Search Tool (BLAST)
BLAST finds similar sequences at reasonable speed
– 10-50x faster than previous algorithms
Terminology:
, – Query: sequence we search the database with (word in searchbar)
– Hit or Subject: similar sequence found in the database
BLAST is the most used bioinformatics program.
Even faster algoristhms are now available.
If you look up a sequence you BLAST it.
If you make a poster and you BLAST something, do cite it!
Heuristics: You are not guaranteed to find the best thing. You cut some corners, but this will make the
whole process a lot faster.
The BLAST search algorithm:
1. Identifies all words: W = 3 for protein, W = 11 for DNA.
2. All this is based on substitution scores.
7 + 5 + 6 = 18
3. Quickly finds similar words in the database. Similair words are defined by using the
substitution matrix. The index quickly locates all potential hit seqs.
Similar words. You look at words with the same score
4. Extends seeds in both directions to find HSP’s between query and hit.
, Global and local sequence alignments
- Are sequences completely or partially homologous. (=are they in the same ‘family’, have a
common ancestor)?
- Local alignment(what blast does). Finds the optimal sub-alignment within two sequences.
Partial homologs.
- Global alignment (our goal). Aligns two sequences from end to end. If you know two
sequences are full homologs, e.g. resulting from gene duplication.
BLAST flavors: direct searches
1. Nucleotide-nucleotide searches
o Blastn(W = 11 nucleortides): finds homologous genes in different species.
o Megablast(W = 28 nucleotides): Designed to find longer alignments between very
similar nucleotide sequences. Best tool to find highly identical hits for a query
sequence. For example: Find sequences from the same species.
o Discontiguous megablast(w = 11 nucleotides): This can focus the search on codons.
Best tool to find nucleotide-nucleotide hits at larger evolutionary disctances for
protein coding query sequences.
2. Protein-protein searches
o Blastp(W = 3 amino acids): Find homologous proteins in different species.
Blast flavors: translated searches
- This allows for more sensitive searches that detect homology at greater evolutionary
disctances.
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper jadeernsting. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €6,49. Je zit daarna nergens aan vast.