College aantekeningen

samenvatting Bioinformatica (Genomica) Biologie UU

109 keer bekeken 1 keer verkocht

Vak
Genomica (BB1GENO20)

Instelling
Universiteit Utrecht (UU)

Aantekeningen van colleges van Bioinformatica, met bijbehorende slides en tabellen.

[Meer zien]

Voorbeeld 3 van de 18 pagina's

Bekijk voorbeeld

Geupload op 3 juli 2021
Aantal pagina's 18
Geschreven in 2020/2021
Type College aantekeningen
Docent(en) -
Bevat Alle colleges

€4,49

Toegevoegd

In winkelwagen Op verlanglijstje

100% tevredenheidsgarantie
Direct beschikbaar na betaling
Zowel online als in PDF
Je zit nergens aan vast

Leerdoelen Genomica
HC1: Intro, BLAST
Why study bioinformatics?
 Explain why a biologist should know Bioinformatic Data Analysis

 Describe the ‘omics: (meta-) genomics, (meta-) transcriptomics, (meta-) proteomics,
metabolomics, etc.

Genomics: Sequence all of the DNA of one organism

Transcriptomics: Sequence all of the mRNA in an organism/tissue/cell

Proteomics: Sequence all of the proteins in an organism/tissue/cell

Metagenomics: Sequence the DNA of all organisms in a sample

Metatranscriptomics: Sequence the mRNA of all organisms in a sample

Metaproteomics: Sequence the proteins of all organisms in a sample

 Explain the biology behind the ‘omics revolution: reduce bias by measuring all of a thing
Omics solves a major problem in science: biases
- People are mostly interested in: 1. Their diseases 2. Their food 3. Themselves
- This causes biases in our general understanding of biology, and biases in our databases
- For example, most studied bacteria are associated with humans

 Compare the two ways a bioinformatician exploits existing data to make new discoveries
(top-down and bottom-up)

Sequence similarity searches
 Explain what a sequence alignment is and the difference between a global and local
sequence alignment
In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or
protein to identify regions of similarity that may be a consequence of functional, structural,
or evolutionary relationships between the sequences. Aligned sequences of nucleotide or
amino acid residues are typically represented as rows within a matrix. Gaps are inserted
between the residues so that identical or similar characters are aligned in successive
columns.

Local alignment – Finds the optimal sub-alignment within two sequences – Partial homologs
Global alignment – Aligns two sequences from end to end – If you know two sequences are
full homologs, e.g. resulting from gene duplication.

 Explain the BLAST algorithm
1. Identifies all words (length W) in the query – Default lengths: W = 3 for protein, W = 11
for DNA
– Based on substitution scores
2. Quickly finds similar words in the database – “Similar” words are defined by using the
substitution matrix (e.g. BLOSUM62) – The index quickly locates all potential hit seqs

, 3. Extends seeds in both directions to find HSPs between query and hit – HSP: region that
can be aligned with a score above a certain threshold

 List the factors including heuristics that make BLAST fast
The fastest algorithms generally use heuristics Heuristic: a practical method that is not
guaranteed to be optimal, but sufficient for the present goals.

Running blast
 Evaluate BLAST output/results

 Decide which BLAST flavor to use for your similarity search
BLAST flavors: direct searches
o Nucleotide-nucleotide searches
- Nucleotide database & nucleotide query
- blastn (default: W = 11 nucleotides)
 Find homologous genes in different species
- Megablast (default: W = 28 nucleotides)
 Designed to efficiently find longer alignments between very similar
nucleotide sequences
 Best tool to find highly identical hits for a query sequence • For
example: find sequences from the same species
- Discontiguous Megablast
 Uses discontiguous words (e.g. W = 11 nucleotides: AT-GT-AC-CG-CG-T)
 For example, this can focus the search on codons (the third nucleotide
of codons is less conserved due to the degeneracy of the genetic code)
 Best tool to find nucleotide-nucleotide hits at larger evolutionary
distances for proteincoding query sequences.
o Protein-protein searches
- Protein database & protein query sequences
- blastp (default: W = 3 amino acids)
 Find homologous proteins in different species

BLAST flavors: translated searches

o We can exploit the conservation of protein sequences when aligning DNA sequences, by
using translated searches
o This allows for more sensitive searches that detect homology at greater evolutionary
distances
– For example: homologous genes in distantly related species
o blastx and tblastx first translate the query from nucleotide into protein before identifying
high-scoring words
o tblastn and tblastx use a translated database of nucleotide sequences stored as proteins

, HC 2 Quantifying Sequence Similarity
Evolution
 List the mechanisms of DNA mutation
Nucleotide substitutions
- Replication error
- Physical or chemical reaction
Insertions or deletions (indels)
- Unequal crossing over during meiosis
- Replication slippage
Inversions or rearrangements
Duplications of:
- Partial or whole gene
- Partial (polysomy) or whole chromosome (aneuploidy, polysomy)
- Whole genome (polyploidy)
Horizontal gene transfer (HGT)
- Transfer between individuals of the same generation
 Define homology, similarity, and identity
Homology
- Property of two sequences that have a shared ancestor
- Homology is TRUE or FALSE: either you’re family or you’re not
Identity
- Percentage of identical residues in an alignment
- Used for amino acids or nucleotides.
Similarity
- Percentage of amino acid residues in an alignment with a positive substitution score-
- Not used for DNA
 List four properties of amino acids that might be important in determining their physico-
chemical similarity
Size, polarity, hydrophobicity, preferred protein fold

Probability & Permutation Statistics
 Work with P-values obtained using permutation statistics
P-value: defined as the probability of observing a hit as good as, or better than your score by
chance.
In permutation statistics -> corresponds to the fraction of times that the permuted score is
equal or higher than your score.
Meaningful observation -> low P-value -> if randomly permuted data rarely has a higher
score
The minimum P-value depends on the number of random permutations.
Example: for 100 permutations, the best P-value: <0.01
For 1000 permutations, the best P-value: <0.001
 Explain how permutation statistics help us evaluate the strength of a result
Statistics are not well defined for many bioinformatic analyses. A simple solution is data
permutation:
- Permute (shuffle) the sequences 1000* times
- Make 1000* new alignment matrices
- Register if the alignment score of the permuted sequences is equal or higher than
Your Score

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper milofonville. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €4,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 67474 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen

Populaire Universiteiten

Populaire Hogescholen

Populaire Scholen

Populaire samengevatte studieboeken voor Communicatie en Taal

Populaire samengevatte studieboeken voor Economie en Bedrijf

Populaire samengevatte studieboeken voor Exact en Informatica

Populaire samengevatte studieboeken voor Gedrag en Maatschappij

Populaire samengevatte studieboeken voor Gezondheid en Geneeskunde

Populaire samengevatte studieboeken voor Onderwijs en Opvoeding

Populaire samengevatte studieboeken voor Recht en Bestuur

De beste samenvattingen om je Wft-diploma te behalen

De beste samenvattingen om je theorie examens te behalen

De beste samenvattingen voor je cursus in de Veiligheidsbranche

De beste samenvattingen voor Gezondheid & Hygiëne cursussen

De beste samenvattingen voor zakelijke cursussen

De beste samenvattingen voor je PABO WisCAT cursus

Populaire vakken

Populaire vakken

Populaire vakken

Boekverslagen en samenvattingen

College aantekeningen

samenvatting Bioinformatica (Genomica) Biologie UU

Document informatie

Onderwerpen

Geschreven voor

Verkoper

Voorbeeld van de inhoud

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Snel en makkelijk kopen

Focus op de essentie

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?