Algorithms for Sequence Analysis
Day Date Time Room Lecturer Type Topic
Week 44 (2020-10-26 - 2020-11-01)
Mon 26-10-2020 11:00 JH Lecture Intro – Pairwise Alignment 1
Mon 26-10-2020 13:30 LR, DM Practical computer class Practical Dynamic Programming 1 (DP )
Fri 30-10-2020 13:30 JH Lecture Pairwise Alignment 2
Fri 30-10-2020 15:30 LR, DM Practical computer class DP
Week 45 (2020-11-02 - 2020-11-08)
Mon 2-11-2020 11:00 JH Lecture Substitution Matrices
Mon 2-11-2020 13:30 LR, DM Practical computer class DP
Fri 6-11-2020 13:30 JH Lecture Multiple Sequence Alignment 1
Fri 6-11-2020 15:30 LR, DM Practical computer class DP
Sun 8-11-2020 23:00 Deadline DP
Week 46 (2020-11-09 - 2020-11-15)
Mon 9-11-2020 11:00 JH + BS? Lecture BWA and QDNAseq
Mon 9-11-2020 13:30 SL, DM Practical computer class Start Practical BWA/QDNAseq
Fri 13-11-2020 13:30 BS Lecture Multiple Sequence Alignment 2
Fri 13-11-2020 15:30 SL, DM Practical computer class BWA and QDNAseq
Week 47 (2020-11-16 - 2020-11-22)
Mon 16-11-2020 11:00 JH Lecture Homology Searching 1
Mon 16-11-2020 13:30 SL, DM Practical computer class BWA and QDNAseq
Fri 20-11-2020 13:30 JH Lecture Homology Searching 2
Fri 20-11-2020 15:30 SL, DM Practical computer class BWA and QDNAseq
Sun 22-11-2020 23:00 Deadline BWA/QDNAseg
Week 48 (2020-11-23 - 2020-11-29)
Mon 23-11-2020 11:00 JH Lecture Hidden Markov Models 1
Mon 23-11-2020 13:30 DM, BS Practical computer class Start Practical HMM1/2
Fri 27-11-2020 13:30 JH Lecture Hidden Markov Models 2
Fri 27-11-2020 15:30 DM, BS Practical computer class HMM1/2
Week 49 (2020-11-30 - 2020-12-06)
Mon 30-11-2020 11:00 Solon Lecture Suffix trees/arrays
Mon 30-11-2020 13:30 DM, BS Practical computer class HMM1/2
Fri 4-12-2020 13:30 JH Lecture Phylogeny 1
Fri 4-12-2020 15:30 DM, BS Practical computer class HMM1/2
Week 50 (2020-12-07 - 2020-12-13)
Mon 7-12-2020 11:00 JH Lecture Phylogeny 2
Mon 7-12-2020 13:30 DM, BS Practical computer class HMM1/2
Fri 11-12-2020 13:30 JH Lecture Question hour
Fri 11-12-2020 15:30 DM, BS Practical computer class HMM1/2
Fri 11-12-2020 23:00 Deadline HMM
Week 51 (2020-12-14 - 2020-12-20)
Mon 14-12-2020 15:30 Online Written exam
1
,Inhoud
Optional extra reading material ............................................................................................................. 3
Week 1 ................................................................................................................................................... 6
Lecture 1 Introduction – Pairwise Alignment ................................................................................. 6
Lecture 2 Alignment ....................................................................................................................... 9
Practical ............................................................................................................................................ 14
Week 2 ................................................................................................................................................. 15
Lecture 3 Substitution Matrices ................................................................................................... 15
Lecture 4 Multiple Sequence Alignment I ..................................................................................... 21
Week 3 ................................................................................................................................................. 26
Lecture 5 Genome sequencing (lecture 5a 2018) ........................................................................ 26
Lecture 6 Multiple Sequence Alignment II .................................................................................... 33
Week 4 ................................................................................................................................................. 41
Lecture 7 Homology Searching I ................................................................................................... 41
Lecture 8 Homology Searching II .................................................................................................. 47
Week 5 ................................................................................................................................................. 55
Lecture 9 Markov Chain Models ................................................................................................... 55
Lecture 10 Hidden Markov Models ............................................................................................ 62
Week 6 ................................................................................................................................................. 71
Lecture 11 Suffix tree: applications ............................................................................................ 71
Lecture 12 Part 1: Gene Family Evolution ................................................................................... 74
Week 7 ................................................................................................................................................. 82
Lecture 13 Introduction to phylogenetic/phylogenomic concepts and methods ....................... 82
Notes Deadlines ................................................................................................................................... 89
2
,Optional extra reading material
Lecture 1 Pairwise alignment
Background biology: chapter 1 Zvelebi and Baum.
Convergent/divergent evolution: box 4.2 in chapter 4 Zvelebi and Baum.
Homology/orthology/paralogy/xenology: chapter 7.2 Zvelebi and Baum and chapter 1.1 Durbin.
Alignments: chapter 5.1 Zvelebi and Baum and chapter 2.2 Durbin.
Substitution matrices: chapter 4.4 and 5.1 Zvelebi and Baum.
Dynamic programming: chapter 5.2 Zvelebi and Baum and chapter 2.3-2.4 Durbin.
Gap functions: chapter 4.4 and 5.1 Zvelebi and Baum.
Lecture 2 Pairwise alignment
Dot plots: chapter 4.2 Zvelebi and Baum
Measuring alignment similarity: chapter 4.2 Zvelebi and Baum.
Statistics of alignments: chapter 2.7 Durbin.
Lecture 3 Substitution matrices
From Understanding Bioinformatics (Zvelebil and Baum):
Substitution matrices (PAM, BLOSUM) and log-odds ratio: Chapter 4.3 and 5.1
Observed sequence distance: Chapter 7.2 p236-237 (Fig. 7.7) and 8.1
Jukes-Cantor and Kimura: Chapter 8.1 p271-273
Lecture 4 MSA 1
MSA: Chapter 6.3 (Durbin) and Lipman, D., Altschul, S., & Kececioglu, J. (1989). A Tool for Multiple
Sequence Alignment.
DCA: Chapter 4.5 p90-91 (Zvelebil and Baum)
Progressive alignment: Chapter 6.4 (Zvelebil and Baum) Chapter 6.4 (Durbin)
PSSM: Chapter 6.1 p167-168 (Zvelebil and Baum)
Lecture 5 Sequencing, BWA, QDNAseq
Human genome project, sequencing, de novo assembly (papers):
• Initial sequencing and analysis of the human genome (PMID:11237011,
DOI:10.1038/35057062) (first part of the paper (Background of the Human Genome Project
and Strategic Issues), more insights on other topics discussed in the lectures can be found in
the other paragraphs) link: https://www.nature.com/articles/35057062
• The Sequence of the Human Genome (PMID:11181995,
DOI:10.1126/science.1058040 (mainly paragraphs
1,2) link: https://science.sciencemag.org/content/291/5507/1304
Genome assembly (papers):
• How to apply de Bruijn graphs to genome assembly doi: 10.1038/nbt.2023 ( De Bruijn
graph) link: https://www.nature.com/articles/nbt.2023
• Sense from sequence reads: methods for alignment and assembly (alignment, BWT,
assembly, de Bruijn graph) link: https://www.nature.com/articles/nmeth.1376
3
, • Fast and accurate short read alignment with Burrows-Wheeler transform
doi:10.1093/bioinformatics/btp324 (BWT,
BWA) link: https://academic.oup.com/bioinformatics/article/25/14/1754/225615
QDNAseq (papers):
• DNA copy number analysis of fresh and formalin-fixed specimens by shallow whole-genome
sequencing with identification and exclusion of problematic regions in the genome assembly
(PMID:25236618
DOI:10.1101/gr.175141.114) link: https://genome.cshlp.org/content/24/12/2022
Lecture 6 MSA 2
T-COFFEE: Chapter 6.4 (Zvelebil and Baum) and paper:
Notredame, C., Higgins D., and Heringa, J. (2000) T-Coffee: A novel method for fast and accurate
multiple sequence alignment. J. Mol. Biol., 302, 205-
217. Link: https://pubmed.ncbi.nlm.nih.gov/10964570/
PRALINE: All PRALINE papers mentioned in the slides.
Sum-of-pairs score: Chapter 6.4 Fig. 6.14 p200-201(Zvelebil and Baum) Chapter 6.2 (Durbin)
Lecture 7 + 8 Homology Searching
Understanding Bioinformatics (Zvelebil and Baum):
• Chapter 4.6 & 4.7 - p. 93 - 102, 108
• Chapter 5.3 & 5.4 - p. 141 - 156
• Chapter 6.1 & 6.2 - p. 167 - 185
Biological Sequence Analysis (Durbin):
• Chapter 2.4 & 2.5 - p. 29 - 35
Lecture 9 + 10 Hidden Markov Models
• Prokaryote gene prediction: An Introduction to Hidden Markov Models for Biological
Sequences by Anders Krogh. Chapter 4.4.
Biological Sequence Analysis (Durbin):
• Chapter 3
• Chapter 4
Lecture 11 Guest lecture Solon: Suffix arrays/trees
Lecture 12 Phylogeny
Lecture 13 Guest lecture Kimmen: Satchmo algorithm
Recommended reading:
* Delsuc, Brinkmann & Philippe, Phylogenomics and the reconstruction of the tree of life Nature
Reviews Genetics 2005 (presents phylogenomic methods for reconstructing species phylogenies;
including gene matrix and supertree approaches)
4