1. DNA sequencing & PCR – Hans van der Spek
There is a direction, DNA has a polarity. There are 2
polarities and these are antiparallel (5’ 3’). 5’ 3’
is also the direction in synthesis. In proteins the
direction is N C terminus.
DNA synthesis/replication:
Semi conservative: There is a daughter and parent strand. New DNA consists of both, one strand is
always template for the other strand (parent is read from 3’ 5’ because synthesis is from 5’ 3’).
Energy for addition of a base comes from the substrate molecule itself, triphosphate. Energy is used
to make a covalent phosphodiester bond in the sugar backbone.
If a mistake is made, you can just remove the nucleotide by proofreading activity by
polymerase, you can add a new one. If you would do this on the 5’, you would loose the
triphosphate and you don’t have the energy for making a new bond.
Enzymes are coupled to each other, they are linked.
Helicase: DNA is a double helix, helicase opens up the DNA.
Topoisomerase: Makes a nick in the DNA (to release tension).
Ligase: Repairs nicks made by topoisomerase.
Leading strand polymerase: Continuous replication.
Lagging strand polymerase: Replicated in parts because it cannot start at the replication fork
3’ 5’. DNA polymerase needs a 3’ OH, RNA polymerase can randomly start. Primase is an
RNA polymerase. Short primer is made, after which DNA polymerase can start. Primer is
eaten away by DNA polymerase and you have a gap. These little parts are Ozaki fragments
ligase fixes this.
This process goes really fast mistakes. These can be fixed by proofreading activity of the DNA
polymerase complex. It has an editing site which recognises mistakes immediately and fixes this
(eats back like a Pac-man and has 3’ 5’ editing activity). There’s also strand-directed mismatch
repair, MutS recognises the mismatch, together with MutL (both in bacteria) they remove a strand
around the mismatch because this is easier to repair than just one base. You can just use DNA
polymerase again and use the other strand as a template. Repair system knows which one is the
mistake by C methylation (of the parent strand).
Without proofreading: 1 error in 105 bases.
With proofreading: 1 error in 107 bases.
With strand directed mismatch repair: 1 error in 10 10 bases.
Chemical changes in DNA can cause mutation in DNA: Depurination & deamination.
, Deamination you get U instead
of T because NH2 group is
changed to a O (addition of
water) you get C T or A G).
Depurination you get a deletion
because polymerase does not
recognise it and just skips it.
DNA sequencing
Dideoxy sequencing (Sanger): DNA synthesis while incorporating chain terminators in separate
reaction for all the bases. The chain terminator lacks 3’OH necessary for strand extension. You put
high concentration of all dNTPs and low concentration of ddNTP (1 of them). It occasionally stops
visualise on a gel. Bottom ones are short reads, top ones are long reads. You have to read the gel
from bottom to top (5’ 3’). Bigger fragments are harder to separate on a gel (MAX 400bp). Instead
of reading the gel yourself with radioactive labels fluorescent with detectors. Better: 4 different
fluorescent labels, you can pool everything. You do not need gels with pools anymore capillary
(automated dideoxy sequencing). High G-C region is hard you need many different sequences to
proof it.
Genome (shotgun) sequencing: Genomic DNA divided in BAC (bacterial artificial chromosomes)
library large contigs cut it up in smaller pieces (shotgun clones) sequence and align later on
by overlapping sequences.
We have repetitive DNA that is endless, hard to sequence. You will always end up with gaps,
techniques to fix this: end to end cloning. You want to link all these parts together. BAC clones are
big, you know the order and can align them to a known genome.
Massive parallel sequencing: Major advance, you do not need to clone DNA anymore (no cutting in
pieces, sequencing and aligning = cloning). You just PCR them, you need to be absolutely sure that
your specific part of DNA is separate from other parts of DNA. You can use beads where you attach
your DNA to, and amplify this. Put this in a grid individual spots (millions) with different parts of
DNA. Sequencing techniques:
- Pyrosequencing: You need to detect how the sequence reaction is continued (no gels). dNTP
incorporation gives PPi ATP. ATP can be used by luciferase flash of light. You have to
do this one nucleotide at the time, you put A on everything, only where A can be
incorporated you see light (camera lots of space on PC & computer time). You need to
remove ATP with apyrase, wash and put another nucleotide. Strength of signal shows how
many of the same bases are incorporated.
- Illumina sequencing: Different fluorescent labels at the same time, still have to take a
picture. Wash fluorescent labels away do it again. Problem: when you have 6/7 of the
same bases its harder to see how many there are.
- Ion torrent sequencing: Detection is not with pictures. You separate them in grids and you
make use of the fact that not only diphosphate is released but also a proton upon
incorporation. The pH changes detects this. You can also measure multiple of the same.
You have to do this one by one again!!
- Nanopore: USB stick, nanopore pulls DNA strand through it because there’s a motor protein
(walks on DNA), it measures disturbance of ion flow (current) of the pore since this is
disturbed when you pull the DNA through it (different disturbance for every base). Can make
reads of 100.000 bp (others max 500).
, - Bionano: Ultra long read scanning. Isolate DNA and label parts of DNA (like restriction sites,
repeats etc). It can find these labels, you can make a profile/map and compare samples like
translocations.
Short but many overlapping reads awesome.
DNA amplification:
PCR (polymerase chain reaction): Amplification of 1 specific DNA fragment, define fragment length by
primers (you need sequence information), synthesise DNA and you have exact amplification of the
fragment. Separate strands anneal primers add DNA polymerase & nucleotides synthesis.
First cycle: template quite big. In the
third cycle you have your specific
fragment. Third cycle: 8 fragments but
2 of which you want to have need
more cycles. Fast, sensitive, <20 kb,
homologous DNA sequences (chance
that primers will anneal somewhere
else).
RT-PCR (Reverse transcription PCR): Cloning of cDNA (copy of mRNA). You know what is expressed
and no introns you know what the exons are.
AFLP (Amplified fragment length polymorphism): Makes use of PCR, you cannot PCR entire genomes,
you have to make a (random) subset to compare. You make use of restriction site (for example use
Mse1), add linker/adaptor to it (for PCR primer). You make sets of primers that amplify different
sequences. You make a selection of fragments that you can visualise between different
species/individuals.
VNTR (Variable number tandem repeats): We have lots of repeats, sequence pretty similar but the
amount of repeats not. You can do PCRs around this repeat area and can see the number of repeats
FORENSICS.
SNP (Single nucleotide polymorphisms): Can be linked to certain traits. We can compare genomes of
people and look for traits. Not a mutation, its just variance between all of us some of them have a
meaning because they are linked to something else.
Barcodes: Sequences added to samples to separate them later on.