Possible exam questions Laura van den End
Possible exam questions
Sequencing of the genome of broad bean of faba bean (Vicea faba), 13 Gbp long (~4 times
the human genome), has lagged almost a decade behind that of other legume species,
including that of common bean (Phaseolus vulgaris, 0.6 Gbp). A high-quality reference
genome has now nally been generated.
A. Speculate why it might have taken so long.
• The faba bean has a very large genome (13Gbp) compared to the common bean
(0.6Gbp). This comes with some challenges, the genome needs to be fragmented and
aligned again with for example shotgun whole genome sequencing. This is very time
consuming.
-
B. Describe schematically the sequencing strategy and equipment you would propose
today to obtain an independent high-quality version of the sequence in much shorter
time. Discuss your approach critically.
• Illumina is fast and has high-throughput but produces short length reads. For long
length reads, PacBio or nanopore sequencing can be used. The combination of short
and long length reads can help overcome problems with repetitive parts. The genome
can then be assembled using computational technologies. Sometimes gaps still exist
but they can be lled with the walking method using primers anking the gap.
-
C. Describe the operating principles of the proposed sequencing platform(s).
• Illumina: adapters are ligated to the DNA which are complementary to primers on a
glass slide. The DNA fragments are attached to the primers and the primers are
extended, denaturation and washing result in fragments of DNA attached to the glass
slide with two known ends. Next, bridge ampli cation is used to amplify the
fragments and form polonies. When in bridge, paired-end sequencing can occur
where the synthesis is done in both directions. Illumina uses sequencing by synthesis
method where the nucleotides are uorescently labeled and the uorescence
indicates incorporation of the nucleotide.
• PacBio: also SMRT is preformed in a well in a aluminium plate. It is a third generation
sequencing tool that involves a phi-29 polymerase that is immobilized in the well.
Nucleotides are added one by one, when uorescence: incorporation. This is also
sequence by synthesis based. The well is small so only the polymerase ts and
aluminium is used so the light cannot go through the well. To improve the error rate,
consensus sequences can be used where the DNA is circularized and polymerase
can go over the same piece multiple times.
• Nanopore sequencing is also a third generation sequencing technique that uses a
pore in a biomembrane or a synthetic membrane with a speci c current. The
individual nucleotides that pass the pore through an electric eld, cause a unique
change in the current that can be detected.
• Walking method: use 2 primers anking the gap/fragment and produce 500 nt, then 2
new primers are produced based on the previous produced fragments. This continues
till there is an overlap.
Genome, proteome, metabolome analysis 1/7
  fi fi fl fl fl fi fl fi fi fl fi
, Possible exam questions Laura van den End
This year, a high-quality reference genome was published of the broad bean (Vicea faba), 13
Gbp long (~4 times the human genome). A next step would be to analyse the transcriptome
of di erent plant tissues to further unravel relevant properties of this crop species
molecularly-functionally.
A. What state-of-the-art methodology and equipment would you propose to analyse the
transcriptome? Brie y describe the working principles.
• Previously, transcriptome analysis was done with hybridization (for example
microarrays (cDNA or oligonucleotide)), sequence based (SAGE, MPET, DGE) or PCR
based techniques. But today, second (454, illumina) or third (SMRT, nanopore)
generation sequencing techniques would be more appropriate. These sequence
techniques would then be applied to cDNA to give insights in the expression levels
• Illumina: adapters are ligated to the DNA which are complementary to primers on a
glass slide. The DNA fragments are attached to the primers and the primers are
extended, denaturation and washing result in fragments of DNA attached to the glass
slide with two known ends. Next, bridge ampli cation is used to amplify the
fragments and form polonies. When in bridge, paired-end sequencing can occur
where the synthesis is done in both directions. Illumina uses sequencing by synthesis
method where the nucleotides are uorescently labeled and the uorescence
indicates incorporation of the nucleotide.
• PacBio: also SMRT is preformed in a well in a aluminium plate. It is a third generation
sequencing tool that involves a phi-29 polymerase that is immobilized in the well.
Nucleotides are added one by one, when uorescence: incorporation. This is also
sequence by synthesis based. The well is small so only the polymerase ts and
aluminium is used so the light cannot go through the well. To improve the error rate,
consensus sequences can be used where the DNA is circularized and polymerase
can go over the same piece multiple times.
• Nanopore sequencing is also a third generation sequencing technique that uses a
pore in a biomembrane or a synthetic membrane with a speci c current. The
individual nucleotides that pass the pore through an electric eld, cause a unique
change in the current that can be detected.
-
B. Critically discuss the strengths and weaknesses of the suggested approaches.
• Second generation sequencing often produce short length reads while third
generation often produce long term reads. Though illumina is a high-throughput
technique, making it suitable for routine transcriptome studies.
• Short read lengths may pose challenges in resolving complex transcript structures,
while long read lengths can give insights in isoforms and alternative splicing.
• Third generation sequencing has higher error rates than second generation
sequencing but third generation sequencing do not involve PCR ampli cation which
gives a more direct mRNA read.
• Nanopore sequencing has the advantage of real time sequencing but has the same
base-level accuracy as illumina
Genome, proteome, metabolome analysis 2/7
  ff
fl fl fl fi fi fi fi fl fi