Following the HGP, it was reported that the human genome contains a number of repetitive DNA
elements, binding sites for various ligands, non-protein coding regions and also protein-coding
genes. In relation to the protein coding genes, describe their characteristics, with specific
examples where applicable, as observed in the human genome. (16)
Protein-coding genes are transported into mRNA. After processing, ribosomes translate mature
mRNA to polypeptide chains. Each triplet of nucleotides in the message corresponds to one amino
acid. Protein-coding genes occupy a small fraction of the human genome (2-3%) and are distributed
unevenly across the different chromosomes. Many protein-coding genes appear in multiple copies,
either identical, or diverged into families. Governed by the central dogma of molecular biology,
encapsulated by Francis Crick: DNA makes RNA makes Protein. Transcription of a protein-coding
gene into RNA is followed in eukaryotes by splicing to form a mature mRNA. The ribosome
synthesizes a polypeptide chain according to the sequence of codons in mRNA. The protein folds
spontaneously to a native three-dimensional structure that accounts for its biological function.
Amino acid sequences are always stated in order from N-terminal to C-terminal. This is also the
order in which ribosomes synthesize proteins: ribosomes add amino acids to the free carboxyl
terminus of the growing chain.
Which two general approaches are typically followed when sequencing genomes? (2)
BAC-to-BAC genome sequencing and the whole-genome shotgun method (WGS).
Distinguish between the steps applied when performing the two approaches. (6)
BAC-to-BAC:
First, cut the DNA into fragments of about 150 kb.
Then, the fragments would be cloned into BACs (bacterial artificial chromosomes).
Identify a series of clones in the library that contain overlapping fragments, a process known as
‘fingerprinting’.
Using the overlaps, order the clones according to their position along the original large target DNA
molecule. Subfragment each clone, sequence the fragments and assemble them.
WGS method:
First, the genomic DNA would be cut into fragments of 2 and 10 kb in lengths.
Then, the fragments would be cloned into plasmid vectors to create a plasmid library.
Insert DNA will be partially sequenced to generate reads of about 1500 bp (750 bp from each side)
from each clone.
Last, all the sequence data will be assembled in silico through the identification of overlaps
between random reads.
‘BAC-to-BAC’ method WGS
1. Make random cuts to produce fragments of:
150kb 2000 - 10 000 kb
2. Make plasmid library in BAC
3. Fingerprint, overlap, and order BAC clones. 3. Skip
4. Partially sequence 1500 bp subfragments of individual clones.
5. Assemble overlaps by computer
, What concerns were raised about the applicability of the approach developed at Celera
Corporation by Craig Venter and his co-workers? (2)
First, it was believed that the WGS method will not work in genomes that have many repetitive
sequences (i.e. complex eukaryotic genomes) since these regions are known to create problems
during assembly. Second, it was also believed that genomes that contain highly skewed base
compositions will also complicate application of the WGS method, eg. Plasmodium falciparum
contains approximately 80% AT in its genome. However, many genomes have since been sequenced
despite this fact.
Is there any correlation between the complexity of an organism and the nature of the genetic
information it contains within its cells? Explain. (4)
Yes, correlations exist between organism complexity and amount of DNA per cell as well as the
estimated number of genes per genome. With the former (i.e. amount of DNA per cell), it is
generally known that prokaryotes have less DNA per cell than eukaryotes while microbes have fewer
genes per genome than metazoan. Also, within eukaryotes, yeast has less DNA per cell and gene
numbers than mammals.
Are there any deviations to the general rule explained above? Elaborate with examples. (4)
Yes, there are deviations to both rules. For example, Amoeba dubia (a single-celled organism) and
marbled lungfish have genomes that are respectively 200 and 43 times larger than that of humans.
Similarly, pufferfish (while having a significantly smaller genome size than humans) seems to have
the same number of genes per genome as humans, whereas the nematode C. elegans (100 Mbp) has
more genes in its genome than Drosophila melanogaster, the fruit fly (122 Mbp). Regarding plants,
clivia has a genome that is 6 times larger than that of humans
Based on comparative studies in many genomes, what evolutionary routes can a gene take?
Describe. (5)
A gene may pass to descendants, accumulating favourable (or unfavourable) mutations or
drifting neutrally
A gene may be lost.
A gene may be duplicated, followed by divergence or by loss of one of the pair.
A gene may undergo horizontal transfer to an organism of another species.
A gene may undergo complex patterns of fusion, fission, or rearrangement, perhaps involving
regions encoding individual protein domains
Describe and elaborate the major results that were gathered when determining the taxonomy of
living organisms based on sequence data. (14)
First, all life on earth has enough general similarity to show that all life forms had a common origin.
This conclusion was made following observations that the basic structures and general biological
roles of DNA, RNA, and proteins are universally similar. In addition, the genetic code is nearly
universal in terms of its interpretation. Second, on the basis of 16S rRNAs, C. Woese divided living
things most fundamentally into three domains (also called the tree of life): bacteria, archaea, and
eukarya. From this tree, it can be seen that while archaea and bacteria are both unicellular
organisms that lack nuclei, at the molecular level, archaea are somewhat more closely related to