Haplotype
Clusters of SNPs that appear to be inherited in tandem are called haplotypes. Haplotypes
are local combinations of genetic polymorphisms that tend to be co-inherited.
Sequence coverage vs. assembly vs. contig.
Sequence coverage is the ratio of the total number of bases sequenced to the genome
length. To achieve complete and accurate assembly of a novel genome requires collection
of data with a coverage of 30 or 50.
Sequence assembly is the inference of the complete sequence of a region from the data
on individual fragments from the region, by piecing together overlaps.
A contig is the result of merging the sequence of fragments to form a long connected
region
Single-end read vs paired-end read
Single-end read – determination of the nucleotide sequence from only one end of a
template DNA molecule.
Paired-end read – determination of the nucleotide sequence from both ends of a template
DNA molecule (with numerous undetermined bases between the reads that is known only
approximately)
De novo sequencing vs resequencing vs exome sequencing
Determining the complete sequence of the first genome from a species is called de novo
sequencing. Resequencing is used to determine the genomic variations of a sample in
relation to a common reference sequence. Exome sequencing is a genomic technique for
sequencing all of the protein-coding genes in a genome
Linkage vs linkage disequilibrium
Linkage entails the distribution of loci among chromosomes, whereas LD entails the
distribution of allelic patterns in populations
Transposon vs retrotransposon
Retrotransposon: self-amplifying sequences in genomes derived from reverse
transcription.
Transposon: DNA sequences that can change positon within a genome.
Restriction fragments vs restriction map
Restriction map – specification of the distribution in a DNA molecule of cutting sites of
one or more restriction enzymes.
Restriction Fragment – short fragments of DNA produced by cutting DNA with a
restriction endonucleases.
1
,C-value
Is the amount, in picograms, of DNA contained within a haploid nucleus.
Homologues vs orthologues vs paralogues
Homologues– regions of genomes, or portions of proteins, that are derived from a
common ancestor
Paralogues– related genes (i.e. homologues) that have diverged to provide separate
functions within the same species
Orthologues– homologues that perform the same function in different species
For instance, the α and β chains of human haemoglobin are paralogues, but human and
horse myoglobin are orthologues
Neo- vs sub- vs nonfunctionalization
Nonfunctionalization
Following duplication, one copy may simply become silenced by degenerative/deleterious
mutations, while the other copy retains the original function
Neofunctionalization
Following duplication, one copy may acquire a novel, beneficial function and become
preserved by natural selection, while the other copy retains the original function
Subfunctionalization
Following duplication, both copies may become partially compromised by mutation
accumulation to the point at which their total capacity is reduced to the level of the single-
copy ancestral gene
For example – haemoglobin genes; ancestral gene α & β; none can function independently
to produce a monomeric protein molecule (i.e. 2α or 2β versus 2α2β)
Ka/Ks = 1 vs >1 vs <1; neutral evolution vs +ve selection vs -ve selection
The ratio of Ka/Ks distinguishes the role of selective pressure and drift in the divergence
of genes after duplication:
Ka/Ks ≈ 1 neutral evolution: silent and substitution mutations have occurred to
approximately equal extents
Ka/Ks > 1 positive (adaptive) selection: substitution mutations are more prevalent than
silent mutations, implying that selective pressures are active and the substitutions are
advantageous
Ka/Ks < 1 purifying (negative) selection: substitution mutations are underrepresented,
implying that the sequence is optimized fairly rigidly, with relatively little tolerance for
mutation
2
, Polyploid vs autopolyploid vs allopolyploid
Polyploids, i.e. they contain multiple sets of entire chromosome’s
Two types of polyploids
Autopolyploids contain multiple copies of genomes from the same parent (highbush
blueberry).
Allopolyploids contain multiple copies of genomes from different parents (e.g. Triticum
aestivum).
Contrast between the challenges of gene identification in prokaryotes vs eukaryotes
Easier in prokaryotes than in eukaryotes. Prokaryotes have smaller genomes and contain
fewer genes. Genes in bacteria are contiguous - they lack introns that eukaryotes have.
E.g. 90% of E.coli genome is protein coding. Protein coding genes in higher eukaryotes are
sparsely distributed and most are interrupted by introns. Identification of exons are one
problem and assembling them another. Alternative splicing presents additional difficulty.
Simpler eukaryotes are easier, e.g. yeast genome is 67% coding.
Distinguish between two general methods of gene identification
A priori methods
Seek to recognise sequence patterns within expressed genes and the regions flanking
them. Protein coding regions will have distinctive patterns of codon statistics, including
the absence of stop codons.
‘Been there seen that’
Recognise regions corresponding to previously known genes. , from the similarity of their
translated amino acid sequences to known proteins in another species, or by matching
expressed sequence tags.
Describe useful features of gene identification in addition to codon usage: what to
look for in the beginning, middle end of genes (10 marks)
5’ exon starts with a transcription start site preceded by a core promoter site (e.g. TATA
box at roughly -30 bp (thus 30 bp upstream)). It is free of in-frame stop codons and ends
immediately before GT splice signal. Rarely an exon occurs before 5’ exon with ATG (also
Kozak sequence: consensus ACCAUGG)
3
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller charneb1. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $5.48. You're not tied to anything after your purchase.