1. Open reading frame
A region of a genome that putatively codes for a protein
2. Single nucleotide polymorphism
Substitution mutations at single sites
3. Haplotype
A group of closely-linked genes that tend to be inherited as a block
4. CRISPR/Cas
A defence mechanism against viral infection in prokaryotes, developed as a powerful laboratory
tool for genome editing used in eukaryotes, including humans
5. Read length
The number of consecutive nucleotides determined in a single sequence determination
6. Contig
The result of merging the sequence of fragments to form a long connected region
7. (Sequence) assembly vs coverage
Sequence assembly refers to aligning and merging fragments from a longer DNA sequence in
order to reconstruct the original sequence, Coverage (or depth) in DNA sequencing is the
number of reads that include a given nucleotide in the reconstructed sequence.
8. single-end read vs paired-end read
Single-read sequencing involves sequencing DNA from only one end. Paired-end sequencing
allows users to sequence both ends of a fragment.
9. De novo sequencing vs resequencing vs exome sequencing
Determining the complete sequence of the first genome from a species is called de novo
sequencing. Resequencing is used to determine the genomic variations of a sample in relation to
a common reference sequence. Exome sequencing is a genomic technique for sequencing all of
the protein-coding genes in a genome
10. Duplication vs divergence
Gene duplication a major mechanism through which new genetic material is generated during
molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene.
Genetic divergence is the process in which two or more populations of an ancestral species
accumulate independent genetic changes (mutations) through time, often after the populations
have become reproductively isolated for some period of time.
11. Pseudogenes vs processed pseudogenes
Pseudogenes are DNA sequences similar to functional gene, but have been inactivated by
mutations, some have developed an alternative function. Processed pseudogenes
(retrotransposed pseudogenes) In the process of retrotransposition, a portion of the mRNA or
hnRNA transcript of a gene is spontaneously reverse transcribed back into DNA and inserted into
chromosomal DNA. However, because they are derived from an RNA product, processed
pseudogenes also lack the upstream promoters of normal genes; thus, they are considered
"dead on arrival", becoming non-functional pseudogenes immediately upon the
retrotransposition event
12. Post-transcriptional vs post-translational modifications
Post-transcriptional modification is the process in eukaryotic cells where primary transcript RNA
is converted into mature RNA. (5’ capping, polyadenylation, splicing)
, 2
Post- translational modification refers to the covalent and generally enzymatic modification of
proteins following protein biosynthesis
13. Alternative splicing vs RNA editing
Alternative splicing: process that creates proteins containing amino acid sequences encoded by
different combinations of exons from a genes. RNA editing: alteration of the nucleotide sequence
of mRNA in between transcription and translation
14. Transposon vs retrotransposon
Retrotransposon: self-amplifying sequences in genomes derived from reverse transcription.
Transposon: DNA sequences that can change positon within a genome.
15. Antisense RNA vs RNA interference
Antisense RNA: single-stranded RNA molecule complementary to a region of mRNA. Binding of
antisense RNA can block to messenger can block transcription.
Contrast between the challenges of gene identification in prokaryotes vs eukaryotes
Easier in prokaryotes than in eukaryotes. Prokaryotes have smaller genomes and contain fewer
genes. Genes in bacteria are contiguous - they lack introns that eukaryotes have. E.g. 90% of E.Coli
genome is protein coding.
Protein coding genes in higher eukaryotes are sparsely distributed and most are interrupted by
introns. Identification of exons are one problem and assembling them another. Alternative splicing
presents additional difficulty. Simpler eukaryotes are easier, e.g. yeast genome is 67% coding.
Distinguish between two general methods of gene identification
A priori methods
Seek to recognise sequence patterns within expressed genes and the regions flanking them. Protein
coding regions will have distinctive patterns of codon statistics, including the absence of stop codons.
‘Been there seen that’
Recognise regions corresponding to previously known genes. , from the similarity of their translated
amino acid sequences to known proteins in another species, or by matching expressed sequence
tags.
Describe useful features of gene identification in addition to codon usage: what to look for in the
beginning, middle end of genes
5’ exon starts with a transcription start site preceded by a core promoter site (e.g. TATA box at
roughly -30 bp (thus 30 bp upstream)). It is free of in-frame stop codons and ends immediately
before GT splice signal. Rarely an exon occurs before 5’ exon with ATG (also Kozak sequence:
consensus ACCAUGG)
Internal exons also free from in-frame stop codons. They begin immediately after an AG splice signal,
and end immediately before a GT splice signal
The final 3’ exon starts immediately after AG splice signal and ends with a stop codon, followed by a
polyadenylation signal sequence. Sometimes a non-coding exon follows the exon with the stop
codon.
All coding regions have non-random sequence characteristics, based partly on codon usage
preferences.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller anyiamgeorge19. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $11.50. You're not tied to anything after your purchase.