Lecture notes From Genome to Function (AM_1290) FGTF
0 view 0 purchase
Course
From Genome to Function (AM_1290)
Institution
Vrije Universiteit Amsterdam (VU)
Extensive summary of notes and lecture slides, combination of notes and slides from 2023 and 2024. Part of Masters program Biomolecular Sciences, year 1 P1. Answers to practice exams incorporated in notes. Practice exam and final exam questions can be answered based on this document
From Genome to Function
Inhoudsopgave
From Genome to Function............................................................................................................................. 1
01-11-2023 – Genome of eukaryotes.............................................................................................................. 1
article discussion..............................................................................................................................................5
31/10/2024 – Transcription in Eukaryotes...................................................................................................... 7
06/11/2023 – Control of gene expression from transcription to RNA processing – 6/11/2024........................13
Chapter 6/7 Alberts.......................................................................................................................................13
09/11/2023 – The lessons to learn on epigenetics........................................................................................17
10/11/2023 – Organelle synthesis and function/Lysosomes and Autophagy (Parkinson’s disease/PD)...........23
13/11/2023 – The power of Life and Death (Nick Lane video).......................................................................27
15/11/2023 – Metabolomic analyses to understand the TCA cycle fluxes – 11/11/24....................................29
17/11/2023 – When the genome does not explain function..........................................................................33
01-11-2023 – Genome of eukaryotes
GENERAL: genome via transcription to transcriptome, via translation to proteome.
Genome: biological information needed to construct and maintain a living organism. Most
made of DNA, few viruses have RNA genomes.
Eukaryotes: genome is haploid set of chromosomes. Haploid cells: only one set of
chromosomes so one copy of each gene, for example in reproductive cells.
Genes are the functional units, DNA segments containing biological information and coding
for RNA or polypeptide molecule.
Eukaryotic vs prokaryotic genomes: different cell organization, eukaryotes have nucleus and
are larger. Nucleolus of eukaryotes contains genetic information to transcribe ribosomal RNA
molecules that make up ribosomal functional units. Nucleus and mitochondria contain DNA.
The rest of the nucleus is condensed into chromatin.
Chromatin is cytologically distinguished in euchromatin, and heterochromatin based on
chromosomal stains. Euchromatin is characterized by a low degree of condensation and is
transcriptionally active. Heterochromatin maintains condensed state throughout interphase
and comprises facultative and constitutive heterochromatin. Largely composed of repetitive
DNA, heterochromatin forms dark bands after Giemsa (AT-rich regions) staining.
Chromatin is dsDNA helices. They are complexed with histones to form nucleosomes, which
all consist of 9 histone proteins. Chromatosome = nucleosome + H1 histone.
,Chromatosome folds up, producing a 30-nm fiber which forms loops (300 nm). The looped
fibers are compressed and folded into a wide fiber, which is tightly coiled and forms the
chromatid of a chromosome.
c-value: amount, in picograms, of DNA contained within a haploid nucleus.
Genome size does not correlate to organism complexity. Similar genomes can be caused by
very differently sized chromosomes (and number of chromosomes), also when evolutionary
related. There is no simple relationship between chromosome number, complexity of the
organism and total genome size. Protein-coding gene number has nothing to do with
organism complexity.
Number of genes as a function of genome size: for variety of bacteria and archaea, slope of
the data line confirms simple rule of thumb relating genome size and gene number. Rule does
not fully work for multi-cellular organisms. So, in bacteria, there is a linear function that
describes the number of genes related to the genome size.
Mendelian: gene = any heritable trait.
One gene can encode multiple polypeptides (alternative promoters/splicing, RNA editing
etc.). DNA can also encode functional non-coding RNA like snRNA, miRNA etc.. All
transcribed from DNA, but not all are protein encoding.
Newer definition: A gene is a DNA sequence (whose component segments do not necessarily
need to be physically contiguous) that specifies one or more sequence-related RNAs/proteins
that are both evoked by GRNs and participate as elements in GRNs, often with indirect
effects, or as outputs of GRNs, the latter yielding more direct phenotypic effects.
GRN: gene regulatory network
All eukaryotes have the same basic gene set, but the number of genes in each set can differ
between species based on complexity (higher number for more complex organisms).
Complexity can still arise with similar set of genes, the more complex the organism is the
more subtle the regulation is.
Genes are not evenly distributed within a genome. Protein-encoding gene distribution is very
uneven, also within chromosomes.
Genes/related sequences + exons (2%) is only 28% of the human genome.
- Pseudogene: sequence of nt’s that resemble a gene, but don’t specify a functional
RNA/protein.
- Gene fragments: short, isolated regions from within a gene
- Truncated genes: lacking part of a complete gene
Noncoding RNAs: tRNA, rRNA, microRNA, snRNA, long non-coding RNA (lncRNA)
ncRNAs are critical elements in gene regulation and expression and contribute to
epigenetics/transcription/splicing/translation machinery.
Enhancers/promoters/silencers etc. have no defined sequences, can be found in introns,
up/downstream of genes. Regulatory DNA sequences crucial for control of gene expression.
Repetitive DNA: about 50% of human genome.
Interspersed repeats: individual repeats dispersed throughout genome, are a result of mobile
genetic/transposable elements; pattern set up by transposition. Vast majority is inactive or
relics. The elements can move throughout the genome following:
, 1. Conservative transposition: involves the excision of the sequence from its original
position followed by its reinsertion elsewhere (DNA transposon).
2. Replicative transposition: increases the number of copies. The original element
remains in place while a copy of it is inserted at a new position (retrotransposon
replicated via an RNA intermediate).
Tandem repeats: tandemly repeated DNA, also called satellite DNA. Repeat units are placed
next to each other in an array. Because DNA fragments containing tandemly repeated
sequences form satellite bands when genomic DNA is fractionated by density gradient
centrifugation.This repetitive DNA is made up of long series of tandem repeats. Example:
centromeres (and pericentromeric-) regions of the chromosome, telomeres.
Other tandemrepeats: mini- and microsatellites.
- Mini: clusters up to 20kb length, repeat units up to 25 bp (eg. telomeric DNA:
TTAGGG motif repeats)
- Micro: shorter, usually less than 150 bp clusters, repeat units <13 bp. Dinucleotide
repeats most common. Unique genetic profiles are generated based on microsatellite
examination, because these are always unique combinations except for in monozygotic
twins.
These tandem repeats also occur in centromeres and telomeres.
- Centromeric heterochromatin contains long tandem array of human high-order repeats
made up of a set of 171 bp alpha monomers. Tandem repeats are typically found in the
centromeric and pericentromeric regions of the chromosomes.
- Telomere: end of chromosome. Telomeric heterochromatin: TTAGGG repeats
constitute tandem arrays ending with a single-strand tail, G-overhang. Shelterin
complex composed of six proteins in humans associates with telomeres to form
protective T-loop.
Chain-termination DNA sequencing Sanger sequencing
“Sanger sequencing is the process of selective incorporation of chain-terminating
dideoxynucleotides by DNA polymerase during in vitro DNA replication “
“Determine the order of DNA building blocks (A, T, C, G) in a DNA molecule. It works by
using modified nucleotides to terminate DNA copying, creating fragments of different
lengths. These fragments are separated and read, allowing scientists to determine the DNA
sequence”
Shotgun sequencing: bacterial vectors are used to clone random fragments of long DNA
molecules. Fragments are then sequenced in parallel; reads are assembled using overlaps.
difficult for large genomes, posing a bottleneck. Also: data analysis becomes
disproportionately more complex as the number of fragments increases (for n fragments the
number of possible overlaps is given by 2n^2-2n). It has a high error rate when repetitive
regions are analyzed because reassembling repetitive sequences is problematic without
leaving out portions of a repetitive region, or it can lead to separate pieces of the
same/different chromosomes to be wrongly connected.
Mitochondrial genome sequenced with shot-gun approach in 1981, first sequenced part.
De novo genome assembly: sequencing without existing reference. 83-84% covering most
important genome parts worked, but centromere/telomere/highly repetitive sequences are
difficult.
, Human genome project: cloning large fragments of human genome into bacterial artificial
chromosomes BAC (are physical: used as intermediate step in sequencing/mapping). BAC are
derived from e.coli F plasmid, can accommodate >300 kb fragments. Select and order BAC to
cover a region of whole chromosomes based on physical and genetic maps. Then fragment the
BACs, size-select and subclone them. DNA from individual clones is used as templates for
automated sanger sequencing.
Hierarchical shotgun sequencing can avoid problems with repeat sequences by using BACs.
Assembly of a clone with two copies of a repeat sequence can result in segment between the
copies being deleted. Avoided through using a third clone who has just one copy of the repeat,
so making clear that this only fits after the 2nd repeat and not after the 1st.
GRC genomic reference consortium: goal of correcting small number of regions in the
reference which are now misrepresented, closing remaining gaps and producing alternative
assemblies of structurally variant loci.
Reference assembly is clone-based assembly of DNA from multiple individuals.
HRG (human reference genome): haploid, composite sequence not corresponding to
any human individual. Standardized representation/model used for comparative
functions.
Second/next generation sequencing:
Uses contigs, contiguous sequence formed by several overlapping reads with no gaps
(computational/in silico, used in the final step of genome assembly). Difficult for de novo,
good for re-sequencing: 1000 genome project, find common genetic variants
genomic DNA – fragmented DNA – adaptor ligation – amplification – detection. Happens
massively parallel.
For de novo: use short reads to create contigs (based on sequence overlap), which are linked
using scaffolds and then sequenced/correlated to a chromosome.
Paired end reads: fragment before repeat sequence, making sure assembly goes right when
repetitive sequences are present.
3rd gen sequencing:
1. Oxford nanopore sequencing
a. Use transposon complex cleaving DNA into shorter fragments, add adaptor
sequences, link with other adaptor complexes carrying sequencing adaptors.
Clean-up, add through pore
b. Problem: error prone, require multiple sequencing
2. Real-time DNA sequencing from single polymerase molecules; HIFI
a. Medium long reads, very accurate. Prone to PCR bias (during library
generation): GC rich regions will be poorly covered.
b. Start with shared dsDNA, use primers of ssDNA to anneal primers and bind
DNA polymerase. Circularized DNA is sequenced in repeated passes,
polymerase reads are trimmed of adaptors to yield subreads. Consensus and
methylation status are called from repeats. Nucleotide incorporation kinetics
are measured in real time. 99.9% accuracy for precision.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller leila_schilpzand. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $9.78. You're not tied to anything after your purchase.