Full summary MOBC Molecular Biology of the Cell course Hans vd spek: the basics, omics, regulation of gene expression, translation of mRNA into protein, gene cloning and manipulation and transgenesis. Maike Stam: epigenetics chromatin, DNA methylation, chromatin histone modifications, histone varia...
3, 4, 5, 6, 7, 8, 9 and 17
September 19, 2020
87
2019/2020
Summary
Subjects
alberts
dna
rna
protein
histones
michealis menten
long non coding rna
epigenetics
transcription
replication
translation
molecular biology of the cell
Connected book
Book Title:
Author(s):
Edition:
ISBN:
Edition:
More summaries for
Molecular Biology of The Cell, 7th Edition
Summary Molecular Biology of the Cell 2 (book) (WBFA007-04)
Summary Regenerative Medicine
All for this textbook (62)
Written for
Universiteit van Amsterdam (UvA)
Biomedische wetenschappen
Molecular biology of the cell (MOBC)
All documents for this subject (10)
Seller
Follow
cvaarting
Content preview
Lecture 1 MOLECULAR BIOLOGY THE BASICS – HANS VAN DER SPECK
1. THE CENTRAL DOGMA
The central dogma is essential for the genes function. DNA needs to
replicate to express its information. The information flow: DNA
RNA protein Metabolite Phenotype.
1. Replication DNA synthesis = Linking nucleotides
together to form a strand of DNA (polarity 5’ – 3’.)
2. Transcription RNA synthesis =DNA sequences are used
as a template for the synthesis of new RNA strands (polarity
5’ – 3’)
3. Translation Many RNA molecules direct the synthesis of
polymers of radical different chemical classes (protein).
(polarity n-c terminus)
2. REPLICATION DNA SYNTHESIS
• DNA synthesis is semiconservative every newly synthesized DNA double strand
consists out of one old (template) and one new (daughter) strand of DNA. The old
strand forms the template from 3’ – 5’ since DNA polymerase is functional from 5’ –
3’. The daughter strand is functional from 5’ – 3’.
• The old DNA strand serves as a new complementary strand. Helicase is involved in
the separation of the DNA helix into two strands. Topoisomerase makes knicks in
the strand to relieve the tension in the helicase and open up the helicase and release
the strand so that DNA polymerase can continue transcribing a new strand.
o Leading strand easy to replicate since it is used as a continuous template.
o Lagging strand difficult, opposite orientation, the replication starts from the
replication fork, polymerases are not able to form DNA from 3’ – 5’. Short
primers made by primase are made and synthesized, filled up by ligase and
okazaki fragments.
• DNA synthesis is the process of the formation of phosphodiester bond while
hydrolyzing the matching dNTP molecules energy for the formation of the covalent
bond comes from the substrate itself. Hydrolysis of the substrate (dNTP) results in the
release of a pyrophosphate (PPi) molecule and the base is incorporated into the
sugar backbone.
3. MISTAKES IN DNA REPLICATION
Proofreading activity: Mistakes in DNA synthesis can be restored through
proof reading activity of the DNA complex. The complex contains an editing
site. Whenever there is a mistake located the hydrogen bound can be opened
up and replaced by the right nucleotide. Normal nucleotides get edited via 3’
to 5’ The proofreading activity checks from 5’ to 3’, thereby eats back the
wrong nucleotides and introduce the new correct one due to exonuclease
activity.
Mismatch repair: MutS and MutL scan the DNA and detect a mismatch/nick
in the new DNA strand a whole stretch of nucleotides around the incorrect
base is removed (±12 nucleotides) DNA polymerase is recruited again to
make a correct.
1
,The new strand can be distinguished from the old strand because the old strand already
contains DNA methylation and/or other modifications.
4. MUTATIONS IN DNA
Chemical changes in DNA bases can cause mutations in the DNA.
Depurination removal of a whole base. When the base is not repaired? Deletion of
nucleotide pair results in a frameshift leading to no or different expression of proteins,
Deamination deamination of cytosines (removal of NH3) into uracil leads to missense
mutations. Mutations form CG to AT.
5. APPLICATION OF DNA BIOSYNTHESIS
Determine the order of the bases AGCT by DNA sequencing (whole genome)
Or DNA simplification by the use of PCR (polymerase chain reaction).
DNA sequencing: sanger sequencing
Detection of nucleotides that form the sequence.
Process: addition of dideoxy nucleotides (dNTPs) to the DNA synthesis mix. Incorporated
dNTPs stop sequencing of the DNA fragments, via electrophoresis fragments can be
separated on size to visualize the termination of the synthesis (Bottom is 5’ Top 3’) and the
newly synthesized strand can be read from bottom to top.
Innovations: replace radioactivity by 4 fluorescent phosphors, use detectors to measure the
different fluorophores to register which nucleotide is incorporated. Makes it possible to pool
all fragments together in one gel.
Whole genome sequencing:
• Shotgun sequencing by BACs: gDNA is fragmented and cloned into bacterial
sequences (BACs)and sequenced. Genome is constructed by stitching nucleotide
sequences together using overlaps between clones.
Contig: assembly of small DNA sequences into a continuous strand.
Disadvantage: difficult to fill gaps between contigs
Innovation: massive parallel sequencing
• Pyroseq: massive parallel sequencing: when dNTPs are incorporated,
pyrophosphate is released and converted in ATP. ATP is consumed by luciferase that
releases a flash of light that can be detected.
Disadvantage: need to wash with apyrase after each flash to incorporate new dNTPs.
Repetitive sequences result in brighter flashes.
Innovation: illumina sequencing.
2
, • Illumina sequencing: 4 fluorescent dNTPs added at the same time that can be
detected due to differences in wave length.
• Ion torrent method: Beads coated with a DNA molecule that has been amplified many
times, are placed in wells. As nucleotides are sequentially washed over the beads,
the incorporated by the polymerase causes a pH change (a proton has been
released). The sequence of the DNA on each bead can be read by the pattern of pH
fluctuations. Can be measured on a chip.
• Nanopore: DNA strands are pulled through the pore by motor proteins. leading to
disruption in the ionflow. Each different nucleotide has a different effect on the
ionflow. Changes in the ionflow can be detected and validated.
• Bionano: rough scanning of differences in DNA localization.
Polymerase chain reaction (PCR)
With PCR you can decide yourself which sequence you want to amplify because you have to
design the primers yourself exact amplification of a desired fragment (so the sequence
has to be known already)
1. Denaturation separation of dsDNA into 2x ssDNA (95C)
2. Annealing of the primers to the ssDNA strands (50-55C)
3. DNA synthesis/ elongation (72C) (starting from the primer).
With PCR the 3th cycle results in specific fragments of interest.
PCR STEP BY STEP:
1. Heat denaturing of DsDNA; need to get separated into two strands.
2. hybridization of specific primers- designed to amplify specific regions of DNA. Cool
down the sample to allow the primers to hybridize to complementary sequences
3
, 3. mixture of DNA polymerase and dNTPs to synthesize DNA. DNA synthesis starts
from the two primers.
4. To amplify the DNA the cycle is repeated many times by reheating the sample to
separate the newly synthesized DNA strands. Now a days use of (thermophilic
bacterium) DNA polymerases that are stable for heat changes. So, it is not denatured
by the heat treatment.
In the first cycle primers continues synthesize the strand of DNA to the end of the sequence
(chromosome broad). After separation for the second cycle the other primer makes specific
shorter fragments but only after the third cycle the specific fragment of interest is being
amplified.
PCR APPLICATIONS
TECHNIQUES FOR DETECTING DNA POLYMORPHISMS (not so interesting said teacher)
DNA Gelblot based:
RFLP restriction fragment length polymorphism
PCR based:
AFLP amplified fragment length polymorphism
RAPD random amplified polymorphic DNA
VNTR variable number tandem repeats
PCR and sequencing based/ melting curve:
SNP single nucleotide polymorphism
RT-PCR: mRNA is isolated from cells. first primer, reverse transcriptase and dNTPs are
added. A double stranded mRNA-DNA molecule is created. The strands are separated and
the second primer (complementary to DNA) is added. Double stranded cDNA molecule.
cDNA does not contain intronic information, so you can easily distinguish between introns
and exons.
AFLP-PCR: Digestion of total cellular DNA with one or more chosen restriction enzymes
ligation of adaptor sequences to the restriction fragment. Selective amplification of some of
these fragments on gel. Compare/ score differences. absence or presence of certain
fragments.
VNTR variable number tandem repeats: VNTR are mini satellites, located in a genome
where a short nucleotide sequence is organized as a tandem repeat. Can be found on many
chromosomes and often show variations in length among individuals. Each variant act as an
inherited allele allowing them to be used for personal or paternal identification.
4
,SNP single nucleotide polymorphism: It refers to variation of a base pair. For each SNP
there are two alleles frequent. Need to study multiple SNP in the organism to identify him.
The collection of all SNPs from certain genomic area is called a haplotype. Three tag SNPs
are enough to summarize the haplotype. DNA polymorphisms can be used in linkage
analysis to establish which gene fragment is linked to a certain trait.
SNPs are not mutations but variances within the population. SNPs can be linked to
treats/diseases
5
,LECTURE 2: FROM GENOME TO GENE FUNCTION: OMICS TECHNOLOGIES AND
BIOINFORMATICS
Bioinformatics: how to handle the enormous amount of information produced by: DNA
sequencing, RNA expression, protein patterns, metabolite contents, digitalized phenotypes.
ANNOTATION OF DNA SEQUENCES
DNA annotation or gene annotation is the process of identifying the locations of genes and
all of coding regions in a genome and determining their function. Only 2% of the human
genome codes for proteins.
Chromosomes contain many duplicated segments; intra and inter chromosomal duplications
(within a chromosome but also in multiple different chromosomes.
WHAT IS A GENE? HOW CAN WE IDENTIFY IT? WHAT LANDMARKS ARE USED FOR
ANNOTATION?
Genome annotation: the process of identifying the locations of genes and all of the coding
regions in a genome and determining what those genes do. Annotation of a DNA sequence
A gene is everything involved in the expression of exons:
• 5’ to 3’ starts with a upstream regulatory region is an enhancer.
• Promotor region containing a CAAT box and a TATA box.
o TATA box, TATA binding protein that is essential for the start of transcription.
o CAAT box signals the binding site for the RNA transcription factor
GGCCAATCT
• Transcription start site location of the transcription of RNA
• AUG start site for protein synthesis. AUG ATG
• Exon intron borders
• Translational stop codon. Termination codon for protein synthesis. UGA UUA UAG
• poly a tail : 3’ end AAUAAA sequence. RNA polymerase recognizes the sequence
where polymerase A will bind and transcription will be ended.
• 5’ UTR and 3’ UTR are regions of the genome that are transcribed but not translated.
6
,Splice sites/ splice sequences – the information that determines wheter something is an exon
or an intron is encoded by only a few nucleotides. The consensus sequence for splice sites
is:
Hydroxyl attack GU on the 5’ end of the intron gets physically coupled to the A (lariat
formation). The 3’ end of the first exon gets physically attached to the 5’ end of the second
exon and the intron/lariat is removed.
Alternative splicing:
• Choose whether a splicing event occurs or not.
• Exon skipping or incorporation of extra exons. Example: the last exon often codes for
a membrane anchor domain.
• If this exon is removed the protein has the same function but a different location.
• Different protein functions and locations can be cell tissue specific.
Genome analysis:
• how to identify protein coding regions?
• Look for Open reading frame ORF. A gene has six ORFs that are important for the
location of reading and translation of a protein. You can start with all the nucleotides
in the first amino acid either from 5’ to 3’ or 3’ to 5’.
• Only one of the ORFs is the right one; mostly the continuous sequence without early
premature stop codons is ending up as a coding sequence that will lead to the
translation into protein.
• After knowing the ORF, you can start looking for all the other essential compartments
(intron, codon usage and TATA box).
NATURAL SELECTION
SYNTENY: the preserved order of genes between related organisms. Since the order of
genes mostly has a neutral effect in eukaryotes, an organism will have no ill effects from
having genes re-arranged. The order of genes is generally preserved best between tightly
related species. Conservation of the order of a cluster of genes suggests a functional
relation.
NATURAL SELECTION: Change in the DNA that do or does not affect the encoded protein.
Ka = non-synonymous substitution ratio (base change leads to different amino-acid)
Ks = Synonymous substitution ratio (base change leads to same amino acid)
Ka/Ks <1 strong selection (need to keep the same amino acid for the function of the
protein
Ka/Ks >1 no selection
Most of these changes will have no significant biological effect, so identification of genomic
differences underlying such characteristics of humanness.
There are three prevailing hypotheses to account for the evolution of humanness traits:
• protein evolution,
7
, • the less is more hypothesis,
• change in the regions of the genome that regulate gene activity.
DEFINITIONS
Homolog: a gene related to a second gene by descent from a common ancestral DNA
sequence. The term, homolog, may apply to the relationship between genes separated by
the event of speciation (ortholog) or to the relationship between genes separated by the
event of genetic duplication (paralog).
Ortholog are genes in different species that
evolved from a common ancestral gene by
speciation. Normally, orthologs retain the same
function in the course of evolution. Identification of
orthologs is critical for reliable prediction of gene
function in newly sequenced genomes.
Paralog are genes related by duplication within a
genome. Paralogs evolve new functions, even if
these are related to the original one.
RNA TRANSCRIPTOMICS
EST database: expressed sequence tags random cDNA sequences
from different tissues.
RNA seq analysis high throughput cDNA sequencing.
MICROARRAY analysis: choose gene specific DNA molecules,
print the DNA molecules onto a fixed slide. mRNA of two different
tissues/conditions is labelled with fluorophores (red+green). And
washed over the microarray. Hybridization of the mRNA to the DNA
if the sequence is complementary. Wash away of the non-
hybridized sequences. Scan for red and green signals. And
combine the image to determine if a certain mRNA is expressed in
both conditions or whether mRNA expression is for example tissue
specific.
MICRO ARRAY VS RNAseq
1. Microarrays can only detect sequences; must know in advance what to put on the
chip.
2. Certain analyses are not possible with microarray:
• distinguish mature mRNA from unspliced RNA,
• different isoforms/splice variants.
• Strandedness, single cell analyses.
3. RNAseq gives a fuzy overview; facilitates novel transcript discovery. RNAseq lends
itself to further and confirmatory analyses. Lower error rate+ problems like cross
hybridization avoided in RNAseq.
PROTEIN: PROTEOMICS
2D GEL ANALYSES AND PROTEIN SPOT IDENTIFICATION
8
, • All the proteins in cell are separated on this
gel, each spot corresponds to a different
polypeptide chain.
• 1st separation on pH gradiant by isoelectric
points
• 2nd separation on molecular mass by
electrophoresis.
• Note that different proteins are present in
very different amounts.
• The bacteria were fed with a mixture of
radioisotope-labeled amino acids so that all
of their proteins were radioactive and could
be detected by autoradiography. You can
take one spot and analyse it individually by
maspectometry.
MASS SPECTOMETRY AND TANDEM MASS SPECTOMETRY
Peptide mass databases to find proteins.
• Mass spectrometers used in biology contain an ion source that generates gaseous
peptides or other molecules under conditions that render most molecules positively
charged.
• The two major types of ion source are MALDI and electrospray.
• Ions are accelerated into a mass analyser, which separates the ions on the basis of
their mass and charge.
Tandem mass spectrometry: two mass analysers separated by chamber, containing an
insert, high energy gas. The electric field selects precursor ions, which is then directed to the
chamber. Collision of the peptide with gas molecules causes random peptide fragmentation,
primarily at the peptide bonds, resulting in a highly complex mixture of fragments containing
1 or more amino acids from throughout the original peptide. 2 nd mass analyser is then used
to measure the masses of the fragments (called product or daughter ions). With computer
assistance, the pattern of fragments can be used to deduce the amino acid sequence of the
original peptide.
PBD
Protein structure database. Linking amino-acid sequences to 3D structure.
Molecule modelling software
METABOLITES, METABOLOMICS
Determining the identity and quantity of any given metabolite in an extract. Molecular mass
database for metabolites: NIST library with 130.000 components. Modelling metabolic
pathways using enzyme activity data and metabolite concentrations: KEGG database.
9
, LECTURE 3 REGULATION OF GENE EXPRESSION, FROM DNA TO RNA
Transcription and translation are the main regulated
pathways in gene expression. External factors influence
gene expression through a network of receptors and
signalling proteins. Ultimately this leads to transcriptional
activation; selected genes are expressed.
External regulatory proteins detect an environmental
signal. Which leads to the activation of gene regulatory
proteins inside the nucleus, and binds to regulatory DNA,
provoking activation of a gene to produce another protein
that binds to other regulatory regions to produce more
proteins, including some with additional gene regulatory
proteins.
BIOCHEMISTRY OF RNA
RNA differs from DNA, RNA is mainly single stranded, but
always synthesized from a single strand DNA chain. RNA
contains uracil instead of thymine structure lacks a O in position 2 that makes the ribose less
stable than DNA nucleotides.
There are multiple types of RNA:
• mRNA: Messenger RNA involved in the coding of proteins (5% of all RNA)
• rRNA: Ribosomal RNA for structure of ribosomes and to catalyse protein synthesis
• tRNA: Transfer RNA, central to protein synthesis as adaptors between mRNA and
amino acids
• snRNA: Small nuclear RNAs. Important for the splicing of pre-mRNA.
• snoRNA: small nucleolar RNA located in the nucleolus, that is complementary to
rRNA, take care of post transcriptional modification.
• miRNA: microRNA regulate gene expression by blocking translation of specific
mRNAs and cause their degradation.
• siRNA: small interfering RNA, turn off gene expression by directing the degradation
of selective mRNAs and the establishment of compact chromatin structures,
interfering with RNA leading to destruction of genes, mainly produced after interaction
with (viral) dsRNA.
• piRNA: piwi interacting RNAs bind to piwi proteins and protect the germ line from
transposable elements.
• LncRNA: long noncoding RNA, many functions as scaffolds, they regulate diverse
cell processes including X-chromosome inactivation
• Majority of the types of RNA are involved in the regulation of genes, and not for the
formation of proteins however
RNA POLYMERASE
• RNA is made with the help of RNA polymerase;
• RNA polymerase binds on the DNA and transcribes one of the Two DNA strands.
• Therefore, two different products can be transcribed from one strand.
• With RNA polymerase there is no need in having double stranded primers to start
with.
• In this way the RNA chain is extended by one nucleotide at the time in the 5’ to 3’
direction.
• There are three forms of RNA polymerase:
o RNA polymerase I: rRNA genes
o RNA polymerase II: all protein-coding genes, miRNA, siRNA, LncRNA,
snRNA
o RNA polymerase III: tRNA genes, other small RNAs
10
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller cvaarting. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $11.24. You're not tied to anything after your purchase.