OMICS in de biomedische wetenschappen (5052OIB12Y)
All documents for this subject (6)
Seller
Follow
YaraBMW
Content preview
OMICS hoorcolleges
Bij tentamen Fysiologie à die vrouw mailen + de docent die dat werkcollege geeft
Hoorcollege 1 – Perry - Next generation sequencing
DNA à genomics – genome: full/partial DNA
mRNA à transcriptomics – transcriptome: all the mRNA
Eiwitten à proteomics – proteome: many eiwittten
Metabolieten à metabolomics – metabolome: many metabolieten
Wat kan je weten door dezelfde techniek? welke transcripties er voor komen, de splice varianten,
DNA methylatie,
Elke nucleotide wordt met een
speciaal fluorescentie weer
gegeven
Electropherogram
High-throughput sequencing
q Also known as
§ Next Generation Sequencing (NGS)
§ Massively Parallel Sequencing (MPS)
An important breakthrough was made with
q Development of methods that could record the DNA sequence while a DNA strand was being
synthesized by a polymerase from a single-stranded DNA template.
è The sequencing method was able to monitor the incorporation of each nucleotide in the growing
DNA chain and to identify which nucleotide was being incorporated at each step.
At the heart of the first such approach was a method known as pyrosequencing. This approach is
used by the 454 Life Sciences sequencer (we will not discuss this approach)
4 stappen:
1. Library preparation à isoleren van het DNA, en fragmenteren, hier worden adapters aan
toegevoegd dat zorgt voor ligatie (single strands)
2. Cluster amplification à het binden aan de adapters op een flow cell, hierna volgt er een
polymerase. Hele reeks van PCR-reacties en dit herhaalt zich
3. Sequencing à fluorescente props voor het maken van een Plaatje, er worden een voor een
nucleotiden aan gebonden. Deze worden gelezen. Per cluster wordt er de nucleotide
sequentie gelezen. Er worden 100-300 nucleotide gelezen
, Een terminator zorgt ervoor dat er niet
meteen een nucleotide wordt gebonden, dat
het eerst gelezen kan worden
4. Alignment & data analysis – het overlappen van de clusters om de sequentie te bepalen – kan
ook de mutaties bepalen
Bij Sanger sequence weet je precies waar je begon, maar hier weet je dat niet meer
A = adapter
SP = sequencing primer
Single-end sequencing – je pakt maar 1 strand, de
reverse strand haal je eruit
Paired-end sequencing – eerst alle reverse strengen
weggelaten en wordt die gesequencend (vanaf 5’), en
hierna herhaal je dit proces met het weghalen van de
forward streng.
Dicerences between Sanger sequencing and NGS
q Parallelism.
§ NGS methods have the ability to process millions of sequence reads in parallel
rather than 96 (capillaries) at a time.
q Library construction.
§ NGS sequence reads (=nucleotide sequence) are produced from fragment ‘libraries’
that have not been subject to the conventional cell-based DNA cloning used in
capillary sequencing.
§ The workflow to produce next-generation sequence-ready libraries is straightforward:
DNA fragments that may originate from a variety of front-end processes are prepared
for sequencing by ligating specific adaptor oligos to both ends of each DNA
fragment.
q Read lengths.
§ NGS: 35–250 bp (depending on the platform)
§ Capillary sequencers: 650–800 bp
§ This may acect the use of the data in various applications.
, q Sequencing errors
§ Sanger accuracy 99.999%.
§ NGS accuracy ~85 – 99.9%
q Sequence costs
§ Sanger: $0.50 per kilobase.
§ NGS: assuming a full genome sequence for $1000, then $0.0003 per kilobase.
Sequence alignment: elke positie is meerdere keren gesequencend, alle eiwitten overlappen
Het aantal keren dat een nucleotide overlapt heet de coverage. Bij een betrouwbare nucleotide wil je
een coverage hebben van gemiddeld 30.
Paired-end sequencing and alignment:
q Because the distance between each paired read is known, alignment algorithms can use this
information to map the reads over repetitive regions more precisely.
q This results in much better alignment of the reads, especially across dicicult-to-sequence,
repetitive regions of the genome.
Multiplexing
q Maximize sequencing capacity and reduce workflow of sample preparation
§ Perform a single sequencing run containing multiple biological samples
q This requires 'multiplexing'
§ 'barcodes' have been developed.
§ Unique 5-10 base sequences that are added at the 3’ end of the template.
q Sets up to 96 barcodes have been designed and can be assigned to up to 96 individual
samples.
De barcode is uniek elke barcode hoort bij een bepaald sample
Sequencing errors
q Defined as the percentage of bases that are incorrectly called.
§ If 0.8% error rate then for every 1000 bases coming oc the sequencer, 8 of them will
report the incorrect base.
q When considered alone, an error is indistinguishable from a sequence variant (e.g., SNP).
q This problem can be overcome by increasing the number of sequencing reads
§ Chance that same position is an error multiple times is low.
q Increased coverage (sequence depth) therefore ‘rescues’ inadequacies in sequencing
methods
Unified molecular identifiers (UMI)
q Improve accuracy of NGS method
q Account for sequencing and PCR errors
q UMIs act as a molecular memory of the number of molecules in the starting sample
q Can be combined with sample multiplexing
1,5% van de genen zijn coderend
, Mendelian disease
q Genetic disease
q Single gene disorder
q Follows simple Mendelian patterns of inheritance
§ autosomal, sex-linked
§ dominant, recessive
Example: cystic fibrosis
q Autosomal recessive disorder.
q Caused by the presence of mutations in both copies of the gene for the protein cystic fibrosis
transmembrane conductance regulator (CFTR).
Exome: The exome is the part of the genome formed by exons
q Single Nucleotide Polymorphism (SNP): point mutation that has persisted in the population
You often see:
§ MUTATION : <1% of the total population
§ SNP : >1% of the total population
q Allele: version of a gene at a given locus (e.g., SNPs)
q SNP / mutation == gene variant == gene with dicerent allele
q Indel: small insertion or deletion
Justification of exome sequencing
q Linkage analysis/positional cloning studies that focused on protein coding sequences were
highly successful at identification of variants underlying monogenic diseases (when
adequately powered)
q Known allelic variants known to underlie Mendelian disorders disrupt protein-coding
sequences
q Large fraction of rare non-synonymous variants in human genome are predicted to be
deleterious
q Splice acceptor and donor sites are also enriched for highly functional variation and are
therefore targeted as well
Alles wat niet bindt haal je weg
Om te weten of het een mutatie is of een sequencing error à vergelijken met de database + op
sommige plekken kunnen 2 nucleotiden komen (heterozygootàdominante ziektes)
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller YaraBMW. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $7.06. You're not tied to anything after your purchase.