DNA sequenties zijn opgeslagen in fasta files. .faa, .fna, .fasta, .fa
Preferred extension for protein is .faa (Fasta Amino Acid)
Preferred extension for DNA fasta files is .fna (Fasta Nucleic Acid)
Verschillende letters in de verschillende files.
In DNA file:
N = we weten niet welke letter, maar geen gap
In eiwit file:
X = we weten niet welke letter, maar geen gap
Metadata
Naam van het organisme
Research Group that generated the sequence
Geographic coordinates and date/time the sample was collected
Environment (biome)
Methods like nucleid acid extraction protocol, DNA sequencing technology
Genbank formaat is het formaat waarin veel hiervan wordt opgeslagen
Hierin staat ook waar het is gepubliceerd.
Van niet alle sequenties zijn de functies bekend, hier wordt dan de label ‘hypothetical’ aan gehangen.
Error propagation
- mensen kunnen fouten maken
- computers kunnen deze fouten overnemen
- Dit is error propagation
,Scientific literature databases
- pubmed/google scholar
Hier kan je ook in zoeken naar key words
Using databases in biology
Als je een database gebruikt moet je opschrijven op welke datum je dat hebt gedaan, want deze
veranderen elke dag.
De identifiers moet je meenemen in je publicatie
In artikel:
- cite the article
- note the namen, version number, and/or date
- list de Identifiers
People are mostly interested in:
1. Themselves
2. Their food
3. Their diseases
Hier is dus ook de meeste data over.
Genoom: alle genen
Genomics: studie van alle genen
Transcriptomics: studie van alle RNA transcripten
Proteoom: studie van alle eiwitten
Microbioom: alle micro-organismen in een bepaald milieu
, First generation
- Chain termination sequencing
o Sanger
Second generation
- Massively parallel sequencing
o Illumina (MiSeq)
o Ion Torrent
Third generation
- Single molecule sequencing
o Oxford Nanopore (MinION)
o Pacific Biosciences (PacBio)
Data sharing kan gewoon via internet, heel belangrijk om je resultaten te delen.
Bioinformatici gebruiken data op 2 verschillende manieren,
1. Given a biological question, a good bioinformatician will immediately think about which
datasets could be used to answer it.
2. Given a dataset, a good bioinformaticion will immediately think about which new biological
question it could answer.
Viroom: alle virussen
Assembleren: kleine reads samenvoegen tot grotere contigs
Door het gebruik te maken van een database beïnvloed je misschien je resultaten. Want deze zijn
biased, van sommige organismen is heel erg veel bekend en van sommige nog niks.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller MariekeWiesmeijer. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $3.21. You're not tied to anything after your purchase.