This is a comprehensive and detailed note on Chapter 21; The Human Genome Project fostered development of faster, less expensive sequencing techniques.
CONCEPT 21.1: The Human Genome Project fostered development of faster, less
expensive sequencing techniques
Genomics is the study of whole sets of genes and their interactions
Bioinformatics is the application of computational methods to the storage and analysis
of biological data
Officially begun as the Human Genome Project in 1990, the sequencing of the human
genome was published in 2006
The sequenced DNA was pooled from a few individuals
Scientists reviewed the results and agreed on a reference genome, a full sequence
that best represents the genome of a species
The goal in mapping any genome is to determine the complete nucleotide sequence of
each chromosome
The human genome was completed using sequencing machines and the dideoxy chain
termination method
Two approaches complemented each other in obtaining the complete sequence
The initial approach ordered each fragment based on earlier mapping of the human
genome
Then, molecular biologist J. Craig Venter set up a company to sequence the entire
genome using an alternative whole-genome shotgun approach
This used cloning and sequencing of fragments of randomly cut DNA followed by
assembly into a single continuous sequence
Figure 21.2
The whole-genome shotgun approach is widely used today
A major thrust of the Human Genome Project was the development of technologies for
faster sequencing
These “next-generation” techniques do not require a cloning step
These techniques have also facilitated a metagenomics approach, in which DNA from
a group of species in an environmental sample is sequenced
Making sense of massive amounts of data from many genome sequences has
necessitated new analytical approaches
CONCEPT 21.2: Scientists use bioinformatics to analyze genomes and their
functions
The Human Genome Project established databases and refined analytical software to
make data available on the Internet
Centralized Resources for Analyzing Genome Sequences
Bioinformatics resources are provided by a number of sources
, The National Library of Medicine (NLM) and the National Institutes of Health
(NIH) maintain the National Center for Biotechnology Information (NCBI)
European Molecular Biology Laboratory
DNA Data Bank of Japan
BGI in Shenzhen, China
The NCBI database of sequences is called GenBank
As of August 2019, it included the sequences of 214 million fragments of genomic DNA,
totaling 366 billion base pairs
A widely used software program on the NCBI website is called BLAST (Basic Local
Alignment Search Tool)
Users of this tool can compare a DNA sequence with every sequence in GenBank
Another program allows comparison of protein sequences
A third program can search any protein sequence for conserved (commo) stretches of
amino acids (domains) for which a function is known or suspected
It can show a three-dimensional model of the domain alongside other relevant
information
Figure 21.3
Rutgers University and the University of California, San Diego, maintain a world-wide
database of all three-dimensional proteins structures that have been determined
It is called the Protein Data Bank
There is a vast array of resources available for researchers anywhere in the world to
use free of charge
Identifying Protein-Coding Genes and Understanding Their Functions
Using available DNA sequences, geneticists can study genes directly
The identification of protein-coding genes within DNA sequences in a database is called
gene annotation
Gene annotation uses three lines of evidence to identify a gene
First computers search for patterns that indicate the presence of genes
This includes translational start and stop signals, RNA splicing sites and other signs,
such as promoter sequences
The software also looks for short sequences that specify known mRNAs
The second step is to obtain clues about the identities and functions
Software is used to compare the sequence of a protein to the products of known genes
from other organisms
The final step is to use RNA-seq or some other method to show that the relevant RNA is
actually expressed from the proposed gene
Understanding Genes and Gene Expression at the Systems Level
Genomics is a rich source of new insights into questions about genome organization,
regulation of gene expression, embryonic development, and evolution
The ENCODE (Encyclopedia of DNA Elements) project ran from 2003 to 2012
The aim was to learn about the functionally important elements in the human genome
2
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller anyiamgeorge19. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $11.99. You're not tied to anything after your purchase.