1 The legend of Amel X and Y
1.1 AmelX and Y
The amelogenin (AMEL) locus encodes a matrix protein forming tooth enamel. In case of mutations, an enamel
defect occurs (amelogenesis imperfecta). This results in deformed tooth enamel. AMEL is used for sex
determination: Amel X: intron 1 contains a 6 bp deletion compared to intron 1 of AMELY:
- Female (XX): 106 bp
- Male (XY): 106 bp & 112 bp
Not 100% accurate method sex determination, female misidentification when:
o Y is deleted
o Mutation in regions of AMELY intron 1 commonly used as primer annealing sites may disable
PCR amplification
o 6bp insertion into intron 1 AMELY amplicon identical in length to that of AMELX.
Extra exercises 1.1 digibook
What does ‘locus mean’? : Locus means A place, space or locality, especially a centre of an activity. In biology, a
locus (plural loci) in genetics is a fixed position on a chromosome, like the position of a gene or a marker
(genetic marker).
Does AMEL PCR work only with human samples?: AMEL PCR works on many other species.
1.2 NCBI
NCBI = National Centre for Biotechnology Information; largest biomedical research facility in the world, a
division of the National Library of Medicine (NLM) at the National Institutes of Health (NIH), 1988. = ‘the google
for life sciences’:
- Establish public databases - Develop software tools for sequence
- Research in computational biology analysis
- Disseminate biomedical information
Everything surrounding the central dogma of life is in the database. If the data is not in NCBI it must be linked.
Data from NCBI is human curated:
- Original DNA sequences (genomes)
- Expressed DNA sequences ( = mRNA Sequences = cDNA sequences)
- Expressed Sequence Tags (ESTs)
- Protein Sequences (Inferred & Direct sequencing)
- Protein structures (Experiments & Models (homologues)
- Literature info
In search details of NCBI “amelx[All Fields] AND alive[prop]” = looking at almost all gene records in human
history about AmelX & it shows the “alive” records, meaning the information receiving is reliable.
RefSeq vs GenBank
Difference GenBank and RefSeq: GenBank sequence records are owned by original submitter (cannot be
altered). RefSeq sequences are derived from GenBank sequences to provide non-redundant curated data
representing our current knowledge of known genes.
Traditional GenBank Record include:
- Accession number = ID of the gene - Version number: tracks changes in
o Stable, Reportable, Universal sequence
- GI number: NCBI internal use
,Accession number RefSeq records: Include two letter prefix, underscore, numeric portion. (NT=contig
assemblies produced by NCBI, NW=supercontig assembles from WGS)
Benefits RefSeq
- Non-redundancy - Distinct accession
- Updates to reflect current sequence data series
and biology - Stewardship by
- Data validation NCBI staff and
- Format consistency collaborators
Extra exercises 1.2 digibook
What does CCDS mean? What are these records? What can you do with them?
The Consensus CDS (CCDS) project is a collaborative effort to identify a core set of human and mouse
protein coding regions that are consistently annotated and of high quality. The long term goal is to
support convergence towards a standard set of gene annotations. Knowing the CCDS a gene will help
in molecular biology. E.g. to know which DNA seq are truely in a "standarized version" of AMELX,
therefore, we can design better primers.
1.3 Analyze AmelX and AmelY DNA seq., where are those legendary
“deletions”
BLAST = Basic Local Alignment Search Tool; finds regions of local similarity between sequences (helps find
homologues genes and proteins). Compares nt or protein sequences to databases & calculates the statistical
significance. Used to analyse functional and evolutionary relationships & help identify members of gene
families.
Homologues: have a common ancestor (related), similar structures, similar functions.
- Proteins are homologous if their amino acid sequences are at least 25% identical
o Note: sequences must be over 100 aa (or bp) in length. Length <35 aa peptide
- DNA sequences are homologous if they are at least 70% identical
It is better to compare proteins instead of genes.
Different types of BLAST
- Nucleotide blast (BLASTn): nucleotide > nucleotide
- Protein blast (BLASTp): protein > protein
- Blastx: translated nucleotide > protein
- tblastn: protein > translated nucleotide
BLAST results interpretation
- Sequence alignment is an arrangement of two or more sequences,
highlighting their similarity.
- The sequences are padded with gaps (dashes) so that wherever possible,
columns contain identical characters from the sequences involved
1.4 Know what to expect with your AMELX and AMELY
PCR experiment
Blast settings: somewhat familiar sequences
- Query = primer
- Subject = position in gene sequence
, The result page will show multiple hits of alignments (most of the time). To predict and expected PCR product
when blasting a primerset (2x blast) against a FASTA; biggest subject number – smallest subject number + 1.
Plus 1 because 1 nt is non-accounted for. This is called in dutch ‘tot en met’. Otherwise you will miss the first nt.
Extra exercises 1.6 digibook
How to get the sequence of your PCR product?
Go to blast, enter the forward primer, select the correct databank, select nucleotide collection, blast,
search for the correct accession code needed (AmelX for e.g.), look at the query and subject numbers,
click on the accession code, search in the nucleotide sequence where the primer attaches by using the
subject numbers, mark and save this. Repeat for the reverse primer (! Use the same accession code).
o Now you know where the primers attach, the sequence in between and how long the product
is.
o Note ! RV is always presented from 5' to 3', you need to reverse-complement the sequence in
order to find it in the gene
2 1001 ways to design 2002
primers
2.1 PCR and Primer design
Primer design is possibly the most common dry-lab experiment
(computer work/Bioinformatics) a lab technician will have to conduct.
Basic steps:
General rule of primer design
Primer length: Generally accepted that the optimal length of PCR
primers is 18-22 bp. = long enough for adequate specificity & short
enough for primers to bind easily to the template at the Ta.
Primer Melting Temperature (Tm): Temp at which ½ of the dsDNA will dissociate to ssDNA & indicates the
duplex stability. The GC content gives a fair indication of the primer Tm. It is mostly calculated with the nearest
neighbor thermodynamic theory; = 4°C*(# G/C nt) + 2°C*(# A/T nt).
- Tm range 52-58 oC generally produce the best results.
- Tm > 65oC have a tendency for secondary annealing.
Primer Annealing Temperature: Tm is the estimate of the DNA-DNA hybrid stability & critical in determining
the Ta.
- Too high Ta insufficient primer-template hybridization resulting in low PCR product yield
- Too low Ta non-specific products caused by a high nr of bp mismatches
o Mismatch tolerance is found to have the strongest influence on PCR specificity.
formula of Rychlik: Ta = 0.3 x Tm (primer) + 0.7 Tm (product) – 14.9
GC Content: the number of G's and C's in the primer as a percentage of the total bases, GC content, of primers
should be 40-60%.
GC Clamp: Presence of G or C bases within last five bases at the 3' end of primers promotes specific binding at
3' end due to the stronger bonding of G-C.
- > 3 G's or C's should be avoided in the last 5 bases at the 3' end of the primer.
Primer Secondary Structures: Presence of these produced by intermolecular/intramolecular interactions can
lead to poor or no yield of the product. It adversely affects primer template annealing and thus amplification.
They greatly reduce the availability of primers to the reaction.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller stellav19. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $4.96. You're not tied to anything after your purchase.