Translational Genomics: Lecture 1
Introduction and Genomic Architecture – 07-09-2023
In the near future, we expect to be able to sequence the genome at birth. This will allow us to
immediately diagnose monogenetic diseases, use genomic data in personalized treatment protocols
and make predictive profiles and thus health advice for late-onset diseases. However, this also comes
with DNA-food, DNA-dating, DNA-jobs and DNA-discrimination.
Currently, personalized medicine can
be divided into 4 parts: assess risk,
refine assessment, predict/diagnose
and monitor progression/prevent
events/inform therapeutics.
The baseline risk looks at the stable
genomics via haplotype mapping, gene
sequencing and single-nucleotide
polymorphisms. That, together with
preclinical progression that tests gene
expression, proteomics, metabolomics
and clinical risk models, is able to
come up with a treatment plan, a
therapeutic decision.
Currently, getting a diagnosis can take a long time and many different doctors before the patient
really knows what the disease is. In the future, we would want personal diagnosis, prognosis, disease
management and treatment.
This lecture will talk about the human genome, functional DNA, “junk” DNA and the epigenome.
The Human Genome
DNA can be found within the nucleus and
mitochondria in cells. The nucleus contains 22 pairs of
autosomes and 2 sex chromosomes (either XX or XY).
It contains ~20.000 coding genes and ~25.000 non-
coding genes. In order to be able to build mouse
models, one needs to cause the same deletion on the
same chromosome. This indicates that it is important
to know which chromosomes are alike in human and mouse models.
The base pairs are either C-G or A-T. The start of the gene (exon
1) contains a lot of G-C base pairs to prevent an accidental ATG,
which would mean an early start of the translation of the gene.
,Functional DNA
Functional DNA can be divided into protein coding genes, non-coding genes and regulatory elements.
Protein Coding Genes
The different parts of protein coding genes can
be seen on the image to the left. Gene splicing
can happen differently depending on the organ
the gene is located (e.g. liver and brain),
resulting in different isoforms.
Non-Coding Genes
Non-coding genes can be divided into long
non-coding RNAs and small non-coding
RNAs. The latter can be subdivided into
piRNAs, siRNAs and miRNAs. They all play a
role in the regulation of genes and thus also
in disease.
miRNA, siRNA and piRNA precursors are
transcribed and process from small RNA
loci. They undergo a Dicer-dependent
(miRNA and siRNA) or a Dicer-
independent (piRNA) process to turn
into mature small RNAs. They can then
form the human RISC complex and
target genes. miRNA targets coding
genes, siRNA targets transposons and
exogenous genes and piRNA targets
transposons and other genes. The RISC
complexes can result in inhibition of
translation initiation, elongation or result in mRNA deadenylation. miRNA/siRNA in the genome
browsers look like little blocks. Feingold syndrome 2 is caused by an miR-17~92 deletion.
,Long non-coding RNA (lncRNA) can be intronic,
intergenic or natural antisense (NAT; T =
transcript). They can be found anywhere in the
genome. They exert their effects through
protein binding, DNA binding or RNA binding.
This can result in blocking of RNA pol II initiation, different splicing or nuclear retention, in which the
gene expression in inhibited.
Regulatory Elements
When talking about regulatory elements, we’re talking about
the locus control region (LCR), insulators, silencers and
enhancers, but also the proximal promoter elements and core
promoters.
The core promoters contain:
• the BRE (B recognition element), to which TFIIB (transcription factor 2B) binds;
• the TATA box, to which the TBP (tata binding protein) binds;
• the Inr (initiator element/motif), to which the TAF1/2 (TATA-box binding protein associated
factor ½) binds;
• MTE (motif ten element);
• DPE (downstream promoter element), to which TAF6/9 binds
• DCE (downstream core element) to which TAF1 binds
“Junk” DNA
“Junk” DNA mostly refers to transposons/transposable elements, which are important for the
structural integrity. About 45% of the human genome is build of transposable elements, but less than
0.05% is active. The most abundant are the Alu elements (making up 10% of the human genome).
, The Epigenome
The most extreme example of the effect of
epigenetic modification on gene expression is
X-inactivation. DNA methylation is the process
in which a methyl group is placed on the C
base (which is followed by a G base). This
leads to imprinting and thus inactivation.
During spermatogenesis, the old imprints are
erased and new, sex-specific imprints are
made.
Genomic imprinting is essential for normal
development, so a deregulation in this
process results in complex genetic diseases
such as Prader-Willi and Angelman
syndrome, which are both caused by a
deletion of the 15q11-13 region. There are
about 100 imprinted genetic loci.
Angelman syndrome is caused by a maternal
deletion, leading to loss of the maternal
gene expression and PW syndrome is caused
by a paternal deletion, leading to loss of the
paternal gene expression.
Angelman syndrome leads to an intellectual disability, random laughing, ataxia, no speech, epilepsy,
a typical face and friendliness. PWS leads to hypotonia and feeding problems in newborns and
obesity, lack of growth, a mild intellectual disability, hypogonadism and behavioural problems in the
first decade of life.
Uniparental disomy, in which both members of a chromosome pair are inherited from one parent,
also leads to AS or PWS, depending on which parental chromosome pair was inherited (PWS for
maternal and AS for paternal).
Questions
Question 1: There are multiple regions in which the causative gene might reside. In one of the
regions, on chr18, a deletion of MIR122HG has been found. Argue based on Fig.1 what type of gene
this is and how it is processed.
It is an RNA coding gene and does not code for protein because the lines aren’t “thick” and the name
of the gene starts with MIR, indicating that it is a miRNA. HG stands for host gene. The gene codes
for hairpin MIR122 and MIR3591 after processing (happens through Drosha/Dicer).