100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Data analysis promises potential target genes for the treatment of alzheimer's disease $22.25   Add to cart

Case

Data analysis promises potential target genes for the treatment of alzheimer's disease

 44 views  0 purchase
  • Course
  • Institution

This document provides an example for the report for the course big data including: GWAS, gene expression, proteomics and PPI, integration with brain gene expression, induced pluripotent stem cells, predictive analysis and a conclusion. We had a really good grade and hope you will learn from our re...

[Show more]

Preview 2 out of 7  pages

  • December 25, 2022
  • 7
  • 2021/2022
  • Case
  • No name
  • A
avatar-seller
Arzu Burak (2635213) and Karima Allach (2706743)



DATA ANALYSIS PROMISES POTENTIAL TARGET GENES FOR
THE TREATMENT OF ALZHEIMER’S DISEASE
INTRODUCTION
Alzheimer’s disease (AD), the most common cause of dementia, is a neurological disorder of the
brain causing shrinkage and cell death. As this disease is still common in the elderly, we are
interested in finding gene associations to better understand this disease and make a start to finding
a potential treatment. By looking at genes in specific locations that could be a possible risk factor for
AD during the Artificial Intelligence Master programme, we hope to gain further knowledge on this
disease and work together with pharmaceutical companies by finding genes that could be a target
for treatment after getting our master’s degree. By making our codes available on the internet at the
end of this analysis, we hope to help other researchers in further developing their own research. We
expect a difference in gene expression in specific brain regions in AD patients compared to the
control. In order to test whether this is the case, we will combine genome-wide association studies
(GWAS), gene expression in patients, brain gene expression data, protein-protein interactions (PPIs)
and MRI data.

GENOME-WIDE ASSOCIATION STUDIES
Before analyzing the data, the GWAS results obtained by FUMA were visualized and explored. Next,
the data is cleaned in Matlab by retaining the columns GENE, CHR, ZSTAT, SYMBOL and P to get a
smaller dataset. For the data analysis, we plotted a distribution of the p-values in a histogram to find
potentially valuable candidate genes. A code is made that finds the genes that reach the significance
threshold of 5*10-8. In order to see which genes have the largest effect, we sorted the rows on
ZSTAT. What we also found in FUMA is that the most significant top lead SNP rs41289512 lies closest
to the APOE gene.




Figure 1: GWAS meta-analysis for AD risk (N=455,258 participants, N=2,357 SNPs). Notice the remarkably high peak at
chromosome 19. α = 5*10-8.


GWAS showed that TOMM40 is the gene with the largest effect (Z = 21.921, P = 8.1347 -107), implying
genetic association with AD risk. After doing some research on the internet, we found that TOMM40
is in LD with the gene APOE. GWAS from previous research showed that APOE is indeed a strong risk
factor for late-onset AD (Mise et al., 2017). These genes are both located on chromosome 19, also
visible in figure 1 and table 1.




Table 1: Top 4 genes with largest effects. TOMM40 is the gene with the largest effect in this part of the analysis.

, Arzu Burak (2635213) and Karima Allach (2706743)




GENE EXPRESSION
The goal of the gene expression part of our analysis is to investigate whether the level of expression
of TOMM40 and APOE are different between AD patients and controls in the entorhinal cortex (EC)
that is involved in the long-term cognitive
memory formation (Puthiyedth et al., 2016).
After visualization of the results from the
separate data files obtained by Liang et al.
(2008), we merged these files by first cleaning up
to get a structured table. The chosen unique
identifier was ‘probe_id’. Next, we tested for a
potential difference between patients and
controls by performing the t-test statistic (T =
2.5047). As you can also see in the barplot in
figure 2, the gene expression is lower in the AD
patients compared to control, indicating that a
lower gene expression is a marker of AD.


Figure 2: Bar plot. Statistical comparison between control and
patients affected with AD. Error bar showing 1 standard deviation.


PROTEOMICS AND PPI
For the proteomics analysis, we were interested in which other proteins APOE interacts with in order
to have a better understanding of the function of the APOE gene in the human brain cell. In order to
do this, a PPI list was needed. HENA provided this list from the IntAct molecular interaction
database, a heterogeneous network based-data set for Alzheimer’s disease. To draw a conclusion
regarding the fidelity of the PPIs, we first searched for the correct ENSG identifier for APOE and
looked up the
corresponding protein. This resulted in a list of the PPIs of only APOE. These can also be seen in the
PPI network (figure 3) built in the STRING database.

HENA also provided a data set that can be
downloaded as an individual txt file containing a list of
proteins that could possibly have a connection to
Alzheimer's disease and this was used to determine
the protein with the most interactions in the disease.
All PPIs in this list, we could presume, are
bidirectional. The amyloid precursor protein (APP), as
a result, has the most interactions with the other
proteins in the network. Thus it is the central player
within the PPI network (figure 3). As the name already
suggests this protein produces a peptide called β-
amyloid. Mutations in the APP gene cause aberrant
cleavage of the APP protein leading to abnormal Aβ
generation, which leads to the
formation of plaques in the brain and causes early-
onset Alzheimer's disease (Wang et al., 2017). Figure 3: Protein-protein interaction network of
APOE. The lines indicate the interactions
between genes. Blue and pink lines show known
interactions.

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller arzuburak. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $22.25. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

77254 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$22.25
  • (0)
  Add to cart