Aantekeningen hoorcollege 6: machine learning. Introduction to Bioinformatics
6 views 0 purchase
Course
Introduction to Bioinformatics
Institution
Vrije Universiteit Amsterdam (VU)
The notes from lecture 6 on machine learning in the course introduction to bioinformatics. The document contains clear images. Furthermore, it is organized and clearly marked.
Introduction to bioinformatics
Hoorcollege 6 – 24 januari 2024
Machine learning
What is machine learning?
- Machine learning is a form of Artificial Intelligence
- Finding interesting patterns in “big data” using statistics.
o Machine learning is the study of computer algorithms that improve
automatically through experience. [...] Machine learning algorithms build a
model based on sample data, known as "training data", in order to make
predictions or decisions without being explicitly programmed to do so.
- Examples:
o Email filtering
o Speech recognition: (Alexa/Siri ect.)
o Those personalized add for sneaker that keep following you because you
bought a pair online once.
Example of machine learning: personalized medicine
- What makes patients similar to each other?
- Which patient should get which drug?
- Based on (molecular) profiling of patient
Unsupervised clustering reveals structure in the samples
Supervised learning reveals that good and poor prognosis patient have different gene
expression profiles
We can use this information to predict the outcome of new patients
Good prognosis no need for chemo
Train to predict a model
Classification rule
, Unsupervised learning: discover interesting structure in the
data
- Clustering
- Dimensionality reduction: PCA, tSNE, UMAP
- Uses “unlabeled” data
o One have genes
o Label: yes of no …
o Unsupervised no label
Two groups of patient, one low expressed and one high
expressed
Supervised learning: Make predictions on new data, given
training data:
- Classification & Regression
- Used “labeled” data (i.e. samples from healthy &
diseased patients)
Unsupervised machine learning: clustering
How?
- Hierarchical clustering
- K-means clustering
- Fuzzy clustering
Why?
- Are there subgroups of a disease? (e.g. multiple types of breast cancer, which require
different treatment)
Can we split the data meaningfully into different groups?
Do we find a pattern here?
Example data 2D, normally D is much larger (e.g. number of genes measured), but I can’t
draw that
There is no one correct answer. It’s not clear how many groups are here
Algorithm to put data into two groups
K-means clustering algorithm
Decide how many clusters you should have
This is not always straightforward
1. The algorithm starts by randomly assigning each of your observations to only one of
the k clusters.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller dboone264. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $4.82. You're not tied to anything after your purchase.