Elements of Machine Learning summary + practice exam
12 views 0 purchase
Course
Elements of Machine Learning
Institution
Tilburg University (UVT)
Summary of all lessons from Elements of Machine Learning, enhanced with explanations by ChatGPT to deepen your understanding. Perfect for your open book exam in Elements of Machine Learning. Additionally, a practice exam based on last year's test is included.
Elements of machine learning
Book: https://www.nrigroupindia.com/e-book/Introduction%20to%20Machine%20Learning
%20with%20Python%20(%20PDFDrive.com%20)-min.pdf
Machine learning
Intelligence in machine learning = Being able
to do a task
Greyhound vs Labrador
We can collect thousands of images fort he
computer
Or we could describe each dog to the
computer;
- Height of the dog
- Weight
- Color
These are called features
More features are more informative and make
the computer more precise
Branches of ML
- Supervised learning: Learn from examples + the desired ‘target’
behaviour of the computer.
E.g. With an input of data, we give the model examples of how
it needs to work -> If it is a dog, it needs to recognize it. We
provide the label and we provide the target.
- Unsupervised learning -> Useful if you want to find structure in
data, if you don’t know what cats and dogs are the computer
might find structure and tell you that there are classes. E.g.:
Fraudulent transactions: using structure to find out that a
transaction in Nigeria or Cambodia is different than normal
- Reinforcement learning: learning a behaviour by interacting
with an environment and receiving rewards and punishments
based on the current behaviour. Connected to reinforcement
learning in animals. The algorithm can decide how he can change his behaviour to maximize
the amount of reward.
Supervised vs unsupervised learning
- Supervised learning requires labelled data that is data samples that come with a target label
- Labelled data is expensive as someone must look at each sample and assign a label, needs to
be done by real people who can max label about 1000 data cases every day and need to be
paid for it.
- Unlabelled data is cheap; e.g. easy to collect millions of images and text from the internet
Examples
- Email spam filter: given an email learn to recognize whether it is spam or legit
- Speech recognition: Transcribe spoken sentences into text
, - Language translation -> simple
language prediction model which
uses his knowledge to predict what
to answer
- Recommender systems: given data
on online behaviour of users in
general (and possibly of a specific
user), provide suggestions for items that are most pertinent to a practical user.
- Sentiment analysis: Given some text (e.g. a tweet or a product review) determine whether the
content is positive, negative, or neutral
- Time series forecasting: Make future predictions based on historical data, (regression)
- Fraud detection
Types of data
Images
- Computers work with numbers so data
has to be represented accordingly,
- Pictures/visual data: Arrays of numbers
(RGV values for each pixel)
- Colour images: 3 colour channels:
commonly, red, green and blue
- Grayscale: light intensity, integer
values between 0 and 225
- Binary: Each pixel is either 0 and 1,
useful to find contours
Text/language:
- we need to convert letters or words in a format computers can understand.
- Usually we separate text individual letters or words and convert each into a vector
i. Example, each letter van become a vector with 26 elements, all set to 0 except fort he
element at the index of the character which is set to 1 E.g., C = (0,0,1,0,0)
ii. With words we can do the same thing using a dictionary, and using a vector of the
same length as the dictionary.
iii. The resulting vector is usually too large and sparse to use directly, os it is often passed
through an ‘embedding’ funtion to compress it.
- Applicable to any categorisation, we need to use a numerical representation rather than their
names
- One-hot encoding is used to represent categorial data with a vector of numbers such that the
elements of the vector are all 0 except fort he correct category, which is given a 1.
i.
, - Categorial data -> Discrete and countable.
Often we work with data that consists of multiple different fields.
Example: Credit card fraud detection, objective: Detect whether a given transaction is legit or not
The Iris dataset
- A dataset is a collection of data
- Each instance in the dataset with features/attributes that describe it, and may come with a
label/class
- The features of each instance make it possible to determine which species flower each
instance is
Wrapping up
- Machine learning is not magic
- We can work with different types of data, but we have to organize them in a way that
computers can understand (vectors of numbers)
- In simple cases we van see patterns in data by eye
- Machine learning methods can do that for us, even when we cant
Werkcollege 1
Exploring the first data set:
- First import the pandas package
- Make sure the file you want to read is in the same folder
- Define the variable like this
i. Variable = pd(ModuleName).read(FunctionName)_csv(“iris.csv”(Name of file))
- Look at first x rows -> iris.head(x)
- Explore the shape by using iris.shape -> (150,5) 150 rows and 5 columns
- Explore how many of which type there are by using iris.value_count(“class”)
Hoorcollege 2
Supervised learning and k-NN
Recap
- Methods that extract knowledge from the observed data
- Looking for patterns that can be exploited to solve a task
- Closely related to statistics and optimization
, - We usually want to predict something
i. What an object is in a picture
ii. Which direction to steer a self-driving car
iii. What sentence is a user saying
iv. Etc...
Many branches of ML
- Supervised learning
i. Learn from examples + the desired ‘target’ behaviour of the computer. Finding
patterns in data
- Unsupervised learning
- Reinforcement learning -> no data but an environment, with robots and gets a reward.
i. Robot gets reward in cycles and gets a grade
ii. Gets +1 if it completes the reward or gets -1 when it doesn’t
Supervised learning
- Works with giving examples
- In supervised learning we want to find a model that maps inputs to outputs given examples of
correct pairs.
- The function is a mathematical function -> gives input and output
- Example a cat vs dog classifier takes images as input and output whether the image is of a cot
or a dog
i. A dataset is collected containing a large number of pictures of cats and dogs
ii. Classifications with vectors like {0,0,1}
iii. For each picture a human writes a label to tell whether the picture contains a cat or
dog
iv. The dataset is a collection oof pairs
v. The {image, label} pairs are used to show our machine learning model show they
should behave
vi. The objective is the chosen ML method to find a function that behaves as shown, and
that generalizes to new unseen images.
Supervised learning: two tasks
Classification
- The model trained on the data defines a decision boundary that separates the data
Regression
- The model fits the data to describe the relation between 2 features or between a feature (e.g.
height) and the label (e.g., yes//no)
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller noahveldhoen. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $8.10. You're not tied to anything after your purchase.