All lectures (theory part) in one document. I've added many extra sources & links for better understanding of the topics discussed in the lectures.
Check my other summaries for a full guide through all the practicals!
- Scikit-learn tutorial materials. May be hard to follow for the less advanced students
- The Coursera course on Machine Learning with Andrew Ng Matlab/Octave rather than Python
Part 2, week 1
Can we automate problem solving?
For example: flagging spam in your e-mail
Spam versus non-Spam
Page 3, gives 2 e-mail headers with different examples of titels:
- ‘your 5 million released, lottery winner’, ‘your overdue payment’ etc..
- ‘thesis evaluation’, ‘Link to the available data’ etc..
What makes you notice which ones are likely to be spam?
Machine learning is about learning by experience.
- Finding examples of spam and non-spam
- Come up with a learning algorithm
- A learning algorithm infers rules from examples
- These rules can be applied to new data (emails)
Types of learning problems:
- Regression
o Predict person’s age
- Binary classification: (Trying to predict a yes/no questions)
o Detect spam
- Multiclass classification: (one of finite options)
o Recognize type of birds
o Classify newspaper articles as… (categories)
- Multilabel classification: (a finite set of yes/no answers)
o Assign songs to one or more genres
- Ranking: (order object according to relevance)
o Google ranking search results
- Sequence labelling: (input: sequence of elements, output: corresponding sequence of labels)
o Translate between languages
o Summarize text
, - Autonomous behaviour (input: measurement from sensors, output: instructions for actuators
(steering, accelerate etc)
Part 3, week 1
How well is the algorithm learning?
Evaluation: how well is it thinking in its task
its good to keep in mind that for different classification/learning problems you have different evaluation
methods
Predicting age (binary classification if you assume 2 genders)
For predicting age:
- Mean absolute error – the average (absolute) difference between true value and predicted value
- Mean squared error – the average square of the difference between true value and predicted
value
- The error rate: proportion of mistakes
,Kinds of mistakes:
- False positive
o Flagged as spam, but not
- False negative
o Not flagged, but is spam
False positives are a bigger problem for flagging spam.
Precision and Recall
- Metrics which focus on one kind of mistake
- Precision: what fraction of flagged emails were real spam?
- Recall: what fraction of real spam were flagged?
For spam classification we aim for a high precision, because finding spam in your normal inbox (Recall) is
not as shitty as when a normal email has been marked as spam (Precision)
F = true positives + false positives
S = true positives + false negatives
, Precision:
Recall:
Example 1:
True Predicted
1 Spam Spam True positive
2 Spam Not spam False negative
3 Not spam Not spam True negative
4 Not spam Spam False positive
F-score:
- Harmonic mean between precision and recall, a kind of average
- Aka F-Measure
Fß
- Parameter ß quantifies how much more we care about recall than precision
- For example F0.5 is the metric to use if we care half as much about recall as about precision
If ß == 1 it is F- score
Multiclass classification:
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller jeroenverboom. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $6.51. You're not tied to anything after your purchase.