Class notes

Complete WEEK1 note: Machine Learning & Learning Algorithms(BM05BAM)

15 views 0 purchase

Course
Machine Learning & Learning Algorithms (BM05BAM)

Institution
Erasmus Universiteit Rotterdam (EUR)

Book
An Introduction to Statistical Learning

THIS IS A COMPLETE NOTE FROM ALL BOOKS + LECTURE! Save your time for internships, other courses by studying over this note! Are you a 1st/2nd year of Business Analytics Management student at RSM, who want to survive the block 2 Machine Learning module? Are you overwhelmed with 30 pages of re...

[Show more]

Preview 2 out of 10 pages

View example

Uploaded on March 12, 2024
Number of pages 10
Written in 2023/2024
Type Class notes
Professor(s) Jason roos
Contains All classes

machine learning
parametric
non parametric
supervised learning
unsupervised learning
regression
classification
statistical learning
islr2
islr
ml

Book Title:An Introduction to Statistical Learning

Author(s):Gareth James, Daniela Witten

Edition:Unknown
ISBN:9781071614204
Edition:Unknown

Class notes
Complete WEEK7 note: Machine Learning & Learning Algorithms(BM05BAM)
Class notes
Complete WEEK6 note: Machine Learning & Learning Algorithms(BM05BAM)
Class notes
Complete WEEK3 note: Machine Learning & Learning Algorithms(BM05BAM)

Institution
Erasmus Universiteit Rotterdam (EUR)
Education
Business Analytics and Management
Course
Machine Learning & Learning Algorithms (BM05BAM)

R248,72

Also available in package deal from R437,89

Add to cart

Add to wishlist

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Document also available in package deal (2)

(50% off!) Prepare for First Exam: Machine Learning & Learning Algorithms(BM05BAM)

R 945,09 R 437,89 4 items

1. Class notes - Complete week4 note: machine learning & learning algorithms(bm05bam)
2. Class notes - Complete week3 note: machine learning & learning algorithms(bm05bam)
3. Class notes - Complete week2 note: machine learning & learning algorithms(bm05bam)
4. Class notes - Complete week1 note: machine learning & learning algorithms(bm05bam)
Show more

(55% off!) Full Bundle: Machine Learning & Learning Algorithms(BM05BAM)

R 1.681,28 R 716,68 7 items

1. Class notes - Complete week5 note: machine learning & learning algorithms(bm05bam)
2. Class notes - Complete week4 note: machine learning & learning algorithms(bm05bam)
3. Class notes - Complete week3 note: machine learning & learning algorithms(bm05bam)
4. Class notes - Complete week2 note: machine learning & learning algorithms(bm05bam)
5. Class notes - Complete week1 note: machine learning & learning algorithms(bm05bam)
6. Class notes - Complete week6 note: machine learning & learning algorithms(bm05bam)
7. Class notes - Complete week7 note: machine learning & learning algorithms(bm05bam)
Show more

HLM : Chapter 1 The Machine learning Landscape

Machine learning = field of study that gives computers the ability to learn without being
explicitly programmed.

Types of Machine learning system : added to the ISRL chapter 2
Main Challenges of Machine Learning:
Generalization
Generalization problem could be caused by sampling bias, overfitting.

It is crucial to use a training set that is representative of the cases you want to generalize
to. This is often harder than it sounds: if the sample is too small, you will have sampling
noise (i.e., nonrepresentative data as a result of chance/outlier/data errors)

However, even very large samples can be nonrepresentative if the sampling method is
flawed. This is called sampling bias.

Regularization: Constraining a model to make it simpler and reduce the risk of overfitting.
The amount of regularization to apply during learning can be controlled by a
hyperparameter. A hyperparameter is a parameter of a learning algorithm (not of the
model
- It must be set prior to training and remains constant during training.
- If you set the regularization hyperparameter to a very large value, you will get an
almost flat model (a slope close to zero)

Hyperparameter vs Parameter
- A model parameter is
o estimated during model training.
o internally optimized
- A hyperparameter must be
o specified before model training.
o optimized externally.

Concept drift: It happens when the relationship that model estimate changes after
training the model, due to the external conceptual change in the circumstance.

Testing and Validating:
A better option than testing on the new data is to split your data into two sets: the
training set and the test set, which allow you to test the performance before moving on
the actual practice. As these names imply, you train your model using the training set,
and you test it using the test set.
- It is common to use 80% of the data for training and hold out 20% for testing.
However, this depends on the size of the dataset:

, The error rate on new cases is called the generalization error (or out-of-sample error), and
by evaluating your model on the test set, you get an estimate of this error.

Hyperparameter Tuning and model selection
Suppose you are hesitating between two types of models (say, a linear model and a
polynomial model): how can you decide between them?

When you want to compare just two different models: just train models on the same
train data and compare the generalization performance with test data.
When you want to find the best performing hyperparameter among 100 options: You
cannot do the same.
- When you measure the generalization error multiple times on the test set, you
adapt the model and hyperparameters to produce the best model for that
particular set so it won’t perform as well on the new data.

A common solution to this problem is called holdout validation: you simply hold out part
of the training set to evaluate several candidate models and select the best one. The new
held-out set is called the validation set (or sometimes the development set, or dev set).

Process
1. You train multiple models with various hyperparameters on the reduced training
set.
2. You select the model that performs best on the validation set (holdout validation
process)
a. if the model performs poorly on the train-dev set, then it must have overfit
the training set, so you should try to simplify or regularize the model, get
more training data, and clean up the training data.
3. You train the best model on the full training set, including the validation set
4. Test the generalization error on the test set.

Validation set should not be too small: then model evaluations will be imprecise
Validation set should not be too large: remaining training set will be much smaller, which
would change the performance result after training on the full training set.

One way to solve this problem is Cross validation that uses small validation sets. Each
model is evaluated once per validation set after it is trained on the rest of the data. By
averaging the evaluations of the mode, you get much more accurate measure of
performance.
- It also means that training time is multiplied by the number of validation sets.

No Free Lunch Theorem
David Wolpert demonstrated that if you make absolutely no assumption about the data,
then there is no reason to prefer one model over any other. This is called the No Free
Lunch (NFL) theorem.

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through EFT, credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying this summary from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller ArisMaya. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy this summary for R248,72. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

48298 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy summaries for 15 years now

Start selling

Popular books for Arts, Humanities and Cultures

Popular books for Business and Economics

Popular books for Law and Public Services

Popular books for Medicine, Health and Social Sciences

Popular books for Technological and Physical Sciences

Notes & summaries for UNISA

Popular Universities

Popular Colleges

Popular High Schools

Seller

Class notes

Complete WEEK1 note: Machine Learning & Learning Algorithms(BM05BAM)

Document information

Subjects

Connected book

More summaries for

Written for