College aantekeningen

Complete WEEK1 note: Machine Learning & Learning Algorithms(BM05BAM)

15 keer bekeken 0 keer verkocht

Instelling
Erasmus Universiteit Rotterdam (EUR)

THIS IS A COMPLETE NOTE FROM ALL BOOKS + LECTURE! Save your time for internships, other courses by studying over this note! Are you a 1st/2nd year of Business Analytics Management student at RSM, who want to survive the block 2 Machine Learning module? Are you overwhelmed with 30 pages of re...

[Meer zien]

Voorbeeld 2 van de 10 pagina's

Bekijk voorbeeld

Geupload op 12 maart 2024
Aantal pagina's 10
Geschreven in 2023/2024
Type College aantekeningen
Docent(en) Jason roos
Bevat Alle colleges

€12,49

Ook beschikbaar in voordeelbundel v.a. €21,99

In winkelwagen

Op verlanglijstje

100% tevredenheidsgarantie
Direct beschikbaar na betaling
Zowel online als in PDF
Je zit nergens aan vast

Ook beschikbaar in voordeelbundel (2)

(50% off!) Prepare for First Exam: Machine Learning & Learning Algorithms(BM05BAM)

€ 47,46 € 21,99 4 items

1. College aantekeningen - Complete week4 note: machine learning & learning algorithms(bm05bam)
2. College aantekeningen - Complete week3 note: machine learning & learning algorithms(bm05bam)
3. College aantekeningen - Complete week2 note: machine learning & learning algorithms(bm05bam)
4. College aantekeningen - Complete week1 note: machine learning & learning algorithms(bm05bam)
Meer zien

(55% off!) Full Bundle: Machine Learning & Learning Algorithms(BM05BAM)

€ 84,43 € 35,99 7 items

1. College aantekeningen - Complete week5 note: machine learning & learning algorithms(bm05bam)
2. College aantekeningen - Complete week4 note: machine learning & learning algorithms(bm05bam)
3. College aantekeningen - Complete week3 note: machine learning & learning algorithms(bm05bam)
4. College aantekeningen - Complete week2 note: machine learning & learning algorithms(bm05bam)
5. College aantekeningen - Complete week1 note: machine learning & learning algorithms(bm05bam)
6. College aantekeningen - Complete week6 note: machine learning & learning algorithms(bm05bam)
7. College aantekeningen - Complete week7 note: machine learning & learning algorithms(bm05bam)
Meer zien

HLM : Chapter 1 The Machine learning Landscape

Machine learning = field of study that gives computers the ability to learn without being
explicitly programmed.

Types of Machine learning system : added to the ISRL chapter 2
Main Challenges of Machine Learning:
Generalization
Generalization problem could be caused by sampling bias, overfitting.

It is crucial to use a training set that is representative of the cases you want to generalize
to. This is often harder than it sounds: if the sample is too small, you will have sampling
noise (i.e., nonrepresentative data as a result of chance/outlier/data errors)

However, even very large samples can be nonrepresentative if the sampling method is
flawed. This is called sampling bias.

Regularization: Constraining a model to make it simpler and reduce the risk of overfitting.
The amount of regularization to apply during learning can be controlled by a
hyperparameter. A hyperparameter is a parameter of a learning algorithm (not of the
model
- It must be set prior to training and remains constant during training.
- If you set the regularization hyperparameter to a very large value, you will get an
almost flat model (a slope close to zero)

Hyperparameter vs Parameter
- A model parameter is
o estimated during model training.
o internally optimized
- A hyperparameter must be
o specified before model training.
o optimized externally.

Concept drift: It happens when the relationship that model estimate changes after
training the model, due to the external conceptual change in the circumstance.

Testing and Validating:
A better option than testing on the new data is to split your data into two sets: the
training set and the test set, which allow you to test the performance before moving on
the actual practice. As these names imply, you train your model using the training set,
and you test it using the test set.
- It is common to use 80% of the data for training and hold out 20% for testing.
However, this depends on the size of the dataset:

, The error rate on new cases is called the generalization error (or out-of-sample error), and
by evaluating your model on the test set, you get an estimate of this error.

Hyperparameter Tuning and model selection
Suppose you are hesitating between two types of models (say, a linear model and a
polynomial model): how can you decide between them?

When you want to compare just two different models: just train models on the same
train data and compare the generalization performance with test data.
When you want to find the best performing hyperparameter among 100 options: You
cannot do the same.
- When you measure the generalization error multiple times on the test set, you
adapt the model and hyperparameters to produce the best model for that
particular set so it won’t perform as well on the new data.

A common solution to this problem is called holdout validation: you simply hold out part
of the training set to evaluate several candidate models and select the best one. The new
held-out set is called the validation set (or sometimes the development set, or dev set).

Process
1. You train multiple models with various hyperparameters on the reduced training
set.
2. You select the model that performs best on the validation set (holdout validation
process)
a. if the model performs poorly on the train-dev set, then it must have overfit
the training set, so you should try to simplify or regularize the model, get
more training data, and clean up the training data.
3. You train the best model on the full training set, including the validation set
4. Test the generalization error on the test set.

Validation set should not be too small: then model evaluations will be imprecise
Validation set should not be too large: remaining training set will be much smaller, which
would change the performance result after training on the full training set.

One way to solve this problem is Cross validation that uses small validation sets. Each
model is evaluated once per validation set after it is trained on the rest of the data. By
averaging the evaluations of the mode, you get much more accurate measure of
performance.
- It also means that training time is multiplied by the number of validation sets.

No Free Lunch Theorem
David Wolpert demonstrated that if you make absolutely no assumption about the data,
then there is no reason to prefer one model over any other. This is called the No Free
Lunch (NFL) theorem.

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper ArisMaya. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €12,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 48298 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Start met verkopen

Populaire Universiteiten

Populaire Hogescholen

Populaire Scholen

Populaire samengevatte studieboeken voor Communicatie en Taal

Populaire samengevatte studieboeken voor Economie en Bedrijf

Populaire samengevatte studieboeken voor Exact en Informatica

Populaire samengevatte studieboeken voor Gedrag en Maatschappij

Populaire samengevatte studieboeken voor Gezondheid en Geneeskunde

Populaire samengevatte studieboeken voor Onderwijs en Opvoeding

Populaire samengevatte studieboeken voor Recht en Bestuur

De beste samenvattingen om je Wft-diploma te behalen

De beste samenvattingen om je theorie examens te behalen

De beste samenvattingen voor je cursus in de Veiligheidsbranche

De beste samenvattingen voor Gezondheid & Hygiëne cursussen

De beste samenvattingen voor zakelijke cursussen

De beste samenvattingen voor je PABO WisCAT cursus

Populaire vakken

Populaire vakken

Populaire vakken

Boekverslagen en samenvattingen

Verkoper

College aantekeningen

Complete WEEK1 note: Machine Learning & Learning Algorithms(BM05BAM)

Document informatie

Onderwerpen

Gekoppeld boek

Meer samenvattingen voor studieboek

Geschreven voor

Verkoper

Voorbeeld van de inhoud

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Snel en makkelijk kopen

Focus op de essentie

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?