Samenvatting

Summary Machine Learning

Name: Summary Machine Learning
SKU: doc_2325518
Rating: 3.25 (4 reviews)
Author: liekebuuron

4 beoordelingen

465 keer bekeken 48 keer verkocht

Vak
Machine Learning

Instelling
Tilburg University (UVT)

English Summary of Machine Learning course of Master Data Science and Society at Tilburg University. A summary of lecture materials, readings, and notes.

[Meer zien]

Voorbeeld 4 van de 61 pagina's

Bekijk voorbeeld

Geupload op 1 februari 2023
Aantal pagina's 61
Geschreven in 2022/2023
Type Samenvatting

machine learning
machine
learning
data science
data science and society
data science machine learning
data science and society machine learning
master data science and society

4 beoordelingen

Door: tygovandenherik1 • 1 maand geleden

Door: richardwillems • 1 maand geleden

Door: jdebeeld • 7 maanden geleden

Door: koenmiddelhof • 1 jaar geleden

Volgen

liekebuuron Lid sinds 4 jaar 152 documenten verkocht

€4,49

In winkelwagen

Op verlanglijstje

100% tevredenheidsgarantie
Direct beschikbaar na betaling
Zowel online als in PDF
Je zit nergens aan vast

Machine learning
Lecture 1

Machine learning is about automation of problem solving. It is the study of computer
algorithms that improve automatically through experience. Involves becoming better at a
task T based on some experience E with respect to some performance measure P.
Examples:
- Span detection
- Movie recommendation
- Speech recognition
- Credit risk analysis
- Autonomous driving
- Medical diagnosis.
It comes up with a learned algorithm. It is about learning from experience.

What does it involve?
- ML may involve a notion of generalization. When the machine learns relationships
between the input and the output, we want this to work on unseen data, which is
the concept of generalization. Is it safe to assume that current observations are
generalized to future observations?
- Annotated data, objective, optimization algorithm, features/representations,
assumptions are some critical components.
- We assume the database presents the population. As we have more data, the output
becomes better.
- There is an optimization algorithm that incrementally works towards the best
outcome.

Different types of learning:
Starting points:
- Supervised learning: annotated/labelled dataset / ground truth
o Classification: discrete variable
o Regression: continuous variable
- Unsupervised learning: unlabeled dataset
o clustering

Examples:
Spam vs non-spam?

This is usually a problem of text mining. The emails have to be pre-processed in such a way
that we can create features from the dataset. This is a binary classification problem. The

,learning algorithm should come up with a function that matches the representation of the
emails.
- Find examples of spam and non-spam
- Come up with a learning algorithm
- A learning algorithm infers rules from examples: if (A or B or C) and not D, then spam
- These rules can then be applied to new data (emails)

Learning algorithms:
- See several different learning algorithms
- Implement 2-3 simple ones from scratch in Python
- Learn about Python libraries for ML (scikit-Learn)
- How to apply them to real-world problems

Machine learning examples:
- Recognize handwritten numbers and letters
- Recognize faces in photos
- Determine whether text expresses positive, negative or no opinion
- Guess person’s age based on a sample of writing
- Flag suspicious credit-card transactions
- Recommend books and movies to users based on their own and others’ purchase
history
- Recognize and label mentions of people’s or organization names in text

Types of learning problems:
Regression:
- Response: a (real) number
- Predict a person’s age
- Predict price of stock
- Predict student’s score on exam
Binary classification:
- Response: Yes/No answer
- Detect spam
- Predict polarity of product review: positive vs negative
Multiclass classification:
- Response: one of a finite set of options
- Classify newspaper article as:
o Politics, sports, science, technology, health, finance
- Detect species based on photo
o Passer domesticus, Calidris alba, Streptopelia, decaocto, corvus cornax
Multilabel classification:
- The output does not have to consist of a single thing, but it could be multiple things
(this is the difference with multiclass classification)
- Assign songs to one or more genres (rock, pop, metal)
- You are not trying to find all of the labels correctly, but you are trying to find the
most correct labels during training.
Autonomous behavior (example of a car)
- Input: measurements from sensors – camera, microphone, radar, accelerometer.

, - Response: instructions for actuators – steering, accelerator, brake.
- Evaluation: choose a baseline, choose a metric, compare!
- Different tasks, different metrics:
o Predicting age
o Flagging spam

Two metrics that we often use in regression problems:
- Mean absolute error – the average (absolute) difference between true value and
predicted value (yn true value (ground truth), ŷn predicted value)

- Mean squared error: the average square of the difference between true value and
predicted value – more sensitive to outlier, but it is differentiable (as opposed to
MAE)

For a binary classification problems, the metrics often used are:
- Accuracy
- Error rate
These are not really informative, especially if the database is not balanced.

Classification:
- False positive – flagged as spam, but not spam
- False negative – not flagged, but is spam
- False positives are a bigger issue for this problem!
- Ture positive – spam classified as spam
- Ture negative – not-spam classified as not-spam

Precision and recall:
- Metrics which focus on one kind of mistake
- Precision: what fraction of flagged emails were real spam?

- Recall: what fraction of real spams were flagged?

Example:

, Confusion matrix example:

f-score:
- Harmonic mean between precision and recall (a kind of average)

- Aka F-measure

Fβ :
- Parameter β quantifies how much more we care about recall than precision, when it
is greater than 1, that means, recall is weighted more, when it is smaller than 1, that
means precision is weighted more

Multiclass classification:
You can still make a confusion matrix with multiclass classification as well.

When there are more than two classes, you have to come up with alternatives when it
comes to rating the learning outcomes. You can use macro-average and micro-average.

Macro-average:
Precision true positive over labeled positives; recall, true positives over actual positives.
- You can only use this if the data is balanced.
- Compute precision and recall per-class, and average:

- Rare classes have the same impact as frequent classes

Micro-average:
- Gives every point equal importance (this is the difference from the macro-average).
- Micro averaging treats the entire set of data as an aggregate result, and calculates 1
metric rather than k metrics that get averaged together

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper liekebuuron. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €4,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 59063 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Start met verkopen

Populaire Universiteiten

Populaire Hogescholen

Populaire Scholen

Populaire samengevatte studieboeken voor Communicatie en Taal

Populaire samengevatte studieboeken voor Economie en Bedrijf

Populaire samengevatte studieboeken voor Exact en Informatica

Populaire samengevatte studieboeken voor Gedrag en Maatschappij

Populaire samengevatte studieboeken voor Gezondheid en Geneeskunde

Populaire samengevatte studieboeken voor Onderwijs en Opvoeding

Populaire samengevatte studieboeken voor Recht en Bestuur

De beste samenvattingen om je Wft-diploma te behalen

De beste samenvattingen om je theorie examens te behalen

De beste samenvattingen voor je cursus in de Veiligheidsbranche

De beste samenvattingen voor Gezondheid & Hygiëne cursussen

De beste samenvattingen voor zakelijke cursussen

De beste samenvattingen voor je PABO WisCAT cursus

Populaire vakken

Populaire vakken

Populaire vakken

Boekverslagen en samenvattingen

Verkoper

Samenvatting

Summary Machine Learning

Document informatie

Onderwerpen

Geschreven voor

4 beoordelingen

Verkoper

Ontvangen beoordelingen

Voorbeeld van de inhoud

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Snel en makkelijk kopen

Focus op de essentie

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?