100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Summary Alle lectures samengevat ! €10,49
In winkelwagen

Samenvatting

Summary Alle lectures samengevat !

1 beoordeling
 6 keer verkocht

Deze samenvatting is uitgebreid, maar bevat alles wat je nodig hebt voor een goed cijfer voor het tentamen! Zelf heb ik een 8.5 gehaald, dus dit moet goed komen! Naast de samenvatting zou ik ook nog het oefententamen maken (voor de toepassingsvragen), dan ben je helemaal ready to go!

Voorbeeld 4 van de 111  pagina's

  • 13 december 2022
  • 111
  • 2021/2022
  • Samenvatting
Alle documenten voor dit vak (5)

1  beoordeling

review-writer-avatar

Door: lauraweerstra • 9 maanden geleden

avatar-seller
gideonrouwendaal
Lecture 1
Putting the process of inferring rules without explicitly stating them into a computer is Machine
Learning. You do this by giving the computer several number of examples and let the computer do
the work.

What makes a suitable ML problem?

- We cant solve it explicitly
- Approximate solutions are fine
- Limited reliability, predictability, interpretability is fine
- Plenty of examples should be available
- Good examples: Recommending a movie, clinical decision support, predicting driving time
and recognising a user
- Bad examples: computing taxes, clinical decisions, parole decision, unlocking phone

Where do we use ML?

- Inside other software: unlock your phone with your face, search with a voice command
- In analytics, data mining, data science: find typical clusters of users, predict spikes in web
traffic
- In science/statistics: if any model can predict A from B, there must be some relation

Machine learning provides systems the ability to automatically learn and improve from experience
without being explicitly programmed.

Reinforcement learning: taking actions in a world based on delayed feedback.




Online learning: predicting and learning at the same time

Offline learning: Separate learning, predicting and acting:

- Take a fixed dataset of examples (instances)
- Train a model to learn from these examples
- Test the model to see if it works by checking its predictions
- If the model works, put it into production: use its predictions to take actions

ML problems are not solved 1 by 1 (not years searching for chess algorithm/driving car etc.), but
abstract tasks like classification, regression…. And find solutions (algorithms) for these tasks e.g.
linear model, kNN….

,Abstract tasks are divided into supervised and unsupervised tasks:

Supervised: explicit examples of input and output. Learn to predict output for unseen input.
Learning tasks:

- Classification: assign a class to each example
- Regression: assign a number to each example

Unsupervised: Only inputs provided. Find any pattern that explains something about the data.

AI is general and is about building intelligent agents. ML is subset of AI.

Data science is general about data. ML is subset of Data science

Data mining is an intersection with ML (very close to each other). E.g. finding common clickstreams
in web logs or finding fraud in transactions networks is more DM. Spam classification, predicting
stock prices, learning to control a robot is more ML. Data mining is more about giving large database
of data and finding patterns in this. ML focuses more on (prediction) tasks.

Information retrieval: not the same but can benefit from one another.

Stats vs ML: most of the rules of stats are also in ML. Main difference is what we want from the
model once it is fitted. Stats: should fit the reality ML only interested in predictions that are likely to
be true.

Deep learning: a subset of ML.

An example of classification is to mark an email as spam or ham (2 classes). Data goes into a learner
and eventually there is a model. The model is a classifier.

Linear classifier: classification algorithm (for example with 2 features in 2D), that draws a line
through a certain space. An example: everything above this line is X and everything underneath the
line is classified as Y. In 2D it is a line, 3D a plane and in 4D+ it is called a hyperplane.

Loss function: a function that expresses for a particular model how well it fits our data:

Lossdata(model) = performance of model on the data (the lower the better). For classification: e.g. the
number of misclassified examples.

Decision tree classifier: classification algorithm. The leaves in the tree are labelled to classes.

k-Nearest Neighbours classifier: classification algorithm. “lazy classifier”. Does not do any learning,
but just remembers the dataset. Once it gets a new point, it just looks at the “k-nearest points”.
Assigns the class of the most frequent class among the k-nearest neighbours. k is a hyper parameter
(has to be chosen by the programmer).

Classification algorithms mostly work with numerical or categorical features (e.g. an algorithm only
works with numerical values). Binary classification: 2 classes. Multiclass classification: more than 2
classes. Multilabel classification: none, some, or all classes may be true. Class probabilities/scores:
the classifier reports a probability for each class: helpful property for a classifier to have.

Offline machine learning: the basic recipe:

- Abstract (part of) your problem to a standard task: classification, regression, clustering….
- Choose your instances and their features. For supervised learning: choose a target
- Choose your model class: linear models, decision trees, kNN

, - Search for a good model: usually a model comes with its own search method. Sometimes
multiple options are available.

Classification vs regression: in classification the target is a class and in regression the target is a
number. xi is the features of instance i. yi is the true label for xi. f(xi) is the model from the feature
space. The model goes from the feature space to the model space. In a regression model you do
have the feature space on the for example x-axis and the target on the for example y-axis
(classification just features as axis). The loss-function that is often used in regression models is the
mean-squared-errors (MSE): loss(f) = 1/n * (sum of all (f(xi) - yi)**2). Squared because for example a
big difference against a negative difference (cancel out). You also have a regression tree. And there
is a kNN regression.

Unsupervised abstraction tasks: clustering, density estimation and generative Modelling.

Clustering is a lot like classification. Divide the dataset or the feature space into a set of finite values.
But the difference is that in this case we are not given target values. Features are given but no
classes. Learner has to decide purely on pattern finding how to separate the dataset.

Density estimation: dataset of instances represented by features. The learner discovers patterns of
density. The task of the learner is to produce a model that outputs a number and that number
should indicate whether that instance is likely according to the distribution of the data. If features
are numerical: probability. If features are categorical: probability density. Fitting a normal
distribution to a set of numbers.

Generative modelling: A model that learns a probability distribution. thispersondoesnotexist.com

You can combine unsupervised and supervised learning: semi-supervised learning. Unlabelled data
is cheap to get (internet). An example of this kind of training is self-training.

Self-supervised learning: large unlabelled dataset is used to train a model without requiring a large
amount of manual annotation.




Sensitive attributes: features or targets that are associated with instances of data that require
careful consideration. Examples: sexual orientation, race, ethnic identity, cultural identity, gender.
What makes an attribute sensitive?

- Can it be used for harm?
- Can mischaracterizing relations become offensive?
- Is it commonly used to discriminate? → explicitly, as in apartheid regimes, or implicitly
through structural inequality.

Training data bias: where do you get your data from?

, Bias from technological legacy: rely on existing technology, that might have unexpected biases.

Amplifying bias: gender regarding words, for example Google translate used the male for words like
mi amigo es doctor as translate of my friend is a doctor. Might be that more doctors in general are
male, but still not 100%. Can be fixed by showing 2 results.

Are you predicting what you think you’re predicting? Results that are obtained from surveys can be
false (e.g. lies). Think we found a predictor, but this predictor is based on a survey.

It matters where you are predicting from! Persistence: the weather forecasting tactic of predicting
todays weather as tomorrow’s. Hence accuracy is not all that matters.

Can predictions be offensive or hurtful? There is a difference between being able to make a guess
and choosing to do so (predicting and acting). If a behaviour is not acceptable in a social context of
people, then humans will be upset if a computer does it. E.g. a website asking about your email
before allowing you to take a look (asking personal information before speaking to you).

Should we include sensitive data in attributes at all? To study bias, we need these attributed to be
annotated. If we remove them, they may be inferred from other features. Directly using a sensitive
attributes (SA) may be preferable to indirectly doing so. There are valid use cases (e.g. race and sex
affect medicine. Often requires a causal link).

Should we stop using SA as targets? What is input and what is target is not always clearly separated.
Showing that sensitive attributes can be inferred, may serve as a warning to those who are
vulnerable (building a proof-of-concept in a controlled setting is sometimes the best way to warn the
world that something can be built. E.g. using an algorithm to warn people that they might be seen as
gay, where homosexuality is not accepted).

Summary: use SA with extreme care: consider user communication over prediction. Check the
distribution. Do not: imply causality, and overrepresent what your predictions mean.

The aim of ML is to find a model that generalizes (does not work only on the training set). Hence,
you should not overfit (fitting the data that contains random noise). If overfitting happens, the
model is memorizing the data instead of generalizing. Hence: never judge your model’s
performance on the training data. The easiest way to prevent overfitting is to hold a bit of your
data. Hence, from all the data there is a part that is training data and a part that is test data. The aim
is not to minimize the loss on the training data, but to minimise the loss on the test data. You don’t
get to see the test data until you’ve chosen your model. Find the pattern in the data and discard the
noise. Machine learning is an empirical science.

The problem of induction. Inductive reasoning is learning. Observe something couple of times and
infer that it will probably happen the next time. Deductive reasoning is rule following. Deductive
reasoning: all men are mortal; Socrates is a man; Socrates is mortal. Inductive reasoning: the sun has
risen in the east my entire life; so it will do so tomorrow.

General heuristics: all else being equal, prefer the simple solution.

Lecture 2
Linear regression
Notation:

- Lower case, non-bold letter (x, y, z) → scalar (i.e. a single number)

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper gideonrouwendaal. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €10,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 64450 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Start met verkopen
€10,49  6x  verkocht
  • (1)
In winkelwagen
Toegevoegd