100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Summary Lecture Slides & notes Real-life Machine learning (300363-B-6) €6,49
In winkelwagen

Samenvatting

Summary Lecture Slides & notes Real-life Machine learning (300363-B-6)

 34 keer bekeken  2 keer verkocht

This document contains all the lecture slides and notes of the course 'Real-life Machine learning (300363-B-6)', given at Tilburg University as premaster for JADS. This document contains everything needed for the exam and is complete. Goodluck with the course!

Voorbeeld 4 van de 71  pagina's

  • Nee
  • Alle stof nodig voor het tentamen/everything needed for the exam
  • 17 december 2023
  • 71
  • 2023/2024
  • Samenvatting
book image

Titel boek:

Auteur(s):

  • Uitgave:
  • ISBN:
  • Druk:
Alle documenten voor dit vak (2)
avatar-seller
Dee25
Lecture 1
What are we going to learn today?
- What is machine learning?
- What are supervised and unsupervised machine learning?
- Which are the most common types of machine learning problems?
- Which are the basic steps of the CRoss Industry Standard Process for data mining
(CRISP-DM)?

Machine learning is the field of study that gives computers the ability to learn without being
explicitly programmed

Machine learning
Assume that you are iterating over and over again an exercise
What should be constant in your exercise?
- Learning! - machine learning applies strategies and algorithms, combined with data
and statistics
- Improving! - machine learning applies statistical indices to measure the overlap
between ML prediction and expected result
When you are doing it, it is human learning
When a machine does it, it is machine learning!




An example of supervised learning
Supervised learning - classification
Given a labelled dataset, the model learns to
predict new examples




An example of unsupervised learning
Unsupervised learning - clustering,
dimensionality reduction, anomaly detection
and novelty detection
Given a dataset, without labels, the model
learns to use to cluster/group similar data

,CRISP-DM process model




Business understanding in the CRISP-DM process




Determine business objectives and success criteria
Business objectives and measures to evaluate the results have to be established

Business objectives:
● What is the customer’s primary objective?
● Increase the number of loyal customers
● Selling more of a certain product
● Have a positive marketing campaign

,Business success criteria:
● Objective measure to establish success (e.g. return of investment)

Main steps in a data mining project
1. Define the goals:
Business and data mining experts together have to define the goals. For each goal a
measure must be defined to understand its success
2. Obtain the models:
Pre-process the data, apply data mining algorithms
3. Evaluate results
Use the pre-specified measures to evaluate the models
4. Deploy:
If the evaluation is successful, the model can be deployed

Costs & benefits
Perform a cost-benefit analysis
Compute the benefits of the project (e.g. return on investment)
Compute the costs of the project - main factors:
● Data sources
● Data mining problem to be solved
● Available tools
● Expertise of the development team

Quantify the risk that the project fails:
● Knowledge not available
● Data not available
● Missing tools

Quality data & feature engineering
What are we going to learn today?
- What kind of data exists?
- How to prepare data?
- What is data balancing?
- How to apply data cleaning and feature scaling?
- What is feature selection?

, What kind of data exists?
- Structured data
- Unstructured data
- Semi-structured data

Structured data
Tabular data (rows and columns) which are very well defined
We know which columns there are and what kind of data they contain (the format is very
strict)
Often such data is stored in databases that represent the relationships between the data as
well. Questions about data can be answered by using a query language.

Unstructured data
The rawest form of data that can be any type of file.
Extracting value out of this shape of data is hard, since you need to extract structured
features from the data
For example, you might want to extract topics from movies.

Semi-structured data
This format is between structured and unstructured data
A consistent format is defined. However, the structure is not very strict. For example, it could
not be tabular or parts of the data may be missing.
Semi-structured data are often stored as files. However, some kinds of semi-structured data
can be stored in document oriented-databases. Such databases allow you to query the
sem-structured data

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper Dee25. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €6,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 53068 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€6,49  2x  verkocht
  • (0)
In winkelwagen
Toegevoegd