100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Machine Learning - Summary CA$15.95   Add to cart

Summary

Machine Learning - Summary

 109 views  6 purchases
  • Course
  • Institution
  • Book

A detailed summary of the lessons of Machine Learning taught by David Martens at the University of Antwerp. This is a summary of my own notes, the slides and the book Data Science for Business.

Preview 4 out of 64  pages

  • No
  • Chapter 1-10, 12
  • January 2, 2024
  • 64
  • 2023/2024
  • Summary
avatar-seller
MACHINE LEARNING
SUMMARY

1. Introduction 2
2. Predictive modeling 4
2.1. Explaining versus predicting 4
2.2. Data preprocessing 5
2.3. Terminology 8
2.4. Finding informative variables from the data 10
2.5. Decision trees 11
2.6. Mathematical models 14
2.6.4. Logistic regression 14
2.6.2. Support vector machines (SVM) 15
2.7. Overfitting and its avoidance 18
3. Assessing model performance 22
3.1. Evaluating classifiers 22
3.2. Expected value 23
3.3. Evaluation and baseline performance 24
4. Visualizing model performance 26
4.1. Profit curves 26
4.2. ROC curve 27
4.3. Cumulative response and lift curves 28
5. Naive Bayes 32
5.1. Bayes 32
5.2. A model of evidence lift 34
6. Descriptive modeling 36
6.1. Nearest-neighbor 36
6.2. Clustering 38
6.3. Frequent itemsets and association rules 39
6.4. Recommender systems 42
6.5. Conclusion and exercises 44
7. Ensemble methods and artificial neural networks 46
7.1. Ensemble methods 46
7.2. Artificial neural networks 48
7.3. Deep learning 51
8. Text mining 52
8.1. Why text mining? 52
8.2. Text processing 52
8.3. Document Classification and clustering 55
8.4. Topic modeling and word embeddings 56

, 8.5. Case study in politics 57
9. Data science ethics 60
9.1. Data gathering: privacy, A/B testing and bias 61
9.2. Data preprocessing: proxies, government backdoors 61
9.3. Modeling: ZK proofs, discrimination 62
9.4. Model evaluation: explain 62




1

,1. Introduction
Data science = set of fundamental principles that guide extraction of knowledge from data
Data mining = the extractionproces of knowledge from data
AI = methods for improving knowledge of an agent over time due to experience
Generative AI: generates texts, making predictions based on prompt and previous word

ML = auto extraction of patterns from large amounts of data
Ex; Wal-Mart learned what products get sold more before hurricanes
Ex; recommendation system → “frequently bought together”
Ex: market basket analysis → give coupon for milk if bread and butter bought together

Goal: find non-obvious patterns ⇒ improve decision making (data driven decision making, DDD)

Ex; Credit scoring




→ target variable labels needed for algo to make distinctions
⇒ based on data (data mining): classification model ⇒ used for predictions




End-user is engine of discovery
- You know what you look for
- Querying = request for a subset of data or for statistics ex; average, graphs, …
- Tools: SQL (Structured Query Language) + GUI (Graphical User Interface)
- OLAP (One-Line Analytical Processing) = advanced query and reporting
Business intelligence = getting the right info to the right person at the right time


2

, End-user isn’t engine of discovery
- You don’t know what you look for ⇒ new knowledge
- Computer finding patterns → ML

AI
● A computer interacts through data
● Learning from data ⇒ intelligence
● Big data + ML = AI
● Mainly used for predictions ex; fb likes ⇒ political preference

CRISP-DM (Cross Industry Standard Process for Machine Learning)


DDD: has proven value ⇒ automated decisions

Data science roles
● Computer science: python, database creation, …
● Domain knowledge
● Communication skills

Data + ability extract knowledge = key strategic assets
Ex; Value facebook stems from data
Ex; Income Robinhood: selling training data to hedge funds

Big data = datasets that are too large for traditional data processing systems
Data warehouse: collect and combine data from across an enterprise

Fundamental concepts of data science
● CRISP-DM
● Find informative descriptive attributes of entities of interest based on large mass of data
using information tech
○ Finding variables that correlate with target
○ Recursively: predict target based on attributes
● Overfitting: finding patterns that don’t generalize
● Formulating solutions and evaluating relies on context of usage




3

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller thijshanssen. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for CA$15.95. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

81298 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
CA$15.95  6x  sold
  • (0)
  Add to cart