Summary of all the lectures WITH additional information from the book added to each concept. The most complete summary you will get that helps you receiving a high grade
Machine Learning (Data Mining) - Samenvatting (slides en handboek)
Full Summary of Chapters and Lecture Slides Data Science for Business
All for this textbook (25)
Written for
Tilburg University (UVT)
Strategic Management
Strategy Analytics
All documents for this subject (4)
1
review
By: brandonsmulders2302 • 3 year ago
Seller
Follow
danterodrigo
Reviews received
Content preview
Strategy analytics summary
Fundamental concepts:
Data science
= involves principles processes and techniques for understanding phenomena via the
(automated) analysis of data
Data-driven decision-making (DDD)
=basing decisions on the analysis of data, rather than purely on intuition
Big Data
=simple, very large dataset, but with 3 characteristics
- Volume: the quantity the data
- Variety: the type and nature of the data
- Velocity: the speed at which the data is generated and processed
Data mining
=the extraction of knowledge from data, via technologies that incorporate these principles
Data science
,Data analytics
=the process of examining datasets in order to draw conclusions about the useful information
they may contain
- Descriptive Analytics: What has happened?
- Predictive Analytics: What could happen?
- Segmentation, regressions
- Prescriptive Analytics: What should we do?
- Complex models for product planning and stock optimization
Business problem → Data mining tasks
- A collaborative problem-solving between business stakeholders and data scientists
- Decomposing a business problem into solvable subtasks
- Matching the subtasks with known tasks for which tools are available
- Solving the remaining non-matched subtasks (by creativity!)
- Putting the subtasks together to solve the overall problem
Supervised vs unsupervised methods
The key question:
- Is there a specific target variable?
- Yes! → supervised
- No! → unsupervised
Unsupervised learning
- Training data provides “examples” - no specific “outcome”
- The machine tries to find specific patterns in the data
- Algorithm
- Clusters
- Anomaly detection
- Association discovery
- Topic modeling
- Because the model has no “outcome”, can not be evaluated
Examples: training data
- Are these customers similar? Customer profile
- Is this transaction unusual? Previous transactions
Supervised learning
- Training data has one feature that is the “outcome”
- Goal is to build a model to predict the outcome (machine learns to predict)
- The outcome has a known value, model can be evaluated
- Split the data into a training and test set
- Model the training set/predict the test
, - Compares the predictions to the known values
- Algorithm
- model/ensemble
- Logistic regression
- Time series
Examples: training data
- How much is this home worth? Previous home sales
- Will this customer default his loan? Previous loan that were paid/defaulted
- How many customers will apply for loan? Previous months of loan application
Consider two similar questions we might ask about a customer population
- Do our customers naturally fall into different groups? → no specific purpose
or target for the grouping → unsupervised
- Can we find groups of customers who have particularly high likelihoods of
canceling their service soon after their contracts expire? → here a specific
target is defined: will a customer leave when her contracts expire? →
supervised
Business understanding = a part where the analyst' creative parts plays a large role. The key
to a great success is a creative problem formulation by some analysts regarding how to cast the
business problem as one or more data science problems.
→ high level knowledge of the fundamentals helps creative business analysts see
novel formulations
Data understanding = important to understand the strengths and limitations of the data
because rarely is there an exact match with the problem. Understanding the different
, information within a database, with different intersecting populations and varying degrees of
reliability
Data preparation = analytic technology requires data to be in a form different from how it is
provided naturally, sometimes conversion is necessary. This process proceeds along with data
understanding, in which data are manipulated and converted into forms that yield better results
Modeling = the output of modeling is some sort of model or pattern capturing regularities in the
data
Evaluation = assess the data mining results rigorously and gain confidence that they are valid
and reliable before moving on. Evaluating include satisfying stakeholders with the quality of the
model’s decision. They want to see if the model does more good than harm.
Deployment = results of data mining are put into real use to realize return on investment
Case Capital One → see other summary
Lecture 2
Learning goals:
- Supervised segmentation
- Classification trees
- Parametric models
- Linear discriminant function
- Logistic regression
- Support vector machine
An intuitive way of thinking about extracting patterns from data in a supervised manner is to try
to segment the population into subgroups that have different values for the target variable
= supervised segmentation
Model
=a simplified representation of reality created to serve a purpose
- Abstraction of irrelevant details
Models serve different purposes
- Unsupervised setting: to identify (classes, groups, patterns, etc.(
- Descriptive
- Supervised setting: to predict (“to estimate an unknown value”)
- Predictive
Induction
=”generalizing from specific to general”
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller danterodrigo. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $7.57. You're not tied to anything after your purchase.