Strategy analytics
Table of Contents
Strategy analytics...............................................................................................................1
Lecture 1: Introduction........................................................................................................5
Chapter 1: Data analytics thinking......................................................................................6
Fundamental concepts: data science and data driven decision making (DDD)..............................6
Fundamental concepts: Big data...................................................................................................6
Fundamental concepts: data science, data mining, and machine learning....................................7
Business problems.........................................................................................................................8
From business problems to data mining tasks...............................................................................8
Data mining and its results............................................................................................................9
Data mining process – CRISP.........................................................................................................9
Boundaries of data mining..........................................................................................................10
Case: Capital one.........................................................................................................................10
Lecture 2: Supervised segmentation..................................................................................10
Before we start, a warning on terminology…..............................................................................10
Chapter 3: predictive modeling: from correlation to supervised segmentation..................11
Models, induction & deduction...................................................................................................11
Supervised segmentation: an explanation...................................................................................11
Access to historical data..............................................................................................................12
Data mining question..................................................................................................................12
Let’s look at some data….............................................................................................................12
A peek into the dataset…............................................................................................................12
Visualize - scatter plot.................................................................................................................12
Step by step approach.................................................................................................................12
Entropy........................................................................................................................................12
Entropy (H)..................................................................................................................................13
Information gain (IG)...................................................................................................................13
Formally definition of IG..............................................................................................................13
Illustration IG..............................................................................................................................14
Iterative process to find maximum IG..........................................................................................14
Classification tree: explanation...................................................................................................14
Classification tree: development and advantages.......................................................................15
Decision boundaries....................................................................................................................16
1
, Predicting the end of the coronavirus pandemic – optimistic or not?..........................................16
Chapter 4: Fitting a model to data.....................................................................................16
Classification via mathematical functions....................................................................................16
Linear discriminant models.........................................................................................................16
Logistic regression.......................................................................................................................17
Estimating probabilities...............................................................................................................17
Probability estimation and Laplace correction............................................................................17
Probability estimation and Laplace correction............................................................................18
Linear discriminant functions (the book has errors on this).........................................................18
Support vector machine (SVM)....................................................................................................19
Support Vector Machine (SVM) - Loss function...........................................................................19
Support vector machine (SVM) method......................................................................................19
Linear regression.........................................................................................................................20
Logistic regression.......................................................................................................................20
Logistic regression vs. SVM..........................................................................................................20
Comparison between logistic regression (function fitting) vs decision tree (tree induction)........21
Case: Classification trees and decision-analytic feedforward control: a case study from the video
game industry (Brydon & Gemino, 2008)....................................................................................21
Lecture 3: FIFA world cup & Easyjet cases..........................................................................21
Recap from last week..................................................................................................................22
Chapter 5: Overfitting and its avoidance............................................................................22
Chapter 5.1: Overfitting: example...............................................................................................22
Overfitting: Explanation..............................................................................................................23
Overfitting: bias-variance tradeoff...............................................................................................24
Overfitting: complexity and prediction error...............................................................................24
Overfitting: logistic regression vs. SVM.......................................................................................25
Chapter 5.2: Avoiding overfitting: how to measure generalizability............................................25
Avoiding overfitting: cross-validation..........................................................................................26
Learning curves...........................................................................................................................26
Avoiding overfitting: methods for classification trees..................................................................26
Avoiding overfitting: classification tree ensemble methods........................................................27
Avoiding overfitting: bagging.......................................................................................................28
Avoiding overfitting: boosting.....................................................................................................28
Avoiding overfitting: random forest............................................................................................28
Avoiding overfitting: random forest application..........................................................................29
Avoiding overfitting: logistic regression.......................................................................................29
2
, General method for avoiding overfitting – nested hold-out / cross-validation............................29
Case: FIFA world cup 2018...........................................................................................................30
Discussion questions...................................................................................................................30
Chapter 6: similarity, neighbors and clusters.....................................................................30
Chapter 6.1 Similarity and distance.............................................................................................30
Similarity: motivation..................................................................................................................30
Distance: measures (definitions).................................................................................................31
Distance: example I.....................................................................................................................31
Distance: example II....................................................................................................................32
Chapter 6.2 Nearest neighbors: example on US voters................................................................32
Nearest neighbors: technique.....................................................................................................32
Nearest neighbors: influence of k................................................................................................33
Nearest neighbors: challenges.....................................................................................................33
Chapter 6.3 Clustering.................................................................................................................34
Clustering: definition...................................................................................................................34
Hierarchical clustering: method...................................................................................................34
Hierarchical clustering: visualization...........................................................................................34
K-means clustering: method........................................................................................................34
K-means clustering: specifications...............................................................................................35
Case: Generation Easyjet.............................................................................................................35
Learning goals lecture 3...............................................................................................................35
Lecture 4: Management: evaluating & visualizing the performance of analytical strategies
.........................................................................................................................................36
Chapter 7: decision-analytic thinking – model performance metrics..................................36
Classifier Accuracy.......................................................................................................................36
Confusion matrix.........................................................................................................................36
Unbalanced classes.....................................................................................................................37
Unbalanced classes- problems.....................................................................................................37
Problems with unequal costs and benefits..................................................................................38
Model performance metrics: metrics..........................................................................................38
Class discussion...........................................................................................................................39
Decision-analytic thinking – expected value framework..............................................................39
Expected value for classifier use..................................................................................................39
Expected value for classifier evaluation - probabilities................................................................40
Expected value for classifier evaluation – sample not random/representative, class priors known
....................................................................................................................................................40
3
, Expected value for classifier evaluation – costs and benefits.......................................................41
Expected value framework: comparison.....................................................................................41
Chapter 8: Visualizing model performance........................................................................41
Classification vs. Ranking.............................................................................................................41
Visualizations..............................................................................................................................42
Visualizations: ROC graphs..........................................................................................................42
Visualizations: cumulative response curves.................................................................................43
Visualizations: Lift curve..............................................................................................................43
Model evaluation........................................................................................................................44
Case: predicting healthcare needs...............................................................................................44
Chapter 11: Complex decision-analytic thinking................................................................45
Example: targeting the best prospects for a charity mailing........................................................45
Lecture 5: methods: Bayesian, text mining, co-occurrence, profiling..................................46
Learning goals.............................................................................................................................46
Chapter 9: Naïve Bayes Classifier.......................................................................................46
9. 1 – example on cancer screening.............................................................................................46
9.1 Probabilities & Bayes’ rule.....................................................................................................47
9.1 Back to our example..............................................................................................................47
9.1 Advancing Bayes’ rule............................................................................................................47
9.1 A simplified example.............................................................................................................48
9.1 Lift.........................................................................................................................................48
9.2 Conditional probabilities in practice......................................................................................48
9.2 Benefits of Naive Bayes.........................................................................................................49
9.2 Disadvantage of Naïve Bayes.................................................................................................49
Chapter 10: Text analysis...................................................................................................49
10.1 Text as data.........................................................................................................................49
10.1 Text analysis........................................................................................................................49
10.2 ‘Bag of words’ approach......................................................................................................50
10.2 Bag of words example I........................................................................................................50
10.3 Advanced methods..............................................................................................................52
10.3 Advanced methods: topic models........................................................................................53
Case: Twitter and stock returns...................................................................................................53
Chapter 12: Co-occurrence, associations, profiling, link prediction, latent dimensions.......53
12.1 Co-occurrence and association rules....................................................................................53
12.1 Co-occurrence measure comparison....................................................................................54
4
, 12.1 Co-occurrence measures visualization.................................................................................54
12.1 Examples.............................................................................................................................54
12.2 Profiling and link prediction.................................................................................................55
12.3 Data reduction and latent dimensions.................................................................................55
Bias, variance, and ensemble methods........................................................................................56
Causal explanation......................................................................................................................56
Lecture 6...........................................................................................................................56
Learning goals.............................................................................................................................56
Chapter 13: Data science and business strategy................................................................57
Important factors to get the most from your data.......................................................................57
Thinking data analytically & creating a conducive culture...........................................................57
Achieving competitive advantage with data science...................................................................57
Sustaining competitive advantage with data science..................................................................57
Attracting data scientists.............................................................................................................58
Proposal evaluation for data science projects.............................................................................58
Questions to ask..........................................................................................................................58
An example for data mining........................................................................................................58
Evaluation step 1: Business understanding..................................................................................59
Evaluation step 2: data understanding / data preparation..........................................................59
Evaluation step 3: Modeling........................................................................................................59
Evaluation step 4: evaluation......................................................................................................60
Evaluation step 5: Deployment....................................................................................................60
Chapter 14: Conclusion......................................................................................................60
What data can’t do: Humans in the loop.....................................................................................61
Privacy, ethics and mining data about individuals.......................................................................62
Case: The Dark Side of Customer Analytics (Davenport & Harris, 2007 HBR)...............................62
Discussion questions...................................................................................................................62
Review..............................................................................................................................62
Lecture 1: Introduction
- Chapter 1 + 2; Case: Capital 1
Data science: set of fundamental principles that guide the extraction of knowledge from
data. Data mining: extraction of knowledge from data, via technologies that incorporate
these principles.
5
,Chapter 1: Data analytics thinking
The ubiquity of data opportunities in the digital era
Over last 25 years, many devices are all be linked with each other through data. The costs of
storing these data have decreased.
Some observations in our daily life
- Marketing
o Online advertising
o Recommendations for cross selling
o Customer relationship management
- Finance
o Credit scoring and trading
o Fraud detection
o Workforce management
- Retail
o Marketing
o Supply chain management
Data is used in many organizations daily. Different technologies more data use for
better decisions.
Fundamental concepts: data science and data driven decision
making (DDD)
Data-driven decision making (DDD): practice of basing decisions on the analysis of data,
rather than purely on intuition.
- Relying on data and analysis. DDD always assumes there is a lot of data, which is not
always the case. In e.g. pandemic, we learn as we go. Often in initial stages, there
hasn’t been a lot of data.
Data science: Involves principles, processes and techniques for understanding phenomena
via the (automated) analysis of data.
- E.g.: What can we do to retain customers? Predict customer churn.
Data science supports DDD but is also overlapping with DDD. Business decisions are
increasingly being made automatically. Data engineering includes data science but is useful
for much more.
The sort of decisions of interest:
1. Decisions which need discovery (non-obvious) within data
2. Repetitive decisions (especially at massive scale)
The type of decisions that are interesting for a company require packages that are not
obvious, they need more discovery, it’s not intuitive. The other important element are the
repetitive decisions. If there is a problem you frequently challenge, data science is
important.
Fundamental concepts: Big data
Data vs. information
6
,Data can invert into information, so that they have meaning. Data in itself has no meaning.
Big data: Simple, very large dataset, but with three distinct characteristics (3Vs):
- Volume: quantity of generated and stored data
- Variety: type and nature of data
- Velocity: Speed at which the data is generated and processed
Big Data 1.0 is transformed into Big Data 2.0 (social networking components, rise of voice of
individual consumer).
Fundamental concepts: data science, data mining, and machine
learning
Data science: involves principles, processes. XXX
- Involves storage, collection, analysis and implementation.
Data mining: Extraction of knowledge from data, via technologies that incorporate these
principles. Data mining is one aspect of data science, extracting the knowledge.
We focus on the 1) business understanding (how do you translate business problems into
data problems) and 2) data analysis (what kind of models can you use; understanding data
analysis).
Data analytics: Process of examining datasets in order to draw conclusions about the useful
information they may contain. What value do these models have? What is the framework?
- How much value does it create for a manager and how can you use it for the future?
Types of data analysis:
- Descriptive analysis (BI): what has happened?
o Simple descriptive statistics, dashboard, charts, diagrams.
- Predictive analysis: what could happen? (main focus of this course)
o Segmentation, regression
- Prescriptive analysis: what should we do?
o Complex models for product planning and stock optimization
Big data analysis as a strategic asset – video
Data comes from both internal and external sources, structured and unstructured. The idea
is to make meaning of the data. When you combine big data with analysis, this is key for
having competitive advantage.
Strategic asset: data and the capability to extract useful knowledge from data can be
strategic asset. One has to think of data science as a strategic asset you invest in. You need
good people, good infrastructure and a process over years. It’s like R&D, long term project
which pays off in the long run.
Fundamental concepts
Classification Predict - for each individual in a population - which of a set of classes this
individual belongs to. It predicts whether something will happen. Often a
binary target (categorical, not numerical as in regression).
Scoring Class probability estimation: applying score, representing the probability
7
, that an individual belongs to each class.
Regression predict, for each individual, the numerical value of some variable for that
individual. It predicts how much something will happen.
Similarity attempts to identify similar individuals based on data known about them.
matching Often used for product recommendations.
Clustering Group individuals in a population together in by their similarity, but not
driven by any specific purpose. Useful in preliminary domain explorations
to see which natural groups exist.
Co- Frequent itemset mining, association rule discovery and market-basket
occurrence analysis: to find associations between entities based on transactions
grouping involving them. It considers similarity of objects based on their appearing
together in transactions (instead of the objects’ attributes).
Profiling Behavior description: characterize typical behaviour of individual/group/
population. Useful for establishing behavioural norms for anomaly
detection applications such as fraud systems.
Link Predict connections between data items, usually by suggesting that a link
prediction should exist, and possibly also estimating the strength of the link. Often
used in social networking systems.
Data Replace a large set of data into a smaller set, which may be easier to deal
reduction with but contains much of the important information.
Causal Understand what events/actions actually influence others. Understand the
modelling difference between two situations (treatment event vs no treatment).
Business problems
From business problems to data mining tasks
A collaborative problem solving between business stakeholders and data scientists:
- Decomposing a business problem into (solvable) subtasks
- Matching subtasks with known tasks for which tools are available
- Solving the remaining non-matched subtasks by creativity
- Putting the subtasks together to solve the overall problem
Is there any pattern among the customers? Is there something common? To answer that,
you need data on customers. You need to get data, understand the models you want to
generate and run them. Choosing the right tools for the subtasks is important. There is a lot
of subjective evaluation (human evaluation) in this process. Your prior knowledge of the
domain becomes crucial.
Typology of methods
The key question: Is there a specific target variable?
- Yes Supervised learning
- No Unsupervised learning – see if there are certain clusters of customers different
from others?
Unsupervised learning = clustering, co-occurrence grouping, profiling
- Training data provides examples – no specific outcomes
- The machine tries to find specific patterns in the data
8