Customer Analytics
Lecture 1 - Introduction
Marketing: then and now
Marketing used to be about selling more, really product-centric (increase profitability, short-term). 1990, things
started shifting to looking at relationships
(relationship-focused).
Customers are assets that generate profits
over time. Instead of a short-term mindset
how do I grow a profitability over a
customer’s lifetime?
Customer Lifecycle
Customers go through different stages. There
is a product lifecycle but also a customer
lifecycle. Marketing is about: acquiring, (first
contact with the firm) developing (change
behavior, upselling/cross-selling) and
retaining (prevent from leaving) customers!
We are going to attack different parts of this in this course.
Customer Analytics
= Using customer data and statistical models to make business decisions:
Who should be targeted for.. a marketing campaign, churn prevention (people about to leave), cross-selling,
acquisition?
Should we do a test before we roll it out? How big?
How many subscriptions/transactions can we predict over time for a cohort of customers?
How valuable is a customer to the firm over his or her lifecycle? How does it differ across customers?
Lectures 1-5: Short-term analytics
Testing and Uncertainty: Why test? Quantifying uncertainty; how large should the test be?
Models for selecting customer to target: which customers should be selected for e.g. acquisition,
retention, direct-mailing?
Models for customer development: collaborative filtering, cross-selling
o Guest lecture: Barrie Kersbergen (Bol) on Recommender systems in practice
Lectures 6-9: Long-term analytics
How does the customer base change over time as customers drop out? Why does retention increase over
time?
Customer lifetime value (CLV): who are the most valuable customers: how do you calculate the value to the
firm of the customer over his or her lifecycle?
o Guest lecture: Coolblue, implementing CLV
Grading:
1. Individual assignments 30%
2. Computer exam (individual) 70%
Pass course:
1. Final grade > 6
2. Exam grade > 5
3. The assignment grade still counts
Assignments
Each lecture has an assignment
Due a week after following Sunday
o Late assignments not accepted
It is OK if you discuss with others, but all assignments are to be done individually
Testvision software (if you have problems with part, be sure to email Anne)
Data sets & software
Course is organized around several data sets that illustrate an important concept.
o All these examples will be “hands-on” and have an emphasis on real-time problem solving.
We’re using R this year (NOT SPSS as past years)
o Advantages: widely used & lots of contributed software, free
1
, o Disadvantages: programming, unpredictability of packages, updates
R notebooks in the computer lab
READINGS:
Book: Blattberg, Robert C., Byung-Do Kim, and Scott A. “Why Database Marketing?”
Articles: other articles and material you can find on canvas under modules.
Module 1 - Testing and Uncertainty
E-Beer
E-Beer sells beer over the Internet and currently has about 50000 customers
A customer selects the favorite brand, pays and within 1 hour the ordered amount of beer is delivered at the
specified address
To boost sales, E-beer developed a mailing to send to their customers
Each mailing contains a flyer to remind customers of the offered service and a key ring with the name and
web address of the company
Campaign costs
Each mailing costs: 1.50
Sending it to all customers would mean total costs of: 1.50 x 50000 = 75000
Is it worth it? Benefits > costs?
The problem is that the benefit is uncertain!
Testing
The objective of testing is to obtain more information before committing a large amount of resources and, hence,
reduce the risk of possible failure.
1. Randomly select some customers; call this test sample (size = n)
a. Split up your customer base and take a proportion of it.
2. Send them the mailing, collect the data & analyze responses
3. Use results to decide whether to send to the rest of the population (size N – n, rollout sample).
a. Key thing here is that you want the people in the sample to representative for the people outside the sample
use simple random sampling!
Results of test
Assume we choose a test sample (n) of
size 5000. So, we randomly select 5000 customers and send them the mailing
Results of test mailing:
o 175 out of 5000 responds. So, the estimated response rate: p = 175/5000
o We assume the margin or the profit per response (profitability) is 50: m = 50
So should we do the rollout? How much would we expect to make if we send to the rest (rollout sample)?
See expected rollout profits
Rollout profit is positive (11250) roll out to
the rest of the customers
Option value
Therefore, because our expected rollout profit
is positive, we roll it out to the rest of the
sample.
o Bad campaigns are only tested
o Good campaigns are tested and then
rolled out
The test gives us the option – not the
obligation – to rollout.
We only rollout when (the results of the test are positive):
P(hat) is what we estimate.
If the response rate is greater than the break-even threshold then it’s profitable to rollout this campaign!
What’s the threshold? The cost divided by the margin: 1.5/50 = 0.3 = 3%.
How big is the option value (DECISION-TREE)?
How valuable is having a test? Take a few assumptions and build a decision-tree on these assumptions. A1: if
response rate is 0.75, then it is exactly that (no error). A2: only two states of the world (success & failure). A3:
30% of time it’s a success, 70% it is a failure. Not doing a test, there would be a negative number because: 0.3 x
1.00 (success) + 0.7 x -1.00 (failure) will always give a negative number.
2
,Bottom ranch: company decides to do the test. Two outcomes either a success or a failure. For success they
do the test and the rollout! In the failure section: they lose a euro per customer (so -5000 euro), but only lose this
on testing (so not that bad), and limit their losses to only the test.
Now we have to take one step backwards: what is the actual value?? Value of test = 11500!
Cost is lower than 11500? DO THE TEST!
Success occurs 30% of the time
= assumption
Uncertainty
The true unobserved population response rate is p
(From the above example) we tested on 5000 people and have an estimated response rate on our sample test.
The rest of the response rates are actually unobserved. Different samples of the same group would give you
different estimates of the response rate (there is variation in sampling).
What we observe: sample mean estimate:
Its standard error:
There might be a sampling error!!
By calculating the standard error (variance divided
by our sample and then the sqrt of that) we can
check how much variation there is.
Central limit theorem: for large enough sample,
distribution of sample mean is
approximately normal (the r-
estimate of our response rate will have a
distribution that will be approximately normal). Will be set on the true p, it will have a variance that will be
equal to the standard error of p squared! P(hat) is our response estimate, we could get it under different
samples (see graph of range of p(hat) when random sampling)! What is interesting for us is whether it is
less than our break-even threshold. Can use a normal distribution to calculate whether the p(hat) is less
than the break-even. The shaded area to the left (from the line) would be 0.27!
Bootstrap
= creating alternative samples from your original
samples (samples with replacements). Do this B
times (new samples, 10 thousands datasets from
your test set, for each of these datasets you
calculate what the response rate is: P(hat)1,
P(hat)2… P(hat)N….) use that as a measure of
sample variation. Of those ten thousand p(hat)s
we simulated, how many are less than our break-
even threshold?
Sample with replacement from the original sample,
using the same sample size.
For b = 1 … B bootstrap samples
1. Resample with replacement, X1,…Xn
2. Calculate estimate using this resample set, (…)
You now have a distribution, p1,…pn
Bootstrap is extremely effective when you do not want to rely on the sample size (large enough sample size).
Taking a step back:
Tests are useful, good response rate is large/good enough roll out
The response rate is measured with error, different samples would give slightly different response rates
how slightly depends on what n is (variability in response rate)
2 ways of calculating the response rate:
Central limit theorem: it’s close to a normal distribution
3
, Bootstrap: resamples data and calculates response rate for all the samples
How big should the test (size) be?
This was previously 5000, but where did we get that
number from? Should it be 10000 or 1000? What’s
the right size? We only have been talking about an
error only 1 way, if we used response rate since it’s
greater on average (greater than break-even, 0.3),
but actual response rate would be less than 0.3 (so
less than break-even) bottom left error, a type 1
error! We rollout when we shouldn’t! What about
the other types of error?
Looks like estimate is low (so no rollout) but actually
it is bigger than the break-even a type 2 error,
when we don’t rollout but really should have rolled out!
The idea for choosing the right sample size; how much error are we going to tolerate for alpha and how much
error for beta? If we set those 2 things and have the idea of the effect size (how much larger the p is than the
break-even), we can then say how large our sample size should be.
How should you determine the sample size?
Use excellent (free) software package GPower
You set how much alpha and beta you are going to tolerate (type 1 and 2 errors).
Go to Test family = “Exact”
Statistical test = “Proportion: Difference from a constant”
Set power (1-B) = 0.95 (beta is 0.05)
Set a = 0.05
Set constant proportion = pBE (the break-even)
Set effect size equal to how much over the break-even your best guess is, p - pBE = 0.035 – 0.030 =
0.005 (estimate of how much more the response rate is over the break-even!)
Put all of these things into this formula, in order to keep the power (error-rates to the desired levels) the
sample size we need is about: 13615 (as the software gives)
We want enough data, so we can reliably tell whether something is 0.005 greater.
If the difference would be bigger (p - pBE), we would need a bigger sample
Note of lecturer: if something is mentioned in the readings, but not on the slides should we be expected to know them? A lot
of hard mathematical equations, familiarize yourself with them?
If it’s not covered in the slides, do NOT need to know them.
Making smarter use of the test results
So far, we’ve considered an “all or nothing” approach.
o Either you use your test sample, calculate the response rate send out to EVERYONE (rollout) or
NO ONE (all or nothing approach)
o Not really efficient use of a test, not every individual has the same response rate (too generalized)
What if we used the test to identify profitable groups, and target mailing to them?
o No all or nothing approach!
o Taking into account that different individuals have different response rates (distinguishing)
Data for the groups
Where do we get this data from?
Of course, we can build a model to select
customers on the basis of many variables! Better predictions.
Most Common
Demographics: gender, ethnicity, age, income, family size, occupation, marital status, education,
homeowner or renter, length of residence (typically available for prospects)
Transaction data: past purchases, amounts, dates, discounts (already your customers!)
Best but unavailable for prospects (more expensive data)
Marketing: past mailings, content mailings, date, costs
(Survey data, e.g. Psychographics): attitudes, interests, activities.
Example: better targeting with model
Let’s evaluate how well it predicts
Let’s say we use our test data to build a model that predicts response probability
Profit contribution per response (m) = 80.00
Each mailing costs 0.70
Rollout sample size = 1000000
4