Samenvatting

Customer Analytics Lectures Summary

88 keer bekeken 1 keer verkocht

Vak
Customer Analytics

Instelling
Tilburg University (UVT)

Summary of Customer Analytics course.

[Meer zien]

Voorbeeld 4 van de 35 pagina's

Bekijk voorbeeld

Geupload op 9 februari 2021
Aantal pagina's 35
Geschreven in 2020/2021
Type Samenvatting

Volgen

sabrinadegraaf Lid sinds 3 jaar 60 documenten verkocht

Customer Analytics
Lecture 1 - Introduction
Marketing: then and now

Marketing used to be about selling more, really product-centric (increase profitability, short-term). 1990, things
started shifting to looking at relationships
(relationship-focused).
Customers are assets that generate profits
over time. Instead of a short-term mindset 
how do I grow a profitability over a
customer’s lifetime?

Customer Lifecycle
Customers go through different stages. There
is a product lifecycle but also a customer
lifecycle. Marketing is about: acquiring, (first
contact with the firm) developing (change
behavior, upselling/cross-selling) and
retaining (prevent from leaving) customers!
We are going to attack different parts of this in this course.

Customer Analytics
= Using customer data and statistical models to make business decisions:
 Who should be targeted for.. a marketing campaign, churn prevention (people about to leave), cross-selling,
acquisition?
 Should we do a test before we roll it out? How big?
 How many subscriptions/transactions can we predict over time for a cohort of customers?
 How valuable is a customer to the firm over his or her lifecycle? How does it differ across customers?

Lectures 1-5: Short-term analytics
 Testing and Uncertainty: Why test? Quantifying uncertainty; how large should the test be?
 Models for selecting customer to target: which customers should be selected for e.g. acquisition,
retention, direct-mailing?
 Models for customer development: collaborative filtering, cross-selling
o Guest lecture: Barrie Kersbergen (Bol) on Recommender systems in practice
Lectures 6-9: Long-term analytics
 How does the customer base change over time as customers drop out? Why does retention increase over
time?
 Customer lifetime value (CLV): who are the most valuable customers: how do you calculate the value to the
firm of the customer over his or her lifecycle?
o Guest lecture: Coolblue, implementing CLV

Grading:
1. Individual assignments 30%
2. Computer exam (individual) 70%
Pass course:
1. Final grade > 6
2. Exam grade > 5
3. The assignment grade still counts

Assignments
 Each lecture has an assignment
 Due a week after following Sunday
o Late assignments not accepted
 It is OK if you discuss with others, but all assignments are to be done individually
 Testvision software (if you have problems with part, be sure to email Anne)

Data sets & software
 Course is organized around several data sets that illustrate an important concept.
o All these examples will be “hands-on” and have an emphasis on real-time problem solving.
 We’re using R this year (NOT SPSS as past years)
o Advantages: widely used & lots of contributed software, free

1

, o Disadvantages: programming, unpredictability of packages, updates
 R notebooks in the computer lab

READINGS:
 Book: Blattberg, Robert C., Byung-Do Kim, and Scott A. “Why Database Marketing?”
 Articles: other articles and material you can find on canvas under modules.

Module 1 - Testing and Uncertainty
E-Beer
 E-Beer sells beer over the Internet and currently has about 50000 customers
 A customer selects the favorite brand, pays and within 1 hour the ordered amount of beer is delivered at the
specified address
 To boost sales, E-beer developed a mailing to send to their customers
 Each mailing contains a flyer to remind customers of the offered service and a key ring with the name and
web address of the company

Campaign costs
Each mailing costs: 1.50
Sending it to all customers would mean total costs of: 1.50 x 50000 = 75000
Is it worth it? Benefits > costs?
The problem is that the benefit is uncertain!

Testing
The objective of testing is to obtain more information before committing a large amount of resources and, hence,
reduce the risk of possible failure.
1. Randomly select some customers; call this test sample (size = n)
a. Split up your customer base and take a proportion of it.
2. Send them the mailing, collect the data & analyze responses
3. Use results to decide whether to send to the rest of the population (size N – n, rollout sample).
a. Key thing here is that you want the people in the sample to representative for the people outside the sample
 use simple random sampling!

Results of test
 Assume we choose a test sample (n) of
size 5000. So, we randomly select 5000 customers and send them the mailing
 Results of test mailing:
o 175 out of 5000 responds. So, the estimated response rate: p = 175/5000
o We assume the margin or the profit per response (profitability) is 50: m = 50
 So should we do the rollout? How much would we expect to make if we send to the rest (rollout sample)?
 See expected rollout profits 
 Rollout profit is positive (11250)  roll out to
the rest of the customers

Option value
 Therefore, because our expected rollout profit
is positive, we roll it out to the rest of the
sample.
o Bad campaigns are only tested
o Good campaigns are tested and then
rolled out
 The test gives us the option – not the
obligation – to rollout.
 We only rollout when (the results of the test are positive):

P(hat) is what we estimate.
If the response rate is greater than the break-even threshold then it’s profitable to rollout this campaign!
What’s the threshold? The cost divided by the margin: 1.5/50 = 0.3 = 3%.

How big is the option value (DECISION-TREE)?
How valuable is having a test? Take a few assumptions and build a decision-tree on these assumptions. A1: if
response rate is 0.75, then it is exactly that (no error). A2: only two states of the world (success & failure). A3:
30% of time it’s a success, 70% it is a failure. Not doing a test, there would be a negative number because: 0.3 x
1.00 (success) + 0.7 x -1.00 (failure) will always give a negative number.

2

,Bottom ranch: company decides to do the test. Two outcomes  either a success or a failure. For success they
do the test and the rollout! In the failure section: they lose a euro per customer (so -5000 euro), but only lose this
on testing (so not that bad), and limit their losses to only the test.
Now we have to take one step backwards: what is the actual value?? Value of test = 11500!
Cost is lower than 11500? DO THE TEST!

Success occurs 30% of the time
= assumption

Uncertainty
The true unobserved population response rate is p
(From the above example) we tested on 5000 people and have an estimated response rate on our sample test.
The rest of the response rates are actually unobserved. Different samples of the same group would give you
different estimates of the response rate (there is variation in sampling).

 What we observe: sample mean estimate:

 Its standard error:
 There might be a sampling error!!
 By calculating the standard error (variance divided
by our sample and then the sqrt of that) we can
check how much variation there is.
 Central limit theorem: for large enough sample,
distribution of sample mean is
approximately normal (the r-
estimate of our response rate will have a
distribution that will be approximately normal). Will be set on the true p, it will have a variance that will be
equal to the standard error of p squared! P(hat) is our response estimate, we could get it under different
samples (see graph of range of p(hat) when random sampling)! What is interesting for us is whether it is
less than our break-even threshold. Can use a normal distribution to calculate whether the p(hat) is less
than the break-even. The shaded area to the left (from the line) would be 0.27!

Bootstrap
= creating alternative samples from your original
samples (samples with replacements). Do this B
times (new samples, 10 thousands datasets from
your test set, for each of these datasets you
calculate what the response rate is: P(hat)1,
P(hat)2… P(hat)N….)  use that as a measure of
sample variation. Of those ten thousand p(hat)s
we simulated, how many are less than our break-
even threshold?
Sample with replacement from the original sample,
using the same sample size.
For b = 1 … B bootstrap samples
1. Resample with replacement, X1,…Xn
2. Calculate estimate using this resample set, (…)
You now have a distribution, p1,…pn
Bootstrap is extremely effective when you do not want to rely on the sample size (large enough sample size).

Taking a step back:
 Tests are useful, good response rate is large/good enough  roll out
 The response rate is measured with error, different samples would give slightly different response rates 
how slightly depends on what n is (variability in response rate)
2 ways of calculating the response rate:
 Central limit theorem: it’s close to a normal distribution

3

,  Bootstrap: resamples data and calculates response rate for all the samples

How big should the test (size) be?
This was previously 5000, but where did we get that
number from? Should it be 10000 or 1000? What’s
the right size? We only have been talking about an
error only 1 way, if we used response rate since it’s
greater on average (greater than break-even, 0.3),
but actual response rate would be less than 0.3 (so
less than break-even)  bottom left error, a type 1
error! We rollout when we shouldn’t! What about
the other types of error?

Looks like estimate is low (so no rollout) but actually
it is bigger than the break-even  a type 2 error,
when we don’t rollout but really should have rolled out!
The idea for choosing the right sample size; how much error are we going to tolerate for alpha and how much
error for beta? If we set those 2 things and have the idea of the effect size (how much larger the p is than the
break-even), we can then say how large our sample size should be.

How should you determine the sample size?
Use excellent (free) software package GPower
You set how much alpha and beta you are going to tolerate (type 1 and 2 errors).
 Go to Test family = “Exact”
 Statistical test = “Proportion: Difference from a constant”
 Set power (1-B) = 0.95 (beta is 0.05)
 Set a = 0.05
 Set constant proportion = pBE (the break-even)
 Set effect size equal to how much over the break-even your best guess is, p - pBE = 0.035 – 0.030 =
0.005 (estimate of how much more the response rate is over the break-even!)
 Put all of these things into this formula, in order to keep the power (error-rates to the desired levels) the
sample size we need is about: 13615 (as the software gives)
 We want enough data, so we can reliably tell whether something is 0.005 greater.
 If the difference would be bigger (p - pBE), we would need a bigger sample

Note of lecturer: if something is mentioned in the readings, but not on the slides  should we be expected to know them? A lot
of hard mathematical equations, familiarize yourself with them?
If it’s not covered in the slides, do NOT need to know them.

Making smarter use of the test results
 So far, we’ve considered an “all or nothing” approach.
o Either you use your test sample, calculate the response rate  send out to EVERYONE (rollout) or
NO ONE (all or nothing approach)
o Not really efficient use of a test, not every individual has the same response rate (too generalized)
 What if we used the test to identify profitable groups, and target mailing to them?
o No all or nothing approach!
o Taking into account that different individuals have different response rates (distinguishing)

Data for the groups
Where do we get this data from?
Of course, we can build a model to select
customers on the basis of many variables! Better predictions.
Most Common
 Demographics: gender, ethnicity, age, income, family size, occupation, marital status, education,
homeowner or renter, length of residence (typically available for prospects)
 Transaction data: past purchases, amounts, dates, discounts (already your customers!)
Best but unavailable for prospects (more expensive data)
 Marketing: past mailings, content mailings, date, costs
 (Survey data, e.g. Psychographics): attitudes, interests, activities.

Example: better targeting with model
Let’s evaluate how well it predicts
 Let’s say we use our test data to build a model that predicts response probability
 Profit contribution per response (m) = 80.00
 Each mailing costs 0.70
 Rollout sample size = 1000000

4

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper sabrinadegraaf. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €5,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 51662 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Start met verkopen

Populaire Universiteiten

Populaire Hogescholen

Populaire Scholen

Populaire samengevatte studieboeken voor Communicatie en Taal

Populaire samengevatte studieboeken voor Economie en Bedrijf

Populaire samengevatte studieboeken voor Exact en Informatica

Populaire samengevatte studieboeken voor Gedrag en Maatschappij

Populaire samengevatte studieboeken voor Gezondheid en Geneeskunde

Populaire samengevatte studieboeken voor Onderwijs en Opvoeding

Populaire samengevatte studieboeken voor Recht en Bestuur

De beste samenvattingen om je Wft-diploma te behalen

De beste samenvattingen om je theorie examens te behalen

De beste samenvattingen voor je cursus in de Veiligheidsbranche

De beste samenvattingen voor Gezondheid & Hygiëne cursussen

De beste samenvattingen voor zakelijke cursussen

De beste samenvattingen voor je PABO WisCAT cursus

Populaire vakken

Populaire vakken

Populaire vakken

Boekverslagen en samenvattingen

Verkoper

Samenvatting

Customer Analytics Lectures Summary

Document informatie

Onderwerpen

Geschreven voor

Verkoper

Ontvangen beoordelingen

Voorbeeld van de inhoud

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Snel en makkelijk kopen

Focus op de essentie

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?