Consumer analytics using big data
Reading Hofacker, et al : Big data and consumer behaviour: Imminent
opportunities.
V’s: volume, velocity and variety (Meta Group, 2001). Volume refers to the “bigness” property, while
velocity refers to the rate at which the digital processes make Big Data even bigger. Variety refers to
new formats and types of data.
In marketing, the main driver of the interest in Big Data is the potential usefulness of it for informing
marketing decisions and executing marketing campaigns
Our steps are problem recognition, search, alternative evaluation, purchase behaviour,
consumption, post-purchase evaluation and post-purchase engagement.
Problem recognition:
- In the first stage of consumer problem-solving, the consumer sees a gap between what he or
she has, or has experienced, and what he or she wants, or wants to experience.
Search:
- In the digital world, the problem is having too many alternatives. The enriched search
process now throws off digital data at every turn
Alternative evaluation:
- The e-tailer will have data on alternative evaluations including consideration sets and
inferred choice rules based on navigation sequences
Purchase
Consumption
- Is done more online, for example posting before consuming,
Post-purchase evaluation
- The step by itself does not create big data, but when writing a review or share pictures, it
does
Post-purchase engagement
- Product reviews are the prototypical Big Data exemplar. These exhibit all of the three V’s
mentioned in the opening section.
- Reviews and comments, and their consequences on others, take us full circle in Figure 1,
back to problem recognition and alternative evaluation, as the one consumer’s behaviour
becomes the antecedent for another’s.
Problems with big data
- Most information is about the past. Without a model or theory, you can’t use the
information in order to predict the future
- Big data records what behaviour is done, but not why:
, o Surveys will continue to be used, especially for understanding prospective
customers, but the advantages of using records of customer behaviours are
intriguing and create opportunities to extend CB research
- Big Data quality cannot be assumed
o Having a database does not mean that it can be used for marketing purposes.
Maintaining a clean database requires substantial effort, and the task of preparing a
data set for analysis will often take longer than the analysis itself.
- Big data sets may not be representative
o Marketers should not be impressed by the size of a data set alone, and should
inquire about how the data were sampled and potential biases created by the
sampling procedure
o Long-term customers→ survival bias, certain files on systematic basis→ selection
bias
- Big Data may not generalize
o Another consideration of descriptive studies is the extent to which they generalize
to other periods or situations, i.e. external validity
- Big Data may have omitted variables
o Not all factors influencing consumer decisions will be recorded in Big Data sets,
creating potential omitted variable biases
o Mass advertising, competitive action, individual differences
- Big data can be volatile
o The value of certain types of big data is perishable and may vanish in minutes
(distance to shop)
- Big data show associations, but not causation
o See text
- Big data and consumer privacy concerns/ ethical issues
Ted talk: the era of blind faith in big data must end
What situations leads to success (with old data)
- First rule: opinions embedded in codes
o When making a meal, as a mom you see success when your child eats vegetables,
the child sees success when it can eat Nutella
- It is not scientific, it is an opinion
Algorithm are not fair, they repeat our past practices, our patterns, they automate the status quo
That would be great if the status quo is good, but its not
- Look at fox sports where women did not got a fair chance in the market. When making an
algorithm on vacancies and staying 4 years with 1 promotion, you are making a market
without woman
- As well as looking at suburbs. Say police only focus on minority areas. When predicting
where the next crime will be, will be in the minority area, because they only check there
Weapons of math destruction
Private companies building private algorithms for private ends
,Their secret spots, it is private power.
Steps to check the algorithm:
- Data integrity check
- Definition of success
- Consider accuracy and consider the errors and for whom does the model fail
- Consider the long-term effects of algorithms, the feedback loops
For the data scientists: should not be the arbiters of truth, but the translators of ethical discussion
that happen in larger society
For the rest: the non-data scientists, this is not a math test, it is a political fight, we need to
command accountability for algorithmic overlords.
Introduction lecture
Key features of Big data:
- Volume
o Large amounts of data
o Hard to analyse using traditional statistical softwares and techniques
- Variety
o Facebook has data on…
o What you write, your photos, your location, your clicks, mouse trajectories, who
your friends are
- Velocity
o Data is generated continuously
o There are ~228 million google searches per hour. There are ~6 thousands tweets per
second
How to analyse Big Data:
Machine learning is “a set of methods that can automatically detect patterns in data, and then use
the uncovered patterns to predict future data”
Statistics vs machine learning
Traditional statistics Machine learning
Main goal Explain relationships in an Build a model that can predict
existing dataset future data
Motivation Guided by researcher’s Guided primarily by data (use
intuitions and hypotheses whatever works best, even if it
doesn’t make sense)
Methods Transparent (can be explained Can be a “black box”(even the
and described) programmer might not know
how a model makes
predictions
Ethical issues:
, - Not giving actual consent for the collection of your data
- Algorithmic bias
o Also in google photos: black people were being seen as gorillas, so no gorillas are
allowed as filters anymore
- Privacy concerns
Building a better facebook:
- Identify a measure of success; more follower, more number of rentals, more etc.
Going viral
- Think of a cause or a product
- Make it go viral
Reading Gladstone: Can Psychological Traits Be Inferred From
Spending? Evidence From Transaction Data
Digital music libraries or spending records, on the other hand, can be better described as
behavioural residues: subtle cues about people’s preferences inadvertently conveyed as a result of
their activity (Gosling et al., 2002).
- On the one hand, because spending is recorded passively and includes information often
hidden from other people, it might be less influenced by social desirability and might
therefore be a more accurate reflection of people’s psychological traits.
- On the other hand, spending records, like other behavioural residues, may be less predictive
of people’s psychological traits because they are weaker indicators of how people perceive
themselves and desire to be perceived by others
Big five (OCEAN) + Two traits:
- materialism and self-control. People with materialistic tendencies prefer material goods over
experiential ones (Howell, Pchelin, & Iyer, 2012), and people with greater self-control spend
less on impulsive purchases and save more
We used participants’ relative spending across categories, rather than the raw amounts.
- To calculate this, we divided participants’ spending in an individual category by their overall
spending, giving us the relative amount spent in each specific category.
- It is expected that wealthier people relatively spend less on necessities than on other
products
Discussion:
- Our findings contribute to research on the automatic prediction of psychological traits by
illustrating that digital records of spending can be used to predict personality at
unprecedented scale