MARKETING ANALYTICS FOR BIG DATA
LECTURE 1: BIG DATA
CLIP 1.1: WHAT ARE BIG DATA?
Observing Behavior
Digital traces (or exhaust): a record created and stored of some behavior
• Click on a website, call or location on a phone, buy with credit card, like or share, watch,
internet of things (sensor data), image, text, …
Big Data
Big data stored is getting larger and
larger. Computing power is also
increasing a lot. Explosion in amount
of information and computation
power. Amount if info that is stored is
going to be larger and larger. Also
computing power is increasing a lot.
Data and two notions of big
Kolommen = de variabelen. De rijen
= observaties. De breedte zijn de
variabelen. Het aantal variabelen = p.
hoe veel observaties over p
variabelen, how many people you
have data on, or how many visits are
made, how many customers,
observations = n.
Data can be big in different ways
If you have a lot of variables, you
need a lot of observations to
estimate the effects and learn about
the relationships about variables. If
you are not interested in looking at
many variables, it doesn’t make
sense to collect many observations.
What is really big?
( n is much larger than p = Tall data.
Many observations, relatively few
variables)
1
,( p is much larger than n = Wide data. Few observations, many variables)
Big data
Early definition: too large to be loaded into one machine, “distributed-data big”, domain of computer
engineering. (Average would have to be calculated over different machines and aggregated)
More our focus: “Big data is a shorthand label that typically means applying the tools of artificial
intelligence, like machine learning, to vast new troves of data beyond that captured in standard
databases. The new data sources include web-browsing data trails, social network communications,
sensor data and surveillance data”
“The term itself is vague, but it is getting at something that is real” (Jon Kleinberg)
Primary vs. Secondary
• Primary data (custom-made): data collected to answer a specific research question.
• Secondary data (readymade): data collected for non-research purpose (e.g., generating
profits, administrating laws)
- Bit by bit: big data is a form of secondary data
While most big data applications are secondary or “readymade” not all are (disagree with reading)
Types of Business data
At least 80% of data is unstructured for organizations. You need to organize first, then you can do
something with it.
2
, CLIP 1.2: USES OF BIG DATA
Big data for personalization
“Recommendation algorithms are at the core of the Netflix product. They provide our members with
personalized suggestions to reduce the amount of time and frustration to find some great content to
watch.”
They use big data with customers preferences, use those of others, choices, use of automated
recommendations → very important for these types of companies.
Big data for boosting engagement
“At any given point in time, there isn’t just one version of Facebook running, there are probably
10.000” Mark Zuckerburg.
Two different types of ads. How many people like it or how many of your friends like it. If your
friends like it, would that make you like it? Boost engagement? Big data can also be primary data.
Data that was collected to investigate that issue, not for another purpose. (they use massive A/B
tests)
Big data for new product development
Tastewise. To come up with new product ideas! They see what food is trending.
We monitor social chatter, nearly 100% of online recipes and the country’s most influential
restaurants & menus in order to understand how food is prepared, loved, and shared
Big data for reducing churn
• Customer churn: customer quits some service (rate of attrition or customer churn, is the
rate at which customers stop doing business with an entity)
• Use past data to estimate model that predicts churn
- Length of time being customer (tenure), number of other services subscribed to,
demographics
• Use that model to predict probability of churn on current customers. Prioritize these people
and try to get them to stay.
• Intervene on those most likely to churn. (offer them something, a discount, etc. to make
them stay!)
Big data for public policy & economy
• “Now the mobile phone has become a primary source of public data intelligence”
• After the pandemic:
• Are people going back to work?
• Are people going back to restaurants?
Surveys take time. Often we want to know results in real-time. As it is happening. Advantage big data
= helping to answer questions
3
, Google mobility data
This is for Noord-Brabant.
Als je je
locatievoorzieningen hebt
aan laten staan, kunnen
ze alles registreren. Je
kunt trends, werkplekken,
alles zien. Nog steeds
werken veel mensen
thuis. Dit kan je allemaal
zien door big data.
OpenTable restaurant reservations
Met de pandemie gingen de
reserveringen 100% omlaag. Nu
gaat het weer omhoog en lijkt het
weer terug.
Highlight how big data can be used
to answer public policy questions.
4