Big Data Analytics using GIS - LECTURE NOTES (2019)
187 views 0 purchase
Course
Big Data Analytics using GIS
Institution
Vrije Universiteit Amsterdam (VU)
Lecture notes for Big Data Analytics using GIS, previously called GeoMarketing, on the Vrije Universiteit Amsterdam. Including practical manuals with screenshot on how to do stuff in QGIS!!! Lectures were given by Jaap Boter
Big Data Analytics using GIS – Notes of Lectures
Danique Levering – Master Marketing – Vrije Universiteit
Course grade
- Exam (70%): min. 5.0
o Theory: papers on BB and lecture notes
o Practice: various materials
o Note that this methods course focuses more on knowledge and testing method – including
individual knowledge of QGIS and Data Science
- Team Paper (30%): min. 5.0
o Mini BDA thesis with all the components
o Exploring a (spatial) thesis topic, supported by papers
o Using and connecting multiple dataset
o Using the tools learned in this course (QGIS)
o Don’t care about significance (certainly not with a synthetic DV), but can you show how you
do it?
Overall score min. 5.5
Course material
1. Obligatory articles: reference list is provided during the exam
a. Wedel & Kannan (2016) – Marketing Analytics for Data-Rich Environments
b. Bronnenberg et al. (2007) – Consumer Packaged Goods in the United States: National
Brands, Local Branding
c. Kessler & McKenzie (2018) – A Geoprivacy Manifesto, Transactions in GIS (Early View)
d. Martin et al. (2017) – Data Privacy: Effects on Customer and Firm Performance
e. Forman et al. (2009) – Competition between Local and Electronic Markets: How to Benefit
of Buying Online Depends on Where You Live
f. Choi & Bell (2011) – Preference Minorities and the Internet
g. Hofstede et al. (2002) – Identifying Spatial Segments in International Markets
h. McCrea (2009) – Explaining Sociospatial Patterns in South East Queensland, Australia:
Social Homophily versus Structural Homophily, Environment and Planning
i. Garber et al. (2004) – From Density to Destiny: Using Spatial Dimension of Sales Data for
Early Prediction of New Product Success
j. Fong et al. (2015) – Geo-Conquesting: Competitive Locational Targeting of Mobile
Promotions
k. Ozimec et al. (2010) – Geographical Information Systems-Based Marketing Decisions:
Effects of Alternative Visualizations of Decision Quality
l. Elmer (2012) – Symbol Considerations for Bivariate Thematic Mapping (Introduction,
Theory, Results, Conclusion)
2. Lecture slides: only know the stuff from the additional papers that was used in the lecture
3. These lecture notes J
a. Part 1 = Theory (p. 3 – p. 40)
b. Part 2 = Practice (QGIS) (p. 46 – p. ?)
Last Exam Tips
- 92% pass rate
- Digital exam: open up QGIS on your computer and do the following assignment. Don’t rely on your
team to use the program.
, Big Data Analytics using GIS – Lecture Notes | Danique Levering
- Jaap: Lectures are not meant to summarize the papers for you, you should study the papers too. I
may indicate what is important in the lectures, but I focus mostly on the “so now what?” [= next step]
- The exam is different from previous years. Now digital and both QGIS practice and theory (so also
less theory question now). Old exams mostly theory. Meant that I could ask one major set of
questions for each week. Our (new) exam will have pretty big QGIS assignment, so less room for
theory (exam still same time)
- We can use plugins in the exam if we want
- Reference list is provided during the exam
- Not all papers are discussed in detail in the lectures. That doesn’t mean that you don’t need to learn
the whole paper; you need to learn the whole paper. In particular he looks for the number of
particular steps: those 6 steps (see exam prep lecture).
- Jaap says his question will be fair. It will always be about a part in a specific article you had to read,
or about an article we didn’t had to read, but we did discuss during the lecture. In that case you
should know what we discussed in lecture.
- How many minutes to spend on the QGIS part: 45 minutes is a lot, but it would still be fine.
- Every week or paper will be represented! Going to be a random choice: 3 or 4 other questions next
to the QGIS part. So not every week will be there!
Green = listed literature
Dark green = unlisted literature
Blue = example
Red = exam hint from Jaap
Help a girl out! Buy me a coffee:
https://betaalverzoek.rabobank.nl/betaalverzoek/?id=uLK88Bp0TYq4BmkPacbYxA
2
, Big Data Analytics using GIS – Lecture Notes | Danique Levering
BIG DATA ANALYTICS & MARKETING
What is so big about big data?
- Practical
o Just “more than you are accustomed to”, requiring new methods, hardware, solutions
o The three V’s (volume, velocity, variety) or more (value, veracity, variability, et cetera)
- Conceptual
o Computation as 3rd approach to scientific discovery (theory, experimentation) à who needs
theory if you have big data? Why come up with a theory first if you can just look at the data?
o The Fourth Paradigm (book) – free PDF
It’s the thought that counts
- Your data is unlikely to be really big (as in big data. Though likely a lot bigger than an experiment (or
survey)
- What is relevant and important:
o The thought process: your thesis is based on secondary data. i.e., collected for other
purposes than your thesis. You are inferring meaning by seeing it as proxy for something.
Implications!
o The data handling: you need to wrangle data. There’s a lot of cleaning (thus first
exploratory data analysis). There is a lot of joining different data sets (and you’ll learn some
advanced options)
o Tools: there is a whole range of (software) tools that are useful in this process. The SPSS of
CMA is not sufficient.
The role of your theory knowledge
1. Critically reviewing data quality
How was data collected? Is this biased?
2. Operationalizing theoretical concepts
E.g. # of bars as proxy for social cohesion?
3. Selecting the right concepts/variables
You have an informed hunch
4. Understanding the results
What story are the data telling us?
5. Translation into action and consequences
In principle not very different
- Theory driven vs. data driven
- Both possible for thesis (with same result), but not very different
- In Data Science this often goes a lot further
o Many black box methods (e.g. neural networks)
o “But hey, if it works, it works, right?”
Example of a thesis: Data of Tilburg box office data.
3
, Big Data Analytics using GIS – Lecture Notes | Danique Levering
As the events approaches in time, the circle
gets smaller. But maybe, the red areas are red
because there are living more people in
general. If there are differences in density,
maybe you should do a relative analysis. It
also depends on the genre of the event. With
a Beyoncé concert, more people from further
away will come to the event. A smaller event
would be less interesting for people living
further away. Finally, there could also be other
factors, like income, driving time (construal
level theory).
Steps of Big Data Analytics
1. Assume you have the data first
2. What concepts does it relate to?
3. When exploring: what strikes you?
4. Theoretically, what might be involved?
5. Use this theory to build model and test effect
à Possibly need to find additional data
6. Report your findings
à It’s not a fishing experiment, where you just do multiple analyses until something is statistically
significant.
Note:
- Many variations on this are possible
- Consult your thesis supervisor
MARKETING ANALYTICS FOR DATA-RICH ENVIRONMENTS (WEDEL & KANNAN, 2016)
Comprehensive and recent
overview of (the history of) Big
Data in Marketing, with a focus on
location.
This paper is important for the
overview for broader perspective.
Interesting (do learn!), but not
tested in exam:
- All the different
techniques in Marketing
Analytics that you have never heard of before
- Similarly, the statistical methods
- Software and platforms for Big Data
4
, Big Data Analytics using GIS – Lecture Notes | Danique Levering
!!! EXAM: the relationship to GeoMarketing
1. Can you place GeoMarketing in broader context?
E.g. what are trends or issues in the rise of Big Data and which also apply to GeoMarketing?
2. Do you know areas in Marketing Analytics and where GeoMarketing can be helpful approach?
E.g. which elements in Figure 5 might be supported how by spatial analysis?
Ethics as concern of its own à next week
- Wedel & Kannan (2016) explicitly mention the dark side of Big Data: privacy and security concerns
- Also, very true in GeoMarketing as it often ties together many sources to create more information
- Plus, difficulty to make anonymous when coordinates are part of the data
- Location can reveal a lot OR the wrong things.
CONSUMER PACKAGED GOODS IN THE UNITED STATES: NATIONAL BRANDS, LOCAL BRANDING
(BRONNENBERG ET AL., 2007)
Very simple (but large) dataset:
- Three years of ACNielsen scanner data
- On 31 product categories in 50 regions
- Sales figures and marketing variables
- Focus on the #1 and #2 in each category
1. First: basic test for regional differences
o Market share: national ≠ local
o Market structures: national ≠ local
o Pattern cross-region dispersion differs per brand
o This phenomenon is persistent over time
2. Second: what best explains variance in data? (Market region,
brand, time, interaction)
o Next: what, then, are national brands?
o Series of analyses similar to step 1
o Results:
§ National brands lead in multiple markets
§ But: numerous cases of local brands securing leadership in spite of small scale.
o Potential explanations:
§ Consumer differences (e.g. regional tastes à is this lifestyle or distance?)
§ Retailer/distribution differences
§ Manufacturer differences (e.g. turf division or historic order of entry)
o Why important?
§ Marketing standard: modeling sales to explain tactics
§ Current practice: time series analyses
§ This paper: variance across regions
§ Relevance paper: sales often modeled on just a few regions. This paper sheds light
on notions like ‘National’ brands
3. Result: time plays a very minor role.
5
, Big Data Analytics using GIS – Lecture Notes | Danique Levering
Simple correlation: e.g. #2 in diapers 25% national market share. What they then did, what giving 50 dots for
that region for what the local market share was (e.g. locally 50% market share). They won an award with this
simple correlation figure.
WHAT IS GEOMARKETING?
There is no real definition, more a collective term for any geo/spatial aspects in marketing.
Spatial data in marketing isn’t really new
- Location planning (where to open a new store?)
- Geodemographics (where does who live?)
- Now: Spatial aspects of marketing concepts – coupon on your phone for the supermarket you are
in
More attention for GeoMarketing because of Big Data
- More online ordering (retailers have addresses now)
- Mobile phones with GPS (WOM now has coordinates)
- New PC’s + (GIS) software (we can now use this data)
à With this data you can now predict/understand/model:
o Adoption of innovations
o Loyalty
o Channel choice
o Etc.
So, location is the new data, and GIS is the new method to understand, visualize, combine and report it.
What is GIS? A computerized system to capture, store, update,
manipulate, analyse and display geographically referenced
information. You can look at it as different layers: store sales, number
of inhabitants within 5 miles, etc. Those programs are made so you
can play with big data, visualize big data.
Vector data: you have layers where you see lines and points, you can
adjust them like in other programs. The other type of data is raster
data, you can compare them with normal photos. If you want to adjust
it, you have to take a part of the photo and drag it.
There are three main packages: MapInfo (business-oriented, easy to
use), ArcGIS (market leader, advanced) and QGIS (open source, great
for visualization). Different scales: desktop software (locally installed,
like QGIS), database server options or webservices. We use QGIS 3.4
‘Madeira’.
!!! EXAM: what is a GIS? (has been asked in exams in the past to get easy points)
6
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller daniquelevering. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $3.20. You're not tied to anything after your purchase.