100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Summary Online Data Collection and Management (ODCM) 2022/2023 - All Lectures, Readings, Tutorials $12.42   Add to cart

Summary

Summary Online Data Collection and Management (ODCM) 2022/2023 - All Lectures, Readings, Tutorials

 38 views  4 purchases
  • Course
  • Institution

Summary of all the readings, lectures, and tutorials (incl. answers) for the course Online Data Collection and Management (ODCM). Not in a bullet-list type of way so you have to figure everything out yourself, but in clear, concise language. This file is a must-have for the open-book exam of this c...

[Show more]

Preview 4 out of 64  pages

  • December 22, 2022
  • 64
  • 2022/2023
  • Summary
avatar-seller
Inhoudsopgave
WEEK 0: PREPRATION BEFORE THE COURSE STARTS (22/08-28/08) ................................................................ 3
Datacamp – Introduction to Python ................................................................................................................... 3
Chapter 1: Python Basics ............................................................................................................................... 3
Chapter 2: Python Lists .................................................................................................................................. 3
Chapter 3: Functions and Packages ............................................................................................................... 4
Table: All Functions from the Chapters ......................................................................................................... 4

WEEK 1 – LECTURE: GETTING STARTED WITH PYTHON & WEB DATA (30/08) .................................................. 6
Lecture Notes ...................................................................................................................................................... 6
Tutorial: Python Bootcamp for Web Data .......................................................................................................... 7

WEEK 2 – TUTORIAL: WEB SCRAPING FOR DUMMIES (08/09) ....................................................................... 11
Video lecture: What is Web Scraping and What are Application Programming Interfaces (APIs)? (20:43) ..... 11
What is web scraping? ................................................................................................................................. 11
What is an API? ............................................................................................................................................ 11
Summary ...................................................................................................................................................... 11
Webinar: Boegershausen, J., Datta, H., Borah, A., & Stephen, A.T. (2022). Fields of Gold: Scraping Web Data
for Marketing Insights. Journal of Marketing, 86(5), 1-20. .............................................................................. 12
Web Data in Academic Marketing Research and How to Extract It............................................................. 12
Pathways for Creating New Marketing Knowledge (terugkijken 6:00) ........................................................ 12
Managing the Idiosyncratic Legal, Technical, and Validity Challenges of Web Data ................................... 12
& Focus on Three Key Stages: Source Selection, Design, Extraction ........................................................... 12
Paper: Boegershausen, J., Datta, H., Borah, A., & Stephen, A.T. (2022). Fields of Gold: Scraping Web Data for
Marketing Insights. Journal of Marketing, 86(5), 1-20. .................................................................................... 13
Abstract........................................................................................................................................................ 13
Introduction ................................................................................................................................................. 13
Using Web Data to Advance Marketing Thought ........................................................................................ 14
§ Studying New Phenomena ........................................................................................................................ 14
§ Boosting Ecological Value ......................................................................................................................... 14
§ Facilitating Methodological Advancement ............................................................................................... 14
§ Improving Measurement .......................................................................................................................... 15
§ Summary ................................................................................................................................................... 15
Methodological Framework for Collecting Web Data ................................................................................. 15
Data Source Selection .................................................................................................................................. 16
Designing the Data Collection ...................................................................................................................... 17
Collecting the Data....................................................................................................................................... 19
Summary tables ........................................................................................................................................... 20
Future Research Opportunities with Web Data........................................................................................... 23
Web Appendix A: Comparing Web Scraping and APIs ................................................................................. 24
Web Appendix C: Marketing Research using Web Data .............................................................................. 25
Web Appendix D: Legal Considerations ....................................................................................................... 25
Web Appendix F: Calculation of Technically Feasible Sample Sizes ............................................................ 25
In-Class Tutorial: Web Data for Dummies ........................................................................................................ 26
After-Class Tutorial: Web Data for Dummies ................................................................................................... 29

WEEK 3 – TUTORIAL: WEB SCRAPING 101 (15/09) ......................................................................................... 40
In-Class Tutorial: Web Scraping 101 ................................................................................................................. 40

, After-Class Tutorial: Web Scraping 101 ............................................................................................................ 41

WEEK 4 – TUTORIAL: APIS 101 (22/09) .......................................................................................................... 54
In-Class Tutorial: APIs 101 ................................................................................................................................ 54
After-Class Exercises: APIs 101 ......................................................................................................................... 56

WEEK 6 – LECTURE: TEAM COACHING #5 (06/10) .......................................................................................... 64
Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J., Wallach, H., Daumé III, H., & Crawford, K. (2018).
Datasheets for Datasets (cite arxiv:1803.09010). Working paper. .................................................................. 64
1. Introduction ............................................................................................................................................. 64
1.1 Objectives .............................................................................................................................................. 64
3. Questions and Workflow ......................................................................................................................... 64

,WEEK 0: PREPRATION BEFORE THE COURSE STARTS (22/08-28/08)

Literature
Datacamp – Introduction to Python:
• Chapter 1: Python Basics
• Chapter 2: Python Lists
• Chapter 3: Functions and Packages


Datacamp – Introduction to Python
Chapter 1: Python Basics
The Python shell is a place where you can type Python code and immediately see the results. Next to
that, you can also have Python run so called Python scripts. These Python scripts are simply text files
with the extension .py.
Q: For which applications can you use Python?
a. You want to do some quick calculations.
b. For your new business, you want to develop a database-driven website.
c. Your boss asks you to clean and analyze the results of the latest satisfaction survey.
d. All of the above.
A: D.
You can add comments to your Python script by using the # tag.
You can define a variable in Python through the equal-to (=) sign. For example, height = 1.79.
The following data types are common in Python:
• A float is a real number, i.e., a number that has both an integer part and a fractional part (e.g.,
1.1).
• An integer (int) is a number without a fractional part (e.g., 100).
• A string (str) is Python’s way to represent text.
• A Boolean (bool) is a type that can either be True or False.
Q: Which one of these will throw an error?
a. “I can add integers, like ” + str(5) + “ to strings.”
b. “I said ” + (“Hey ” * 2) + “Hey!”
c. The correct answer to this multiple choice exercise is the answer number ” + 2
d. True + False
A: C.

Chapter 2: Python Lists
A list is a compound data type. You can build a list using square brackets. For example, list = [1.73, 1.68,
1.71, 1.89]. A list can contain any Python type. Although it’s not really common, a list can also contain a
mix of Python types including strings, floats, Booleans, etc. A list can also contain a list.
Q: Which of the following lines of Python code are valid ways to build a list?
a. [1, 3, 4, 2]
b. [[1, 2, 3], [4, 5, 7]]
c. [1 + 2, “a” * 5, 3]
A: A, B, and C.
To select an element from a list, you can use square brackets. For example, fam[2] gives you the second
index – the third item – in the list (Python indexing starts at 0 for the first element in a list). It is also
possible to slice your list, which means selecting multiple elements from your list. For example, fam[3:5]
gives you the third and fourth index – the fourth and fifth elements – of a list, but not the fifth index.
The latter is exclusive.
Q: Remove the poolhouse (the string and float) from the areas list. Which of the code chunks will do
the job for us?
a. del(areas[10]); del(areas[11])
b. del(areas[10:11])
c. del(areas[-4:-2])
d. del(areas[-3]); del(areas[-4])
A: C.

, Chapter 3: Functions and Packages
A function is a piece of reusable code, aimed at solving a particular task. The inputs of functions are
called arguments.
Q: Use the iPython Shell to open up the documentation on pow(). Which of the following statements is
true?
a. pow() takes three arguments: base, exp, and mod. If you don’t specify mod, the function will
return an error.
b. pow() takes three arguments: base, exp, and None. All of these arguments are required.
c. pow() takes three arguments: base, exp, and mod. base and exp are required arguments, mod is
an optional argument.
d. pow() takes two arguments: exp and mod. If you don’t specify exp, the function will return an
error.
A: C.
Values or data structures like strings, floats, and lists are all so-called Python objects. These objects
come with object-specific methods. You can think of methods as functions that “belong to” Python
objects. To call a method, you use the dot notation (see the table below).
A package can be thought of as a directory of Python scripts. Each such script is a so-called module.
These modules specify functions, methods, and new Python types aimed at solving particular problems.
To import a package, you can type import [package]. To use a function from a package, you have to use
the dot notation with the package name in front of it. You can also abbreviate package names to make
coding a bit less time consuming, by using import [package] as [abbreviation]. If you only want to use a
specific part of a package, you can also type from [package] import [function].
Q: Suppose you want to use the function inv(), which is in the linalg subpackage of the scipy package.
You want to be able to use this function as follows: my_inv([[1,2], [3,4]]). Which import statement will
you need in order to run the above code without an error?
a. import scipy
b. import scipy.linalg
c. from scipy.linalg import my_inv
d. from scipy.linalg import inv as my_inv
A: D.

Table: All Functions from the Chapters
Ch. Function Description Example
1 print() Prints what is inside the brackets. print(5+3)
1 type() Check the type of a value. type(bmi)
1 str() Convert a variable to a string. str(savings)
1 int() Convert a variable to an integer. int(savings)
1 float() Convert a variable to a float. float(savings)
1 bool() Convert a variable to a Boolean. bool(savings)
2 del() Delete elements from a list. del(x[1])
2 list() Make a copy of a list. areas_copy = list(areas)
3 max() Find the highest value in a list. max(fam)
3 round() Round a number. round(1.68, 1)
3 help() Get information about a function. help(max) [or: ?max]
3 len() Get the length of a list. len(var1)
3 sorted() Sort a list. full_sorted = sorted(full, reverse = True)
3 index (list) Get the index number from a list fam.index(“mom”)
variable.
3 count (list) Count the number of variables in a fam.count(1.73)
list with a specific value.
3 capitalize Capitalize a string. sister.capitalize()
(str)
3 replace (str) Replace part of a string. sister.replace(“z”, “sa”)
3 index (str) Get the index number from a string sister.index(“z”)
variable.

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller tilburgsamenvattingen. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $12.42. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

59325 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$12.42  4x  sold
  • (0)
  Add to cart