Summary

Summary Lecture slides Computational Analysis of Digital Communication

16 views 0 purchase

Course
Computational Analysis of Digital Communication (S_C21C)

Institution
Vrije Universiteit Amsterdam (VU)

Lecture slides of Computational Analysis of Digital Communication (S_CADC)

[Show more]

Preview 4 out of 59 pages

View example

Uploaded on November 29, 2022
Number of pages 59
Written in 2022/2023
Type Summary

cadc
phillip
masur
communication science
computational analysis of digital communication
scadc
master of communication science

Institution
Vrije Universiteit Amsterdam (VU)
Education
Communicatiewetenschap
Course
Computational Analysis of Digital Communication (S_C21C)

Vustudentt

Member since 7 year 205 documents sold

$4.77

Add to cart

Add to wishlist

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Computational analysis of digital communication
Week 1 – Introduction

What is computational social science and why should we care?

Example: surprising sources of information
• In 2009, Blumenstock and colleagues (2015, Science) wanted to study wealth and poverty in
Rwanda.
• They conducted a survey with a random sample of 1,000 customers of the largest mobile phone
provider
• They collected demographics, social, and economic characteristics
• Traditional social science survey, right?
• The authors also had access to complete call records from 1.5 million people
• Combining both data sources, they used the survey data to “train” a machine learning model to
predict a person’s wealth based on their call records.
• They also estimated the places of residence based on the geographic information embedded in call
records.

Blumenstock, Cadamura, & On, 2015

What is computational social science
- Field of Social Science that uses algorithmic tools and large/unstructured data to understand
human and social behavior
- Computational methods as “microscope”: Methods are not the goal, but contribute to theoretical
development and/or data generation
- Complements rather than replaces traditional methodologies
- Includes methods such as, e.g.,:
o Advanced data wrangling/data science
o Combining of different data sets
o Automated Text Analysis
o Machine Learning (supervised and unsupervised)
o Actor-based modelling
o Simulations
o …

Typical workflow

1

,Why is this important now?
- In the past, collecting data was expensive (surveys, observations…)
- In the digital age, the behaviors of billions of people are recorded, stored, and therefore analyzable
- Every time you click on a website, make a call on your mobile phone, or pay for something with
your credit card, a digital record of your behavior is created and stored
- Because (meta-)data are a byproduct of people’s everyday actions, they are often called digital
traces
- Large-scale records of persons or business are often called big data.

10 characteristics of big data

Characteristic Description
1 Big The scale or volume of some current datasets is often
impressive. However, big datasets are not an end in
themselves, but they can enable certain kinds of research
including the study of rare events, the estimation of
heterogeneity, and the detection of small differences
2 Always-on Many big data systems are constantly collecting data and thus
enable to study unexpected events and allow for real-time
measurement
3 Nonreactive Participants are generally not aware that their data are being
captured or they have become so accustomed to this data
collection that it no longer changes their behavior.
4 Incomplete Most big data sources are incomplete, in the sense that they
don’t have the information that you will want for your
research. This is a common feature of data that were created
for purposes other than research.
5 Inaccessible Data held by companies and governments are difficult for
researchers to access.
6 Nonrepresentative Most big datasets are nonetheless not representative of
certain populations. Out-of-sample generalizations are hence
difficult or impossible.
7 Drifting Many big data systems are changing constantly, thus making it
difficult to study long-term trends
8 Alghorithmically Behavior in big data systems is not natural; it is driven by the
confounded engineering goals of the systems.
9 Dirty Big data often includes a lot of noise (e.g., junk, spam, spurious
data points…)
10 Sensitive Some of the information that companies and governments
have is sensitive.

(Salganik, 2017, chap 2.3)

Example data
Smartphone log data (Masur, 2018)
- Incredible detailed log of each person’s smartphone use
- Big data?
• BIG: Thousands of rows per person, but not many columns
• ALWAYS-ON: Recorded smartphone use at all times
• INCOMPLETE: Did not record app use with higher privacy standards (e.g., signal)

2

, • DIRTY: Depending on what you want to study, lots of noise (e.g., phone on/off)

Typical computational research strategies
1. Counting things
In the age of big data, researcher can “count” more than ever
- How often do people use their smartphone per day?
- About which topics do news websites write most often?

2. Forecasting and nowcasting
Big data allow for more accurate predictions both in the present and in the future
- Investigate when people disclose themselves in computer-mediated communication
- Crime prediction

3. Approximating experiments
Computational methods provide opportunities to conduct “natural experiments”
• Compare smartphone log data of people who use their smartphone naturally vs. those who abstain
from certain apps (e.g., social media apps)
• Investigate the potential of nudges to make users select certain news

Advantages and disadvantages
Advantages of Computational Methods
- Actual behavior vs. self-report
- Social context vs. lab setting
- Small N to large N

Disadvantages of Computational Methods
- Techniques often complicated
- Data often proprietary
- Samples often biased
- Insufficient metadata

Computational communication science. Why computational methods are important for (future)
communication research

Definition
“Computational Communication Science (CCS) is the label applied to the emerging subfield that
investigates the use of computational algorithms to gather and analyze big and often semi- or unstructured
data sets to develop and test communication science theories”
3

, Van Atteveldt & Peng, 2018

Promises
The recent acceleration in the use of computational methods for communication science is primarily fueled
by the confluence of at least three developments:
- vast amounts of digitally available data, ranging from social media messages and other digital
traces to web archives and newly digitized newspaper and other historical archives
- improved tools to analyze this data, including network analysis methods and automatic text
analysis methods such as supervised text classification, topic modeling, word embeddings,
and syntactic methods
- powerful and cheap processing power, and easy to use computing infrastructure for processing
these data, including scientific and commercial cloud computing, sharing platforms such as Github
and Dataverse, and crowd coding platforms such as Amazon MTurk and Crowdflower

Example 1: simulating search queries
- Numbers of drug-overdose deaths have been increasing in the United States
- Google spotlights counselling services as helpful resources when users query for suicide-related
search terms
- However, the search engine does so at varying display rates, depending on terms used
- Display rates in the drug-overdose deaths domain are unknown
- Haim and colleagues (2021) emulated suicide-related potentially harmful searches at large scale
across the U.S. to explore Google’s response to search queries including or excluding additional
drug-related terms
- They conducted 215,999 search requests with varying combinations of search terms
- Counseling services were displayed at high rates after suicide-related potentially harmful search
queries (e.g., “how to commit suicide”)
- Display rates were substantially lower when drug-related terms, indicative of users’ suicidal
overdosing tendencies, were added (e.g., “how to commit suicide fentanyl”)

Haim, Scherr, & Arendt, 2021

Example 2: analyzing news coverage
- Jacobi and colleagues (2016) analyzed the coverage of nuclear technology from 1945 to the present
in the New York Times
- Analysis of 51,528 news stories (headline and lead): Way too much for human coding!
- Used “topic modeling” to extract latent topics and analyzed their occurrence over time

4

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller Vustudentt. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $4.77. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

59063 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 15 years now

Start selling

Popular Universities in the United States

Popular books

Find notes and summaries for these qualifications

Seller