100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Samenvatting Hoorcolleges Computational Analysis of Digital Communication €5,49   In winkelwagen

Samenvatting

Samenvatting Hoorcolleges Computational Analysis of Digital Communication

 14 keer bekeken  0 keer verkocht

Een samenvatting van hoorcollege 1 tot en met 4 van Computational Analysis of Digital Communication aan de VU.

Voorbeeld 4 van de 71  pagina's

  • 8 december 2022
  • 71
  • 2022/2023
  • Samenvatting
Alle documenten voor dit vak (2)
avatar-seller
Sterrevermond
Lecture 1 - Introduction to Computational Methods - 31/10/2022
Computational Social Science: Field of Social Science that uses algorithmic tools and
large/unstructured data to understand human and social behavior. Computational methods
as “microscope”: Methods are not the goal, but contribute to theoretical development and/or
data generation. Complements rather than replaces traditional methodologies. Includes
methods such as:
★ Advanced data wrangling/data science
★ Combining of different data sets
★ Automated Text Analysis
★ Machine Learning (supervised and unsupervised)
★ Actor-based modeling
★ Simulations

Typical Workflow




Why is this important now?
★ In the past, collecting data was expensive (surveys, observations…).
★ In the digital age, the behaviors of billions of people are recorded, stored, and
therefore analyzable.
★ Every time you click on a website, make a call on your mobile phone, or pay for
something with your credit card, a digital record of your behavior is created and
stored.
★ Because (meta-)data are a byproduct of people’s everyday actions, they are often
called digital traces.
★ Large-scale records of persons or businesses are often called big data.

,10 characters of big data
Big The scale or volume of some current datasets is often impressive.
However, big datasets are not an end in themselves, but they can
enable certain kinds of research including the study of rare events, the
estimation of heterogeneity, and the detection of small differences.

Always-on Many big data systems are constantly collecting data and thus enable
them to study unexpected events and allow for real-time measurement.

Nonreactive Participants are generally not aware that their data are being captured
or they have become so accustomed to this data collection that it no
longer changes their behavior.

Incomplete Most big data sources are incomplete, in the sense that they don’t
have the information that you will want for your research. This is a
common feature of data that was created for purposes other than
research.

Inaccessible Data held by companies and governments are difficult for researchers
to access.

Non Most big datasets are nonetheless not representative of certain
Representative populations. Out-of-sample generalizations are hence difficult or
impossible.

Drifting Many big data systems are changing constantly, thus making it difficult
to study long-term trends.

Algorithmically Behavior in big data systems is not natural; it is driven by the
confounded engineering goals of the systems.

Dirty Big data often includes a lot of noise (e.g., junk, spam, spurious data
points…)

Sensitive Some of the information that companies and governments have is
sensitive.


Example: Smartphone log data:
★ Big: Thousands of rows per person, but not many columns.
★ Always-on: Recorded smartphone use at all times.
★ Incomplete: Did not record app use with higher privacy standards
★ Dirty: Depending on what you want to study, lots of noise.

,Typical computational research strategies
1. Counting things: In the age of big data, researcher can “count” more than ever
- How often do people use their smartphone per day?
- About which topics do news websites write most often?
2. Forecasting and nowcasting: Big data allow for more accurate predictions both in
the present and in the future
- Investigate when people disclose themselves in computer-mediated
communication
- Crime prediction
3. Approximating experiments: Computational methods provide opportunities to
conduct “natural experiments”
- Compare smartphone log data of people who use their smartphone naturally
vs. those who abstain from certain apps (e.g., social media apps)
- Investigate the potential of nudges to make users select certain news

Advantages and disadvantages
★ Advantages of Computational Methods: Actual behavior versus self-report, social
context versus lab setting, small N versus large N.
★ Disadvantages of Computational Methods: Techniques often complicated, data often
proprietary, samples often biased, insufficient metadata.

Computational Communication Science (CCS): the label applied to the emerging subfield
that investigates the use of computational algorithms to gather and analyze big and often
semi- or unstructured data sets to develop and test communication science theories.

Promises
The recent acceleration in the use of computational methods for communication science is
primarily fueled by the confluence of at least three developments:
★ vast amounts of digitally available data, ranging from social media messages and
other digital traces to web archives and newly digitized newspaper and other
historical archives.
★ improved tools to analyze this data, including network analysis methods and
automatic text analysis methods such as supervised text classification, topic
modeling, word embeddings, and syntactic methods.
★ powerful and cheap processing power, and easy to use computing infrastructure for
processing these data, including scientific and commercial cloud computing, sharing
platforms such as Github and Dataverse, and crowd coding platforms such as
Amazon MTurk and Crowdflower.

Ethical problems with computational methods
★ More power over participants than in the past
- Data collection without awareness/consent
- Manipulation without awareness/consent
- Data potentially sensitive, individual users identifiable
★ Guiding principles:
- Respect for persons: Treating people as autonomous and honoring their
wishes.

, - Beneficence: Understanding and improving the risk/benefit profile of a study.
- Justice: Risks and benefits should be evenly distributed.
- Respect for law and public interest

Challenges of computational communication science
★ Simply data-driven research questions might not be theoretically interesting
★ Proprietary data threatens accessibility and reproducibility
★ ‘Found’ data not always representative, threatening external validity
★ Computational method bias and noise threaten accuracy and internal validity
★ Inadequate ethical standards/procedures

Preliminary summary
★ Computational communication research holds manifold promises.
★ We can harness unusual sources of information and large amounts of data,
particularly because people constantly leave digital traces.
★ New methods allow to structure, aggregate and make sense of these data and
extract meaningful information to study communication behavior and phenomena.
★ However, computational communication research comes with ethical challenges
related to consent, privacy, and autonomy of the participants.

Exam: 40% - 35 MC, 6 open-ended
Homework: 30%
Group presentation: 30% - 10 minute talk

Example exam question (MC)
Why is the “Facebook Manipulation Study” by Kramer et al. ethically problematic?
A. People didn't know that they took part in a study (no informed consent)
B. It overly manipulated people’s emotion
C. Both A and B are true
D. The study was not ethically problematic

Example exam question (Open format)
Name and explain two characteristics of big data.
1. Big data is often “incomplete”: This means they do not have the information that you
will want for your research. This is a common feature of data that was created for
purposes other than research. For example, log data (e.g., browser history) includes
all links a person has visited over time, but does not provide any additional
information. Moreover, it may contain gaps where the software failed or the person
purposefully hid his surfing behavior.
2. Big data is often “algorithmically confounded”: Behavior in big data systems is not
natural; it is driven by the engineering goals of the systems. For example, what you
see on a facebook news feed depends on an algorithm that Facebook has built into
their platform. Behavior of individuals is thus also driven by these system-immanent
features.

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper Sterrevermond. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €5,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 72042 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€5,49
  • (0)
  Kopen