DIGITAL METHODS
CL AS S 1: INT RODUCT ION DIG IT AL M ETHODS
DIGITAL METH ODS: CL OSE READING, DIS TANT READIN G AN D CO MMON CHARACT ERIS TICS OF B IG DATA
• Data: online activities
• Can have risks: like overgeneralizing
SITUATIN G TH E COURS E
“The best way to create this powerful hybrid is not to focus on abstract social theory or fancy machine learning. The best
place to start is research design. If you think of social research as the process of asking and answering questions about
human behavior, then research design is the connective tissue; research design links questions and answers (Salganik, 2017
p.6).”
CLOSE READING (QUALITATIVE)
• Offline-online continuum
• Danah Boyd making sense of teen life
o Research approach
Immersion in teen pop culture and subculture
= Consuming media that is popular among teens
Participant observation and content analysis of traces on social media
= Digital traces: observe behavior on social media that is public/semi-public (looking at things
that are trending)
Deep hanging out in physical spaces
= Deep understanding of what they are doing (going to playgrounds, schools, malls)
Semistructured face-to-face interviews (with teens)
Social media certainly make it much easier to peek into people’s lives, but it is also quite easy to misinterpret online traces.
Example: Gang isignia on MySpace profile –> indication of Gang involvement?
• Essay on gangs: someone from a gang wrote essay about wanting to leave the gang and go to college (Ivy League
College)
• BUT His MySpace account reveals that he is still involved in the gang community
> Using online content to get more information, but you need to take the content into account
• Close reading: detailed analysis of content is necessary
• Carefully reading and reflecting on a piece of content
ALTERNATIVE EXPLANATION:
“Without knowing the specific boy involved, I surmised that he was probably focused on fitting in, staying safe, or, more
directly, surviving in his home environment. Most likely he felt as though he needed to perform gang affiliation online—
especially if he was not affiliated—in order to make certain that he was not physically vulnerable.’
1
, • Importance of close reading: interpretation based on detailed analysis
o importance of context in combination with quantitative data
o Try to understand the meaning (close reading carefully reading + reflecting)
OVERVIEW
• Lecture 7: Walk-through, going along and scrolling back
• Lecture 8: Digital ethnography I
• Lecture 9: Digital ethnography II
DISTANT READING (QUANTITATIVE)
“Big Data”: Advances in technology & analysis
• huge amounts of data left behind by people & furthermore aggregated by companies
- Data that is left behind
- How that data is being used/analysed by big tech corporations
- Not about one datasets but about multiple datasets and combinations of data sets
Gather/analyze and link datasets
Using the large datasets to identify trends/patterns make conclusions
• Example: Investigating health and wealth in Rwanda
- Traditional social science survey: asking people specific questions (demographics)
- Call records (of approx. 1,5 million people)
- combined the 2 data sources: used survey to train machine learning model to predict a person’s wealth
Result: high resolution maps of the distribution of wealth in Rwanda – combining 2 data
sources
Importance of distant reading: analyzing large numbers of data(sets)
OVERVIEW
• Lecture 2: Computational Social Science and Open Science
• Lecture 3: Data Vizualization
• Lecture 4: Collecting data from the web
• Lecture 5: The use of AI in Social Science Research
• Lecture 6: Working with Sensors and VR
• Lecture 10: The rise of the person-specific paradigm
• Lecture 11: Digital trace data
Practicals
• Practical 1: Assignment information and Installing R
• Practical 2: Getting acquainted with R
• Practical 3: Visualizing data with ggplot
• Practical 4: Webscraping and APIs
• Practical 5: working with sensors and VR
• Practical 6: Walk-through method
• Practical 7: Working with intensive longitudinal data
READYMADE VERSUS CUSTOMMADE DATA
• Readymade: Duchan (artist): takes an ordinary object and repurposes that as art (urinal)
• Custommade: Michelangelo (artist) statue of David (from Michelangelo) more than 3 years of labeling
2
, • Same when talking about data
o Readymade repurpose data sets
o Custommade start your own data set to answer research question
10 CHARACTERIS TICS OF BIG DAT A SOURC ES (DIVIDED IN HEL PF UL AND PROBL EMAT IC )
Digital traces Big Data Digital methods
• Generally helpful for research:
- Big
- always-on
- nonreactive
• Generally problematic for research:
- Incomplete
- Inaccessible
- Nonrepresentative
- Drifting
- algorithmically confounded
- dirty
- sensitive
Big data
• Repurposing
• Found versus designed data
- Found: data that has been found by researchers but designed by someone else
- Designed: data designed by researcher to then use it in their research
• What should the ideal data set look like?
• ‘Twitter’ versus ‘social survey’ data
1. BIG (HELPFUL)
• Is all that data really doing anything?
- More data does not always mean better data
• Big datasets are never an end in themselves, but do allow for the study of rare cases, detection small differences,
and estimation of heterogeneity
Example: Analysis of the 2016 US Presidential Campaign on Twitter (Kollanyi, Howard, Woolley, 2016)
• Total of 18 910 250 tweets were analyzed (analyzed 2 hashtags)
- 39.1% debat tweets pro Trump hashtag (e.g., #MAGA)
- 13.6% debat tweets pro Clinton (e.g., #ImWithHer)
• However… (many accounts were not real)
- 32.7% pro tweets Trump originated from bots
- 22.3% pro tweets Clinton originated from bots
2. ALWAYS-ON (HELPFUL)
(we are living in media, always connected, constant stream of data)
• Unexpected events
• Real-time measurements
3
, Example: hurricane Sandy-related Twitter and Foursquare data
• Data scientist combined foursquare data and Twitter data
• Surprising findings (that could otherwise not be found)
- People went out grocery shopping more the night before the storm
- People went out after the storm (kind of celebrating that it is over)
3. NONREACTIVE (HELPFUL)
• Measurement in big data sources is much less likely to change behavior.
- People not being aware that their data is being captured: so they do not change their behavior
• However, social desirability bias can still be present
- not neutral platforms
- example: social media: focusing on the good things in our life (personal milestones,….)
4. INCOMPLETE (PROBLEMATIC)
• Usually the following information is missing/ incomplete
- Demographic information about participants
- Behavior on other platforms
- Data to operationalize theoretical constructs
• Construct validity
o a type of validity that refers to the degree to which a measurement tool or method accurately measures
the underlying construct or concept that it is intended to measure.
Example: measuring social capital
• Articulated networks contacts
• Behavioral networks communication
• Construct validity?
- For example: spending more time on the phone with your colleague does not imply that they are more
important that spouses
- Will not always be perfect: context is important
5. INACCESSIBLE (PROBLEMATIC)
• Data held by companies and governments are difficult for researchers to access.
Examples: ‘Facebook reportedly provided inaccurate data of misinformation researchers
• Sometimes only semipublic available
6. NONREPRESENTATIVE (PROBLEMATIC)
• Nonrepresentative data are bad for out-of-sample generalizations, but can be quite useful for within-sample
comparisons.
4
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
√ Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, Bancontact of creditcard voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper alinebruckner. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €9,99. Je zit daarna nergens aan vast.