DIGITAL METHODS
CL AS S 1: INT RODUCT ION DIG IT AL M ETHODS
DIGITAL METH ODS: CL OSE READING, DIS TANT READIN G AN D CO MMON CHARACT ERIS TICS OF B IG DATA
• Data: online activities
• Can have risks: like overgeneralizing
SITUATIN G TH E COURS E
“The best way to create this powerful hybrid is not to focus on abstract social theory or fancy machine learning. The best
place to start is research design. If you think of social research as the process of asking and answering questions about
human behavior, then research design is the connective tissue; research design links questions and answers (Salganik, 2017
p.6).”
CLOSE READING (QUALITATIVE)
• Offline-online continuum
• Danah Boyd making sense of teen life
o Research approach
Immersion in teen pop culture and subculture
= Consuming media that is popular among teens
Participant observation and content analysis of traces on social media
= Digital traces: observe behavior on social media that is public/semi-public (looking at things
that are trending)
Deep hanging out in physical spaces
= Deep understanding of what they are doing (going to playgrounds, schools, malls)
Semistructured face-to-face interviews (with teens)
Social media certainly make it much easier to peek into people’s lives, but it is also quite easy to misinterpret online traces.
Example: Gang isignia on MySpace profile –> indication of Gang involvement?
• Essay on gangs: someone from a gang wrote essay about wanting to leave the gang and go to college (Ivy League
College)
• BUT His MySpace account reveals that he is still involved in the gang community
> Using online content to get more information, but you need to take the content into account
• Close reading: detailed analysis of content is necessary
• Carefully reading and reflecting on a piece of content
ALTERNATIVE EXPLANATION:
“Without knowing the specific boy involved, I surmised that he was probably focused on fitting in, staying safe, or, more
directly, surviving in his home environment. Most likely he felt as though he needed to perform gang affiliation online—
especially if he was not affiliated—in order to make certain that he was not physically vulnerable.’
1
, • Importance of close reading: interpretation based on detailed analysis
o importance of context in combination with quantitative data
o Try to understand the meaning (close reading carefully reading + reflecting)
OVERVIEW
• Lecture 7: Walk-through, going along and scrolling back
• Lecture 8: Digital ethnography I
• Lecture 9: Digital ethnography II
DISTANT READING (QUANTITATIVE)
“Big Data”: Advances in technology & analysis
• huge amounts of data left behind by people & furthermore aggregated by companies
- Data that is left behind
- How that data is being used/analysed by big tech corporations
- Not about one datasets but about multiple datasets and combinations of data sets
Gather/analyze and link datasets
Using the large datasets to identify trends/patterns make conclusions
• Example: Investigating health and wealth in Rwanda
- Traditional social science survey: asking people specific questions (demographics)
- Call records (of approx. 1,5 million people)
- combined the 2 data sources: used survey to train machine learning model to predict a person’s wealth
Result: high resolution maps of the distribution of wealth in Rwanda – combining 2 data
sources
Importance of distant reading: analyzing large numbers of data(sets)
OVERVIEW
• Lecture 2: Computational Social Science and Open Science
• Lecture 3: Data Vizualization
• Lecture 4: Collecting data from the web
• Lecture 5: The use of AI in Social Science Research
• Lecture 6: Working with Sensors and VR
• Lecture 10: The rise of the person-specific paradigm
• Lecture 11: Digital trace data
Practicals
• Practical 1: Assignment information and Installing R
• Practical 2: Getting acquainted with R
• Practical 3: Visualizing data with ggplot
• Practical 4: Webscraping and APIs
• Practical 5: working with sensors and VR
• Practical 6: Walk-through method
• Practical 7: Working with intensive longitudinal data
READYMADE VERSUS CUSTOMMADE DATA
• Readymade: Duchan (artist): takes an ordinary object and repurposes that as art (urinal)
• Custommade: Michelangelo (artist) statue of David (from Michelangelo) more than 3 years of labeling
2
, • Same when talking about data
o Readymade repurpose data sets
o Custommade start your own data set to answer research question
10 CHARACTERIS TICS OF BIG DAT A SOURC ES (DIVIDED IN HEL PF UL AND PROBL EMAT IC )
Digital traces Big Data Digital methods
• Generally helpful for research:
- Big
- always-on
- nonreactive
• Generally problematic for research:
- Incomplete
- Inaccessible
- Nonrepresentative
- Drifting
- algorithmically confounded
- dirty
- sensitive
Big data
• Repurposing
• Found versus designed data
- Found: data that has been found by researchers but designed by someone else
- Designed: data designed by researcher to then use it in their research
• What should the ideal data set look like?
• ‘Twitter’ versus ‘social survey’ data
1. BIG (HELPFUL)
• Is all that data really doing anything?
- More data does not always mean better data
• Big datasets are never an end in themselves, but do allow for the study of rare cases, detection small differences,
and estimation of heterogeneity
Example: Analysis of the 2016 US Presidential Campaign on Twitter (Kollanyi, Howard, Woolley, 2016)
• Total of 18 910 250 tweets were analyzed (analyzed 2 hashtags)
- 39.1% debat tweets pro Trump hashtag (e.g., #MAGA)
- 13.6% debat tweets pro Clinton (e.g., #ImWithHer)
• However… (many accounts were not real)
- 32.7% pro tweets Trump originated from bots
- 22.3% pro tweets Clinton originated from bots
2. ALWAYS-ON (HELPFUL)
(we are living in media, always connected, constant stream of data)
• Unexpected events
• Real-time measurements
3
, Example: hurricane Sandy-related Twitter and Foursquare data
• Data scientist combined foursquare data and Twitter data
• Surprising findings (that could otherwise not be found)
- People went out grocery shopping more the night before the storm
- People went out after the storm (kind of celebrating that it is over)
3. NONREACTIVE (HELPFUL)
• Measurement in big data sources is much less likely to change behavior.
- People not being aware that their data is being captured: so they do not change their behavior
• However, social desirability bias can still be present
- not neutral platforms
- example: social media: focusing on the good things in our life (personal milestones,….)
4. INCOMPLETE (PROBLEMATIC)
• Usually the following information is missing/ incomplete
- Demographic information about participants
- Behavior on other platforms
- Data to operationalize theoretical constructs
• Construct validity
o a type of validity that refers to the degree to which a measurement tool or method accurately measures
the underlying construct or concept that it is intended to measure.
Example: measuring social capital
• Articulated networks contacts
• Behavioral networks communication
• Construct validity?
- For example: spending more time on the phone with your colleague does not imply that they are more
important that spouses
- Will not always be perfect: context is important
5. INACCESSIBLE (PROBLEMATIC)
• Data held by companies and governments are difficult for researchers to access.
Examples: ‘Facebook reportedly provided inaccurate data of misinformation researchers
• Sometimes only semipublic available
6. NONREPRESENTATIVE (PROBLEMATIC)
• Nonrepresentative data are bad for out-of-sample generalizations, but can be quite useful for within-sample
comparisons.
4
Les avantages d'acheter des résumés chez Stuvia:
Qualité garantie par les avis des clients
Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.
L’achat facile et rapide
Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.
Focus sur l’essentiel
Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.
Foire aux questions
Qu'est-ce que j'obtiens en achetant ce document ?
Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.
Garantie de remboursement : comment ça marche ?
Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.
Auprès de qui est-ce que j'achète ce résumé ?
Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur alinebruckner. Stuvia facilite les paiements au vendeur.
Est-ce que j'aurai un abonnement?
Non, vous n'achetez ce résumé que pour €9,99. Vous n'êtes lié à rien après votre achat.