Lecture 1 – Introduction
The thing about the internet is that it combines several previously separate communication
tools into one, which led (and still leads) to an unprecedented explosion of information and
data. The internet offers new and interactive ways to communicate with each other.
The internet is interesting for sociologists because certain sociological processes now take
place online. Furthermore, these processes can worsen certain societal problems (such as
inequality, cohesion, and culture). Three reasons why sociologists use the internet:
- New way to study old questions (for example homophily, social influence, and
network information
- New type of explanation: online behaviour may impact the “offline” world
- Online behaviour as interesting behaviour as such, as a new form of culture or social
behaviour
Internet: the physical network of computers connected by the TCP/IP protocol.
WWW: the world wide web. This is the network of documents connected by hyperlinks.
- Deep web: the part of the WWW not indexed by search engines
- Dark web: the part of the deep web only accessible through special software (e.g.
TOR)
Typically, we do not distinguish between WWW and the internet.
Social and policy issues
Cohesion:
- Does the internet make us lonely?
- Does the internet foster social integration?
- Applications: loneliness, online segregation
Inequality and poverty:
- Is there (still) a digital divide?
- Is LinkedIn the new “old boys network”?
- Applications: mobilisation, echo chambers
Social order:
- When do social media pose a threat to public safety?
- Are social media a threat to democracy?
- How to combat cybercrime?
- Applications: online social support, social capital
Motivation of this course
This course mediates between social scientists, data scientists, and policy makers.
Network perspective
Sociological questions about the internet are often about diffusion of information and
structures of social relations.
,Lecture 2 – Social Network Analysis (SNA)
Key sociological questions about the internet are relational:
- How well are people connected? (social cohesion)
- How do opinions spread from one person to the other? (social influence)
- How do people derive benefits from being connected? (social inequality)
Attribute data: characteristics of individuals
Relational data: characteristics of relations between individuals
Terminology of networks:
- (Social) network/graph
- Node: these are the actors in a network
- Edge/tie: these are the relations between actors in a network
Ties
Properties of ties:
- Direction:
o Directed ties: ties run from one actor to another
Examples: liking, friendship, citation, etc.
Directed ties may be reciprocated, but do not have to be
o Undirected ties: tie runs in both directions by definition
Examples: marriage, kinship (family relationships), collaboration
Directed ties are often studied as undirected for simplicity
o Related property: asymmetry
Example: marriage relations vs authority relations (being the boss of
someone is always asymmetrical: if I’m your boss, you cannot be my
boss)
- Strength:
o Weighted networks: we can assign a strength to a tie
Examples: good friends vs regular friends, call volume in telephone
networks
o Unweighted networks: ties have no inherent strength; they either exist or do
not exist
Example: marriage (later nakijken!)
How to measure social networks?
- Ethnographic/direct observation
- Survey methods
o Ego networks research: among a sample of individuals from a population
o Complete networks (sociometric approach): among a deliniated subset of
individuals
- From records of behaviour online social networks
Challenges in network data collection
- Setting boundaries: where does the network stop?
, - Sampling: observe all nodes (and related ties) or only a random subset?
- Missing data: missing information about nodes and/or ties. Also, MaR can be
problematic!
Types of network data
- Ego networks: ties between egos and alters
- Complete networks: ties between all members of the population (sociometric)
- Two-mode networks: relations between, but not within, two (or more) types of
entities) (example on slide was the two-mode data on women attending clubs)
Storing network data
Adjacency matrix: can be used for directed relations and is not symmetrical because of this.
Shows the actors and whether they have a relation with the other actors by using 0 and 1. 1
shows the current relation, 0 shows the potential for future relations.
Ties-as-cases: also known the edgelist or arclist. Has a similar approach as the adjacency
matrix but is a more concise visualisation. Does not show the potential for future relations,
only the current relations.
Friend list format: you can use this for example for the question of who one’s three best
friends are. You mention the actor on the left, and the three best friends on the right. You
can also use this format to combine attribute and relational data.
Levels of analysis during social network analysis
- Network level: structure of relations in the entire population
o Density: which proportion of all possible ties exist? This can be calculated by
the following formulas:
Later verder uitwerken
- Dyad level: properties of pairs of nodes
o Distance: how many steps does it take to get from actor i to actor j via the
shortest path? Direction is important!
Average distance: the distance averaged over all pairs of actors
- Individual level: position of the individual in the network
o Degree: number of friends in directed networks
Outdegree: number of relations being sent from one actor
o Local density: also called clustering, transitivity, or closure. It measures the
proportion of possible ties between your friends that exist, or in other words:
to what extent are your friends also friends with each other?
o Centrality: how central is an actor in a network? Who are the crucial
connectors or information brokers in a network? Who is most likely (later
verder uitwerken)
Degree centrality: number of ties of an actor
Betweenness centrality: how many shortest paths run via an actor?
Closeness centrality: this is the distance to all other nodes
Eigenvector centrality: centrality weighted by centrality of your
friends (also: Google’s page rank)