Lecture 1 Introduction
The Internet is essentially a physical network of computers connected by TCP/IP protocol. The web is
a network of documents connected by hyperlinks. The web is thus built on top of the Internet.
The deep web is part of the web not indexed by search engines. The deep web is basically all
websites that are not accessible through Google, so even our own student mail is a part of the deep
web. The dark web is part of the deep web only accessible through special software.
Typically we make no distinction between the internet and the web. In this course the internet is
seen as a socially constructed phenomenon that both reflects social, economic and political processes
and impacts on these processes.
In social science, there is often talk about the “impact of the internet”, this basically means the
impact of internet usage.
Cyberspace is used as a term to describe the space that people interact in on the internet. However,
this term is quite metaphorical and it is hard to measure. Social scientists have created the term
virtual communities to have a more measurable term for a space that people interact in online.
There is also the term online/virtual groups, which means a group of people with shared interests
who communicate via the Internet, but where collective identity does not exist. The difference
between virtual groups and virtual communities is that people in virtual communities have shared
values, norms and understandings.
In “normal” social communities, three aspects are used to define a community of people:
homogeneity, proximity and social ties. However in virtual communities, it should be noted that a
virtual community might exist even in the absence of homogeneity, proximity and social ties if the
topic of interest that draws the community together is one that itself is value driven, that is, the
members would not be interested in the topic if they did not share common values. This is best
explained using an example. The rec.pets.cats newsgroup (where people discuss their cats, like how
to care for them) is a good example of an online group while alt.non.racism (a newsgroup devoted to
discussing racism, presumably from the point of view that it is morally wrong) is an example of a
virtual community.
With the rise of Facebook and other social media, the term online social networks has become
increasingly popular: the formal representation of a social network, where the data on ties and nodes
are the result of online interactions between individuals (like in a social network site such as
Facebook).
An online social network is not necessarily a virtual community. For example, if you extracted a
network of real-world friends from their Facebook profiles, you could represent and analyze the data
as an online social network. But this would not be an example of a virtual community since it is not
the case that these people necessarily share common values and norms leading to collective identity.
Online social networks are studied the most by sociologists, which is why it is most focused on in this
course.
With online social networks the small world hypothesis can be tested at a country or even global
scale. The small world phenomenon means that people have very clustered networks and that there
are short distances between connections. The small world phenomenon on the internet has been
confirmed to exist.
Rather than being a force that is shaping human behavior, the web can best be viewed as a tool that
people use to achieve social, economic and political outcomes. The web provides social scientists
with a unique data source for studying this behavior, thus providing new insights into long-standing
questions in social science.
The network perspective: sociological questions about the internet are often about diffusion of
information and structures of social relations.
,Lecture 2 Social Network Analysis
Social Network Analysis comprises of methods to study networks between people.
The principal types of data used in social science are:.
- Attribute data: relates to the attitudes, opinions and behavior of agents, these are regarded
as the properties, qualities or characteristics that belong to them as individuals or groups.
The items collected through surveys and interviews, for example, are often regarded simply
as attributes of particular individuals that can be quantified and analyzed through many of
the available statistical procedures. The methods most appropriate for attribute data are
those of variable and multivariate analysis, whereby attributes are measured as values of
particular variables such as income, occupation and education.
- Relational data: concern the contacts, ties and connections, and the group attachments and
meetings that relate one agent to another and that cannot be reduced to the properties of
the individual agents themselves. The methods appropriate for relational data are those of
network analysis, in which the relations are treated as expressing the linkages that run
between agents. So: in social network analysis (and in this course), we look at relational data.
Examples of relational data are for example friendship networks.
In relational data, we analyze the structure of social interaction, this can be visualized in a structure
graph (which is referred to as a network). Examples of structural questions are: how cohesive is a
group? Or who is most important in a group?
Terminology:
- Points in a graph: node or actor
- Line in a graph: relation, edge, link or tie (examples being friendship, acquaintances, love,
hate, trade/exchange, authority or proximity).
Different types of ties:
- Directed: tie runs from one actor to another (examples being liking or friendship)
- Undirected: tie runs in both directions (exampled being marriage, kinship, collaboration)
Note: directed ties can be from both sides (for example two friends like eachother), this is still a
directed tie. What makes an undirected tie is that it runs in both directions by definition, for example
in marriage where people are always in pairs.
Strength of ties:
- Weighted networks: we can assign a strength to a tie (examples being best friends versus
“regular” friends)
- Unweighted networks: ties have no inherent strength, they either exist or do not exist
(examples being marriage or being colleagues. You cannot be very married or a little married.
You can of course study the strength of the relation in a marriage but that would then make
it a weighted network).
There are different ways of measuring social networks:
- Ethnographic/direct observation: like observing a group and taking notes on who talks to
who et cetera.
- Survey methods: among a sample of individuals from a population (ego networks) or among
a delineated subset of individuals (complete networks). You can choose between a roster (a
list of people that the respondent can choose from) or nomination. When a roster is used
the researcher must have identified the relevant members of the network before connecting
and implementing the questionnaire or schedule.
, - Records of behavior: online social networks like Twitter following/retweeting or Facebook
friendships and even website hyperlinks.
Challenges in network data collection:
- Setting boundaries: where does the network stop? If you study a certain group, they of
course always have connections to people outside of that group which then will not be
considered. But you have to draw a line somewhere (otherwise you could take the whole
world as your research group because everyone has connections to people outside of a
group).
- Sampling: observe all nodes (and ties) or only a random subset?
- Missing data: missing information about nodes and/or ties.
Different methods of storing network data:
Adjacency matrix: a square matrix that shows connections between people.
Ties-as-cases/edgelist/arclist: a matrix where every row indicates a relationship (for example A – B,
B – D and D – A et cetera).
Friend list format: a format where nominations from different friends are shown. For example with
the question “who are your three best friends?”, the format shows every person and then the three
best friends. A friend list format is useful for small datasets but not for big ones.
In social network analysis there are different types of levels of analysis:
- Network level: structure of levels in the entire population.
- Dyad level: properties of pairs of nodes (are two nodes connected, how many steps?).
In dyad level analysis you can look at distance: how many steps does it takes to get from
actor A to actor B? Average distance: distance averaged over all pairs of actors.
- Individual level: positions of the individuals in the network. There are different terms in
individual level analysis:
Degree: number of friends in directed networks.
Local density/clustering: to what extent are your friends also friends with each other?
Centrality: how central is an actor in a network? You can see who is most important in a
network. With centrality questions could be asked like who is most influential or who is most
likely to be informed first about something?
Degree centrality: the number of ties of an actor.
Betweenness centrality: how many shortest paths run via an actor.
Closeness centrality: distance to all other nodes.
Eigenvector centrality: centrality weighted by centrality of your friends.
Calculating density: which proportion of all possible ties exist?
Number of possible ties (directed networks): N*(N-1)
Number of possible ties (undirected networks): N*(N-1)/2
Visualization is a popular method in social network analysis to highlight the structure of the network
visually. A problem to solve in visualization is that there are many different algorithms to make these
visualizations, and that these algorithms can make different visualizations which can then lead to
different conclusions. So as a researcher you should not base your conclusion solely on your
visualization.