Inhoudsopgave
Week 1: Data-Driven Technologies: what are they?............................................................................................2
Boyd and Crawford (2012)...................................................................................................................................2
Mayer-Schonberger & Cukier (2013)..................................................................................................................3
Dalton and Thatcher (2014).................................................................................................................................4
Week 2: How do they work?..................................................................................................................................5
Raghupathi & Raghupathi (2014)........................................................................................................................5
Silver (2012).........................................................................................................................................................7
Ziewitz (2017).......................................................................................................................................................9
Week 3: How can we use them in healthcare?....................................................................................................10
Schutt & O’Neil (2014)......................................................................................................................................10
Menger et al (2016)............................................................................................................................................11
Baru (2019)........................................................................................................................................................12
Week 4: What sort of knowledge do they produce?...........................................................................................13
Anderson (2008).................................................................................................................................................13
Stevens et al (2018)............................................................................................................................................14
Kitchin (2014).....................................................................................................................................................15
Stevens et al (2020)............................................................................................................................................17
Week 5: How does regulation differ in European countries?...........................................................................19
Rieder (2018)......................................................................................................................................................19
Custers et al (2018)............................................................................................................................................21
Starkbaum & Felt (2019)....................................................................................................................................22
Week 6: What are the ethical dilemmas?............................................................................................................24
Mittelstadt & Floridi (2016)...............................................................................................................................24
Zwitter (2014).....................................................................................................................................................30
Zook et al (2017)................................................................................................................................................32
Grote & Berens (2020).......................................................................................................................................35
Morley et al (2020).............................................................................................................................................37
,Week 1: Data-Driven Technologies: what are they?
Boyd and Crawford (2012)
Big Data is less about data that is big than it is about a capacity to search, aggregate, and
cross-reference large data sets.
1
We define Big Data as a cultural, technological, and scholarly phenomenon that rests on
the interplay of:
(1) Technology: maximizing computation power and algorithmic accuracy to gather, analyze,
link, and compare large data sets.
(2) Analysis: drawing on large data sets to identify patterns in order to make economic,
social, technical, and legal claims.
(3) Mythology: the widespread belief that large data sets offer a higher form of intelligence
and knowledge that can generate insights that were previously impossible, with the aura of
truth, objectivity, and accuracy.
Features like personalization allow rapid access to more relevant information, but they
present difficult ethical questions and fragment the public in troubling ways
Critical view on big data:
1. Big Data changes the definition of knowledge
a. Big data stakes out new terrains of objects, methods of knowing, and
definitions of social life.
b. Some data is hard to access so the focus lies on more recent or present data.
c. If we are observing the automation of particular kinds of research functions,
then we must consider the inbuilt flaws of the machine tools. It is not enough
to simply ask, as Anderson has suggested ‘what can science learn from
Google?’, but to ask how the harvesters of Big Data might change the
meaning of learning, and what new possibilities and new limitations may
come with these systems of knowing.
2. Claims to objectivity and accuracy are misleading
a. A model may be mathematically sound, an experiment may seem valid, but as
soon as a researcher seeks to understand what it means, the process of
interpretation has begun. This is not to say that all interpretations are created
equal, but rather that not all numbers are neutral.
b. All data needs to be interpreted which interferes with the objectivity of the
data.
c. ‘Interpretation is at the center of data analysis’ p 668
3. Bigger data are not always better data
a. Twitter does not represent ‘all people’, and it is an error to assume ‘people’
and ‘Twitter users’ are synonymous: they are a very particular sub-set.
Neither is the population using Twitter representative of the global
2
, population. Nor can we assume that accounts and users are equivalent. Some
users have multiple accounts, while some accounts are used by multiple
people.
b. ‘Big Data and whole data are also not the same’ p 669
c. The size of data should fit the research question being asked; in some cases,
small is best.
4. Taken out of context, Big data loses its meaning
a. Because large data sets can be modeled, data are often reduced to what can
fit into a mathematical model. Yet, taken out of context, data lose meaning
and value
b. Behavioral and articulated networks have great value but are not the same as
personal network.
c. Not every connection is equivalent to every other connection, and neither
does frequency of contact indicate strength of relationship
5. Just because it is accessible does not make it ethical
a. Accountability requires rigorous thinking about the ramifications of Big Data,
rather than assuming that ethics boards will necessarily do the work of
ensuring that people are protected.
b. Data may be public (or semi-public) but this does not simplistically equate
with full permission being given for all uses.
6. Limited access to big data creates new digital divides
a. While the explosion of research using data sets from social media sources
would suggest that access is straightforward, it is anything but.
b. Big Data researchers with access to proprietary data sets are less likely to
choose questions that are contentious to a social media company if they think
it may result in their access being cut.
c. The current ecosystem around Big Data creates a new kind of digital divide:
the Big Data rich and the Big Data poor.
The era of Big Data has only just begun, but it is already important that we start questioning
the assumptions, values, and biases of this new wave of research.
Mayer-Schonberger & Cukier (2013)
Things really are speeding up. The amount of stored information grows four times faster
than the world economy, while the processing power of computers grows nine times faster.
Sometimes the constraints that we live with, and presume are the same for everything, are
really only functions of the scale in which we operate.
With information, as with physics, size matters. Hence, Google is able to identify the
prevalence of the flu just about as well as official data based on actual patient visits to the
doctor. it can do this by combing through hundreds of billions of search terms — and it can
produce an answer in near real time, far faster than official sources
Big data is not about trying to “teach” a computer to “think” like humans. instead, it’s about
applying math to huge quantities of data in order to infer probabilities: the likelihood that an
3
, email message is spam; that the typed letters “teh” are supposed to be “the”; that the
trajectory and velocity of a person jay- walking mean he’ll make it across the street in time
The key is that these systems perform well because they are fed with lots of data on which
to base their pre- dictions. Moreover, the systems are built to improve themselves over
time, by keeping a tab on what are the best signals and patterns to look for as more data is
fed in.
As scale increases, the number of inaccuracies increases as well.
Big data is about what, not why. we don’t always need to know the cause of a phenomenon;
rather, we can let data speak for itself.
Moreover, because of the data’s vast size, decisions may often be made not by humans but
by machines.
Meanwhile the danger to us as individuals shifts from privacy to probability: algorithms will
predict the likelihood that one will get a heart attack (and pay more for health insurance),
default on a mortgage (and be denied a loan), or commit a crime (and perhaps get arrested
in advance). it leads to an ethical consideration of the role of free will versus the dictatorship
of data.
What does it mean if a doctor cannot justify a medical intervention without asking the
patient to defer to a black box, as the physician must do when relying on a big-data-driven
diagnosis?
Although they build upon the values that were developed and enshrined for the world of
small data, it’s not simply a matter of refreshing old rules for new circumstances, but
recognizing the need for new principles altogether.
As the world shifts from causation to correlation, how can we pragmatically move forward
without undermining the very foundations of society, humanity, and progress based on
reason?
Dalton and Thatcher (2014)
1. Situating ‘big data’ in time and space
2. Technology is never as neutral as it appears
3. Big data does not determine social forms: confronting hard technological
determinism
a. A technology designed by one group of stakeholders for a particular purpose
may be adopted by different stakeholders and used against its original
intended function.
b.
4. Data is never raw
4