Research Methodology II
T&D Chapter 11. Quantitative understanding of content
Content analysis = a quantitative, systematic, and objective technique for describing the
manifest content of communication.
Quantitative = counting instances.
Systematic = we must count all relevant aspects of the sample; we cannot arbitrarily
pick what aspects get analyzed.
Objective = validity, reliability, clear unit of analysis.
Manifest = tangible and observable.
Advantages:
Content analyses can be used with almost any form of content, it is possible to analyze
almost any recorded medium, e.g. press, radio, video, web, billboards.
Content analysis has been used to study representations in news, advertising, and
entertainment media of demographic, social, minority, and occupational groups as well
as of health, parenthood, food, religion. The contexts of content analysis range from an
individual’s communication through interpersonal, group, and organizational
communication to social networks involving thousands of individuals.
Content analysis is unobtrusive. Human participants are not involved.
Emphasis on systematic sampling, clear definitions of units, and counting. The
procedures should be explicit, precise and replicable so that other researchers can verify
the results of the research.
Disadvantages:
Human coders can read for meaning, but because this involves judgments about content,
the challenge is to have the code with 100% reliability.
Content analysis only addresses the questions of content.
The method mostly has application if used for comparisons.
Validity in content analysis can be problematic in terms of relating its findings to the
external world. Detecting that the frequency with which the word ‘patriotism’ appears in
a politician’s speeches has increased over time does not entitle us to assume that the
politician has become more patriotic over time.
Content analysis = it is assigning units of content to predetermined categories and then
counting the number of units in each category.
1. Develop a hypothesis or research question about communication content.
2. Define the content to be analyzed.
3. Sample the content.
4. Select units for coding.
5. Develop a coding scheme.
6. Assign each occurrence of a unit in the sample to a code in the coding scheme.
7. Count occurrences of the coded units.
8. Report results, patterns of data, and inferences from data.
Interaction analysis = seeks to capture and understand interactions among members of a
group and the different roles that group members play.
Robert Bales (1950) outlines three broad categories of group behaviour:
Task-oriented = focus on the group’s work.
, Group-oriented = work to ensure that the group remains cohesive.
Self-centered behaviour = refuse to participate or may dominate discussions.
Content, be it massive web data or more modest examples such as speeches or newspaper
editorials, needs to be prepared for computer analysis. Hand in hand with this preparation,
the analyst will need to train the analytic software to recognize content.
Disambiguation = the process of examining a word in its context and assigning it the most
appropriate out of several possible meanings.
Stemming = changing all variations of a word to its basic stem, e.g. fish, fishing, fished,
fisherman can all be stemmed to fish.
Lemmatization = grouping words together based on their basic dictionary definition so that
they can be analyzed as a single item, e.g. car and automobile can be described as vehicle.
Stop words = high frequency words such as pronouns (I, we, you) and prepositions (in,
under).
,Lecture 1. Introduction
Corpus research
Experimental research
Survey
Abstract 1 Experiment
- Something’s manipulated
- Something’s caused by something else
- One scenario versus another scenario
Abstract 2 Survey
- No two different scenarios
-
Abstract 3 Corpus research
- No two different scenarios
- Analysis of 21 materials
Abstract 4 Experiment
- Different conditions
- Manipulation
“Content analysis is a research technique for the objective, systematic, and quantitative
description of the manifest content of communication.”
Step 1. Formulate a research question or hypothesis
Topic/literature research
- Metaphor in some advertisements
- Difference between products and services
Hypothesis: The use of metaphor (present/absent) will differ in advertisements for products
versus services.
Step 2. Determine what content you will analyze
I will only take into account advertisements that were published in print, not online.
Step 3. Create your corpus
I will select all advertisements that were published in 5 most-read magazines in the
Netherlands in the past 5 years.
Step 4. Decide what you are going to code
I will code for presence of metaphor in the image displayed in the ad.
Not interested in type of metaphor, nor interested in language use.
I will code for products/services.
Not interested in specific brands/companies.
Step 5. Establish how you are going to code the data
- Presence of metaphor: yes/no
- Type: product/service/other
Step 6. Annotate the corpus
, Step 7. Count occurrences
Step 8. Report results
- 66% of all product ads contain a metaphor
- 40% of all service ads contain a metaphor
How would you investigate these questions? Through corpus analysis!
1. What content will you analyze?
2. What will be your corpus?
3. What are going to code and how will you code it?
Guest lecture: Research Data Management
Research data = al information, digital and non-digital, generated as part of the scientific
process, on which scientific conclusions are based.
In some disciplines, data is a pretty straightforward concept, such as survey data, interview
transcripts and statistical data. For other types of data this is less clear.
Data as part of a publication = extensive, structured reference lists may be valuable
databases on their own, and could/should be archived in a data repository.
Primary data = audio/video/text data that you collected/recorded yourself, raw data.
Secondary data = derived data, e.g. analysis schemes, scrips, codebooks.
Annotations = may be databases on their own, and could/should be archived in a data
repository (check copyright if the original text is included).
Why research data management?
Data is research output, and is part of your publication. RDM helps you to make conscious
decisions about the data in your project.
Findable Accessible Interoperable Reusable
The international FAIR principles are guidelines for the way of describing, storing and
publishing scientific data.
Findable = data and metadata have a unique identifier.
Accessible = data and metadata are accessible through a protocol.