Lecture 12: Logic and quality of research designs: Experimental designs
Positivists and realists are interested in finding and observing cause-effect relationships (Independent variable/
Explanatory variable/ Cause Dependent/ Outcome variable/ Effect).
1. What is the essence of an experimental design?
Experimental research is now perhaps the fastest growing area of political research. Experimental designs are often
thought to most closely resemble the true scientific method, and as such they are regarded as being the most effective
design for testing whether or not two variables are causally related. They manage to do this thanks to the rigorous use
of experimental control. This helps researchers to isolate the impact of a specific variable and to overcome one of the
main problems that researchers face when they want to investigate causal relationships.
Causality in experimental research:
Counterfactual understanding of causality: “X caused Y” means that Y is present, but Y would not have been
present if X were not present.
Probabilistic interpretation of causation: X increases the likelihood of Y occurring
Effect of cause: What is the effect of X, rather than what causes Y?
Research strategy – systematic testing of causal claims:
Controlled intervention/treatment:
o Manipulation of independent variable of interest. We called the data gathered by intervention
treatment.
Ceteris paribus = all else equal:
o Control of environment (lab):
It relies on the use of control groups (in which no intervention takes place) and experimental
groups (in which interventions do take place) and the random assignment of subjects to control
and experimental groups:
The control group provides a point of reference to which the effect of the intervention
can be compared. The random assignment of subjects to control and experiments
groups ensures that the two groups are similar to each other.
o Random assignment to treatment and control group:
Ensures comparability of treatment and control group
Minimizes confounding factors: factors other than the treatment that may cause the outcome;
Omitted variable bias
Can help prevent reactivity (if participants don’t know the group they are in – blind)
Can help to prevent the researcher from treating treatment and control groups differently
(Rosenthal effect) (if researchers do not know who is in which group - double blind)
, o How to randomize?
Isolation of ‘effect’’:
o Difference between treatment and control groups in Y (outcome)
o Average effects rather than individual effects
2. Three designs to illustrate how we exploit comparison
Post-test only:
Possible comparison:
o A – of post-test results in both groups. Is the intervention effective?
BUT: Was randomization effective? How does the change for individuals, on average, look like?
Pre-test/ Post-test Two Groups Design:
The classic version of the experimental design comprises of five steps:
o Two groups: one group that is exposed to the intervention and one that is not.
o Random allocation of subjects to the groups before the pre-test.
o One pre-intervention (pre-test) measure on the outcome variable of interest, the dependent variable Y:
o One intervention (test/treatment)
o One post-intervention (post-test) measure on the outcome variable:
To see whether the hypothesis is confirmed or not we need to carry out a post-test on both
groups on the dependent variable. If our hypothesis is supported, then we should observe that
the test statistic for our dependent variable Y has changed for the experimental group but has
not changed for the control group.
Possible comparisons:
o A – of post-test results in both groups. Is the intervention effective?
o B – of pre-test/post-test in both groups. Does treatment group change over time (B)? Are there
confounding factors at play, e.g. changing environments (B1)?
o C – of pre-tests in both groups. Did randomization work well?
BUT:
, o Could pre-test have affected the post-test?
A F
E G
E D
Solomon Four Groups
Design:
Possible comparisons (in addition to A-C):
o D - of posttests between groups 1&2 and 3&4; if A and differ, pretesting has possibly affected the
outcome
o E –of pretest in group2 and posttest in group4: if there is a difference, a external distortion may have
caused the effect over time; causality?
o F (G) – of posttests between groups 1&3 (2&4) to see whether pre-testing has affected the outcome
Laboratory experiments:
Subjects are recruited to a common location designed to ensure that the researcher has as much control as
possible over the environment to which subjects are exposed.
o The experimental group and the control group can be exposed to exactly the same environment except
for the experimental intervention.
Three main strengths:
o Allow researcher to have a great deal of control. Different stimuli can be manipulated one at a time,
holding everything else constant.
o Allow the researcher a great deal of control over what variables are manipulated, and even to
manipulate variables that might be difficult to vary in the real world.
o Laboratory experiments allow the researcher to create environments that simply don’t exist in the real
world.
, High internal validity, low external and ecological validity (because of reactivity – participants know they are
being observed so they change their behaviour).
3. Some examples of political science field and natural experiments
Field experiments:
Controlled treatment, taking place in the field, so in real-world environemnets.
Attempt to reproduce as closely as possible the conditions aunder which different political phenomena occur,
thus increasing the external validty or generalizability of the findings.
Two main strengths:
o RCT = randomized controlled trials; assures unbiased interference about the cause and effect.
o The antural settings ensure that the results will tell us something useful about the real world.
Can try and uncover the way in whcih variables are related to each other when the direciton of causality is
uncertain.
External validity, but might have issues with internal validity.
Quasi-experiments/natural experiments:
No control of treatment:
o Instead: search for natural variation on X1, so rely on naturally occurring events a interventions rather
than on interventions controlled by the researcher.
There is no problem of artificiality associated with the laboratory experiment. Since the intervention occurs
independently of the actions of the researcher, there are not the ethical issues.
No randomization:
o But ‘as-if’ randomization: The researcher might be able to make a plausible claim that the allocation is as
good as random.
o Strategies to still satisfy ceteris paribus assumption case selection, matching, etc. But: Difficult to
achieve.
Comparative research designs
4. Assessment: internal & external validity, ethics
Internal validity: Can the effect truly be attributed to the treatment?
Depends on: How convincing is the ceteris paribus assumption?
Possible threats to ceteris paribus assumption:
o Are control and treatment groups comparable?
Randomization; blindness of participants
o Are effects over time possible?
Pre-test/post-test design
o Are repeated measurements affecting the outcome?
Solomon Four Group Design
o Do researchers treat participants in treatment/control groups equally?
Blindness of researcher
Tradeoffs:
o Between having repeated measurement and measurements affecting the outcome
, o Between establishing more control groups and resources needed
External validity: Can we generalize the result beyond the participants of the experiment?
How have the participants been sampled into the experiment? If no random sampling, in what direction goes
the bias?
Do participants react to experimental situation (reactivity, e.g. placebo effects)? Blindness of participants
Ecological validity: Does experiment convincingly simulate real-life situation?
Limitations / tradeoffs
o recruiting representative samples and available resources
o between controlling the situation (lab) and creating a real-life context
Ethical considerations:
Deception is inherent in experimental research: Debriefing participants afterwards is essential
Can unequal treatment be justified? List randomization designs (in which everyone eventually enrolls in the
assignment, but with different sequencing)
Lecture 13: Comparative case studies and research designs
Key message:
Comparison is not an end in itself!
o We may want to compare for descriptive purposes:
‘False uniqueness’ vs ‘False universalism’
False uniqueness emphasizes the specificity of the case, entirely ignoring the general
social forces at work, and does not move beyond thick description.
False universalism assumes that the theory tested in one country/context will be equally
applicable to other countries.
o Or for causal investigation, but there are theoretical reasons for a particular comparison
Broadly speaking, comparative methods can be used in three main ways:
o To apply existing theory to new cases
o To develop new theory or hypothesis
o To test theory
1. Classification of comparative designs
Do we compare over time or over space?
Cross-sectional designs: comparison between space; ‘between variation’
Longitudinal (time-series) designs: comparison over time; ‘within variation’
Time-series-cross-sectional design: comparison between space and over time
o Panel study: same participants over time
o Cohort study: same participants over time; participants share a defining characteristic such as age, year
of graduation, etc.
,How many cases are compared?
Terminology:
o Case: A (spatially or temporally) delimited phenomenon (a unit) that represents the phenomenon we
are interested in.
o Population: All the cases that an inference is said to apply to.
o Sample: All cases chosen for a study.
o N: The number of
cases in a sample (small-N)
observations in a sample (large-N)
→ These terms are definable only by reference to your hypothesis and research design
o Case study: Gerring (Gerring 2006:20): A intensive study of a single or several cases ‘where the purpose
of that study is – at least in part – to shed light on a larger class of cases (a population)’.
“However, at a certain point it will no longer be possible to investigate [each of] those cases
intensively. At the point where the emphasis of a study shifts from the individual case to a
sample of cases, we shall say that a study is cross-case. […]. The fewer cases there are, and the
more intensively they are studied, the more a work merits the appellation “case study.” […] All
empirical work may be classified as either case study (comprising one or a few cases) or cross-
case study (comprising many cases).
, Single-N/ single-case study designs:
o Research activities:
In-depth investigation of one specific case (e.g. election, welfare state, leader)
Using various forms of evidence e.g. documents, macro-economic data, surveys, interviews
→ Focus on depth (internal validity), rather than breadth (external validity)
o Single-N descriptive/explorative designs:
Extreme (unusual) case
o Single-N designs to investigate causality:
Most-likely crucial case; least-likely crucial case; deviant case
o Case selection is crucial: based on theoretical considerations (-> Reseach Methods next year)
Small-N (multiple case studies) designs:
o Research activities:
in-depth investigation of several specific cases
using various forms of evidence e.g. documents, macro-economic data, surveys, interviews
→ Focus on depth (internal validity), rather than breadth (external validity)
o Typically, 2-6 cases
o Also referred to as ‘comparative’ (case study) design
Most-similar systems design
Most-different systems design
o Comparison as a means, rather than end in itself
o Attempt to approximate the experimental condition through case selection -> after the break
Combination of both – Large-N/ Statistical analysis:
o Example: A researcher wants to know whether economic welfare leads to more liberal political attitudes.
She uses data from a national survey on income and political attitudes for 1,115 individuals.
o Research activities: collection of numerical data on many cases; statistical analysis
o Correlation between one dependent and one or more independent variables (size of effect)
o Case selection: representative (random) sample
o ‘Control’ of alternative explanations (control variables)
→ Focus on breadth (external validity), rather than depth (internal validity)
2. Zooming in on small-N case study designs
Most-similar systems design (MSSD):
John Stuart Mill: System of Logic (1843): method of difference
Two or more cases
Cases need to be
o similar in X2 (controlled causes)
‘control’ (= cases that happen to be similar in these aspects) of alternative explanations (X2, X3,
X4)
o different in X1 (cause of interest)