Conducting a survey
Lecture 1: introduction to survey methodology
H10: Research process
Developing and conducting a survey is an iterative process:
1. Defining the problem and research objectives
2. Developing the research plan for collecting information: planning, design
a questionnaire, pre-test questionnaire
3. Implementing the research plan-collecting and analysing the data:
distribute questionnaire to respondent (offline methods, online
methods). You can use login processes in order to restrict participation
to those who are invited and invite people to only fill in the
questionnaire once:
a. Automatic login procedures = use an unique ID and or/password
embedded in the URL, resulting in a unique URL for each respondent.
b. Manual login approaches = use an ID in the URL but require a
respondent to fill out a password or to fill out both ID and password.
Automatic login procedures increase response rates. After that you clean
your data and start analysing.
4. Interpreting and reporting the findings
Doing surveys online
Chapter 1: introduction to online surveys
Online surveys can be used to conduct both qualitative and quantitative research questions,
although they are more commonly used for answering quantitative research questions.
- Quantitative research = quantify a research problem by generating numerical data that
can be used for statistical testing. Most questions are close-ended.
- Qualitative research = gaining an understanding of underlying reasons, opinions or
motivations
a. Passive analysis = analysing interactions and information on the Internet without
actively communicating with people in Internet communities themselves. Researcher
often do not reveal their own identity as a researcher e.g. interactions in chat rooms.
b. Active analysis = analysing interactions and information on the Internet and actively
getting involved in the interactions, often without revealing their own identity as a
researcher.
c. Web surveys = only method in which the researcher informs the participants of his
or her identity, making it more ethical than passive or active analysis. Most questions
are open-ended.
Advantages of online surveys:
1. Wide geographical reach: offers the possibility to quickly and easily create a great
sample generalization = make inferences out of the sample to a general population.
2. Low costs
3. Internet communities = ability to approach a group that one would not easily find
outside of the Internet by using chat rooms, mailing lists or discussion boards.
Total survey error
Total survey error = framework describing statistical error properties of sample surveys: all
sources of bias and error that may affect the validity of survey data. This can be used as a
criterion.
1. Coverage error (selection error) = discrepancies between the sampling frame, that is a
list of the target population used to draw a sample, and the actual target population. The
sample frame does not contain every member of the population: misses subjects who are
part of the population, contains subjects who are no longer member of the defined
population, missing information, inaccurate information, subjects may be on the list
, more than once E.g. when your survey population is 18 to 80 year old citizens of
Amsterdam and you use the Amsterdam telephone book as a sampling frame not all 18
to 80 year old citizens in Amsterdam have a telephone or are registered in the telephone
book. Address-based sampling is more appropriate. The coverage error in web surveys is
influenced by the digital divide = groups that do not have access to the Internet have
significant different demographics than groups that do have access to the Internet
2. Sampling error (selection error) = discrepancies between the sample and the population:
different samples will have different results. Sampling error is influenced by sample size:
as the sample gets larger, the distribution of possible sample outcomes get tighter
around the true population figure.
a. Sampling bias = some members of a target population are less likely to be included in
the sample than others. This is not influenced by sample size, but can be minimised
by taking random samples.
3. Non-response error (selection error) = errors due to missingness of answers. Non-
response only leads to error when the characteristics of non-respondents differ
systematically from respondents. When the non-response is random it is not a threat.
High response rates do not guarantee absence of nonresponse error, when a specific
group is not represented there is still a non-response error. Non-response is a greater
problem for web-surveys in comparison with other modes. Three types of non-response
error:
a. Unit-non response = a sampled unit (e.g.
household, individual) does not participate in
the survey
b. Partial non-response = a sampled unit
prematurely end their participation
c. Item non-response = respondents fail to
complete an individual item/question within the
survey
Non-response error can be minimized by weighting.
4. Measurement error = there is a deviation of the
answers retrieved from the respondents and the true value of the answer: the
instrument does not measure what we want it to measure, respondent produces
measurement error
Random error = reliability: the extent to which a scale produced consistent results, if
the measurement is repeated.
Systematic error = validity
Different types of measurement error:
a. Misreporting (validity)
b. Loss in precision (loss in reliability, increase in variance)
c. Satisficing = selecting the first answer category that seems appropriate instead of
reading all the options: choose first option that fits best, straight lining, ‘’Don’t
knows’’ when internal answer is available.
This may be caused by: design issues (lay-out, response scales, question designs),
interviewer effects, social desirability. Measurement error in websurveys:
- In web surveys the display of the questionnaire might differ for respondents.
- Web surveys can be completed by respondents at any time and place meaning
that investigators cannot control for environmental factors.
, - In web surveys there is
absence of interviewer to
guide the response process
5. Post-survey processing error =
errors in processing surveys, such
as data entry and computer errors.
Four fundamental concepts to
design a good survey are:
accurate measurement, high
response rate, high coverage,
proper sampling.
6. Adjustment error = error due to
data collection statistical
adjustment
Althoff: using iPhone data to determine the average amount of steps one takes
- Coverage: only people with phones are included. In some countries more people
have phones than others; owning a phone is in some countries associated with
status
- Sampling error: only volunteers
- Measurement error: device malfunction, not always carrying your phone with you
- Adjustment error: weighting the data by using demographic data of the population,
however the demographic characteristics are not relevant for the distance walked,
so you cannot correct for adjustment error
Dimension differences between modes
Dimension differences between modes:
1. Degree of interviewer involvement:
Advantages interviewer involvement: interviewer can provide guidance and explanation.
Low interviewer involvement can lead to non-observation and measurement error, due
to the lack of respondent assistance. Disadvantages interviewer involvement:
I. Interviewer effects
a. Social desirability = respondents giving social desirable answers rather than
the truth. This can be influenced by the interviewer’s gender, age,
interviewing style.
b. Response rates = interviewer’s expertise can influence response rates: higher
expertise higher response rate
II. High costs
2. Degree of interaction with the respondent: is highest in face-to-face surveys because it
allows direct contact with verbal and non-verbal cues. Telephone interviews also have a
high degree of interaction, but they only allow verbal cues. In web surveys direct contact
can be completely absent, but it can also be incorporated by including pictures of the
interviewers or the reading of the survey questions by the interviewer.
3. Degree of privacy
4. Channels of communication: usage of aural channels and/or visual channels.
5. Technology: variance in the degree in which computer technology is used. In some
survey modes no technology is used (paper-and-pencil), in others only the interviewer
uses technology (telephone) or both the interviewer and respondent use technology
(web survey).
Web Face-to-face Telephon Paper-and-pencil
e
Interviewer involvement - + + -
Interaction +/- ++ + -