1. INTRODUCTION, DATA EXPLORATION AND VISUALIZATION
Total error What you observe = true value + sampling error + measurement error + statistical
framework error (you don’t observe). If you mess up the errors, your results will be biased
Statistics Characteristics of the sample (estimates the parameters)
Parameters Characteristics of the population
Coverage error Target population (voters) to frame population (everyone with telephone)
Sample error Frame population (everyone with telephone) to sample population (random digit)
Non-response error Sample population (random digit) to respondents (accept the call), biggest error
Post-strati cation Making your sample close to your population (e.g. when population 50% female
weights and sample 80% female), weighting this to come to better sample and outcome
Measurement scales Non-metric and metric (continuous), right statistical technique depends on this
Non-metric Nominal (categorical) and ordinal —> outcomes can be categorical (labels) or
directional (only measure direction of response, e.g. yes/no)
Metric Interval and ratio —> continuous scales not only measure direction or
classi cation, but intensity as well (e.g. strongly agree or somewhat agree)
Nominal Number only serves as label for identifying objects in mutually exclusive (not at
same time) and collectively exhaustive (at least one) categories (e.g. SNR,gender)
Ordinal Numbers are assigned to objects to indicate relative positions of characteristics
of objects, but not magnitude of di erence between them (e.g. preference, ranks)
Interval Numbers are assigned to objects to indicate relative positions of some
characteristics of objects with di erences between objects being comparable,
zero point is arbitrary (e.g. Likert scale, temperature Fahrenheit/Celcius)
Ratio Most precise scale, absolute zero point, has all advantages of other scales (e.g.
weight, height, age, income, temperature Kelvin)
Summated scales Measuring attitudes/feelings/beliefs that are more abstract and di cult than age
and income (multiple question to capture everything (reduce measurement error))
Validity and reliability Validity: measure what it’s supposed/wanted to measure? Does it make sense?
Reliability: is the outcome stable? Do results change if changing variables?
Statistical error Two outcomes: fail to reject null (null true) and reject null. Two types of error:
(hypothesis testing) - Type I: in reality nothing is going on (null true) but data shows something is
going on (reject null), false positive (doctor says man is pregnant, but not true)
- Type II: in reality something is going on and data shows nothing is going on,
false negative (female is pregnant, but doctor says she is not), setting power
P-value (alpha) Probability of observed data/statistic given that null hypothesis is true (< 0.05),
so what is the chance that we found the data that we did if null is true in reality
Exploration Always explore data before running any model (recode missings, reverse code
negatively worded questions, check range variables, check mutually consistency)
Visualization Exploration, understanding/making sense of data, communicating results (charts)
fi fi ff ff ffi
, 2. ANOVA
1. De ning objectives Test if there are di erences in the mean of a metric (interval/ratio) dependent
variable across di erent levels of one or more non-metric (nominal/ordinal)
independent variables (‘factors’), one-way/two-way ANOVA (experiments)
2. Designing study 2.1 Sample size
Determine e ect size with previous literature or using Cohen’s F
Signal = between groups
Noise = within a group
How smaller the e ect, how larger the sample needs to be and vice versa
Sensitivity analysis: how changes my sample size if e ect size changed?
2.2 Interactions
Interaction is the e ect of one variable on the DV is dependent on another
(interaction e ect), interaction between IVs (treatment/categorical variables)
2.3 Use covariates (control variables) by doing ANCOVA
Covariates a ect DV separately from treatment variables (IVs), requirements:
- Pre-measure (before outcome, otherwise they may intervene)
- Independent of treatment
- Limited number (< (0.1) * # observations - (# populations - 1)
3. Checking assumptions 3.1 Independence (most important)
Are the observations independent? —> when there is no pattern in the plot
A ects your estimates and standard errors
- “Between-subjects” design: each unit of analysis (row, respondent) sees
only one combination of IVs
- “Within-subjects” design: each unit of analysis sees all treatments
(counterbalance order of treatments, allow di erences)
3.2 Equality of variance/homoscedasticity (Levene’s test)
Is the variance equal across treatment groups? —> not reject null (> 0.05)
A ects your standard errors
What if homoscedasticity rejected?
1. If sample size is similar across treatment groups —> ANOVA robust (ok)
2. Transform dependent variable (e.g. take log(DV)) —> redo test
3. Add covariate —> ANCOVA, redo test
3.3 Normality (least important)
Is the DV approximately normally distributed? —> not reject null (> 0.05)
A ects your standard errors only if sample is small
What if normality rejected?
1. Large sample —> ANOVA robust
2. Small sample —> transform DV to make distribution more symmetric
4. Estimating model Calculation F-value (variation between groups larger than within groups?):
Mean sum of squares between groups (MSSB)
Mean sum of squares within groups (MSSW)
Large F (high signal/low noise): reject null of no di erences across groups
ff
fi ffffff ffffff ff ff ff