Lecture 1 - Introduction, Data Exploration, and Visualization
What you observe = True value + Sampling error + Measurement error + Statistical error
→ If any of these is messed up, results are biased and recommendations are wrong
Statistics estimate parameters
→ Statistics: characteristics of the sample
→ Parameters: characteristics of the population
Target population (voters) → Coverage error → Frame population (everyone with a telephone) →
Sample error → Sample population ( random digit) → Non-response error → Respondents (accept
the call)
Post-stratification weights: make the sample closer to the population
Non-metric scales: outcomes are categorical (labels) or directional, they can only measure the
direction of the response (yes/no)
→ Nominal scale: number serves only as label or tag for identifying or classifying objects in mutually
exclusive and collectively exhaustive categories (SNR, gender)
→ Ordinal scale: numbers are assigned to objects to indicate the relative positions of some characteristic of
objects, but not the magnitude of difference between them (brand preference ranking)
Metric (continuous) scales: not only measure the direction or classification, but the intensity as
well (strongly agree, somewhat disagree)
→ Interval scale: numbers are assigned to objects to indicate the relative positions of some characteristic of
objects with differences between objects being comparable; zero point is arbitrary (Likert scale, satisfaction
scale, perceptual constructs, temperature (Fahrenheit/Celsius)
→ Ratio scale: most precise scale; absolute zero point (weight, height, age, income, temperature (Kelvin))
In summated scales (satisfaction with purchase experience, Likert scale), more than one question
is needed to capture all facets (to reduce a measurement error).
Validity: does it measure what it’s supposed to measure
→ (Face) validity: do these coefficients make sense? (do the effect sizes and signs give
plausible model results?)
Reliability: is it stable?
→ How much do these results change if …
→ we add additional control variables to the model
→ we take away some observations (outliers)
→ we estimate the same model on a new dataset
Type I error: null is falsely accepted
Type II error: null is falsely rejected
,p-value: probability of the observed data or statistic (or more extreme) given that the null
hypothesis is true (not a good measure of evidence)
Data preparation: explore data before running any model
→ Recode missing observations (9999=missing)
→ Reverse code negatively worded questions
→ Check that variables have the correct range/are not invalid
→ Check mutual consistency (age=18, date of birth=4/30/1901)
Data visualization: explore the data, understand/make sense of the data, communicate results
Choosing the right chart type
→ Showing the composition or distribution of one variable
→ Comparing data points or variables across multiple subunits
, Lecture 2 - ANOVA
Step 1: Defining Objectives
ANOVA: testing if there are differences in the mean of a metric DV across different levels of one or
more non-metric IVs
Interval scale as it has no natural zero point, a
‘’How much do you like this ad? 1-2-3-4-5-6-7’’ →
scale from -3 to +3 wouldn’t have made a difference
ANOVA allows for more than 2 levels, a t-test doesn’t (1 IV with 2 levels)
Step 2: Designing The ANOVA
Reality
Null Reality
Decision Null 1-α β
Alternative α 1-β
p-value: probability of getting data/a statistic that is as extreme or more extreme if the null
hypothesis is true
→ If the null is true in reality, what is the chance that we see the current data (or data even further apart
from what would be expected under the null)
→ If the p-value is low, data are unlikely according to the null, and the null can be rejected (low chance of
type I error)
→ For a type I error, an error rate of 5% is typically allowed (α=0.05, reject the null if p-value < α)
→ For a type II error, an error rate of 20% is typically allowed
→ Power of a study (1 - P(Null ] Alt) is set to 0.8
→ In 80% of the cases when the null is not true, you can correctly reject it
Power depends on
→ Effect size
→ Sample size
→ α is typically fixed
Thus, for a large effect, a small sample is sufficient to find the effect, and for a small effect, you
need a large sample to find the effect.
Step 2.1: Sample Size
Inputs to determine sample size
→ Effect size
→ Desired power
→ Alpha (α)
Cohen’s f (signal-to-noise ratio) = Standard deviation of group means / Common standard
deviation = Signal / Noise (not important, only to illustrate)
→ f=0.1 is a small effect, f=0.25 is a medium effect, f=0.5 is a large effect (mostly small to medium)