1. Comment
7 December 2019 at 13:48:58
Median of lower half
2. Comment
7 December 2019 at 13:49:07
Medium of upper half
Statistics 1: Description and
Inference
Lecture 1 - Distributions, Means and Deviations
Variable: anything that can be measured and can differ across entities across time
• Independent (x): cause, doesn’t change
• Dependent (y): outcome, does changes
Levels of measurement:
• Categorical
• Nominal: no natural order
• Ordinal: natural order/rank
• Continuous vs discrete:
• Interval: 0 is arbitrary (e.g. °C)
• Ratio: 0 is meaningful (e.g. Kelvin)
Frequency distributions:
• Measure of central tendency: central position of data set
• Mean: average of numbers
n
∑i=1 xi
• x̄ =
n
• μ: mean of population
• x̄: mean of sample
• Sensitive to extreme values/outliers
• Median: middle score when data is arranged by magnitude
• Mode: most frequent score
• Measure of dispersion: stretch/squeeze of data set
• Range: maximum value - minimum value
• Interquartile range (IQR): range of middle 50%
1 2 • Q3 - Q1
• Deviance: how much does data deviate from mean
n
SS ∑ (xi − x̄ )2
2
• Variance: s = = i=1
N−1 N−1
• SS: sum of squared errors
• Standard deviation: s = var i a n ce
• σ: standard deviation of population
• s: standard deviation of sample
• Normal distribution: where mean = median = mode (symmetrical), allow us to calculate
probabilities of outcome values
• Ranges:
• 68% within 1σ of μ
• 95% within 1.96σ of μ
,3. Comment
7 December 2019 at 15:10:34
Multiple of σ (e.g. 1.96)
4. Comment
7 December 2019 at 14:08:27
Don’t do both smaller or both
larger
5. Comment
9 December 2019 at 12:40:50
When categories are not
substituted by numbers
6. Comment
7 December 2019 at 14:22:47 • 99.7% within 3σ of μ
Instead of computed as 0. • Standardizing normal distribution:
x − x̄
3 • Z-score: z =
Missing values given random s
number (e.g. -8) in data view, • Refer to table of standard normal distribution to identify probability
which is identified as value to be • Finding ranges:
excluded in variable view. • If both values on same side of mean, subtract like normal
4 • If each value on either side, choose one larger and one smaller portion and subtract
7. Comment
7 December 2019 at 14:30:16
Opens up syntax
SPSS 1
Necessary to prevent technical
issues? Windows:
• Data editor: input data
8. Comment • Tabs:
7 December 2019 at 14:30:32
• Variable View: defining variables (and their characteristics)
Opens up output/viewer 5 • Type: numeric, string (categorical)
• Label: full name of variable
• Values: allows categories to be represented as numbers
6 • Missing: identifies values to be excluded from data
• Measure: scale (interval-ratio), ordinal, nominal
• Data View: defining values within each variable
• Output/viewer: interpret data (displays graphs, tables, special values)
11. Comment
7 December 2019 at 14:38:50
E.g. deviance
12. Comment
7 December 2019 at 14:34:43
Measured data
13. Comment Lecture 2 ??
7 December 2019 at 14:34:55
Estimated data (from variables) Statistical models: summarize data (observed) and predict real world (expected)
14. Comment 9 11 outcomei = (model) + errori
7 December 2019 at 15:01:51
12 13 • Combination of variables and parameters
Where means of samples are
there own data values
Goodness of fit:
• Tradeoff between simplicity and accuracy
15. Comment n
SS ∑ (outcom ei − m od eli )2
7 December 2019 at 15:07:55
• m ea n squ ared er r or (MSE ) = = i=1
Most normal. N−1 d egrees of f reed om
• Aka variance (more general)
Interval range in which 95% of • Degrees of freedom = N - 1
sample means fall. • outcomei = xi
• modeli = x̄
Or there Is 5% chance that range • outcomei = b0 + b1xi + errori
does not include population • Quadratic equation (y = ax + b + errori)
mean
Sampling:
16. Comment • Samples: estimated population parameters
7 December 2019 at 15:09:52 • Allow us to generalize about population
From Z-score. • Sampling distribution: theoretical distribution of infinite samples
• Central limit theorem: when samples become large, average of sample means = population
17. Comment mean
7 December 2019 at 15:19:34 • Approximately normally distributed
More prone to produce values far 14 • Standard error (σx̄ ): standard deviation of sampling distribution
from mean s
σx̄ =
• N
18. Comment • Con dence interval: range in which true population mean likely exists
7 December 2019 at 15:20:40 • Format: CI = {lower bound; upper bound}
As N increases, t-distribution • CI = x̄ ± threshold value × σx̄
more similar to normal 15 • Usually 90%, 95%, 99%
distribution. • Higher CIs are wider ranges
• Using z-score (sample > 100):
16 • 95% CI = x̄ ± 1.96 × σx̄
• Central limit theorem allows us to use z-score
• Using t-distribution (sample < 100)
17 • Symmetric/bell-shaped (like normal distribution) but heavier tails
18 • Shape depends on degrees of freedom (df = N - 1)
• CI = x̄ ± tN-1 × σx̄
• tN-1 found in table of t-distribution
SPSS 2
fi
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper bellakim. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €6,49. Je zit daarna nergens aan vast.