Samenvatting

Samenvatting Statistiek IV: kort en bondige herhaling

4 keer verkocht

Instelling
Katholieke Universiteit Leuven (KU Leuven)

Cursus Statistiek IV belangrijkste delen samengevat + enkele belangrijke formules die niet in formularium staan. Perfect om net voor je examen te herhalen, zodat alle belangrijkste dingen nog zijn opgefrist!

[Meer zien]

Voorbeeld 3 van de 20 pagina's

Bekijk voorbeeld

Geupload op 8 maart 2022
Aantal pagina's 20
Geschreven in 2021/2022
Type Samenvatting

€3,89

In winkelwagen

Op verlanglijstje

100% tevredenheidsgarantie
Direct beschikbaar na betaling
Zowel online als in PDF
Je zit nergens aan vast

Samenvatting Statistiek IV
Chapter 2: The good old one-way ANOVA

ANOVA = analysis of variance
 Used to make inferences about means

Analyzing data  always start with explorative analysis
IOT test = interocular trauma test (pattern in data is so obvious that no further statistical analysis is
needed)

Notation and interpretation:
- Person i in condition j
o i = 1 … mj (mj persons in condition j)
o j = 1 … a (a conditions = levels of a factor)
o Balanced (number of persons across conditions is equal) or unbalanced

Statistical inferences:
1. Models and hypotheses
Full (systemic part ie population mean muj and random deviation ie noise) = means
can differ across conditions
Reduced (condition means are all equal to each other) (nested in full model)
a. Parameter estimation (population means = unknown)
 Least squares estimation (= minimizes sum of squared differences between
what is observed and what the model tells it should be)
 Fitted value = best guess for an observation based on the model
 Difference between yij and mu(j) = difference between an observation and
what model tells us = residual = eij (the bigger the residue, the worse the
model)
 Reduced model:
 Full model:
b. Sum of squares
 Single number needed that expresses how large the residuals are =
minimized sum of squares = error sum of squares = residual sum of squares =
SSEred/SSEfull
 SSEfull = how much variability is left unexplained under full model, variability
within conditions/groups because considering the differences between
conditions does not imply that all data within a condition are exactly the
same
 Total sum of squares = SStot = measures total variation present in the data
(deviation from the observations to the grand sample average, is an index of
the total variability in the sample) = SSEred = to be explained var
 One-way anova: SStot = SSEreduced
 SSEred > of gelijk aan SSEfull
 SSEff = SSEred – SSEfull = expresses how much we can decrease the error by
considering the different groups (or conditions) (between variance tussen
condities) (difference between variability to be explained and the
unexplained variability) (measure for explained variability) (wat is het effect v
full model?)

1

,  Problems:
 Problem of scaling (kwadrateren) = sum of squares only interpreted
relatively to another
 Error sum of squares reduced model is always larger or equally large
than full model (full more complex  more flexible  residuals
smaller)  H0 true difference will be small, but what is small/large?
 degrees of freedom
c. Degrees of freedom = complexity of the models
 Raw residuals sum to zero (without squaring)
 Dfred: n-1
 Dffull: n-a
 General: number of observations – number of freely estimated parameters
(more parameters = smaller df)
 Dfred > dffull
d. Mean squares
 Sum of squares / degrees of freedom
 Df SSEff = a-1 (= difference between df red and full)
e. Alternative model parameterization with effect parameters
 Effect parameters have to sum to zero
 Alpha = estimated effect parameter = muj – mu
2. Choice of the test statistic
o Fit of the model to the data + complexity of model
o Is the decrease in error sum of squares (or fit) of full model large enough to justify its
increase in complexity? If additional number of parameters lowers the error sum of
squares sufficiently, then yes
o SSEff: not scale invariant + model complexity not taken into account
o F statistic (fits systematic and sampling (ie random) variability)
o Teller: systematic differences between conditions + sampling variability
o Noemer: sampling variability
o Systematic difference increase, F statistic also increases
3. The sampling distribution of F under H0 and what to conclude
a. Sampling distribution
 H0 is true: F distribution with a-1 and n-1 df
P value = probability, given H0, to find an equally or more extreme F value
P value is conditional, defined given H0
b. What to conclude?
 The smaller p value, the more evidence against H0

c. ANOVA table:
 Between groups = SSEff (treatment)
 Within groups = SSEfull (residuals) (error)
 Total = SStot = SSEred (ook: total sample variance . n-1)
4. Determine the size of your effect
o Reporting effect sizes is crucial! Very small effect studies, but enormous amount of
data  very small p value
o = practical significance
a. Biased estimator of the proportion of variance explained: eta^2 (anova) / R^2 (regr)
 = ratio of amount of explained variability over variability to be explained

2

,  0 < SSEff < SStot  0 < eta^2 < 1
 BUT: biased estimator of the true proportion of variance explained
(verwachte waarde groter dan 0)

b. Unbiased estimator of the proportion of variance explained: w^2
 Smaller than eta^2
 BUT: can become negative (-> zero)
 Preferred over eta^2
c. Remarks on effect sizes: unitless + between 0 and 1 -> what is large/small?
 1% = small
 6% = medium
 14% = large
d. Why not use F statistics of p value as a measure of effect size?
 F depends on sample size + effect size
e. Uncertainty of effect sizes
 Effect sizes are statistics, so depend on sample size
 CI

Chapter 3: Contrasts, be more specific!

 F test: conditions differ, but which conditions? How much differ they?
 Contrast = a difference in which the averages of two or more conditions are involved
o Pairwise contrast = simple difference between the averages of two conditions
o Complex contrast = difference between two elements, and one or both elements are
averages of several conditions
 Contrast = linear combination of sample averages, such that the coefficients sum to zero (cj)

A single planned contrast
- Derivation of the sampling distribution of g
o Distribution of g: if yij is normally distributed, then sample average yjstreep also
normally distributed  every linear combination of sample averages yjstreep also
normally distributed (= contrast)
 E(g) = gamma -> g is an unbiased estimator of gamma
 Var (g) = variance of the sum = sum of variances because terms are
independent  variance of sample average is equal to variance of single
observation divided by number of observations into sample average
- Statistical inference for gamma
o Confidence interval for gamma
o Hypothesis test for gamma (H0: gamma = C)(H0 true -> t verdeeld onder tdffull)
o Effect size: cohens d (= difference two means divided by the estimate of the
corresponding within-group standard deviation)
 Around .2 small
 .5 medium
 .8 large
o Street fighting statistics: if sample size large enough (df full > 30)  t verdeling =
normale verdeling
 CI: 2.SE(g)
 Rough hypothesis test: comparing value of the absolute value of t statistic
with 2 to evaluate the significance (alpha = .05)

3

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

√ Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, Bancontact of creditcard voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper evamariedelarbre. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €3,89. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 64450 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Start met verkopen

Populaire universiteiten

Populaire hogescholen

Populaire studieboeken voor Communicatie en Taal

Populaire studieboeken voor Economie en Bedrijf

Populaire studieboeken voor Exact en Informatica

Populaire studieboeken voor Gedrag en Maatschappij

Populaire studieboeken voor Gezondheid en Geneeskunde

Populaire studieboeken voor Recht en Bestuur

Verkoper