Samenvatting Statistiek IV: kort en bondige herhaling
67 views 4 purchases
Course
Statistiek voor Psychologen: deel IV (P0P74A)
Institution
Katholieke Universiteit Leuven (KU Leuven)
Cursus Statistiek IV belangrijkste delen samengevat + enkele belangrijke formules die niet in formularium staan. Perfect om net voor je examen te herhalen, zodat alle belangrijkste dingen nog zijn opgefrist!
Statistiek voor Psychologen: deel IV (P0P74A)
All documents for this subject (1)
Seller
Follow
evamariedelarbre
Content preview
Samenvatting Statistiek IV
Chapter 2: The good old one-way ANOVA
ANOVA = analysis of variance
Used to make inferences about means
Analyzing data always start with explorative analysis
IOT test = interocular trauma test (pattern in data is so obvious that no further statistical analysis is
needed)
Notation and interpretation:
- Person i in condition j
o i = 1 … mj (mj persons in condition j)
o j = 1 … a (a conditions = levels of a factor)
o Balanced (number of persons across conditions is equal) or unbalanced
Statistical inferences:
1. Models and hypotheses
Full (systemic part ie population mean muj and random deviation ie noise) = means
can differ across conditions
Reduced (condition means are all equal to each other) (nested in full model)
a. Parameter estimation (population means = unknown)
Least squares estimation (= minimizes sum of squared differences between
what is observed and what the model tells it should be)
Fitted value = best guess for an observation based on the model
Difference between yij and mu(j) = difference between an observation and
what model tells us = residual = eij (the bigger the residue, the worse the
model)
Reduced model:
Full model:
b. Sum of squares
Single number needed that expresses how large the residuals are =
minimized sum of squares = error sum of squares = residual sum of squares =
SSEred/SSEfull
SSEfull = how much variability is left unexplained under full model, variability
within conditions/groups because considering the differences between
conditions does not imply that all data within a condition are exactly the
same
Total sum of squares = SStot = measures total variation present in the data
(deviation from the observations to the grand sample average, is an index of
the total variability in the sample) = SSEred = to be explained var
One-way anova: SStot = SSEreduced
SSEred > of gelijk aan SSEfull
SSEff = SSEred – SSEfull = expresses how much we can decrease the error by
considering the different groups (or conditions) (between variance tussen
condities) (difference between variability to be explained and the
unexplained variability) (measure for explained variability) (wat is het effect v
full model?)
1
, Problems:
Problem of scaling (kwadrateren) = sum of squares only interpreted
relatively to another
Error sum of squares reduced model is always larger or equally large
than full model (full more complex more flexible residuals
smaller) H0 true difference will be small, but what is small/large?
degrees of freedom
c. Degrees of freedom = complexity of the models
Raw residuals sum to zero (without squaring)
Dfred: n-1
Dffull: n-a
General: number of observations – number of freely estimated parameters
(more parameters = smaller df)
Dfred > dffull
d. Mean squares
Sum of squares / degrees of freedom
Df SSEff = a-1 (= difference between df red and full)
e. Alternative model parameterization with effect parameters
Effect parameters have to sum to zero
Alpha = estimated effect parameter = muj – mu
2. Choice of the test statistic
o Fit of the model to the data + complexity of model
o Is the decrease in error sum of squares (or fit) of full model large enough to justify its
increase in complexity? If additional number of parameters lowers the error sum of
squares sufficiently, then yes
o SSEff: not scale invariant + model complexity not taken into account
o F statistic (fits systematic and sampling (ie random) variability)
o Teller: systematic differences between conditions + sampling variability
o Noemer: sampling variability
o Systematic difference increase, F statistic also increases
3. The sampling distribution of F under H0 and what to conclude
a. Sampling distribution
H0 is true: F distribution with a-1 and n-1 df
P value = probability, given H0, to find an equally or more extreme F value
P value is conditional, defined given H0
b. What to conclude?
The smaller p value, the more evidence against H0
c. ANOVA table:
Between groups = SSEff (treatment)
Within groups = SSEfull (residuals) (error)
Total = SStot = SSEred (ook: total sample variance . n-1)
4. Determine the size of your effect
o Reporting effect sizes is crucial! Very small effect studies, but enormous amount of
data very small p value
o = practical significance
a. Biased estimator of the proportion of variance explained: eta^2 (anova) / R^2 (regr)
= ratio of amount of explained variability over variability to be explained
2
, 0 < SSEff < SStot 0 < eta^2 < 1
BUT: biased estimator of the true proportion of variance explained
(verwachte waarde groter dan 0)
b. Unbiased estimator of the proportion of variance explained: w^2
Smaller than eta^2
BUT: can become negative (-> zero)
Preferred over eta^2
c. Remarks on effect sizes: unitless + between 0 and 1 -> what is large/small?
1% = small
6% = medium
14% = large
d. Why not use F statistics of p value as a measure of effect size?
F depends on sample size + effect size
e. Uncertainty of effect sizes
Effect sizes are statistics, so depend on sample size
CI
Chapter 3: Contrasts, be more specific!
F test: conditions differ, but which conditions? How much differ they?
Contrast = a difference in which the averages of two or more conditions are involved
o Pairwise contrast = simple difference between the averages of two conditions
o Complex contrast = difference between two elements, and one or both elements are
averages of several conditions
Contrast = linear combination of sample averages, such that the coefficients sum to zero (cj)
A single planned contrast
- Derivation of the sampling distribution of g
o Distribution of g: if yij is normally distributed, then sample average yjstreep also
normally distributed every linear combination of sample averages yjstreep also
normally distributed (= contrast)
E(g) = gamma -> g is an unbiased estimator of gamma
Var (g) = variance of the sum = sum of variances because terms are
independent variance of sample average is equal to variance of single
observation divided by number of observations into sample average
- Statistical inference for gamma
o Confidence interval for gamma
o Hypothesis test for gamma (H0: gamma = C)(H0 true -> t verdeeld onder tdffull)
o Effect size: cohens d (= difference two means divided by the estimate of the
corresponding within-group standard deviation)
Around .2 small
.5 medium
.8 large
o Street fighting statistics: if sample size large enough (df full > 30) t verdeling =
normale verdeling
CI: 2.SE(g)
Rough hypothesis test: comparing value of the absolute value of t statistic
with 2 to evaluate the significance (alpha = .05)
3
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller evamariedelarbre. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $4.17. You're not tied to anything after your purchase.