W1.1
CHAPTER 12.1-12.7 (FIELD) – COMPARING SEVERAL INDEPENDENT MEANS
Analysis of variance/ANOVA: The inferential method for comparing means of several groups
- Allows for a comparison of more than two groups at the same time to determine
whether a relationship exists between them
- F statistic/F-ratio: The result of the ANOVA formula – Allows for the analysis of
multiple groups of data to determine the variability between samples and within
samples
- Factors: Categorical explanatory variables in multiple regression and in ANOVA
- If groups truly differ, the between-group variability must be larger than the within-
group variability!
Why we cannot use multiple T-tests for more than two groups → Multiple comparisons
problem: Conducting multiple T-tests without adjusting for multiple comparisons increases
the chance of a Type I Error (i.e. a false positive)
ANOVA → Two or more categorical predictor variables and one continuous outcome
variable
- Variance = Comparing the between-groups variance with the within-groups variance
The means differ F-statistic will be > 1 P-value will likely be small
The means do not differ F-statistic will be < 1 P-value will likely be large
Total sum of squares: One model for all data points – Represents the total variation in the
dependent variable from its mean (= SSM + SSE), regardless of the group from which the
scores come
Model sum of squares (SSM): How much of the variation can be explained by the model –
Difference between the grand mean and the group means – How much improvement there is
when looking at the group means instead of the grand mean = Between-groups variability
n [ ( x 1−x ) + ( x 2−x ) +⋯ ( xq −x ) ]
2 2 2
- Mean square model (MSM):
g−1
Error sum of squares (SSE)/residual sum of squares (RSS): How much of the variation cannot
be explained by the model = Within-groups variability
2
∑ ( x ig −x g )
- Mean square residual (MSR):
N−g
Omnibus test: ANOVA can indicate whether there are significant difference among groups,
but it does not explicitly tell you which specific groups differ → If the ANOVA result
indicates that there are significant differences between groups, one can perform post-hoc tests
to determine which specific groups differ from each other
, ANOVA significance test:
1. Assumptions:
1. Applicable in cases of a categorical explanatory variable and a quantitative
response variable – The explanatory variable should have at least 3 groups
2. The population distribution of the response variable for the g groups are
approximately normal → Shapiro-Wilk test
3. Homogeneity of variances → Levene’s test
↓
Variations of the traditional F-statistic designed to address situations where the
assumption of homogeneity of variance is violated:
- Brown-Forsythe F
- Welch’s F
4. Independent random samples
2. Hypotheses:
- H 0 : μ1=μ2=…=μ g
- H 1 : at least two of the population means are different
3. Test statistic:
↓
MSM SSM / ⅆ f (amount of groups−1) signal
F= = =
MSE SSE / ⅆ f (total sample ¿of groups) noise
↓
MSM = [ 1
n ( x −x )2 + ( x 2−x )2 +⋯ ( xq −x )2 ]
g−1
2
∑ ( x ig −x g )
MSE =
N−g
4. P-value: 1-F.DIST(F-score; ⅆ f 1; ⅆ f 2; TRUE)
↓
Degrees of freedom:
- ⅆf 1 = g – 1
- ⅆ f 2 = N – g → N = total number of subjects
5. Conclusion: The smaller the P-value, the more unusual the sample data is, the stronger
the evidence against H 0, and the stronger the evidence in favour of H 1
Source df SS MS F P
Model ⅆf 1 M S model × ⅆ f 1 Within-groups MS model P-value
estimate MS error
Error ⅆf 2 MS error × ⅆ f 2 Between-groups
estimate
Total ⅆf 1 + ⅆf 2 Between-groups
SS + Within-
groups SS
, Dummy variable: A categorical value that takes a binary value (0 or 1) to indicate the absence
or presence of some categorical effect that may be expected to shift the outcome – Allow us to
include categorical variables into analyses, which would otherwise be difficult to include due
to their non-numeric nature
Following up a significant F-statistic by looking at model parameters, which provide
information about specific differences between means:
- Dummy coding: The simplest form of contrast coding in which one group is
designated as the reference category (typically mentioned first), and other groups are
compared to this reference (E.g.: if you have three groups (A, B and C), one might
code them as A = 0, B = 1 and C = 0, which allows one to test whether group B differs
from the reference group (i.e. group A)
- Issues with two dummy variables:
- Performing two t-tests inflates the familywise error rate
- The dummy variables might not make all the comparisons that we
want to make
- Planned contrasts: A specific type of contrast coding where one predefines the
comparisons one wants to test before conducting the analysis – Allows one to test a
limited number of comparisons that are meaningful for the research questions (E.g.: in
a study with three groups (A, B and C), one might be interested in comparing A to B
and A to C while ignoring the comparison between B and C)
- Three rules for contrast coding using planned contrasts:
1. If you have a control group, this is usually because you want to
compare it against any other groups.
2. Each contrast must compare only two ‘chunks’ of variation
3. Once a group has been singled out in a contrast it can’t be used in
another contrast – Once a piece of variance has been split from a larger
piece, it cannot be attached to any other pieces of variance, it can only
be subdivided into smaller pieces
- Five rules for assigning values to dummy variables to obtain contrasts:
1. Choose sensible contrasts
2. Groups coded with positive weights will be compared against groups
coded with negative weights – It does not matter which way round this
is done
3. If the weights for a given contrast are added up, the result should be
zero
4. If a group is not involved in a contrast, automatically assign it a weight
of zero, which will eliminate it from the contrast
5. For a given contrast, the weights assigned to the group(s) in one chunk
of variation should be equal to the number of groups in the opposite
chunk of variation
- Post hoc analysis: A statistical analysis specified after a study has been concluded and
the data collected
Alle Vorteile der Zusammenfassungen von Stuvia auf einen Blick:
Garantiert gute Qualität durch Reviews
Stuvia Verkäufer haben mehr als 700.000 Zusammenfassungen beurteilt. Deshalb weißt du dass du das beste Dokument kaufst.
Schnell und einfach kaufen
Man bezahlt schnell und einfach mit iDeal, Kreditkarte oder Stuvia-Kredit für die Zusammenfassungen. Man braucht keine Mitgliedschaft.
Konzentration auf den Kern der Sache
Deine Mitstudenten schreiben die Zusammenfassungen. Deshalb enthalten die Zusammenfassungen immer aktuelle, zuverlässige und up-to-date Informationen. Damit kommst du schnell zum Kern der Sache.
Häufig gestellte Fragen
Was bekomme ich, wenn ich dieses Dokument kaufe?
Du erhältst eine PDF-Datei, die sofort nach dem Kauf verfügbar ist. Das gekaufte Dokument ist jederzeit, überall und unbegrenzt über dein Profil zugänglich.
Zufriedenheitsgarantie: Wie funktioniert das?
Unsere Zufriedenheitsgarantie sorgt dafür, dass du immer eine Lernunterlage findest, die zu dir passt. Du füllst ein Formular aus und unser Kundendienstteam kümmert sich um den Rest.
Wem kaufe ich diese Zusammenfassung ab?
Stuvia ist ein Marktplatz, du kaufst dieses Dokument also nicht von uns, sondern vom Verkäufer lottepeerdeman. Stuvia erleichtert die Zahlung an den Verkäufer.
Werde ich an ein Abonnement gebunden sein?
Nein, du kaufst diese Zusammenfassung nur für 7,49 €. Du bist nach deinem Kauf an nichts gebunden.