Summary CAT Gelissen
Chapter 6: One-Way between Subjects Analysis of Variance
6.1 Research situations where One-Way Between-Subjects Analysis of Variance (ANOVA) is used
An ANOVA is used in research situations where the researcher wants to compare means on a quantitative Y outcome
variable across two or more groups. An ANOVA has mostly a categorical X variable. The categorical predictor variable
is called a factor; the groups are called the levels of this factor.
ANOVA is a generalization of the t test; a t test provides information about the distance between the means on a
quantitative outcome variable for just two groups, whereas a one-way ANOVA compares means on a quantitative
variable across any number of groups. But a t-test is not recommended, because if you do several t-tests on the same
sample, this can lead to an inflated risk of a type I error.
Nonexperimental design: when the means of naturally occurring groups are compared.
Experimental design: when the groups are formed by the researcher and the researcher administers a different type
or amount of treatment to each group while controlling extraneous variables.
Between-S (subjects): each participant is a member of one and only one group and the members of samples are not
matched or paired.
One way to limit the risk of Type I error is to perform a single omnibus test that examines all the comparisons in the
study as a set. The F test in a one-way ANOVA provides a single omnibus test of the hypothesis that the means of all k
populations are equal, in place of many t test for all possible pairs of groups.
F-test: is used to assess differences for a set of more than two group means, and some of the more popular procedures
for follow-up comparisons among group means.
The overall null hypothesis for one-way ANOVA is that the means of the k
populations that correspond to the groups in the study are all equal:
The alternative hypothesis: there is at least one (and possibility more than one) significant difference between group
means.
A significant F tells us that there is probably at least one significant difference among group means; by itself, it does
not tell us where that difference lies.
When we do an ANOVA, we identify the part of each individual score that is associated with group membership (and,
therefore, with the treatment or interventions, if the study is an experiment) and the part of each individual score that
is not associated with group membership and that is, therefore, attributable to a variety of other variables that are
not controlled in the study, or “error”. Researchers usually hope to show that a large part of the scores is associated
with, or predictable from, group membership or treatment.
6.3 Assumptions of One-Way Between-Subjects ANOVA
The assumptions for one-way ANOVA are similar to those described for the independent sample t test:
1. Quantitative dependent variable of interval/ratio (continuous) measurement level, independent variable of
nominal measurement level.
2. In the full sample and in each group, scores of the dependent variable are approximately normally distributed
3. No outliers
4. Variances of scores of dependent variable is equal between groups (homogeneity of variance assumption
Levene’s test)
5. Observations have been selected by random sampling and are independent.
In practice, assumption 1,3,4,5 are the most important ones.
- Assumption 1: you must be able to calculate a meaningful mean average for the groups.
- If samples are large, then assumption 2 is robust against violations
- Outliers: check always, because the average is crucial in ANOVA calculations.
1
,Assumption of independent observations:
- Score of a person does not provide information about (“is independent of”) other scores in a dataset;
- Between groups: persons belong to only one group (no double membership)
- Within each group: random sampling
- Persons do belong to a same group (for example, same team), so we have some dependency between
observations, but by including ‘group’ as independent variable we take into account (“control for”) this
dependency!
6.6 Partition of Scores into Components
When we do a one-way ANOVA, the analysis involves partition of each score into two components:
- A component of the score that is associated with group membership
- A component of the score that is not associated with group membership
The sum of squares (SS) summarize how much of the variance in scores is associated with group membership and how
much is not related to group membership.
Example:
Measure HR for each of 6 persons in a small sample. Following scores: HR scores of 81, 85, and 92 for the 3 female
participants. HR scores of 70, 63 and 68 for the 3 male participants. X = gender (group membership), coded 1=female,
and 2=male. Y= HR (quantitative variable).
We can partition HR score into a component that is related to group membership (gender) and a component that is
not related to group membership.
We will denote the HR score for person j in group I as Yij. For example, Cathy’s HR of 92 corresponds to Y13, the score
for the third person in group 1.
Grand mean (i.e. the mean HR for all N=6 persons in this
dataset) as My; for this set of scores the value of My is
76.5.
Group mean (for example the mean of the female group,
M1) = 86. M2 = 67.
For each person, we can compute a deviation of that
person’s individual Yij score from the grand mean; this
tells us how far above (or below) the grand mean of HR
each person’s score was.
The total deviation of the individual score from the grand
mean, DevGrand, can be divided into two components:
the deviation of the individual score from the group
mean, DevGroup, and the deviation of the group mean
from the grand mean, effect:
Another way to look at these numbers is to set up a predictive equation based on a theoretical model. When we do
an one-way ANOVA, we seek to predict each individual score (Yij) on the quantitative outcome variable from the
following theoretical components: the population mean, the ‘effect’ for group I, and the residual associated with the
score for person j in group i.
µ=My,
αi= Mi-My
εij= Yij-Mi
Yij= µ+ αi+ εij
2
,This equation says that we can predict (or reconstruct) the observed score for person j in group I by taking the grand
mean, adding the effect that is associated with membership in group I, and finally, adding the residual that tells us
how much individual j’s score differed from the mean of the group that person j belonged to.
This is a more formal notation: (Yij-My) = (Mi-My) + (Yij-Mi)
Total deviation for person j in group i= effect for group I + residual for person j.
This equations says that we can predict each person’s HR form the following information: Person j’s HR = grand mean
+ effect of person j’s gender on HR + effect of all other variables that influence person j’s HR.
When we do an ANOVA, we summarize the information about the sizes of these two deviations or components (Yij-
Mi) and (Mi-My) across all the scores in the sample.
ANOVA begins by computing the following deviations for each
score in the data set.
To summarize information about the magnitudes of these score
components across all participants, we square each term and
then sum the squared deviations across all the scores in the entire dataset:
SStotal: for the sum of the squared deviations of each score from the grand mean
SSwithin: for the sum of squared deviations of each score form its group mean, which summarized information about
within-group variability of scores.
SSbetween: which summarizes information about the distances among (variability) among group means.
Researchers hope that SSbetween will be relatively large because this would be evidence that the group means are far
apart and that the different participant characteristics or different types or amounts of treatments for each group are,
therefore, predictively related to scores on the Y variable.
The difference between the actual HR and the HR predicted from the model is called a residual or error. Ideally, this is
small. So the SSwithin is ideally small.
6.7 Computations for the One-Way Between-S ANOVA
6.7.1 Comparison Between the Independent Samples t Test and one-way between-s ANOVA
The one-way between-s ANOVA is a generalization of the independent samples t test.
For a one way ANOVA, the same computations as for the t Test are performed; the only difference is that we have to
obtain (and summarize) this information for k groups (instead of only two groups as in the t test). The information that
is needed is: the sample mean, the standard deviation, and the number of scores in each group.
For each analysis, we need to obtain information about differences between groups. The variance among the k group
mean is called MSbetween.
For the t test, a summary of within-group score variability is provided by S2p; for ANOVA, a summary of within-group
score variability is called MSwithin.
For an F ratio, we compare MSbetween with MSwithin; we need to have a separate df for each of these mean square terms
to indicate how many independent deviations from the mean each MS term was based on.
3
, The by-hand computation for one-way ANOVA:
1. Compute SSbetween, SSwithin, and SStotal
2. Compute MSbetween by dividing SSbetween by its df, k-1
3. Compute MSwithin by dividing SSwithin by its df, n-k.
4. Compute an F ratio: MSbetween/MSwithin
5. Compare this F value obtained with the critical value of F from a table of the distributing with (k-1) and (n-k)
df. If the F value obtained exceeds the tabled critical value of F for the predetermined alpha level and the
available degrees of freedom, reject the null hypothesis that all the population means are equal.
6.8 Effect-Size Index for One-Way between-s ANOVA
The proportion of the total variability (SStotal) that is due to between-
group differences is given by:
An eta squared is an effect-size index given as a proportion of variance; if eta squared =.50, then 50% of the variance
in the Yij scores is due to between-group differences. An eta squared is interpreted as the proportion of variance in
scores on the Y outcome that is predictable from group membership (i.e., from the score on X, the predictor variable).
6.11 SPSS output and model results
1. Look at the test of homogeneity of variances at “based on mean” sig. value. If this sig. value is higher than the
alpha value (0,05) the H0 is retained. This means that the groups have equal variances.
2. If the H0 is retained, you have to look at the ANOVA table at the between groups sig. value. This value is lower
than the alpha level of 0,05, which means that there is a significant difference between the means of two
groups and probably more. If this was not significant, there was no significant difference between the means
of the groups.
3. If H0 is rejected, this means that the groups have not equal variances. You have to look at the robust test of
equality of means at the welch sig value. In this case, this sig. value is also smaller than the alpha level, which
means that you have to reject the H0.
a. To decide if you have to look at the welch test or the brown-forsythe depends on the difference in N’s.
If the N’s of groups differ a lot, you have to look at the welch test.
4