PSYCHOLOGY YEAR 2
STATICSTICS
Explaining and Predicting
What you should do:
Read the chapters mentioned below
Make the worked examples
Make the homework assignments
Make the practical assignments (this will be graded)
Do the SPSS sessions
Learning objectives:
All materials discussed in Course 1.3
Proportions (Chapter 8)
Crosstabs (Chapter 9)
Relationships between variables (Chapter 2)
Single and Multiple Regression (Chapters 10 & 11)
1-factor ANOVA (Chapter 12)
2-factor ANOVA (Chapter 13)
Repeated measures ANOVA (Chapter 14, Field)
Basic elements of social science research (including: experimental research, confounding variables, within-subject versus be-
tween-subject designs, factorial design, internal validity);
Exam components during the course:
Attendance requirement (100% attendance)
Course exam on November 10th (multiple-choice questions)
Practical SPSS exam on October 9th
Practical assignments on November 13th (group assignment)
Meeting 2.1
M. Passer: Chapter 3 Conducting Ethical Research Moore, McCabe & Craig: Chapter 8 Proportions
MOORE CHAPTER 8 PROPORTIONS
The goal of statistics is to say something about corresponding population proportions. You can either focus on one population or com-
pare two of them.
Single Population Proportions
A sample proportion is used when a random variable only has two possible out-
comes (e.g. do you own an iPhone? Yes or no). It is calculated by p= X/n (e.g.100
people tested and 36 own an iPhone → 36/100= 0,36 → 36% of the population).
X in this formula is the amount of successes (keep in mind that X in math is any
unknown number, it could apply to anything). Proportions always have a value
between 0 and 1 (0% and 100%).
If you use a sample proportion in a large population, the outcomes will get binom-
inal distributed (comparable to a normal distribution, but with two possible out-
comes). The bigger the sample gets, the more normal it will be distributed (see picture).
The standard deviation of the population (σ) is the number of “steps” a random p will be from the mean. It is calculated by σ =
√𝑝(1 − 𝑝)/𝑛. This formula is similar to the formula of the standard error of the population which is SD = √𝑝(1 − 𝑝)/𝑛 .
(e.g. p=0,36 and n=100 → σ= 2√0,36(1 − 0,36)/100 = 0,096)
A 95% confidence interval (C) that a random p (the p is one outcome) will be in be not an extreme (so not in the two ends of the
tail of the normal distribution) you calculate 2√𝑝(1 − 𝑝)/𝑛 or if you want to
be very precise 1,96√𝑝(1 − 𝑝)/𝑛. You chose 2 (or 1,96) because in the per-
fect normal distribution 34,1+34,1=68,2% is between -1 σ and 1 σ.
68,2+13,6+13,6= 95,4% is between -2 σ and 2 σ. If you want to do this for the
population, you calculate the margin of error 𝑚 = z√𝑝(1 − 𝑝)/𝑛. The z can
be found in table D (e.g. confidence interval of 95% → 1,96 = z). In both cases
you than take the mean p and add and subtract the margin of error: p ± m (for the large sample population this is called the
approximate level C confidence interval). (e.g. p=0,36 and n=100 → σ= 2√0,36(1 − 0,36)/100 = 0,096 → 0,36 +
0,096=0,456 and 0,36 – 0,096=0,264 → so we are 95% sure that the true mean of the population lies between 26,4% and 45,6%
of the people who have an iPhone).
Large sample confidence intervals (the proportion of the population) can only be used if the successes and failures are both
>10. If a sample does not have ate least 10 successes and 10 failures, you use the plus four estimate. This is a theory that
works very well, which just adds 2 successes and 2 failures. This leads to the following formula p=X+2/n+4. This also affects the
σ, which becomes σ= √𝑝(1 − 𝑝)/𝑛 + 4 (e.g. 4 out of 12 people have an iPhone → p=4+2/12+4=0,375 → σ=
√0,375(1 − 0,375)/12 + 4= 0,121 → 1,96*1,121=0,237 → 0,375+0,237=0,612 and 0,375-0,237=0,138 → between 61,2%
and 13,8%)
The standard error of the sample mean (SE) is what shows you whether there is a big difference between different samples
that you took from the population. Say you draw 40 people from the population to test whether people have an iPhone or not,
and then 40 people again, and so on. You get a p for every sample (e.g. 0,36 – 0,40 – 0,20 – 0,38). You can calculate how dif-
ferent the means of those samples are, by calculating the sample of the mean. The formula is SEx= σ/√𝑛 or SEx= sd/√𝑛. This
depends on whether you are given the σ of the population (always better because it is more accurate, if you have both choose
σ) or the sd of a sample.
Significance Testing for a Single Proportion
The H0 is always that there is no difference (H0: p=p1). The Ha always claims that there is a difference (Ha: p>p1 or p<p1 or p≠p1).
Note that it is always better to have a two sided test (so ≠) than a one sided test (< or >) because you have a risk of ruling out
one side of the distribution whilst there is still an effect going on there.
Note that if you use a two sided test, you either divide the alpha level by 2 (e.g. 0,05 → 0,025) or you multiply the p value by 2
(never do both). If it is a one sided test, you don’t need to do this.
𝑝−𝑝0
In order to calculate z= . Once you have calculated the z, you can look up the p in table A. You can either look up
√𝑝0 (1−𝑝0)/𝑛
the minimal z score you need in order for the H0 to get rejected first (e.g. look up significance level of 0,10/0,05/0,025 in the
table) and then reject or fail to reject according to the z score. Or you could calculate the z score, look up the p value in the table
and see if it is more or less than alpha. It basically has the same result, so whatever you like best.
If the p value is below the alpha/significance level (e.g. 0,001<0,10), H0 gets rejected. That means that there is a difference, so
Ha is true. But if it is higher (0,40>0,1), the opposite conclusion can be made.
Choosing a Sample Size for a Single Proportion
You can arrange to have both a high confidence and a small margin of error by adjusting the amount of people you use in your
sample (n).
Because you want to know which n you should use before you actually do the experiment, you will have to guess p. You can
either use a p that resulted from another, similar study or use p=0,5. Since we don’t have information about other studies on the
exam, we are going to use p=0,5. The margin of error is the greatest when between p=0,3 and p=0,7.
𝑍∗ 1,96
The formula for n=( )2 𝑝 ∗ (1 − 𝑝 ∗). So say you want a margin of error of 3% and a 95% confidence interval: ∗
𝑚 0,03
0,25=1067,1. So you will need about 1068 subjects in order to get the margin of error and confidence interval that you would
like.
Comparing Two Proportions
If you compare two populations, which happens a lot in science, you use the same formula basis but you add some elements.
To calculate the difference in the two sample proportions, you use the formula D= p1-p2. This is the same for both sample (p)
and population (µ). You usually put the highest number as p1 because it is easier to consider positive numbers than negative
numbers.
𝑝1(1−𝑝1) 𝑝2(1−𝑝2)
When you calculate the σ, you add the two from the different samples, so: σd2= σp12+ σp22 or σd=√ +
𝑛 𝑛
which is the same thing, because the answer from the first formula is σd2 → √σd.
𝑝1(1−𝑝1) 𝑝2(1−𝑝2)
The rest of the process stays the same: 𝑚 = z√ + → D ± m.
𝑛 𝑛
This test is also used only when the successes and failures are at least 10, or you need to use the plus four estimate. That
𝑥1+1 𝑥2+1 𝑝1(1−𝑝1) 𝑝2(1−𝑝2)
looks like this for a two proportions test: D= − and σd=√ + . Again the rest of the process stays
𝑛+2 𝑛+2 𝑛+2 𝑛+2
𝑝1(1−𝑝1) 𝑝2(1−𝑝2)
the same: 𝑚 = z√ + → D ± m. When the margin of error is relatively big (so a lot bigger than D), you can
𝑛+2 𝑛+2
conclude that you would need more samples in order to make the results generalizable.
Significance Testing for Two Proportions
It is preferred to compare the proportions by confidence interval (so as we did above), but you can also test a hypothesis. You
can conclude that something is significant when the interval includes 0.
Property of Sarina Verwijmeren
Les avantages d'acheter des résumés chez Stuvia:
Qualité garantie par les avis des clients
Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.
L’achat facile et rapide
Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.
Focus sur l’essentiel
Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.
Foire aux questions
Qu'est-ce que j'obtiens en achetant ce document ?
Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.
Garantie de remboursement : comment ça marche ?
Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.
Auprès de qui est-ce que j'achète ce résumé ?
Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur sarinaverwijmeren. Stuvia facilite les paiements au vendeur.
Est-ce que j'aurai un abonnement?
Non, vous n'achetez ce résumé que pour 15,49 €. Vous n'êtes lié à rien après votre achat.