Chapter 18: Confidence intervals and tests to compare two parameters
The topic of this present chapter is the difference between two unknown parameters of the same type:
two population means 𝜇1 , 𝜇2 (by comparing their difference), two population proportions 𝑝1 , 𝑝2 (by
comparing their difference), or two population variances 𝜎12 , 𝜎22 (by comparing their ratio). The starting
point is that we now have two populations with one variable.
18.1 Some problems with two parameters
The following problems are examples or research questions:
For the course Statistics 2: is the mean grade for IBA students larger
than the mean grade for BA students? Difference
Is the lifespan of a newly developed type of tires larger than the lifespan between two 𝜇
of the older type of tires?
Do consumers prefer cola P to cola C? Difference
Has the percentage of people that approve a certain government between two p
measure changed after the discussion on television?
Is the risk of portfolio 1 larger than the risk of portfolio 2? Difference
Is the new machine more accurate? Between two 𝜎 2
We now have two populations, population 1 and population 2, and one variable X that is considered on
both. If X is quantitative, we are interested in comparing μ1 , μ2 of X or σ12 , σ22 of X on the two
populations. If X is qualitative or even binary (yes=1, no=0), we are interested in comparing the success
proportions p1 , p2 on the two populations. The comparisons will be done based on the two random
samples, one from each population.
18.2 The difference between two populations means
Two random samples are drawn, one from population 1, one from population 2. The precise description
of the way the sampling has to be done, is called experimental design. Two experimental designs will be
considered:
1. Independent samples designs: The two random samples are drawn independently of each other.
Two samples can be drawn independently from each other, for example randomly select 50 cars;
25 get type 1 tires and 25 get type 2. The sample sizes can be unequal.
2. Matched-pairs design/paired observation design: The two random sample are paired, for
instance by way of another variable. The samples are dependent because they are paired. For
example randomly select 25 cars; on each car, put type 1 left front and left-rear and type 2 right-
front and right-rear. The sample sizes have to be equal because of the pairing.
Table 1 How can data be stored?
Independent sample design Matched-pairs design
How can the sample data be The data of sample 1 in column 1 Data of both samples in column 1,
stored? of a dataset, those of sample 2 in while column 2 indicates whether
column 2. the observations come from
sample 1 or sample 2.
In this subsection, the independent sample designs will be analysed. The two samples are drawn
independently from each other. Sample 1 is of size n1, and sample 2 is of size n2. The usual estimator of
,𝛍𝟏 − 𝛍𝟐 is just the difference 𝐗̅𝟏 − 𝐗
̅ 𝟐 . Since both samples are random samples, the estimators ̅
X1 , ̅
X2
are unbiased and consistent estimators.
It can be concluded that:
𝐸(𝑋̅1 − 𝑋̅2 ) = 𝐸(𝑋̅1 ) − 𝐸(𝑋̅2 ) = 𝜇1 − 𝜇2
The estimator X̅1 − X
̅ 2 will approximate the parameters μ1 − μ2 very precisely if both sample sizes are
large enough.
Because of the fact that both samples are random samples and independent the following can be
concluded about the variance and standard deviation:
𝜎12 𝜎22
𝑉(𝑋̅1 − 𝑋̅2 ) = +
𝑛1 𝑛2
𝜎12 𝜎22
̅ ̅
𝑆𝐷(𝑋1 − 𝑋2 ) = √ +
𝑛1 𝑛2
When one of the following situations is valid
1. The distributions N(μ1 , σ12 ) and N(μ2 , σ22 ) are good models for the variable X on, respectively,
population 1 and population 2.
2. Both sample sizes n1 and n2 are large
̅1 − X
Then it follows that X ̅ 2 are also approximately normal:
𝜎12 𝜎22
𝑋̅1 − 𝑋̅2 ≈ (𝜇1 − 𝜇2 , + )
𝑛1 𝑛2
𝑋̅1 − 𝑋̅2 − (𝜇1 − 𝜇2 )
𝑍= ≈ 𝑁(0, 1)
2 2
𝜎 𝜎
√ 1 + 2
𝑛1 𝑛2
Z is a pivot. To obtain interval estimators for μ1 − μ2 , the unknown population variances have to be
replaced by good estimators, an obvious choice is to use the sample variances (S12 S22 ).
The case that two population variances are equal 𝜎12 = 𝜎22
When the population variances are equal, only one unknown parameter (σ2 ) has to be replaced by a
good estimator. It is wise to use both samples to obtain a good estimator of σ2 . The Z-value can then be
written as:
𝑋̅1 − 𝑋̅2 − (𝜇1 − 𝜇2 )
𝑍= ≈ 𝑁(0, 1)
1 1
√𝜎 2 +
𝑛 1 𝑛 2
𝑛1 − 1 𝑛2 − 1
𝑆𝑝2 = 𝑆12 + 𝑆2
𝑛1 + 𝑛2 − 2 𝑛1 + 𝑛2 − 2 2
The sample variance above is called the pooled sample variance, the pooled sample variance falls
between S12 and S22 . If the two sample sizes are precisely equal (if the design is a balanced design), the
pooled variance is precisely the average of the variances S12 and S22 . However, if the first sample is larger
than the second sample, then pooled variance will lie closer to S12 than S22.
Replacing 2 in Z by the estimator Sp2 yields:
𝑋̅1 − 𝑋̅2 − (𝜇1 − 𝜇2 )
𝑇=
1 1
√𝑆𝑝2 ( + )
𝑛1 𝑛2
The following interval estimators in the case of equal variances can be yielded:
, 1 1
𝐿 = 𝑋̅1 − 𝑋̅2 − 𝑡𝑎/2;𝑛1 +𝑛2 −2 √𝑆𝑝2 ( + )
𝑛1 𝑛2
1 1
𝑈 = 𝑋̅1 − 𝑋̅2 + 𝑡𝑎/2;𝑛1 +𝑛2 −2 √𝑆𝑝2 ( + )
𝑛1 𝑛2
Also a five-step test procedures about 1 – 2 ; equal-variance test can be used
1. Test problems
a) Test H0: μ1 − μ2 ≤ h against H1: : μ1 − μ2 > h
b) Test H0: : μ1 − μ2 ≥ h against H1: : μ1 − μ2 < h
c) Test H0: : μ1 − μ2 = h against H1: : μ1 − μ2 ≠ h
2. Test statistics
𝑋̅1 − 𝑋̅2 − ℎ
𝑇=
1 1
√𝑆𝑝2 ( + )
1𝑛 2 𝑛
3. Rejection:
a) Reject H0 ↔ t ≥ 𝑡𝛼; 𝑛1 +𝑛2 −2
b) Reject H0 ↔ t ≤ −𝑡𝛼; 𝑛1 +𝑛2 −2
c) Reject H0 ↔ t ≥ 𝑡𝛼/2; 𝑛1 +𝑛2 −2 Or t ≤ −𝑡𝛼/2; 𝑛1 +𝑛2 −2
4. Calculate the val (fill in the test statistics)
5. Draw the statistical conclusion
The case that two population variances are unequal 𝜎12 ≠ 𝜎22
Interval estimators:
𝑆12 𝑆22
𝐿 = 𝑋̅1 − 𝑋̅2 − 𝑡𝑎/2;𝑚 √ +
𝑛1 𝑛2
𝑆12 𝑆22
𝑈 = 𝑋̅1 − 𝑋̅2 + 𝑡𝑎/2;𝑚 √ +
𝑛1 𝑛2
M = min(n1, n2)-1, for example you have two samples, one of 150 and one of 100, you use the one of 100
and subtract it by 1, which yields m=99
Also a five-step test procedures about 1 – 2 ; unequal-variance test can be used
1. Test problems
a) Test H0: μ1 − μ2 ≤ h against H1: : μ1 − μ2 > h
b) Test H0: : μ1 − μ2 ≥ h against H1: : μ1 − μ2 < h
c) Test H0: : μ1 − μ2 = h against H1: : μ1 − μ2 ≠ h
2. Test statistics
𝑋̅1 − 𝑋̅2 − ℎ
𝐺=
𝑆2 𝑆2
√ 1+ 2
𝑛1 𝑛2
3. Rejection:
a) Reject H0 ↔ g ≥ 𝑡𝛼;𝑚
b) Reject H0 ↔ g ≤ −𝑡𝛼;𝑚