Management Research Methods 1 (MRM1)
Exam – Friday, 23 October 2015 – 17:00-20:00 – 3 hours
Question 1 – Quartiles and Transformation
A short research resulted in data of 20 household incomes (thousands of euros):
19.0 18.5 22.5 20.5 28.0 26.0 33.5 29.0 35.0 34.0
36.5 35.0 45.0 39.5 50.0 46.0 53.5 50.5 89.5 74.0
1a. (5) If we wanted to draw a box plot, then we would need to draw whiskers. What is the starting point
and endpoint of the upper whisker? Use Tukey's hinges.
First, put the data in ascending order:
1 2 3 4 5 6 7 8 9 10
18.5 19.0 20.5 22.5 26.0 28.0 29.0 33.5 34.0 35.0
11 12 13 14 15 16 17 18 19 20
35.0 36.5 39.5 45.0 46.0 50.0 50.5 53.5 74.0 89.5
1
𝐿𝐿1 = 2 ∙ (1 + 10) = 5.5 ⟹ 𝑄𝑄1 = 𝑋𝑋5 + 0.5 ∙ (𝑋𝑋6 − 𝑋𝑋5 ) = 27.0
1
𝐿𝐿3 = 2 ∙ (11 + 20) = 15.5 ⟹ 𝑄𝑄3 = 𝑋𝑋15 + 0.5 ∙ (𝑋𝑋16 − 𝑋𝑋15 ) = 48.0
The upper whisker starts at 𝑄𝑄3 = 48.0
𝐼𝐼𝐼𝐼𝐼𝐼 = 𝑄𝑄3 − 𝑄𝑄1 = 21.0 ⟹ 1.5 ∙ 𝐼𝐼𝐼𝐼𝐼𝐼 = 31.5
Upper fence: 𝑄𝑄3 + 1.5 ∙ 𝐼𝐼𝐼𝐼𝐼𝐼 = 79.5
Therefore, there is one outlier: 89.5 > 79.5 and the upper whisker ends at 74.0
1b. (3) Researchers in financial statistics often use 'median household income' instead of 'mean household
income'. Why?
The income distribution is skewed to the right, and the mean is sensitive to the extremely high
incomes, while the median is not. So using median, rather than mean income, results in a more
accurate picture of the general income level, as it will not be affected by abnormalities at the
extreme end.
MRM1 – 1
, As the SPSS-table below shows, the Income distribution is quite skewed. In order to perform a test, the
data have been transformed by using: 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 = ln(𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼).
Descriptive Statistics
N Mean Std. Deviation Skewness Kurtosis
Statistic Statistic Statistic Statistic Std. Error Statistic Std. Error
Income 20 39.275 18.123 1.381 0.512 2.159 0.992
LnIncome 20 3.581 0.427 0.314 0.512 -0.174 0.992
1c. (10) Is the population median of household income lower than € 40 000? This research question about
the median could be tested with a significance level of 5%. But first, motivate in detail (include
relevant statistical tests!) why it is better to test the median of Income, instead of the mean.
Motivation for test of the median and not the mean:
The sample size is small, so for a valid test of the population mean, the test variable must be normally
distributed.
However, the distribution of Income has servere positive skewness, as its skewness is 1.381 > 1.0.
Test Income for skewness: 𝑧𝑧 = 1.381/0.512 = 2.70, 𝑃𝑃(𝑧𝑧 < 2.70) = 0.9965, p-value = 2(1 − 0.9965) =
0.007 < 0.10, so the variable Income is significantly skewed. Therefore we cannot do a valid test of the
population mean.
LnIncome has no severe skewness: 0.314 < 1.0. Test LnIncome for skewness: 𝑧𝑧 = 0.314/0.512 = 0.61,
𝑃𝑃(𝑧𝑧 < 0.61) = 0.7291, p-value= 2�1– 0.7291� = 0.5418 > 0.10, so LnIncome is not significantly skewed.
Therefore we can do a valid test of the population mean of LnIncome. This corresponds with testing the
median of Income.
1d. (6) We could do the test whether the population median of household income is lower than € 40 000.
We will only do parts of the test: Give the hypotheses, calculate the test statistic t and interpret its
value.
Hypotheses:
H0: PMIncome = 40 H1: PMIncome < 40
H0: µSqrtIncome = ln(40) = 3.689 H1: µSqrtIncome < ln(40) = 3.689
Test statistic:
𝑥𝑥̅ − 𝜇𝜇 3.581 − 3.689
𝑡𝑡 = 𝑠𝑠 ~𝑡𝑡[𝑑𝑑𝑑𝑑 = 𝑛𝑛 − 1 = 19] 𝑡𝑡 = = −1.131
0.427
√𝑛𝑛 √20
Interpretation: The mean difference in the sample lies 1.131 standard errors below the value given in the
null-hypothesis.
MRM1 – 2