Introduction to the practice of statistics
Chapter 6: Introduction to inference
Statistical inference draws conclusions about a population or process from sample data. It also
provides a statement of how much confidence we can place in our conclusions.
Types of statistical inference:
1. Confidence intervals: estimating the value of a population parameter.
2. Tests of significance: assess the evidence for a claim.
Formal inference emphasizes substantiating our conclusions via probability calculations.
Sampling distributions of statistic: what would happen if we used the inference method many times.
6.1: Estimating with confidence
The sample mean (x-gemiddeld) is the natural estimator of the unknown population mean (mu).
Unbiasedness of an estimator concerns the center of its sampling distribution, but questions about
variation are answered by looking at its spread.
Confidence interval= estimate +/- margin of error
Two important things about a confidence interval are common to all settings:
1. It is an interval of the form (a,b), where a and b are numbers computed from the sample
data.
2. It has a property called a confidence level that gives the probability of producing an interval
that contains the unknown parameter.
A level C confidence interval for a parameter is an interval computed from sample data by method
that has probability C of producing an interval containing the true value of the parameter.
Margin of error: m= z* x sd/wortel n
Confidence interval= x gemiddeld +/- m
Reducing a too large margin of error:
- Use a lower level of confidence (smaller C)
- Choose a larger sample size (larger n)
- Reduce standard deviation
Sample size for desired margin of error: n= (z* x sd/m)2
Z*: hoe kleiner C, hoe kleiner z*, hoe smaller het BHI.
Standaarddeviatie: hoe kleiner de standaarddeviatie, hoe smaller het BHI.
N: hoe groter n, hoe kleiner sd/wortel n, hoe smaller het BHI.
6.2: Tests of significance
Null hypothesis (H0): the statement being tested in a test of significance. The test of significance is
designed to assess the strength of the evidence against the null hypothesis.
- H0: there is no difference in the population means.
- H0: the difference in population means is zero.
, Alternative hypothesis (Ha): the statement we hope or suspect is true instead of H 0.
- Ha: the population means are not the same.
- Ha: the difference in population means is not zero.
H0 and Ha are parameters.
Test statistic: measures compatibility between the null hypothesis and the data.
Test statistic= estimate – hypothesized value
standard deviation of the estimate
P-value: the probability, assuming H0 is true, that the test statistic would take a value as extreme or
more extreme than that actually observed. The smaller the P-value, the stronger the evidence
against H0 provided by the data.
Significance level: when we choose alpha=0,05, we are requiring that the data give evidence against
H0 so strong that it would happen no more than 5% of the time when H 0 is true.
If the P-value is as small or smaller than alpha, we say that the data are statistically significant at level
alpha.
Four steps common to all tests of significance are as follows:
1. State the null hypothesis H0 and the alternative hypothesis Ha. The test is designed to assess
the strength of the evidence against H0; Ha is the statement that we will accept if the
evidence enables us to reject H0.
2. Calculate the value of the test statistic on which the test will be based. This statistic usually
measures how far the data are from H0. Z= (x gemiddeld – mu0)/(sigma/wortel n)
3. Find the P-value for the observed data. This is the probability, calculated assuming that H 0 is
true, that is the test statistic will weigh against H 0 at least as strongly as it does for these
data. Table A
4. State a conclusion. Choose a significance level alpha, how much evidence against H0 you
regard as decisive. If the P-value is less than or equal to alpha, you conclude that the
alternative hypothesis is true; if it is greater than alpha, you conclude that the data do not
provide sufficient evidence to reject the null hypothesis.
A two-sided test at significance level alpha can be carried out directly from a confidence interval with
confidence level C= 1 – alpha.
Critical value: a value z* with a specified area to its right under the standard Normal curve.
6.3: Use and abuse of tests
Ha is the research hypothesis asserting that some effect or difference is present. The null hypothesis
H0 says that there is no effect or no difference. A low P-value represents good evidence that the
research hypothesis is true.
When a null hypothesis can be rejected at the usual level alpha=0,05, there is good evidence that an
effect is present. That effect, however, can be extremely small. When large samples are available,
even tiny deviations from the null hypothesis will be statistically significant.
Statistical significance is not the same as practical significance. Statistical significance rarely tells us
about the importance of the experimental results.