Biostatistics An Applied Introduction for the Public Health Practitioner, 1e
Heather M. Bush (Test Bank All Chapters, 100% Original Verified, A+
Grade)
Chapter 1: An Overview of Statistical Concepts
MULTIPLE CHOICE
1. Which of the following is NOT an advantage of a retrospective study?
a. Sample size required is relatively small compared to that of a prospective study.
b. Efficient when the outcome is rare.
c. Efficient when the outcome requires a long time to develop.
d. Investigators have control over the way the variables are collected.
ANS: D
A retrospective study typically uses the data that are obtained before the initiation of the study. Thus,
the investigators have no control over the way the variables are collected.
PTS: 1
2. Central limit theorem states that the distribution of _________ will be normally distributed with large
enough samples regardless of the shape of the population distribution.
a. a sample size c. a continuous random variable
b. sample means d. a discrete random variable
ANS: B PTS: 1
3. Which of the following statements is considered a null hypothesis?
a. The children who watch more than 2 hours of TV have the same weight gain as children
who watch TV less than 2 hours.
b. Smoking is related to poorer health status.
c. There is a greater amount of fatigue in health-care professionals in urban areas.
d. Increases in activity are associated with decreases in cognition.
ANS: A
The null hypothesis is a statement of no effect, status quo, or no difference.
PTS: 1
4. Which of the following is most likely a type 2 error?
a. Correctly concluding that an effect exists.
b. Correctly concluding that no effect exists.
c. Falsely concluding that an effect exists.
d. Falsely concluding that no effect exists.
ANS: D
The formal definition of a type 2 error is the probability of accepting a null hypothesis when it should
be rejected. Since a null hypothesis typically states that there is no effect, type 2 error would be the
probability of concluding that there is no effect when an effect truly exists.
PTS: 1
5. Power, type 1 and type 2 errors are all related, which of the following statements is NOT true?
a. Probability of type 2 error = 1-power.
b. If the probability of type 1 error increases, the power also increases.
c. Probability of type 1 error is inversely related to the probability of type 2 error.
d. Probability of type 1 error increases with the probability of type 2 error.
, ANS: D
Type 1 and type 2 errors are inversely related. This means that as one increases, the other decreases.
PTS: 1
SHORT ANSWER
1. A group of investigators are interested in conducting a clinical trial to determine whether taking
bio-identical treatments prevents postmenopausal osteoporosis in women. They obtained a list of
physicians in the region and recruited their patients by mailing out response cards to all eligible
women; 60% of the cards were returned and 75% of those respondents entered the study. They were
equally divided into the treatment group and placebo group, and were followed for 5 years to
determine if they develop osteoporosis. Identify the population, sample, sampling frame, and type of
study.
ANS:
Population: all postmenopausal women in the region seeking treatment from these physicians
Sample: (0.60*0.75)*100%=45% of the eligible women; the women who received and returned the
response cards and were willing to enter the study; voluntary-response sample
Sampling frame: eligible patient mailing lists from physicians
Type of study: cohort study
PTS: 1
2. An outbreak of food poisoning occurred at a high school a few hours after 100 students ate lunch at the
school’s cafeteria. Most of them developed symptoms including nausea, vomiting, and diarrhea. An
investigation was conducted to identify the contaminated food source. Interviews were conducted with
students who ate at the cafeteria, and a food history was collected. Describe this study design
(case-control, cohort, cross-sectional). Explain.
ANS:
A cohort study would not be appropriate because the event has already occurred. The cross-sectional
study is typically used to determine the prevalence. A case-control study is reasonable because the
students with and without disease were identifiable. It allows investigators to compare the occurrence
of the disease among those who ate certain food with those who did not.
PTS: 1
3. A 2-year study was conducted to investigate bicycle safety at a city’s 10 most congested intersections.
The primary variable of interest was the number of accidents that involved a bicycle. Describe the
number of accidents involving a bicycle as a continuous and ordinal variable.
ANS:
The number of bicycle accidents can be considered as a continuous variable if the number of accidents
is large. A mean can be used to calculate the average number of accidents out of the 10 intersections.
When the number of accidents is grouped as “large,” “medium,” and “small,” an ordinal variable can
be used to describe the amount of accidents.
PTS: 1
4. Briefly describe the difference between a bar graph and a histogram. Consider your own area of
interest. Provide an example where a bar graph would be most appropriate. Provide an example where
a histogram would be most appropriate.
, ANS:
A bar graph is often used to show the visual comparison of a categorical variable. A histogram is
typically used to describe the distribution of a categorized continuous variable. Generally, bar graphs
have a gap between the bars, but the bars in a histogram touch each other.
PTS: 1
5. Twenty fibromyalgia patients are asked to register their pain on a visual analog scale (VAS), where 0
represents no pain and 5 represents the worst pain imaginable. The responses are {0, 0, 0, 1, 1, 2, 2, 3,
3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5}.
a. What are the frequency distribution and probability distribution of this sample?
b. What are the mean, median, and mode for the VAS?
ANS:
a. The frequency distribution of the VAS would be 3 zero’s, 2 one’s, 2 two’s, 3 three’s, 5
four’s, and 5 five’s. The probability distribution would be 15% zero’s, 10% two’s, 10%
two’s, 15% three’s, 25% four’s and 25% five’s.
b. The mean would be (1*2+2*2+3*3+4*5+5*5)/20 = 3. The median and mode are 3.5 and
4.0, respectively.
PTS: 1
6. Consider a study where the majority of the study subjects have high body mass index (BMI) and only
a few have low BMI. When a histogram is constructed to describe the distribution of BMI, what shape
of the distribution is most likely to be observed (right-skewed, left-skewed, or symmetric)?
ANS:
The distribution would be left-skewed because there would be a peak in the higher BMI categories and
few people in the lower and medium BMI categories.
PTS: 1
7. You suspect that the salary of study participants might be skewed. A statistician told you that the
skewed data should not be described by a mean. Why?
ANS:
In a skewed data set, the mean is likely to be pulled towards the tail and does not provide a robust
measure of the center. A median would be more appropriate.
PTS: 1
8. Suppose you are told that your BMI is 32, the 70th percentile for your age and sex. Interpret this
percentile.
ANS:
This means that among 100 people of your age and sex, your BMI is the 70th. This also indicates that
70% of the 100 typical people of your age and sex have BMI less than 32.
PTS: 1
9. John waited 30 minutes to be treated in an emergency room. A 30-minute wait is in the 20th percentile
of the wait time. Did he have a comparatively long- or short wait time? Interpret this percentile.
, ANS:
John had a short wait time. In the emergency room, 20% of the patients were treated in less than 30
minutes and 80% of the patients had to wait more than 30 minutes.
PTS: 1
10. How does the shape of the sampling distribution of means with sample sizes of N=10 and N=100
differ? How are they the same?
ANS:
The shape of a sampling distribution with N=10 is wider than that of N=100 because the variability in
the sampling distribution for the N=10 samples is much larger than the variability for the N=100
samples. They are the same because they have the same overall shape (bell-curve) and the peaks will
still be located in the same place.
PTS: 1
11. A study was conducted to examine the gender difference in daily smoking prevalence.
a. State the null and research hypotheses for this test.
b. Suppose the p-value associated with the test is 0.04. What can you conclude?
ANS:
a. Research hypothesis: there is a difference in smoking prevalence between men and women.
Null hypothesis: there is no difference in smoking prevalence between men and women.
Similarly, it can be written as H0: pmale=pfemale , where pmale = the true proportion of male
smokers and pfemale = the true proportion of female smokers.
b. Since the p-value is less than 0.05, the p-value is considered small and null hypothesis should
be rejected. We would conclude that there is a difference in smoking prevalence between men
and women.
PTS: 1
12. A study was conducted to investigate whether the use of vitamins during prostate cancer treatment
would improve the prostate-specific antigen (PSA). PSA in study patients was measured before
treatment and four months after treatment. Suppose the change in PSA is 3.0 and there is a p-value of
0.07. How would you interpret this result?
ANS:
When a p-value is considered small but slightly larger than 0.05, we would say that the change is
“marginally significant.” We avoid saying that the result is insignificant to rule out the chance of
overlooking a change of 3.0 in PSA, which might be clinically significant to the researcher.
PTS: 1
13. In preparing a grant, you meet with a biostatistician to discuss sample sizes. The biostatistician tells
you that if you want more power, you will have to increase your sample size. Explain why this is the
case. Provide additional alternatives for increasing the power of the study without changing the sample
size.
ANS: