BSNS 112 Final Exam New Latest Version Best Studying
Material with All Questions from Actual Past Exam and
Correct Answer
Which of the following statements is/are true when comparing the Wilcoxon Rank Sum test to
the more commonly used parametric methods for testing differences in means?
(a) An advantage of the Wilcoxon Rank Sum test is that the null can be rejected due to
differences in the distribution other than just differences in the mean.
(b) An advantage of the Wilcoxon Rank Sum test is that it requires no distributional
assumptions.
(c) A disadvantage of the Wilcoxon Rank Sum test is that there is generally a higher chance of
rejecting the null hypothesis when the null is in fact not true.
(d) An advantage of the Wilcoxon Rank Sum test is that it usually has a higher power. -----------
Correct Answer ---------- (b) An advantage of the Wilcoxon Rank Sum test is that it requires no
distributional
assumptions.
Recall that in the ANOVA F-test, a numeric variable is measured across three or more
populations, and the test statistic can be interpreted as the ratio of "variance explained by
differences in group means" to "within-group variation." All else equal, which of the following
would unambiguously decrease the p-value?
(a) An increase in the sample mean for one treatment group whose mean was previously above
the combined sample mean (i.e., the "grand mean").
(b) An increase in the sample mean for one treatment group whose mean was previously below
the combined sample mean (i.e., the "grand mean").
(c) An increase in the within group variance for one of the treatment groups whose variance was
previously above the combined sample variance (i.e., the "grand variance"). ----------- Correct
Answer ---------- (a) An increase in the sample mean for one treatment group whose mean was
previously
above the combined sample mean (i.e., the "grand mean").
Given a continuous random variable x, what is Pr(x = 0.5)?
(a) 0.00
(b) 0.25
(c) 0.50
(d) 1.00
(e) More information is needed to answer this question. ----------- Correct Answer ---------- (a)
0.00
What is the greatest concern about the regression below?
(a) It has a small slope coefficient.
(b) It has a high R2 value.
(c) Linear regression should not be used on these data.
(d) The residuals are too large.
,(e) The regression line does not pass through zero. ----------- Correct Answer ---------- (c) Linear
regression should not be used on these data.
Assuming a large sample and unknown population variance:
The sample mean will have the _______ distribution.
The sample proportion will have the ________ distribution.
The test statistic for a single mean will have the ______ distribution.
The test statistic for a single proportion will have the ______ distribution.
(a) normal; normal; normal; normal
(b) normal; normal; Student's t; Student's t
(c) Student's t; normal; Student's t; normal
(d) normal; normal; Student's t; normal
(e) normal; Student's t; normal; Student's t ----------- Correct Answer ---------- (d) normal;
normal; Student's t; normal
Given a set of data, what might be apparent if the data is poor quality? ----------- Correct Answer
---------- observe inconsistencies, errors, missing values, outliers, or biases. There could be
duplication of records, inaccurate measurements, or incomplete information. Patterns might be
unclear
What would you expect of data that was good quality? ----------- Correct Answer ----------
accurate, complete, consistent, and reliable. It is free from errors, biases, and inconsistencies.
Patterns are discernible, relationships are clear, and the data is suitable for analysis and
modeling.
If a dataset has a large number of explanatory (independent) variables (features), why is it
desirable to select a subset of these for modelling? ----------- Correct Answer ---------- lead to
overfitting in models, making it less generalizable to new data.
Explain one method using correlation to reduce the number of features in a dataset. -----------
Correct Answer ---------- identify highly correlated variables and remove one of each correlated
pair, dentify pairs of variables with high correlation coefficients (e.g., above a certain threshold).
how stepwise regression can be used to reduce the number of variables in a linear regression
model. ----------- Correct Answer ---------- method used to select a subset of variables for
inclusion in a regression model involves iteratively adding or removing variables from the model
based on statistical criteria such as p-values, AIC (Akaike Information Criterion), or BIC
(Bayesian Information Criterion).
How might we use the principal components to visualise the structure in a dataset? -----------
Correct Answer ---------- used to reduce the dimensionality of a dataset by transforming the
original variables into a new set of orthogonal variables called principal components.
Explanatory variables which are categorical (qualitative) can be used in linear (and logistic)
regression models by constructing dummy variables. Describe what a dummy variable encoding
would look for an explanatory variable that took on 3 values (say low, medium, high) when the
, variable had the value medium. ----------- Correct Answer ---------- If the variable is "medium",
both dummy variables are 0. So, Dummy Variable 1 represents "low" (1 if low, 0 otherwise),
Dummy Variable 2 represents "high" (1 if high, 0 otherwise).
What is the issue with a normal linear model when the response is binary (i.e. predicts true/false
or buy/sell or some other 2 valued categorical value)? ----------- Correct Answer ----------
assumes a continuous, linear relationship between the predictor variables and the response.
Which of the following would not assist you in checking to see if a sample could have come
from
a normal distribution?
(a) Looking at the shape of a histogram.
(b) Calculating the skewness coefficient.
(c) Calculating the coefficient of variation.
(d) Looking at box and whiskers plots.
(e) Calculating the kurtosis coefficient. ----------- Correct Answer ---------- (c) Calculating the
coefficient of variation.
A 95% confidence interval is an interval calculated from ________ data and will cover the true
________ in 95% of all samples of the same size randomly drawn from the same population -----
------ Correct Answer ---------- (e) sample; population parameter
As the term has been used in lectures, the term standard error can refer to:
(a) The standard deviation of the population mean.
(b) The standard deviation of the sample proportion.
(c) The standard deviation of the population proportion.
(d) The standard deviation of the difference in population means.
(e) All of the above. ----------- Correct Answer ---------- (b) The standard deviation of the sample
proportion.
How are the degrees of freedom for this test calculated?
(a) One less than the number of categories.
(b) (The row total x the column total)/total sample size.
(c) (The number of rows -1) x (the number of columns -1)
(d) (Observed frequency -Expected frequency)2/Expected frequency.
(e) The number of columns. ----------- Correct Answer ---------- (a) One less than the number of
categories.
Which of the following three statements about p-values is not correct? ----------- Correct Answer
---------- (b) The p-value can be interpreted as the probability of making a Type I Error in
repeated
sampling.
If we wish to examine the relationship between total weekly household food spending and
market spending for market visitors we should use what type of graph/chart?
(a) A clustered bar chart
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller johnwachi22. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $27.99. You're not tied to anything after your purchase.