Statistics II Regression Analysis - Answers to all assignments, including final assignment
3 views 0 purchase
Course
Regression analysis, Statistics part 2
Institution
Maastricht University (UM)
Here are the answers to all weekly assignments for the PhD course at Maastricht University of Statistics II, focusing on Regression Analysis. With these answers, you will be able to assess your own learning, to see whether you understand what has been learned so far.
1. Correlation
Data file:
Open the SPSS file: colton6re.sav. In this file, age in years (Age) and systolic blood pressure
in mmHg (SBP) are given for 33 women from a Canadian study. An interesting research
question in this study is whether there is an association between age and systolic blood
pressure. Moreover, how strong is this association?
Before you start with the following questions, make your own analysis strategy: what steps
should be taken to answer this research question?
Task 1.1
Make a scatterplot with systolic blood pressure on the y-axis and age on the x-axis to see
whether there are no impossible values and that the possible association between systolic
blood pressure and age is linear.
Question 1a.
Are there any impossible values for systolic blood pressure or for age?
Systolic blood pressure ranges from about 100 to 220 (exact: 99 to 217), which are all
plausible values. Age ranges from about 20 to 80 (exact 22 to 81). Thus, there are also no
impossible values for age.
PhD course Statistics part 2: Regression analysis and SPSS 1
,Question 1b.
Is the association positive or negative? Is there a straight-line association or does it deviate
substantially from a linear line? Why does this latter matter?
Positive association (if age increases, SBP also increases); no clear deviation from linearity,
therefore you may assume linearity. This deviation from linearity matters, since the Pearson
correlation coefficient is a measure of linear association. If this assumption is clearly
violated, one should not use Pearson correlation coefficient on the raw data. A
transformation to make the association linear or use of Spearman correlation (if association
is monotone in- or decreasing) can then be considered.
Question 1c.
Is the association strong or weak? Give an estimation of the correlation coefficient (do not
calculate it).
Whether the association is strong or weak, in other words whether the points are close to a
straight line, is hard to tell. I would say it is a medium to rather strong correlation.
Estimation: points fairly close to a straight line → correlation coefficient between 0.5 and
0.8.
Task 1.2
Compute with SPSS the Pearson and Spearman correlation between age and systolic blood
pressure. Test whether these correlations are significant.
Correlations
Systolic blood
Age in years pressure in mmHg
Age in years Pearson Correlation 1 .718**
Sig. (2-tailed) .000
N 33 33
Systolic blood pressure in mmHg Pearson Correlation .718** 1
Sig. (2-tailed) .000
N 33 33
**. Correlation is significant at the 0.01 level (2-tailed).
Correlations
Systolic blood
pressure in
Age in years mmHg
Spearman's rho Age in years Correlation Coefficient 1.000 .659**
Sig. (2-tailed) . .000
N 33 33
Systolic blood pressure in Correlation Coefficient .659** 1.000
mmHg Sig. (2-tailed) .000 .
PhD course Statistics part 2: Regression analysis and SPSS 2
, N 33 33
**. Correlation is significant at the 0.01 level (2-tailed).
Question 2a.
How large are the Pearson and Spearman correlations? Are these correlations significant,
assuming a significance level (α) of 0.05? Which null hypothesis belongs to these tests?
Pearson: r = 0.718; p = 0.000 (< 0.001), thus significant (smaller than 0.05). H0: ρ = 0
(correlation in population = 0),
Spearman: rs = 0.659; p = 0.000 (< 0.001), thus significant (smaller than 0.05). H0: ρs = 0
(Spearman correlation in population = 0).
Question 2b.
Which correlation (Pearson or Spearman) is preferred? What information is additionally
required to answer this question? Make sure that you obtain this information using SPSS and
try to explain why the Pearson or Spearman correlation is preferred?
The Pearson correlation is preferred if both variables are (approximately) normally
distributed (and the relation is linear, which is checked in question 1). Thus, normality has to
be checked for both variables.
PhD course Statistics part 2: Regression analysis and SPSS 3
, PhD course Statistics part 2: Regression analysis and SPSS 4
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller catherinnaems. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $5.97. You're not tied to anything after your purchase.