Every statistical method is developed based on assumptions.
The validity of results derived from a given method depends
on how well the model assumptions are met. Many statistical
procedures are “robust”, which means that only extreme
violations from the assumptions impair the ability to draw valid
conclusions. Linear regression falls in the category of robust
statistical methods. However, this does not relieve the
investigator from the burden of verifying that the model
assumptions are met, or at least, not grossly violated. In
addition, it is always important to demonstrate how well the
model fits the observed data, and this is assessed in part
based on the techniques we’ll learn in this lecture.
Regression diagnostics – p. 2/48
,Different types of residuals
Recall that the residuals in regression are defined as yi − ŷi ,
where yi is the observed response for the ith observation,
and ŷi is the fitted response at xi .
There are other types of residuals that will be useful in our
discussion of regression diagnostics. We define them on the
following slide.
Regression diagnostics – p. 3/48
Different types of residuals (cont.)
Raw residuals: ri = yi − ŷi
ri
Standardized residuals: zi = where s is the estimated
s √
error standard deviation (i.e. s = σ̂ = MSE).
zi
Studentized residuals: ri∗ = √1−h i
where hi is called the
leverage. (More later about the interpretation of hi .)
s
Jackknife residuals: r(−i) = ri∗ s(−i) where s(−i) is the
estimated error standard deviation computed with the ith
observation deleted.
Regression diagnostics – p. 4/48
, Which residual to use?
The standardized, studentized and jackknife residuals are all
scale independent and are therefore preferred to raw
residuals. Of these, jackknife residuals are most sensitive to
outlier detection and are superior in terms of revealing other
problems with the data. For that reason, most diagnostics rely
upon the use of jackknife residuals. Whenever we have a
choice in the residual analysis, we will select jackknife
residuals.
Regression diagnostics – p. 5/48
Analysis of residuals - Normality
Recall that an assumption of linear regression is that the error
terms are normally distributed. That is ε ∼ Normal(0, σ 2 ). To
assess this assumption, we will use the residuals to look at:
• histograms
• normal quantile-quantile (qq) plots
• Wilk-Shapiro test
Regression diagnostics – p. 6/48
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through EFT, credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying this summary from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller jacksonmobe. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy this summary for R133,33. You're not tied to anything after your purchase.