Summary

Summary Statistical Data Analysis CH 6, 7, 8

0 purchase

Course
Statistical Data Analysis

Institution
Vrije Universiteit Amsterdam (VU)

Everything you need to know for DSA chapters 6, 7 and 8!

[Show more]

Preview 2 out of 10 pages

View example

Uploaded on March 21, 2022
Number of pages 10
Written in 2021/2022
Type Summary

statistical
data
analyis
business
analytics
summary

Institution
Vrije Universiteit Amsterdam (VU)
Education
Business Analytics
Course
Statistical Data Analysis

femkestokkink

Member since 3 year 42 documents sold

$5.35

Add to cart

Add to wishlist

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Femke Stokkink Statistical Data Analysis
2666619
2022-03-21
Summary

1 Chapter 6: Nonparametric Methods
1.1 The One Sample Problem

Nonparametric tests:
• Nonparametric tests make no (parametric) assumptions about the underlying distribution F of the data.
So, for example no normality assumption.
• These tests are applicable for broad classes of distributions, and have actual level α. The distribution of
the test statistic under H0 is the same for each distribution F belonging to H0 .
• Nonparametric tests are robust with respect to the level: they have the intended level α for a large class
of distributions.
• Nonparametric tests are more efficient (have higher power) than parametric tests when the (normality)
assumptions are not fulfilled.

One sample problem: assume we have a sample X1 ,...,Xn from an unknown distribution F , and we want to
test the location of F . Which test would you use?

T-test:
• Assumption: X1 ,...,Xn ∼ N(µ,σ 2 )
• Null hypothesis H0 : µ = µ0
√
• Test statistic T = n X̄−µ
Sx
0

• Distribution: under H0 we have T ∼ tn−1 .
• This is a parametric test (assumes normality) for a composite H0 , consisting of all normal distributions
with expectation µ0

Sign Test:
• Assumption: underlying distribution F has a unique median m, such that P(Xi < m) = P(Xi > m) = 12 .
• Null hypothesis H0 : m = m0 . This is a composite null hypothesis.
• Test statistic T = #(Xi > m0 ) = Σni=1 1Xi >m0
• Distribution: under H0 we have T ∼ bin(n, 12 ). This is a nonparametric test, since T has this distribution
for all F in H0 .
• In case k of the Xi ’s are equal to m0 , delete these k values and perform the test conditionally on k values
equal to m0 , and T ∼ bin(n - k, 21 ) under H0 .

(Wilcoxon) Signed Rank Test:
• Assumption: Underlying distribution F is continuous and symmetric around m.
• Null hypothesis H0 : m = m0 . This is a composite null hypothesis.
• Test statistic V is based on the ranks Ri of the absolute differences |Xi - m0 |. V = Σni=1 Ri sgn(Xi − m0 )
• Distribution: relatively large values of V indicate that m is larger than m0 . Under H0 , V is distributed as
Σni=1 Qi Ri with
– Qi random variable, P(Qi = -1) = P(Qi = 1) = 12 .
– R1 ,...,Rn a random permutation of {1,...,n}.
• Since this distribution is the same for all distributions under H0 , this is a nonparametric test.
• Wilcoxon test is usually more powerfull

1

, 1.2 Asymptotic Efficiency

A more efficient test needs fewer observations to obtain the same power as the less efficient test. Like in the
case of (robust) estimators we will consider the asymptotic case for the number of observations n tending to
infinity, and we will see that the asymptotic variance (now of the test statistic) plays a role in determining the
asymptotic efficiency.

Let X1 ,...,Xn be observations from a distribution F . According to H0 , F belongs to a class F0 (for example, all
distributions with median m), whereas according to H1 , F belongs to a class F1 . The power of a test is the
function π(F ) = PF (H0 is rejected).

For a test to be good π(F ) should be small when F ∈ F0 and large when F ∈ F1 .

It is difficult to compare the power of two tests for every possible F , therefore we focus on shift alternatives:
alternatives that can be obtained by shifting a distribution that belongs to F0 over a certain distance θ. Such
alternatives therefore have a location that is shifted over this distance θ, whereas the scale stays the same. We
limit to right-sided testing problems, so θ > 0.

If, for example, one would consider hypotheses about the median, F0 could be a distribution with median m0 ,
and then F0 would have median m0 + θ. The power for the class of shift alternatives F0 can be written as
πn (θ) = Pθ (H0 is rejected).

For a suitable right-sided test the value πn (0) of the power function under H0 is small, whereas πn (θ) is ‘large’
for θ > 0, so under the alternative hypothesis.

Suppose that H0 is rejected for a large value of the test statistic Tn and assume Tn is asymptotically normally
distributed. This means that for large n, Tn is approximately distributed as N (µ(θ), σ 2 (θ)/n), when θ is the
true value of the parameter. here µ(θ) is the asymptotic mean and σ 2 (θ) the asymptotic variance of Tn
√ 0
The power of the test can then be rewritten to: πn (θ) ≈ 1 − Φ(ξ1−α − n µσ(0)
(0)
θ)

• ξ1−α is the 1-α quantile of the N(0,1) distribution.
µ0 (0)
• σ(0) is the slope of the test. The larger the slope, the better the test.

• A sequence of tests Tn is consistent when for a fixed α the power tends to 1 for each alternative when n
goes to ∞. In this case the sequence of tests is consistent for the shifted alternatives if µ(θ) > µ(0).

0
The asymptotic relative efficiency of the test Tn with respect to T̃m = are(Tn ,T̃m ) = ( µµ̃0 (0)/σ(0) 2
(0)/σ̃(0) ) =
m
n

• Here T̃m is the test statistic of a second test
• If are(Tn ,T̃m ) > 1, then Tn is more efficient than T̃m .

If for a certain sample the a.r.e of two tests is 2/π, then in order to obtain similar power in both tests, the
sample size ratio should be nntest1
test2
= π2 = 0.64

1.3 Two Sample Problems

Two samples can either be paired or independent:
• Paired samples: (X1 , Y1 ), (X2 , Y2 ),..., (Xn , Yn )
• Independent samples: X1 , X2 , ..., Xm and Y1 , Y2 , ..., Yn
For paired samples: if one is interested whether one sample is stochastically larger, consider differences Zi =
Yi - Xi . This then becomes a one sample problem.

Median test:
• Assumption: X1 , .., Xm ∼ F, Y1 , .., Yn ∼ G, F and G continuous and both are i.i.d.

2

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller femkestokkink. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $5.35. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

64450 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 15 years now

Start selling

Popular Universities in the United States

Popular books

Find notes and summaries for these qualifications

Seller