1 response variable 2 resp. var: x and y 2 samples hypergeometric distr
Sum (Mann-Whitney test) Rank of fit =1 r. sample dence=1 r. sample geneity=r sample
π=success proportion
t-test Or d =x-y t-test t-test Non-parametric alternative for
2- independent sample t-test
Non-parametric alternative for
paired t-test
1 nominal variables 2 nominal variables 1 nominal var.
Data: for R.V(y1-y2) Two independent random samples of sizes and n2 from 𝑛two
pairedn1observations, pairs (𝑥, 𝑦), Example: hypothesis that the 2x2 table ; study difference in ESTIMATED
Assumptions Data: for resp.val(y) Data: for R.V(d) populations (or a comparison of two with 𝑑=𝑥−𝑦. Normality of the proportion of sth in the C-group proportion bw 2 groups
Independence (Ho) r random sample
Independence EXPECTED
Independence treatments). differences is doubtful (placebo) in this study is ≠ example :” whether A has an Association (Ha) 1 nominal variables
Independence Normality The response is a continuous variable. there is a systematic difference significantly from π=0.9. effect on the proportion of
COUNT
Normality survival as compared to C?” =i*j/n
Normality Equal variances σ
Equal variances σ =
var.equal=TRUE
N(10,2)=norm.distr.=10&st.dev=2
Situation 1 2. 2 2. 3 1. μ -μ
1 2
4. 3a The Ho is not given in terms of parameters, but 2a 10 1. μx=π= 11 1.π1-π2 12 ATTENTION: If he gives % 13 E=expect y1 14
1.parameter in terms of distributions.
Under H0, T+ and T- have the 2. π1-π2 = y1/n1-y2/n2 in expected and wants to find x1, The approximation is
2. π1= y1/n1
The Ha states the so-called shift alternative:
3. 2. two distributions of the same form that are shifted same expected value: 1a conversion: π=y/n => find y
2.estimator 4. n -1
relative to each other
EH0(T-) = n(n+1)/4 (=10.5) 3 and it is Ej adequate if 100% of Ei≥
Ranks for the observations (of both samples together)
are calculated: 1 for the lowest observation, ...,
T-=T+ 3.se= √π(1-π)/n 1 and 80% of Ei ≥ 5
3.se(st.error) 4.df 3. rank (n1 +n2) for the highest
H0: the distributions of the response are the same forH0:-distribution
the of differences d is symmetrical
Ho=parameter = μ Ho = μd 𝐻0: 𝜇1 − 𝜇2 = 0 Ho: π1=0.9 Ho: π1=...,π2=… 𝐻0: row and columns are Ho: the probabilities
1.Ho & Ha 𝐻𝑎: 𝜇1 − 𝜇2 < 0
two populations.
Ha : treatment A or population A has systematically
around 0 Ho: π1-π2 =0 independent
Ha > < ≠ μ Ha > < ≠ μd lower/higher/different values than B. H 0: median of the distribution of d is 0 the population proportion of
Π3=..., π4=… are he same
W1=8, W2=7
We can have as an alternative
one-sided e.g. 𝐻𝑎: median < 0
successes Ha: π1-π2 ≠0 𝐻𝑎: they are not
Ha: the probabilities
or two-sided : 𝐻𝑎: median ≠ 0 Ha: π1 ><≠0.9 Ha: Ho is false independent, they are
associated, so x matters vary bw classes
Test statistic W = W1 or W2 (sum of ranks in one of the
Test statistic T- : sum of ranks of
2.TS (equation) TS=estimator-μ/se(y) Same for difference y=nr of success χ2=Σik=1(Oi-Ei)2/Ei χ2=Σik=1(Oij-Eij)2/Ei Add all the categories
=
samples. )
When Routput is available, choose the one indicated
in the output. Here we choose: W1
negative d’s (d = 𝑥 - 𝑦)
or: Test statistic: T+ (sum of ranks of positive From table
y1 add all the categories Each y has and expected, add them
Write numbers of d’s = R gives T+)
aprx=good if E≥ 5
All = total as the table
If there are 3*3= 9 fractions to add
d =0, left out
μ and n
Under H0, Exam Q: How many degrees of freedom has the
3.Behaviou of t t “under H0” Under H0 T- ~ Wilcoxon signed-rank (𝑛) distribution y~Binomial(n,π) y1~ Hypergeometric 𝜒2 approx. follows a approx. 𝜒2 distribution with distribution under H0 for the TS? – count rows&
columns
W1~Wilcoxon rank-sum (𝑛 n= number of
n=total nr of samples (TOTAL, #success, 14) 𝜒2 distribution with df = (r-1)(c-1) df = (r-1)(c-1)
Under Ho ~ t(n-1) t ~ tdf (n1,n2) distribution Pairs with
non-zero
Here (n1= 2, n2=3) difference π=0.9=Ho (N,N1,n) df =K-1(categories-1)) (rows-1)(columns-1))
(t~ tdf) df=n1+n2-2 )
𝜒2~𝜒2df (approx.) 𝜒2~𝜒2df (approx.)
4.Under Ha t t “under Ha” tends R: Y tends to smaller or larger y1 tends to 𝜒2 distrib. “Under Ha” same same
to smaller / larger / smaller or larger Always tends to be
tends to 2-sided values
Under Ha we expect T+( or T-)
values
to be large / small / 2 sided Larger= alwas RPV
We use So we reject H0 if RPV ≤ 0.05 (= α) GOODNESS 1 random sample and 1 classification in K classes
5.R/L/2-tailed PV Left =Ha μ<…/
If > or < PV /2 LPV RPV 2 sided 2-tailed PV 𝜒2 ≥ 𝜒2df (a) Ho specifies probabilities associated with the classes
Not divide (based on some ‘external’ idea, e.g. proportions found xx years ago)
R/L/2-sided RR & Right= μ>…/ 𝜒2df (a) = nr table 7 good approximation if 80% of expected counts >= 5, and all at least 1
t≤≥ tdf(a)=cr.value 2sided = μ ≠ …. (p1096) or RPV
pvalue or RR
Outcome TS: T+ = 25 y=given If 𝜒2 ≥ table num
6.Outcome TS t = number > wilcox.test(first, second, alternative="greater",paired
= TRUE, exact=TRUE) eg y=87 TS is in the
X=97 RR-Ho rejected
Π=87/97
alternative hypothesis: true location shift is greater than 0
7. -pvalue>or< a If y1>exp=RPV We always use
If < a = Ho reject If y1<exp= LPV RPV
RR: TS>< IS NOT DEVIDED/2
Exam: If the difference in mean ………. between factor REMEMBER Give estimator and REMEMBER Plot: Est. Marginal
Power depends on: Data: 2 independent
8. Conclusion 1 and 2 that you want to be able to detect is 5 (=Δ),
how many pots(n=sample size) should be used per
group if the power of the test should be at least α,n(sample size),σ,
minimum random samples, with Aproxm.RPV= estimate of the mean…in category 2
estimator: 𝑦̅ 2= mean (response var) in
Means: a.if there is interaction=is not
or hardly meaningful to draw a
Ho is rej= Ha accepted 0.9?=β=0.1, standard deviation=σ ,
Ελεγχω αν ειναι indep or paired, αν λεει
μεγαλυτερο=> παιρνω 1sided true parameter value relevant sizes n1 and n2, N=n1+n2
PHo(𝜒2 ≥ table) category 2 and Estimate : st.deviation
of category 2 (table) but st.error of the
general conclusion about one
factorb. See (only) the profile plot
Model: the binary
Ho is not rej/Ha is not detect a difference in
estimate is se(𝑦̅ 2)= √MSE/ √n , CI=dfEr, above.=the plot only SUGGESTS
proven mean=2sided
difference Δ observations have success
probabilities π1 and π2
>< 0.05 Mr. A stated that the mean donation
in categ2 = 4.0. (if 4 is not
If you ask me for a single CI category I use the formula
Sample equation 1sided 1sided 1)n = 2 sp2(zα+zβ)2/ Δ2 ΑΝ ΕΙΝΑΙ 2sample n= (zα/2)2* π(1-π)/ E2 from the binomial
ODDS RATIO
n= σ2(zα+zβ)2/ Δ2 n= σd2(zα+zβ)2/ Δ2 Sp= (n1-1)*S1+(n2-1)*S2/Το n που θα βρω ειναι για 1 σ2x= π(1-π)(variance)
For β= 1 - (prob,0.9) δειγμα,Αν ζηταει total
OR = odds(1) / odds(2) =
n1+ n2-2 (π1 / (1− π1)) OR<1 ->
β= 0.1 usually or π=0.5(-max psbl n)
2sided 2sided 2sided = n*2 (n1=n2, 2sided n & CI
(π2 / (1− π2)) π1< π2
OR=1-> π1= π2
n= σ2(zα/2+zβ)2/ Δ2 n= σd2(zα/2+zβ)2/ Δ2 n= 2 sp 2(zα/2+zβ)2/ Δ2 total=n1+n2)
expected count< 5: I find the smallest Correlation: strength of relationship between two quantitative variables (positive
CI equation: =y+t(df)(zα/2) sy/√n d+t(df)(zα/2) sd/√n =(y1-y2) + t(df)((zα/2) ERROR MARGIN: Expected WIDTH =π± zα/2 √π(1-π)/n number In the table, calculate the
expected, N*N1/N< 5 and then
or negative) Pearson: linear correlation
– T-test for H0: ρxy = .., ~ tn-2 (one-sided alternatives possible)
df=(n-1)
Estimate± table value*se (of the estimator) df=(n-1) * t(df)(zα/2)* se W=2*Error Margin nπ=μy=E(y) =y1>5 calculate it I find how many expected
cells are <5, add them and / the total
• Spearman rank: 1 any monotonic relation, 2 less sensitive to
Outliers than pearson
Estimate± ERROR MARGIN
n(1-π)>5 The apprx. Is good if cells (add nr.of exp<5 / Tcells) • Significant Pearson t-test not necessarily linear
Always 2-sided Both samples have •High |r| in sample is not necessarily significant – Depends on sample size
Error Margin= table value*se (of the estimator
df=(n1+n2-2) Normal Aproximation τα 4y (success&fail)>5