100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Volledige samenvatting Statistiek 3 €6,49   In winkelwagen

Samenvatting

Volledige samenvatting Statistiek 3

 45 keer bekeken  2 keer verkocht

volledige samenvatting van statistiek 3 met literatuur van PPP-boek en colleges

Voorbeeld 4 van de 35  pagina's

  • Nee
  • 1, 23, 27, 38, 29, 31, 32, 30
  • 30 juni 2021
  • 35
  • 2020/2021
  • Samenvatting
book image

Titel boek:

Auteur(s):

  • Uitgave:
  • ISBN:
  • Druk:
Alles voor dit studieboek (2)
Alle documenten voor dit vak (2)
avatar-seller
veracreemers
Summary Statistics 3
exam 2021
Unit 1: contingency tables, odds
ratios, stratification, confounding
and interaction
The 2X2 contingency table for association
X=0/ control X=1/ treated
Y=0/ no recovery 13 (A) 7 (B) 20 (A+B)
Y=1/ recovery 12 (C) 18 (D) 30 (C+D)
25 (A+C) 25 (B+D) N=50
The null hypothesis is that there is no effect of therapy on the probability of recovery. This is the same as:

P (recovered | therapy) = P (recovered | control)

the probability of recovery (i.e. the marginal probability, which is the probability of a single event
occurring independent of other events) is 30/50, which is 0.6, which is 60%. we’d expect that in both
groups 60% recovers 60% x 25 (amount of people in each group)= 15, we expect 15 people to recover
and 10 to not recover.

an alternative null hypothesis would be that there is no association between therapy and recovery, which
is P (recovery AND treated) = P (recovery) * P (treated). Given that 30/50 people recover, and 25 out of
50 are treated, the probability of a patient being treated and recovering is; (30/50) * (25/50)= 30%. The
total sample size is 50 15 people in the cell recovery AND treated, by the same logic we expect 15
people in the recovery AND control cell, and 10 patients in each of the other 2 cells.

What we have used here is the product rule for independent events. If there’s no association between X
and Y this implies that therapy and recovery are statistically independent. If they are, the probability of
both occurring simultaneously is the product of the unconditional probabilities.

The table above shows the observed frequencies, the values we have calculated are the expected
frequencies under the H0. We calculate the expected frequency as follows: row total * column total/
grand total:

X0 X1
Y0 20*25/50= 10 20*25/50=10 20
Y1 30*25/50=15 30*25/50=15 30
25 25 50

,Test of association in contingency table
2 2 2
(O −E rc)
χ =∑ ∑ rc
2
,df =1
r=1 c=1 Erc

 For each cell r, c (r=row, c=column) we take the difference between the observed and expected
frequency
 We raise this difference to the power of 2 and then divide by Erc
 We sum these terms over the 2 columns and rows

This test is an approximation and requires that all E’s are at least 5 or more.

X0 X1
Y0 (13-10)2/10= 0.9 (7-10)2/10= 0.9 1.8
2 2
Y1 (12-15) /15=0.6 (18-15) /15= 0.6 1.2
1.5 1.5 3= χ 2
Under the H0 this test statistic has a Chi-square distribution with df=1, since the derivations O-E are raised
to the power of 2, a violation of the H0 leads to large Chi-square values the critical area is on the right
of the distribution. If we now look at the Chi-square distributions, the critical value is 3.84, values larger
than this lead to rejection of the H0.

We cannot reject our H0 but we cannot accept it either the power to detect a true statement effect
may be too small with N=50. The 95% confidence interval runs from -0.02 to +0.5 the true difference
can be 0 as H0 claims, or anything up to 50%.


Measures of association for 2X2 contingency
table
The effect of treatment on recovery probability can be expressed in 3 different ways:

1. The difference in recovery probability
2. The correlation between treatment and recovery
3. The odds ratio

Applying the formula for Pearson’s correlation r to 0/1 variables (dichotomous) and rewriting the formula
gives us what is known as the phi-coefficient (φ):

( A x D ) −( B x C)
φ=
√( A+ B )( A+C ) ( B+ D )(C+ D)
The A, B, C and D are the same as in the first table. Note that A (00, not treated and not recovered) and D
(11, treated and recovered) contribute to a positive correlation between treatment and recovery. B and C
contribute to a negative correlation.

The Odds Ratio (OR) is defined as follows:

 The odds are defined as P (Y=1)/ P (Y=0), i.e. the probability of success divided by the probability
of failure
 The Odds Ratio is the ratio of the odds of the group (X=1) to the odds of the group (X=0)

, D/B A∗D
The OR: =
C/ A B∗C

The OR is equal to the number of concordant pairs (contribute to positive correlation) divided by the
number of discordant pairs (contribute to negative correlations)

situation Phi- coefficient Odds ratio Association?
A*D> B*C >0 >1 Positive association
A*D=B*C =0 =1 No association
A*D<B*C <0 <1 Negative association
The OR in our example is thus; (13*18)/(7*12)=2.79, this value is larger than 1 and thus there is a positive
association. The phi coefficient is 0.25.

But what does this mean when we have binary variables? We need to think about the coding: the value of
X=1 tends to go with the value of Y=1.


The 2X2X2 contingency
table: stratification
In analysing the relationship between predictor X and
outcome Y, we might want to adjust for a 3rd variable C.
we need to distinguish between different causal models
and roles for C.

The confounding model
X and C can both affect Y, and X and C are correlated
with each other (they are confounded) but neither
of the 2 have an effect on the other. In this case we
suppose that C=1 is for the mild cases of depression
and C=0 for the severe cases.

 We need to adjust the effect of X on Y for C,
because otherwise the effect of X will be
biased.

The mediation model
X affects C which in turn affects Y, X can also still affect Y directly (this concept is not really discussed in
the course). In this case we suppose C=compliance

The difference between mediation and confounding is that we always want to correct for confounding,
whereas this is not always the case with mediation where this might be of scientific interest.

, The moderation (interaction) model
The effect of X on Y depends on the value of C. C in this case could be the level of depression at pretest
where Y is the level of depression at posttest. We need to test the simple effect of X per value of C.

Now what do we do with a confounder or moderator? We break down the contingency table, i.e. we
stratify it for all levels of C, and perform logistic regression analyses.


Working with logarithms
For reasons to be seen in the following unit, we often take the natural logarithm of the odds ratio.

 Ln (a) + ln(b)= Ln (a*b)
 Ln (a)- ln (b)= ln (a/b)
 Goes if a and b >0
 Ln (ab)= b * ln (a)
 Ln (1/a)= -ln (a)
 Goes if a>0

Special logarithms:

 Ln (odds)= ln (P) – ln (1-P)
 Ln (OR)= ln (oddsx=1) – ln (oddsx=0)
 Ln (1)= 0, ln (e)=1 (e≈2.72)

Now why would we use these (they look confusing AF)?  probability is bounded between 0 and 1 but
log odds are not they go from minus infinity to plus infinity and this allows us to analyse the data as if we
were working with quantitative variables, i.e. it allows for easier interpretation.

If we work with log odds transformations and X= 0/1 the slope of the logistic function is equal to the ln
(OR).


Working with exponentials
 Exp(a) * exp(b)= exp (a+b)
 Exp (a)/ exp (b)= exp (a-b)
 [exp(a)]b= exp (a*b)
 1/ exp (a)= exp(-a)

An exponential is the inverse of a logarithm
(whatever that means?!).

Special powers of e:

 Exp(ln(a))= ln (exp(a))= a, if a>0
 Exp(0)=1
 Exp(1)=e ≈2.72

The logarithms and exponentials together allow us to switch from one scale to another.

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper veracreemers. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €6,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 59523 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€6,49  2x  verkocht
  • (0)
  Kopen