100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
College aantekeningen

Statistics II Lecture Notes

Beoordeling
4,0
(2)
Verkocht
6
Pagina's
31
Geüpload op
21-10-2020
Geschreven in
2019/2020

This document contains lecture notes from the Statistics II: Applied Quantitative Analysis course, which is mandatory for all International Relations and Organizations students.

Instelling
Vak











Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Gekoppeld boek

Geschreven voor

Instelling
Studie
Vak

Documentinformatie

Geüpload op
21 oktober 2020
Aantal pagina's
31
Geschreven in
2019/2020
Type
College aantekeningen
Docent(en)
Onbekend
Bevat
Alle colleges

Onderwerpen

Voorbeeld van de inhoud

I. COMPARING TWO MEANS: Steps of statistical inference
1. Hypothesis
a. Null hypothesis: ∆= 0
b. Alternative hypothesis: ∆≠ 0
2. Test statistic
"

a. T-test: % = " in this example %̂ = 3.45
%(∆)
#$
3. Sampling distribution of the test statistic
a. T-distribution with 11202 (+()$*(+$,( + +-.,()./ − 2 012345) degrees of freedom
4. Look up/calculate p=value for %̂ = 3.45; 67 = 11202
a. p=0.0006
5. Conclusion
a. Reject the null hypothesis at the 5% significance level (because p < 0.05)
b. Earnings are different from those who followed the training program

II. ANOVA: Comparing more than two means
• If we want to compare more than two means, we cannot use a simple t-test
• ANOVA considers the differences between groups and the differences within groups

EXAMPLE: Is there a statistically significant difference between number of TV appearances for MPs of different parties?
Figure 1. Number of TV show entries




Figure 2. Total sum of squares (990 ) | 990 = 991 + 992
6
990 = ∑7
389;<3 − <̅4)*,5 >


<̅4)*,5 = 3 + 2 + 4 + 7 + 5 + 6 + 8 + 5 + 7 = 47 ÷ 9 = 5.22

990 = (3 − 5.22)6 + (2 − 5.22)6 + (4 − 5.22)6
+(7 − 5.22)6 + (5 − 5.22)6 + (6 − 5.22)6
+(8 − 5.22)6 + (5 − 5.22)6 + (7 − 5.22)6 = 31.55

FF: = GH. II



Figure 3. Model sum of squares (991 ) - 99;$(<$$,
CDA: <̅9 = (3 + 2 + 4) ÷ 3 = 3
VVD: <̅6 = (7 + 5 + 6) ÷ 3 = 6
PvdA: <̅= = (8 + 5 + 7) ÷ 3 = 6.67
(With k for the group (here: political party) and <̅ > the mean for that group
>
6
991 = J +> ;<̅> − <̅4)*,5 >
>89
= 3(3 − 5.22)6 + 3(6 − 5.22)6 + 3(6.67 − 5.22)6 = 22.89

FF? = KK. LM


Figure 4. Residual sum of squares (992 ) - 99@3(A3,
992 = ∑(<3> − <̅> )6
= (3 − 3)6 + (2 − 3)6 + (4 − 3)6
+(7 − 6)6 + (5 − 6)6 + (6 − 6)6
+(8 − 6.67)6 + (5 − 6.67)6 + (7 − 6.67)6 = 8.67

FFB = L. NO

,991 is good to answer the question: Which part of the total sum of squares can we explain by using the group means?
992 is good to answer the question: Which part of the total sum of squares cannot be explained by using the group means?


Mean squares
• The model sum of squares (991 ) is based on the difference between 3 group means and the grand mean.
o The degrees of freedom is the number of groups minus 1 for the grand mean
991 22.89
P91 = = = 11.44
671 2
671 = 3 − 1 = 2
• The residual sum of squares (992 ) is based on the difference between each value and its group mean
o The degrees of freedom is based on the number of observations (minus the number of groups)
992 8.67
P92 = = = 1.44
672 6
672 = 9 − 3 = 6
F statistic
• The ratio between the variance explained by the model (P91 ) and the variance NOT explained by the model (P92 )
• If Q > 1, the model can explain more than what it leaves unexplained
P91 11.44
Q= = = 7.92
P92 1.44

Inference: conclusion about population
Null hypothesis: the mean of all groups is the same

We compare this score for the F-test to the F-distribution.
This distribution has two sets of degrees of freedom: 671 and 672 . Here: 2 and 6.

Critical value for a significance level (a-level) of 0.05 and 2 and 6 degrees of freedom is 5.14.




SCDEFECGH compared to SIJKLDMLN
• The observed value of F (Q.O#$)P$5 = 7.92) is greater than the correspond ding critical value (Q-)3(3-*/ = 5.14)
• Therefore, we reject the null hypothesis (null hypothesis: the mean of all groups is the same)




Reporting: There was a statistically significant difference (at the 5% level) between parties in terms of the average number of tv show entries by their
politicians, F(2, 6) = 7.92, p = 0.021.

,REGRESSION ANALYSIS
Why do we use regression for statistical inference?
• To express uncertainty about our conclusions about the relation between 2 concepts
• Assessing the strength of a relation
• Understand the population (based on a sample)
Why regression?
• What if we are not just interested in the difference between two means, but in how the mean values of a variable change as another
variable changes
• Example: Have available incomes increased in rich and poor countries, or have poor countries remained poor?




• How can we describe the strength of this association? Correlation? r = 0.961

Regression is related to correlation
• But regression can assess the impact of several independent variables on one specific dependent variable
o Not just strength of the association, but size of the effect: the expected change in Y as a result of a 1-unit change in X
• By assuming a linear association exists
• Regression can assess the null hypothesis: incomes are unrelated to incomes in the past

EXAMPLE: What is the relationship between the number of seats a party has in parliament and the number of motions it tables?




‘Line of best fit’
• Minimizing the distances between points and the line; your best guess given the data available

REGRESSION EQUATION: T = U + V<
• Intercept (constant): a; if the number of seats is 0, how many motions can we expect (according to the model)?
• Slope: b; if the number of seats increases by 1, what is the expected change in the number of motions (according to the model)?




Intercept: Slope:
• If a party has 30 seats, how many motions can we expect?
o W2%X2+5 = U + V ∗ 5ZU%5
o W2%X2+5 = 38.11 + 7.17 ∗ 5ZU%5
o \ = 38.11 + 7.17 ∗ 30 = 253.3
W2%[2+5
• We often use VQ and V9 instead of use U and V
o T3 = VQ + V9 <3
o The subscript X stands for the number of the observation,
T9 is the value of the response variable T for the first observation in the dataset,
T3 is the value of the response variable T for any observation X in the dataset.

ERROR: There are observations not on the regression line, there is error! All models are wrong

, Including error in the equation
• T3 = VQ + V9 <3 + ]3 | All models are wrong, but we make assumptions about error (e.g. it is random for all cases)
• Ε[T3 |<3 ] = VQ + V9 <3 | That’s why we work with the expected value of T3 given a value of bE

HOW DO WE DRAW THE REGRESSION LINE?
• Ordinary Least Squares: Minimizes the residual sum of squares; a residual is the difference between a data point and the regression line




• Squaring these residuals gives us squared residuals, or squares; the sum of the squared residuals is 992 = 24680.2
• The regression line is chosen in such a way that the residual sum of squares is as small as possible, least squares




Calculating the regression line
• 992 = ∑(T3 − Tc3 )6
• 992 = ∑(T3 − VQ − V9 <3 )6
• Tc3 = VQ − V9 <3 ; Tc3 refers to the predicted value of y according to the regression model

Analyze > Correlate > Bivariate > Select Options > Cross-
product deviations and covariances

eR (predicted/estimated dR ) in our example
d

∑(<3 − <̅ )(T3 − Tg) 25908
Vf9 = = = 7.17
(<3 − <̅ )6 3612
Vf9 = 7.17




eS (predicted/estimated dS ) in our example
d

h
VQ = Tg − Vf9 <̅
hQ = 199.5 − 7.17 ∗ 22.5 = 38.17
V
h
VQ = 38.17

Multiple explanatory variables: If you have more than one explanatory variable in your model,
you can still calculate the ‘least squares’, this is what SPSS is for!

Regression: Key assumptions
1. It makes sense to treat the relationship between Ε[T3 |<3 ] and the x variable as linear and additive
2. Ε[T3 |<3 ] = 0, error exists but is assumed to be random, so not relevant for estimating point-values
T3 = VQ + V9 <3 + ]3
Ε[T3 |<3 ] = VQ + V9 <3
What variables are suitable for regression?
• Dependent variable: Interval-ratio scale response variables
o Must have the same substantive meaning anywhere on the scale, e.g. profit, GDP
• Otherwise, modification is needed:
o Nominal/Ordinal scale: Logistic regression (blue/brown, agree, strongly agree)
o Count scale (non-negative integers): Poisson and negative binomial regression models; NOT in this course (war casualties)
• Explanatory variables can be of any type (with modification)
• Variable values must vary (variance cannot be zero)
€4,99
Krijg toegang tot het volledige document:

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten


Ook beschikbaar in voordeelbundel

Beoordelingen van geverifieerde kopers

Alle 2 reviews worden weergegeven
4 jaar geleden

4 jaar geleden

4,0

2 beoordelingen

5
0
4
2
3
0
2
0
1
0
Betrouwbare reviews op Stuvia

Alle beoordelingen zijn geschreven door echte Stuvia-gebruikers na geverifieerde aankopen.

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
polscinotes Universiteit Leiden
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
176
Lid sinds
5 jaar
Aantal volgers
116
Documenten
41
Laatst verkocht
6 maanden geleden
IRO Lecture notes and Book summaries

Hi! I'm a third-year International Relations and Organizations student at Leiden University who is also part of Honors College. I sell my class notes and book summaries. I have a GPA of 8.6.

4,1

30 beoordelingen

5
10
4
16
3
2
2
0
1
2

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via Bancontact, iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo eenvoudig kan het zijn.”

Alisha Student

Veelgestelde vragen