Garantie de satisfaction à 100% Disponible immédiatement après paiement En ligne et en PDF Tu n'es attaché à rien 4.2 TrustPilot
logo-home
Notes de cours

Statistics II Lecture Notes

Vendu
6
Pages
31
Publié le
21-10-2020
Écrit en
2019/2020

This document contains lecture notes from the Statistics II: Applied Quantitative Analysis course, which is mandatory for all International Relations and Organizations students.

Établissement
Cours











Oups ! Impossible de charger votre document. Réessayez ou contactez le support.

Livre connecté

École, étude et sujet

Établissement
Cours
Cours

Infos sur le Document

Publié le
21 octobre 2020
Nombre de pages
31
Écrit en
2019/2020
Type
Notes de cours
Professeur(s)
Inconnu
Contenu
Toutes les classes

Sujets

Aperçu du contenu

I. COMPARING TWO MEANS: Steps of statistical inference
1. Hypothesis
a. Null hypothesis: ∆= 0
b. Alternative hypothesis: ∆≠ 0
2. Test statistic
"

a. T-test: % = " in this example %̂ = 3.45
%(∆)
#$
3. Sampling distribution of the test statistic
a. T-distribution with 11202 (+()$*(+$,( + +-.,()./ − 2 012345) degrees of freedom
4. Look up/calculate p=value for %̂ = 3.45; 67 = 11202
a. p=0.0006
5. Conclusion
a. Reject the null hypothesis at the 5% significance level (because p < 0.05)
b. Earnings are different from those who followed the training program

II. ANOVA: Comparing more than two means
• If we want to compare more than two means, we cannot use a simple t-test
• ANOVA considers the differences between groups and the differences within groups

EXAMPLE: Is there a statistically significant difference between number of TV appearances for MPs of different parties?
Figure 1. Number of TV show entries




Figure 2. Total sum of squares (990 ) | 990 = 991 + 992
6
990 = ∑7
389;<3 − <̅4)*,5 >


<̅4)*,5 = 3 + 2 + 4 + 7 + 5 + 6 + 8 + 5 + 7 = 47 ÷ 9 = 5.22

990 = (3 − 5.22)6 + (2 − 5.22)6 + (4 − 5.22)6
+(7 − 5.22)6 + (5 − 5.22)6 + (6 − 5.22)6
+(8 − 5.22)6 + (5 − 5.22)6 + (7 − 5.22)6 = 31.55

FF: = GH. II



Figure 3. Model sum of squares (991 ) - 99;$(<$$,
CDA: <̅9 = (3 + 2 + 4) ÷ 3 = 3
VVD: <̅6 = (7 + 5 + 6) ÷ 3 = 6
PvdA: <̅= = (8 + 5 + 7) ÷ 3 = 6.67
(With k for the group (here: political party) and <̅ > the mean for that group
>
6
991 = J +> ;<̅> − <̅4)*,5 >
>89
= 3(3 − 5.22)6 + 3(6 − 5.22)6 + 3(6.67 − 5.22)6 = 22.89

FF? = KK. LM


Figure 4. Residual sum of squares (992 ) - 99@3(A3,
992 = ∑(<3> − <̅> )6
= (3 − 3)6 + (2 − 3)6 + (4 − 3)6
+(7 − 6)6 + (5 − 6)6 + (6 − 6)6
+(8 − 6.67)6 + (5 − 6.67)6 + (7 − 6.67)6 = 8.67

FFB = L. NO

,991 is good to answer the question: Which part of the total sum of squares can we explain by using the group means?
992 is good to answer the question: Which part of the total sum of squares cannot be explained by using the group means?


Mean squares
• The model sum of squares (991 ) is based on the difference between 3 group means and the grand mean.
o The degrees of freedom is the number of groups minus 1 for the grand mean
991 22.89
P91 = = = 11.44
671 2
671 = 3 − 1 = 2
• The residual sum of squares (992 ) is based on the difference between each value and its group mean
o The degrees of freedom is based on the number of observations (minus the number of groups)
992 8.67
P92 = = = 1.44
672 6
672 = 9 − 3 = 6
F statistic
• The ratio between the variance explained by the model (P91 ) and the variance NOT explained by the model (P92 )
• If Q > 1, the model can explain more than what it leaves unexplained
P91 11.44
Q= = = 7.92
P92 1.44

Inference: conclusion about population
Null hypothesis: the mean of all groups is the same

We compare this score for the F-test to the F-distribution.
This distribution has two sets of degrees of freedom: 671 and 672 . Here: 2 and 6.

Critical value for a significance level (a-level) of 0.05 and 2 and 6 degrees of freedom is 5.14.




SCDEFECGH compared to SIJKLDMLN
• The observed value of F (Q.O#$)P$5 = 7.92) is greater than the correspond ding critical value (Q-)3(3-*/ = 5.14)
• Therefore, we reject the null hypothesis (null hypothesis: the mean of all groups is the same)




Reporting: There was a statistically significant difference (at the 5% level) between parties in terms of the average number of tv show entries by their
politicians, F(2, 6) = 7.92, p = 0.021.

,REGRESSION ANALYSIS
Why do we use regression for statistical inference?
• To express uncertainty about our conclusions about the relation between 2 concepts
• Assessing the strength of a relation
• Understand the population (based on a sample)
Why regression?
• What if we are not just interested in the difference between two means, but in how the mean values of a variable change as another
variable changes
• Example: Have available incomes increased in rich and poor countries, or have poor countries remained poor?




• How can we describe the strength of this association? Correlation? r = 0.961

Regression is related to correlation
• But regression can assess the impact of several independent variables on one specific dependent variable
o Not just strength of the association, but size of the effect: the expected change in Y as a result of a 1-unit change in X
• By assuming a linear association exists
• Regression can assess the null hypothesis: incomes are unrelated to incomes in the past

EXAMPLE: What is the relationship between the number of seats a party has in parliament and the number of motions it tables?




‘Line of best fit’
• Minimizing the distances between points and the line; your best guess given the data available

REGRESSION EQUATION: T = U + V<
• Intercept (constant): a; if the number of seats is 0, how many motions can we expect (according to the model)?
• Slope: b; if the number of seats increases by 1, what is the expected change in the number of motions (according to the model)?




Intercept: Slope:
• If a party has 30 seats, how many motions can we expect?
o W2%X2+5 = U + V ∗ 5ZU%5
o W2%X2+5 = 38.11 + 7.17 ∗ 5ZU%5
o \ = 38.11 + 7.17 ∗ 30 = 253.3
W2%[2+5
• We often use VQ and V9 instead of use U and V
o T3 = VQ + V9 <3
o The subscript X stands for the number of the observation,
T9 is the value of the response variable T for the first observation in the dataset,
T3 is the value of the response variable T for any observation X in the dataset.

ERROR: There are observations not on the regression line, there is error! All models are wrong

, Including error in the equation
• T3 = VQ + V9 <3 + ]3 | All models are wrong, but we make assumptions about error (e.g. it is random for all cases)
• Ε[T3 |<3 ] = VQ + V9 <3 | That’s why we work with the expected value of T3 given a value of bE

HOW DO WE DRAW THE REGRESSION LINE?
• Ordinary Least Squares: Minimizes the residual sum of squares; a residual is the difference between a data point and the regression line




• Squaring these residuals gives us squared residuals, or squares; the sum of the squared residuals is 992 = 24680.2
• The regression line is chosen in such a way that the residual sum of squares is as small as possible, least squares




Calculating the regression line
• 992 = ∑(T3 − Tc3 )6
• 992 = ∑(T3 − VQ − V9 <3 )6
• Tc3 = VQ − V9 <3 ; Tc3 refers to the predicted value of y according to the regression model

Analyze > Correlate > Bivariate > Select Options > Cross-
product deviations and covariances

eR (predicted/estimated dR ) in our example
d

∑(<3 − <̅ )(T3 − Tg) 25908
Vf9 = = = 7.17
(<3 − <̅ )6 3612
Vf9 = 7.17




eS (predicted/estimated dS ) in our example
d

h
VQ = Tg − Vf9 <̅
hQ = 199.5 − 7.17 ∗ 22.5 = 38.17
V
h
VQ = 38.17

Multiple explanatory variables: If you have more than one explanatory variable in your model,
you can still calculate the ‘least squares’, this is what SPSS is for!

Regression: Key assumptions
1. It makes sense to treat the relationship between Ε[T3 |<3 ] and the x variable as linear and additive
2. Ε[T3 |<3 ] = 0, error exists but is assumed to be random, so not relevant for estimating point-values
T3 = VQ + V9 <3 + ]3
Ε[T3 |<3 ] = VQ + V9 <3
What variables are suitable for regression?
• Dependent variable: Interval-ratio scale response variables
o Must have the same substantive meaning anywhere on the scale, e.g. profit, GDP
• Otherwise, modification is needed:
o Nominal/Ordinal scale: Logistic regression (blue/brown, agree, strongly agree)
o Count scale (non-negative integers): Poisson and negative binomial regression models; NOT in this course (war casualties)
• Explanatory variables can be of any type (with modification)
• Variable values must vary (variance cannot be zero)
4,99 €
Accéder à l'intégralité du document:

Garantie de satisfaction à 100%
Disponible immédiatement après paiement
En ligne et en PDF
Tu n'es attaché à rien


Document également disponible en groupe

Avis des acheteurs vérifiés

Affichage de tous les 2 avis
4 année de cela

4 année de cela

4,0

2 revues

5
0
4
2
3
0
2
0
1
0
Avis fiables sur Stuvia

Tous les avis sont réalisés par de vrais utilisateurs de Stuvia après des achats vérifiés.

Faites connaissance avec le vendeur

Seller avatar
Les scores de réputation sont basés sur le nombre de documents qu'un vendeur a vendus contre paiement ainsi que sur les avis qu'il a reçu pour ces documents. Il y a trois niveaux: Bronze, Argent et Or. Plus la réputation est bonne, plus vous pouvez faire confiance sur la qualité du travail des vendeurs.
polscinotes Universiteit Leiden
S'abonner Vous devez être connecté afin de pouvoir suivre les étudiants ou les formations
Vendu
176
Membre depuis
5 année
Nombre de followers
116
Documents
41
Dernière vente
6 mois de cela
IRO Lecture notes and Book summaries

Hi! I'm a third-year International Relations and Organizations student at Leiden University who is also part of Honors College. I sell my class notes and book summaries. I have a GPA of 8.6.

4,1

30 revues

5
10
4
16
3
2
2
0
1
2

Récemment consulté par vous

Pourquoi les étudiants choisissent Stuvia

Créé par d'autres étudiants, vérifié par les avis

Une qualité sur laquelle compter : rédigé par des étudiants qui ont réussi et évalué par d'autres qui ont utilisé ce document.

Le document ne convient pas ? Choisis un autre document

Aucun souci ! Tu peux sélectionner directement un autre document qui correspond mieux à ce que tu cherches.

Paye comme tu veux, apprends aussitôt

Aucun abonnement, aucun engagement. Paye selon tes habitudes par carte de crédit et télécharge ton document PDF instantanément.

Student with book image

“Acheté, téléchargé et réussi. C'est aussi simple que ça.”

Alisha Student

Foire aux questions