College aantekeningen

Introduction to Statistics - Lecture Notes

Name: Introduction to Statistics - Lecture Notes
SKU: doc_567332
Rating: 4.33 (3 reviews)
Author: ilariamonese

3 beoordelingen

1 keer verkocht

Vak
Introduction to Statistics

Instelling
Universiteit Van Amsterdam (UvA)

Introductory course on statistics for the first year of Sociology by the lecturer Thijs Bol at the UvA.

[Meer zien]

Voorbeeld 3 van de 26 pagina's

Bekijk voorbeeld

Geupload op 1 augustus 2019
Aantal pagina's 26
Geschreven in 2018/2019
Type College aantekeningen
Docent(en) Onbekend
Bevat Alle colleges

3 beoordelingen

Door: nzachova17 • 1 jaar geleden

Door: brentriet • 11 maanden geleden

Door: martijnediepstraten • 5 jaar geleden

Volgen

ilariamonese Lid sinds 5 jaar 171 documenten verkocht

€2,99

Ook beschikbaar in voordeelbundel v.a. €4,99

In winkelwagen

Opslaan

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Ook beschikbaar in voordeelbundel (1)

Sociology - Year 1

€ 21,43 € 4,99

24x verkocht

9 items

1. College aantekeningen - Introduction to sociology pt.1 - lectures (s1, b1)
2. College aantekeningen - Philosophy of social science - lecture notes
3. College aantekeningen - Evolution of humankind - all lectures
4. College aantekeningen - Introduction to sociology pt.2 - lectures (s1b2)
5. College aantekeningen - Introduction to statistics - lecture notes
6. College aantekeningen - Sociological theory 1 - lectures
7. College aantekeningen - Research methodology - lectures
8. College aantekeningen - Sociological theory 2 - lecture notes
9. College aantekeningen - Sociology of institutions - lecture notes
Meer zien

INTRODUCTION TO STATISTICS – Lecture 0 19/11/2018

Types of variables
Different types of variables:

Measurement level Description Example
Religion, political party voted
NOMINAL No rank order
for
Rank order, but unequal Disagree completely – Agree
ORDINAL
distances completely
Rank order with equal
INTERVAL Celsius, hourly wage
distances

Rank order with equal
RATIO Age, weight, height
distances and a natural 0

Nominal
Closed (categorical) questions
Ordinal
Closed questions

DICHOTOMOUS VARIABLES
There are just two categories: YES or NO, 0 and 1
Sex? 0.Female
1.Male

Different types of variables require different types of description.
We want to describe data. We can’t do this by showing all answers to a survey.
A core function of statistics is to describe (survey) data: centrality and dispersion.

CENTRALITY
Where is the center of the variable?
Three common way to address centrality:
- Mode indicates the most common value
- Median indicates the middle value
Mean 𝑦̅ indicates the average value
∑ 𝑦𝑖
𝑦̅ = -> sum of all values divided by the number of observations
𝑛

For dichotomous variables the mean equals the proportion 𝜋̂
The proportion is basically the same as the percentage. Proportion = percentage/100

The type of variable defines the centrality measure that we can use.
Nominal: mode
Ordinal: mode and median. Mean not really allowed but every uses it

,Interval/ratio: mode, median, mean
Dichotomous: mean
DISPERSION
If we know the center of data, we know very little about the distribution of data. Data has a
certain level of dispersion. And there are different measures for dispersion:
- Frequencies: how often do we see each answer?
- Range: what’s the minimum and maximum value?
- Standard deviation s
- Variance s2

Standard deviation s
The sum of all squared distances to the mean.
If all observations are clustered around the mean, the sum of distances will be small.
If observations are widely dispersed around the mean, the sum of distances will be larger.

The standard deviation is a summary measure of the average distance to the mean.
If there is more dispersion, the standard deviation sy will be higher.

Comparing distributions
If we want to compare different positions in distributions we can use Z-SCORES

Z-score is the amount of standard deviations to the mean.
It is independent of the dispersion of the distribution. It expresses how many standard
deviations we are from the mean.
Z-scores take into account that different distributions might have a different mean and a
different level of dispersion.
A z-score is a standardized measure of the distance from an observation to the mean,
independent of the dispersion of the distribution.
It is useful for inferential statistics.
It all depends on the reference group: importance of context (“relatively”)

, INTRODUCTION TO STATISTICS – Lecture 1 Week 1 – 07/01/2019

On probability, z-scores and distributions

Distribution of data
Data can be distributed in different ways. We can have a skewed distribution or a bell-
shaped distribution. In a perfect bell-shaped distribution, the distribution is perfectly
symmetrical around the mean 𝑦̅. This means that the right and left tail are symmetrical.

Empirical Rule: we can summarize all observations in bell-shaped distributions:
- 68% of all observations is between 𝑦̅ – s and 𝑦̅ + s
- 95,4% of all observations is between 𝑦̅ – 2s and 𝑦̅ + 2s
- 99,7% of all observations is between 𝑦̅ -3s and 𝑦̅ + 3s

Probabilities and probability distributions
We can think of frequency distributions as probability distributions as well. If we pick one
random inhabitant of De Pijp, for example, what is the probability that he/she is older than
35? We can determine this on the basis of the distribution.
The probability p is the area under the curve.
We can apply this to all normal distributions.
We can also apply this and the Empirical Rule to the standard normal distribution which is a
theoretical distribution used in inferential statistics. Empirical distributions are hardly ever
normally distributed. We use the standard normal distribution for calculations.
Characteristics of the standard normal distribution:
- Bell-shaped
- Perfectly symmetrical
- Mean 𝑦̅ = 0 and standard deviation s = 1

Z-scores and probabilities
Probabilities can be defined as z-scores. In the standard normal distribution z = 1 because 𝑦̅
= 0 and s = 1. Every position in a normal distribution has a z-score with a corresponding
probability that we can check in the Z-table. For normally distributed variables we can
convert z-scores to probabilities (and the other way around).

Dit zijn jouw voordelen als je samenvattingen koopt bij Stuvia:

Bewezen kwaliteit door reviews

Studenten hebben al meer dan 850.000 samenvattingen beoordeeld. Zo weet jij zeker dat je de beste keuze maakt!

In een paar klikken geregeld

Geen gedoe — betaal gewoon eenmalig met iDeal, creditcard of je Stuvia-tegoed en je bent klaar. Geen abonnement nodig.

Direct to-the-point

Studenten maken samenvattingen voor studenten. Dat betekent: actuele inhoud waar jij écht wat aan hebt. Geen overbodige details!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.