Samenvatting

Summary Introduction to the Practice of Statistics (Extended Version) - Statistics

0 keer verkocht

Vak
Statistics

Instelling
Vrije Universiteit Amsterdam (VU)

This summary based on the classic textbook for teaching statistics 'Introduction to the Practice of Statistics', helps students to correctly produce and interpret data found in a real-world context. The summary can be seen as a guide through the different types of data gathering and the analysis. U...

[Meer zien]

Voorbeeld 4 van de 43 pagina's

Bekijk voorbeeld

Heel boek samengevat? Ja
Geupload op 9 oktober 2023
Aantal pagina's 43
Geschreven in 2022/2023
Type Samenvatting

Volgen

Myrtevdbergh Lid sinds 3 jaar 27 documenten verkocht

€20,49

Ook beschikbaar in voordeelbundel v.a. €32,49

In winkelwagen

Opslaan

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Ook beschikbaar in voordeelbundel (1)

Extensive benefit bundle to the Practice of Statistics,

€ 46,37 € 32,49 3 items

1. Samenvatting - Extensive summary craig, b: introduction to the practice of statistics - statistics
2. Samenvatting - Summary introduction to the practice of statistics (extended version) - statistics
3. Samenvatting - Summary introduction to the practice of statistics (extended version) - statistics
Meer zien

Statistics Exam Notes

CHAPTER 1 - looking at Data Distributions
Terms:
- cases: objects described by a set of data → usually people in global health, could also
be villages, tractors etc.
- variable: a characteristic of a case → e.g. height
- value: different cases have different values of a variable → the height in cm
- label (unique ID): used to distinguish or uniquely identify cases with the dataset →
e.g. gender
- the key characteristics of a data set answer the questions: who, what and why?

Examining distributions:
- overall pattern:
● shape: e.g. normally distributed
● center
● spread
- deviations
- symmetry - skewed to the left / skewed to the right
● In statistics, a negatively skewed (also known as left-skewed) distribution is a
type of distribution in which more values are concentrated on the right side
(tail) of the distribution graph while the left tail of the distribution graph is
longer.

Measuring center:
1. The mean
- symbolized by x̄
- sensitive to outliers and skew
2. The median
- represented by M
- midpoint of a distribution
● half of the observations are smaller, the other half larger
- resistant to outliers and skew
- two numbers in the middle → take the average: e.g. 3,4 → M = 3.5

,Measuring spread: the quartiles
● works with the median (not the mean)
● splitting data into quartiles means splitting into 4 parts
● the median split the data into 2
● IQR (interquartile range)= Q3-Q1
● 1.5 x IQR rule for identifying outliers → anything greater than Q3 (or smaller than
Q1) + outcome of (1.5xIQR) is an outlier
- Multiplying the interquartile range (IQR) by 1.5 will give us a way to
determine whether a certain value is an outlier. If we subtract 1.5 x IQR from
the first quartile, any data values that are less than this number are considered
outliers.
● Order: minimum - quartile 1 - median/quartile 2 - quartile 3 - maximum

Boxplots

Measuring spread: the standard deviation
- works with the mean (not the median)
- symbolized by Sx
- average distance of the observations from the mean

1

,Choosing measures of center and spread:
NOTE: The median and IQR are usually better than the mean and standard deviation for
describing a skewed distribution or a distribution with outliers.
→ use mean and standard deviation only for reasonably symmetric distributions that
do not have outliers

Models
A model: a simplified representation of something more complex that helps us to understand
something
1. density curve:
- smooth curve drawn over the distribution
- it is a model of the distribution
- it is a model of what value the variable takes and how often
- if a smooth curve is always above the x-axis and the total mass/area/volume
under the curve is scaled to 1, it is a density curve

2

, Area under the curve:
● total area under a density curve is 1
● EXAMPLE: proportion of the density curve that is shaded (from 6 and <) is equal to
0.293 in a model showing the vocabulary score of 947 seventh graders → how to
interpret? About 29.3% of the vocabulary scores of the 947 seventh graders is below a
6.

Greek letters
● When mean and standard deviation come from a model of the data, Greek letters are
used:

Normal density curve:
- mathematical model for normally distributed data
- symmetric, single-peaked, and bell-shaped
- completely described by two numbers: u (mean) and 𝜎 (standard deviation)
- N (u,𝜎)

The 68-95-99.7 rule
In the Normal distribution with mean u and standard deviation 𝜎:
- approximately 68% of the observations fall within 1𝜎 of u
- approximately 95% of the observations fall within 2𝜎 of u
- approximately 99.7% of the observation fall within 3𝜎 of u

Standard normal distribution
● N (0,1)
● Simply easier to work with
● All normal distributions can be transformed (standardized) to N (0,1) (mean, SD))

--> standard normal probability/ standardized value of x/ z-score

3

Dit zijn jouw voordelen als je samenvattingen koopt bij Stuvia:

Bewezen kwaliteit door reviews

Studenten hebben al meer dan 850.000 samenvattingen beoordeeld. Zo weet jij zeker dat je de beste keuze maakt!

In een paar klikken geregeld

Geen gedoe — betaal gewoon eenmalig met iDeal, creditcard of je Stuvia-tegoed en je bent klaar. Geen abonnement nodig.

Direct to-the-point

Studenten maken samenvattingen voor studenten. Dat betekent: actuele inhoud waar jij écht wat aan hebt. Geen overbodige details!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.