Dit is een samenvatting voor de leerstof van hoorcollege 1 van Beschrijvende Statistiek in de pre-master Orthopedagogiek aan de Universiteit van Amsterdam. Het behandelt hoofdstuk 1.1 tot en met 1.3 en 2.1 tot en met 2.3 van Statistics van Algresti & Franklin.
1. Gathering and exploring data
1.1. Using data to answer statistical questions
Statistical problem solving is an investigative process that involves 4 components:
- Formulate a statistical question
- Collect data
- Analyse data
- Interpret results
3 main components of statistics for answering a statistical question:
- Design = starting the goals and/or statistical question of interest and planning how to obtain
data that will address them
- Description = summarizing and analysing the data that are obtained
- Inference = making decisions and predictions based on the data for answering the statistical
question
Probability = framework for quantifying how likely various possible outcomes are
1.2. Sample versus population
Subject = entities measured in a study
Population = total set of all the subjects of interest
Sample = subset of the population for whom we (plan to) have data
Descriptive statistics refers to methods for summarizing the collected data. The summaries usually
consist of graphs and numbers such as averages and percentages.
Inferential statistics refers to methods of making decisions or predictions about a population, based
on data obtained from a sample of that population.
- An important aspect of this involves reporting the likely precision of a prediction. How close
is the sample value to the true value of the population? margin of error
Parameter = numerical summary of the population
Statistic = numerical summary of a sample taken from the population
Random sampling = every subject in the population has the same chance of being included in the
sample
- Allows to make powerful inferences about populations
Randomness is also crucial to performing experiments well (randomization)
Margin of error = measure of the expected variability from one random sample to the next random
sample
‘very likely’ typically means 95 times out of 100 95% confidence interval
1
Approximate margin of error = ×100 %
√n
Random variation is roughly like the margin of error (above formula)
, The difference expected through ordinary random variation is smaller with larger samples
Statistically significant = when the difference between results of treatment and control group is so
large that it would be rare to see such a difference by ordinary random variation
1.3. Using calculators and computers
To make statistical analysis easier, large sets of data are organised in a data file
Two basic rules for constructing a data file:
- Any one row contains measurements for a particular subject
- Any one column contains measurements for a particular characteristic
Database = archived collection of data files
2. Exploring data with graphs and numerical summaries
2.1. Different types of data
Variables = any characteristic observed in a study
- A variable is called quantitative if observations on it take numerical values that represent
different magnitudes of the variable
o Key features to describe:
Center
Variability (AKA spread)
o Quantitative variables:
Discrete = if its possible values form a set of separate numbers
Continuous = if its possible values form an interval (infinite continuum of
possible values)
- A variable is called categorical if each observation belongs to one of a set of distinct
categories.
o Key feature to describe:
Relative number of observations in the various categories
Observations = data values that we observe for a variable
The distribution of a variable describes how the observations fall (are distributed) across the range
of possible values
- Can be displayed by a graph or a table
- Features to look for in distribution of categorical variables:
o Modal category = the category with the largest frequency
o And more generally how frequently each category was observed
- Features to look for in distribution of quantitative variables:
o Shape = do observations cluster in certain intervals and/or are they spread thin in
others?
o Center = where does a typical observation fall?
o Variability = how tightly are the observations clustering around a center?
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper sevendeboer. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €2,99. Je zit daarna nergens aan vast.