Tutorial 1: Population, sample, variables, frequency table
Voordat je een onderzoek gaat doen moet je eerst verschillende dingen definiëren, zoals;
o Research question
o Population
o Unit
o Sample
o Variable (= property of an unit from the sample)
Population: every member of a group (persons, objects, etc.) for which we would like to collect
information.
Research question: question that we want to answer
Sample: part of the population that we will study and collect information for.
Why sample? Too expensive or time consuming to study whole population, so we draw a
sample.
We want to draw conclusions about the population, so sample should be representative of
the population.
Units: the elements of a sample from which we collect the information.
Variable: measured property of an element of the sample.
↳ Quantitative variable (continuous/discrete)
Height, weight at birth (c = alle waarden kunnen gemeten worden, heel nauwkeurig)
Number of children in a household, number of diseased plants in a field, number of
cigarettes each day for a pregnant woman (d = een specifiek aantal)
↳ Qualitative variable (nominal/ordinal)
Hair colour, bachelor program, province, place of residence (n = een interpretatie, geen
getallen / je kan ze niet op een bepaalde volgorde zetten)
Grade of eggs, highest level of education completed, annual salary (o = een interpretatie,
geen getallen dus niet mee rekenen, wél op volgorde)
Simple Random Sampling (SRS):
In SRS, units are drawn at random from a population. Every sample has equal chance to be selected.
Bias: certain parts of the population might be overrepresented as compared to other parts.
↳ Undersampling: certain groups are excluded from the sample. For example, not all
women give birth in the hospital, some from home. So hospital records aren’t enough for
a research about women giving birth.
↳ Non-response: not participating, or not successfully contacted
↳ Voluntary participation: might result in particularly positive or negative answers
↳ Response bias: social desirability bias (self-reported personal traits, questions about
income). There is a big chance that people will give a socially correct anwser instead a
honest answer.
Observational study: observe the unit/process without influencing it.
Experimental study: apply a treatment to the unit in order to observe a reaction.
↳ A cause-effect relationship can only be concluded from an experimental study.
Frequency: an absolute number
Relative frequency (fraction) = frequency / total
↳ advantage: easier to compare data
, Tutorial 2 – Numerical summary of data: measures of centre and dispersio, probability, the
law of large numbers, consistency
Mean (=gemiddelde): first add all the values and then divide by the amount of how many values
there are.
Median (M): first order the data from smallest to largest. With an odd number of data the median is
the midpoint/value. [e.g. 1,3,4,6,7,7,8]
With an even number of data the median is the mean of the two values in the middle. [e.g.
1,3,4,5,6,7,7,8 =5.5 ]
↳ The difference between the mean and median is; the median doesn’t react on the effect of
outliers and the mean does.
[e.g. 4,5,6,7,9 M=6, mean=6,2 | but when the last value changes 4,5,6,7,110 M=6, mean=26.4]
Standard deviation (sd=standaardafwijking): je rekent het verschil uit tussen de waarde en het
gemiddelde, dat kwadrateren, daarna plus precies hetzelfde bij de andere waardes. Hoe ver een
observatie gemiddeld af ligt van het gemiddelde. Als het dicht bij elkaar ligt dan is het een kleine
deviatie, ver van elkaar af is het een grote deviatie.
s = √ variance Variance: s2=¿ ¿ ¿
Range (=bereik/spreidingsbreedte): the difference between the maximum and minimum.
First quartile (Q1/lower quartile/25th percentile) = the middle value between the minimum and
median.
Third quartile (Q3/upper quartile/75th percentile) = the middle value between the median and
maximum
Interquartile range (IQR = Q3 – Q1) = box kwartielafstand
↳ The interquartile range is not sensitive to outliers in contrast to the variance, and therefore
also in contrast to the standard deviation.
The pth percentile of a set of n ordered observations (from smallest to largest) is the value where at
most p% of the observations are smaller than it and at most (100-p)% of the observations are larger.
[e.g. 15 values in total. = 1 of the 15 values is lower than 1.83, so 1/15= 0.0667= 6.67% |13 of the 15
values are higher than 1.83, so 13/15= 0.8667= 86.67%]
Q1= a maximum of 25% below the value and a maximum of 75% above the value= nr. 4
Nr. Value % below % above
1 1.78 0.00 93.33
2 1.83 6.67 86.67
3 1.98 13.33 80.00
4 2.04 20.00 73.33
- Five-number summary
1. The sample minimum (=smallest observation)
2. The lower quartile (=first quartile/Q1)
3. The median (=middle value/second quartile/Q2)
4. The upper quartile (=third quartile/Q3)
5. The sample maximum (=largest observation)
Law of large numbers: relative frequencies stabilize if an experiment is repeated very often.
Statistical notation:
n = sample size = number of persons in the sample
y = number of persons that are relevant
p = probability / chance
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller valentinevinagredossantosdionisio. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $5.88. You're not tied to anything after your purchase.