This document contains notes from all the lectures (1 to 12), some interesting notes and tips from the computer practicals, as well as notes from the pen and paper practicals (PPP). Information from some knowledge clips are included already in the notes.
R output is included, so it is easier to kn...
Lecture 1a – advanced statistics
Main aim: Inference (= draw conclusions about a population or about a general phenomenon based on a
limited number of observations, which are the sample data)
3 different situations for t-procedures (confidence interval and t-tests):
- one sample, one mean (e.g. the mean body weight of all 6 years old boys in the NL)
- paired observations, mean difference (e.g. data of twins or before and after study)
- two independent samples, difference in mean (e.g. two populations: difference in exam scores in
males and females, which is a typical observational study/ research)
1. Inference (1 sample)
Take a random sample (sample data which is representative for the whole population). The noise is different
for each sample data, but some noise makes them a bit different.
à Conclusions of inference are partly based on ‘noise’, introducing a level of uncertainty in the conclusions.
That is why we do tests with ‘significance level α’ and have 0.95 confidence intervals (necessary for the
uncertainty that the random samples take)
2. Confidence intervals
1) Explain what a confidence interval for a parameter means
2) Specify the general pattern of a confidence interval (the 4 elements of t-procedures)
a. Parameter of interest = what you want to know, what you want to draw a conclusion from
= something that describes the population
b. Estimator (= method of estimation) – how to estimate the parameter from the data (it’s a
method, a formula)
c. Standard error of the estimator (= how certain we can be about the estimate)
d. Degrees of freedom (= in estimating the spread) for the t-distribution
3) Apply this pattern to a specific problem (calculate the limits of the interval) à know “which
situation” to apply
Situation 1 – 1 sample situation
E.g. What is the mean body hight in Wageningen students?
à answered by doing a confidence interval
Step 1: take a random sample of male students of 25 males
à draw conclusions about a large population based on the 25 observations
Sampling terminology
• We are interested in the mean of one trait (body height) in one population (e.g. all male WUR
students)
• The students are the sampling units
• The response is body height, measure per student (so the student is also the observed or
measurement unit)
• The scientist draws conclusion about the population mean (of body weight) based on one random
sample = ‘one-sample situation’ = one population, one mean
• The population is a physical population
• The type of research is observational
Parameter of interest: mean body height of all male WUR students = mu or μy with y being the height
Step 2: to determine the confidence interval, we need the summary statistics of the data set
Sample size: n=25
Sample mean: y barre = 184
Sample standard deviation: s=9 (= how variable the values are)
1
, • A confidence interval is a range of values for a parameter, a range of values for the parameter that
we have “confidence” in
• The confidence level (1- α) is often 0.95 (α is 0.05 = 5%)
• The width of a confidence interval reflects the precisions of the estimate: precise estimate = narrow
interval
• Bounds or limits of the interval are random: they depend on the units that are drawn in the sample.
• The 0.95 (1- α): the interval is constructed such that the probability that the interval will contain the
true parameter value 0.95. Imagine many repeats of the experiment. In each repeat we have new
data and a new interval. Of all these intervals, 95% will then contain the true parameter value. In
practice we only have one sample. It’s about the method and not the outcome of the confidence
interval
• A CI is typically of the form: best guess (estimate) +- error margin
E.g. Is there a difference in mean body height of male students compared to 1980 (when it was 180cm)?
à answered by doing a t-test
Situation 2 – paired data
Blood pressure change: a physician records the blood pressure before (x) and after 2 weeks (y) of medication
use for 16 patients: d = x-y (regarded as a random sample)
Q1: What is (in general, or ‘in the population’) the change in mean blood pressure after medication use (μx – μy),
or what is the mean change in blood pressure (μd) after medication use?
à μx – μy is the change in mean and μd is the mean change à the two are the same
à we make a two-sided confidence interval for μd
à parameter of interest is the difference in mean blood pressure before and after medication use μd
Q2: does mean blood pressure in the population go down after medication use? = μx –μy > 0? or μd > 0? à we
need to do a one-sample t-test
NB1: for paired data, the observations (x and y) within the pair are not independent; they belong to the same
unit and will be correlated. This ‘problem’ is solved by using the d-values (values of the differences)
NB2: If the sample would be random (in this case it was not. That’s why it’s important that they regard this
sample as random), the patients are independent units
Paired data design = 1 sample situation for d
• Patients were not randomly selected. We should check gender, age, weight... to see if the sample
may well represent the population.
2
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper louise_s. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €7,49. Je zit daarna nergens aan vast.