100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Advanced statistics - Notes from all the lectures $8.15   Add to cart

Class notes

Advanced statistics - Notes from all the lectures

2 reviews
 70 views  4 purchases
  • Course
  • Institution
  • Book

This document contains notes from all the lectures (1 to 12), some interesting notes and tips from the computer practicals, as well as notes from the pen and paper practicals (PPP). Information from some knowledge clips are included already in the notes. R output is included, so it is easier to kn...

[Show more]

Preview 2 out of 106  pages

  • October 28, 2020
  • 106
  • 2020/2021
  • Class notes
  • Unknown
  • All classes

2  reviews

review-writer-avatar

By: irisderuijter • 2 year ago

review-writer-avatar

By: luukje1312 • 3 year ago

It contains everything you need to know in short, but detailed alinea's.

avatar-seller
Lecture 1a – advanced statistics
Main aim: Inference (= draw conclusions about a population or about a general phenomenon based on a
limited number of observations, which are the sample data)

3 different situations for t-procedures (confidence interval and t-tests):
- one sample, one mean (e.g. the mean body weight of all 6 years old boys in the NL)
- paired observations, mean difference (e.g. data of twins or before and after study)
- two independent samples, difference in mean (e.g. two populations: difference in exam scores in
males and females, which is a typical observational study/ research)

1. Inference (1 sample)
Take a random sample (sample data which is representative for the whole population). The noise is different
for each sample data, but some noise makes them a bit different.
à Conclusions of inference are partly based on ‘noise’, introducing a level of uncertainty in the conclusions.
That is why we do tests with ‘significance level α’ and have 0.95 confidence intervals (necessary for the
uncertainty that the random samples take)

2. Confidence intervals
1) Explain what a confidence interval for a parameter means
2) Specify the general pattern of a confidence interval (the 4 elements of t-procedures)
a. Parameter of interest = what you want to know, what you want to draw a conclusion from
= something that describes the population
b. Estimator (= method of estimation) – how to estimate the parameter from the data (it’s a
method, a formula)
c. Standard error of the estimator (= how certain we can be about the estimate)
d. Degrees of freedom (= in estimating the spread) for the t-distribution
3) Apply this pattern to a specific problem (calculate the limits of the interval) à know “which
situation” to apply

Situation 1 – 1 sample situation
E.g. What is the mean body hight in Wageningen students?
à answered by doing a confidence interval

Step 1: take a random sample of male students of 25 males
à draw conclusions about a large population based on the 25 observations

Sampling terminology
• We are interested in the mean of one trait (body height) in one population (e.g. all male WUR
students)
• The students are the sampling units
• The response is body height, measure per student (so the student is also the observed or
measurement unit)
• The scientist draws conclusion about the population mean (of body weight) based on one random
sample = ‘one-sample situation’ = one population, one mean
• The population is a physical population
• The type of research is observational

Parameter of interest: mean body height of all male WUR students = mu or μy with y being the height

Step 2: to determine the confidence interval, we need the summary statistics of the data set
Sample size: n=25
Sample mean: y barre = 184
Sample standard deviation: s=9 (= how variable the values are)



1

, • A confidence interval is a range of values for a parameter, a range of values for the parameter that
we have “confidence” in
• The confidence level (1- α) is often 0.95 (α is 0.05 = 5%)
• The width of a confidence interval reflects the precisions of the estimate: precise estimate = narrow
interval
• Bounds or limits of the interval are random: they depend on the units that are drawn in the sample.

• The 0.95 (1- α): the interval is constructed such that the probability that the interval will contain the
true parameter value 0.95. Imagine many repeats of the experiment. In each repeat we have new
data and a new interval. Of all these intervals, 95% will then contain the true parameter value. In
practice we only have one sample. It’s about the method and not the outcome of the confidence
interval

• A CI is typically of the form: best guess (estimate) +- error margin




E.g. Is there a difference in mean body height of male students compared to 1980 (when it was 180cm)?
à answered by doing a t-test

Situation 2 – paired data
Blood pressure change: a physician records the blood pressure before (x) and after 2 weeks (y) of medication
use for 16 patients: d = x-y (regarded as a random sample)
Q1: What is (in general, or ‘in the population’) the change in mean blood pressure after medication use (μx – μy),
or what is the mean change in blood pressure (μd) after medication use?
à μx – μy is the change in mean and μd is the mean change à the two are the same
à we make a two-sided confidence interval for μd
à parameter of interest is the difference in mean blood pressure before and after medication use μd

Q2: does mean blood pressure in the population go down after medication use? = μx –μy > 0? or μd > 0? à we
need to do a one-sample t-test

NB1: for paired data, the observations (x and y) within the pair are not independent; they belong to the same
unit and will be correlated. This ‘problem’ is solved by using the d-values (values of the differences)
NB2: If the sample would be random (in this case it was not. That’s why it’s important that they regard this
sample as random), the patients are independent units

Paired data design = 1 sample situation for d
• Patients were not randomly selected. We should check gender, age, weight... to see if the sample
may well represent the population.



2

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller louise_s. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $8.15. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

75323 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$8.15  4x  sold
  • (2)
  Add to cart