(p-value is the probability of obtaining the observed data, or more extreme data, under the assumption that the null
hypothesis is true. If your data are very surprising then you reject the null hypothesis.
So the p-value tells something about the data.
You should never place the significance of your study on the p-value alone.
PART 2
Population vs sample
- Standard deviation vs standard error of the mean
- Confidence intervals
Population
- Collection of all values of a variable in our study: IQ, body weights, plasma [Ca²+],....
- The population’s values are characterized by parameters:
○ Mean, μ
○ Standard deviation, σ
- Can not sample the entire population
- So: μ and σ ( and population size N) are unknown
- We have to take a sample to obtain information about the population.
↓
Sample
- A representative subset of values from the population: IQ, body weight etc.
- Sample size (n) is known
- The sample’s values are characterized by statistics:
○ Mean, x̅
○ Standard deviation, s
- X̅ and s are used to estimate μ and σ.
When you have a population, you don't know the mean and standard deviation and you don't know how large the
population is. So you take a sample so you do know all of this.
The sample can vary, because the population is very large and the chance that you have the same observations a
second time is very small.
To get a more reliable sample mean → you take many samples (sample is a group of n organisms) and you take
the mean of these many samples.
Then you can make a pdf (probability density function) of the many individuals or of the many sample means and
there will be a difference in the graph.
In red the graph is still centered around 5, but
the width of the curve is much smaller. This
means that the variation around the mean of 5
is much smaller than when values are taken
from individual measurements/ observations.
When the sample size becomes larger, then
the width of the curve becomes smaller.
The standard error of the mean form one
sample is the standard deviation of the mean.
The standard error is the standard deviation of
a statistic. And the mean is a statistic.
, So the standard deviation of a mean is named standard error of the mean (SEM).
The standard deviation tells something about the variation in the sample.
The standard error of the mean reflects how reliable your sample mean is.
You can not be sure about the exact mean of a population but you can tell the 95% confidence interval of the
sample mean. You take the z value which encloses 95% of the observations and you multiply it by the SEM. And
the μ ± z * SEM shows range to express the 95% confidence interval.
A confidence interval describes the uncertainty of a mean. 95% of 95% CIs contain the true population parameter.
95% of the confidence intervals constructed from the population contain the true population mean.
The larger the sample size, the smaller the confidence interval becomes.
The smaller the confidence interval, the more reliable the estimation of the population mean is.
Standard deviation, s
- Measure of variability
- Average distance of a data point from a mean value
- It is a fixed property of a sample or a population (doesn’t matter how large or small the sample is, the
standard deviation will be about the same.)
- The sample’s standard deviation is used to estimate the unknown standard deviation sigma of a
population.
- When you just want to describe the sample itself then you just divide by n.
Standard error
- Measure of reliability of a statistic
- A standard error is a standard deviation of a statistic.
- The standard deviation of the mean is a standard error of the mean (SEM).
- We will encounter standard errors of other statistics as well.
- Standard error decrease with increasing sample size
- Standard errors are used in the construction of confidence intervals.
The bottom line of any statistical test:
The value of the test statistic is fully and completely determined by the results of your experiment.
The test statistic can be seen as the ultimate summary of your experimental results.
You can calculate the z value for a sample mean instead of an individual observation. →
You can look up the z value in a standard normal distribution table and look up the p value.
So when the mean and standard deviation is known, do this.
When the population standard deviation is not known, then estimate it from the sample standard deviation. (this
does cause more uncertainty) When sigma is not known you can not use the z-distribution, you have to use the
t-distribution.
When you don't know the mean and standard deviation, you use the sample’s
standard deviation and mean to make an estimation, and you use the t-distribution.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller elske_vd_rest. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $3.25. You're not tied to anything after your purchase.