ST201 Statistical Models and Data Analysis (ST201)
All documents for this subject (1)
1
review
By: johanpyone • 1 year ago
Seller
Follow
henryrayner
Reviews received
Content preview
ST201 Notes
1. Nominal/Categorical: formed by categories that cannot be ranked, e.g. eye colour, religion.
2. Dichotomous/Binary: nominal variables with only two categories, e.g male and female,
alive and dead.
3. Ordinal: categories can be ranked but the distance between categories may not be equal
across the range; e.g. gears (1,2,3,4,5), exam results (A, B, C, D, E, U).
4. Continuous: variables can take any value in a range, e.g. speed, weight, time.
Mode: The value that occurs most frequently, if it exists
Median: The (½(n+1))th value; it is robust to outliers
Mean: 1 / n∑x = x bar; it is sensitive to outliers
Range: the difference between the highest and lowest values
Standard deviation or S.D.:
Pearson’s correlation r: ranges from -1 to +1
Spearman’s ρ also ranges from -1 to +1 but is based on the ranking of the values.
Specifically, it is defined as the Pearson correlation coefficient between the ranked variables
and is a more general correlation which can be applied to non-linear but monotonic
relationships.
Skewness
,It is important to extend univariate statistics to express the symmetry or lack of symmetry of
data.
For left skewness, mean < median < mode
For right skewness, mode < median < mean
For symmetry, mode = median = mean
Statistical Inference
The analysis of sample data to draw conclusions (inferences) about the population from
which the sample was taken.
Important terminology:
− Parameter: a value, usually unknown, used to represent a population characteristic – within
a population, a parameter is a fixed value – normally represented by a Greek letter. –
, Estimator: a rule for calculating an estimate based on sample data and used to approximate a
parameter from the population – normally represented by a Roman letter
The analysis of sample data to draw conclusions (inferences) about the population from
which the sample was taken.
− An Estimate is the value obtained after calculating an estimator using data from a particular
sample
– unlike a parameter, estimates are not fixed
– they vary across samples reflecting sampling variability
We can draw different samples and from each of them obtain an estimate to assess the
properties of a particular estimator.
In doing so it is vital that we use randomly picked samples (simple random samples), to
ensure that the samples are representative of the population.
The distribution of the different estimates obtained from each sample is known as the
sampling distribution.
The mean of this distribution is the point estimate and the observations around it, or the area
around it, constitutes the uncertainty
Ideally, we want unbiased and precise estimators.
The former requires the expectation of the estimator being equal to the population parameter,
e.g. E(X bar) =
The latter requires that the standard deviation of the sampling distribution, i.e. the standard
error of the estimator (or SE), to be as small as possible.
Put another way, we require estimators to be close to the target population parameter and to
have small variance.
Central Limit Theorem
The key to statistical inference is the sampling distribution of an estimator.
According to the central limit theorem:
− for large samples (in practice size > about 30)
− from a population with mean μ and standard deviation σ
− the sample mean will be approximately normally distributed
− with mean μ and standard deviation σ/√n n
− regardless of how the population is distributed.
So, although we will usually just take one sample and obtain one single point estimate,
invoking the central limit theorem and the properties of a normal distribution, we can assess
the uncertainty surrounding such an estimate.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller henryrayner. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $7.09. You're not tied to anything after your purchase.