Doesn't match the book, lacks a lot of information
Seller
Follow
kirstenzjyh
Reviews received
Content preview
Statistics Pre-MSc
Types of data
There are four main types of data: nominal, ordinal, interval, and ra6o. Each type has
specific characteris6cs that determine the kind of analysis that can be performed on it.
Nominal Data:
Nominal data consists of categories or labels without any inherent order. The categories are
dis6nct and mutually exclusive.
Examples:
• Colors (e.g., red, blue, green)
• Types of animals (e.g., cat, dog, bird)
• Marital status (e.g., single, married, divorced)
Ordinal Data:
Ordinal data involves categories with a meaningful order or ranking. The intervals between
categories are not necessarily equal.
Examples:
• Educa6on levels (e.g., high school, college, graduate school)
• Star ra6ngs for a movie (e.g., 1 star, 2 stars, 3 stars)
• Socioeconomic status (e.g., low, middle, high)
Interval Data:
Interval data has ordered categories with equal intervals between them.
However, it lacks a true zero point, meaning that ra6os (e.g., twice as much) cannot be
calculated.
Examples:
• Temperature in Celsius or Fahrenheit (e.g., 0°C doesn't mean "no temperature")
• IQ scores (e.g., 100 is not "twice as intelligent" as 50)
Ra3o Data:
Ra6o data has ordered categories with equal intervals, and it also has a true zero point.
This allows for meaningful ra6os and mathema6cal opera6ons.
Examples:
• Height (e.g., 0 cm represents "no height")
• Weight (e.g., 0 kg means "no weight")
These dis6nc6ons are crucial in choosing appropriate sta6s6cal analyses. For instance,
nominal data may only allow for frequency counts and percentages, while ra6o data can be
used in complex mathema6cal opera6ons like calcula6ng means and standard devia6ons.
Remember, the type of data you have will influence the choice of sta6s6cal tests and
analyses you can perform on that data.
,Measures of central tendency (loca3on)
Mode: The mode is the value that occurs most frequently in a dataset. Unlike the mean and
median, the mode can be used with both numerical and categorical data.
A dataset can have one mode (unimodal), more than one mode (mul6modal), or no mode if
all values occur with the same frequency.
The only measure of central tendency for nominal variables.
Median: The median is the middle value when a dataset is sorted in ascending order. If there
is an even number of observa6ons, the median is the average of the two middle values. The
median is less affected by extreme values compared to the mean, making it a more robust
measure in such cases.
To be used for ordinal and higher measurement levels.
Mean: The mean is the most commonly used measure of central tendency. It is calculated by
adding up all the values in a dataset and then dividing by the number of observa6ons.
To be used for interval and ra6o variables only.
Importance and Relevance:
1. Summariza3on: Measures of central tendency provide a concise summary of a dataset,
making it easier to understand and compare different datasets.
2. Interpreta3on: They help in interpre6ng the meaning of the data by providing a "typical"
value around which the data tends to cluster.
3. Comparison: Central tendency measures allow for easy comparison between different
datasets or different parts of a dataset.
4. Decision-Making: They play a crucial role in decision-making processes, especially in fields
like economics, finance, and public policy.
When to Use:
You would use measures of central tendency when you want to:
- Provide a representa6ve value that summarizes a dataset.
- Compare different datasets or parts of a dataset.
- Make informed decisions based on the central value of the data.
It's important to note that the choice of which measure to use depends on the nature of the
data and the specific ques6ons you're trying to answer. For example, if your data contains
extreme values, the median might be a more appropriate measure of central tendency than
the mean.
Measures of Variability
Measures of variability, also known as measures of dispersion, quan6fy the spread or
dispersion of data points in a dataset. They provide informa6on about the extent to which
,the data deviates from the central tendency (mean, median, or mode). There are several
common measures of variability:
Range: The range is the simplest measure of variability. It is the difference between the
maximum and minimum values in a dataset. While easy to calculate, it is sensi6ve to outliers
and may not provide a comprehensive understanding of dispersion.
Interquar3le Range (IQR): The IQR is the range between the first quar6le (25th percen6le)
and the third quar6le (75th percen6le) of the data. It gives a measure of the spread of the
middle 50% of the data, making it less sensi6ve to extreme values than the range.
Variance: The variance is the average of the squared differences between each data point
and the mean. It quan6fies the overall variability in the dataset. However, because it
involves squaring the differences, the variance is in squared units, which can be less
interpretable.
Standard Devia3on: The standard devia6on is the square root of the variance. It is oaen
preferred over the variance because it is in the same units as the original data, making it
more interpretable. It represents the average devia6on of each data point from the mean.
, Coefficient of Varia3on (CV): The CV is the ra6o of the standard devia6on to the mean,
expressed as a percentage. It is used to compare the rela6ve variability between different
datasets, especially when the means are different.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller kirstenzjyh. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $6.95. You're not tied to anything after your purchase.