Review from previous notes:
Measures of Centre (also called Location):
• Mean (denoted by 𝑥̅ ): sum of data values divided by number of data values
• Median: the middle value, after arranging the data in ascending order
• Mode: the highest point in the histogram (by far the least useful of the three
summaries and not often used as a summary measure)
The mean can be thought of as the “centre of gravity” of the distribution.
The median can be thought of as the middle of the data; 50% are on each side of the
median. Note: If there is an even number of data values, the median is the average of
the two middle values.
For symmetric distributions, the mean equals the median.
For asymmetric or skewed distributions, the mean and the median are not equal.
If the distribution has a long right-hand tail (i.e. skewed right) the mean is greater
than the median.
If the distribution has a long left-hand tail (i.e. skewed left), the mean is less than
the median.
The median is a more “robust” measure than the mean; that is, it is not highly affected by
the presence of outliers.
The Excel function for the mean is AVERAGE, and for the median is MEDIAN.
New material
Measures of Spread (also called Scale or Dispersion):
Range = Maximum value – Minimum value
(There is no Excel function for Range, but you can use the functions MAX and MIN.)
Variance (s2) and Standard Deviation (s or SD)
2
2 (∑ 𝑥𝑖 )
s=√
∑(𝑥𝑖 −𝑥̅ )2
A computational formula is: s =
√∑ 𝑥𝑖 − 𝑛
𝑛−1 𝑛−1
The standard deviation (sometimes referred to as the SD) can best be interpreted as
“the typical distance from a data value to the mean”.
(In Excel, use STDEV.S; the final “S” refers to this being a “sample” of data. Or you can
use an older version called STDEV. DO NOT USE: STDEV.P; that’s only for a complete
“population,” which doesn’t occur in real-life applied situations. We’ll discuss samples
and populations later.)
1
, Neither the mean nor the standard deviation is resistant to outliers. They are also a poor
choice of summary if the distribution is highly skewed.
Percentiles: The pth percentile is a value below which p% of the data values fall. Some
percentiles have special names:
75th percentile = 3rd or upper quartile = Q3
25th percentile = 1st or lower quartile = Q1
50th percentile = Median
The Interquartile Range (IQR) = Q3 – Q1.
Note: If there are an odd number of observations, the median is indeed the middle one;
but if there are an even number of observations the convention is to take the average of
the two middle ones as the median. This can be extended to computing the quartiles; if a
quartile lies between two observations, take the average.
The text explains that the first quartile is the “median” of the values to the left of the full
data set median and the third quartile is the “median” of the values to the right of the full
data set median. Beware, however, that different software packages (including Excel)
have different conventions for computing the quartiles. But since you are only using
them as summaries all the slight variations are acceptable.
(In Excel, use QUARTILE.INC. or the older version which is simply QUARTILE. For
reasonably large data sets, the difference between QUARTILE.INC and QUARTILE.EXC
will be small. Excel also has a similar choice between PERCENTILE.INC [or just
PERCENTILE] and PERCENTILE.EXC.)
IMPORTANT:
For symmetric distributions ➔ Mean and SD
For skewed distributions ➔ Median and IQR
2
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller rakshitrajwani. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $7.99. You're not tied to anything after your purchase.