Important facts/overview: every test of BRM
All for this textbook (3)
Written for
Erasmus Universiteit Rotterdam (EUR)
International Business Administration
Statistics (BT1111)
All documents for this subject (4)
Seller
Follow
medestudentlisa
Content preview
Chapter 1 What is Statistics?
1.1 Key Statistical Concepts
Statistics = A way to get information. A statistics practitioner is a person
who uses statistical
techniques properly. The term statistician refers to an individual who
works with the
mathematics of statistics. His or her work involves research that
develops techniques
and concepts that in the future may help statistics practitioners.
Descriptive statistics deals with methods of organising,
summarising, and presenting data in a convenient and
informative way.
Inferential statistics is a body of methods used to draw
conclusions or inferences about characteristics of populations
based on sample data. Uncertainty is involved.
Exit polls = A random sample of voters who exit the polling booth are asked for
whom they voted.
Statistical inference problems involve three key concepts:
Population: the group of all items of interest to a statistics practitioner. It is
frequently very large and may, in fact, be infinitely large. It does not
necessarily refer to a group of people. A descriptive measure of a population
is called a parameter.
Sample: a set of data drawn from the studied population. A descriptive
measure of a sample is called a statistic. We use statistics to make inferences
about parameters.
Statistical inference: the process of making an estimate, prediction, or
decision about a population based on sample data. As estimates are not
always correct, we build into the statistical inference a measure of reliability,
two to be exact:
o The confidence level: the proportion of times that an estimating procedure
will be correct.
o The significance level: measures how frequently the conclusion will be
wrong.
Chapter 2 Graphical Descriptive Techniques I
2.1 Types of Data and Information
Variable = Some characteristic of a population or sample. The characteristics
will vary, see name.
The values of the variable are the possible observations of the
variable. Data are the
observed values of a variable. Data is plural for datum.
When people think of data, they think of sets of numbers. However, there are
three different types:
Interval data: real numbers, such as heights, weights, incomes, and
distances. We also refer to this type of data as quantitative or numerical.
Nominal data: the values are not numbers, but instead are words that
describe categories. We often record nominal data by arbitrarily assigning a
number to each category. We also refer to this type of data as qualitative or
categorical. Calculations based on the codes used to store this type of data
are meaningless.
, Ordinal data: appears to be nominal, but the difference is that the order of
their values has meaning. The difference between nominal and ordinal types
of data is that the order of the values of the latter indicate a higher rating. It’s
not the magnitude, but the order that is important. The descriptive
measurement is called the median.
Hierarchy of data: interval, ordinal, nominal.
Higher-level data types may be treated as
lower-level ones. A fourth variable is the
ratio, which is the absolute zero.
2.2 Describing a Set of Nominal Data
A frequency distribution summarises the data in a table, which presents the
categories and their counts. A relative frequency distribution lists the categories
and the proportion with which each occurs. To present this (qualitative) data, we
can use either a bar chart or a pie chart:
Bar charts: emphasizes frequency of occurrence of the different categories.
Used when the order in which qualitative data are presented is meaningful.
Pie charts: emphasizes the proportion of occurrences of each category
(relative frequencies). Especially popular to represent proportions of
appearance for nominal data.
Chapter 3 Graphical Descriptive Techniques II
3.1 Graphical Techniques to Describe a Set of Interval Data
A histogram is a graph that is created by drawing rectangles whose bases are
the intervals and whose heights are the frequencies. Classes are series of
intervals with a number of observations that fall into each of these series and
cover the complete range of observations. By using Sturges’s formula
(1+3.3log(n)), we can calculate the number of class intervals. We determine the
approximate width of the classes by subtracting the smallest observation from
the largest and dividing the difference by the number of classes.
A histogram is said to be symmetric if, when we
draw a vertical line down the centre of the
histogram, the two sides are identical in shape and
size. A skewed histogram is one with a long tail
extending to either the right (positively skewed) or
the left (negatively skewed). A modal class is the
class with the largest number of observations. A
unimodal histogram is one with a single peak. A
bimodal histogram is one with two peaks, not
necessarily equal in height.
The stem-and-leaf display is a method that overcomes the loss of data when
using a histogram. The first step in developing such a display is to split each
observation into a stem and a leaf. The stem-and-leaf display is similar to a
histogram turned on its side. The
advantage of the display over the
histogram is that we can see the
actual observations.
, A cumulative relative frequency distribution adds each frequency onto the
next and then calculates the frequency again. Displaying this distribution can be
in the way of an ogive, which is a graphical representation of the cumulative
relative frequencies.
3.2 Describing Time-Series Data
Data can be classified according to the time they are measured:
Cross-sectional data: all data is collected at the same time;
Time-series data: all data is collected at successive points in time.
Time-series data are often graphically depicted on a line chart, which is a plot of
the variable over time. It is created by plotting the value of the variable on the
vertical axis and the time periods on the horizontal axis.
3.4 Art and Science of Graphical Presentations
Graphical excellence =A term applied to techniques that are informative and
concise and that import
information clearly to their viewers. You can call a graph
excellent when:
The graph presents large data sets concisely and
coherently.
The ideas and concepts the statistics practitioner wants to
deliver are clearly understood by the viewer.
The graph encourages the viewer to compare two or more
variables.
The display induces the viewer to address the substance of
the data and not the form of the graph.
There is no distortion of what the data reveal.
There are multiple graphical deceptions we must be aware of:
Graphs without scales;
Graphs with different captions;
Graphs with stretches x- and y-axis’s;
Graphs with size distortions, particularly in pictograms.
Chapter 4 Numerical Descriptive Techniques
4.1 Measures of Central Location
The aim is to describe the most typical outcome of distribution and to benchmark
to judge other observations. Examples of measures of central location are the
mean, median and mode. The suitability of these measures depends on the type
of data (interval, ordinal and nominal).
The most popular and useful measure of central
location is the arithmetic mean:
∑ of all observations . The geometric mean:
number of observations
measure of the average growth rate:
Rg= √ ( 1+ R 1 ) (1+ R 2 ) … ( 1+ Rn )−1.
n
The median is the value that falls in the middle when the measurements are
arranged in order of magnitude. For an odd number of observations, locate the
value in the middle. For an even number of observations, there are two middle
values, thus take the average of these two numbers. One advantage that the
median holds is that it is not as sensitive to extreme values as is the mean.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller medestudentlisa. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $6.96. You're not tied to anything after your purchase.