Introduction to Statistical Analysis
Chapter One Introduction to Statistics
Statistics: a branch of mathematics used to summarize, analyse, and interpret what we observe to make
sense or meaning of our observations.
Example: we make sense of how good a soccer player is based on observing the amount of goals.
There are two ways to evaluate the information that scientist gather:
1. Descriptive statistics: scientists organize and summarize information such that the information is
meaningful to those who read about the observations scientists make in a study.
2. Inferential statistics: scientists use information to answer a question or make an actionable
decision. Example: “Is diet related to obesity?” and “Should we implement this policy?”
Mark Twain: “there are lies, damned lies, and statistics”. Statistics could be deceiving, and so can be
interpreting them. Two general types of statistics:
1. Descriptive statistics: applying statistics to organize and summarize information, and to make sense
of a set of scores or observations. These are usually presented graphically, in tabular form, or as
summary statistics (e.g., an average).
2. Inferential statistics: applying statistics to interpret the meaning of information.
Data (plural): measurements or observations that are typically numeric. Datum (singular): a single
measurement or observation, usually referred to as a score or raw score.
FIGURE 1.1???
Descriptive statistics:
Data are generally presented in summary, either graphically, in tabular form (tables) of as summary
statistics -> average (mean), middle (median), or most common (mode). This is among all individuals.
Advantage of using tables: they can clarify findings in a research study.
Inferential statistics:
It is often not possible to identify all individuals in a population, so researchers require statistical
procedures (inferential statistics) to infer that observation made with a sample are also likely to be
observed in the larger population from which the sample was selected.
We want to test if a new learning tool can improve learning amongst U.S. students.
Population: the set of all individuals, items, or data of interest. This is the group about which scientists will
generalize. Example: all U.S. students
Population parameter: a characteristic (usually numeric) that describes a population. Example: learning
Population of interest: the populations researchers are interested in.
Sample: set of individuals, items, or data selected from a population of interest.
Example: a portion of these students.
Sample statistic: a characteristic (usually numeric) that describes a sample. Example: learning.
Science: the study of phenomena, such as behaviour, through strict observation, evaluation, interpretation,
and theoretical explanation.
The research method/scientific method: a set of systematic techniques used to acquire, modify, and
integrate knowledge concerning observable and measurable phenomena.
,Three research methods commonly used in behavioural science:
1. Experimental method:
Experiment: the use of methods and procedures to make observations in which the researcher fully
controls the conditions and experiences of participants by applying three required elements of control
(manipulation, randomization, and comparison/control) to isolate cause-and-effect relationships between
variables.
Random assignment is used to assign (2) participants to a group. This is a random procedure used to
ensure that participants in a study have an equal chance of being assigned to a particular group or
condition.
The researcher must be able to manipulate (1) the levels of the independent variable (IV): the variable that
is manipulated in an experiment. This variable remains unchanged (or “independent) between conditions
being observed in an experiment. It is the “presumed cause”. The specific conditions of an IV are referred
to as the levels of an independent variable.
The dependent variable (DV) is the variable that is measured in each group of a study and is believed to
change in the presence of the independent variable. It is the “presumed effect”. These can be measured in
many ways and therefore require an operational definition: a description of some observable event in
terms of the specific process or manner by which it was observed or measured.
2. Quasi-experimental method:
The research study is structured similar to an experiment but lacks either the quasi-independent variable,
or a comparison/control group. A quasi-independent variable is a pre-existing variable that is often a
characteristic inherent to an individual, which differentiates the groups or conditions being compared in a
research study. Because the levels of the variables are pre-existing, it is not possible to randomly assign
participants to groups. Example: sex -> male, female. The variables cannot be manipulated.
With the lack of comparison/control groups, the differences between two levels of an independent variable
cannot be compared.
3. Correlational method:
The correlation method involves measuring the relationship between pairs of scores. No variable is
manipulated to create different conditions or groups to which participants can be randomly assigned. Two
variables are measured and the extent to which those variables are related is measured.
Scale of measurement: rules for how the properties of numbers can change with different uses. These
rules imply that the extent to which a number is informative depends on how it was used or measured.
Order: does a large number indicate a greater value than a smaller number?
Difference: does subtracting two numbers represent some meaningful value?
Ratio: does dividing (or taking the ratio of) two numbers represent some meaningful value?
Scale of Measurement
Property Nominal Ordinal Interval Ratio
Order No Yes Yes Yes
Difference No No Yes Yes
Ratio No No No Yes
Nominal scales are measurements in which a number is assigned to represent something or someone.
Example: credit card numbers, ZIP codes etc. This is different than nominal variables like gender. Coding is
the procedure of converting a nominal or categorical variable to a numeric value.
, Ordinal scales are measurements that convey order or rank alone. Example: educational level, finishing
order in a competition etc. Ranks do not convey the meaning of the differences but just the order. They
don’t indicate that one rank is greater/less than another rank.
Interval scales are measurements that have not true zero and are distributed in equal units. A true zero is
when the value 0 truly indicates nothing on a scale of measurement. Interval scales do NOT have a true
zero. Example: temperature, it has a 0 but this does not mean there is no temperature.
Ratio scales are measurements that DO have a true zero and are distributed in equal units. Example:
length, age.
There are two common types of variables for which data are measured:
1. Continuous and discrete measures:
A continuous variable is measured along a continuum at any place beyond the decimal point. A continuous
variable can thus be measured in fractional units. Example: 1/100 seconds, 1/1000 seconds.
A discrete variable is measured in whole units or categories that are not distributed along a continuum.
Example: number of brothers and sisters you have.
2. Quantitative and qualitative variables
A quantitative variable varies by amount. This variable is often measured numerically and is often collected
by measuring or counting. Both continuous and discrete measures can be quantitative, calories
(continuous) or number of pieces of food (discrete).
A qualitative variable varies by class. This variable is often represented as a label and describes
nonnumeric aspects of phenomena. Only discrete variables can fall into this category, socioeconomic class,
or categories of depression.
Chapter Three Summarizing Data; Central Tendency
Measure of central tendency are statistical measures for locating a single score that is most representative
or descriptive of all scores in a distribution. They have a tendency to be near the centre of a distribution.
N = population size: the number of individuals who constitute an entire group or population.
n = sample size: the number of individuals who constitute a subset of those selected from a larger
population.
There are three measures of central tendency:
1. The mean/arithmetic mean/average: It is the sum (Σ) of a set of scores (x) in a distribution, divided
by the total number of scores summed.
∑𝑥
A population mean is the mean for a set of scores in an entire population -> 𝜇 =
𝑁
∑𝑥
A sample mean is the mean for a sample (or subset of scores from a population) -> 𝑀 =
𝑛
The mean is not necessarily the middle value or the exact centre. Example: balancing a pen.
The weighted mean is the combined mean of two or more groups of scores in which the number of scores
in each group is disproportionate or unequal. Disproportionate refers to the fact that some samples have
more scores than others. The formula is as follows:
∑(𝑀×𝑛) 𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑠𝑢𝑚
𝑀𝑤 = or
∑𝑛 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 𝑛
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller pienwilms. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $5.43. You're not tied to anything after your purchase.