Week 1
Chapter 1
1.1 Definitions of Statistics
In this case, we are using the term statistics as a shortened version of statistical procedures.
Research in the behavioral sciences (and other elds) involves gathering information.
Definitions: The term statistics refers to a set of mathematical procedures for organizing,
summarizing, and interpreting information.
2 General Purposes of Statistics
1. Statistics are used to organize and summarize the information so that the researcher can
see what happened in the research study and can communicate the results to others.
2. Statistics help the researcher to answer the questions that initiated the research by
determining exactly what general conclusions are justified based on the specific results
that were obtained.
Statistical procedures help ensure that the information or observations are presented and
interpreted in an accurate and informative way.
Statistics provide researchers with a set of standardized techniques that are recognized and
understood throughout the scientific community.
Populations and Samples
Research in the behavioral sciences typically begins with a general question about a specific
group (or groups) of individuals.
For example, a researcher may want to know what factors are associated with academic
dishonesty among college students. – The researcher is interested in the group of individuals.
In statistical terminology, the entire group that a researcher wishes to study is called a
population.
Definition: A population is the set of all the individuals of interest in a particular study.
Populations can obviously vary in size from extremely large to very small, depending on how the
investigator defines the population.
Because populations tend to be very large, it usually is impossible for a researcher to examine
every individual in the population of interest. Therefore, researchers typically select a smaller,
more manageable group from the population and limit their studies to the individuals in the
selected group. In statistical terms, a set of individuals selected from a population
is called a sample. A sample is intended to be representative of its population, and a sample
should always be identified in terms of the population from which it was selected.
Definition: A sample is a set of individuals selected from a population, usually intended to
represent the population in a research study.
When a researcher finishes examining the sample, the goal is to generalize the results back
to the entire population.
,Variables and Data
Researchers are interested in specific characteristics of the individuals in the population (or the
sample), or they’re interested in the factors that may influence individuals or their behaviours.
Definition: A variable is a characteristic or condition that changes or has different values for
different individuals.
Variables can be characteristics that differ from one individual to another, such as weight,
gender, personality, or fast-food preferences.
demonstrate changes in variables, it is necessary to make measurements of the variables being
examined.
The measurement obtained for each individual is called a datum, or more commonly, a score or
raw score. The complete set of scores is called the data set or simply the data.
Definition: Data (plural) are measurements or observations.
A data set is a collection of measurements or observations.
A datum (singular) is a single measurement or observation and is commonly called a score or
raw score.
Because research typically involves measuring each individual to obtain a score, every sample
(or population) of individuals produces a corresponding sample (or population) of scores.
Parameters and Statistics
When describing data, it is necessary to specify whether the data come from a population
or a sample.
Definition: A parameter is a value—usually a numerical value—that describes a population.
A parameter is usually derived from measurements of the individuals in the
population.
Definition: A statistic is a value—usually a numerical value—that describes a sample. A
statistic is usually derived from measurements of the individuals in the sample.
Every population parameter has a corresponding sample statistic.
Descriptive and Inferential Statistical Methods
Researchers have developed two general categories of classifying statistical procedures.
The first category:
Definition: Descriptive statistics are statistical procedures used to summarize, organize, and
simplify data.
Descriptive statistic techniques that take raw scores and organize or summarize them in a form
that is more manageable.
Usually in a table or graph, summarizing a set of scores by computing an average.
,The second category:
Definition: Inferential consist of techniques that allow us to study samples and then
make generalizations about the populations from which they were selected.
A problem with samples, is that they only provide limited information about the population,
despite being representative of the population they are not perfectly accurate.
This results in a discrepancy between the sample statistic and the corresponding population
parameter. This discrepancy is called:
Definition: Sampling error is the naturally occurring discrepancy, or error, that exists between a
sample statistic and the corresponding population parameter.
1.2 Variables and Measurement
Constructs and Operational Definitions
Often the variables being studied are internal characteristics that people use to help describe and
explain behaviour. Variables like intelligence, anxiety, and hunger are called constructs, and
because they are intangible and cannot be directly observed, they are often called hypothetical
constructs.
It is possible to observe and measure behaviors that are representative
of the construct.
The external behaviors can then be used to create an operational definition for the construct. An
operational definition denes a construct in terms of external behaviors that can be observed and
measured.
Definition: Constructs are internal attributes or characteristics that cannot be directly
observed but are useful for describing and explaining behavior.
Definition: An operational definition identifies a measurement procedure (a set of operations) for
measuring an external behavior and uses the resulting measurements as a definition and a
measurement of a hypothetical construct. Note that an operational definition has two
components: First, it describes a set of operations for measuring a construct. Second, it denes the
construct in terms of the resulting measurements.
Discrete and Continuous Variables
A discrete variable consists of separate, indivisible categories. For this type of variable, there are
no intermediate values between two adjacent categories.
Consider the number of questions that each student answers correctly on a 10-item multiple-
choice quiz. Between neighboring values—for example, seven correct and eight correct—no
other values can ever be observed.
Definition: A discrete variable consists of separate, indivisible categories. No values can exist
between two neighboring categories. (whole, countable names)
, VS
Non-discrete variables, variables such as time, height, and weight are not limited to a fixed set of
separate, indivisible categories.
Definition: For a continuous variable, there are an infinite number of possible values that fall
between any two observed values. A continuous variable is divisible into an infinite number of
fractional parts.
2 Other factors apply to Continuous Variables:
1. When measuring a continuous variable, it should be very rare to obtain identical
measurements for two different individuals. Because a continuous variable has an infinite
number of possible values, it should be almost impossible for two people to have exactly
the same score. If the data show a substantial number of tied scores, then you should
suspect that the measurement procedure is very crude or that the variable is not really
continuous.
2. When measuring a continuous variable, researchers must first identify a series of
measurement categories on the scale of measurement. Measuring weight to the nearest
pound, for example, would produce categories of 149 pounds, 150 pounds, and so on.
However, each measurement category is actually an interval that must be defined by
boundaries. However, each measurement category is actually an interval that must be
defined by boundaries.
Definition: Real limits are the boundaries of intervals for scores that are represented on a
continuous number line. The real limit separating two adjacent scores is located exactly halfway
between the scores. Each score has two real limits. The upper real limit is at the top of the
interval, and the lower real limit is at the bottom.
The concept of real limits applies to any measurement of a continuous variable, even when the
score categories are not whole numbers. For example, if you were measuring time to the nearest
tenth of a second, the measurement categories would be 31.0, 31.1, 31.2, and so on.
Real limits are used for constructing graphs and for various calculations with continuous scales.
Whenever you are free to choose the degree of precision or the number of categories for
measuring a variable, the variable must be continuous.
Scales of Measurement
Measurement assigns individuals or events to categories
• The categories can be names, such as male/female or employed/unemployed
• They can be numerical values, such as 68 inches or 175 pounds
The complete set of categories makes up a scale of measurement
Relationships between the categories determine different types of scales
The distinctions among the scales are important because they identify the limitations of certain
types of measurements and because certain statistical procedures are appropriate for scores that
have been measured on some scales but not on others.