INTRODUCTION TO STATISTICS
Statistics
Set of mathematical procedures for organizing, summarizing, and interpreting info.
Purpose of statistics
1. Organize and summarize the info so that the researcher can see what happened in the
research study can communicate the results to others.
2. Help the researcher to answer the questions that initiated the research by determining
exactly what general conclusions are justified based on the specific results that were
obtained.
Variable
A characteristic or condition that changes or has different values for different individuals.
Data
Measurments or observations.
Data set
Collection of measurments or observations.
Datum
A single measurement or observations and is commonly called a score or raw score.
(measurement obtained for each individual)
Parameter
Value that describes a population that is derived from measurements of the individuals in the
population. (the average score for population)
Statistic
A value that describes a sample that is derived from measurements of the individuals in the
sample.
Descriptive statistics
Statistical procedures used to summarize, organize, and simplify data.
Inferential statistics
Techniques that allow us to study samples and then make generalizations about the
populations from which they were selected.
Sampling error
Error that exists between a sample statistic and the corresponding population parameter.
Unpredictable, unsystematic differences that exist from one sample to another are an example
of sampling error.
Correlational method
Two different variables are observed to determine whether there is a relationship between
them.
Cannot demonstrate a cause and effect relationship.
,Techniques to control other variables
1. Random assignment
2. Matching
3. Holding them constant
Quasi-independent variable
In a nonexperimental group, the ‘independent variable’ that is used to create the different
groups of scores.
Construct
Internal attributes or characteristics that cannot be directly observed but are useful for
describing and explaining behavior.
Discrete variable
Separate, indivisible categories. No values can exist between two neighboring categories.
Example: class attendance from day to day
Continuous variables
There are an infinite number of possible values that fall between any two observed values.
Example: time, height, weight
Can be divided into any number of fractional parts!
Two other factors apply to continuous variables:
1. When measuring a continuous variable, it should be very rare to obtain identical
measurements for two different individuals. Because a continuous variable has an
infinite number of possible values, it should be almost impossible for two people to
have exactly the same score. If the data show a substantial number of tied scores, then
you should suspect that the measurement procedure is very crude or that the variable
is not really continuous.
2. When measuring a continuous variable, each measurement category is actually an
interval that must be defined by boundaries. For example, two people who both claim
to weigh 150 pounds are probably not exactly the same weight. However, they are
both around 150 pounds. One person may actually weigh 149.6 and the other 150.3.
Thus, a score of 150 is not a specific point on the scale but instead is an interval (see
Figure 1.7). To differentiate a score of 150 from a score of 149 or 151, we must set up
boundaries on the scale of measurement. These boundaries are called real limits and
are positioned exactly halfway between adjacent scores. Thus, a score of X = 150
pounds is actually an interval bounded by a lower real limit of 149.5 at the bottom
and an upper real limit of 150.5 at the top. Any individual whose weight falls between
these real limits will be assigned a score of X = 150.
Real limits
Boundries of intervals for scores that are represented on a continuous number line. The real
limit separating two adjacent scores is located exactly halfway between scores. Each score
has two real limits. The upper real limit at the top of the interval, and the lower real limit is
at the bottom.
, SCALES OF MEASURMENT
1. Nominal scale
Set of categories that have different names. Measurements on a nominal scale label and
categorize observations, but do not make any quantitative distinctions between
observations.
Example: classifying ppl by race, gender, occupation
2. Ordinal scale
Categories that are organized in an ordered sequence.
Rank order.
3. Interval scale
Ordered categories that are all intervals of exactly the same size. The zero point on an
interval scale is arbitrary and does not indicate a zero amount of the variable being
measured.
4. Ratio scale
Interval scale with the additional feature of an absolute zero point.
Notation
N= How many scores are in a set
n= The number of scores in a sample
Σ = Summation
Order of Mathematical Operations
1. Any calculation contained within parentheses is done first.
2. Squaring (or raising to other exponents) is done second.
3. Multiplying and/or dividing is done third. A series of multiplication and/or division
operations should be done in order from left to right.
4. Summation using the Σ notation is done next.
5. Finally, any other addition and/or subtraction is done.
CHAPTER 2
A frequency distribution takes a disorganized set of scores and places them in order from
highest to lowest, grouping together individuals who all have the same score. If the highest
score is X = 10, for example, the frequency distribution groups together all the 10s, then all
the 9s, then the 8s, and so on.
Σf = N
proportion = p = f/N – relative frequencies
percentage = p(100) = f/N (100)
Grouped frequency distribution table guidelines
The g.f.d. table should have about 10 class intervals. With too few or too many
intervals, the table will not provide a clear picture.
The width of each interval should be a relatively simple number
The bottom score in each class interval should be a multiple of the width