Chapter 1. Introduction
Observations: individual parts of the Data that is collected.
Data: the collective of gathered observations on the characteristics of
interest. Existing archived collections of data are called databases.
Statistics: consists of a body of methods for obtaining and analyzing
data. It provides methods for:
1. Design, planning how to gather data for research studies
2. Description, summarizing the data
3. Inference, making predictions based on the data
Together, Description and inference are the two types of Statistical
Analysis, ways of analyzing the data.
Descriptive Statistics: graphs, tables, numerical summaries, etc.
Entities of statistical analysis:
1. Subject: the entities that a study observes (individual members of
the population sample).
2. Population: the total set of subjects of interest in a study.
3. Element: A single member of the population.
4. Sample: a subset of the population on which the study collects
data.
Descriptive statistics: summarizes the information in a collection of
data (it describes the observations made).
Inferential statistics: provides predictions about a population, based on
data from a sample of that population (makes predictions derived from the
observations).
Parameter: a numerical summary of the population. (Applicable to the
whole)
Statistic: a numerical summary of the sample data. (Applicable to a part
of the whole)
Precision: the extent to which the sample statistics approach the
population parameters.
Data file: a spreadsheet with separate rows of data for each subject and
separate columns for each characteristic.
, Chapter 2. Sampling and Measurement
Validity: Does the measure describe what it is intended to measure, and
does it accurately reflect the concept?
Reliability: are the measurements collected by the measure consistent
and stable?
Variable: a characteristic that can vary in value among subjects in a
sample or population.
Measurement scales: different types of variables require different scales
of measurement.
Different types of scales:
1. Nominal Scale: unordered categories, without a “high” or “low”
end, from which 1 category is chosen. This type of categorical scale
offers only nominal information. (E.g. type of transportation to work:
foot, bike, public transportation, etc.).
2. Ordinal Scale: consists of categorical scales having a natural
ordering of values, that for an ordinal scale. Each level has a greater
or smaller magnitude than another level (E.g. social class: low,
middle, high, etc.)
3. Interval Scale: the possible numerical values for a quantitative
variable form an interval scale. They have a specific numerical
distance, interval, between each pair of levels. (E.g. annual income).
Quantitative: a variable is quantitative when the measurement scale has
numerical values, which represent different magnitudes of the variable.
Categorical: a variable is categorical when the measurement scale is a
set of categories. These variables are often called Qualitative.
Discrete and Continuous variables: a variable is discrete if it’s possible
values form a set of separate numbers. It is continuous if it can take an
infinite continuum of possible real number values.
Randomization: a mechanism for achieving good sample representation.
It is a process of randomly assigning subjects to different samples, or
randomly picking subjects to be included in the sample. Keep in mind
though, that even if a study uses randomization, the results of the study