Chapter 1: Introduction to Psychometrics
Psychological measurement: the assignment of numerals to psychological characteristics of individuals
(according to rules)
- Characteristic = a psychological variable or construct
- A psychological test is a systematic procedure for comparing the behavior of two or more
people
- The purpose of measurement in psychology is to identify and, if possible, quantify
inter-individual and intra-individual differences
In a norm-referenced test, the performance of each examinee is interpreted in reference to a relevant
standardization sample
- I.e., a reference/normative sample
However, in a criterion-referenced test, the objective is to determine where the examinee stands with
respect to very tightly defined educational objectives
- There is no comparison to the normative performance of others; no reference group
A speed test is time-limited. In general, people who take speeded tests are not expected to complete
the entire test in the allotted time
- Speeded tests are scored by counting the number of questions answered in the allotted time
period
A power test is not time-limited; the examinees are expected to answer all the test questions
- Often, power tests are scored by counting the number of correct answers made on the test
Psychometrics: the science concerned with the attributes of psychological tests
- E.g., is it reliable? Is it valid? Does it display a construct bias?
- Psychometric analysis: analysis of individual differences in item responses
Psychological variables/constructs are latent traits (i.e., hypothetical constructs; the psychological
constructs driving responses to a test)
- Operational definitions: the procedure used to measure hypothetical constructs (e.g., the
number of recalled digits is an operational definition of working memory)
- The measurement of psychological variables depends on the measurement of observable
behavior
- But how do we link observable behavior (e.g., do you party much?) to psychological
(unobservable) variables (e.g., extraversion)?
- What is required:
1. A psychological theory linking psychological characteristics, processes, or
states to an observable behavior (e.g., how often a person goes out) that is
thought to reflect differences in the psychological attribute (e.g.,
extraversion)
1
, a. Here, the relevant psychological theory is BIG 5; you could ask an
expert which items are good for measuring the latent variable (here,
the example is extraversion)
b. A theory also prescribes the distribution of the latent variable
(continuous distribution (Big 5) versus classification (Piaget’s stages
of development; nominal level))
i. I.e., we make distributional assumptions based on the theory
ii. However, the distribution is not always clear (see image; the
different theories/individuals need not agree on the type of
distribution the latent variable displays)
iii. Related to statistics
2. Causality
a. Reflective versus formative items (always an exam question; see page
___)
3. Statistics
a. Necessary to examine (1) individual differences in item responses and
(2) the association between items
4. An explicit graphical representation of the latent-observed variable relation
(path diagram)
A path diagrammatic representation of a latent variable:
- The latent variable (e.g., depression, spatial IQ, attachment style, etc.) is depicted in a circle
- Psychometric analysis based on variance and standard deviance (interval scale)
- The observed variables (performance and self-report ratings; e.g., do you drink alcohol daily?)
are shown in boxes; linked to the latent variable
- The observed variables correspond to items in, for example, a questionnaire
- Psychometric analysis based on variance, standard deviation, mean, and (conditional)
probability
- Most often ordinal scale (0, 1, 2, 3, 4), ordinal binary (0/1), or ordinal but
treated as interval (0/1/2/3/4)
- The Pearson product moment correlation coefficient expresses the
linear relationship between the item responses (i.e., variables)
- I.e., what is the correlation between the items
- The correlation matrix provides input for common factor analysis
- Single common factor model: the items are correlated because they are all
dependent upon a single latent variable
- Every square is accompanied by an error; the observed variable is related to the latent trait,
but never perfectly (no item has a reliability of 1; there is always measurement error)
2
, - I.e., all measurement is subject to measurement error
- Each arrow (specifically, latent variable → observed variable/item responses) represents a
linear regression (the prediction of Y (dependent variable) from X (predictive variable))
- Linear regression model: y i = b0 + b1 × xi + ei (also: y i = a + b × xi + ei ), where
- b0 = intercept (i.e., a)
- b1 = regression coefficient (i.e., b; the slope)
- b0 and b1 are parameters that determine the line through the data
(remember the scatterplot)
- i = person index
- i = 1, 2, 3. . .
- e = error
- Application: item 1i = b0 + b1 × LV i + error 1i
- LV = latent variable
- Linear regression: the dependent variable is continuously distributed
- The scores are on an interval level
- Logistic regression: the dependent variable (item) is binary or dichotomous (e.g., 0 or 1)
- A logistic regression is basically a linear regression suitable for a dichotomous
dependent variable
My position on the latent variable (e.g., depression) is the cause of my response to the (depression)
item
- Highly depressed → respond ‘yes’ to the item ‘I feel worthless all the time’
- Causal model: the item responses are directly and causally related to the latent
variable (such items are called reflective items/indicators because they directly reflect
the latent variable)
- Remember, reflectivity is a theoretical assumption
- Example: latent variable = whether or not you have a viral infection, reflective items =
sore throat, cough, fever, etc. (influenza → symptoms)
- Also, working memory and depression (but some psychologists may argue
otherwise)
Not all items are reflective! (e.g., APGAR score)
- Formative model: the item scores determine the APGAR score, but the APGAR score does not
cause the item scores
- The items are formative indicators
- I.e., item scores are not causally dependent on the APGAR index; e.g., the general health of
the kid is not causing his respiration to be poor
- A change in the index may not lead to a change in the
variables
3
, - Examples: SES (moving to a different neighborhood will affect by SES (latent), but it will not
affect my income (indicator)), general physical fitness
Causality is also important for the causal interpretation of group differences
- ‘On average, males have better spatial abilities than females’ → this statement refers to the
latent variable spatial ability; so, males and females differ on average with respect to the
latent variable
- Sex predicts spatial ability
- spatial = b0 + bsex × sexe + e
- bsex =
0 → no difference between the female distribution and
the male distribution of the latent trait spatial ability (no
mean difference)
- Sex differences in spatial ability (bsex >
0) → implies sex differences in the item
responses; sex is predictive of spatial ability (causal model)
- Mediation model: the latent variable mediates the relationship
between sex and the item scores. Causal path:
- Sex is predictive of spatial ability (latent variable; coefficient
bsex)
- Spatial ability predicts item response (coefficients b1, b2, and b3)
- If coefficient bsex2 ≠ 0, this counts as a violation of the causal mediation model
- There is a sex difference (bsex2 > 0) on item 3 that is not directly caused by spatial
ability (bsex = 0; no sex difference in spatial ability)
- A sex difference in item 3 but not on the other items is not what you’d
expect → embarrassing; the t-test would show a mean difference
between males and females with regards to spatial ability (which is
false!)
- You wouldn’t know this unless you investigated the model
- Consider this example: Maria has a better spatial ability than Mario. If the test is
unbiased, we expect Maria to have a greater probability of correct item responses
than Mario. If the test is biased (bsex2 ≠ 0), we might find that Mario has a higher
probability of a correct item response, even though he has a lower latent variable
score
- I.e., the basis for the formal definition of construct bias and prediction bias
(Furr chapter 11)
- Issue of dimensionality: are the item responses causally and
directly dependent on one latent variable (i.e., one
dimension)?
4