2015-2016
Chapters 1-5, 7-8, 11-12, 14.2-14.3, and 15
Note: use the book for the insight of the formulas!
R.B.M.
W.O.
J.A.
1
,Chapter 1: Introduction
- Psychological and educational tests are important tools of the behavioural
sciences. They are used in scientific research (testing theories and
hypotheses, studying development and change, and predicting behaviour)
as well as in practice (diagnoses and psychological and educational
decision making).
1.1 origins of psychometrics
- In 1905 Binet en Simon published an intelligence test in France, which, in
generally is considered to be the first modern test. However, this test had
many different roots, which is described by Dubois (1970). Dubois
mentions three roots of modern testing;
1) Civil service examination (selection of candidates for civil service
(government positions)
2) The assessment of academic achievement ( university and
schools)
3) The study of individual differences in behaviour
- Based on Binets earlier work on individual differences, Binet and Simon
published in 1905 a ‘psychological method’ to differentiate between
normal and retarded children, which was the first modern intelligence test.
- The development of psychometrical theories on test scores originates from
the same period as the development of modern tests
1.2 Test definitions
- Psychometric terminology sometimes differs depending on the types of
test applications.
- This section summarizes the definitions of some terms that will be used in
this book.
- Definition 1.1 Test: a psychological or educational test is an instrument for
the measurement of a person’s maximum or typical performance under
standardized conditions, where the performance is assumed to reflect one
or more latent attributes.
! This definition has a number of elements:
1) A test is defined to be a measurement instrument. However, the
definition expresses the view that a test is for measurement in the
first place, and that other uses, such as prediction, are important
applications of the measurement.
2) A test is defined to measure a performance. Cronbach (1960)
distinguished two ypes of performance. A maximum performance
( e.g intelligence and achievement tests) and a typical performance
test ( personality and attitude).
3) The definition also states that performance is measured under
standard conditions ( e.g instructions and test materials are fixed for
different test takers and on different administered occasions). The
reason for standardization is that test performance must be
comparable between persons and between occasions.
4) Finally, the definition states that is it assumed that the test
performance reflects one or more latent attributes. The test
performance is observable, but latent attributes cannot be observed.
It is assumed that one or more latent attributes underlie the test
performance, and that the latent attributes effect the test
performance.
2
, ! Tests are distinguished from surveys. Although they contain questions
which are answered by a respondent, it is not assumed that these
questions reflect a latent attribute.
- Definition 1.2 Subtest; is an independent part of a test.
- Definition 1.3 Item; The smallest possible subtest of a test.
- Definition 1.4 Dimensionality; dimensionality of a test or subtest is equal
to the number of latent attributes (variables), which effects test
performance.
! Unidimensional test: if a test predominantly measures one latent
attribute (variable).
! Multidimensional test; if a test measures more than one latent attribute
(variable).
1.3 Test types
- Different types of tests are described in the psychometric literature. As
mentioned earlier, a test is defined as a measurement instrument, which
are divided into mental and physical tests.
- Cronbach (1960) distinguished between maximum and typical
performance tests:
Maximum performance: a performance can be considered maximum in
two different respects 1) the performance is accurate (power test),
consist of problems that the test taker tries to solve, test taker has
ample time to work on the items. The emphasis is on measuring the
accuracy to solve the problems and 2) the performance is fast (speed
test: measures the speed taken to solve the problems. The emphasis is
on measuring the time taken to solve problems. Note: items are easy so
that it can be solves by all test takers
! Maximum performance tests are also classified according to the
attributes which they measure. The main types are Ability (aptitude)
and achievement. An ability test measures a person’s best performance
in an area that is not explicitly thought in training and/or educational
programmes. In contrast, achievement tests measures performance
that is explicitly thought in training or/and educational performance
Typical performance: a typical performance test is an instrument for
measuring behaviour that is typical for the person
3
, Chapter 2: Developing maximum performance test
- Maximum performance tests ask the test takers to do the best they can to
perform a task. These tests are used to assess a wide variety of abilities,
aptitudes, knowledge and skills.
- The development of a test starts with the making of a plan. The plan
specifies a number of essential elements of the test development (see §2.1
through §2.7). These 7 element need not be specified in the given order
and can be considered simultaneously or in another order.
2.1 The construct of interest
- The test developer must specify the latent variable (construct) of interest
that has to be measured by the test
- a good way to start a test development project is to define the construct
that has to be measured by the test. This definition describes the construct
of interest, and distinguishes it from other, related constructs. Usually, the
literature on the construct needs to be studies before the definition can be
given.
2.2 The measurement mode
- Different modes can be used to measure constructs. Some modes of
maximum performance tests are:
Self-performance mode: Ask test taker to perform a mental or
physical task (common)
Self-evaluative mode: The test taker is asked to evaluate his/her
ability to perform the task
Other evaluation mode: ask others to evaluate a person’s ability
to perform a task
2.3 The objectives
- The test developer must specify the objectives of the test. Some distinction
of objectives are:
1) Tests can be used for scientific (e.g. study human intellectual
functioning) or practical (selecting job applicant) purpose
2) A 2nd distinction is between objectives at the level of individual
test takers (e.g. reject or accept a job applicant) and at the level of a
group of test takers (use of mean scores to compare groups).
3) A 3th distinction of test objective is between description,
diagnosis and decision making
2.4 The population
- The target population of a test is the set of persons to whom the test has
to be applied. The test developer must define the target population, and
must provide criteria for the inclusion and exclusion of persons.
- A target population can be split into distinct subpopulations.
2.5 The conceptual framework
- Test developments starts with a definition or description of the construct
that has to be measured by the test. However, the definition or description
is usually not concrete enough to write test items.
- A conceptual framework give the item writer a handle to write items. The
framework gives more specific information.
4