GGZ2030 – Psychodiagnostics
Task 1 – A difficult patient 2
Task 2 – Fool the assistant 14
Task 3 – Who is right? 28
Task 4 – Andrew’s Problems Become Clear 36
Task 5 – What’s the fuss? 46
Task 6 – This isn’t helpful ! 56
Lecture 1 The empirical circle 67
Lecture 2 76
Lecture 3 – Rules of Bayes in Psychodiagnostics 80
Lecture 4 – Psychodiagnostic assessment in a mental health care setting 84
Lecture psychological report writing 90
1
,Task 1 A difficult patient
ANSWERS
1. What is a good psychological test? What are the features we have to look
into?
The rating system evaluated the quality of a test on 7 criteria:
- Theoretical basis
o The information in this criterion should enable the prospective test user
to judge whether the test is suitable for his/her purposes
o Contains 3 items:
1.1 – Key item
Asks whether the test manual clarifies the construct that
the test purports to measure, the groups for which the test
is meant, and the application of the tests
1.2
Deals with the theoretical elaboration of the test
construction process
Is the test based on an existing psychological theory or on
new ideas that have brought about changes in this
theory?
The manual has to supply definitions of the constructs to
be measured
If both a paper-and-pencil and a computer-based version
exist, differences in item content or item wording must be
specified
1.3
Asks for information about the operationalization of the
construct
Deals with content validity of the test
- Quality of the testing materials
o If a test can be administered on paper but also on a computer, both
sets of items have to be answered. Both sets contain eight items, of
which three are key items.
o The key items deal with:
2.1: Standardization of the test content
2.2: The objectivity of the scoring system
2.3: The presence of unnecessary culture-bound words or
content that may be offensive to specific ethnic, gender or other
groups
o The other 5 items refer to:
2.4 +2.7: Design, content and form of the test materials
2.5: The instructions for the test taker
2.6: The quality of the items
2.8: The scoring of the items
o In adaptive tests, test takers receive different sets of items. For
adaptive tests it is required that:
2.9: The decision rules for the selection of the next item are
specified
o Computer-based tests:
2
, 2.10: Information has to be provided that enables the rater to
check the correctness of the scoring.
2.12: Special attention is given to the resistance of the software
to user errors
2.15: The quality of the design of the user interface
2.16: The security of the test materials and the test results
- Comprehensiveness of the manual
o Next to the User’s Guide (information that the test user needs to
administer and interpret the test), the manual should also supply a
summary of the construction process and the relevant research
(Technical Manual)
o Contains of 7 items:
3.1 (key item): Asks whether there is a manual at all
3.2: Deal with the completeness of the instructions for successful
test administration
3.3: Information on restrictions for the use of the test
3.4
The availability of a summary of results of research performed
with the test
3.5 The inclusion of case descriptions
3.6: The availability of indications for test-score interpretation
3.7: Statements on user qualifications
o For computer-based tests 3 extra items are available:
3.8: Asks whether sufficient information is supplied with respect
to the installation of the software
3.9: Asks whether there is sufficient information regarding the
operation of the software and the opportunities provided by the
software
3.10: Asks for the availability of technical support for practical
software use
- Norms
o Scoring a test results in a raw score. This raw score is partly
determined by characteristics of the test, such as number of items, time
limits, item difficulty or item popularity, and test conditions. Thus, the
raw score is difficult to interpret and unsuited for practical use. To give
meaning to a raw score two ways of scaling or categorizing raw scores
can be distinguished:
Norm-referenced interpretation: A set of scaled scores or norms
may be derived from the distribution of raw scores of a reference
group
Standards may be derived from a domain of skills or subject
matter to be mastered (domain-referenced interpretation), or cut-
scores may be derived from the results of empirical validity
research (criterion-references misinterpretation)
Raw scores will be categorized in two (pass/fail) or more
different score ranges.
o The provision of norms, standards, or cut scores is a basic requirement
for the practical use of most tests, but there are exceptions
3
, o This criterion is assessed using two key items and 3 separate sections
on norm-references, domain-referenced and criterion-referenced
interpretation
The two key items apply to all sections
4.1: Checks whether norms, standards, or cut scores are
provided
4.2: Asks in which year or period the data were collected.
An important innovation is the addition of the comment
“the norms are out-dated” when norms are older than 15
years. This warning should alert test users to be careful
interpreting the norms.
5 items refer to norm-referenced interpretation
4.3 (key item): Deals with the size and the
representativeness of the norm group or norm groups.
o Since the first version of the rating system, clear-
cut rules were formulated with respect to the size of
the norm groups.
o For tests intended for making important decisions,
a norm group smaller than 300 is considered
“insufficient,” between 300 and 400 “sufficient,” and
larger than 400 “good.” For tests intended for
making less important decisions, corresponding
group sizes are 200 and 300, respectively. These
rules apply to the “classical norming” approach in
which norms are constructed for separate (age or
year) groups.
o Concerning the representativeness of the norm
groups, test constructors should at least provide
evidence with respect to age, gender, ethnic group
and region.
4.4: Asks for information on the norm scale used
4.5: Means, standard deviations and other information
with respect to the score distributions
4.6: Differences between various subgroups
4.7: The standard error of measurement, the standard
error of estimate, or the test information function
3 items refer to domain-references interpretation
In domain-references interpretation, the specific method
chosen, the procedures for determining the cut scores
(4.9) and the training and selection procedure of the
judges (4.10) have to be described.
4.8: Inter-rater agreement with respect to the
determination of the critical score
3 items refer to criterion-referenced interpretation
In criterion-references interpretation, cut scores or
expectancy tables are derived from empirical research.
This concerns research on the criterion validity of the test,
which serves setting norms empirically
4.11: Check whether the results show sufficient validity
4