Task 1. A Difficult patient
Learning goals
How can I interpret the Beck Depression Inventory (BDI-II) and its cut off scores?
Beck’s Depression Inventory (1978)
Interpreting the Beck Depression Inventory
After completing the questionnaire, add up the score for each of the 21 questions by
counting the number of each question marked. The highest possible total for the whole test
is 63. This means that number 3 on all questions is circled. Since the lowest possible score
for each question is 0, the lowest possible score for the test would be 0. Depression can be
evaluated as follows:
Total Score Levels of Depression
1-10 These ups and downs are considered normal
11-16 Mild mood disturbance
17-20 Borderline clinical depression
21-30 Moderate depression
31-40 Severe depression
over 40 Extreme depression
Von Glischinski M, von Brachel R, Hirschfeld G. (2019). How depressed is “depressed”? a
systematic review and diagnostic meta-analysis of optimal cut points for the beck depression
inventory revised (BDI-II). 1111-1118
Abstract
The Beck Depression Inventory revised (BDI-II) is widely used tool to screen for depression.
Method - We identified 27 studies that tried to identify optimal cut points for the BDI-II.
Study quality was assessed using QUADAS criteria. Cut points and their variability were
analyzed descriptively, via simulation and synthesized with a diagnostic meta-analysis.
Analysis was performed on all studies and subgroups based on the setting (psychiatric,
somatic, healthy).
Results - Cut points identified as optimal ranged from 10 to 25 across all studies. Simulation-
based estimations of the variability inherent in studies show that much of the between-
study differences may be attributed to random fluctuations. Diagnostic meta-analysis across
all studies revealed that a cut point of 14,5 (95% CI 12,75–16,44) is optimal, yielding a
sensitivity of 0.86 and a specificity of 0,78. Analyses within the different settings suggest
using sample-specific cut points, specifically 18.18 in psychiatric settings, and 12,9 in primary
care settings and healthy populations.
Conclusion - Most studies aimed at determining optimal cut points fail to acknowledge that
reported results are only estimates and subject to random fluctuations resulting in
conflicting recommendations for practitioners. Taking into account these fluctuations, we
find that practitioners should use different cut points to screen for depression in primary
care and healthy populations (a score of 13 and higher indicates depression) and psychiatric
,settings (a score of 19 and higher indicates depression). Methods to describe this variability
and meta-analysis to synthesize findings across studies should be used more widely.
Introduction
Major depressive disorder (MDD) is a pervasive, common condition affecting people of all
ages and races. It significantly impairs mental, physical, and social functioning; is highly
comorbid with other mental disorders; and is associated with enormous economic costs. A
substantial number of patients with MDD remains undiagnosed and doesn’t receive proper
treatment. The best available test for the detection of depressive symptoms is a full
psychiatric interview, carried out by a health professional. However, for clinical practice this
method is often too time consuming and therefore uneconomic. Many standardized self-
report measures have been developed, to grade the severity of depressive symptoms and to
identify individuals with MDD. A widely used measure of depressive symptoms is the Beck
Depression Inventory revised (BDI-II). Reviews agree that the BDI-II exhibits excellent
psychometric performance and is able to classify patients as “depressed” vs. “non-
depressed”. However, they also noted that different cut points emerged as optimal in
different samples. Cut points are not only used to classify patients in clinical practice, but are
also used to classify patients’ outcomes in clinical trials (e.g. by defining all patients as
remitted who attain a post-score below 10). Methods to establish optimal cut points require
that both the BDI-II as well as a reference test (e.g. structured clinical interviews) are
assessed in all study participants. Once both scores are available, receiver-operating-
characteristic (ROC)-based methods are used to determine the “optimal” cut point. The cut
point that shows the highest Youden index (sensitivity + specificity) is identified as optimal.
The aims of this study are: (1) to describe the study quality of studies that used ROC-based
methods to determine cut points for the BDI-II, (2) to describe the proposed cut points and
the level of variability inherent to their estimation, and (3) to synthesize the existing studies
using diagnostic meta-analysis.
Methods
Optimal cut points were defined as those cut points which maximized the Youden index
(sum of sensitivity and specificity). The reported sample size, mean and standard deviation
of participants with and without depression were used to generate samples and calculate
the optimal cut point. From the 5.000 calculated optimal cut points for each study, we
calculated the median cut point and the 2,5% and 97,5% quartiles. Second, we performed a
meta-analysis of the optimal cut points using the multiple cut point model to estimate
pooled estimates for sensitivity and specificity, the summary ROC curve and the optimal cut
point. This method has the advantage of including multiple cut points per study and study
heterogeneity, thereby improving precision.
Results
Literature search
27 articles met inclusion criteria and were included in the analyses. Overall, these studies
included 2044 participants that were classified as “depressed” using clinical criteria and 8979
participants classified as “not depressed.”
, Study quality
All studies showed low risk of bias with regard to patient selection and index testing; in most
cases, the reference standard was determined by a clinical interview; the exclusion of study
participants was always explained in sufficient detail; and information about participant
demographic characteristics (e.g. age, gender), their medical/psychiatric condition, and
recruitment setting were provided. However, there are some: the order of testing (BDI-II or
reference standard first) was sometimes described only vague; results of the BDI-II might not
always be interpreted without knowledge of the reference standard; and it was not always
clear if all participants received the same reference standard. This happens when
researchers rely on a known-groups validation approach. Furthermore, there were many
differences concerning sample characteristics (e.g., healthy, somatic, or psychiatric sample)
and recruitment setting (e.g., outpatient or inpatient) between studies, which might be a
source of systematic bias.
Cut point in included studies
The original authors of the BDI-II recommended the following rules for the interpretation of
their instrument: a score of 0-13 indicates minimal or no depression; 14-19, mild depression;
20-28, moderate depression; and a score of 29-63 indicates severe depression. These cut
points might vary, depending on the type of sample and study purpose. The included studies
which determined cut points for BDI-II mostly recruited samples in outpatient settings (55%),
and comprised of English-speaking (74%) medical patients (59%). The remaining studies
comprised of several psychiatric (26%) and a few non-clinical (15%) samples. Overall, these
27 studies identified cut points between 10-25 as optimal, with a median and mode of 16.
Also, when looking at the three samples separately, the optimal cut points still varied, with
ranges from 10-22 for healthy, 7-22 for somatic, and 13,5-25 for psychiatric samples.
Random fluctuations seem to be the best explanation for the differences in optimal cut
points between studies. However, taking into account these random fluctuations and
constructing confidence intervals, one can calculate a true optimal cut point. For somatic
samples the cut point of 14 is entailed in all but one of the ranges and would be thus
consistent with the studies.
Diagnostic meta-analysis of cut points
The meta-analysis across all samples showed that the optimal cut point was 14,48 with
associated sensitivity of 0,86 and specificity of 0,78. Separate meta-analyses on studies from
the three different samples showed the following:
- Psychiatric patients a cut point of 18,18 would be optimal yielding a sensitivity of 0,87
and a specificity of 0,77.
- Somatic samples a cut point of 12,48 would be optimal and yield a sensitivity of 0,88
and a specificity of 0,79.
- Healthy samples a cut point of 14,06 would result in a sensitivity of 0,79 and a
specificity of 0,76.
Due to the highly similar and overlapping cut points in the latter two groups, we combined
and calculated the meta-analysis. This resulted in an optimal cut point of 12,93 associated
with a sensitivity of 0,86 and a specificity of 0,78.
Sensitivity: number of people we can correctly diagnose with depression. Sensitivity of 0.86:
from the 100 people that are diagnosed with a depression, 86 would be diagnosed correctly.
The other 14 people do have depression, but are not diagnosed.