Measuring depression severity in general practice: discriminatory performance of the PHQ-9, HADS-D, and BDI-II

Isobel Mary Cameron; Amanda Cardy; John R Crawford; Schalk W du Toit; Steven Hay; Kenneth Lawton; Kenneth Mitchell; Sumit Sharma; Shilpa Shivaprasad; Sally Winning; Ian C Reid

doi:10.3399/bjgp11X583209

Measuring depression severity in general practice: discriminatory performance of the PHQ-9, HADS-D, and BDI-II

Isobel Mary Cameron, Amanda Cardy, John R Crawford, Schalk W du Toit, Steven Hay, Kenneth Lawton, Kenneth Mitchell, Sumit Sharma, Shilpa Shivaprasad, Sally Winning, Ian C Reid

Royal Cornhill Hospital

Research output: Contribution to journal › Article › peer-review

89 Citations (Scopus)

Abstract

Background
The UK Quality and Outcomes Framework (QOF) rewards practices for measuring symptom severity in patients with depression, but the endorsed scales have not been comprehensively validated for this purpose.

Aim
To assess the discriminatory performance of the QOF depression severity measures.

Design and setting
Psychometric assessment in nine Scottish general practices.

Method
Adult primary care patients diagnosed with depression were invited to participate. The HADS-D, PHQ-9, and BDI-II were assessed against the HRSD-17 interview. Discriminatory performance was determined relative to the HRSD-17 cut-offs for symptoms of at least moderate severity, as per criteria set by the American Psychiatric Association (APA) and NICE. Receiver operating characteristic curves were plotted and area under the curve (AUC), sensitivity, specificity, and likelihood ratios (LRs) calculated.

Results
A total of 267 were recruited per protocol, mean age = 49.8 years (standard deviation [SD] = 14.1), 70% female, mean HRSD-17=12.6 (SD = 7.62, range = 0-34). For APA criteria, AUCs were: HADS-D = 0.84; PHQ-9 = 0.90; and BDI-II = 0.86. Optimal sensitivity and specificity were reached where HADS-D =9 (74%, 76%); PHQ-9 =12 (77%, 79%), and BDI-II =23 (74%, 75%). For NICE criteria: HADS-D AUC = 0.89; PHQ-9 AUC = 0.93; and BDI-II AUC = 0.90. Optimal sensitivity and specificity were reached where HADS-D =10 (82%, 75%), PHQ-9 =15 (89%, 83%), and BDI-II =28 (83%, 80%). LRs did not provide evidence of sufficient accuracy for clinical use.

Conclusion
As selecting treatment according to depression severity is informed by an evidence base derived from trials using HRSD-17, and none of the measures tested aligned adequately with that tool, they are inappropriate for use.

Original language	English
Pages (from-to)	e419-e426
Number of pages	8
Journal	The British Journal of General Practice
Volume	61
Issue number	588
DOIs	https://doi.org/10.3399/bjgp11X583209
Publication status	Published - 1 Jul 2011

Keywords

depression
primary care
sensitivity
severity
specificity

Access to Document

10.3399/bjgp11X583209

Depressive Disorder Research
Ian Reid (Coordinator), Isobel Cameron (Coordinator), Kenneth Lawton (Coordinator) & John Crawford (Coordinator)
Impact

Cite this

@article{666fc5001c364361aa3b52dbdd6d966f,

title = "Measuring depression severity in general practice: discriminatory performance of the PHQ-9, HADS-D, and BDI-II",

abstract = "Background The UK Quality and Outcomes Framework (QOF) rewards practices for measuring symptom severity in patients with depression, but the endorsed scales have not been comprehensively validated for this purpose. Aim To assess the discriminatory performance of the QOF depression severity measures. Design and setting Psychometric assessment in nine Scottish general practices. Method Adult primary care patients diagnosed with depression were invited to participate. The HADS-D, PHQ-9, and BDI-II were assessed against the HRSD-17 interview. Discriminatory performance was determined relative to the HRSD-17 cut-offs for symptoms of at least moderate severity, as per criteria set by the American Psychiatric Association (APA) and NICE. Receiver operating characteristic curves were plotted and area under the curve (AUC), sensitivity, specificity, and likelihood ratios (LRs) calculated. Results A total of 267 were recruited per protocol, mean age = 49.8 years (standard deviation [SD] = 14.1), 70% female, mean HRSD-17=12.6 (SD = 7.62, range = 0-34). For APA criteria, AUCs were: HADS-D = 0.84; PHQ-9 = 0.90; and BDI-II = 0.86. Optimal sensitivity and specificity were reached where HADS-D =9 (74%, 76%); PHQ-9 =12 (77%, 79%), and BDI-II =23 (74%, 75%). For NICE criteria: HADS-D AUC = 0.89; PHQ-9 AUC = 0.93; and BDI-II AUC = 0.90. Optimal sensitivity and specificity were reached where HADS-D =10 (82%, 75%), PHQ-9 =15 (89%, 83%), and BDI-II =28 (83%, 80%). LRs did not provide evidence of sufficient accuracy for clinical use. Conclusion As selecting treatment according to depression severity is informed by an evidence base derived from trials using HRSD-17, and none of the measures tested aligned adequately with that tool, they are inappropriate for use. ",

keywords = " depression , primary care, sensitivity , severity , specificity",

author = "Cameron, {Isobel Mary} and Amanda Cardy and Crawford, {John R} and {du Toit}, {Schalk W} and Steven Hay and Kenneth Lawton and Kenneth Mitchell and Sumit Sharma and Shilpa Shivaprasad and Sally Winning and Reid, {Ian C}",

year = "2011",

month = jul,

day = "1",

doi = "10.3399/bjgp11X583209",

language = "English",

volume = "61",

pages = "e419--e426",

journal = "The British Journal of General Practice",

issn = "0960-1643",

publisher = "Royal College of General Practitioners",

number = "588",

}

TY - JOUR

T1 - Measuring depression severity in general practice

T2 - discriminatory performance of the PHQ-9, HADS-D, and BDI-II

AU - Cameron, Isobel Mary

AU - Cardy, Amanda

AU - Crawford, John R

AU - du Toit, Schalk W

AU - Hay, Steven

AU - Lawton, Kenneth

AU - Mitchell, Kenneth

AU - Sharma, Sumit

AU - Shivaprasad, Shilpa

AU - Winning, Sally

AU - Reid, Ian C

PY - 2011/7/1

Y1 - 2011/7/1

N2 - Background The UK Quality and Outcomes Framework (QOF) rewards practices for measuring symptom severity in patients with depression, but the endorsed scales have not been comprehensively validated for this purpose. Aim To assess the discriminatory performance of the QOF depression severity measures. Design and setting Psychometric assessment in nine Scottish general practices. Method Adult primary care patients diagnosed with depression were invited to participate. The HADS-D, PHQ-9, and BDI-II were assessed against the HRSD-17 interview. Discriminatory performance was determined relative to the HRSD-17 cut-offs for symptoms of at least moderate severity, as per criteria set by the American Psychiatric Association (APA) and NICE. Receiver operating characteristic curves were plotted and area under the curve (AUC), sensitivity, specificity, and likelihood ratios (LRs) calculated. Results A total of 267 were recruited per protocol, mean age = 49.8 years (standard deviation [SD] = 14.1), 70% female, mean HRSD-17=12.6 (SD = 7.62, range = 0-34). For APA criteria, AUCs were: HADS-D = 0.84; PHQ-9 = 0.90; and BDI-II = 0.86. Optimal sensitivity and specificity were reached where HADS-D =9 (74%, 76%); PHQ-9 =12 (77%, 79%), and BDI-II =23 (74%, 75%). For NICE criteria: HADS-D AUC = 0.89; PHQ-9 AUC = 0.93; and BDI-II AUC = 0.90. Optimal sensitivity and specificity were reached where HADS-D =10 (82%, 75%), PHQ-9 =15 (89%, 83%), and BDI-II =28 (83%, 80%). LRs did not provide evidence of sufficient accuracy for clinical use. Conclusion As selecting treatment according to depression severity is informed by an evidence base derived from trials using HRSD-17, and none of the measures tested aligned adequately with that tool, they are inappropriate for use.

AB - Background The UK Quality and Outcomes Framework (QOF) rewards practices for measuring symptom severity in patients with depression, but the endorsed scales have not been comprehensively validated for this purpose. Aim To assess the discriminatory performance of the QOF depression severity measures. Design and setting Psychometric assessment in nine Scottish general practices. Method Adult primary care patients diagnosed with depression were invited to participate. The HADS-D, PHQ-9, and BDI-II were assessed against the HRSD-17 interview. Discriminatory performance was determined relative to the HRSD-17 cut-offs for symptoms of at least moderate severity, as per criteria set by the American Psychiatric Association (APA) and NICE. Receiver operating characteristic curves were plotted and area under the curve (AUC), sensitivity, specificity, and likelihood ratios (LRs) calculated. Results A total of 267 were recruited per protocol, mean age = 49.8 years (standard deviation [SD] = 14.1), 70% female, mean HRSD-17=12.6 (SD = 7.62, range = 0-34). For APA criteria, AUCs were: HADS-D = 0.84; PHQ-9 = 0.90; and BDI-II = 0.86. Optimal sensitivity and specificity were reached where HADS-D =9 (74%, 76%); PHQ-9 =12 (77%, 79%), and BDI-II =23 (74%, 75%). For NICE criteria: HADS-D AUC = 0.89; PHQ-9 AUC = 0.93; and BDI-II AUC = 0.90. Optimal sensitivity and specificity were reached where HADS-D =10 (82%, 75%), PHQ-9 =15 (89%, 83%), and BDI-II =28 (83%, 80%). LRs did not provide evidence of sufficient accuracy for clinical use. Conclusion As selecting treatment according to depression severity is informed by an evidence base derived from trials using HRSD-17, and none of the measures tested aligned adequately with that tool, they are inappropriate for use.

KW - depression

KW - primary care

KW - sensitivity

KW - severity

KW - specificity

U2 - 10.3399/bjgp11X583209

DO - 10.3399/bjgp11X583209

M3 - Article

SN - 0960-1643

VL - 61

SP - e419-e426

JO - The British Journal of General Practice

JF - The British Journal of General Practice

IS - 588

ER -

Measuring depression severity in general practice: discriminatory performance of the PHQ-9, HADS-D, and BDI-II

Abstract

Keywords

Access to Document

Fingerprint

Impacts

Depressive Disorder Research

Cite this