Predictive accuracy of the 2- and 8-item versions of the PHQ in measuring symptoms of depression: Brazilian National Health Survey (PNS) 2013-2019
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Objective: This study aimed to evaluate the predictive accuracy of the abbreviated 2-item (PHQ-2) and 8-item (PHQ-8) versions of the Patient Health Questionnaire compared to the standard 9-item version (PHQ-9) for depression screening in a nationally representative sample of Brazilian adults. Additionally, we examined sociodemographic and health-related factors associated with depressive symptoms across all three scales. Methods: Using cross-sectional data from 148.733 participants in the 2013 and 2019 Brazilian National Health Survey (PNS), we conducted receiver operating characteristic (ROC) curve analyses to identify optimal cutoffs for the PHQ-8 and PHQ-2, using the PHQ-9 (cutoff ≥10) as the reference standard. We calculated sensitivity, specificity, area under the curve (AUC), Youden index, and weighted kappa statistics. Associations between depressive symptom classifications and sociodemographic and health-related variables were examined using logistic regression models adjusted for the complex sampling design. Results: The PHQ-8 (cutoff ≥10) demonstrated near-perfect aagreement withthe PHQ-9, with an AUC of 0.982 (95% CI: 0.977–0.987), sensitivity of 96.5%, specificity of 100%, and a weighted kappa of 0.980. For the PHQ-2, a cutoff of ≥3 optimized specificity (96.4%) and overall accuracy (94.7%), yielding moderate sensitivity (79.0%) and substantial agreement (kappa = 0.713; AUC = 0.877, 95% CI: 0.866–0.888). Both abbreviated versions identified similar risk profiles: women had more than twice the odds of depressive symptoms compared to men (OR = 2.61, 95% CI: 2.44–2.81), while individuals with chronic diseases (OR = 4.20, 95% CI: 3.89–4.55) and those with low income (≤½ minimum wage: OR = 1.70, 95% CI: 1.54–1.88) also showed elevated risks. The PHQ-2 slightly overestimated prevalence compared to the PHQ-9 (+1.03%), particularly among rural residents (+1.12%) and tobacco users (+1.70%). Internal consistency was high across all scales: PHQ-9 (α = 0.862), PHQ-8 (α = 0.862), and PHQ-2 (α = 0.708). Conclusion: The PHQ-8 represents a psychometrically equivalent alternative to the PHQ-9 for depression screening in Brazil, while the PHQ-2 serves as a viable and efficient brief screening tool. Both scales maintain consistent epidemiological associations, with key sociodemographic and health-related factors, supporting their applicability across diverse clinical and public health settings.