Machine learning models for predicting work-related sickness absence due to mental disorders using national surveillance data in Brazil

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Purpose To develop and compare supervised machine learning models to predict sickness absence among workers notified with work-related mental disorders in the Brazilian National Notifiable Diseases Information System (SINAN), and to identify the most influential predictors associated with this outcome. Methods A cross-sectional study was conducted using SINAN records from 2006–2024. The analytical sample comprised 4,217 workers aged ≥ 18 years with ICD-10 mental or behavioral disorders (F00–F99, Z73.0). Three supervised algorithms—Decision Tree, Random Forest, and Extreme Gradient Boosting (XGBoost)—were trained using an 80/20 stratified split. Performance was evaluated using accuracy, sensitivity, specificity, precision, F1-score, and AUC-ROC, accompanied by 95% confidence intervals. Model interpretability and feature importance were assessed using SHAP values. Results The three models exhibited comparable performance, with overlapping 95% CIs. AUC-ROC values ranged from 0.697 (Decision Tree) to 0.745 (XGBoost), and accuracy ranged from 0.665 to 0.691. SHAP analyses identified structural and service-related variables—specifically the issuance of work accident reports, referral to psychosocial care centers, geographic region, psychotropic medication use, and employment status—as the primary drivers of prediction. Conclusions Supervised machine learning models demonstrated robust predictive capacity and represent promising tools for occupational health surveillance. Predictions within the SINAN context were driven primarily by structural and organizational factors rather than individual characteristics, underscoring the critical role of institutional and territorial determinants in work-related mental health outcomes.

Article activity feed