Tools for Helping Identify Behavior Disorders: Comparing Bayesian Evidence-Based and Machine Learning Approaches
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Objective: Disruptive behavior problems like Oppositional Defiant Disorder (ODD) and Conduct Disorder (CD) are high burden conditions where accurate diagnosis is crucial for effective treatment. We evaluated probability nomograms and machine learning (ML) algorithms (logistic regression, LASSO, SVM, random forest) on accuracy, calibration, and generalizability for predicting current ODD or CD diagnoses. Method: Youth aged 5–18 receiving outpatient care were recruited from academic (n=622) and community (n=506) clinics. Predictors included demographics, comorbid diagnoses, mood and behavioral indicators. The criterion was K-SADS diagnoses of ODD/CD. Models were trained in the academic sample, using the community sample for external validation. AUC quantified accuracy, Spiegelhalter's z indexed calibration, and cross-site performance measured generalizability. Results: For ODD prediction, the nomogram selected CBCL Aggression as the strongest predictor and reached AUC≈.70 in both samples. ML models achieved good discrimination (AUC≈.86) with good-to-moderate calibration in the academic sample, deteriorating to fair accuracy with severe miscalibration in the community setting. For CD prediction, the nomogram built on CBCL Rule-Breaking showed good accuracy (AUC≈.82) in both samples. ML models showed near-excellent accuracy (AUC≈.88) in the academic sample but underperformed in the community sample (AUC=.73-.76), though LASSO, SVM, and random forest maintained good calibration. Conclusion: Across both disorders, nomograms demonstrated lower accuracy but strong cross-site stability, while ML using more predictors achieved superior accuracy that did not generalize. The nomogram offers a low-cost and parsimonious tool. Machine learning is best suited for well-resourced clinics where model training data closely resemble the new cases encountered in practice.