Investigating Algorithmic Bias in Machine Learning Prediction Models of Suicide Attempts in Multiple Clinical Settings by Race/Ethnicity and Gender

Shirley B Wang
Kate Bentley
William G La Cava
Jiehuan Sun
Matthew Nock
Tianxi Cai
Yi-Han Sheu
Jasmin Brooks Stephens
Ben Reis
Jordan W. Smoller

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Importance: Machine learning models reflect the training data, and may thus learn and perpetuate healthcare disparities. Objective: To evaluate whether performance of a validated machine learning model predicting suicide attempts varies by race/ethnicity or gender from electronic health records (EHRs). Design: In this prognostic study, we re-analyzed previously validated landmark prediction models predicting suicide attempts 18 months after a healthcare visit. Prediction models were estimated with regularized Cox regression models in three cohorts: (1) general outpatient; (2) psychiatric emergency department (ED); and (3) psychiatric inpatient. Model performance (area under the curve [AUC], sensitivity, positive predictive value [PPV]) was evaluated independently across race/ethnicity and gender in all three cohorts, and at the intersection of race/ethnicity and gender in the general outpatient cohort. Setting: EHR data were from the Research Patient Data Registry at Mass General Brigham. Participants: Individuals ages 15–85 years seen in at least 1 of 3 clinical settings from Jan 1, 2016–Dec 31, 2018: general outpatient (N=1,210,222), psychiatric ED (N=13,098), and psychiatric inpatient (N=7,825).Main Outcomes and Measures: The primary outcome was suicide attempt determined by validated ICD codes during 18 months after a randomly sampled “landmark visit” in one of the three settings. Results: When considering gender alone, models showed consistently stronger performance for male vs. female patients. When considering race/ethnicity alone, results were equivocal: in general outpatient, models had higher AUC for White than Hispanic patients. However, in the psychiatric ED, AUC was highest for Asian patients. When considering the intersection of race/ethnicity and gender in general outpatient, models provided better performance for White men than Hispanic and White women across all metrics. There were also gender differences within racial/ethnic groups, with higher PPV for Black men than Black women, and Hispanic men than Hispanic women, suggesting gender differences largely drove these differences. Conclusions and Relevance: We observed modest evidence for disparities in suicide prediction models by gender, and limited evidence of disparities by race/ethnicity alone. More consistent patterns of bias emerged at the intersection of race/ethnicity and gender. Future work should replicate these findings in larger diverse samples to ensure fair deployment of models.

Version published to 10.31234/osf.io/yq983_v2 on OSF Preprints
Feb 8, 2026
Version published to 10.31234/osf.io/yq983_v1 on OSF Preprints
Feb 7, 2026

Investigating Algorithmic Bias in Machine Learning Prediction Models of Suicide Attempts in Multiple Clinical Settings by Race/Ethnicity and Gender

This article has 10 authors:
1. Shirley B Wang
2. Kate Bentley
3. William G La Cava
4. Jiehuan Sun
5. Matthew Nock
6. Tianxi Cai
7. Yi-Han Sheu
8. Jasmin Brooks Stephens
9. Ben Reis
10. Jordan W. Smoller
This article has no evaluationsLatest version Feb 8, 2026
Interpretable Machine Learning Models for Childhood and Adolescent Obesity Prediction According to Chinese and WHO Standards: A Two-Wave Cross-Sectional Study

This article has 7 authors:
1. Fangjieyi Zheng
2. Xiaoqian Wang
3. Mei Xue
4. Qiong Wang
5. Wenqian Zhang
6. Zhixin Zhang
7. Wenquan Niu
This article has no evaluationsLatest version Feb 4, 2026
Interpretable Machine Learning for Predicting Early Mental Health Care-Seeking Among Reproductive-Age Women in Bangladesh Using BDHS 2022 Data

This article has 5 authors:
1. Masudur Rahman Kanchon
2. Jamilu Sani
3. Tanvir Ahmed
4. Md. Rakib Akon
5. Farouk Musa Aliyu
This article has no evaluationsLatest version Feb 11, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Investigating Algorithmic Bias in Machine Learning Prediction Models of Suicide Attempts in Multiple Clinical Settings by Race/Ethnicity and Gender

Interpretable Machine Learning Models for Childhood and Adolescent Obesity Prediction According to Chinese and WHO Standards: A Two-Wave Cross-Sectional Study

Interpretable Machine Learning for Predicting Early Mental Health Care-Seeking Among Reproductive-Age Women in Bangladesh Using BDHS 2022 Data