Fairness analysis of machine learning predictions of aggression in acute psychiatric care

Yifan Wang
Laura Sikstrom
Robert Xiao
Zoe Findlay
Juveria Zaheer
Sean Hill
Marta M Maslej

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Managing patient aggression is a major challenge in acute psychiatry, and machine learning (ML) applications are increasingly being developed to support individualized risk assessment and de-escalation. However, ML algorithms have been shown to exhibit unfair behavior based on protected characteristics, such as an individual’s sex or ethnicity. This is especially worrying in psychiatric contexts as social and systemic inequities - such as disparities in access to psychiatric care or racial profiling in admissions to hospital by police - can become embedded in training datasets. Despite the potential for ML algorithms to replicate and amplify such inequities, the fairness of ML-based predictions of aggression in acute psychiatry has received limited investigation. To address this gap, we trained an ML algorithm to predict aggressive incidents from structured electronic health records corresponding to 17,703 patients receiving acute care at a large psychiatric hospital between January 2016 and May 2022 ( n = 42,719 observation days). We analyzed predictions for fairness by assessing disparities in false positive rates [FPR] and true positive rates [TPR] (i.e., the equalized odds criterion), based on patient race/ethnicity, gender, admission mode, citizenship, and housing status, as well as intersections of race/ethnicity and gender. The random forest algorithm performed best (ROC-AUC = 0.812). Fairness analyses revealed significant disparities in FPR and TPR across subgroups, such that FPR were higher for Middle Eastern and Black patients, men, those admitted into emergency care by the police, and those with unstable or supportive forms of housing. Middle eastern men had the highest FPR of any intersectional group. Our analysis demonstrates the potential for ML algorithms to exhibit unfairness across multiple demographic and social groups in predictions of inpatient aggression, reflecting known social and structural inequities. To prevent the reinforcement and amplification of existing disparities, it will be critical to apply strategies to mitigate unfairness in this context. At the same time, evaluating and exploring unfair ML behavior can reveal unique insights into underlying inequities that might be impacting patient experiences and care.

Version published to 10.21203/rs.3.rs-7781555/v1 on Research Square
Oct 31, 2025

Prediction of Long COVID and Mortality among Patients with Substance Use Disorder

This article has 3 authors:
1. Jiawei Wu
2. K M Sajjadul Islam
3. Praveen Madiraju
This article has no evaluationsLatest version Nov 20, 2025
Explainable AI-Based Framework for Predicting Mental Health Vulnerability in Technology Professionals

This article has 4 authors:
1. Jagdish Makhijani
2. Priya Singh
3. Rajveer Singh
4. Vinay Rohit
This article has no evaluationsLatest version Nov 11, 2025
A Systematic Fairness Evaluation of Racial Bias in Alzheimer’s Disease Diagnosis Using Machine Learning Models

This article has 3 authors:
1. Neha Goud Baddam
2. Bizhan Alipour Pijani
3. Serdar Bozdag
This article has no evaluationsLatest version Oct 2, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Prediction of Long COVID and Mortality among Patients with Substance Use Disorder

Explainable AI-Based Framework for Predicting Mental Health Vulnerability in Technology Professionals

A Systematic Fairness Evaluation of Racial Bias in Alzheimer’s Disease Diagnosis Using Machine Learning Models