Multimodal Physiological Assessment for Clinical Competency Classification in Simulation-Based Medical Education: A Machine Learning Approach
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Medical errors remain a leading cause of preventable harm, yet current competency assessments often rely on subjective evaluations that overlook critical performance indicators, particularly learners' responses to clinical stress. Although physiological stress markers have been linked to performance outcomes, no widely adopted or scalable framework has integrated these biomarkers with performance data to identify learners requiring additional training before real-world practice. Methods This prospective observational study developed machine learning models to classify clinical competency using multimodal data from healthcare learners. Data were collected from 152 learners (74 Emergency Medicine residents, 70 Anesthesiology residents, 8 Emergency Medical Services students) across 470 high-fidelity simulation scenarios. A multimodal assessment platform synchronized physiological signals (electrodermal activity, heart rate, skin temperature) from Empatica E4 wristbands with expert evaluations. A genetic algorithm was employed for feature selection, and neural network models were evaluated using multiple leave-N-out strategies to assess generalizability across learners and scenarios. Results The neural network achieved 84–85% balanced accuracy across thresholds 0.45–0.70, with sensitivity 93.3–95.4% and specificity 72.9–76.2%. Despite class imbalance (80.6% competent, 19.4% novice), performance remained robust, with Matthew's correlation coefficients of 0.687–0.706 and precision–recall area-under-the-curve (PR-AUC) values of 0.969–0.970 across thresholds. Conclusions This study demonstrates that integrating physiological metrics with machine learning supports objective, data-driven competency assessment. By capturing stress-performance relationships that traditional evaluations often overlook, this framework may provide an early warning system to identify learners who may require additional training and lay the foundation for more precise, data-informed medical education.