Improving Real-Time Prediction of Intradialytic Hypotension through Recurrence Pattern Integration and Evaluation Refinement
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Real-time prediction of intradialytic hypotension (IDH) using deep learning has shown high accuracy; however, existing models typically treat all IDH events equally, without distinguishing between initial and recurrent occurrences within a dialysis session. This conventional approach neglects the distinct underlying physiological mechanisms and clinical intervention requirements of each occurrence type. This study systematically examined IDH recurrence patterns to evaluate their impact on model performance and to identify methodological improvements. Methods We retrospectively analyzed 12,767 hemodialysis sessions from 66 patients. Recurrent IDH was defined as an event occurring ≥ 30 minutes after the initial IDH. Deep learning models, including ConvMixer, temporal convolutional network (TCN), and long short-term memory (LSTM) with attention, were compared with a rule-based naïve baseline that predicted IDH solely from prior occurrence. Modeling strategies explicitly incorporating recurrence information were implemented. Model robustness across systolic blood pressure (SBP) subgroups was evaluated and enhanced using adversarial training. Results The probability of IDH increased markedly from 0.7–10.4% before initial events to 11.7–65.7% thereafter. Conventional evaluation that aggregated all IDH events overestimated performance, with differences of up to 0.389 in F1 score and 0.438 in AUPRC between initial and recurrent predictions. The naïve baseline achieved an AUROC of 0.798 without training, highlighting the strong influence of recurrence patterns on predictive performance. Incorporating recurrence information improved AUROC by 3.3–8.0 percentage points across architectures and substantially narrowed variability between models. ConvMixer achieved the highest and most stable performance, consistently exceeding 0.90 AUROC across all event types and definitions. The combination of recurrence-aware features and loss weighting yielded the largest gains, particularly for recurrent events. Adversarial training further reduced subgroup disparities (e.g., AUROC gap from 0.231 to 0.168) while preserving overall model performance. Conclusions Incorporating recurrence patterns into IDH prediction models improves accuracy, robustness, and comparability across studies. We recommend standardized evaluation protocols that explicitly account for recurrence to enhance clinical applicability and reliability.