InfEHR: Resolving Clinical Uncertainty through Deep Geometric Learning on Electronic Health Records

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Electronic health records (EHRs) contain multimodal data that can inform diagnostic and prognostic clinical decisions but are often unsuited for advanced machine learning (ML)–based patient-specific analyses. ML models and clinical heuristics learn generalizable relationships from predefined factors, yet many patients may not benefit if those factors are missing in the EHR or differ—however subtly—from typical training populations. Clinical heuristics are limited to low complexity, often linear, relationships and patterns between clinical variables. ML approaches in EHRs significantly expand pattern sophistication but require large, labeled datasets, which are often unattainable especially in low prevalence diseases and are limited by sources of random and non-random variation in EHRs. Deep learning (DL), in contrast with ML and clinical heuristics, learns features without predefinition but requires even greater label access for predictions. While DL can construct unsupervised EHR representations, the patterns and characteristics of less prevalent examples are poorly resolved, and downstream clinical applications still require labels. We present Inf-EHR, a framework to automatically compute clinical likelihoods from whole EHRs of patients from diverse clinical settings without need of large volumes of labeled training data. We apply deep geometric learning to EHRs through a novel procedure that converts whole EHRs to temporal graphs. These graphs naturally capture phenotypic temporal dynamics leading to unbiased representations. Using only a few labeled examples, InfEHR computes and automatically revises likelihoods leading to highly performant inferences especially in low prevalence diseases which are often the most clinically ambiguous. To demonstrate utility, we use EHRs from the Mount Sinai Health System and The University of California, Irvine Medical Center and test its performance compared to physician-provided clinical heuristics across two diseases with no clinical or epidemiological overlap: a rare disease (neonatal culture-negative sepsis) with prevalence of 2% in neonates, and a more common disease (adult post-operative acute kidney injury) with prevalence of 22%. We show that Inf-EHR is superior to existing clinical heuristics both for culture-negative sepsis (sensitivity: 0.65 vs .041, specificity: 0.99 vs.0.98) and post-operative acute kidney injury (sensitivity: 0.72 vs 0.20, specificity: 0.91 vs 0.97). We present the first application of geometric deep learning in EHRs that can be used in real world clinical settings at scale, for improving phenotype identification and resolving clinical uncertainty.

Article activity feed