Development and Validation of Natural Language Processing Algorithms in the ENACT National Electronic Health Record Research Network

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Electronic health record (EHR) data are a rich and invaluable source of real-world clinical information, enabling detailed insights into patient populations, treatment outcomes, and healthcare practices. The availability of large volumes of EHR data are critical for advancing translational research and developing innovative technologies such as artificial intelligence. The Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network, established in 2015 with funding from the National Center for Advancing Translational Sciences (NCATS), aims to accelerate translational research by democratizing access to EHR data for all Clinical and Translational Science Awards (CTSA) hub investigators. The present ENACT network provides access to structured EHR data, enabling cohort discovery and translational research across the network. However, a substantial amount of critical information is contained in clinical narratives, and natural language processing (NLP) is required for extracting this information to support research. To address this need, the ENACT NLP Working Group was formed to make NLP-derived clinical information accessible and queryable across the network. This article describes the implementation and deployment of NLP infrastructure across ENACT. First, we describe the formation and goals of the Working Group, the practices and logistics involved in implementation and deployment, and the specific NLP tools and technologies utilized. Then, we describe how we extended the ENACT ontology to standardize and query NLP-derived data, as well as how we conducted multisite evaluations of the NLP algorithms. Finally, we reflect on the experience and lessons learnt, which may be useful for other national data networks that are deploying NLP to unlock the potential of clinical text for research.

Article activity feed