A Systematic Process for Assessing Fitness-for-Purpose of Health Outcomes for Computable Phenotyping with Electronic Health Record Data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Purpose

Information from electronic health records (EHRs) may be incorporated into computable phenotype algorithms in efforts to overcome inaccuracies of algorithms based on administrative claims data alone. However, such efforts can be resource-intensive and unsuccessful. Assessing the feasibility of computable phenotyping for a health outcome of interest (HOI) before proceeding is therefore recommended.

Methods

We developed a systematic fitness-for-purpose (FFP) assessment process to implement concepts outlined in a previously described general framework for computable phenotyping incorporating EHR data. Our process includes verifying the HOI is well-defined, reviewing clinical information about the HOI, identifying existing algorithms and their performance, evaluating HOI clinical and data complexity, and determining an overall FFP conclusion and recommendation. We applied this process to ten HOIs lacking high-performing claims-based algorithms, selecting HOIs of public health importance that varied in clinical and data complexity, including neutropenia, pericardial effusion and drug-induced liver injury.

Results

HOIs assessed as having moderate (vs. easy) overall difficulty had characteristics such as the need for natural language processing, integration of multiple laboratory test results, or longitudinal EHR data. HOIs assessed as having high difficulty required using data from multiple EHR sources, ruling out many other potential causes, or relying on low-sensitivity diagnostic tests. Input from experts in EHR data and clinical care was crucial.

Conclusion

EHR data have potential to enhance accuracy of defining certain HOIs for research and surveillance compared to administrative claims data. The process and tools we created will support others in assessing FFP of HOIs for computable phenotyping.

Five key points

  • Incorporating electronic health record (EHR) data into computable phenotypes could improve accurate identification of health outcomes of interest (HOIs), but such work can be resource intensive.

  • We developed a systematic fitness-for-purpose (FFP) process and tools to assess the feasibility of computable phenotyping for HOIs.

  • Steps include identifying existing algorithms and their performance, ensuring the HOI is well-defined, evaluating clinical and data complexity, and determining a feasibility recommendation.

  • Difficulty increased with a need for natural language processing, multiple laboratory tests, longitudinal EHR data, multiple EHR sources or ruling out other potential causes.

  • Input from EHR data and clinical care experts was crucial to the FFP assessment process.

Plain Language Summary (PLS)

Attempts to identify diseases and health conditions by applying computer programs to information easily gleaned from insurance claims of tens of thousands of patients (such as FDA’s ongoing safety monitoring of approved drugs or medical products) are often unsuccessful because the data lack nuance. Incorporating information from electronic health records (EHR) and patient chart notes may improve accurate identification of health outcomes. Because this can be resource-intensive, we designed a process and tools to assess the feasibility of including EHR data in computer algorithms to identify health outcomes. Steps included identifying existing algorithms and their performance, building familiarity with the outcome and making sure it is well-defined, evaluating clinical and data complexity, and determining a conclusion about feasibility. We applied our process to ten health outcomes of public health importance. Health outcomes were considered moderately difficult for computerized algorithms if they required natural language processing, integration of multiple laboratory tests, or EHR data from multiple timepoints. Health outcomes having high difficulty required using multiple EHR data types, ruling out many alternative causes of the HOI (other than medications), or relying on diagnostic tests of low accuracy. Input from EHR data and clinical care experts was crucial for the assessment process.

Article activity feed