Medical Diagnosis Coding Automation: Similarity Search vs. Generative AI

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective

This study aims to predict ICD-10-CM codes for medical diagnoses from short diagnosis descriptions and compare two distinct approaches: similarity search and using a generative model with few-shot learning.

Materials and Methods

The text-embedding-ada-002 model was used to embed textual descriptions of 2023 ICD-10-CM diagnosis codes, provided by the Centers provided for Medicare & Medicaid Services. GPT-4 used few-shot learning. Both models underwent performance testing on 666 data points from the eICU Collaborative Research Database.

Results

The text-embedding-ada-002 model successfully identified the relevant code from a set of similar codes 80% of the time, while GPT-4 achieved a 50 % accuracy in predicting the correct code.

Discussion

The work implies that text-embedding-ada-002 could automate medical coding better than GPT-4, highlighting potential limitations of generative language models for complicated tasks like this.

Conclusion

The research shows that text-embedding-ada-002 outperforms GPT-4 in medical coding, highlighting embedding models’ usefulness in the domain of medical coding.

Article activity feed