Medical Diagnosis Coding Automation: Similarity Search vs. Generative AI
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Objective
This study aims to predict ICD-10-CM codes for medical diagnoses from short diagnosis descriptions and compare two distinct approaches: similarity search and using a generative model with few-shot learning.
Materials and Methods
The text-embedding-ada-002 model was used to embed textual descriptions of 2023 ICD-10-CM diagnosis codes, provided by the Centers provided for Medicare & Medicaid Services. GPT-4 used few-shot learning. Both models underwent performance testing on 666 data points from the eICU Collaborative Research Database.
Results
The text-embedding-ada-002 model successfully identified the relevant code from a set of similar codes 80% of the time, while GPT-4 achieved a 50 % accuracy in predicting the correct code.
Discussion
The work implies that text-embedding-ada-002 could automate medical coding better than GPT-4, highlighting potential limitations of generative language models for complicated tasks like this.
Conclusion
The research shows that text-embedding-ada-002 outperforms GPT-4 in medical coding, highlighting embedding models’ usefulness in the domain of medical coding.