Automated phenotyping of ophthalmologic diseases from routine medical records using small language models and the Human Phenotype Ontology (HPO)

Binh Duong Thai
Sebastian Arens
Thomas Reinhard
Daniel Böhringer

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Automated phenotyping in ophthalmology requires accurate standardization of clinical terms to facilitate interoperability and research. This study evaluates the suitability of the Human Phenotype Ontology (HPO) for automated extraction of ophthalmic phenotypes from narrative documentation. We developed a locally operated AI pipeline combining text segmentation and negation detection based on a small language model (PHI-4) with a dense retrieval approach using an augmented multilingual HPO catalog. Synonyms were incorporated into the HPO during training on anonymized consecutive physician letters. To validate the pipeline, 175 anterior segment and fundus descriptions from randomly picked medical records were manually annotated with HPO terms as ground truth. Overall, 342 HPO terms were identified manually (on average 2.53 terms per document), with 341 retrieved by the pipeline (on average 2.52 terms per document). Performance metrics showed a median Jaccard similarity of 0.67, precision of 0.83, recall of 0.82, and F1 score of 0.80. These results demonstrate that our AI pipeline effectively extracts standardized HPO terms from free-text ophthalmic findings. Integration of this pipeline into clinical information systems may enhance data interoperability and reduce manual coding workload in ophthalmology practices in the future.

Version published to 10.21203/rs.3.rs-8881215/v1 on Research Square
Mar 30, 2026

OncoCITE: Multimodal Multi-Agent Reconstruction of Clinical Oncology Knowledge Bases from Scientific Literature

This article has 6 authors:
1. Mujahid Quidwai
2. Santiago Thibaud
3. Dennis Shasha
4. Sundar Jagannath
5. Samir Parekh
6. Alessandro Laganà
This article has no evaluationsLatest version Mar 31, 2026
Performance of Vision–Language Models Compared with 252 Medical Students on Text-only and Image-based Dermatology Examinations

This article has 8 authors:
1. Ozan Erdem
2. Abdurrahim Yilmaz
3. Ahmet Sait Sahin
4. Bugra Burc Dagtas
5. Ece Gokyayla
6. Melek Aslan Kayıran
7. Vefa Aslı Erdemir
8. Mehmet Salih Gurel
This article has no evaluationsLatest version Apr 9, 2026
How are doctors across specialties using commercial large language models? Insights from the Anthropic Economic Index

This article has 4 authors:
1. Izabella Mancewicz
2. Yufei Xu
3. Jeff R. Ma
4. Khoa N. Cao
This article has no evaluationsLatest version Mar 24, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

OncoCITE: Multimodal Multi-Agent Reconstruction of Clinical Oncology Knowledge Bases from Scientific Literature

Performance of Vision–Language Models Compared with 252 Medical Students on Text-only and Image-based Dermatology Examinations

How are doctors across specialties using commercial large language models? Insights from the Anthropic Economic Index