OphthUS-GPT: Multimodal AI for Automated Reporting in Ophthalmic B-Scan Ultrasound
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
IMPORTANCE The rapid advancement of AI in ophthalmology is transforming traditional diagnostic approaches, especially in resource-limited settings. The shortage of specialized ophthalmologists and lack of standardized reporting in primary care creates an urgent need for AI systems capable of both automated diagnostic report generation and interactive clinical decision support. OBJECTIVE To develop a multimodal artificial intelligence system (OphthUS-GPT) that integrates BLIP and DeepSeek models to enable automated diagnostic report generation and interactive clinical decision support from ophthalmic B-scan ultrasound images. DESIGN, SETTING, AND PARTICIPANTS This retrospective study was conducted at the Affiliated Eye Hospital of Jiangxi Medical College, Nanchang University, collecting ophthalmic B-scan ultrasound examination reports between June 2017 and March 2024. The study included 54,696 ophthalmic B-scan ultrasound images and 9,392 reports from 31,943 patients with a mean age of 49.14 years, 50.15% of whom were male. MAIN OUTCOMES AND MEASURES Evaluation comprised two components: assessment of the diagnostic report generation and evaluation of the question-answering system. The diagnostic report generation was evaluated using text generation quality metrics (ROUGE-L, CIDEr, etc.), disease classification performance metrics (accuracy, sensitivity, specificity, precision, and F1 score), and human assessment by three ophthalmologists for accuracy and completeness on a five-point scale. The question-answering system was evaluated by three ophthalmologists rating generated answers on accuracy, completeness, potential harm, and satisfaction. RESULTS OphthUS-GPT demonstrated excellent performance in diagnostic report generation, with ROUGE-L and CIDEr scores of 0.6131 and 0.9818, respectively. In disease classification, the system achieved accuracy rates exceeding 90% for common conditions such as vitreous opacities, retinal detachment, posterior scleral staphyloma, and cataracts, with precision greater than 70%. For rarer conditions like choroidal detachment and phthisis bulbi, diagnostic accuracy reached 0.9893 and 0.9962, with specificity exceeding 99%. Assessment by three ophthalmologists showed that over 90% of reports scored 3 or higher for correctness, and 96% scored 3 or higher for completeness. In the question-answering system evaluation, the DeepSeek model exhibited excellent performance in accuracy, completeness, potential harm, and satisfaction, comparable to GPT4o and OpenAI-o1, and significantly outperforming other models. CONCLUSIONS AND RELEVANCE The multimodal OphthUS-GPT system developed in this study, combining BLIP and DeepSeek models, demonstrated excellent performance in automatically generating diagnostic reports from ophthalmic B-scan ultrasound images and providing intelligent question-answering capabilities, offering a novel solution for medical imaging diagnosis and clinical decision support.