Leveraging Multimodal Large Language Models to Extract Mechanistic Insights from Biomedical Visuals: A Case Study on COVID-19 and Neurodegenerative Diseases

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

The COVID-19 pandemic has intensified concerns about its long-term neurological impact, with growing evidence linking SARS-CoV-2 infection to neurodegenerative diseases (NDDs) such as Alzheimer’s (AD) and Parkinson’s (PD). Patients with these conditions not only face higher risk of severe COVID-19 outcomes but may also undergo accelerated cognitive and motor decline following infection. Proposed mechanisms—ranging from neuroinflammation and blood–brain barrier disruption to abnormal protein aggregation—closely mirror core features of neurodegenerative pathology. Yet, current knowledge is fragmented across text, figures, and pathway diagrams, hindering integration into computational models capable of uncovering systemic patterns.

Results

To address this gap, we applied GPT-4 Omni (GPT-4o), a multimodal large language model, to extract mechanistic insights from biomedical figures. Over 10,000 images were retrieved through targeted searches on COVID-19 and neurodegeneration; after automated and manual filtering, a curated subset was analyzed. GPT-4o extracted biological relationships as semantic triples, which were grouped into six mechanistic categories—including microglial activation and barrier disruption—using ontology-guided similarity and assembled into a Neo4j knowledge graph.

Accuracy was evaluated against a gold-standard dataset of expert-annotated images using BioBERT-based semantic matching. This evaluation also enabled prompt tuning, threshold optimization, and hyperparameter assessment. Results demonstrate that GPT-4o successfully recovers both established and novel mechanisms, yielding interpretable outputs that illuminate complex biological links between SARS-CoV-2 and neurodegeneration.

Conclusions

This study showcases the potential of multimodal LLMs to mine biomedical visual data at scale. By complementing text mining and integrating figure-derived knowledge, our framework advances understanding of COVID-19–related neurodegeneration and supports future translational research.

Article activity feed