Biological Database Mining for LLM-Driven Alzheimer’s Disease Drug Repurposing
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
BACKGROUND
This study presents a software pipeline that leverages LLMs to apply knowledge stored in natural language (such as in pharmacological texts) and ontologies in a transparent Drug Repurposing (DR) information structure.
METHODS
Alzheimer’s Disease (AD) related entries in Gene Ontology and DrugBank were integrated into a Knowledge Graph database to inform LLM prompts. 16,581 drugs were screened for their DR potential by the LLM Llama3:8b. The vector embedding representation of the drugs in the LLM was investigated to asses if LLMs store pharmacological information in alignment with domain expert understanding of pharmacological groups. By measuring the semantic similarity of drugs quantitatively, the performance of the DR pipeline was examined. A manual hallucination check was performed to assess the impact of the ontology-database combination on LLM-hallucination performance. The results were compared against registered clinical trials (RCTs) and proposed medications in meta-analyses to evaluate their predictive value.
RESULTS
The embedding analysis showed that the vector representations of drugs in the LLM show clusters in alignment with pharmacological groups. The ontologically enhanced prompt was closer to the expert domain proposals than a zero-shot control prompt without that knowledge. The results of the ontology-based prompt showed fewer hallucinations in their responses compared to the zero-shot control prompting.
CONCLUSIONS
Ontology-augmented LLM interaction leads to fewer hallucinations and output closer to expert assessment in comparison with a zero-shot control. We propose retrospective analyses, considering the high-rated drugs and their effect on AD patients as a starting point for further (prospective) research.