Bridging Data Gaps in Oncology: Large Language Models and Collaborative Filtering for Cancer Treatment Recommendations
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Patients with rare cancers face substantial challenges due to limited evidence-based treatment options, resulting from sparse clinical trials. Advances in large language models (LLMs) and recommendation algorithms offer new opportunities to utilize all clinical trial information to improve clinical decisions.
Methods
We used LLM to systematically extract and standardize more than 100,000 cancer trials from ClinicalTrials.gov . Each trial was annotated using a customized scoring system reflecting cancer-treatment interactions based on clinical outcomes and trial attributes. Using this structured data set, we implemented three state-of-the-art collaborative filtering algorithms to recommend potentially effective treatments across different cancer types.
Results
The LLM-driven data extraction process successfully generated a comprehensive and rigorously curated database from fragmented clinical trial information, covering 78 cancer types and 5,315 distinct interventions. Recommendation models demonstrated high predictive accuracy (cross-validated RMSE: 0.49–0.62) and identified clinically meaningful new treatments for melanoma, independently validated by oncology experts.
Conclusions
Our study establishes a proof of concept demonstrating that the combination of LLMs with sophisticated recommendation algorithms can systematically identify novel and clinically plausible cancer treatments. This integrated approach may accelerate the identification of effective therapies for rare cancers, ultimately improving patient outcomes by generating evidence-based treatment recommendations where traditional data sources remain limited.