Variantscape: Using Large Language Models to Build a Comprehensive Landscape of Cancer Variants for Precision Oncology

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Precision oncology depends on accurate interpretation of molecular variants, yet novel insights are often buried in unstructured literature, described using heterogeneous nomenclature. To address this, we developed “Variantscape ,” an automated, large-scale pipeline and open-access web tool that integrates natural language processing and large language models to explore variant-cancer-treatment co-associations. Of over 2.7 million titles and abstracts processed, 7,524 mention all three entities, cancers, spanning 4,029 unique variants, 98 cancer types, and 377 treatments. Co-occurrence and network analyses revealed 15,577 significant co-associations within a graph comprising 4,504 nodes and 48,470 edges. Canonical variants in common cancers, such as BRAF V600E, had high-confidence treatment associations, while some rare variants showed strong literature-derived signals. By automating discovery and co-association detection, “Variantscape” offers a systematic overview of the variant landscape in the literature, enabling scalable insight generation that support hypothesis generation, uncover underrecognized connections, reveal novel applications of existing therapies, and advance precision oncology.

Article activity feed