Variance-based Prioritization Reveals a Clinically Validated Antigen Discovery Space Systematically Inaccessible to Mean-Based Methods

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background. Mean-based transcriptomic prioritization (differential expression analysis, DEG) dominates cancer target discovery but is optimized for driver gene identification rather than therapeutic antigen discovery. Whether variance-based prioritization captures a complementary and clinically relevant discovery space has not been systematically evaluated. Methods. We applied variance-based prioritization (TANK) and four comparator methods (DEG, MAD, coefficient of variation, mean expression) to genome-wide transcriptomic data from TCGA gastric adenocarcinoma (n = 443 tumor samples). We evaluated recall of a gold standard set of 28 clinically validated therapeutic antigens (FDA-approved and Phase 2 + ADC/CAR-T/TCR-T targets) at three ranking thresholds. Mechanistic specificity was assessed by comparing surface protein enrichment, driver oncogene enrichment, and therapeutic antigen recall between high-variance and low-variance gene sets. Results. Across all thresholds, TANK substantially outperformed all comparator methods in therapeutic antigen recall (top 5%: TANK 25%, DEG 3.6%, MAD 3.6%, CV 0%, Mean 3.6%). In the primary analysis using the full gene universe (60,654 genes), TANK recovered 50% of gold standard targets at top 5% versus 3.6% for DEG (OR = 27.0, p = 0.000071). High-variance genes were not globally enriched for surface proteins (9.1% vs 9.2%, OR = 0.98, p = 0.58), ruling out surface protein abundance as an explanatory factor. Instead, high variance specifically depleted canonical driver oncogenes (OR = 0.48, p = 0.0004) while achieving extreme enrichment of therapeutic antigens over low-variance genes (OR = infinity, p = 0.002). Three targets nominated prospectively by TANK prior to literature review subsequently converged on FDA-approved or Phase 2 + clinical programs. Conclusions. Transcriptomic variance encodes a specific biological signal for therapeutic antigenicity that is orthogonal to driver gene biology and systematically inaccessible to all mean-based approaches tested. Integration of variance-based prioritization into target discovery workflows may substantially expand the accessible space for immunotherapy antigen development.

Article activity feed