OncoSML: A Self-Maintaining Machine Learning Framework for Personalized Cancer Vaccine Research
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Cancer kills nearly 10 million people annually and costs the global economy over $200 billion (€184 billion / ₹16.7 lakh Crore) in direct medical expenditure. Conventional treatment modalities — surgery, chemotherapy, radiotherapy, targeted therapy, and immune checkpoint inhibitors — while clinically valuable, share fundamental limitations: non-specificity, treatment-induced resistance, fixed model parameters, inability to adapt to tumor evolution, and substantial toxicity burdens that significantly degrade patient quality of life. The economic consequences are equally severe, with treatments such as checkpoint inhibitors costing $150,000 (€138,000 / ₹1.25 Crore) per patient per year, rendering them financially catastrophic for the majority of the global population. This paper presents OncoSML (Oncology Self-Maintaining Learning System), an open-source, research-grade, end-to-end machine learning pipeline developed at the Department of Data Science and Artificial Intelligence, Indian Institute of Technology Guwahati, by Aditya Roy Bardhan. OncoSML uniquely integrates the complete personalized cancer vaccine development workflow — from raw genomic input (FASTQ/BAM/VCF) through somatic variant identification, multi-parameter neoantigen scoring, mRNA vaccine construct synthesis, multi-gate biological safety validation, and clinical genomics stack orchestration — within a single, self-improving software ecosystem governed by a continuous learning loop that updates model parameters as new genomic data and validated outcomes are incorporated. This comprehensive edition makes six primary contributions: (1) a detailed technical description of OncoSML's modular architecture, algorithmic design, and self-maintaining learning loop; (2) a comprehensive genome-to-vaccine biological walkthrough supported by six original scientific diagrams explaining DNA structure, somatic mutation types, mRNA vaccine construction, the complete immune response cascade, and patient recovery; (3) an evidence-based argument for why personalized mRNA vaccines are medically, biologically, economically, and from a patient wellbeing perspective the optimal cancer treatment modality; (4) a detailed patient recovery analysis comparing quality of life, physical function, and multi-domain wellbeing; (5) a comparative analysis against six existing neoantigen prediction tools demonstrating OncoSML is the only system supporting all 10 key pipeline capabilities simultaneously; and (6) a tri-currency economic analysis (USD / EUR / INR) demonstrating a 36% total per-patient cost reduction. Key findings across 24 original visualizations: OncoSML achieves binding affinity AUC of 0.921 after 52 weeks of self-maintaining operation versus 0.841 for static tools; mRNA vaccine therapy produces <5% Grade 3-4 adverse events versus 62% for chemotherapy (15× safer); projected 5-year survival improvements of 50–136% relative improvement over conventional treatment across major cancer types; $66,000 (€60,720 / ₹55.1 lakh) lifecycle cost savings per patient; and a 2030 India target of ₹2 lakhs ($2,395 / €2,204) — the only advanced cancer therapy affordable for India's middle class of 300 million people.