Visual analytics framework for survival analysis and biomarker discovery
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We introduce a visual analytics methodology for survival analysis, and propose a framework that defines a reusable set of visualization and modeling components to support exploratory and hypothesis-driven biomarker discovery. Survival analysis—essential in biomedicine—evaluates patients’ survival rates and the onset of medically relevant events, given their clinical and genetic profiles and genetic predispositions. Existing approaches often require programming expertise or rely on inflexible analysis pipelines, limiting their usability among biomedical researchers. The lack of advanced, user-friendly tools hinders problem solving, limits accessibility for biomedical researchers, and restricts interactive data exploration. Our methodology emphasizes functionality-driven design and modularity, akin to combining LEGO bricks to build tailored visual workflows. We (1) define a minimal set of reusable visualization and modeling components that support common survival analysis tasks, (2) implement interactive visualizations for discovering survival cohorts and their characteristic features, and (3) demonstrate integration within an existing visual analytics platform. We implemented the methodology as an open-source add-on to Orange Data Mining and validated it through use cases ranging from Kaplan–Meier estimation to biomarker discovery. The resulting framework illustrates how methodological design can drive intuitive, transparent, and effective survival analysis.
Author summary
When studying diseases like cancer, it is important to understand how factors such as genes or treatments influence how long patients live after medical intervention. Survival analysis is a well-established statistical and bioinformatics approach for uncovering such insights, but it often requires advanced programming skills, which can be a barrier for many clinicians and life science researchers. To help overcome this, we developed a visual analytics tool that makes the analysis of censored data easier, more interactive, and accessible. Instead of writing code, researchers can use our framework to assemble analysis components into workflows through visual programming. They can explore their data by comparing survival curves, identifying meaningful patient subgroups, and discovering potential biomarkers. The tool is implemented as part of Orange Data Mining, a free and open-source data analytics platform. We tested it on real-world cancer datasets to demonstrate how easily and quickly researchers can prototype powerful data analysis workflows. Our goal is to make survival analysis more accessible, empowering researchers to generate and test hypotheses without needing to program. Our goal is to make survival analysis more accessible, empowering researchers to generate and test hypotheses without programming, and contributing to the broader democratization of data science.