Visualising the Truth: A Composite Evaluation Framework for Score-Based Predictive Models Selection

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Model selection in machine learning applications for biomedical predictions is often constrained by reliance on conventional global performance metrics such as area under the ROC curve (AUC), sensitivity, and specificity. When these metrics are closely clustered across multiple candidate models, distinguishing the most suitable model for real-world application becomes challenging. We propose a novel composite evaluation framework that combines a visual analysis of prediction score distributions with a new performance metric, the MSDscore, tailored for classifiers producing continuous prediction scores. This approach enables detailed inspection of true positive, false positive, true negative, and false negative distributions across the scoring space, capturing local performance patterns often overlooked by standard metrics. Implemented within the Digital Phenomics platform as the MSDanalyser tool, our methodology was applied to 27 predictive models developed for breast, lung, and renal cancer prognosis. Although conventional metrics showed minimal variation between models, the MSDA methodology revealed critical differences in score-region behaviour, allowing the identification of models with greater real-world suitability. While our study focuses on oncology, the methodology is generalisable to other domains involving threshold-based classification. We conclude that integrating this composite framework alongside traditional performance metrics offers a complementary, more nuanced approach to model selection in clinical and biomedical settings.

Article activity feed