Visualising the Truth: A Composite Evaluation Framework for Score-Based Predictive Models Selection

Uraquitan Lima Filho
Tiago Alexandre Pais
Ricardo Jorge Pais

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Model selection in machine learning applications for biomedical predictions is often constrained by reliance on conventional global performance metrics such as area under the ROC curve (AUC), sensitivity, and specificity. When these metrics are closely clustered across multiple candidate models, distinguishing the most suitable model for real-world application becomes challenging. We propose a novel composite evaluation framework that combines a visual analysis of prediction score distributions with a new performance metric, the MSDscore, tailored for classifiers producing continuous prediction scores. This approach enables detailed inspection of true positive, false positive, true negative, and false negative distributions across the scoring space, capturing local performance patterns often overlooked by standard metrics. Implemented within the Digital Phenomics platform as the MSDanalyser tool, our methodology was applied to 27 predictive models developed for breast, lung, and renal cancer prognosis. Although conventional metrics showed minimal variation between models, the MSDA methodology revealed critical differences in score-region behaviour, allowing the identification of models with greater real-world suitability. While our study focuses on oncology, the methodology is generalisable to other domains involving threshold-based classification. We conclude that integrating this composite framework alongside traditional performance metrics offers a complementary, more nuanced approach to model selection in clinical and biomedical settings.

Version published to 10.20944/preprints202506.1803.v1
Jun 23, 2025

Predicting Annotation Yield in Artificial Intelligence-Ranked Electronic Health Record Cohorts: A Regression-Based Framework for Efficient Manual Review

This article has 6 authors:
1. Assaf Landschaft
2. Leena Abdelmoity
3. Fatemeh Mohammad Alizadeh Chafjiri
4. Molly Ann Puckett
5. Jennifer Gettings
6. Tobias Loddenkemper
This article has no evaluationsLatest version Jul 23, 2025
Evaluation of Classical and Ensemble Machine Learning Algorithms for Thyroid Cancer Diagnosis: A Comparative Evaluation

This article has 1 author:
1. Kamorudeen Amuda
This article has no evaluationsLatest version Jul 17, 2025
Comparison of Assembly Methods in Machine Learning for the Early Prediction of Acute Myocardial Infarction

This article has 6 authors:
1. Daniel Cristóbal Andrade-Girón
2. Juana Sandivar-Rosas
3. William Joel Marin-Rodriguez
4. Marcelo Gumercindo Zuñiga-Rojas
5. Abrahán Cesar Neri-Ayala
6. Ernesto Díaz-Ronceros
This article has no evaluationsLatest version Jul 22, 2025

Listed in

Abstract

Article activity feed

Related articles

Predicting Annotation Yield in Artificial Intelligence-Ranked Electronic Health Record Cohorts: A Regression-Based Framework for Efficient Manual Review

Evaluation of Classical and Ensemble Machine Learning Algorithms for Thyroid Cancer Diagnosis: A Comparative Evaluation

Comparison of Assembly Methods in Machine Learning for the Early Prediction of Acute Myocardial Infarction