WELDE: A Weighted Ensemble Loss with Diversity Enhancement for Imbalanced Object Detection in Medical Imaging

Rao Farhat Masood
Imtiaz Ahmad Taj

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Class imbalance in medical imaging datasets remains a key challenge for reliable object detection, particularly when rare yet clinically significant pathologies coexist with prevalent findings. In spinal MRI, common conditions such as Normal Intervertebral Disc (IVD) may constitute over 45% of annotated objects, whereas findings like Spondylolisthesis account for fewer than 2% of instances. Conventional loss functions including Focal Loss, Class-Balanced Loss, and Label-Distribution-Aware Margin Loss, each address isolated facets of this imbalance but do not provide a unified, adaptive solution. Inspired by ensemble loss strategies recently advanced in Deep Metric Learning (DML), we propose WELDE ( W eighted E nsemble L oss with D iversity E nhancement), a framework that combines four complementary loss functions via per-head adapter projections, EMA-based normalization, and learnable adaptive weighting with a relaxed sum-to-one penalty. Each loss component receives a dedicated classification head with an independent adapter projection from a shared frozen backbone, enabling feature specialization without backbone fine-tuning. We provide theoretical analysis of WELDE's properties, including gradient magnitude balancing across loss components and weight non-degeneracy. Applied to a lumbar mid-sagittal spinal MRI dataset with six classes and a 33.9:1 imbalance ratio, WELDE achieves the highest classification performance among all evaluated methods, outperforming all single-loss baselines (mAP 0.702 vs.\0.689 for the best baseline CE, mAP\((_{\text{tail}})\) 0.509 vs.\0.472, \((+)\)8.1% relative improvement on tail classes) and an architecture-matched CE ensemble control (mAP\((_{\text{tail}})\) 0.509 vs.\0.496), confirming that the improvement derives from diverse loss composition rather than increased model capacity. External cross-domain validation on the DermaMNIST skin lesion benchmark (7 classes, \((\rho{=}58.3)\)) confirms that \welde{} generalizes robustly, achieving the highest mAP (\((0.709)\)) and mAP\((_{\text{tail}})\) (\((0.651)\)) among all methods, outperforming both single-head baselines (\((+11.5%)\) mAP over CE) and the architecture-matched CE ensemble control.

Version published to 10.21203/rs.3.rs-9019468/v1 on Research Square
Mar 24, 2026

ML-ConvNet: A Lightweight and Interpretable Unified Architecture for Medical Image Classification Across Modalities

This article has 10 authors:
1. Williams Ayivi
2. Xiaoling Zhang
3. Yeongx Yeong Hyeon Gu
4. Amil Aligayev
5. Ali Alqahtani
6. Wisdom Xornam Ativi
7. Francis Sam
8. Muhammed Amin Abdullah
9. Emmanuel Sarpong Addai Gyarteng
10. Mugahed A. Al-antari
This article has no evaluationsLatest version Mar 17, 2026
Self-Supervised Kidney Tumor Segmentation Using Random Block Reconstruction on 3D CT Scans

This article has 4 authors:
1. Adrian Krenzer
2. Patrick Straßer
3. Tobias Friedetzki
4. Frank Puppe
This article has no evaluationsLatest version Feb 11, 2026
MRAN: A Reconstructive Attention Network for Handling Modality Sparsity in Multimodal Cancer Survival Analysis

This article has 3 authors:
1. Djaafer GHERBI
2. Mohammed Lamine BENOMAR
3. Mohammed Said KADDOUR
This article has no evaluationsLatest version Feb 18, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

ML-ConvNet: A Lightweight and Interpretable Unified Architecture for Medical Image Classification Across Modalities

Self-Supervised Kidney Tumor Segmentation Using Random Block Reconstruction on 3D CT Scans

MRAN: A Reconstructive Attention Network for Handling Modality Sparsity in Multimodal Cancer Survival Analysis