Enhancing Early Skin Cancer Detection: A Deep Learning Approach with Multi-Scale Feature Refinement and Fusion
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The global incidence of skin cancer is rising, making it an increasingly critical public health issue. Malignant skin tumors such as melanoma originate from pathological alterations of skin cells, and their accurate early-stage segmentation is crucial for quantitative analysis, early diagnosis, and successful treatment. However, achieving precise and efficient segmentation remains a major challenge, as existing methods often struggle to balance computational efficiency with the ability to capture complex lesion characteristics. To address this challenge, we propose a novel deep learning framework that integrates the PVT v2 backbone with two key modules: Spatial-Aware Feature Enhancement (SAFE) and Multiscale Dual Cross-attention Fusion (MDCF). The SAFE module refines multi-scale encoder features through a dual-branch architecture that bridges the feature discrepancy across network depths by combining fine-grained shallow-layer details with deep semantic information via adaptive offset prediction. The MDCF module establishes bidirectional cross-attention between decoder and encoder features, followed by multi-scale deformable convolutions that capture lesion boundaries and small fragments at heterogeneous receptive fields, thereby enriching semantic details while suppressing background responses. The proposed model was evaluated on two public benchmark datasets (ISIC 2016 and ISIC 2018), achieving Intersection over Union (IoU) scores of 87.33% and 83.67%, respectively, demonstrating superior performance compared to current state-of-the-art methods. These results indicate that our framework significantly enhances skin lesion image analysis and offers a promising tool for improving early detection of skin cancer.