RLDSCP: Reducing Label Dependency with Self-Attention and Contrastive Pretraining
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper introduces RLDSCP, a transformer-based architecture specifically designed for tabular data modeling. RLDSCP employs a dual attention mechanism, combining feature-level self-attention and intersample (row-level) attention to effectively learn both intra-feature and inter-sample relationships. To address challenges in data-scarce and semi-supervised settings, RLDSCP incorporates a contrastive self-supervised pretraining strategy, enhanced with CutMix and Mixup-style augmentations, and a denoising reconstruction objective. This hybrid learning approach enables the model to extract robust and generalized feature representations. Empirical evaluations across 16 diverse tabular datasets—including both binary and multiclass classification tasks—demonstrate that RLDSCP consistently outperforms existing deep learning models and achieves comparable robustness to ensemble methods like XGBoost and LightGBM, particularly under conditions of limited labeled data. The results establish RLDSCP as a scalable, interpretable, and high-performing solution for real-world tabular data scenarios.