An Artificial Immune Network Combined with Improved SMOTE Oversampling Technique-Incorporated Robust Feature Selection Method for Imbalanced Medical Data

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Accurate identification of key diagnostic features is crucial for effective medical treatment. However, medical datasets are usually imbalanced and contain noise, which significantly compromise the performance and generalizability of diagnostic models. To overcome these challenges, we propose a bio-inspired oversampling approach-incorporated robust feature selection method(Bi-RFS). The main idea of the proposed Bi-RFS is twofold: 1) to generate reliable synthetic samples and ensure a balanced sample distribution, an artificial immune network combined with improved SMOTE oversampling method is proposed, 2) by considering sample similarity, a robust Relief-F algorithm is proposed to mitigate the impact of noise/outliers. Furthermore, we conduct a detailed algorithm design for the proposed method to enhance the overall feature evaluation process. Finally, the empirical studies on six real available medical datasets demonstrate our proposed method outperforms the state-of-the-art feature selection techniques.

Article activity feed