Online streaming feature selection for high-dimensional small-sample data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Within the domain of high-dimensional small-sample data classification tasks, there are several significant challenges. The feature space of samples typically has high dimensionality, and the features of samples may be extracted sequentially. In addition, the distribution of data categories is imbalanced. For this purpose, a novel online feature selection algorithm specifically designed for this scenario is presented in this paper. First, an adaptive neighborhood relation is proposed, which is based on class density, and the distribution of the target sample's category is fully utilized. Second, a neighborhood consistency metric is defined based on the proposed neighborhood relation. Moreover, the proposed online feature selection algorithm consists of three main phases: significance judgment, correlation analysis, and redundancy update. Comprehensive experimental studies on 12 datasets illustrate that our method significantly enhances the prediction of minority class samples compared to several popular online streaming feature selection algorithms.

Article activity feed