Experimenting with Two Recent Feature Selection Methods for High-Dimensional Biological Data

Minzhe Zhang
Xiao Yang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Feature selection in high-dimensional biological data, where the number of features far exceeds the number of samples, has long posed a significant methodological challenge. This study evaluates two recently developed feature selection methods, Stabl and Nullstrap, under a simulation framework designed to replicate regression, classification, and non-linear regression tasks across varying feature dimensions and noise levels. Our results demonstrate that Nullstrap consistently outperforms Stabl and other benchmarked methods across all evaluated scenarios. Furthermore, Nullstrap proved significantly faster and more scalable in high-dimensional settings, underscoring its suitability for large-scale omics data applications. These findings establish Nullstrap as a robust, accurate, and computationally efficient feature selection tool for modern omics data analysis.

Version published to 10.1101/2025.09.09.675248 on bioRxiv
Sep 15, 2025

Classification of Bio-Data with Interval Dissimilarities: A Multidimensional Scaling Framework

This article has 4 authors:
1. Md. Anwarul Islam Bhuiyan
2. Sohana Jahan
3. Md. Babul Hasan
4. Md. Maruf Hossain
This article has no evaluationsLatest version Jan 21, 2026
A Reproducible and Unified Benchmark of Deep Learning Feature Selection Across Simulations and Multi-Omics datasets

This article has 6 authors:
1. Yalu Wen
2. QINGYU MENG
3. Xiaoyan Sun
4. Ning Li
5. Long Liu
6. Deqiang Zheng
This article has no evaluationsLatest version Jan 21, 2026
An Oracle Approach to Goodness-of-Fit Testing via Linear Combination of Degenerate V-Statistics

This article has 3 authors:
1. Katarina Halaj Mileusnić
2. Bojana Milošević
3. Marko Obradović
This article has no evaluationsLatest version Dec 11, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Classification of Bio-Data with Interval Dissimilarities: A Multidimensional Scaling Framework

A Reproducible and Unified Benchmark of Deep Learning Feature Selection Across Simulations and Multi-Omics datasets

An Oracle Approach to Goodness-of-Fit Testing via Linear Combination of Degenerate V-Statistics