High-dimensional Biomarker Identification for Scalable and Interpretable Disease Prediction via Machine Learning Models

Yifan Dai
Fei Zou
Baiming Zou

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Omics data generated from high-throughput technologies and clinical features jointly impact many complex human diseases. Identifying key biomarkers and clinical risk factors is essential for understanding disease mechanisms and advancing early disease diagnosis and precision medicine. However, the high-dimensionality and intricate associations between disease outcomes and omics profiles present significant analytical challenges. To address these, we propose an ensemble data-driven biomarker identification tool, Hybrid Feature Screening (HFS), to construct a candidate feature set for downstream advanced machine learning models. The pre-screened candidate features from HFS are further refined using a computationally efficient permutation-based feature importance test, forming the comprehensive High-dimensional Feature Importance Test (HiFIT) framework. Through extensive numerical simulations and real-world applications, we demonstrate HiFIT’s superior performance in both outcome prediction and feature importance identification. An R package implementing HiFIT is available on GitHub ( https://github.com/BZou-lab/HiFIT ).

Version published to 10.1101/2024.10.04.616748v1 on bioRxiv
Oct 7, 2024

Bio-primed machine learning to enhance discovery of relevant biomarkers

This article has 9 authors:
1. David Henke
2. Alexander Renwick
3. Joseph Zoeller
4. Jitendra Meena
5. Nicholas Neill
6. Elizabeth Bowling
7. Kristen Karlin
8. Thomas Westbrook
9. Lukas Simon
This article has no evaluationsLatest version Oct 17, 2024
LP-Micro Offers Interpretable Disease Outcome Prediction by Leveraging Microbial Biomarkers and Their Time-Varying Effects

This article has 11 authors:
1. Yifan Dai
2. Yunzhi Qian
3. Yixiang Qu
4. Wyliena Guan
5. Jialiu Xie
6. Catherine Butler
7. Stuart Dashper
8. Ian Carroll
9. Kimon Divaris
10. Yufeng Liu
11. Di Wu
This article has no evaluationsLatest version Oct 22, 2024
RCoxNet: deep learning framework for enhanced cancer survival prediction integrating random walk with restart with mutation and clinical data

This article has 6 authors:
1. Stuti Kumari
2. Sakshi Gujral
3. Smruti Panda
4. Prashant Gupta
5. Gaurav Ahuja
6. Debarka Sengupta
This article has no evaluationsLatest version Sep 20, 2024

Listed in

Abstract

Article activity feed

Related articles

Bio-primed machine learning to enhance discovery of relevant biomarkers

LP-Micro Offers Interpretable Disease Outcome Prediction by Leveraging Microbial Biomarkers and Their Time-Varying Effects

RCoxNet: deep learning framework for enhanced cancer survival prediction integrating random walk with restart with mutation and clinical data