AIMarkerFinder: AI-Assisted Marker Discovery Based on an Integrated Approach of Autoencoders and Kolmogorov-Arnold Networks
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In modern bioinformatics, the analysis of high-dimensional data (genomic, metabolomic, etc.) remains a critical challenge due to the "curse of dimensionality," where feature redundancy reduces classification efficiency and model interpretability. This study introduces a novel method, AIMarkerFinder, for analyzing metabolomic data to identify key biomarkers. The method is based on a denoising autoencoder with an attention mechanism (DAE), enabling the extraction of informative features and the elimination of redundancy. Experiments on glioblastoma and adjacent tissue metabolomic data demonstrated that AIMarkerFinder reduces dimensionality from 446 to 4 key features while improving classification accuracy. Using the selected metabolites (Malonyl-CoA, Glycerophosphocholine, SM(d18:1/22:0 OH), GC(18:1/24:1)), the Random Forest and Kolmogorov-Arnold Network (KAN) models achieved accuracies of 0.904 and 0.937, respectively. The analytical formulas derived by KAN provide model interpretability, which is critical for biomedical research. The proposed approach is applicable to genomics, transcriptomics, proteomics, and the study of exogenous factors on biological processes. The study's results open new prospects for personalized medicine and early disease diagnosis. AIMarkerFinder is avaliable at https://github.com/dps123/AIMarkerFinder