A Support Vector Machine Based Artificial Intelligence Technique Using Genetic Algorithms to Screen Metabolites Associated with Heart Disease in the Qatari Population
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Algorithms for feature selection are growing in interest among researchers aiming to connect specific features in a dataset with specific classifications. Recent developments in Support Vector Machine-based artificial intelligence algorithms have demonstrated excellent classification performance in highly nonlinear data. However, identifying which features contribute most to classification remains challenging, especially when datasets include hundreds of variables. Initially, features must be screened to narrow down the set for deeper analysis. Metabolomics datasets are one such case, where many features must be examined to determine those associated with heart disease diagnosis. This work applies a Genetic Algorithm, incorporating a penalized likelihood approach with Support Vector Machines for mutation, to stochastically search the feature space. A large-scale simulation study demonstrates that the proposed method achieves a high true feature identification rate while maintaining a reasonable false identification rate. The method is then applied to a Qatar BioBank dataset focused on heart disease, reducing the number of candidate metabolites from 232 to 37.