Arctic Puffin Algorithm with Local Escaping Operator for Feature Selection in High-Dimensional Datasets

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large datasets might make data mining more difficult due to their high dimensionality. To minimize the dimensionality of large datasets, feature selection is an essential step that maximizes classification accuracy by choosing the most informative features. In this study, a binary adaptation of the enhanced Arctic Puffin Optimization (EAPO) algorithm, which builds on the Arctic Puffin Optimization (APO) algorithm, is introduced within a wrapper-based framework to identify the optimal subset of features for classification tasks. The core component of EAPO lies in the design of the transfer function, which facilitates the transformation of the continuous search space into a binary domain. To evaluate its effectiveness, three transfer functions, S-shaped, V-shaped, and U-shaped, are investigated. Additionally, the local escaping operator is introduced as a technique to upgrade the performance of the APO algorithm via effectively balancing exploration and exploitation. The proposed EAPO method is assessed on 12 benchmark datasets using the k-Nearest Neighbor classifier as the evaluation metric. To control overfitting, k-fold cross-validation employs a partitioning strategy in which the dataset is divided into k folds, and each fold is used as the test set once while the remaining folds form the training set. Comparisons are made with state-of-the-art methods, including the butterfly optimization algorithm, sine cosine algorithm, Kepler optimization algorithm, grey wolf optimizer, cuckoo optimization method, and salp swarm algorithm. The results obtained from the experiments show that the EAPO technique is better in terms of consistency, feature reduction, and classification accuracy. Statistical analysis further confirms the effectiveness and robustness of the proposed method.

Article activity feed