Fuzzy Feature Selection Using Fuzzy C-Means Clustering and Recursive Feature Elimination (FCM-RFE)

Phichsinee Khongja
Amit Kumar Saxena
Damodar Patel
Phumin Sumalai

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

In machine learning, feature selection is crucial for reducing computing costs, increasing generalization, reducing dimensionality, and improving model interpretability. Due to multicollinearity and redundancy, traditional approaches often encounter difficulties when dealing with high-dimensional data. We propose a hybrid framework called Fuzzy Feature Selection using Fuzzy C-Means Clustering and Recursive Feature Elimination (FCM-RFE), which combines fuzzy logic, filter, and wrapper approaches, to address these problems. In order to capture complex relationships, fuzzy C-Means clustering first partitions related features into soft clusters. Then, within each cluster, less significant features are repeatedly eliminated using Recursive Feature Elimination with Random Forest (RFE-RF). For more precise selection, features are ranked according to the strength of their cluster link using a fuzzy membership-based scoring system. Experiments on 18 benchmark datasets using KNN and SVM classifiers evaluated metrics including accuracy, precision, recall, F1-score, specificity, and AUC-ROC. The proposed approach maintained or enhanced performance while significantly decreasing dimensionality, selecting, on average, only 4.1% of the original features. The maximum accuracy was 92.75% for SVM with FCM-RFE and 89% for KNN. The proposed method demonstrated effectiveness and scalability for high-dimensional data analysis, outperforming eight state-of-the-art techniques and demonstrating computing efficiency. This framework is suitable for high-dimensional data analysis in various disciplines because it not only increases classification performance but also improves interpretability and scalability.

Version published to 10.21203/rs.3.rs-7984249/v1 on Research Square
Nov 13, 2025

Hybrid Machine Learning and Nature-Inspired Optimization for Robust and Accurate Product Recommendations

This article has 2 authors:
1. V. Jenifer1
2. T. Kamala Kannan
This article has no evaluationsLatest version Oct 7, 2025
Unsupervised and Supervised Approaches for Breast Cancer Subtype Classification: Hierarchical Clustering and Machine Learning with Hyperparameter Optimization

This article has 3 authors:
1. Ana Beatriz Miranda Valentin
2. Glaucia Maria Bressan
3. Elisângela Lizzi
This article has no evaluationsLatest version Nov 18, 2025
KOS: Kernel-based Optimal Subspaces Method for Data Classification

This article has 1 author:
1. Lakhdar Remaki
This article has no evaluationsLatest version Oct 14, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Hybrid Machine Learning and Nature-Inspired Optimization for Robust and Accurate Product Recommendations

Unsupervised and Supervised Approaches for Breast Cancer Subtype Classification: Hierarchical Clustering and Machine Learning with Hyperparameter Optimization

KOS: Kernel-based Optimal Subspaces Method for Data Classification