PSVM-MR: A Parallel Support Vector Machine Algorithm Based on MapReduce

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In big data environments,utilizing a parallel Support Vector Machine (SVM) can significantly expedite the training process. However, prior efforts have faced challenges due to the excessive deviation of subset distribution, inadequate performance of parallel training, and poor filtering ability of non-support vector. To tackle these obstacles, this paper proposes a novel approach called SVM algorithm based on MapReduce (PSVM-MR). Firstly, a data partition method based on relative entropy (DP-RE) calculates relative entropy to prevent excessive deviation of subset distribution introduced. Secondly, it presents a redundancy level-removing method based on cosine similarity (RLR-CS) that addresses the inadequate performance of parallel training by eliminating the redundancy levels in the cascade structure. Finally, a non-support vector filtering method (NSVF) that enhances the ability of non-support vector filtering by combining rough identification and singular vector identification is proposed. The proposed algorithm demonstrates higher parallel efficiency and lower training costs compared to the general parallel SVM algorithm.

Article activity feed