A Comparative Analysis of Optimization Methods for Classification on Various Datasets

Simanta Das
Soumitra Das

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Optimization, which involves the study of the conditions by which a variety of mathematical structures can be analyzed through the minimization or maximization of a function, is often seen as the heart of mathematics. In deep learning (DL), the scope of optimization broadly includes hyperparameter tuning, weight and bias adjustments, etc., until convergence of the loss or cost function (J), aiming to improve the model’s performance, prediction accuracy, and reliability in tasks like classification and regression. In recent years, the stochastic gradient algorithm and its variants are becoming widely used, and each offering various levels of success. The variants are called Adaptive Gradient Methods. A thorough comparison of adaptive gradient methods with respect to their convergence speed and Cross-Entropy Loss (CEL) in the mentioned classification tasks is provided; hence, the study covered optimization algorithms like SGD, Momentum SGD, RMSProp, Adam, Adagrad, Adadelta, Adamax, Nadam, and AMSGrad across three CNN architectures on MNIST, Fashion-MNIST, and CIFAR-10 datasets for 30 epochs. On MNIST (CNN-2), momentum SGD achieved 1.0000 accuracy coupled with 0.0000 loss, while it reached 0.9672 on CIFAR-10 (CNN-2); RMSProp achieved 0.9714 on Fashion-MNIST (CNN-1) and 0.9582 on CIFAR-10 (CNN-1); Adam reached 0.9898 on Fashion-MNIST (CNN-2) and 0.9733 on CIFAR-10 (CNN-2). Nadam was also performing relatively well on all three. In contrast, Adagrad and Adadelta showed poor results; for example, Adadelta on CIFAR-10 (CNN-1) showed 0.2576 accuracy with 2.0727 loss, and on Fashion-MNIST (CNN-1), it showed 0.6890 accuracy. The optimizers that overall proved to be the best were SGD, RMSProp, Adam, and Nadam, while Adagrad and Adadelta showed consistent underperformance.

Version published to 10.31219/osf.io/kbh9d_v1 on OSF Preprints
May 2, 2025

A Novel Differential Loss Function for Enhancing Generalization in Machine Learning Models

This article has 1 author:
1. Eyas Gaffar A. Osman
This article has no evaluationsLatest version May 13, 2025
Hyperparameter Optimization Strategies for Tree-Based Machine Learning Models Prediction: A Comparative Study of AdaBoost, Decision Trees, and Random Forest

This article has 1 author:
1. Mohsen Mohammadagha
This article has no evaluationsLatest version Apr 14, 2025
A Comparative Study of VGG16, Support Vector Machine, and Random Forest for Automated Skin Disease Detection Using the DermNet Dataset

This article has 2 authors:
1. Neha Gautam
2. Krupali Dhawle
This article has no evaluationsLatest version May 23, 2025

Listed in

Abstract

Article activity feed

Related articles

A Novel Differential Loss Function for Enhancing Generalization in Machine Learning Models

Hyperparameter Optimization Strategies for Tree-Based Machine Learning Models Prediction: A Comparative Study of AdaBoost, Decision Trees, and Random Forest

A Comparative Study of VGG16, Support Vector Machine, and Random Forest for Automated Skin Disease Detection Using the DermNet Dataset