S3 and S4: Novel Hybrid Activation Functions for Deep Neural Networks

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Activation functions are critical components in deep neural networks, directly influencing gradient flow, training stability, and model performance. Traditional functions like ReLU suffer from dead neuron problems, while sigmoid and tanh exhibit vanishing gradient issues. We introduce two novel hybrid activation functions: S3 (Sigmoid-Softsign) and its improved version S4 (smoothed S3). S3 combines sigmoid for negative inputs with Softsign for positive inputs, while S4 employs a smooth transition mechanism controlled by a steepness parameter k. Comprehensive experiments were conducted across binary classification, multi-class classification, and regression tasks using three different neural network architectures. S4 outperformed nine baseline activation functions. It achieved 97.4% accuracy on MNIST, 95.4% on the Iris dataset, and 18.7 ± 1.2 MSE on the Boston Housing regression. Moreover, S4 converged faster and maintained more stable gradient flow, reducing training time by 28.8% on average compared with ReLU across tested network depths. Comparative analysis revealed S4: maintained gradient magnitudes within the [0.24, 0.59] range while avoiding the dead neuron problem that affects 18% of ReLU neurons. The S4 activation function addresses key limitations of existing functions through its hybrid design and smooth transition mechanism. The tunable parameter k allows adaptation to different tasks and network depths, making S4 a versatile choice for deep learning applications. These findings indicate that hybrid activation functions provide a viable approach for improving neural network training dynamics.

Article activity feed