S3 and S4: Novel Hybrid Activation Functions for Deep Neural Networks
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Activation functions are critical components in deep neural networks, directly influencing gradient flow, training stability, and model performance. Traditional functions like ReLU suffer from dead neuron problems, while sigmoid and tanh exhibit vanishing gradient issues. We introduce two novel hybrid activation functions: S3 (Sigmoid-Softsign) and its improved version S4 (smoothed S3). S3 combines sigmoid for negative inputs with Softsign for positive inputs, while S4 employs a smooth transition mechanism controlled by a steepness parameter k. Comprehensive experiments were conducted across binary classification, multi-class classification, and regression tasks using three different neural network architectures. S4 outperformed nine baseline activation functions. It achieved 97.4% accuracy on MNIST, 95.4% on the Iris dataset, and 18.7 ± 1.2 MSE on the Boston Housing regression. Moreover, S4 converged faster and maintained more stable gradient flow, reducing training time by 28.8% on average compared with ReLU across tested network depths. Comparative analysis revealed S4: maintained gradient magnitudes within the [0.24, 0.59] range while avoiding the dead neuron problem that affects 18% of ReLU neurons. The S4 activation function addresses key limitations of existing functions through its hybrid design and smooth transition mechanism. The tunable parameter k allows adaptation to different tasks and network depths, making S4 a versatile choice for deep learning applications. These findings indicate that hybrid activation functions provide a viable approach for improving neural network training dynamics.