S3 and S4: Novel Hybrid Activation Functions for Deep Neural Networks

Sergii Kavun
Sergii Kavun

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Activation functions are critical components in deep neural networks, directly influencing gradient flow, training stability, and model performance. Traditional functions like ReLU suffer from dead neuron problems, while sigmoid and tanh exhibit vanishing gradient issues. We introduce two novel hybrid activation functions: S3 (Sigmoid-Softsign) and its improved version S4 (smoothed S3). S3 combines sigmoid for negative inputs with Softsign for positive inputs, while S4 employs a smooth transition mechanism controlled by a steepness parameter k. Comprehensive experiments were conducted across binary classification, multi-class classification, and regression tasks using three different neural network architectures. S4 outperformed nine baseline activation functions. It achieved 97.4% accuracy on MNIST, 95.4% on the Iris dataset, and 18.7 ± 1.2 MSE on the Boston Housing regression. Moreover, S4 converged faster and maintained more stable gradient flow, reducing training time by 28.8% on average compared with ReLU across tested network depths. Comparative analysis revealed S4: maintained gradient magnitudes within the [0.24, 0.59] range while avoiding the dead neuron problem that affects 18% of ReLU neurons. The S4 activation function addresses key limitations of existing functions through its hybrid design and smooth transition mechanism. The tunable parameter k allows adaptation to different tasks and network depths, making S4 a versatile choice for deep learning applications. These findings indicate that hybrid activation functions provide a viable approach for improving neural network training dynamics.

Version published to 10.21203/rs.3.rs-7684932/v1 on Research Square
Dec 1, 2025

Enhancing Logistic Regression Performance Through Hyperparameter Tuning: A Comparative Evaluation Across Datasets

This article has 7 authors:
1. Mueed Ahmad
2. Noman Javed
3. Awais Muzafar
4. Mateen Muzafar
5. Hadia Naseer
6. Guantian Huang
7. Dianning He
This article has no evaluationsLatest version Jan 9, 2026
Discrete Weight Neural Networks: Investigating the Relationship Between Weight Precision and Generalization

This article has 1 author:
1. Avinav Sahoo
This article has no evaluationsLatest version Jan 9, 2026
Deep Learning with Zero Initialization: Revisiting Symmetry Breaking and Gradient Flow

This article has 2 authors:
1. Jongwoo Seo
2. Wuhyun Koh
This article has no evaluationsLatest version Dec 17, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Enhancing Logistic Regression Performance Through Hyperparameter Tuning: A Comparative Evaluation Across Datasets

Discrete Weight Neural Networks: Investigating the Relationship Between Weight Precision and Generalization

Deep Learning with Zero Initialization: Revisiting Symmetry Breaking and Gradient Flow