Bayesian PASA: Provably Stable AdaptiveActivation with Uncertainty Quantification

Mohsen Mostafa Sayed

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The choice of activation function is a fundamental design decision in deep learning, yet most popular options like ReLU, GELU, or Swish are static and treat all inputs uniformly. This one-size-fits-all approach breaks down in the presence of noisy or corrupted data, where the optimal non-linearity should depend on the input's statistical context. In this paper, we introduce Bayesian Probabilistic Adaptive Sigmoidal Activation (Bayesian PASA), a novel activation function that dynamically adapts its behavior based on the input's uncertainty. Bayesian PASA is not just a new function, but a new paradigm. It frames activation selection as a Bayesian model averaging problem, adaptively mixing sigmoidal, linear, and noise-aware behaviors. The mixing weights are derived from a principled variational evidence lower bound (ELBO), regularized by a stable ψ-function that guarantees bounded influence from noise estimates. We provide three formal theorems proving its Lipschitz continuity, gradient stability, and convergence under standard training assumptions. On the challenging CIFAR-100 benchmark, Bayesian PASA achieves a state-of-the-art test accuracy of 76.38% , outperforming ReLU (75.68%), GELU (75.98%), and the original PASA (75.53%). On the corrupted CIFAR-10-C dataset, the full Bayesian PASA model combined with Bayesian R-LayerNorm achieves an average accuracy of 53.91% , a + 1.87% improvement over the ReLU+LayerNorm baseline. This work provides a drop-in replacement for existing activations, offering not only improved performance but also built-in uncertainty quantification for more robust deep learning systems.

Version published to 10.21203/rs.3.rs-9032403/v1 on Research Square
Mar 10, 2026

Bayesian PASA: Adaptive Activation with Uncertainty Quantification under Computational Constraints – Revised Version

This article has 1 author:
1. Mohsen Mostafa
This article has no evaluationsLatest version Mar 11, 2026
Bayesian PASA: Adaptive Activation with Uncertainty Quantification under Computational Constraints – Revised Version

This article has 1 author:
1. Mohsen Mostafa
This article has no evaluationsLatest version Mar 11, 2026
Bayesian R-LayerNorm: Uncertainty-Aware Adaptive Normalization—Revised Version

This article has 1 author:
1. Mohsen Mostafa
This article has no evaluationsLatest version Mar 11, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Bayesian PASA: Adaptive Activation with Uncertainty Quantification under Computational Constraints – Revised Version

Bayesian PASA: Adaptive Activation with Uncertainty Quantification under Computational Constraints – Revised Version

Bayesian R-LayerNorm: Uncertainty-Aware Adaptive Normalization—Revised Version