Functional Trust Regions (FTR): A Lagrangian Framework for Stability-Constrained Continual Learning

Kavya Bhand
Aadi Joshi

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The stability-plasticity tradeoff in continual learning is widely assumed to be architecture-dependent: models with higher loss-landscape curvature or larger parameter counts should require stronger regularization. We empirically challenge this assumption through extensive experiments on Functional Trust Regions (FTR), a method that enforces explicit KL-divergence constraints on functional drift during sequential task learning. Conducting 1,200 experiments across eight architectures spanning a twenty-four-fold parameter range and forty-eight-fold variation in Hessian trace, we identify a stability crossover at ε ∗ = 7.15 ± 0.35 (coefficient of variation: 4.96 percent) that is architecture-independent to measurement precision. A formal F-test for constancy yields p = 0.786, indicating that between-architecture variance is statistically indistinguishable from measurement noise. Crucially, all ten tested curvature-based normalizations including Hessian trace, Fisher trace, spectral norm, and effective dimensionality increase cross-architecture dispersion rather than reduce it. No curvature metric achieves statistically significant correlation with ε ∗ (all p > 0.06). Cross-method analysis reveals that Learning without Forgetting (LwF) exhibits moderately architecturedependent transitions (coefficient of variation approximately 14 percent), while Elastic Weight Consolidation (EWC) shows no phase transition across four orders of magnitude of regularization strength. These results indicate that stability crossovers in distillation-based constrained learning arise from task structure rather than model geometry, and that widely accepted curvature-based intuitions fail to predict the critical stability budget.

Version published to 10.21203/rs.3.rs-9205833/v1 on Research Square
Mar 25, 2026

Differentially Private Lasso: An ISTA Framework with Finite-Iteration Guarantees

This article has 2 authors:
1. Jiahui Zhang
2. Chi Seng Pun
This article has no evaluationsLatest version Mar 24, 2026
Loss Function Matters More Than Framework: A Comparative Study of Gradient Boosting Robustness to Outliers

This article has 1 author:
1. Mikhail Ulyanin
This article has no evaluationsLatest version Mar 25, 2026
Limits of Self-Correction in LLMs: An Information-Theoretic Analysis of Correlated Errors

This article has 1 author:
1. Andrew Michael Brilliant
This article has no evaluationsLatest version Apr 9, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Differentially Private Lasso: An ISTA Framework with Finite-Iteration Guarantees

Loss Function Matters More Than Framework: A Comparative Study of Gradient Boosting Robustness to Outliers

Limits of Self-Correction in LLMs: An Information-Theoretic Analysis of Correlated Errors