Streaming Propagation Through Time: A New Computational Paradigm for Recurrent Neural Networks
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Recurrent Neural Networks (RNNs) are foundational to numerous advances in artificial intelligence, yet their training has for decades predominantly relies on Backpropagation Through Time (BPTT), a paradigm that struggles with substantial computational and memory demands for long sequences. The inherently batch-oriented nature of BPTT further constrains RNNs’ ability to learn from streaming or online data. Earlier efforts toward online RNN training have been hindered by prohibitive costs in both computation and memory. Here, we introduce Streaming Propagation Through Time (SPTT), a new computational paradigm for RNN training. SPTT employs a streaming low-rank matrix decomposition to decouple gradient computation into two independent components: an optimization direction and an update magnitude. This incremental exploration of the gradient landscape enables efficient long-sequence processing while maintaining learning continuity.Across diverse sequence modeling benchmarks, SPTT outperforms BPTT, with stronger generalization and improved computational efficiency, thereby opening new possibilities for real-time and resource-constrained RNN applications.