GRAIN: Gated Recurrent Adaptive Integration Network
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In the world of deep learning, very often RNN architectures such as GRU and LSTM areused for sequential data processing tasks which sometimes lead to overfitting scenarios, poorgeneralisation and highly unstable hidden state transitions during training phase. In this re-search paper, we are introducing a modified architecture in GRU which incorporates the conceptof dynamic EWMA(Exponentially weighted moving average) of previous hidden cell states inorder to stabilize the evolution of hidden state transitions, improving generalisation as well asdecreasing the chances of abrupt/sudden fluctuations in hidden state thereby increasing the per-formance. Experiments performed as part of this research on certain benchmark datasets usingthe proposed architecture have shown significant improvement in accuracy when compared withexisting architectures such as Vanilla LSTM, LSTM with dropout, Vanilla GRU, etc. therebyindicating that incorporating adaptive temporal smoothing within recurrent updates can en-hance the robustness and stability of deep sequence models without significant computational overhead.