GRAIN: Gated Recurrent Adaptive Integration Network

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In the world of deep learning, very often RNN architectures such as GRU and LSTM areused for sequential data processing tasks which sometimes lead to overfitting scenarios, poorgeneralisation and highly unstable hidden state transitions during training phase. In this re-search paper, we are introducing a modified architecture in GRU which incorporates the conceptof dynamic EWMA(Exponentially weighted moving average) of previous hidden cell states inorder to stabilize the evolution of hidden state transitions, improving generalisation as well asdecreasing the chances of abrupt/sudden fluctuations in hidden state thereby increasing the per-formance. Experiments performed as part of this research on certain benchmark datasets usingthe proposed architecture have shown significant improvement in accuracy when compared withexisting architectures such as Vanilla LSTM, LSTM with dropout, Vanilla GRU, etc. therebyindicating that incorporating adaptive temporal smoothing within recurrent updates can en-hance the robustness and stability of deep sequence models without significant computational overhead.

Article activity feed