Artificial Discourse

Were RNNs All We Needed?


Listen Later

This research paper revisits the traditional Recurrent Neural Networks (RNNs) – specifically, LSTMs and GRUs – and shows how to adapt them for modern parallel training. The authors demonstrate that by removing certain dependencies within the RNN structure, these models can be trained using the Parallel Scan algorithm, making them significantly faster than their traditional counterparts. The paper then compares the performance of these simplified LSTMs and GRUs (minLSTMs and minGRUs) to recent state-of-the-art sequence models in several tasks, including Selective Copying, Reinforcement Learning, and Language Modeling. The results show that the minLSTMs and minGRUs achieve comparable or better performance than other models while being far more efficient, suggesting that RNNs might be a viable option even in the era of Transformers.

...more
View all episodesView all episodes
Download on the App Store

Artificial DiscourseBy Kenpachi