This research paper argues that traditional recurrent neural networks (RNNs) like LSTMs and GRUs, despite being outperformed by Transformers for several years, can still be effective in long-sequence modeling. The authors demonstrate that by simplifying these older RNN architectures and eliminating their dependencies on previous hidden states, they can be trained in parallel using the parallel prefix scan algorithm. This leads to significant efficiency gains in training time and memory usage while achieving performance comparable to modern, more complex models like Mamba and Transformers. The paper presents these simplified versions of LSTMs and GRUs, called minLSTMs and minGRUs, and showcases their effectiveness in various tasks, including language modeling, reinforcement learning, and the Selective Copying task.