This paper provides a thorough and detailed explanation of Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTMs), two popular machine learning architectures used for processing sequential data. The paper starts by deriving the canonical RNN equations from differential equations, establishing a clear foundation for understanding the behaviour of these networks. The paper then explores the concept of "unrolling" an RNN, demonstrating how a long sequence can be approximated by a series of shorter, independent sub-sequences. Subsequently, it addresses the challenges faced when training RNNs, particularly the issues of vanishing and exploding gradients. The paper then meticulously constructs the Vanilla LSTM cell from the canonical RNN, introducing gating mechanisms to control the flow of information within the cell and mitigate the vanishing gradient problem. The paper also presents an extended version of the Vanilla LSTM cell, known as the Augmented LSTM, by incorporating features like recurrent projection layers, non-causal input context windows, and an input gate. Finally, the paper details the backward pass equations for the Augmented LSTM, which are used for training the network using the Back Propagation Through Time algorithm.