Reccurent Neural Network
A recurrent neural network has the capacity to have two inputs: a T-1 immediate output from itself and a given input. This means that the order of information given to a RNN is important. Because of their internal memory and time dependence, this network is preferred in speech recognition, language modeling, and translation.
Long/Short Term Memory
The LSTM is an extension to RNNs. LSTM’s remember inputs over a long period of time. LSTM can learn from important experiences that have long time lags. Fundamentally, there are three gates: input, forget, and output. With analog gates, backpropagation solves the vanishing/exploding gradients because it keeps gradients effectively steep to keep training short and accurate.
See Simeon Kostadinov’s detailed explanation of LSTM on Towards Data Science
Gated Reccurent Unit
The Gated Recurrent Unit is similar to LSTM, but instead of the input, forget, and output gates, it has an update and reset gate. Ultimately, it helps the model determine how much past information to pass on and forsake. LSTM and GRU are really similar types and usually used interchangeably.
See Simeon Kostadinov’s detailed explanation of GRU on Towards Data Science