Perceptrons are single layer neural networks. It takes in inputs, sums up the linear combinations, and passes through an activation function to an output layer. Perceptrons are linear classifiers, which is used for supervised learning and classifying input data.
Very similar to a perceptron but with a hidden layer. Activation goes from input to output without back loops. This type of network is usually trained using backpropagating, a method to compute gradients.FFs are more flexible than binary perceptrons because there’s a intermediate stage of evaluation.
Radial Basis Functions are Feed Forward Neural Networks but with a radial basis activation instead of logistic functions. The radial basis function typically is faster and easily interpretable. However, classification takes more time.
See Ramraj Chandradevan’s detailed explanation of RBF on Towards Data Science
Deep Feed Forwards are multi-layer perceptrons. They are feed forward neural networks with multiple hidden layers. Usually with gradient descent, having more hidden layers is more beneficial as it allows higher specificity and complexities, but at the same time becomes slower to train.
A recurrent neural network has the capacity to have two inputs: a T-1 immediate output from itself and a given input. This means that the order of information given to a RNN is important. Because of their internal memory and time dependence, this network is preferred in speech recognition, language modeling, and translation.
The LSTM is an extension to RNNs. LSTM’s remember inputs over a long period of time. LSTM can learn from important experiences that have long time lags. Fundamentally, there are three gates: input, forget, and output. With analog gates, backpropagation solves the vanishing/exploding gradients because it keeps gradients effectively steep to keep training short and accurate.
The Gated Recurrent Unit is similar to LSTM, but instead of the input, forget, and output gates, it has an update and reset gate. Ultimately, it helps the model determine how much past information to pass on and forsake. LSTM and GRU are really similar types and usually used interchangeably.
These networks have the same number of output and input neurons. By having a hidden layer smaller than the input layer, the encoder forces the input data to be represented in a compressed version. Later, the decoder reconstructs the data only using compressed hidden layer outputs. Auto encoders specialize in unsupervised learning- unlabelled data without input output pairs. Auto encoders are mainly used to reduce the dimensionality of data.
See this website for more information.
Variational AE stores a probabilistic range of data in the hidden layer rather than vanilla AE’s discrete information storing system. Because of the distributions in VAEs, it can generate hybrid outputs from distinct inputs. VAEs are useful in producing synthetic human faces, text, and interpolative results.
See Irhum Shafkat’s detailed explanation of VAE on Towards Data Science