Recurrent Neural Networks
Overview
Recurrent neural networks are neural networks that are designed to be able to deal with a
sequence (sometimes a sequence in time, such as a
time series).
Traditional
neural networks
are capable of dealing with fixed length sequences. That is, if every sequence is the same length (say 5 items),
then there is not much need for recurrent neural networks.
Many sequential datasets have items of varying length. For instance, language models have to deal with
sentences of variable length.
Inputs
Given a set of inputs {% (x_1,x_2, ... ,x_n) %} with corresponding desired outputs
{% (y_1,y_2,...,y_n) %}. The algorithm starts with an initialized state vector {% \vec{h}_1 %}
(typically either set to zero or randomly initialized).
{% \vec{h}_{n+1} = RNN(\vec{h}_n, \vec{x}_n) %}
That is, each input {% x_i %} is presented to the network sequentially in addition to the current output of the network to
get the next state vector.
Types
- Simple
- the simplest recurrent neural network where a state variable is maintained
in the network and fed into each successive iteration.
- LTSM
- designed to help deal with the exploding/vanishing gradient problem that
simple RNN's sometimes run into.