Recurrent Neural Networks

Overview


Recurrent neural networks are neural networks that are designed to be able to deal with a sequence (sometimes a sequence in time, such as a time series).

Traditional neural networks are capable of dealing with fixed length sequences. That is, if every sequence is the same length (say 5 items), then there is not much need for recurrent neural networks.

Many sequential datasets have items of varying length. For instance, language models have to deal with sentences of variable length.

Inputs


Given a set of inputs {% (x_1,x_2, ... ,x_n) %} with corresponding desired outputs {% (y_1,y_2,...,y_n) %}. The algorithm starts with an initialized state vector {% \vec{h}_1 %} (typically either set to zero or randomly initialized).

{% \vec{h}_{n+1} = RNN(\vec{h}_n, \vec{x}_n) %}
That is, each input {% x_i %} is presented to the network sequentially in addition to the current output of the network to get the next state vector.

Types


  • Simple - the simplest recurrent neural network where a state variable is maintained in the network and fed into each successive iteration.
  • LTSM - designed to help deal with the exploding/vanishing gradient problem that simple RNN's sometimes run into.

Contents