Deep Learning - Activation Functions

Overview


Activation functions are functions that are applied to the outputs of one layer of a neural network before being passed

Sigmoid Type Functions


Hockey Stick Functions


  • ReLU - Rectified Linaer Unit
  • PReLU - Parametric Rectified Linear Unit
  • ELU - Exponential Linear Units

Choice of Activation Function


  • Single Layer For networks with either a single layer it is common to use either the sigmoid function or the softmax function for classification tasks, and the identity for regression tasks.
  • Two Layer It is common to use a tanh function on the inner layer (sometimes the sigmoid), and then a sigmoid or softmax on the outputs for classification problems. For regression problems, the second layer usually uses the identity function as the activation function.
  • Deep Networks For deep networks with more than 2 layers, the ReLU type functions are more common to use on the inner layers. This is due to the fact that when layers are stacked, the optimization can run into the problem of exploding or vanishing gradients, which is solved by using a ReLU. ReLU are often quicker to converge in deep networks as well.
(see Trask)

Contents