Perceptron Multi Class Classification
Overview
Aj perceptron splits the input space into two halves. By definition then, a single perceptron can only recognize two classes.
When there are multiple classes that need to be recognized, it is standard to build a perceptron to recognize each class individually.
Softmax Activation
As an additional extension to the perceptron, it is common to replace the activation function {% sign %}
with the softmax activation function, defined as
{% softmax(x_i) = \frac{exp(x_i)}{\sum_j exp(x_j)} %}
where {% x %} is a vector of outputs of the set of perceptrons. Using the softmax function allows one to use the
gradient descent
or other common optimization routines to optimize the perceptrons, rather than the originally defined algorithm.