Perceptron

Overview


The perceptron is the first and perhaps simplest machine learning model. The perceptron is a classifying model (as opposed to a regression model. see classification vs regression). That means that the goal of the perceptron is to classify datapoints into distinct categories. The simple perceptron actually classifies points into one of two categories, based on a linear boundary. (you could extend this to multiple categories by using multiple perceptrons.) It is an example of a parametric model.

We assume that we have a number of data points, each represented by a vector of numbers. Call the ith point, {% x_i %}. Now the perceptron is specified by a matrix, w such that
{% h(x) = sign(\sum w_i x_i + b) %}
indicates the category of the point.

Math


The perceptron starts with a set of inputs. Here they are arranged as a vector, that is a list of numbers.
{% x = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ \end{bmatrix} %}
Then we define a set of weights, the same number as the number of inputs
{% w = \begin{bmatrix} w_1 \\ w_2 \\ w_3 \\ \end{bmatrix} %}
The value of the perceptron is calculated by multiplying each input by the corresponding weight, then adding a bias. Finally, this value is passed to a function and the result becomes the value of the perceptron. In the classical percpetron, the function is the step function which returns -1 for values less than 0 and 1 for values greater than 0.
{% y = f(w^T v + b) %}
The bias term, b can be elminiated from the notation by adding a {% x_0 = 1 %} term to the input vector. Then, {% w_0 = b %}, and we can then write
{% y = f(w^T v) %}

Learning Algorithm


  • initialize the weights - the typical assumption is that the weights will be assigned random weights, however, they could all be set to zero as well.
  • select an input/output pair from the sample. This should occur on the basis of a selection algorithm. One way to do this is to select each item in the sample in turn. Another method may be to select each pair randomly.
  • If the computed output (y) on the selected item is different from the actual output, then adjust the weights as follows:
    • If y<0, then add {% \nu x_i %} to each {% w_i %}
    • if y > 0 subtract {% \nu x_i %} to each {% w_i %}
  • repeat untill all samples are correctly classified

Demo


The following demo is a graphical demonstration of the percetron algorithm. The algorithm is run to classify the points in the chart. At startup, the line incorrectly classifies the points. When you click run, the algorithm is iterated. In this example, the algorithm finds a solution after a single interation.

Topics


Contents