Logistic Regression Classifier

Overview


Logistic Regression is a statistical tool that is typically used to model the probability of a binary outcome. However, using one-hot encoding it can be configured as a machine learning classifier.

One Hot Encoding


One hot encoding is a classifier that is designed to return a column vector which consists of zeros in each place, except a one in the index that represents the correct category.

The following column vector represents one hot encoding in a class with 4 category labels.
{% \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \\ \end{bmatrix} %}
A logistic regression is run for each category, where the outcome equals 1 for data points in the category and 0 for everything else. For a sample point, the regression is used to return a number between 0 and 1 for each category label. These numbers are then arranged as a column vector as above, and finally, they are normalized so that they sum to one.

In general, a given data point is forecast to belong to the category with the highest value in the output vector.

Contents