Logistic Regression Classifier
Overview
Logistic Regression
is a statistical tool that is typically used to model the probability of a binary outcome.
However, using one-hot encoding it can be configured as a
machine learning classifier.
One Hot Encoding
One hot encoding is a
classifier that is designed to return a
column vector
which consists of zeros in each place, except a one in the index that represents the correct category.
The following column vector represents one hot encoding in a class with 4 category labels.
{%
\begin{bmatrix}
0 \\
0 \\
1 \\
0 \\
\end{bmatrix}
%}
A logistic regression is run for each category, where the outcome equals 1 for data points in the category and 0 for everything else.
For a sample point, the regression is used to return a number between 0 and 1 for each category label. These numbers are then
arranged as a column vector as above, and finally, they are normalized so that they sum to one.
In general, a given data point is forecast to belong to the category with the highest value in the output vector.