Machine Learning Theory
Overview
Machine learning is framework for allowing machines to update their behaviour based on seeing new data. That is,
the machine "learns" from its environment in order to optimize its functioning.
Classification vs Regression
Broadly speaking, machine learning can be categorized into two categories based on the
nature of the range of the learning function.
- Classification: consists
of algorithms that assign a label to a datapoint from a finite set
of possible labels.
- Regression: assigns a
numeric value from a continuous range of values.
Response Function
The goal of machine learning is to "learn" a function
{% f:X \mapsto Y %}
where X is the set of possible inputs (or feature vectors),
and Y is the set of classifications or forecasts. This function is the response function.
As a general rule, both x and y are taken to be
column vectors.
Loss Function
The process of learning a dataset requires some way to measure how close a given prediction is to the actual value.
That is, for a given data point with a given input x (feature vetor), there is an ossociated value of y, and
an output for the function under consideration, f(x). If f(x) equals y, then we say that there is zero error.
If f(x) does not equal y, there is some error, which is typically termed a loss.
The size of the loss is defined by a loss function, which assigns the loss to each triple.
{% loss \; = L(x,y,f(x)) %}
Then, given a finite dataset, one can measure the total amount of error, or loss as
{% Empirical \; Loss = \sum_i L(x_i,y_i, f(x_i)) %}
The empirical loss measures the loss on a given sample set. Of course, the goal is to minimize the
loss on datapoints that the algorithm has not seen. From a statistical perspective, this is interpreted to
mean that the goal of machine learning is to minimize the expected loss, which is termed the risk.
{% Risk = \mathbb{E}[ L(x,y, f(x)) ] %}
Loss Minimization
The goal of machine learning, as stated above, is risk minimization. However, the full dataset is never available.
That is, the only data we have available is the sample dataset, the one from we train on.
The basic recipe of machine learning is the following two steps
- Constrain the Set of Functions as valid candidates for the response function
- Pick the function from among the candidates that minimizes the loss on the training set
For a detailed discussion, see
loss minimization
Topics