Generalization Bounds

Overview


Generalization bounds is a mathematical framework for showing when the loss minimization algorithm will achieve learnability.

PAC Learnability


Given a hypothesis set {% \mathcal{F} %}, a pattern is PAC learnable if for a given {% \delta %} and {% \epsilon %}, there is a number {% N %} of sample data points where for {% n > N %}, the following holds.
{% \mathbb{P}[sup_{f\in \mathcal{F}}|R(f) - \hat{R}(f)| > \epsilon] < \delta %}
where

  • {% R(f) %} is the theoretical loss for hypothesis {% f %}
  • {% \hat{R}(f) %} - the empirical risk for hypothesis {% f %}
Note: The formal definition of PAC learnability in the literature is somewhat more technical, and involves computing aspects of the algorithm that is PAC learnable.