Universal Approximation Theorem

Overview


The universal approximation shows that in general, one does not need more than 2 layers in a neural network in order to approximate any reasonable function. However, it is true that the performance of a neural network in terms of computing resources may be better by including additional layers.

Theorem


Let {% K \subset \mathbb{R}^n %} be closed and bounded (compact) and let {% f:K \rightarrow \mathbb{R}^m %} such that {% f_i(x) \geq 0 %} for all {% x \in K %} and each i. The for {% \epsilon > 0 %} there exists an artificial neural network {% h:\mathbb{R}^n \rightarrow \mathbb{R}^m %} with two layers using ReLU activation function such that
{% sup_{x \in K} || h(x) - f(x) || < \epsilon %}
see berlyand

Contents