Overview
Matrix differentiation gives meaning to a derivative when the variables involved may be matrices or vectors. The simplest example would be a scalar function that is a function of a multiple inputs. Then, you may want to give meaning to differentiation of this function by a vector of the inputs, that is
{% \partial{f(x_1,x_2,...,x_n)}/\partial{\vec{x}} %}
What you may mean is something like
the following:
{%
\begin{bmatrix}
\partial f/ \partial{x_1} \\
\partial f/ \partial{x_2} \\
\partial f/ \partial{x_3} \\
\end{bmatrix}
%}
However, you could mean something like this
{%
\begin{bmatrix}
\partial f/ \partial{x_1}, & \partial f/ \partial{x_2}, & \partial f/ \partial{x_3} \\
\end{bmatrix}
%}
That is, do you want the answer to be a row or a column vector. Of course,
this is just a notational matter, but you want to be consistent
in your conventions. In addition, things get more complex when you
differentiate a vector by a vector. In such a case, you could have a matrix
of terms as
{%
\begin{bmatrix}
\partial f_1/ \partial{x_1}, & \partial f_1/ \partial{x_2}, & \partial f_1/ \partial{x_3} \\
\partial f_2/ \partial{x_1}, & \partial f_2/ \partial{x_2}, & \partial f_2/ \partial{x_3} \\
\partial f_3/ \partial{x_1}, & \partial f_3/ \partial{x_2}, & \partial f_3/ \partial{x_3} \\
\end{bmatrix}
%}
The choice of how to layout a matrix derivative generally falls into one of two types. (see below)
Layouts
- Numerator Layout - in the numerator layout, the derivative is a column vector if the numerator is a column vector. (that is, the numerators match)
- Denomerator Layout - in the denominator layout, the derivative follows the layout of the denominator, and reverses the layout of the numerator.
Formulas
Below we list some common formulas for differentiating vectors and matrices.
bold Capital letters are Matrices
bold lower case letters are vectors (column vectors)
letters are scalars
{% \partial{\textbf{b}^T \textbf{a}} / \partial{\textbf{a}} = \textbf{b} %}
{% \partial{\textbf{a}^T \textbf{A} \textbf{a}} / \partial{\textbf{a}} = 2 (\textbf{A})\textbf{a} %}
{% \partial{tr(\textbf{BA})}/\partial{\textbf{A}} = \textbf{B}^T %}
{% %} {% %}
{% \partial{\textbf{a}^T \textbf{A} \textbf{a}} / \partial{\textbf{a}} = 2 (\textbf{A})\textbf{a} %}
{% \partial{tr(\textbf{BA})}/\partial{\textbf{A}} = \textbf{B}^T %}
{% %} {% %}