affine function implementation

Affine function Implementation

Overview

The affine function represents a standard (non-activation function) layer in a neural network. It is represented by the follwoing equation.

{% \vec{y} = \textbf{W} \vec{x} + \vec{b} %}

where {% W %} is the weight matrix and {% \vec{b} %} is a vector called the bias.

Implementation

The following code is a basic implementation of the affine function. It utilizes the linear algebra library.



function affine(matrix, bias){

    return {
        type:'affine',
        matrix:matrix,
        bias:bias,
        //evaluate the funciton on the inputs
        evaluate:function(input){
            let result = la.multiply(this.matrix, input);
            result = la.add(result, this.bias);
            return result;
        },
    }
}

Input Gradient

denom

{% \frac{d \textbf{W} \vec{x}}{d \vec{x}} = \textbf{W}^T %}


inputGradient:function(input){
    return la.transpose(this.matrix);
},

Parameter Gradient

For the parameter gradient, we vectorize the weight matrix. The linear algebra library provided a function which calculates the vectorized derivative of matrix mulitplication.


parameterGradient:function(input){
    let grad1 = la.vectorGradient(this.matrix, input);
    let identity = la.identity(this.bias.length);
    for(let row of identity){
        grad1.push(row);
    }
    return grad1;
},

Note, we append the vectorized gradient of the bias unto the bottom of the vectorized gradient, given by the following

{% \frac{d \vec{b}}{d \vec{b}} = \mathbb{I} %}

Update

Each layer that has parameters, and therefore a parameter gradient, must implement a method called update that will take a parameterGradient and updates its parameters.

In the case of teh affine function, the parameter gradient has been vectorized, therefore, they need to be extracted from the vectorized version.


/*
Adds to the target matrix

*/
function addTo(target, matrix){
    for(let i=0;i<target.length;i++){
        for(let j=0;j<target[0].length;j++){
            target[i][j] += matrix[i][j];
        }
    }
}

update:function(parameterGradient){
    let newBias = [];
    for(let i=0;i<h;this.bias.length;i++){
        newBias.push([parameterGradient[parameterGradient.length-i-1][0]]);
    }
    let weightGradient = la.unvec(parameterGradient, this.matrix.length, this.matrix[0].length);
    addTo(this.bias, la.multiply(step,newBias));
    addTo(this.matrix, la.multiply(step, weightGradient));
},

Clone

EAch layer should also implement a clone function that creates a layer exactly like the current layer.

Full Implementation


/*
Adds to the target matrix

*/
function addTo(target, matrix){
    for(let i=0;i<target.length;i++){
        for(let j=0;j<target[0].length;j++){
            target[i][j] += matrix[i][j];
        }
    }
}

function affine(matrix, bias){

    let layer = {
        type:'affine',
        matrix:matrix,
        bias:bias,
        //evaluate the funciton on the inputs
        evaluate:function(input){
            let result = la.multiply(this.matrix, input);
            result = la.add(result, this.bias);
            return result;
        },
        //give the gradient with respect to the inputs
        inputGradient:function(input){
            return la.transpose(this.matrix);
        },
        parameterGradient:function(input){
            let grad1 = la.vectorGradient(this.matrix, input);
            let identity = la.identity(this.bias.length);
            for(let row of identity){
                grad1.push(row);
            }
            return grad1;
        },
        update:function(parameterGradient){
            let newBias = [];
            for(let i=0;i<h;this.bias.length;i++){
                newBias.push([parameterGradient[parameterGradient.length-i-1][0]]);
            }
            let weightGradient = la.unvec(parameterGradient, this.matrix.length, this.matrix[0].length);
            addTo(this.bias, la.multiply(step,newBias));
            addTo(this.matrix, la.multiply(step, weightGradient));
        },
        clone:function(){
            return affine(this.matrix, this.bias);
        }
    };
    return layer;
}

Try it!