Usually we build networks using many existing frameworks. But at the beginning of learning we may want to build one from scratch.

One of the key points in neural networks is the back propagation algorithm, and as I explained before, the algorithm can be seen as another dynamic programming. We can benefit from this when writing codes.

Example codes are written in Julia.

class design

As a network keeps consistency of lots of internal data, like weights, it’s better to take them as an object. Thus there will be a Layer class and a Network class.

Specifically, we have a perceptron layer and a stack network class, because we are building a simple feedforward neural network.


To take a layer as an independent unit, we may have to keep the inputs and outputs for each layer, even though they can be merged since the previous layers’ outputs will be the inputs of the next layer. This can be useful if computations of partial derivative needs the inputs and outputs.

If we are implementing a batch update, we need to save temporarily the updates for all weights before we actually update any of them.


A stack net is a composition of layers, thus the forward, backward and batch_update are just looping and calling appropriate functions of each layer, except that before backward the network should compute the error of loss function w.r.t the outputs of the last layer, which is the duty of the network and layers are not aware of it.

activation functions

Existing functions need to be broadcasted to the whole array as they are for a single number originally. And the partial derivative should be given, too.


Randomly initialization of the weights is the common choice. But to prevent the activation function from getting saturated, maybe the weights of edges point to the same output neuron should be normalized and sum to 1.