Artificial Neural Networks/Feed-Forward Networks

Artificial Neural Networks

Feedforward Systems

Feed-forward neural networks are the simplest form of ANN. Shown below, a feed-forward neural net contains only forward paths. A Multilayer Perceptron (MLP) is an example of feed-forward neural networks. The following figure below show a feed-forward networks with two hidden layers.

Figure: Feed-forward networks with two hidden layers

Connection Weights

In a feed-forward system PE are arranged into distinct layers with each layer receiving input from the previous layer and outputting to the next layer. There is no feedback. This means that signals from one layer are not transmitted to a previous layer. This can be stated mathematically as:

w_{ij}=0{\mbox{ if }}i=j

w_{ij}=0{\mbox{ if }}layer(i)\leq layer(j)

Weights of direct feedback paths, from a neuron to itself, are zero. Weights from a neuron to a neuron in a previous layer are also zero. Notice that weights for the forward paths may also be zero depending on the specific network architecture, but they do not need to be. A network without all possible forward paths is known as a sparsely connected network, or a non-fully connected network. The percentage of available connections that are utilized is known as the connectivity of the network.

Mathematical Relationships

The weights from each neuron in layer l - 1 to the neurons in layer l are arranged into a matrix w_l. Each column corresponds to a neuron in l - 1, and each row corresponds to a neuron in l. The input signal from l - 1 to l is the vector x_l. If ρ_l is a vector of activation functions [σ1 σ2 … σn] that acts on each row of input and b_l is an arbitrary offset vector (for generalization) then the total output of layer l is given as:

\mathbf {y} _{l}=\rho _{l}(\mathbf {w} _{l}\mathbf {x} _{l}+\mathbf {b} _{l})

Two layers of output can be calculated by substituting the output from the first layer into the input of the second layer:

\mathbf {y} _{l}=\rho _{l}(\mathbf {w} _{l}\rho _{l-1}(\mathbf {w} _{l-1}\mathbf {x} _{l-1}+\mathbf {b} _{l-1})+\mathbf {b} _{l})

This method can be continued to calculate the output of a network with an arbitrary number of layers. Notice that as the number of layers increases, so does the complexity of this calculation. Sufficiently large neural networks can quickly become too complex for direct mathematical analysis.