# Artificial Neural Networks/Feed-Forward Networks

## Feedforward Systems

Feed-forward neural networks are the simplest form of ANN. Shown below, a feed-forward neural net contains only forward paths. A Multilayer Perceptron (MLP) is an example of feed-forward neural networks. The following figure below show a feed-forward networks with four hidden layers.

Figure: Feed-forward networks with four hidden layers

## Connection Weights

In a feed-forward system PE are arranged into distinct layers with each layer receiving input from the previous layer and outputting to the next layer. There is no feedback. This means that signals from one layer are not transmitted to a previous layer. This can be stated mathematically as:

${\displaystyle w_{ij}=0{\mbox{ if }}i=j}$
${\displaystyle w_{ij}=0{\mbox{ if }}layer(i)\leq layer(j)}$

Weights of direct feedback paths, from a neuron to itself, are zero. Weights from a neuron to a neuron in a previous layer are also zero. Notice that weights for the forward paths may also be zero depending on the specific network architecture, but they do not need to be. A network without all possible forward paths is known as a sparsely connected network, or a non-fully connected network. The percentage of available connections that are utilized is known as the connectivity of the network.

## Mathematical Relationships

The weights from each neuron in layer l - 1 to the neurons in layer l are arranged into a matrix wl. Each column corresponds to a neuron in l - 1, and each row corresponds to a neuron in l. The input signal from l - 1 to l is the vector xl. If ρl is a vector of activation functions [σ1 σ2 … σn] that acts on each row of input and bl is an arbitrary offset vector (for generalization) then the total output of layer l is given as:

${\displaystyle {\mathbf {y}}_{l}=\rho _{l}({\mathbf {w}}_{l}{\mathbf {x}}_{l}+{\mathbf {b}}_{l})}$

Two layers of output can be calculated by substituting the output from the first layer into the input of the second layer:

${\displaystyle {\mathbf {y}}_{l}=\rho _{l}({\mathbf {w}}_{l}\rho _{l-1}({\mathbf {w}}_{l-1}{\mathbf {x}}_{l-1}+{\mathbf {b}}_{l-1})+{\mathbf {b}}_{l})}$

This method can be continued to calculate the output of a network with an arbitrary number of layers. Notice that as the number of layers increases, so does the complexity of this calculation. Sufficiently large neural networks can quickly become too complex for direct mathematical analysis.