Artificial Neural Networks/Boltzmann Machines

From Wikibooks, open books for an open world
Jump to: navigation, search

Boltzmann learning compares the input data distribution P with the output data distribution of the machine, Q [24]. The distance between these distributions is given by the Kullback-Leibler distance:

w_{ij}[n+1] = w_{ij}[n] - \frac{\partial G}{\partial w_{ij}}

Where:

\frac{\partial G}{\partial w_{ij}} = -\frac{1}{T}[p_{ij} - q_{ij}]

Here, pij is the probability that elements i and j will both be on when the system is in its training phase (positive phase), and qij is the probability that both elements i and j will be on during the production phase (negative phase). The probability that element j will be on, pi, is given by:

p_i = \frac{1}{1 + e^\frac{-\Delta E_i}{T}}

T is a scalar constant known as the temperature of the system. Boltzmann learning is very powerful, but the complexity of the algorithm increases exponentially as more neurons are added to the network. To reduce this effect, a restricted Boltzman machine (RBM) can be used. The hidden nodes in an RBM are not interconnected as they are in regular Boltzmann networks. Once trained on a particular feature set, these RBM can be combined together into larger, more diverse machines.

Because Boltzmann machine weight updates only require looking at the expected distributions of surrounding neurons, it is a plausible model for how actual biological neural networks learn.