Neural Networks Multilayer Feed-Forward

(McCulloch and Pitts, 1943) were well aware that a single threshold unit wott1d not solve all their problems. In fact, their paper proves that such a unit can represent the basic Boolean functions AND, OR, and NOT and then goes on to argue that any desired functionality can be obtained by connecting large numbers of tmits into (possibly recurrent) networks of arbitrary depth. The problem was that nobody knew how to train such networks.

Figure nn.11
(a) The result of combining two opposite-facing soft threshold functions to produce a ridge.
(b) The result of combining two ridges to produce a bump.

This turns out to be an easy problem if we think of a network the right way: as a function h_w(x) parameterized by the weights w. Given an input vector x =(x₁, x₂), the activations ofthe input units are set to (a₁,a₂) = (x₁, x₂). The output at unit 5 is given by :

Thus, we have the output expressed as a function of the inputs and the weights. A similar expression holds for unit 6. As long as we can calculate the derivatives of such expressions with respect to the weights, we can use the gradient-descent loss-minimization method to train the network.

Before delving into learning rules, let us look at the ways in which networks generate complicated functions. For example, by adding two opposite-facing soft threshold functions and thresholding the result, we can obtain a "ridge" function as shown in Figure NN.11(a). Combining two such ridges at right angles to each other (i.e., combining the outputs from four hidden units), we obtain a "bump" as shown in Figure NN.11(b).
With more hidden wlits, we can produce more bumps of different sizes in more places. In fact, with a single, sufficiently large Iudden layer, it is poss.ible to represent any continuous ftmction of the inputs with arbitrary accuracy; with two layers, even discontinuous functions can be represented. Unfortunately, for any particular network structure, it is harder to characterize exactly which functions can be represented and which ones cannot.

Neural Networks Multilayer Feed-Forward

www.CodeNirvana.in