Neural Networks
Neural Networks
Since its inception in the 1940s, when it was hailed as "The Electronic Brain," the digital computer has been expected to replicate cognitive functions. While developments in hardware technology and software techniques have enabled limited successes in this direction, many cognitive capabilities remain uniquely human. The field of artificial intelligence (AI) seeks to infer general principles of "intelligence" by analyzing human cognitive processes, and then to encode them in software. The standard approach to AI is grounded in the assumption that human cognition is based on the ability to manipulate symbols using logical operators. An alternative approach, generally known as neural networks, is motivated by the observation that the biological brain is the only device capable of intelligent behavior.
Thus, the neural network framework is rooted in principles abstracted from our knowledge of neurobiological function. Some of these follow:
- Computation is performed by a network of interconnected units, each representing a neuron or a set of neurons.
- Each unit generates an output signal that is a simple parametric function of the signals on its input lines, which can be either output signals from other neurons or external signals (environmental input to the network).
- A given unit cannot transmit different signals to different units; its output signal is broadcast to a set of units that can comprise the entire network or any subset.
- Each input line to a neuron has a corresponding weight parameter, a value that determines the influence of that line on the neuron's output.
- The function ultimately computed by a network is determined by the connectivity of the units, their individual input-output functions, and the weights among them.
A diagram of a neural network is shown in Figure 1. The nine units in the diagram are numbered, and the curved lines with arrows indicate the connectivity of the network. Units 1 and 8 receive no input from other units in the network; their activity values are set by stimuli external to the network, like the light detecting cells in the retina, for example. Similarly, units 3, 5, and 7 are output units; these units do not activate any other units. Generally, output units generate the ultimate result of the network computation. Biologically, they would correspond to effector cells in the nervous system; that is, cells that directly influence the environment, like motor neurons that directly stimulate muscle tissue.
The thickness of the links is meant to represent the weight of the connection, and dashed links denote negative values to the links. An N by N matrix is a more precise (and for a computer program, more useful) representation of the connectivity in a network with N units. The table that follows refers to the network depicted in Figure 1. (The weight from unit A to B is the B th element in row A, where A and B can be 1 through 9.)
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
1 | 0 | 0.8 | 0 | 0 | 0.3 | 0 | 0 | 0 0 | |
2 | 0 | 0 | 0 | -0.2 | 0 | 0.9 | 0 | 0 0 | |
300 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ||
401.8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ||
500 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ||
6 | 0 | 0 | -0.1 | -1.1 | 0 | 0 | 0.4 | 0 | 0 |
7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
8 0 0 | -1.3 | 0 | 1.9 | 0 | 0 | 0 1.8 | |||
9 0 0 | 0 | 0 | 0 | 0 | -0.7 | 0 0 |
The network computation proceeds by computing activity values for each unit according to a function that combines its inputs multiplied by the connection weights. Once the modeler has selected the unit function and set the connectivity of the units, the function computed by the network is determined by the weight values. These values can either be set by the modeler or learned by the network to give a function that is extracted from observations.
Hard-Wired Neural Networks
Consider the task of adjusting several interdependent parameters of a complex system so that its output function is optimal. An example is scheduling classes at a large university, so that conflicts among classroom use, student course choices, and faculty teaching times are minimized. Another is the simultaneous function of the many muscles in the human body in performing a complex task like riding a bicycle or juggling. For a task with multiple simultaneous constraints among the variables, it is often impossible to find a solution that is guaranteed to be optimal (as in, the absolute best). But there are several approaches to finding very good solutions, and for many real-world situations this is sufficient.
A neural network can be designed to approach such a problem by defining each unit to represent the "truth value" of a specific hypothesis. For example, in the class-scheduling task, a unit may represent the plausibility that "English 101 will be taught in Room 312 by Professor Miller from 10 A. M. to 11 A. M. on Monday, Wednesday, and Friday." Another unit would represent another hypothesis, such as "Psychology 400 will be taught in Room 312 by Professor Wu from 10 A. M. to 11 A. M. on Monday, Wednesday, and Friday." Since there is a conflict between these two hypotheses, the connection between them would have a negative value. Thus, a network can represent hypotheses as nodes and constraints among hypotheses as weight values between pairs of nodes. The success of this approach depends on the skills and intuitions of the network designer. While there are some guiding principles to network design, it remains very much an art form.
Neural networks have been applied to many problems of this kind, often providing a better solution than other heuristic techniques. A classic example of a difficult (NP complete) problem is the so-called traveling salesperson problem (TSP). Given a set of N locations, the goal of the TSP is to find the shortest path that goes to every point and returns to the starting location. A neural network can be used to find a "pretty good" path; that is, the solution is good, though not guaranteed to be the best. Experiment results published in 1985 by John J. Hopfield and David W. Tank showed that a neural network finds a path that is very close to optimal.
Learning in Neural Networks
One of the most attractive features of neural network models is their capacity to learn. Since the response properties of the network are determined by connectivity and weight values, a learning procedure can be expressed mathematically as a function that determines the amount by which each weight changes in terms of the activities of the neurons. Most neural network learning procedures are based on a postulate put forward by Donald Hebb (1904–1985) in his classic book, The Organization of Behavior : "When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B is increased."
Various mathematical and computational enhancements to Hebb's original theory have led to many approaches to machine learning. Of these, a technique known as backpropagation, originally discovered by Paul J. Werbos (1974), has been the primary driving force in neural networks since the mid-1980s.
Typically, a network that "learns" is trained to learn the relationship between a set of input and output variables that have been independently measured across many examples. For example, the input might be a set of blood test results and the output might be a binary value that indicates whether the patient was later found to have a particular form of cancer. A learning network could be defined as a set of input units (each representing a blood test result) which activate a set of intermediate units, which then activate a single output unit (representing the development of a tumor). Initially, the weights are randomly assigned, and so the network output is not useful for medical diagnosis. Training proceeds by presenting input data to the network and comparing the resulting network output to the "target" value (the corresponding output value in the data). The backpropagation technique specifies how to adjust the network weights based on the network error (target minus network output) and the input activation values.
This technique is broadly applicable to problems involving pattern recognition, spanning a broad spectrum of domains. These include military applications, such as detecting submarines from sonar signals; financial applications including stock market prediction; medical diagnosis; and speech recognition. Neural network learning algorithms have also shed light onto the neurobiological mechanisms underlying human learning.
see also Artificial Intelligence; Expert Systems; Pattern Recognition.
Paul Munro
Bibliography
Hebb, Donald O. The Organization of Behavior. New York: Wiley, 1949.
Hopfield, John J., and David W. Tank. "'Neural' Computation of Decisions in Optimization Problems." Biological Cybernetics 52 (1985): 141–152.
Werbos, Paul J. "Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences." Ph.D. Thesis, Harvard University, 1974.
neural network
neural network
neu·ral net·work (also neu·ral net) • n. a computer system modeled on the human brain and nervous system.