Neural networks are a way of processing information that is used in both biological nervous systems and in artificial intelligence, or machine learning. Digital computers follow a von Neumann model, moving data from memory, performing clearly defined calculations on it, and then storing the resulting transformed data back in the memory. Neural networks on the other hand, use a network of connections to both store and process information. They are able to learn to perform operations, given only an objective and no precise instructions as to how it should be carried out. In many ways they function like a highly-complex statistical regression, fitting a model in many dimensions. Neural networks are particularly well-suited to applications such as image recognition, speech recognition and adaptive control.
Biological Neural Networks
Biological neural networks consist of nerve cells called neurons which each have many nerve fibres protruding from them, known as axons. The axons of different neurons have electrochemical connections known as synapses. Synapses act as variable attenuators, allowing the strength of connections between neurons to be adjusted. It is thought that both memories and skills are formed by creating pathways through which signals can flow via synapses with lowered resistance. Although this basic theory of the brain was suggested in the 19th Century, it was mathematical models of its functioning developed in the 1940’s that led directly to the development of artificial neural networks.
Artificial Neural Networks
Artificial neural networks are simplified mathematical models of biological ones, which can be simulated within a conventional digital computer. These models have a set of numerical inputs and a set of numerical outputs. The inputs are linked to the outputs through a network of connections with layers of nodes. The weighting of the nodes can be adjusted to make a particular path more likely to be followed. The neural network can then be presented with a training set of inputs for which the correct outputs are already known. The nodes are adjusted to minimize the difference between the neural networks outputs and the correct outputs. This involves solving large matrices.
Neural networks form non-linear statistical models. This is much like a regression fitting a curve to data in two dimensions. However, a neural network can fit models in many dimensions, spotting patterns in numerical data that humans cannot see. For example, this can map inputs to outputs for complex control applications. One limitation of neural networks is that, because of the high dimensionality of the models, they require very large datasets to train them effectively.
Networks may be shallow or deep, referring to the layers of nodes between the inputs and the outputs. The application of deep neural networks is known as deep learning and has proved the most useful area of artificial intelligence in recent years.