Syed Najib Syed Abdul Bahari
Manufacturing Engineering
Department
Kuliyyah of Engineering International Islamic University
Revised January 24, 2009
Definition (1)
An artificial neural network (ANN), often just called a "neural network" (NN), is a mathematical model or computational model based on biological neural networks. It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase.
In more practical terms neural networks are non-linear statistical data modeling tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data.
A neural network is an interconnected group of nodes, akin to the vast network of neurons in the human brain.
Definition (2)
An Artificial Neural Network (ANN) is an information processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information. The key element of this paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements (neurons) working in unison to solve specific problems. ANNs, like people, learn by example. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. Learning in biological systems involves adjustments to the synaptic connections that exist between the neurons. This is true of ANNs as well.
Architecture of neural networks
Feed-forward ANNs (figure 1) allow signals to travel one way only; from input to output. There is no feedback (loops) i.e. the output of any layer does not affect that same layer. Feed-forward ANNs tend to be straight forward networks that associate inputs with outputs. They are extensively used in pattern recognition. This type of organization is also referred to as bottom-up or top-down.
Feedback networks (figure 1) can have signals traveling in both directions by introducing loops in the network. Feedback networks are very powerful and can get extremely complicated. Feedback networks are dynamic; their 'state' is changing continuously until they reach an equilibrium point. They remain at the equilibrium point until the input changes and a new equilibrium needs to be found. Feedback architectures are also referred to as interactive or recurrent, although the latter term is often used to denote feedback connections in single-layer organizations.
The commonest type of artificial neural network consists of three groups, or layers, of units: a layer of "input" units is connected to a layer of "hidden" units, which is connected to a layer of "output" units. (see Figure 4.1)
The activity of the input units represents the raw information that is fed into the network.
The activity of each hidden unit is determined by the activities of the input units and the weights on the connections between the input and the hidden units.
The behavior of the output units depends on the activity of the hidden units and the weights between the hidden and output units.
This simple type of network is interesting because the hidden units are free to construct their own representations of the input. The weights between the input and hidden units determine when each hidden unit is active, and so by modifying these weights, a hidden unit can choose what it represents.
We also distinguish single-layer and multi-layer architectures. The single-layer organization, in which all units are connected to one another, constitutes the most general case and is of more potential computational power than hierarchically structured multi-layer organizations. In multi-layer networks, units are often numbered by layer, instead of following a global numbering.
The
Learning Process
All learning methods used for adaptive neural networks can be classified into two major categories:
Supervised
learning which incorporates an external teacher, so that each
output unit is told what its desired response to input signals ought to be.
During the learning process global information may be required. Paradigms of
supervised learning include error-correction learning, reinforcement learning
and stochastic learning.
An important issue concerning supervised learning is the problem of error
convergence, i.e. the minimization of error between the desired and computed
unit values. The aim is to determine a set of weights which minimizes the
error. One well-known method, which is common to many learning paradigms, is
the least mean square (LMS) convergence.
Unsupervised
learning uses no external teacher and is based upon only local
information. It is also referred to as self-organization, in the sense that it
self-organizes data presented to the network and detects their emergent
collective properties. Paradigms of unsupervised learning are Hebbian learning
and competitive learning.
Ano2.2 From Human Neurones to Artificial Neuronesther aspect of learning
concerns the distinction or not of a separate phase, during which the network
is trained, and a subsequent operation phase. We say that a neural network
learns off-line if the learning phase and the operation phase are distinct. A
neural network learns on-line if it learns and operates at the same time.
Usually, supervised learning is performed off-line, whereas unsupervised
learning is performed on-line.
There are
three main types of learning:
• Supervised learning, where
the neuron (or NN) is provided with a data set
consisting
of input vectors and a target (desired output) associated with each
input
vector. This data set is referred to as the training set. The aim of supervised
training
is then to adjust the weight values such that the error between the
real
output, o = f(net−θ), of the
neuron and the target output, t, is
minimized.
• Unsupervised learning, where
the aim is to discover patterns or features in
the input
data with no assistance from an external source. Many unsupervised
learning
algorithms basically perform a clustering of the training patterns.
• Reinforcement learning, where
the aim is to reward the neuron (or parts of
a NN) for good performance, and to penalize the neuron for bad
performance.
Applications for Neural Networks
Neural networks are applicable in virtually every situation in which a relationship between the predictor variables (independents, inputs) and predicted variables (dependents, outputs) exists, even when that relationship is very complex and not easy to articulate in the usual terms of "correlations" or "differences between groups." A few representative examples of problems to which neural network analysis has been applied successfully are:
Future Application - Application of
Neural Networks in unit trust market prediction.
Integration of Neural Networks with computing / AI / fuzzy logic etc.
Real life applications
The tasks to which artificial neural networks are applied tend to fall within the following broad categories:
Application areas include system identification and control (vehicle control, process control), game-playing and decision making (backgammon, chess, racing), pattern recognition (radar systems, face identification, object recognition, etc.), sequence recognition (gesture, speech, handwritten text recognition), medical diagnosis, financial applications, data mining (or knowledge discovery in databases, "KDD"), visualization and e-mail spam filtering.
In investment analysis:
To attempt to predict the movement of stocks currencies etc., from previous data. There, they are replacing earlier simpler linear models.
In signature analysis:
As a mechanism for comparing
signatures made (e.g. in a bank) with those stored. This is one of the first
large-scale applications of neural networks in the
In process control:
There are clearly applications to be made here: most processes cannot be determined as computable algorithms. Newcastle University Chemical Engineering Department is working with industrial partners (such as Zeneca and BP) in this area.
In monitoring:
Networks have been used to monitor
In marketing:
Networks have been used to improve marketing mail shots. One technique is to run a test mailshot, and look at the pattern of returns from this. The idea is to find a predictive mapping from the data known about the clients to how they have responded. This mapping is then used to direct further mail shots.
Given this description of neural networks and how they work, what real world applications are they suited for? Neural networks have broad applicability to real world business problems. In fact, they have already been successfully applied in many industries.
Since neural networks are best at identifying patterns or trends in data, they are well suited for prediction or forecasting needs including:
· sales forecasting
· industrial process control
· customer research
· data validation
· risk management
· target marketing
But to give you some more specific examples; ANN are also used in the following specific paradigms: recognition of speakers in communications; diagnosis of hepatitis; recovery of telecommunications from faulty software; interpretation of multimeaning Chinese words; undersea mine detection; texture analysis; three-dimensional object recognition; hand-written word recognition; and facial recognition.
Pen PC's
PC's where one can write on a tablet, and the writing will be recognized and translated into (ASCII) text.
Speech and Vision recognition systems
Not new, but Neural Networks are becoming increasingly part of such systems. They are used as a system component, in conjunction with traditional computers.
White goods and toys
As Neural Network chips become available, the possibility of simple cheap systems which have learned to recognize simple entities (e.g. walls looming, or simple commands like Go, or Stop), may lead to their incorporation in toys and washing machines etc. Already the Japanese are using a related technology, fuzzy logic, in this way. There is considerable interest in the combination of fuzzy and neural technologies.
Using a Neural Network
The previous section describes in simplified terms how a neural network turns inputs into outputs. The next important question is: how do you apply a neural network to solve a problem?
The type of problem amenable to solution by a neural network is defined by the way they work and the way they are trained. Neural networks work by feeding in some input variables, and producing some output variables. They can therefore be used where you have some known information, and would like to infer some unknown information (see Patterson, 1996; Fausett, 1994). Some examples are:
Stock market prediction. You know last week's stock prices and today's DOW, NASDAQ, or FTSE index; you want to know tomorrow's stock prices.
Credit assignment. You want to know whether an applicant for a loan is a good or bad credit risk. You usually know applicants' income, previous credit history, etc. (because you ask them these things).
Control. You want to know whether a robot should turn left, turn right, or move forwards in order to reach a target; you know the scene that the robot's camera is currently observing.
Needless to say, not every problem can be solved by a neural network. You may wish to know next week's lottery result, and know your shoe size, but there is no relationship between the two. Indeed, if the lottery is being run correctly, there are no fact you could possibly know that would allow you to infer next week's result. Many financial institutions use, or have experimented with, neural networks for stock market prediction, so it is likely that any trends predictable by neural techniques are already discounted by the market, and (unfortunately), unless you have a sophisticated understanding of that problem domain, you are unlikely to have any success there either!
Therefore, another important requirement for the use of a neural network therefore is that you know (or at least strongly suspect) that there is a relationship between the proposed known inputs and unknown outputs. This relationship may be noisy (you certainly would not expect that the factors given in the stock market prediction example above could give an exact prediction, as prices are clearly influenced by other factors not represented in the input set, and there may be an element of pure randomness) but it must exist.
In general, if you use a neural network, you won't know the exact nature of the relationship between inputs and outputs - if you knew the relationship, you would model it directly. The other key feature of neural networks is that they learn the input/output relationship through training. There are two types of training used in neural networks, with different types of networks using different types of training. These are supervised and unsupervised training, of which supervised is the most common and will be discussed in this section (unsupervised learning is described in a later section).
In supervised learning, the network user assembles a set of training data. The training data contains examples of inputs together with the corresponding outputs, and the network learns to infer the relationship between the two. Training data is usually taken from historical records. In the above examples, this might include previous stock prices and DOW, NASDAQ, or FTSE indices, records of previous successful loan applicants, including questionnaires and a record of whether they defaulted or not, or sample robot positions and the correct reaction.
The neural network is then trained using one of the supervised learning algorithms (of which the best known example is back propagation, devised by Rumelhart et. al., 1986), which uses the data to adjust the network's weights and thresholds so as to minimize the error in its predictions on the training set. If the network is properly trained, it has then learned to model the (unknown) function that relates the input variables to the output variables, and can subsequently be used to make predictions where the output is not known.
The computing world has a lot to gain from neural networks. Their ability to learn by example makes them very flexible and powerful. Furthermore there is no need to devise an algorithm in order to perform a specific task; i.e. there is no need to understand the internal mechanisms of that task. They are also very well suited for real time systems because of their fast response and computational times which are due to their parallel architecture.
Neural networks also contribute to other areas of research such as neurology and psychology. They are regularly used to model parts of living organisms and to investigate the internal mechanisms of the brain.
Perhaps the most exciting aspect of neural networks is the possibility that some day 'conscious' networks might be produced. There are a number of scientists arguing that consciousness is a 'mechanical' property and that 'conscious' neural networks are a realistic possibility.
Finally, I would like to state that even though neural networks have a huge potential we will only get the best of them when they are integrated with computing, AI, fuzzy logic and related subjects.
Recommended Textbooks
Bishop, C. (1995). Neural Networks for Pattern Recognition.
Carling, A. (1992). Introducing Neural Networks.
Fausett, L. (1994). Fundamentals of Neural Networks.
Haykin, S. (1994). Neural Networks: A Comprehensive
Foundation.
Patterson, D. (1996). Artificial Neural Networks.
Ripley, B.D. (1996). Pattern Recognition and Neural Networks.