A neural network is a powerful data modeling tool that is able to capture and represent complex input/output relationships. The motivation for the development of neural network technology stemmed from the desire to develop an artificial system that could perform "intelligent" tasks similar to those performed by the human brain. Neural networks resemble the human brain in the following two ways:

- A neural network acquires knowledge through learning
- A neural network's knowledge is stored within inter-neuron connection strengths known as synaptic weights

The Neural Network algorithm is an artificial intelligence technique that explores more possible data relationships than other algorithms. Because it is such a thorough technique, the processing of it is usually slower than the processing of other classification algorithms.

A neural network consists of basic units modeled after biological neurons. Each unit has many inputs that it combines into a single output value. These inputs are connected together, so the outputs of some units are used as inputs into other units. The network can have one or more middle layers called hidden layers. The simplest are feed-forward networks (pictured), where there is only a one-way flow through the network from the inputs to the outputs. There are no cycles in the feed-forward networks.

As mentioned, units combine inputs into a single output value. This combination is called the unit’s activation function. Consider this example: The human ear can function near a working jet engine. Yet, if it were only 10 times more sensitive, you would be able to hear a single molecule hitting the membrane in your ears! What does that mean? When you go from 0.01 to 0.02, the difference should be comparable with going from 100 to 200. In biology, there are many types of non-linear behavior.

Thus, an activation function has two parts. The first part is the combination function that merges all of the inputs into a single value (weighted sum, for example). The second part is the transfer function, which transfers the value of the combination function to the output value of the unit. The linear transfer function would do just the linear regression. The transfer functions are S-shaped, like the sigmoid function:

Sigmoid(x) = 1 / (1 + e(-x)).

A single hidden layer is optimal, so the Neural Network algorithm always uses a maximum of one (or zero for Logistic Regression).

The Neural Network algorithm uses the hyperbolic tangent activation function in the hidden layer and the sigmoid function in output layer. You can see a Neural Network with a single hidden layer in the following picture.

Training a neural network is the process of setting the best weights on the inputs of each of the units. This *backpropagation* process does the following:

- Gets a training example and calculates outputs
- Calculates the error – the difference between the calculated and the expected (known) result
- Adjusts the weights to minimize the error

Like the Decision Trees algorithm, you can use the Neural Network algorithm for classification and prediction. The interpretation of the Neural Network algorithm results is somewhat more complex than the interpretation of the Decision Trees algorithm results. Consequently, the Decision Trees algorithm is more popular.

## Comments

## tom said:

I don't think this addressed Data Mining at all.

Data Mining: Sift through tons of data to find those that will allow some prediction to be made.

NNet's would be part of the "prediction" I'm talking about and have nothing really to do per se with DMing.

However, NNet's could be used to search for data that would improve the prediction. Candidate data would be used as Inputs to NNet's and some algorithm would be used to evaluate that input and how to select the next candidate.

This would be useful information, however, rehashing NNets serves no purpose.

## Dejan Sarka said:

Tom,

I am sorry, but I don't see your point. Artificial Neural Network is one of data mining and / or machine learning algorithms.

This article is in the series of articles where I describe the algorithms. I am not talking about the data mining process here. I wrote a series of articles about data mining process in a very detailed fraud detection example. You can find them on this site.

Dejan Sarka