AI News,

The various types of neural networks are explained and demonstrated, applications of neural networks like ANNs in medicine are described, and a detailed historical background is provided.

It is composed of a large number of highly interconnected processing elements (neurones) working in unison to solve specific problems.

Minsky and Papert, published a book (in 1969) in which they summed up a general feeling of frustration (against neural networks) among researchers, and was thus accepted by most without further analysis.

For a more detailed description of the history click here The first artificial neuron was produced in 1943 by the neurophysiologist Warren McCulloch and the logician Walter Pits.

Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques.

The network is composed of a large number of highly interconnected processing elements(neurones) working in parallel to solve a specific problem.

Even more, a large number of tasks, require systems that use a combination of the two approaches (normally a conventional computer is used to supervise the neural network) in order to perform at maximum efficiency.

In the human brain, a typical neuron collects signals from others through a host of fine structures called dendrites.

The neuron sends out spikes of electrical activity through a long, thin stand known as an axon, which splits into thousands of branches.

At the end of each branch, a structure called a synapse converts the activity from the axon into electrical effects that inhibit or excite activity from the axon into electrical effects that inhibit or excite activity in the connected neurones.

When a neuron receives excitatory input that is sufficiently large compared with its inhibitory input, it sends a spike of electrical activity down its axon.

However because our knowledge of neurones is incomplete and our computing power is limited, our models are necessarily gross idealisations of real networks of neurones.

In the using mode, when a taught input pattern is detected at the input, its associated output becomes the current output.

If the input pattern does not belong in the taught list of input patterns, the firing rule is used to determine whether to fire or not.

The rule goes as follows: Take a collection of training patterns for a node, some of which cause it to fire (the 1-taught set of patterns) and others which prevent it from doing so (the 0-taught set).

Then the patterns not in the collection cause the node to fire if, on comparison , they have more input elements in common with the 'nearest' pattern in the 1-taught set than with the 'nearest' pattern in the 0-taught set.

For example, a 3-input neuron is taught to output 1 when the input (X1,X2 and X3) is 111 or 101 and to output 0 when the input is 000 or 001.

It differs from 000 in 1 element, from 001 in 2 elements, from 101 in 3 elements and from 111 in 2 elements.

Therefore the firing rule gives the neuron a sense of similarity and enables it to respond 'sensibly' to patterns not seen during training.

If we represent black squares with 0 and white squares with 1 then the truth tables for the 3 neurones after generalisation are;

Top neuron Middle neuron Bottom neuron  From the tables it can be seen the following associasions can be extracted:

Feedback architectures are also referred to as interactive or recurrent, although the latter term is often used to denote feedback connections in single-layer organisations.

The commonest type of artificial neural network consists of three groups, or layers, of units: a layer of "input"

The activity of each hidden unit is determined by the activities of the input units and the weights on the connections between the input and the hidden units.

The weights between the input and hidden units determine when each hidden unit is active, and so by modifying these weights, a hidden unit can choose what it represents.

The single-layer organisation, in which all units are connected to one another, constitutes the most general case and is of more potential computational power than hierarchically structured multi-layer organisations.

The perceptron (figure 4.4) turns out to be an MCP model ( neuron with weighted inputs ) with some additional, fixed, pre--processing.

Units labelled A1, A2, Aj , Ap are called association units and their task is to extract specific, localised featured from the input images.

The book was very well written and showed mathematically that single layer perceptrons could not do some basic pattern recognition operations like determining the parity of a shape or determining whether a shape is connected or not.

associative mapping in which the network learns to produce a particular pattern on the set of input units whenever another particular pattern is applied on the set of input units.

This is used to provide pattern completition, ie to produce a pattern whenever a portion of it or a distorted pattern is presented.

nearest-neighbour recall, where the output pattern produced corresponds to the input pattern stored, which is closest to the pattern presented, and interpolative recall, where the output pattern is a similarity dependent interpolation of the patterns stored corresponding to the pattern presented.

Yet another paradigm, which is a variant associative mapping is classification, ie when there is a fixed set of categories into which the input patterns are to be classified.

Whereas in asssociative mapping the network stores the relationships among patterns, in regularity detection the response of each unit has a particular 'meaning'.

Supervised learning which incorporates an external teacher, so that each output unit is told what its desired response to input signals ought to be.

important issue conserning supervised learning is the problem of error convergence, ie the minimisation of error between the desired and computed unit values.

For threshold units, the output is set at one of two levels, depending on whether the total input is greater than or less than some threshold value.

Sigmoid units bear a greater resemblance to real neurones than do linear or threshold units, but all three must be considered rough approximations.

To make a neural network that performs some specific task, we must choose how the units are connected to one another (see figure 4.1), and we must set the weights on the connections appropriately.

We can teach a three-layer network to perform a particular task by using the following procedure: Assume that we want a network to recognise hand-written digits.

The network would therefore need 256 input units (one for each sensor), 10 output units (one for each kind of digit) and a number of hidden units.

For each kind of digit recorded by the sensors, the network should produce high activity in the appropriate output unit and low activity in the other output units.

To train the network, we present an image of a digit and compare the actual activity of the 10 output units with the desired activity.

Next we change the weight of each connection so as to reduce the error.We repeat this training process for many different images of each different images of each kind of digit until the network classifies every image correctly.

To implement this procedure we need to calculate the error derivative for the weight (EW) in order to change the weight by an amount that is proportional to the rate at which the error changes as the weight is changed.

It was developed independently by two teams, one (Fogelman-Soulie, Gallinari and Le Cun) in France, the other (Rumelhart, Hinton and Williams) in U.S. In order to train a neural network to perform some task, we must adjust the weights of each unit in such a way that the error between the desired output and the actual output is reduced.

To compute the EA for a hidden unit in the layer just before the output layer, we first identify all the weights between that hidden unit and the output units to which it is connected.

After calculating all the EAs in the hidden layer just before the output layer, we can compute in like fashion the EAs for other layers, moving from layer to layer in a direction opposite to the way activities propagate through the network.

Since neural networks are best at identifying patterns or trends in data, they are well suited for prediction or forecasting needs including:

Neural networks are ideal in recognising diseases using scans since there is no need to provide a specific algorithm on how to identify the disease.

Diagnosis can be achieved by building a model of the cardiovascular system of an individual and comparing it with the real time physiological measurements taken from the patient.

If this routine is carried out regularly, potential harmful medical conditions can be detected at an early stage and thus make the process of combating the disease much easier.

model of an individual's cardiovascular system must mimic the relationship among physiological variables (i.e., heart rate, systolic and diastolic blood pressures, and breathing rate) at different physical activity levels.

Sensor fusion enables the ANNs to learn complex relationships among the individual sensor values, which would otherwise be lost if the values were individually analysed.

In medical modelling and diagnosis, this implies that even though each sensor in a set may be sensitive only to a specific physiological variable, ANNs are capable of detecting complex medical conditions by fusing the data from the individual biomedical sensors.

trained an autoassociative memory neural network to store a large number of medical records, each of which includes information on symptoms, diagnosis, and treatment for a particular case.

There is also a strong potential for using neural networks for database mining, that is, searching for patterns implicit within the explicitly stored information in databases.

A feedforward neural network is integrated with the AMT and was trained using back-propagation to assist the marketing control of airline seat allocations.

Stephens, 1987] While it is significant that neural networks have been applied to this problem, it is also important to see that this intelligent technology can be integrated with expert systems and other approaches to make a functional system.

Finally, I would like to state that even though neural networks have a huge potential we will only get the best of them when they are intergrated with computing, AI, fuzzy logic and related subjects.

The various types of neural networks are explained and demonstrated, applications of neural networks like ANNs in medicine are described, and a detailed historical background is provided.

It is composed of a large number of highly interconnected processing elements (neurones) working in unison to solve specific problems.

Minsky and Papert, published a book (in 1969) in which they summed up a general feeling of frustration (against neural networks) among researchers, and was thus accepted by most without further analysis.

For a more detailed description of the history click here The first artificial neuron was produced in 1943 by the neurophysiologist Warren McCulloch and the logician Walter Pits.

Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques.

The network is composed of a large number of highly interconnected processing elements(neurones) working in parallel to solve a specific problem.

Even more, a large number of tasks, require systems that use a combination of the two approaches (normally a conventional computer is used to supervise the neural network) in order to perform at maximum efficiency.

In the human brain, a typical neuron collects signals from others through a host of fine structures called dendrites.

The neuron sends out spikes of electrical activity through a long, thin stand known as an axon, which splits into thousands of branches.

At the end of each branch, a structure called a synapse converts the activity from the axon into electrical effects that inhibit or excite activity from the axon into electrical effects that inhibit or excite activity in the connected neurones.

When a neuron receives excitatory input that is sufficiently large compared with its inhibitory input, it sends a spike of electrical activity down its axon.

However because our knowledge of neurones is incomplete and our computing power is limited, our models are necessarily gross idealisations of real networks of neurones.

In the using mode, when a taught input pattern is detected at the input, its associated output becomes the current output.

If the input pattern does not belong in the taught list of input patterns, the firing rule is used to determine whether to fire or not.

The rule goes as follows: Take a collection of training patterns for a node, some of which cause it to fire (the 1-taught set of patterns) and others which prevent it from doing so (the 0-taught set).

Then the patterns not in the collection cause the node to fire if, on comparison , they have more input elements in common with the 'nearest' pattern in the 1-taught set than with the 'nearest' pattern in the 0-taught set.

For example, a 3-input neuron is taught to output 1 when the input (X1,X2 and X3) is 111 or 101 and to output 0 when the input is 000 or 001.

It differs from 000 in 1 element, from 001 in 2 elements, from 101 in 3 elements and from 111 in 2 elements.

Therefore the firing rule gives the neuron a sense of similarity and enables it to respond 'sensibly' to patterns not seen during training.

If we represent black squares with 0 and white squares with 1 then the truth tables for the 3 neurones after generalisation are;

Top neuron Middle neuron Bottom neuron  From the tables it can be seen the following associasions can be extracted:

Feedback architectures are also referred to as interactive or recurrent, although the latter term is often used to denote feedback connections in single-layer organisations.

The commonest type of artificial neural network consists of three groups, or layers, of units: a layer of "input"

The activity of each hidden unit is determined by the activities of the input units and the weights on the connections between the input and the hidden units.

The weights between the input and hidden units determine when each hidden unit is active, and so by modifying these weights, a hidden unit can choose what it represents.

The single-layer organisation, in which all units are connected to one another, constitutes the most general case and is of more potential computational power than hierarchically structured multi-layer organisations.

The perceptron (figure 4.4) turns out to be an MCP model ( neuron with weighted inputs ) with some additional, fixed, pre--processing.

Units labelled A1, A2, Aj , Ap are called association units and their task is to extract specific, localised featured from the input images.

The book was very well written and showed mathematically that single layer perceptrons could not do some basic pattern recognition operations like determining the parity of a shape or determining whether a shape is connected or not.

associative mapping in which the network learns to produce a particular pattern on the set of input units whenever another particular pattern is applied on the set of input units.

This is used to provide pattern completition, ie to produce a pattern whenever a portion of it or a distorted pattern is presented.

nearest-neighbour recall, where the output pattern produced corresponds to the input pattern stored, which is closest to the pattern presented, and interpolative recall, where the output pattern is a similarity dependent interpolation of the patterns stored corresponding to the pattern presented.

Yet another paradigm, which is a variant associative mapping is classification, ie when there is a fixed set of categories into which the input patterns are to be classified.

Whereas in asssociative mapping the network stores the relationships among patterns, in regularity detection the response of each unit has a particular 'meaning'.

Supervised learning which incorporates an external teacher, so that each output unit is told what its desired response to input signals ought to be.

important issue conserning supervised learning is the problem of error convergence, ie the minimisation of error between the desired and computed unit values.

For threshold units, the output is set at one of two levels, depending on whether the total input is greater than or less than some threshold value.

Sigmoid units bear a greater resemblance to real neurones than do linear or threshold units, but all three must be considered rough approximations.

To make a neural network that performs some specific task, we must choose how the units are connected to one another (see figure 4.1), and we must set the weights on the connections appropriately.

We can teach a three-layer network to perform a particular task by using the following procedure: Assume that we want a network to recognise hand-written digits.

The network would therefore need 256 input units (one for each sensor), 10 output units (one for each kind of digit) and a number of hidden units.

For each kind of digit recorded by the sensors, the network should produce high activity in the appropriate output unit and low activity in the other output units.

To train the network, we present an image of a digit and compare the actual activity of the 10 output units with the desired activity.

Next we change the weight of each connection so as to reduce the error.We repeat this training process for many different images of each different images of each kind of digit until the network classifies every image correctly.

To implement this procedure we need to calculate the error derivative for the weight (EW) in order to change the weight by an amount that is proportional to the rate at which the error changes as the weight is changed.

It was developed independently by two teams, one (Fogelman-Soulie, Gallinari and Le Cun) in France, the other (Rumelhart, Hinton and Williams) in U.S. In order to train a neural network to perform some task, we must adjust the weights of each unit in such a way that the error between the desired output and the actual output is reduced.

To compute the EA for a hidden unit in the layer just before the output layer, we first identify all the weights between that hidden unit and the output units to which it is connected.

After calculating all the EAs in the hidden layer just before the output layer, we can compute in like fashion the EAs for other layers, moving from layer to layer in a direction opposite to the way activities propagate through the network.

Since neural networks are best at identifying patterns or trends in data, they are well suited for prediction or forecasting needs including:

Neural networks are ideal in recognising diseases using scans since there is no need to provide a specific algorithm on how to identify the disease.

Diagnosis can be achieved by building a model of the cardiovascular system of an individual and comparing it with the real time physiological measurements taken from the patient.

If this routine is carried out regularly, potential harmful medical conditions can be detected at an early stage and thus make the process of combating the disease much easier.

model of an individual's cardiovascular system must mimic the relationship among physiological variables (i.e., heart rate, systolic and diastolic blood pressures, and breathing rate) at different physical activity levels.

Sensor fusion enables the ANNs to learn complex relationships among the individual sensor values, which would otherwise be lost if the values were individually analysed.

In medical modelling and diagnosis, this implies that even though each sensor in a set may be sensitive only to a specific physiological variable, ANNs are capable of detecting complex medical conditions by fusing the data from the individual biomedical sensors.

trained an autoassociative memory neural network to store a large number of medical records, each of which includes information on symptoms, diagnosis, and treatment for a particular case.

There is also a strong potential for using neural networks for database mining, that is, searching for patterns implicit within the explicitly stored information in databases.

A feedforward neural network is integrated with the AMT and was trained using back-propagation to assist the marketing control of airline seat allocations.

Stephens, 1987] While it is significant that neural networks have been applied to this problem, it is also important to see that this intelligent technology can be integrated with expert systems and other approaches to make a functional system.

Finally, I would like to state that even though neural networks have a huge potential we will only get the best of them when they are intergrated with computing, AI, fuzzy logic and related subjects.

Feedforward neural network

The method involves measuring the natural fluorescence of these drugs in the micellar media of sodium dodecyl sulfate (SDS), combined with principal component analysis–feed-forward neural networks.

This proposed method was applied for the simultaneous determination of atenolol, propranolol, amiloride, and dipyridamole at concentrations range between 10–400, 6–200, 5.6–280, and 5–100 ng/mL, respectively, by means of absolute values of the first derivative of nonlinear variable-angle synchronous scan at λex/λem = 228.8/300, 287.2/340, 366.4/412.8, and 288/487.2 nm for atenolol, propranolol, amiloride, and dipyridamole, respectively.

The adopted strategy combines the use of PARAFAC, for extraction of the pure analyte signal, with the standard addition method for determination of the presence of an individual matrix effect caused by the quenching action of the proteins present in the plasma and urine.

Using a propranolol with concentration of 260 ng/mL, good results were obtained for determinations in the mole fraction range from 0.50 to 0.80 of (R)-propranolol, providing absolute errors between 0.4% and 3.6% for plasma and between 0.9% and 6.0% for urine.

BLLS performed better than PARAFAC, presenting relative mean error in the order of 3.5%, analytical sensitivity of 0.07% and 0.08%, detection limit of 0.23% and 0.28%, and quantification limit of 0.71% and 0.84% for the pure form and pharmaceutical preparations, respectively.

The adopted strategy combined the use of PARAFAC, for extraction of the pure analyte signal, with the standard addition method, for a determination of the presence of an individual matrix effect caused by the quenching action of the proteins present in the urine.

A specific PARAFAC model was built for each triplicate analysis of each sample, from three-way arrays formed by 231 emission wavelengths, 8 excitation wavelengths, and 5 measurements (the sample plus 4 additions).

Types of artificial neural networks

Particularly, they are inspired by the behaviour of neurons and the electrical signals they convey between input (such as from the eyes or nerve endings in the hand), processing, and output from the brain (such as reacting to light, touch, or heat).

The way neurons semantically communicate is an area of ongoing research.[1][2][3][4] Most artificial neural networks bear only some resemblance to their more complex biological counterparts, but are very effective at their intended tasks (e.g.

Then, using PDF of each class, the class probability of a new input is estimated and Bayes’ rule is employed to allocate it to the class with the highest posterior probability.[10] It was derived from the Bayesian network[11] and a statistical algorithm called Kernel Fisher discriminant analysis.[12] It is used for classification and pattern recognition.

It has been implemented using a perceptron network whose connection weights were trained with back propagation (supervised learning).[13] In a convolutional neural network (CNN, or ConvNet or shift invariant or space invariant.[14][15]) the unit connectivity pattern is inspired by the organization of the visual cortex.

Unit response can be approximated mathematically by a convolution operation.[16] They are variations of multilayer perceptrons that use minimal preprocessing.[17] They have wide applications in image and video recognition, recommender systems[18] and natural language processing.[19] Regulatory feedback networks started as a model to explain brain phenomena found during recognition including network-wide bursting and difficulty with similarity found universally in sensory recognition.[20] This approach can also perform mathematically equivalent classification as feedforward methods and is used as a tool to create and modify networks.[21][22] Radial basis functions are functions that have a distance criterion with respect to a center.

In classification problems the output layer is typically a sigmoid function of a linear combination of hidden layer values, representing a posterior probability.

A common solution is to associate each data point with its own centre, although this can expand the linear system to be solved in the final layer and requires shrinkage techniques to avoid overfitting.

All three approaches use a non-linear kernel function to project the input data into a space where the learning problem can be solved using a linear model.

Alternatively, if 9-NN classification is used and the closest 9 points are considered, then the effect of the surrounding 8 positive points may outweigh the closest 9 (negative) point.

The Euclidean distance is computed from the new point to the center of each neuron, and a radial basis function (RBF) (also called a kernel function) is applied to the distance to compute the weight (influence) for each neuron.

For supervised learning in discrete time settings, training sequences of real-valued input vectors become sequences of activations of the input nodes, one input vector at a time.

At each time step, each non-input unit computes its current activation as a nonlinear function of the weighted sum of the activations of all units from which it receives connections.

To minimize total error, gradient descent can be used to change each weight in proportion to its derivative with respect to the error, provided the non-linear activation functions are differentiable.

The standard method is called 'backpropagation through time' or BPTT, a generalization of back-propagation for feedforward networks.[23][24] A more computationally expensive online variant is called 'Real-Time Recurrent Learning' or RTRL.[25][26] Unlike BPTT this algorithm is local in time but not local in space.[27][28] An online hybrid between BPTT and RTRL with intermediate complexity exists,[29][30] with variants for continuous time.[31] A major problem with gradient descent for standard RNN architectures is that error gradients vanish exponentially quickly with the size of the time lag between important events.[32][33] The Long short-term memory architecture overcomes these problems.[34] In reinforcement learning settings, no teacher provides target signals.

Instead a fitness function or reward function or utility function is occasionally used to evaluate performance, which influences its input stream through output units connected to actuators that affect the environment.

These units connect from the hidden layer or the output layer with a fixed weight of one.[35] At each time step, the input is propagated in a standard feedforward fashion, and then a backpropagation-like learning rule is applied (not performing gradient descent).

ESN are good at reproducing certain time series.[36] A variant for spiking neurons is known as liquid state machines.[37] The long short-term memory (LSTM)[34] has no vanishing gradient problem.

LSTM RNN outperformed other RNN and other sequence learning methods such as HMM in applications such as language learning[38] and connected handwriting recognition.[39] Bi-directional RNNs, or BRNNs, use a finite sequence to predict or label each element of the sequence based on both the past and the future context of the element.[40] This is done by adding the outputs of two RNNs: one processing the sequence from left to right, the other one from right to left.

Because neural networks suffer from local minima, starting with the same architecture and training but using randomly different initial weights often gives vastly different results.[citation needed] A CoM tends to stabilize the result.

The CoM is similar to the general machine learning bagging method, except that the necessary variety of machines in the committee is obtained by training from different starting weights rather than training on different randomly selected subsets of the training data.

SNNs are also a form of pulse computer.[46] Spiking neural networks with axonal conduction delays exhibit polychronization, and hence could have a very large memory capacity.[47] SNNs and the temporal correlations of neural assemblies in such networks—have been used to model figure/ground separation and region linking in the visual system.

It uses multiple types of units, (originally two, called simple and complex cells), as a cascading model for use in pattern recognition tasks.[49][50][51] Local features are extracted by S-cells whose deformation is tolerated by C-cells.

Local features in the input are integrated gradually and classified at higher layers.[52] Among the various kinds of neocognitron[53] are systems that can detect multiple patterns in the same input by using back propagation to achieve selective attention.[54] It has been used for pattern recognition tasks and inspired convolutional neural networks.[55] Dynamic neural networks address nonlinear multivariate behaviour and include (learning of) time-dependent behaviour, such as transient phenomena and delay effects.

Instead of just adjusting the weights in a network of fixed topology,[56] Cascade-Correlation begins with a minimal network, then automatically trains and adds new hidden units one by one, creating a multi-layer structure.

It is done by creating a specific memory structure, which assigns each new pattern to an orthogonal plane using adjacently connected hierarchical arrays.[57] The network offers real-time pattern recognition and high scalability;

HTM combines and extends approaches used in Bayesian networks, spatial and temporal clustering algorithms, while using a tree-shaped hierarchy of nodes that is common in neural networks.

Neural networks

Neural networks are parallel and distributed information processing systems that are inspired and derived from biological learning systems such as human brains.

Figure 6 demonstrates the architecture for a supervised neural network, which includes three layers, namely, input layer, output layer, and a hidden middle layer.

Neural networks are used in a wide variety of applications in pattern classification, language processing, complex systems modeling, control, optimization, and prediction.92 Neural networks have also been actively used in many bioinformatics applications such as DNA sequence prediction, protein secondary structure prediction, gene expression profiles classification, and analysis of gene expression patterns.93 Neural network has been applied widely in biology since the 1980s.94 For example, Stormo et al.95 reported prediction of the translation initiation sites in DNA sequences.

Neural networks have also been applied to the analysis of gene expression patterns as an alternative to hierarchical cluster methods.75,100,102,103 Narayanan et al.104 demonstrated the application of the single layer neural networks to analyze gene expression.

Besides SVMs and neural networks, there are also machine learning methods for gene selection such as ‘discriminate analysis’, which distinguishes a selected dataset from the rest of the data, and ‘k-nearest neighbor (KNN) algorithm’, which is based on a distance function for pairs of observations, such as the Euclidean distance.

Neural Network Calculation (Part 2): Activation Functions & Basic Calculation

From In this part we see how to calculate one section of a neural network. This calculation will be repeated many times to calculate a larger neural network...

Recurrent Neural Networks - Ep. 9 (Deep Learning SIMPLIFIED)

Our previous discussions of deep net applications were limited to static patterns, but how can a net decipher and label patterns that change with time? For example, could a net be used to scan...

Convolutional Neural Networks - Ep. 8 (Deep Learning SIMPLIFIED)

Out of all the current Deep Learning applications, machine vision remains one of the most popular. Since Convolutional Neural Nets (CNN) are one of the best available tools for machine vision,...

Artificial Neural Networks (Part 2) - -Classification using Multi-Layer Perceptron Model

Support Vector Machines Video (Part 1): Support Vector Machine (SVM) Part 2: Non Linear SVM Other Videos on Neural Networks:

4.3.3 Neural Networks - Multi-class Classification

Week 4 (Neural Networks: Representation) - Applications - Multi-class Classification Machine Learning Coursera by Andrew Ng Full Playlist:

Neural Network Calculation (Part 1): Feedforward Structure

From In this series we will see how a neural network actually calculates its values. This first video takes a look at the structure of a feedforward neural network

But what *is* a Neural Network? | Chapter 1, deep learning

Subscribe to stay notified about new videos: Support more videos like this on Patreon: Special thanks to these supporters:

1. Overview of Ways to Improve Generalization

Video from Coursera - University of Toronto - Course: Neural Networks for Machine Learning:

Lec-5 Learning Mechanisms-Hebbian,Competitive,Boltzmann

Lecture Series on Neural Networks and Applications by Prof.S. Sengupta, Department of Electronics and Electrical Communication Engineering, IIT Kharagpur. For more details on NPTEL visit

Dynamic Attractor Computing: Handwriting Demo

Note that if a trajectory that leads to the words "Chaos" or "Neuron" is "knocked-off" its path (perturbed) it returns to the "dynamic attractor" and completes the word. Event 1: Input 1. Event...