On Thursday, June 7, 2018
## What is the difference between a Perceptron, Adaline, and neural network model?

learning algorithms can actually be summarized by 4 simple steps – given that we use stochastic gradient descent for Adaline: We write the weight update in each iteration as:

Here, the activation function is not linear (like in Adaline), but we use a non-linear activation function like the logistic sigmoid (the one that we use in logistic regression) or the hyperbolic tangent, or a piecewise-linear activation function such as the rectifier linear unit (ReLU).

In addition, we often use a softmax function (a generalization of the logistic sigmoid for multi-class problems) in the output layer, and a threshold function to turn the predicted probabilities (by the softmax) into class labels.

By connecting the artificial neurons in this network through non-linear activation functions, we can create complex, non-linear decision boundaries that allow us to tackle problems where the different classes are not linearly separable.

On Thursday, September 19, 2019

