AI News, What Does a Neural Network ActuallyDo?

What Does a Neural Network ActuallyDo?

There has been a lot of renewed interest lately in neural networks (NNs) due to their popularity as a model for deep learning architectures (there are non-NN based deep learning approaches based on sum-products networks and support vector machines with deep kernels, among others).

To gain an intuitive understanding of what a learning algorithm does, I usually like to think about its representational power, as this provides insight into what can, if not necessarily what does, happen inside the algorithm to solve a given problem.

In other words, the NN computes a linear combination of the two inputs and , weighted by and respectively, adds an arbitrary bias term and then passes the result through a function , known as the activation function.

The linear regime of an activation function can also be exploited by a neural network, but for the sake of simplifying our discussion even further, we will choose an activation function without a linear regime.

As shown below, depending on the values of and , one regime in this two-dimensional input space yields a response of (white) and the other a response of (shaded):

By setting all weights in the middle layer to , and setting the bias of the middle layer to , the activation function of the output neuron will output whenever the input lies in the intersection of all three half-spaces defined by the decision boundaries, and otherwise.

Since there was nothing special about our choice of decision boundaries, we are able to carve out any arbitrary polygon and have the NN fire precisely when the input is inside the polygon (in the general case we set the weights to , where is the number of hyperplanes defining the polygon).

Creating regions comprised of multiple polygons, even disjoint ones, can be achieved by adding a set of neurons for each polygon and setting the weights of their respective edges to , where is the number of hyperplanes defining the polygon.

Any combination of weights that we assign to the middle layer in the above NN will result in a discrete set of values, up to one unique value per region formed by the union or intersection of the half-spaces defined by the decision boundaries, that are inputted to the node.

Since the bias can only adjust the threshold at which will fire, then the resulting behavior of any weight assignment is activation over some union of polygons defined by the shaded regions.

Regardless NNs still provide far more expressive power than most other machine learning techniques and my focus on disguises the fact that even simple decision boundaries, operating in high-dimensional spaces, can be surprisingly powerful.

the three-layer architecture discussed here, is equal in representational power to a neural network with arbitrary depth, as long as the hidden layer is made sufficiently wide.

case, we can set up a four-layer NN such that the second layer defines the edges, the third layer defines the polygons, and the fourth layer contains the 8 possible activation patterns:

While the three-layer approach is just as expressive as the four-layer one, it is not as efficient: the three-layer NN has a 2-10-8 configuration, resulting in 100 parameters (20 edges connecting first to second layer plus 80 edges connecting second to third layer), while the four-layer NN, with a 2-10-3-8 configuration, only has 74 parameters.

Lecture 10 - Neural Networks

Neural Networks - A biologically inspired model. The efficient backpropagation learning algorithm. Hidden layers. Lecture 10 of 18 of Caltech's Machine ...

Lec11 Autoencoder for Representation Learning and MLP Initialization

Autoencoders for representation learning; initialization of weights in a multi layer perceptron; unsupervised learning of representation; explanation and ...

Polyworld: Using Evolution to Design Artificial Intelligence

Google Tech Talks November, 8 2007 ABSTRACT This presentation is about a potential shortcut to artificial intelligence by trading mind-design for world-design ...

Faculty Colloquium: Dr. Saul Gelfand

Colloquium Title: Methods and Analysis for Some Statistical Estimation and Optimization Problems See an abstract of Dr. Gelfand talk: ...

Google Developer Days Europe 2017 - Day 1 (Auditorium)

Check in to the livestream to watch day 1 of GDD Europe '17! This livestream will cover all sessions taking place on the Auditorium stage of the ICE Congress ...

Forward 5: JS Live Stream