AI News, Improved Q-learning with continuous actions

Improved Q-learning with continuous actions

But we can also think of a neural network as of a high-dimensional manifold in the infinite-dimensional space of all possible functions, and we can, at least conceptually, run gradient descent in function space, subject to the constraint that we stay on the neural network manifold.

Thus, the choice of sigmoid versus tanh will affect the backpropagation algorithm, but it will not affect the idealized natural gradient, since natural gradient depends entirely on the neural network manifold, and we already established that the neural network manifold is unaffected by the choice of sigmoid versus tanh.

But the relevant fact about natural gradient is that its behavior is much more stable and benign in a variety of settings (for example, natural gradient is relatively unaffected by the order of the data in the training set, and is highly amenable to data parallelism), which suggests that natural gradient could improve the stability of the Q-learning algorithm as well.

Improved Q-learning with continuous actions

But we can also think of a neural network as of a high-dimensional manifold in the infinite-dimensional space of all possible functions, and we can, at least conceptually, run gradient descent in function space, subject to the constraint that we stay on the neural network manifold.

Thus, the choice of sigmoid versus tanh will affect the backpropagation algorithm, but it will not affect the idealized natural gradient, since natural gradient depends entirely on the neural network manifold, and we already established that the neural network manifold is unaffected by the choice of sigmoid versus tanh.

But the relevant fact about natural gradient is that its behavior is much more stable and benign in a variety of settings (for example, natural gradient is relatively unaffected by the order of the data in the training set, and is highly amenable to data parallelism), which suggests that natural gradient could improve the stability of the Q-learning algorithm as well.

Lecture 6 | Training Neural Networks I

In Lecture 6 we discuss many practical issues for training modern neural networks. We discuss different activation functions, the importance of data ...

Large Scale Machine Learning

Dr. Yoshua Bengio's current interests are centered on a quest for AI through machine learning, and include fundamental questions on deep learning and ...

Lecture 16 | Adversarial Examples and Adversarial Training

In Lecture 16, guest lecturer Ian Goodfellow discusses adversarial examples in deep learning. We discuss why deep networks and other machine learning ...

On Characterizing the Capacity of Neural Networks using Algebraic Topology

The learnability of different neural architectures can be characterized directly by computable measures of data complexity. In this talk, we reframe the problem of ...

Synthetic Gradients Tutorial - How to Speed Up Deep Learning Training

Synthetic Gradients were introduced in 2016 by Max Jaderberg and other researchers at DeepMind. They are designed to replace backpropagation, and they ...

TensorFlow Tutorial #15 Style Transfer

How to implement the Style Transfer algorithm in TensorFlow for combining the style and content of two images.

Lecture 13 | Generative Models

In Lecture 13 we move beyond supervised learning, and discuss generative modeling as a form of unsupervised learning. We cover the autoregressive ...

Exploiting Structure Information in Machine Learning

Machine learning has recently witnessed revolutionary success in a wide spectrum of domains. Most of these applications involve learning with complex inputs ...

Machine Learning and AI for the Sciences - Towards Understanding

Klaus-Robert Müller, Technische Universität Berlin Abstract: In recent years, machine learning (ML) and artificial intelligence (AI) methods have begun to play a ...