AI News, Improved Q-learning with continuous actions

Improved Q-learning with continuous actions

But we can also think of a neural network as of a high-dimensional manifold in the infinite-dimensional space of all possible functions, and we can, at least conceptually, run gradient descent in function space, subject to the constraint that we stay on the neural network manifold.

Thus, the choice of sigmoid versus tanh will affect the backpropagation algorithm, but it will not affect the idealized natural gradient, since natural gradient depends entirely on the neural network manifold, and we already established that the neural network manifold is unaffected by the choice of sigmoid versus tanh.

But the relevant fact about natural gradient is that its behavior is much more stable and benign in a variety of settings (for example, natural gradient is relatively unaffected by the order of the data in the training set, and is highly amenable to data parallelism), which suggests that natural gradient could improve the stability of the Q-learning algorithm as well.

Improved Q-learning with continuous actions

But we can also think of a neural network as of a high-dimensional manifold in the infinite-dimensional space of all possible functions, and we can, at least conceptually, run gradient descent in function space, subject to the constraint that we stay on the neural network manifold.

Thus, the choice of sigmoid versus tanh will affect the backpropagation algorithm, but it will not affect the idealized natural gradient, since natural gradient depends entirely on the neural network manifold, and we already established that the neural network manifold is unaffected by the choice of sigmoid versus tanh.

But the relevant fact about natural gradient is that its behavior is much more stable and benign in a variety of settings (for example, natural gradient is relatively unaffected by the order of the data in the training set, and is highly amenable to data parallelism), which suggests that natural gradient could improve the stability of the Q-learning algorithm as well.

Tom Goldstein: "What do neural loss surfaces look like?"

New Deep Learning Techniques 2018 "What do neural loss surfaces look like?" Tom Goldstein, University of Maryland Abstract: Neural network training relies ...

Lecture 16 | Adversarial Examples and Adversarial Training

In Lecture 16, guest lecturer Ian Goodfellow discusses adversarial examples in deep learning. We discuss why deep networks and other machine learning ...

On Characterizing the Capacity of Neural Networks using Algebraic Topology

The learnability of different neural architectures can be characterized directly by computable measures of data complexity. In this talk, we reframe the problem of ...

Large Scale Machine Learning

Dr. Yoshua Bengio's current interests are centered on a quest for AI through machine learning, and include fundamental questions on deep learning and ...

Lecture 13 | Generative Models

In Lecture 13 we move beyond supervised learning, and discuss generative modeling as a form of unsupervised learning. We cover the autoregressive ...

On Gradient-Based Optimization: Accelerated, Stochastic and Nonconvex

Many new theoretical challenges have arisen in the area of gradient-based optimization for large-scale statistical data analysis, driven by the needs of ...

Mod 3 Lec 4 Indirect Adaptive Control of a Robot manipulator

Lectures by Prof. Laxmidhar Behera, Department of Electrical Engineering, Indian Institute of Technology, Kanpur. For more details on NPTEL visit ...

Geometric Deep Learning | Michael Bronstein || Radcliffe Institute

As part of the 2017–2018 Fellows' Presentation Series at the Radcliffe Institute for Advanced Study, Michael Bronstein RI '18 discusses the past, present, and ...