AI News, The Unreasonable Effectiveness of Recurrent Neural Networks

The Unreasonable Effectiveness of Recurrent Neural Networks

Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense.

What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience I’ve in fact reached the opposite conclusion).

We’ll train RNNs to generate text character by character and ponder the question “how is that even possible?” By the way, together with this post I am also releasing code on Github that allows you to train character-level language models based on multi-layer LSTMs.

A few examples may make this more concrete: As you might expect, the sequence regime of operation is much more powerful compared to fixed networks that are doomed from the get-go by a fixed number of computational steps, and hence also much more appealing for those of us who aspire to build more intelligent systems.

Moreover, as we’ll see in a bit, RNNs combine the input vector with their state vector with a fixed (but learned) function to produce a new state vector.

Lecture 8: Recurrent Neural Networks and Language Models

Lecture 8 covers traditional language models, RNNs, and RNN language models. Also reviewed are important training problems and tricks, RNNs for other ...

Lecture 10 | Recurrent Neural Networks

In Lecture 10 we discuss the use of recurrent neural networks for modeling sequence data. We show how recurrent neural networks can be used for language ...

4.1: Early Stopping and Encoding a Feature Vector for Deep Neural Networks(Module 4, Part 1)

How to use a training and validation split for a Keras neural network. The validation set can be used to implement early stopping. Also see how to encode a ...

Practical 4.3 – RNN training

Recurrent Neural Networks – Training example Full project: RNN training example: ..

Office Automation Part 2 - Using Pre-Trained Word-Embedded Vectors

Second video of 3, here we use pre-trained word-embedded vectors and python to find clear logical and thematic clusters in the Enron email dataset.

Practical 4.0 – RNN, vectors and sequences

Recurrent Neural Networks – Vectors and sequences Full project: Links to the paper Vinyals et al. (2016) ..

Lecture 10: Neural Machine Translation and Models with Attention

Lecture 10 introduces translation, machine translation, and neural machine translation. Google's new NMT is highlighted followed by sequence models with ...

RNN3. Recurrent Neural Network Model

Deep Learning Chapter 10 Sequence Modeling: Recurrent and Recursive Nets presented by Ian Goodfellow

This is a Deep Learning Book Club discussion of Chapter 10: Sequence Modeling: Recurrent and Recursive Nets. Chapter is presented by author Ian ...

Neural networks [10.6] : Natural language processing - neural network language model