AI News, aigamedev/scikit-neuralnetwork

aigamedev/scikit-neuralnetwork

By importing the sknn package provided by this library, you can easily train deep neural networks as regressors (to estimate continuous outputs from inputs) and classifiers (to predict discrete labels from features).

Thanks to the underlying Lasagne implementation, the code supports the following neural network features — exposed in an intuitive and well documented API: If a feature you need is missing, consider opening a GitHub Issue with a detailed explanation about the use case and we'll see what we can do.

Then, you can run the samples and benchmarks available in the examples/ folder, or launch the tests to check everything is working: We strive to maintain 100% test coverage for all code-paths, to ensure that rapid changes in the underlying backend libraries are caught automatically.

To run the example that generates the visualization above using our sknn.mlp.Classifier, just run the following command in the project's root folder: There are multiple parameters you can plot as well, for example iterations, rules or units.

Using Lasagne for training Deep Neural Networks

After using the libraries cuda-convnet and Caffe for a while, I found out that I needed more flexibility in the models, in terms of defining the objective functions and in controlling the way samples are selected / augmented during training.

Lasagne is a library built on top of Theano, but it does not hide the Theano symbolic variables, so you can manipulate them very easily to modify the model or the learning procedure in any way you want.

The samples on the dataset look like this: Samples from the MNIST dataset From a high level, what we want to do if define a model that identifies the digit (y \in [0,1,...,9]) from an image x \in \mathbb{R}^{28*28}.

We then consider a cost function that considers how wrong our model is, on a set of images - that is, we show a bunch of images, and check if the model is accurate in predicting y.

The first step is creating symbolic variables for input of the network (images) and the output - 10 neurons predicting the probability of each digit (0-9) given the image: In this example, we named the inputs as input_var and the outputs as target_var.

This may be hard to grasp initially, but it is what allows Theano to automatically calculate gradients (derivatives), which is great for trying out new things, and it also enables the library to optimize your code.

The library implements most commonly used layer types, and their declaration is very straightforward: Lasagne does not specify a “model” class, so the convention is to create a dictionary that contains all the layers (called net in this example).

Looking at lines 11 and 12, in order to add weight decay we simply need to sum the weight decay to the loss variable.

Here is where Theano really shines: since we defined the computations using symbolic math, it can automatically calculate the derivatives of an arbitrary loss function with respect to the weights.

\DeclareMathOperator*{\argmax}{arg\,max}We have defined two functions for test: the first is val_fn, that returns the average loss and classification accuracy of a set of images and labels (x,y), and get_preds, that returns the predictions P(y \vert x), given a set of images x.

Let’s now take a look on some cases where the model failed to predict the correct class: Incorrect predictions in the testing set There is certainly room for improvement in the model, but it is entertaining to see that the cases that the model gets wrong are mostly hard to recognize.

Other libraries (such as cuda-convnet) require that you specify the parameters in a file, which is harder to use if you want to, for instance, try out different numbers of neurons in a given layer (in an automated way).

Here is a plot of the progress of the training error over time (in epochs - the number of passes through the training set): Training progress with different optimization algorithms For this dataset and model, using ADAM was much superior than the classical Stochastic Gradient Descent - for instance, in the second pass on the training set (using ADAM), the performance was the same as doing 10 epochs using SGD.

Introduction to Deep Learning with Python

Alec Radford, Head of Research at indico Data Solutions, speaking on deep learning with Python and the Theano library. The emphasis of the talk is on high performance computing,..