AI News, Picasso: A free open-source visualizer for Convolutional Neural Networks

Picasso: A free open-source visualizer for Convolutional Neural Networks

While it’s easier than ever to define and train deep neural networks (DNNs), understanding the learning process remains somewhat opaque.

Monitoring the loss or classification error during training won’t always prevent your model from learning the wrong thing or learning a proxy for your intended classification task.

To understand what we mean, consider this (possibly apocryphal) story [1]: Regardless of the veracity of this tale, the point is familiar to machine learning researchers: training metrics don’t always tell the whole story.

And the stakes are higher than ever before: for rising applications of deep learning like autonomous vehicles, these kinds of training errors can be deadly [2].

we developed Picasso to make it easy to see standard visualizations across our models in our various verticals: including applications in automotive, such as understanding when road segmentation or object detection fail;

We already know this model is pretty good at classifying tanks: can we use these visualizations to check that model is actually classifying based on the tank and not, say, the sky?

Picasso: A Modular Framework for Visualizing the Learning Process of Neural Network Image Classifiers

Neural networks (NNs) [1] and convolutional neural networks (CNNs) [2, 3, 4] are subject to unique training pitfalls [5, 6].

The researchers ran the neural network on the remaining 100 photos, and without further training the neural network classified all remaining photos correctly.

The researchers handed the finished work to the Pentagon, which soon handed it back, complaining that in their own tests the neural network did no better than chance at discriminating photos.

It turned out that in the researchers’ dataset, photos of camouflaged tanks had been taken on cloudy days, while photos of plain forest had been taken on sunny days.

[emphasis added] While this story may be apocryphal, it nonetheless illustrates a common pitfall in machine learning: training on a proxy feature instead of the intended feature.

We developed Picasso to help protect against situations where evaluation metrics like loss and accuracy may not tell the whole story in training neural networks on image classification tasks.

Picasso makes it easy to see standard visualizations across our models in various fields: including applications in automotive, such as understanding when road segmentation or object detection fail;

Other visualization packages exist to help bring transparency to the learning process, most notably the Deep Visualization Toolbox [15] and keras-vis [16], which can also generate saliency maps.

We furthermore required an application that would easily allow us to add new visualizations, which may in the future include visualizations such as class activation mapping [17, 18] and image segmentation [19, 20].

Visualizing parts of Convolutional Neural Networks using Keras and Cats

It is well known that convolutional neural networks (CNNs or ConvNets) have been the source of many major breakthroughs in the field of Deep learning in the last few years, but they are rather unintuitive to reason about for most people.

In the first few layers of CNNs the network can identify lines and corners, but we can then pass these patterns down through our neural net and start recognizing more complex features as we get deeper.

Additionally of note, images are sometimes padded with zeros around the perimeter when performing convolutions, which dampens the value of the convolutions around the edges of the image (the idea being typically the center of photos matter more).

This works because of filters, stacks of weights represented as a vector, which are multiplied by the values outputed by the convolution.When training an image, these weights change, and so when it is time to evaluate an image, these weights return high values if it thinks it is seeing a pattern it has seen before.

Pooling works very much like convoluting, where we take a kernel and move the kernel over the image, the only difference is the function that is applied to the kernel and the image window isn’t linear.

Max pooling takes the largest value from the window of the image currently covered by the kernel, while average pooling takes the average of all values in the window.

It has become one of the research world’s standards for comparing CNN models, with current best models will successfully detect the objects in 94+% of the images.

The first viable example of a CNN applied to imagenet was AlexNet in 2012, before that researches attempted to use traditional computer vision techiques, but AlexNet outperformed everything else up to that point by ~15%.

Anyway, lets look at LeNet: This diagram doesn’t show the activation functions, but the architecture is: Input image →ConvLayer →Relu → MaxPooling →ConvLayer →Relu→ MaxPooling →Hidden Layer →Softmax (activation)→output layer Here is an image of a cat: Our picture of the cat has a height 320px, a width of 400px, and 3 channels of color (RGB).

Here is the cat with a kernel size of 3x3 and 3 filters (if we have more than 3 filter layers we cant plot a 2d image of the cat.

We add a pooling layer (getting rid of the activation just max it a bit easier to show) As expected, the cat is blockier, but we can go even blockyier!

Lecture 12 | Visualizing and Understanding

In Lecture 12 we discuss methods for visualizing and understanding the internal mechanisms of convolutional networks. We also discuss the use of ...

Lecture 1 | Introduction to Convolutional Neural Networks for Visual Recognition

Lecture 1 gives an introduction to the field of computer vision, discussing its history and key challenges. We emphasize that computer vision encompasses a ...

Jeff Abrahamson - WTF am I doing? An introduction to NLP and ANN's

Filmed at PyData London 2017 Description This talk will be a playful but serious introduction to natural language processing and image ..

Extraordinary Variations of the Mind: Geschwind:Our Brains Berman: Williams SyndromeFisher:Language

Visit: 1:39 - Our Brains: Life on a Continuum 21:07 - From Genes to Neural Circuits 35:07 - Language at the Extremes The human mind is one ..

Google I/O'17: Channel 7

Technical sessions and deep dives into Google's latest developer products and platforms. Watch more Firebase talks at I/O '17 here: See ..

Collections as Data: Impact

Building on the success of its “Collections as Data” symposium last year, the Library of Congress National Digital Initiatives (NDI) again will host a daylong ...

ACDH Lecture 3.1- Björn Ommer- Visual Analytics: Enabling Images to Speak for Themselves

ACDH Lecture 3.1, Visual Analytics: Enabling Images to Speak for Themselves date: 7. March 2017 place: ACDH, OEAW, Vienna, Austria REGISTRATION ...

espyconnect NDIS Support Item modification 2017

we have made some changes to our NDIS Support Items category. Watch our video to see how easy it is to update your listing!