AI News, A Primer in Adversarial Machine Learning – The Next Advance in AI
A Primer in Adversarial Machine Learning – The Next Advance in AI
Consider two rising security applications, spam filtering and retinal identification. It is possible for a knowledgeable adversary to exploit this weakness to defeat a CNN spam filter. It was originally believed that the spammer would have to have deep knowledge of the original system but it’s recently been shown that even cleverly introduced random noise in the spam message, undetectable to the reader, can be used to defeat CNN spam filters. Similarly with retinal images. Although retinal images are believed to be highly specific to individuals, how do we know that the CNN is correctly identifying that retinal image is actually yours?
third possibility is to intentionally poison say an intrusion detection system. These systems are designed to constantly retrain on the most current system observations. If those system observations are intentionally tainted with noise designed to defeat the CNN recognition, the system will be trained to make incorrect conclusions about whether a malevolent intrusion is occurring.
We’ll stick with image processing for this example. Suppose we want to train our GAN to recognize 18th Century paintings. This is completely unsupervised with the system simply being shown a large sample of these paintings. Remember that CNNs reduce images through a series of layers to simple numerical vectors that describe edges, colors, and the like and can also create images in the reverse process by simply starting with a random selection of these numerical vectors.
But AML offers a path forward for the machine to demonstrate human-like prediction and planning based on its own experience of the world. In a thought experiment suggested by Yan LeCun and Soumith Chintala, both doing AI research at Facebook, they propose to show several video frames of a billiards game to their AML and have it learn the rules of physics describing what will happen next – to anticipate. They plan to augment this experiment by training their AML on images set a few inches apart like binocular vision. In essence, to let it learn about the 3D world. From this they expect their AML to develop the ‘common sense’ that, for example, it cannot walk out a door without first opening it or how far someone would have to reach across a table to grasp an object.
API-driven services bring intelligence to any application
Developed by AWS and Microsoft, Gluon provides a clear, concise API for defining machine learning models using a collection of pre-built, optimized neural network components.
More seasoned data scientists and researchers will value the ability to build prototypes quickly and utilize dynamic neural network graphs for entirely new model architectures, all without sacrificing training speed.
Applied Deep Learning - Part 3: Autoencoders
In Part 2 we applied deep learning to real-world datasets, covering the 3 most commonly encountered problems as case studies: binary classification, multiclass classification and regression.
The code for this article is available here as a Jupyter notebook, feel free to download and try it out yourself.
The encoder compresses the input and produces the code, the decoder then reconstructs the input only using this code.
To build an autoencoder we need 3 things: an encoding method, decoding method, and a loss function to compare the output with the target.
Autoencoders are mainly a dimensionality reduction (or compression) algorithm with a couple of important properties: Let’s explore the details of the encoder, code and decoder.
The hyperparameters are: 128 nodes in the hidden layer, code size is 32, and binary crossentropy is the loss function.
Before we used to add layers using the sequential API as follows: model.add(Dense(16, activation='relu'))model.add(Dense(8, activation='relu')) With the functional API we do this: layer_1 = Dense(16, activation='relu')(input)layer_2 = Dense(8, activation='relu')(layer_1) It’s more verbose but a more flexible way to define complex models.
Also note the call to fit function, before with ANNs we used to do: model.fit(x_train, y_train) But now we do: model.fit(x_train, x_train) Remember that the targets of the autoencoder are the same as the input.
The autoencoder will reconstruct the training data perfectly, but it will be overfitting without being able to generalize to new instances, which is not what we want.
If the input data has a pattern, for example the digit “1” usually contains a somewhat straight line and the digit “0” is circular, it will learn this fact and encode it in a more compact form.
If the input data was completely random without any internal correlation or dependency, then an undercomplete autoencoder won’t be able to recover it perfectly.
There is another way to force the autoencoder to learn useful features, which is adding random noise to its inputs and making it recover the original noise-free data.
This way the autoencoder can’t simply copy the input to its output because the input also contains random noise.
We trained the regular autoencoder as follows: autoencoder.fit(x_train, x_train) Denoising autoencoder is trained as: autoencoder.fit(x_train_noisy, x_train) Simple as that, everything else is exactly the same.
We introduced two ways to force the autoencoder to learn useful features: keeping the code size small and denoising autoencoders.
We can regularize the autoencoder by using a sparsity constraint such that only a fraction of the nodes would have nonzero values, called active nodes.
This forces the autoencoder to represent each input as a combination of small number of nodes, and demands it to discover interesting structure in the data.
As a reminder, previously we created the code layer as follows: We now add another parameter called activity_regularizer by specifying the regularization strength.
If we look at the histogram of code values for the images in the test set, the distribution is as follows: The mean for the standard model is 6.6 but for the regularized model it’s 0.8, a pretty big reduction.
As a compression method, they don’t perform better than its alternatives, for example jpeg does photo compression better than an autoencoder.
Advancing state-of-the-art image recognition with deep learning on hashtags - Facebook Code
Our researchers and engineers aim to push the boundaries of computer vision and then apply that work to benefit people in the real world — for example, using AI to generate audio captions of photos for visually impaired users.
In order to improve these computer vision systems and train them to consistently recognize and classify a wide range of objects, we need data sets with billions of images instead of just millions, as is common today.
Our researchers and engineers have addressed this by training image recognition networks on large sets of public images with hashtags, the biggest of which included 3.5 billion images and 17,000 hashtags.
By training our computer vision system with a 1 billion-image version of this data set, we achieved a record-high score — 85.4 percent accuracy — on ImageNet, a common benchmarking tool.
Along with enabling this genuine breakthrough in image recognition performance, this research offers important insight into how to shift from supervised to weakly supervised training, where we use existing labels — in this case, hashtags — rather than ones that are chosen and applied specifically for AI training.
For image recognition purposes, tags function as weakly supervised data, and vague and/or irrelevant hashtags appear as incoherent label noise that can confuse deep learning models.
That included dealing with multiple labels per image (since people who add hashtags tend to use more than one), sorting through hashtag synonyms, and balancing the influence of frequent hashtags and rare ones.
On the ImageNet image recognition benchmark — one of the most common benchmarks in the field — our best model achieved 85.4 percent accuracy by training on 1 billion images with a vocabulary of 1,500 hashtags.
On the other hand, for tasks with greater visual variety, the performance improvements of models trained with 17,000 hashtags became much more pronounced, indicating that we should increase the number of hashtags in our future training.
For example, an audio caption for a photo that mentions a bird in a tree is useful, but a caption that can pinpoint the exact species, such as a cardinal perched in a sugar maple tree, provides visually impaired users with a significantly better description.
Since it involves a first-of-its-kind level of scale, the observations detailed in this paper will pave the way for a range of new research directions, including the need to develop a new generation of deep learning models that are complex enough to effectively learn from billions of images.
The work also suggests that, as widely used as benchmarks such as ImageNet are, we need to develop new benchmarks that allow us to better gauge the quality and limitations of today’s image recognition systems and the larger, less supervised ones to come.
- On Sunday, March 24, 2019
Anomaly Detection: Algorithms, Explanations, Applications
Anomaly detection is important for data cleaning, cybersecurity, and robust AI systems. This talk will review recent work in our group on (a) benchmarking ...
What Does Chemotherapy Actually Do To Your Body?
Chemotherapy, while effective at treating many types of cancers, can cause hair loss and fatigue. So, what does it do to the body? Sign Up For The TestTube ...
Russia & Cold War 2.0, A Conversation with General Wesley Clark (Full Version)
MILLION DOLLAR WEBSITE - WEBINAR REPLAY!!
Million Dollar Website Million Dollar Website FREE Goodies: GET ACCESS TO RESOURCES IN THE VIDEO STEP1: Free member to very ..
Jaat Jatni | Latest Haryanvi songs Haryanavi | Ajay Hooda, Pooja Hooda | Gagan Haryanvi
Jaat Jatni Latest Haryanvi Songs Haryanavi 2017. Starring with Ajay Hooda and Pooja Hooda. Sung by Gagan Haryanvi. Directed by Ajay Hooda and Dheeraj ...
Jacqueline Kennedy: White House Tour - Documentary Film
Jacqueline "Jackie" Lee Bouvier Kennedy Onassis (July 28, 1929 -- May 19, 1994) was the wife of the 35th President of the U.S., John F. Kennedy, and First ...
NCUA Webinar: Remittances and Other Money Transfer Services (6/17/2015)
This webinar discussed facilitating the transfer of funds overseas and important considerations respecting money transfer services including which countries ...
'Liberal' Hollywood, War, & Prisons Driving Immigration Policy (The Point)
Points from TV legend Phil Donahue about the human costs to being a nation of warmongers, actor Crispin Glover (Back to the Future, Charlie's Angels) on ...
Cloud Computing - Computer Science for Business Leaders 2016
caching, load balancing; containers, virtual machines; IAAS, PAAS, SAAS.
Internet Technologies - Computer Science for Business Leaders 2016
DHCP, DNS, TCP/IP, VPNs, Wi-Fi; HTTP, HTTPS; hosts, registrars.