AI News, Pre-trained Models with Keras in TensorFlow

Pre-trained Models with Keras in TensorFlow

This is great for making new models, but we also get the pre-trained models of keras.applications (also seen elsewhere).

This model can take input images that are 224 pixels on a side, so we have to make our image that size.

It's pretty fun that this kind of super-easy access to quite good pre-trained models is now available all within the TensorFlow package.

The thousand ImageNet categories this model knows about include some things that are commonly associated with people, but not a 'person' class.

There's also another function, resnet50.preprocess_input, which in theory should help the model work better, but my tests gave seemingly worse results when using that pre-processing.

How to train your own Object Detector with TensorFlow’s Object Detector API

This is a follow-up post on “Building a Real-Time Object Recognition App with Tensorflow and OpenCV” where I focus on training my own classes.

After my last post, a lot of people asked me to write a guide on how they can use TensorFlow’s new Object Detector API to train an object detector with their own dataset.

For training, you need the following: Note: The data_augmentation_option is very interesting if your dataset doesn’t have much of variability like different scale, pose etc..

In total, I ran it over about one hour/22k steps with a batch size of 24 but I already achieved good results in about 40mins.

This is how the total loss evolved: Since I only had one class, it was enough to just look at total mAP (mean average precision): And here is an example for the evaluation of one image while training the model: Bonus: I

obtained quite decent results for such a short training time but this is due to the fact that the detector was trained on a single class only.

In many other cases, even the model that I used would be too simple to capture all the variability across multiple classes so that more complicated models must be used.

Pre-trained Models with Keras in TensorFlow

This is great for making new models, but we also get the pre-trained models of keras.applications (also seen elsewhere).

This model can take input images that are 224 pixels on a side, so we have to make our image that size.

It's pretty fun that this kind of super-easy access to quite good pre-trained models is now available all within the TensorFlow package.

The thousand ImageNet categories this model knows about include some things that are commonly associated with people, but not a 'person' class.

There's also another function, resnet50.preprocess_input, which in theory should help the model work better, but my tests gave seemingly worse results when using that pre-processing.

Attacking Machine Learning with Adversarial Examples

Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake;

At OpenAI, we think adversarial examples are a good aspect of security to work on because they represent a concrete problem in AI safety that can be addressed in the short term, and because fixing them is difficult enough that it requires a serious research effort.

(Though we'll need to explore many aspects of machine learning security to achieve our goal of building safe, widely distributed AI.) To get an idea of what adversarial examples look like, consider this demonstration from Explaining and Harnessing Adversarial Examples: starting with an image of a panda, the attacker adds a small perturbation that has been calculated to make the image be recognized as a gibbon with high confidence.

For example, attackers could target autonomous vehicles by using stickers or paint to create an adversarial stop sign that the vehicle would interpret as a 'yield' or other sign, as discussed in Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples.

When we think about the study of AI safety, we usually think about some of the most difficult problems in that field — how can we ensure that sophisticated reinforcement learning agents that are significantly more intelligent than human beings behave in ways that their designers intended?

This creates a model whose surface is smoothed in the directions an adversary will typically try to exploit, making it difficult for them to discover adversarial input tweaks that lead to incorrect categorization.

(Distillation was originally introduced in Distilling the Knowledge in a Neural Network as a technique for model compression, where a small model is trained to imitate a large one, in order to obtain computational savings.) Yet even these specialized algorithms can easily be broken by giving more computational firepower to the attacker.

In other words, they look at a picture of an airplane, they test which direction in picture space makes the probability of the “cat” class increase, and then they give a little push (in other words, they perturb the input) in that direction.

If the model’s output is “99.9% airplane, 0.1% cat”, then a little tiny change to the input gives a little tiny change to the output, and the gradient tells us which changes will increase the probability of the “cat” class.

Let’s run a thought experiment to see how well we could defend our model against adversarial examples by running it in “most likely class” mode instead of “probability mode.” The attacker no longer knows where to go to find inputs that will be classified as cats, so we might have some defense.

The defense strategies that perform gradient masking typically result in a model that is very smooth in specific directions and neighborhoods of training points, which makes it harder for the adversary to find gradients indicating good candidate directions to perturb the input in a damaging way for the model.

Neither algorithm was explicitly designed to perform gradient masking, but gradient masking is apparently a defense that machine learning algorithms can invent relatively easily when they are trained to defend themselves and not given specific instructions about how to do so.

Neural Network Toolbox

Neural Network Toolbox™ provides algorithms, pretrained models, and apps to create, train, visualize, and simulate both shallow and deep neural networks.

Deep learning networks include convolutional neural networks (ConvNets, CNNs), directed acyclic graph (DAG) network topologies, and autoencoders for image classification, regression, and feature learning.

For small training sets, you can quickly apply deep learning by performing transfer learning with pretrained deep network models (including Inception-v3, ResNet-50, ResNet-101, GoogLeNet, AlexNet, VGG-16, and VGG-19) and models imported from TensorFlow™ Keras or Caffe.

Time Series Prediction with LSTM Recurrent Neural Networks in Python with Keras

Time series prediction problems are a difficult type of predictive modeling problem.

The Long Short-Term Memory network or LSTM network is a type of recurrent neural network used in deep learning because very large architectures can be successfully trained.

In this post, you will discover how to develop LSTM networks in Python using the Keras deep learning library to address a demonstration time-series prediction problem.

After completing this tutorial you will know how to implement and develop LSTM networks for your own time series prediction problems and other more general sequence problems.

The example in this post is quite dated, I have better examples available for using LSTMs on time series, see: The problem we are going to look at in this post is theInternational Airline Passengers prediction problem.

The downloaded dataset also has footer information that we can exclude with the skipfooter argument to pandas.read_csv() set to 3 for the 3 footer lines.

As such, it can be used to create large recurrent networks that in turn can be used to address difficult sequence problems in machine learning and achieve state-of-the-art results.

A block operates upon an input sequence and each gate within a block uses the sigmoid activation units to control whether they are triggered or not, making the change of state and addition of information flowing through the block conditional.

There are three types of gates within a unit: Each unit is like a mini-state machine where the gates of the units have weights that are learned during the training procedure.

We can write a simple function to convert our single column of data into a two-column dataset: the first column containing this month’s (t) passenger count and the second column containing next month’s (t+1) passenger count, to be predicted.

After we model our data and estimate the skill of our model on the training dataset, we need to get an idea of the skill of the model on new unseen data.

The code below calculates the index of the split point and separates the data into the training datasets with 67% of the observations that we can use to train our model, leaving the remaining 33% for testing the model.

The function takes two arguments: the dataset, which is a NumPy array that we want to convert into a dataset, and the look_back, which is the number of previous time steps to use as input variables to predict the next time period —

We can transform the prepared train and test input data into the expected structure using numpy.reshape() as follows: We are now ready to design and fit our LSTM network for this problem.

The network has a visible layer with 1 input, a hidden layer with 4 LSTM blocks or neurons, and an output layer that makes a single value prediction.

Note that we invert the predictions before calculating error scores to ensure that performance is reported in the same units as the original data (thousands of passengers per month).

Once prepared, the data is plotted, showing the original dataset in blue, the predictions for the training dataset in green, and the predictions on the unseen test dataset in red.

For example, given the current time (t) we want to predict the value at the next time in the sequence (t+1), we can use the current time (t), as well as the two prior times (t-1 and t-2) as input variables.

The create_dataset() function we created in the previous section allows us to create this formulation of the time series problem by increasing the look_back argument from 1 to 3.

Like above in the window example, we can take prior time steps in our time series as inputs to predict the output at the next time step.

Instead of phrasing the past observations as separate input features, we can use them as time steps of the one input feature, which is indeed a more accurate framing of the problem.

We can do this using the same data representation as in the previous window-based example, except when we reshape the data, we set the columns to be the time steps dimension and change the features dimension back to 1.

For example: Finally, when the LSTM layer is constructed, the stateful parameter must be set True and instead of specifying the input dimensions, we must hard code the number of samples in a batch, number of time steps in a sample and number of features in a time step by setting the batch_input_shape parameter.

Lecture 12 | Visualizing and Understanding

In Lecture 12 we discuss methods for visualizing and understanding the internal mechanisms of convolutional networks. We also discuss the use of ...