AI News, Anomaly Detection for Time Series Data with Deep Learning

Anomaly Detection for Time Series Data with Deep Learning

More recently, machine learning has entered the public consciousness because of advances in "deep learning"–these include AlphaGo's defeat of Go grandmaster Lee Sedol and impressive new products around image recognition and machine translation.

The increasing accuracy of deep neural networks for solving problems such as speech and image recognition has stoked attention and research devoted to deep learning and AI more generally.

This article introduces neural networks, including brief descriptions of feed-forward neural networks and recurrent neural networks, and describes how to build a recurrent neural network that detects anomalies in time series data.

By building a system of connected artificial neurons we obtain systems that can be trained to learn higher-level patterns in data and perform useful functions such as regression, classification, clustering, and prediction.

An artificial neural network is a collection of compute nodes where data represented as a numeric array is passed into a network’s input layer and proceeds through the network’s so-called hidden layers until an output or decision about the data is generated in a process described briefly below.

The net’s resulting output is then compared to expected results (ground-truth labels applied to the data, for example), and the difference between the network’s guess and the right answer is used to incrementally correct the activation thresholds of the net’s nodes.

That input data passes through the coefficients, or parameters, of the net and through multiplication those coefficients will amplify or mute the input, depending on its learned importance—that is, whether or not that pixel should affect the net’s decision about the entire input.

The error is calculated by comparing the net’s guess to the true answer contained in a test set, and using that error, the coefficients of the network are updated in order to change how the net assigns importance to different pixels in the image.

While deep learning is a complicated process involving matrix algebra, derivatives, probability and intensive hardware utilization as large matrices of coefficients are modified, the end user does not need to be exposed to all the complexity.

Computing power has increased with the advent of GPUs to increase the speed of the matrix operations, as well as with larger distributed computing frameworks, making it possible to train neural nets faster and iterate quickly through many combinations of hyperparameters to find the right architecture.

Finally, advances in how we understand and build the neural network algorithms have resulted in neural networks consistently setting new accuracy records in competitions for computer vision, speech recognition, machine translation and many other machine perception and goal-oriented tasks.

Those lines are known as features and the as the filters pass over the image, they construct feature maps locating each kind of line each time it occurs at different places in the image.

Convolutional networks have proven very useful in the field of image and video recognition (and because sound can be represented visually in the form of a spectrogram, convolutional networks are widely used for voice recognition and machine transcription tasks as well).

While a convolutional neural network steps through overlapping sections of the image and trains by learning to recognize features in each section, a feed forward network trains on the complete image.

A feed forward network trained on images where a feature is always in a particular position or orientation may not recognize when that feature shows up in an uncommon position, while a convolutional network would if trained well.

Unlike feed-forward neural networks, the hidden layer nodes of a recurrent neural network maintain an internal state, a memory, that is updated with new input the network is fed.

Although this might be possible with a typical feed-forward network that ingests a window of events, and subsequently moves that window through time, such an approach would limit us to dependencies captured by the window, and the solution would not be flexible.

The internet has multiple examples of using RNNs for generating text, one character at a time, after being trained on a corpus of text to predict the next letter given what’s gone before.

Just as a character generator understands the structure of data well enough to generate a simulacrum of it, an RNN used for anomaly detection understands the structure of the data well enough to know whether what it is fed looks normal, or not...

An RNN trained on normal network activity would perceive a network intrusion to be as anomalous as a sentence without punctuation Suppose we wanted to detect network anomalies with the understanding that an anomaly might point to hardware failure, application failure, or an intrusion.

By feeding a large volume of network activity logs, with each log line a time step, to the RNN, the neural net will learn what normal expected network activity looks like.

Training a neural net to recognize expected behavior has an advantage, because it is rare to have a large volume of abnormal data, or certainly not enough to accurately classify all abnormal behavior.

As an aside, the trained network does not necessarily note that certain activities happen at certain times (it does not know that a particular day is Sunday), but it does notice those more obvious temporal patterns we would be aware of, along with other connections between events that might not be apparent.

Running training on GPUs will lead to a significant decrease in training time, especially for image recognition, but additional hardware comes with additional cost, so it’s important that your deep-learning framework use hardware as efficiently as possible.

When performing network anomaly detection in production, log files need to be serialized into the same format that the model trained on, and based on the output of the neural network, you would get reports on whether the current activity was in the range of normal expected network behavior.

The configuration of a recurrent neural network might look something like this: Let’s describe a few important lines of this code: sets a random seed to initialize the neural net’s weights, in order to obtain reproducible results.

When using stochastic gradient descent, the error gradient (that is, the relation of a change in coefficients to a change in the net’s error) is calculated and the weights are moved along this gradient in an attempt to move the error towards a minimum.

More recently, machine learning has entered the public consciousness because of advances in "deep learning"–these include AlphaGo's defeat of Go grandmaster Lee Sedol and impressive new products around image recognition and machine translation.

Lecture 6 | Training Neural Networks I

In Lecture 6 we discuss many practical issues for training modern neural networks. We discuss different activation functions, the importance of data ...

Deep Learning with MATLAB: Training a Neural Network from Scratch with MATLAB

This demo uses MATLAB® to train a CNN from scratch for classifying images of four different animal types: cat, dog, deer, and frog. Images are used from the ...

Training - Using Convolutional Neural Network to Identify Dogs vs Cats p. 3

Now, the training data and testing data are both labeled datasets. The training data is what we'll fit the neural network with, and the test data is what we're going ...

Training/Testing on our Data - Deep Learning with Neural Networks and TensorFlow part 7

Welcome to part seven of the Deep Learning with Neural Networks and TensorFlow tutorials. We've been working on attempting to apply our recently-learned ...

Convolutional Neural Network (CNN) | Convolutional Neural Networks With TensorFlow | Edureka

TensorFlow Training - ) This Edureka "Convolutional Neural Network Tutorial" video (Blog: ..

The Best Way to Prepare a Dataset Easily

In this video, I go over the 3 steps you need to prepare a dataset to be fed into a machine learning model. (selecting the data, processing it, and transforming it).

How to Make an Image Classifier - Intro to Deep Learning #6

We're going to make our own Image Classifier for cats & dogs in 40 lines of Python! First we'll go over the history of image classification, then we'll dive into the ...

Lecture 11 | Detection and Segmentation

In Lecture 11 we move beyond image classification, and show how convolutional networks can be applied to other core computer vision tasks. We show how ...

Neural networks [7.3] : Deep learning - unsupervised pre-training

Training convolutional neural network for self-driving - Python plays GTA p.11

Welcome to Part 11 of the Python Plays: Grand Theft Auto V tutorial series, where we're working on creating a self-driving car in the game. Leading up to this ...