AI News, Auto-Generating Clickbait With Recurrent NeuralNetworks
- On Sunday, September 30, 2018
- By Read More
Auto-Generating Clickbait With Recurrent NeuralNetworks
What if we could automate the writing of these, thus freeing up clickbait writers to do useful work? If this sort of writing truly is formulaic and unoriginal, we should be able to produce it automatically.
Recently, as people have figured out how to train deep (multi-layered) neural nets, very powerful models have been created, increasing the hype surrounding this so-called deep learning.
So, given a string of words like “Which Disney Character Are __”, we want the network to produce a reasonable guess like “You”, rather than, say, “Spreadsheet”. If this model can learn to predict the next word with some accuracy, we get a language model that tells us something about the texts we trained it on.
If we ask this model to guess the next word, and then add that word to the sequence and ask it for the next word after that, and so on, we can generate text of arbitrary length.
During training, we tweak the weights of this network so as to minimize the prediction error, maximizing its ability to guess the right next word.
The hope is that having a continuous rather than discrete representation for words will allow the network to make better mistakes, as long as similar words get similar vectors.
Whereas traditional neural nets are built around stacks of simple units that do a weighted sum followed by some simple non-linear function (like a tanh), we’ll use a more complicated unit called Long Short-Term Memory (LSTM).
Even if it can learn to generate text with correct syntax and grammar, it surely can’t produce headlines that contain any new knowledge of the real world?
It’s not clear that these headlines are much more than a semi-random concatenation of topics their userbase likes, and as seen in the latter case, 100% correct grammar is not a requirement.
This is what it produces after having seen about 40000 headlines: However, after having had multiple passes through the data, the training converges and the results are remarkably better.
Here are the 10 first completions of “Barack Obama Says”: And here are the 10 first completions of “Kim Kardashian Says”: By getting the RNN to complete our sentences, we can effectively ask questions of the model.
During training, we can follow the gradient down into these word vectors and fine-tune the vector representations specifically for the task of generating clickbait, thus further improving the generalization accuracy of the complete model.
It turns out that if we then take the word vectors learned from this model of 2 recurrent layers, and stick them in an architecture with 3 recurrent layers, and then freeze them, we get even better performance.
To summarize the word vector story: Initially, some good guys at Standford invented GloVe, ran it over 6 billion tokens, and got a bunch of vectors.
I found this to be a Big Deal: It cut the training time almost in half, and found better optima, compared to using rmsprop with exponential decay.
It’s possible that similar results could be obtained with rmsprop had I found a better learning and decay rate, but I’m very happy not having to do that tuning.
In practice, this can look like the following in PostgreSQL: The articles are a result of three seperate language models: One for the headlines, one for the article bodies, and one for the author name.
The article body neural network was seeded with the words from the headline, so that the body text has a chance to be thematically consistent with the headline.
If I remember correctly from economics class, this should drive the market value of useless journalism down to zero, forcing other producers of useless journalism to produce something else.
Getting started with the Keras functional API
The Keras functional API is the way to go for defining complex models, such as multi-output models, directed acyclic graphs, or models with shared layers.
The main input to the model will be the headline itself, as a sequence of words, but to spice things up, our model will also have an auxiliary input, receiving extra data such as the time of day when the headline was posted, etc. The
At this point, we feed into the model our auxiliary input data by concatenating it with the LSTM output: This defines a model with two inputs and two outputs: We compile the model and assign a weight of 0.2 to the auxiliary loss. To
a sequence of 280 vectors of size 256, where each dimension in the 256-dimensional vector encodes the presence/absence of a character (out of an alphabet of 256 frequent characters).
To share a layer across different inputs, simply instantiate the layer once, then call it on as many inputs as you want: Let's pause to take a look at how to read the shared layer's output or output shape.
Whenever you are calling a layer on some input, you are creating a new tensor (the output of the layer), and you are adding a 'node' to the layer, linking the input tensor to the output tensor.
The same is true for the properties input_shape and output_shape: as long as the layer has only one node, or as long as all nodes have the same input/output shape, then the notion of 'layer output/input shape' is well defined, and that one shape will be returned by layer.output_shape/layer.input_shape.
But if, for instance, you apply the same Conv2D layer to an input of shape (32, 32, 3), and then to an input of shape (64, 64, 3), the layer will have multiple input/output shapes, and you will have to fetch them by specifying the index of the node they belong to: Code examples are still the best way to get started, so here are a few more.
Going deeper with recurrent networks: Sequence to Bag of Words Model
Harnessing the vast data troves of the digital world can help us understand people more directly, going beyond the limitations of collecting data points through measurements and survey results.
For speedup, we combined the dot products between the RNN projection and positive word and negative words together into the same training sample using a softmax layer for predicting only the positive word instead of training separate embedding pairs like in Word2vec.
Using our interest vocabulary trained with Word2vec, we can enter any interest keyword, look up its vector and find all the words belonging to the closest vectors: Marketing Content: Content generation, Corporate blogging, Content syndication, Bylined articles, Social media strategy, Online content creation, Content curation, Content production Juggling: Roller skating, Ventriloquism, Circus arts, Unicycle, Street dance, Swing dance, Comedic timing, Acrobatics Brain Surgery: Medical research, Neurocritical care, Skull base surgery, Endocrine surgery, Brain tumors, Medical education, Pediatric cardiology, Hepatobiliary surgery With the Seq2BoW title model, we can find related interests, given any title: Marketing Analytics: Marketing mix modeling, Adobe insight, Lifetime value, Attribution modeling, Customer analysis, Spss clementine, Data segmentation, Spss modeler Data Engineer: Spark, Apache Pig, Hive, Pandas, Map Reduce, Apache Spark, Octave, Vertica Winemaker: Viticulture, Winemaking, Wineries, Red wine, Wine tasting, Food pairing, Champagne, Beer We can create a separate title vocabulary by computing and storing the vectors for the most frequent titles.
Then, we can query among these vectors to find related titles: CEO: Chairman, General Partner, Chief Executive, Coo, President, Founder/Ceo, President/Ceo, Board Member Dishwasher: Crew Member, Crew, Kitchen Staff, Busser, Barback, Shift Leader, Carhop, Sandwich Artist Code Monkey: Senior Software Development Engineer, Lead Software Developer, Senior Software Engineer II, Software Designer, Software Engineer III, Lead Software Engineer, Technical Principal, Lead Software Development Engineer We can also find titles near any interest: Cold Calling: Account management, Sales presentations, Direct sales, Sales process, Sales operations, Outside sales, Sales, Sales management Baking: Chef Instructor, Culinary Arts Instructor, Culinary Instructor, Baker, Head Baker, Pastry Chef, Pastry, Assistant Pastry Chef Neural Networks: Senior Data Scientist, Principal Data Scientist, Machine Learning, Data Scientist, Algorithm Engineer, Quantitative Researcher, Research Programmer, Lead Scientist We can extend beyond relating interests and titles and add various inputs or outputs to the Seq2BoW model.
- On Sunday, March 24, 2019
Lecture 2 | Word Vector Representations: word2vec
Lecture 2 continues the discussion on the concept of representing words as numeric vectors and popular approaches to designing word vectors. Key phrases: ...
How to Make a Text Summarizer - Intro to Deep Learning #10
I'll show you how you can turn an article into a one-sentence summary in Python with the Keras machine learning library. We'll go over word embeddings, ...
Lecture 6: Dependency Parsing
Lecture 6 covers dependency parsing which is the task of analyzing the syntactic dependency structure of a given input sentence S. The output of a dependency ...
Lecture 9: Machine Translation and Advanced Recurrent LSTMs and GRUs
Lecture 9 recaps the most important concepts and equations covered so far followed by machine translation and fancy RNN models tackling MT. Key phrases: ...
How Publishers Can Take Advantage of Machine Learning (Cloud Next '18)
Hearst Newspapers uses Google Cloud Machine Learning infrastructure to automate and create value in the newspaper business. A recent case study has been ...
We Were There - Hantavirus
Twenty-five years ago, a new and deadly type of Hantavirus swept through parts of southwestern U.S. Join us to hear fascinating stories about the discovery of ...
O Cérebro Humano como Centro do nosso Universo – Miguel Nicolelis | Vototalks Festival 2018
Para mais informações acesse O Vototalks Festival é um evento on-line colaborativo com palestras e debates sobre ..
Crypto Defenses for Real-World System Threats - Kenn White - Ann Arbor
Modern encryption techniques provide several important security properties, well known to most practitioners. Or are they? What are in fact the guarantees of, ...
Keynote (Google I/O '18)
Learn about the latest product and platform innovations at Google in a Keynote led by Sundar Pichai. This video is also subtitled in Chinese, Indonesian, Italian, ...