AI News, Catching Star Wars surprises and other spoilers with Machine Learning

Catching Star Wars surprises and other spoilers with Machine Learning

Like many postdocs and grad students, when I wasn’t trying to discover the basic laws of matter (i.e., debugging my code), I spent a lot of time surfing the Internet.

I chose Tumblr for three reasons: To explain what I mean by 3, here is what a Star Wars: The Force Awakens post might look like on Tumblr, with the spoiler text redacted: Despite posting spoilers, this Tumblr user has followed an important informal community rule: in the hashtags below the post, they’ve added “#star wars spoilers” and “#tfa spoilers”.

This allows other users to avoid spoilers, usually via a Chrome Extension like Tumblr Savior that can block pages containing user-specified phrases (in this case, obviously, “star wars spoilers” and variants).

My goal was to find a set of features which could best describe these spoiler-labeled posts, and then to train a model to pick up the “spoileriness” of unlabeled posts much like the above one.

Using the scikit-learn python library, I made my first set of features by calculating the frequency of each of the 500 most common words in the body of the posts (alsoweighted by the words’ inverse frequencies in the dataset, to reduce the importance of words like “Jedi” appearing in nearly every post in both classes).

To validate and test this performance, I held back a sample of both spoiler and non-spoiler posts, with all spoiler labeling removed — in other words, I made my own unmarked spoilers.

By selecting posts using various cutoff values for this predictor, here’s what the resulting True Positive and False Positive rates look like for the test sample: For a cutoff which picks up 80% of the original spoilers (the true positives), the algorithm mis-identifies 25% of the non-spoiler posts as spoilers (the false positives).

You’ll get back a page of search results, with links to individual blog posts, the date of the posting, a column saying whether or not the post was tagged as a spoiler, and the results of FanGuard, i.e.

In the drop-down menu, you have access to a spoiler filter not just for Star Wars, but for seven other movies, TV shows, and games taken from lists of the most popular reblogged content on Tumblr in 2015.

The “How careful should I be?” buttons give different levels of filtering, catching 60%, 80% (default), and 90% of spoilers, but with increasing false positive rates.

First, the most predictive feature of a spoiler by far is the total number of words in a post, with spoiler posts being on average over twice as long as non-spoiler posts.

Here is a graph of the most important features appearing in at least three of the models (the variable averaged on the y-axis is a numerical measure of the classification power of a feature in the Random Forest): When they talk about spoilers, Tumblr users seem to be employing a common underlying vocabulary and grammar, regardless of the movie, TV show, or game.

Deadpool 2 Post Credits Scene Explained

Deadpool 2 Post Credits Scene Explained. Deadpool 2 Ending, Ryan Reynolds Green Lantern Joke, Wolverine Origins and Funny Moments ...

Avengers Infinity War Post Credits Scene Explained

Avengers Infinity War Post Credits Scene Explained. Marvel Comics Easter Eggs, Iron Man, Spider-Man, Avengers 4 Teaser, Thanos Infinity Gauntlet ...

What if LOGAN had a POST-CREDIT SCENE? Featuring DEADPOOL (*Spoilers*) Fan-Made

What If LOGAN ending had a Post-Credit Scene? Feat. DEADPOOL (*SPOILERS*) Fan-Made and EXCLUSIVE to JoBlo. If there actually was a post-credit scene ...

Mark Hamill Hates Star Wars The Last Jedi

Mark Hamill hates the latest Star Wars movie, The Last Jedi - Subscribe for more! Film Gob features movie reviews, news, reactions, ..

Allison Williams Reveals What White People Ask Her About Get Out

Allison Williams talks about what it was like to film Get Out and reveals what white people say to her about her character in the film. » Subscribe to Late Night: ...

Are There Rules For Spoilers? | Idea Channel | PBS Digital Studios

Spoilers can be infuriating because they can ruin the suspense in a piece of media. But some people like them, actually preferring to have the endings divulged.

How “Black Panther” Is Bringing Afrofuturism Into The Mainstream

When the first mainstream black superhero stepped onto the '60s comics scene, there was no word to describe the genre of storytelling he represented.

Marvel Studios' Ant-Man and the Wasp - Official Trailer #1

Real heroes. Not actual size. Watch the brand-new trailer for Ant-Man and the Wasp. In theaters July 6th. ▻ Subscribe to Marvel: Follow ..

Youth - Daughter

picture: (the picture is from the band 'The Almost' i found it randomly on tumblr i didnt know its an album ..


This trailer was pretty lit, can't wait to see a bit more of White Diamond's Yaoi hands, and Pink Diamonds evil laughs. Sorry if I'm a bit underwhelmed. Truth be ...