# AI News, Classifying Handwritten Digits with TF.Learn - Machine Learning Recipes #7

## Classifying Handwritten Digits with TF.Learn - Machine Learning Recipes #7

I have a couple alternate ways of introducing them that I think would be helpful (and I put some exceptional links below for you to check out to learn more, esp.

Finally, I’ll show you how to reproduce those nifty images of weights from TensorFlow.org's Basic MNIST’s tutorial.Jupyter Notebook: https://goo.gl/NNlMNuDocker images: https://goo.gl/8fmqVWMNIST tutorial: https://goo.gl/GQ3t7nVisualizing MNIST: http://goo.gl/ROcwpR (this blog is outstanding)More notebooks: https://goo.gl/GgLIh7More about linear classifiers: https://goo.gl/u2f2NEMuch more about linear classifiers: http://goo.gl/au1PdG (this course is outstanding, highly recommended)More TF.Learn examples: https://goo.gl/szki63Thanks for watching, and have fun!

## Classifying Handwritten Digits with TF.Learn - Machine Learning Recipes #7

I have a couple alternate ways of introducing them that I think would be helpful (and I put some exceptional links below for you to check out to learn more, esp.

Finally, I’ll show you how to reproduce those nifty images of weights from TensorFlow.org's Basic MNIST’s tutorial.Jupyter Notebook: https://goo.gl/NNlMNuDocker images: https://goo.gl/8fmqVWMNIST tutorial: https://goo.gl/GQ3t7nVisualizing MNIST: http://goo.gl/ROcwpR (this blog is outstanding)More notebooks: https://goo.gl/GgLIh7More about linear classifiers: https://goo.gl/u2f2NEMuch more about linear classifiers: http://goo.gl/au1PdG (this course is outstanding, highly recommended)More TF.Learn examples: https://goo.gl/szki63Thanks for watching, and have fun!

## Wiki: Lesson 1

am doing simple human race image classifier, I have created a folder called ~/data/people_original and have three folders in there called caucasian, african and asian and populated each folder using the command google-images-download download 'african man' 'african woman' --keywords '' --download-limit 100 from within each folder.

was wondering if anyone has any munging code that could be repurposed for moving these splitting these images into the required folder structure that is present in the dogscats folder i.e.

## Deep Reinforcement Learning: Pong from Pixels

You may have noticed that computers can now automatically learn to play ATARI games (from raw game pixels!), they are beating world champions at Go, simulated quadrupeds are learning to run and leap, and robots are learning how to perform complex manipulation tasks that defy explicit programming.

I also became interested in RL myself over the last ~year: I worked through Richard Sutton’s book, read through David Silver’s course, watched John Schulmann’s lectures, wrote an RL library in Javascript, over the summer interned at DeepMind working in the DeepRL group, and most recently pitched in a little with the design/development of OpenAI Gym, a new RL benchmarking toolkit.

Of course, it takes a lot of skill and patience to get it to work, and multiple clever tweaks on top of old algorithms have been developed, but to a first-order approximation the main driver of recent progress is not the algorithms but (similar to Computer Vision) compute/data/infrastructure.

In this case I’ve seen many people who can’t believe that we can automatically learn to play most ATARI games at human level, with one algorithm, from pixels, and from scratch - and it is amazing, and I’ve been there myself!

Anyway, as a running example we’ll learn to play an ATARI game (Pong!) with PG, from scratch, from pixels, with a deep neural network, and the whole thing is 130 lines of Python only using numpy as a dependency (Gist link).

On the low level the game works as follows: we receive an image frame (a 210x160x3 byte array (integers from 0 to 255 giving pixel values)) and we get to decide if we want to move the paddle UP or DOWN (i.e.

After every single choice the game simulator executes the action and gives us a reward: Either a +1 reward if the ball went past the opponent, a -1 reward if we missed the ball, or 0 otherwise.

As our favorite simple block of compute we’ll use a 2-layer neural network that takes the raw image pixels (100,800 numbers total (210*160*3)), and produces a single number indicating the probability of going UP.

I’m showing log probabilities (-1.2, -0.36) for UP and DOWN instead of the raw probabilities (30% and 70% in this case) because we always optimize the log probability of the correct label (this makes math nicer, and is equivalent to optimizing the raw probability because log is monotonic).

At this point notice one interesting fact: We could immediately fill in a gradient of 1.0 for DOWN as we did in supervised learning, and find the gradient vector that would encourage the network to be slightly more likely to do the DOWN action in the future.

So if we fill in -1 for log probability of DOWN and do backprop we will find a gradient that discourages the network to take the DOWN action for that input in the future (and rightly so, since taking that action led to us losing the game).

And that’s it: we have a stochastic policy that samples actions and then actions that happen to eventually lead to good outcomes get encouraged in the future, and actions taken that lead to bad outcomes get discouraged.

Lets assume that each game is made up of 200 frames so in total we’ve made 20,000 decisions for going UP or DOWN and for each one of these we know the parameter gradient, which tells us how we should change the parameters if we wanted to encourage that decision in that state in the future.

We’ll take all 200*12 = 2400 decisions we made in the winning games and do a positive update (filling in a +1.0 in the gradient for the sampled action, doing backprop, and parameter update encouraging the actions we picked in all those states).

However, when you consider the process over thousands/millions of games, then doing the first bounce correctly makes you slightly more likely to win down the road, so on average you’ll see more positive than negative updates for the correct bounce and your policy will end up doing the right thing.

In my explanation above I use the terms such as “fill in the gradient and backprop”, which I realize is a special kind of thinking if you’re used to writing your own backprop code, or using Torch where the gradients are explicit and open for tinkering.

In vanilla supervised learning the objective is to maximize $$\sum_i \log p(y_i \mid x_i)$$ where $$x_i, y_i$$ are training examples (such as images and their labels).

Policy gradients is exactly the same as supervised learning with two minor differences: 1) We don’t have the correct labels $$y_i$$ so as a “fake label” we substitute the action we happened to sample from the policy when it saw $$x_i$$, and 2) We modulate the loss for each example multiplicatively based on the eventual outcome, since we want to increase the log probability for actions that worked and decrease it for those that didn’t.

So in summary our loss now looks like $$\sum_i A_i \log p(y_i \mid x_i)$$, where $$y_i$$ is the action we happened to sample and $$A_i$$ is a number that we call an advantage.

One common choice is to use a discounted reward, so the “eventual reward” in the diagram above would become $$R_t = \sum_{k=0}^{\infty} \gamma^k r_{t+k}$$, where $$\gamma$$ is a number between 0 and 1 called a discount factor (e.g.

Hint hint, $$f(x)$$ will become our reward function (or advantage function more generally) and $$p(x)$$ will be our policy network, which is really a model for $$p(a \mid I)$$, giving a distribution over actions for any image $$I$$.

In particular, it says that look: draw some samples $$x$$, evaluate their scores $$f(x)$$, and for each $$x$$ also evaluate the second term $$\nabla_{\theta} \log p(x;\theta)$$.

This will make it so that samples that have a higher score will “tug” on the probability density stronger than the samples that have lower score, so if we were to do an update based on several samples from $$p$$ the probability density would shift around in the direction of higher scores, making highly-scoring samples more likely.

This little piece of math is telling us that the way to change the policy’s parameters is to do some rollouts, take the gradient of the sampled actions, multiply it by the score and add everything, which is what we’ve done above.

I trained a 2-layer policy network with 200 hidden layer units using RMSProp on batches of 10 episodes (each episode is a few dozen games, because the games go up to score of 21 for either player).

The alternating black and white is interesting because as the ball travels along the trace, the neuron’s activity will fluctuate as a sine wave and due to the ReLU it would “fire” at discrete, separated positions along the trace.

The approach is a fancy form of guess-and-check, where the “guess” refers to sampling rollouts from our current policy, and the “check” refers to encouraging actions that lead to good outcomes.

In particular, anything with frequent reward signals that requires precise play, fast reflexes, and not too much long-term planning would be ideal, as these short-term correlations between rewards and actions can be easily “noticed” by the approach, and the execution meticulously perfected by the policy.

You can see hints of this already happening in our Pong agent: it develops a strategy where it waits for the ball and then rapidly dashes to catch it just at the edge, which launches it quickly and with high vertical velocity.

The idea was first introduced in Williams 1992 and more recently popularized by Recurrent Models of Visual Attention under the name “hard attention”, in the context of a model that processed an image with a sequence of low-resolution foveal glances (inspired by our own human eyes).

More generally, consider a neural network from some inputs to outputs: Notice that most arrows (in blue) are differentiable as normal, but some of the representation transformations could optionally also include a non-differentiable sampling operation (in red).

Therefore, during training we will produce several samples (indicated by the branches below), and then we’ll encourage samples that eventually led to good outcomes (in this case for example measured by the loss at the end).

In other words we will train the parameters involved in the blue arrows with backprop as usual, but the parameters involved with the red arrow will now be updated independently of the backward pass using policy gradients, encouraging samples that led to low loss.

However, with Policy Gradients and in cases where a lot of data/compute is available we can in principle dream big - for instance we can design neural networks that learn to interact with large, non-differentiable modules such as Latex compilers (e.g.

We saw that the algorithm works through a brute-force search where you jitter around randomly at first and must accidentally stumble into rewarding situations at least once, and ideally often and repeatedly before the policy distribution shifts its parameters to repeat the responsible actions.

We also saw that humans approach these problems very differently, in what feels more like rapid abstract model building - something we have barely even scratched the surface of in research (although many people are trying).

One related line of work intended to mitigate this problem is deterministic policy gradients - instead of requiring samples from a stochastic policy and encouraging the ones that get higher scores, the approach uses a deterministic policy and gets the gradient information directly from a second network (called a critic) that models the score function.

This approach can in principle be much more efficient in settings with very high-dimensional actions where sampling actions provides poor coverage, but so far seems empirically slightly finicky to get working.

For example AlphaGo first uses supervised learning to predict human moves from expert Go games and the resulting human mimicking policy is later finetuned with policy gradients on the “real” objective of winning the game.

And if you insist on trying out Policy Gradients for your problem make sure you pay close attention to the tricks section in papers, start simple first, and use a variation of PG called TRPO, which almost always works better and more consistently than vanilla PG in practice.

The core idea is to avoid parameter updates that change your policy too much, as enforced by a constraint on the KL divergence between the distributions predicted by the old and the new policy on a batch of data (instead of conjugate gradients the simplest instantiation of this idea could be implemented by doing a line search and checking the KL along the way).

## S.P.E.E.D. Writing: 5 Tips to Double Your Writing Productivity

They get an idea, pound it out in minutes, post it to their blog, and move on to something else.

Anyone can write faster if they follow a 5-step formula for writing more efficiently.

I serve dozens of clients, maintain two of my own blogs, write for a political blog, write articles for half a dozen other blogs, and do other miscellaneous writing.

In fact, I rewrote this paragraph that you&#8217;re reading right now three times before moving on.

Then, because you know it&#8217;s not relevant, you&#8217;ll just spend more time deleting it later.

By sticking to one and only one idea, you&#8217;ll force yourself to stay on-point, which will shorten your writing time and give your readers a better post.

When you find yourself staring helplessly at your computer screen, it&#8217;s almost always because you don&#8217;t have facts at hand.

I keep a simple text file on my computer desktop and jot down ideas as I get them.

I don’t tear out magazine bits anymore because that creates clutter that I have to sort through later.

Just take all your facts or ideas and arrange them in the order you want them to appear in your finished piece, using your chosen structure as a guide.

For this article, I decided to use an easy to remember acronym, S.P.E.E.D., to give me five points to cover.

Turn off the TV, mute the phone, close your email program, get off your social networks, and just write.

When I get stumped, I often go back and read what I’ve written to create momentum that can carry me forward.

(I have to laugh at myself for giving this advice, because if this were a crime, I&#8217;d get life in jail.) If you follow this formula, you&#8217;ll quickly end up with a written post.

## Write This Down: Note-Taking Strategies for Academic Success

For most of your classes (especially lecture-heavy social science courses) I recommend taking notes with a laptop.

You can type faster than you can write, it makes organizing your notes easier, and your notes will always be in legible type instead of the chicken scratch you callhandwriting.

While you could just use your computer&#8217;s default text file editor or word processor program, I recommend using a programspecificallydesigned fornote-taking.

Just scan your handwritten notes into Evernote, and Evernote will use the magic of image recognition technology to allow you to search for your handwritten notes within the app.

It also lets you record your professor using your computer&#8217;s microphone (just make sure to ask your professor first if it&#8217;s okay to record him or her).

As you take notes during class, you&#8217;ll probably want to bold, underline, or italicize certain points and words.

italicize text: Control+I (Command+I on Mac), then type what you want italicized To create a bulleted list: Depends on the platform- To create a numbered list: Depends on the platform- To find text: Control+F (Command+F on Mac) This is handy whenever you&#8217;re reviewing notes and want to find instances where you wrote about a specific topic.

If you find yourself typing certain phrases or words over and over again, save yourself time by using a text expander program.

For example, when I was taking Torts during my first year of law school, instead of typing out &#8220;intentional infliction of emotional distress&#8221;

Here are some text expander programs for the various operating systems out there: PhraseExpress (Windows 7) Texter (All other versions of Windows) TextExpander (Mac) AutoKey (Linux) AutoHotKey (Windows/Mac/Linux) Pen and Paper To keep students from surfing around during class and force them to actually pay attention, some professors are starting to ban the use of laptops during their classes.

If you find yourself in one of these classes, you&#8217;ll need to use the note-taking tools your dad and grandpa used: good old fashioned pen and paper.

Even if your professor doesn&#8217;t ban laptops, there are some classes where it&#8217;s actually better to take notes by hand.

Classes that are heavy on numbers, equations, and formulas&#8211;calculus, chemistry, physics, economics, symbolic logic, etc.&#8211;are best suited for handwritten notes.

Being familiar with the material will better enable you to understand the professor&#8217;s lecture and separate out the important points.

Your goal isn&#8217;t to transcribe your professor&#8217;s lecture word for word, rather it&#8217;s to extract and record the main points of it.

Write the professor&#8217;s summary at the end of class and his review at the beginning of the next class.At the end of the class, your professor will often summarize the main takeaway points.

At the beginning of the next class, your professor may give a quick review of the previous class and then provide a preview of how those points are related to the day&#8217;s lecture.

If you didn&#8217;t get a point, make a note of it, and wait until after class to ask.If you missed a point, make a note to remind yourself to ask the professor about it after class.

I don&#8217;t know how many times I wrote a note in class that later left me scratching my head and wondering, &#8220;What the heck did I mean by that?&#8221;

If you don&#8217;t understand a note, clarify it by reviewing the reading material or by asking a fellow classmate or the professor.Reviewing your notes after class also aides in memory retention.

It requires you to look at different bits of information, figure out the main ideas and how they relate, and organize them in a way that makes sense.

Over the years, professors and learning experts have suggested various note-taking styles to help students organize their notes.

Rough Outline Method My typical note-taking style is to simply create a rough outline of the lecture using bullet points.

Advocates of mind mapping argue that the non-linear, visual format of mind maps allow students to find connections they&#8217;d otherwise miss when using traditional note-taking strategies.

Also, because mind mapping is a somewhat creative activity, by engaging both the left and right spheres of your brain, learning retention is supposed to improve (a claim that some brain researchers dispute).

To mind map a lecture, you simply write the main topic of the day&#8217;s lecture at the center of a piece of paper.

I tried it a few times during my academic career, but never found it very helpful for recording lecture notes.

Classifying Handwritten Digits with TF.Learn - Machine Learning Recipes #7

Last time we wrote an image classifier using TensorFlow for Poets. This time, we'll write a basic one using TF.Learn. To make it easier for you to try this out, ...

Classifying Handwritten Digits with TF.Learn - Machine Learning Recipes #7

Last time we wrote an image classifier using TensorFlow for Poets. This time, we'll write a basic one using TF.Learn. To make it easier for you to try this out, ...

Let’s Write a Decision Tree Classifier from Scratch - Machine Learning Recipes #8

Hey everyone! Glad to be back! Decision Tree classifiers are intuitive, interpretable, and one of my favorite supervised learning algorithms. In this episode, I'll ...

Loud Luxury feat. brando - Body (Official Lyric Video)

Stream Loud Luxury – Body now on Spotify: Listen or download ..

Talking Angela - Summer Fun at the Beach with Talking Tom (Shorts Combo)

It's time for a beach bonanza! Talking Angela and Talking Tom are talking a day trip to the seaside for some fun in the sun! Make sure to subscribe! That way you ...

12 Games Like The Forest (Survival - Crafting - Basebuilding)

A pick of 12 awesome video games that are similar to the forest. You can join our monthly giveaway for your chance to win a triple A title of your choice: ...

The Big Bang Theory - The Friendship Algorithm

Season 2, episode 13 Sheldon displays his friendship algorithm as a flow chart, and tests it. (this belongs to CBS, not me, I'm just enlightening you with ...

How to Create a Credit Page with Hyperlinks for Teachers Pay Teachers

If you are new to selling on Teachers Pay Teachers, you may wonder how to create links that work after you've flattened your images. I didn't realize it for at least ...

Java Programming Tutorial - 26 - Random Number Generator