AI News, “Dog Cam” Trains Computer Vision Software for Robot Dogs

“Dog Cam” Trains Computer Vision Software for Robot Dogs

A dog’s purpose can take on new meaning when humans strapa GoPro camera to her head.

The idea behind DECADE, described as “a dataset of ego-centric videos from a dog’s perspective,” is to directly modelthe behavior of intelligent beings based on how they see and move around withinthe real world.

It also enabled them to predict the appropriate sequence of dog limb and joint movements that get a dog from point A to point B: an early step toward programming a robotic dog to perform the same motions and behaviors.

The idea of modeling the overall behavior of “visually intelligent agents” differs from traditionalcomputer vision research that trains machine learning algorithms—including more specialized deep learning algorithms—on very specific vision tasks such as object detection or body pose estimation.

They even showed that data from a single dog’s behavior can train algorithms to perform more generalizedcomputer vision tasks: a demonstration of using dog behavior to superviserepresentation learning inalgorithms.

The more impressive part of thedemonstration involved training ResNet-18—a deep learning model for image recognition—on both their dog dataset and a standard ImageNet classification dataset in order to accomplish a particular computer vision task.

Thetask, known as “walkable surface estimation,” required the deep learning modelto figure out what parts of any given image representa walkable surface such as a carpet or floor area.

“This can be scaled up to more dogs because the whole setup is adjustable for different sizes and breeds of dogs,” Ehsani said.“There isn't going to be any additional costs for the dog owners as well:They just need to attach the sensors and start playing with their dog, or go for a walk, and the setup will do the data collection automatically.”

The 10 coolest papers from CVPR 2018

Of course, there’s always those papers that publish new ground breaking results and bring in some great new knowledge into the field.

They may not be the most fundamentally ground-breaking works, but they’re fun to see and offer a creative and enlightening perspective to the field, often sparking new ideas from the new angle they present.

The real key is that they randomize many of the variables that training data can have including: They showed some pretty promising results that demonstrate the effectiveness of pre-training with synthetic data;

All you need to train the network is a set of “good” looking images (for the output ground truth) and a set of “bad” looking images that you want to enhance (for the input images).

Polygon-RNN++ allows you to set rough polygon points around each object in the image, and then the network will automatically generate the segmentation annotation!

The paper shows that this method actually generalizes quite well and can be used to create quick and easy annotations for segmentation tasks!

In this paper, the authors design a model that, given an inventory of candidate garments and accessories, can assemble a minimal set of items that provides maximal mix-and-match outfits.

It’s basically trained using objective functions that are designed to capture the key ingredients of visual compatibility, versatility, and user-specific preference.

Their CNN estimates intermediate video frames and is capable of transforming standard 30fps videos into awesome looking slow motion at 240fps!

The model estimates the optical flow between frames and uses it to cleanly interpolate video frames so that the slow motion video looks crisp and sharp.

A set of CNN feature extractors are used to get image features from the video frames, which are then passed to a set of LSTMs along with the sensor data to learn and predict the dog’s actions.

The fact that it can get some generally strong baseline segments of unseen object classes is critical for being able to deploy such segmentation networks in the wild since in such an environment there may be many unseen object classes.

In a nutshell, the authors trained a model that, given a video of a soccer game, can output a dynamic 3D reconstruction of that game.

At test time, the bounding boxes, poses, and trajectories (across multiple frames) of the players are extracted in order to segment the players.

The basic idea behind NAS is that instead of manually designing the network architecture, we can use another network to “search” for the best model structure.

This will be huge in the future especially for design specific applications since all we’ll have to really focus on is designed a good NAS algorithm, rather than hand designing a specific network for our specific application.

If you enjoyed reading, feel free to hit the clap button so other people can see this post and hop on the learning train with us!

Demystifying AI

This session is a primer to introduce the concepts of deep learning with a specific focus on computer vision. It covers concepts including CNN's (Convolutional ...

DATA & ANALYTICS - Build smart applications with your new superpower: cloud machine learning

Recorded on Mar 24 2016 at GCP NEXT 2016 in San Francisco. Visual effects rendering is a computationally intensive process where one second of ...

BEEtag: A Low-Cost, Image-Based Tracking System for the Study of Animal Behavior and Locomotion

BEEtag: A Low-Cost, Image-Based Tracking System for the Study of Animal Behavior and Locomotion. James D. Crall et al (2015), PLoS ONE ...

Lecture 7 | Training Neural Networks II

Lecture 7 continues our discussion of practical issues for training neural networks. We discuss different update rules commonly used to optimize neural networks ...

Build a Game AI - Machine Learning for Hackers #3

This video will get you up and running with your first game AI in just 10 lines of Python. The AI can theoretically learn to master any game you train it on, but has ...

Using Artificial Intelligence to Enhance Your Game (1 of 2)

Machine learning has revolutionized many important fields, ranging from computer vision and natural language processing to healthcare and robotics.

Lecture 3 | Loss Functions and Optimization

Lecture 3 continues our discussion of linear classifiers. We introduce the idea of a loss function to quantify our unhappiness with a model's predictions, and ...

Lecture 8 | Deep Learning Software

In Lecture 8 we discuss the use of different software packages for deep learning, focusing on TensorFlow and PyTorch. We also discuss some differences ...

Lesson 1: Deep Learning 2018

NB: Please go to to view this video since there is important updated information there. If you have questions, use the forums at ..

Seminar 9: Surya Ganguli - Statistical Physics of Deep Learning

MIT RES.9-003 Brains, Minds and Machines Summer Course, Summer 2015 View the complete course: Instructor: Surya ..