AI News, How to deliver on Machine Learning projects
- On Saturday, October 6, 2018
- By Read More
How to deliver on Machine Learning projects
Success for an ML team often means delivering a highly performing model within given constraints — for example, one that achieves high prediction accuracy, while subject to constraints on memory usage, inference time, and fairness.
That said, this framework is still immensely valuable for even the most experienced engineers when uncertainty increases — for example, when a model unexpectedly fails to meet requirements, when the teams’ goals are suddenly altered (e.g., the test set is changed to reflect changes in product needs), or as team progress stalls just short of the goal.
This typically means: For instance, if we’re building a tree detector to survey tree populations in an area, we might use an off-the-shelf training set from a similar Kaggle competition, and a hand-collected set of photos from the target area for development and test sets.
Don’t get bogged down trying to develop a complete understanding of every shortcoming — — aim instead to understand the biggest factors since many of the smaller issues will change or even disappear as you make improvements to your model.
If training set error is the current limiting factor, the following issues might be contributing: If development set error is the current limiting factor, this may be caused by an analogous set of issues as above: If test set error is the current limiting factor, this is often due to the development set being too small or the team overfitting to the development set over the course of many experiments.
For any of the above situations, you can understand the failures of your models by manually inspecting a random set of examples your model gets wrong (You should generally not do this for the test set to avoid “training” your system on those examples.).
We still find it useful to mentally separate the analysis stage and selection stage (below) since it is easy to slip into trying random approaches without really digging into the underlying issues.
Examples At Insight for example, when AI Fellow Jack Kwok was building a segmentation system to help with disaster recovery, he noticed that while his segmentation model performed well on his training set of satellite imagery, it performed poorly on the development set, which contained cities that were flooded by hurricanes.
For speech recognition systems, an in-depth error analysis on the development set may reveal that speakers with strong accents that are very different from the majority of users represents a disproportionate number of errors.
While more sophisticated approaches might look like they will get more done in one swing, we often find that improvements from many quick iterations swamps the gains from tinkering with state-of-art or bespoke solutions that take longer to get right.
If you need to tune the optimizer to better fit the data: If the model is unable to fit the training data well: If the model isn’t generalizing to the development set: You know what to try, and you’ve made it simple for yourself, now it’s just a matter of implementing… “just”.
These decisions are easier to make if each cycle of the ML Loop is relatively cheap: you haven’t put too much energy into making your code perfect, and another attempt won’t take too long — — so you can decide what to do based on the risk and value of the idea instead of sunk cost.
It is not at all magical, unfortunately — you will need to develop your ability to make good choices in each stage, like identifying the performance bottleneck, deciding which solutions to try, how to implement them correctly, and how to measure performance for your application.
While it’s hard to hold yourself accountable to hitting a specific accuracy goal when the fate of your experiment is uncertain, you can at least hold yourself accountable to finish that error analysis, draw up a list of ideas, code them, and see how it works.
- On Sunday, July 21, 2019
Cognition: How Your Mind Can Amaze and Betray You - Crash Course Psychology #15
You can directly support Crash Course at Subscribe for as little as $0 to keep up with everything we're doing. Also, if you ..
EXPLOSIVE Workout MONSTER! - Best of Michael Vazquez
EXPLOSIVE Workout MONSTER! - Best of Michael Vazquez Bio: Michael Vazquez Love God, People & Fitness Performixdriven | Aesthetic Revolution ...
Build The Person You Want to Be - Best Motivational Videos Compilation for 2018
Special thanks to Tom Bilyeu. Check out his channel ▻▻▻▻Get the book "High ..
Google Colaboratory for free GPU model training (Deep learning)
Here is a guide for training your model online for free. You can train your machine learning and deep learning model online and develop deep learning ...
The power of believing that you can improve | Carol Dweck
Carol Dweck researches “growth mindset” — the idea that we can grow our brain's capacity to learn and to solve problems. In this talk, she describes two ways to ...
Lecture 05 - Training Versus Testing
Training versus Testing - The difference between training and testing in mathematical terms. What makes a learning model able to generalize? Lecture 5 of 18 of ...
Your body language may shape who you are | Amy Cuddy
Body language affects how others see us, but it may also change how we see ourselves. Social psychologist Amy Cuddy argues that "power posing" -- standing ...
Style Transfer using Spell with Yining Shi
In this live stream, Yining Shi demonstrates how to train a "Style Transfer Model" using Spell (Sign up here: After training the model, ..
Developing a Growth Mindset with Carol Dweck
Watch, learn and connect: Should you tell your kids they are smart or talented? Professor Carol Dweck answers this ..
How to use Excel to highlight Employee Performance Rating
A video on how to use Excel within a Human Resources context