AI News, Live Metrics Stream with custom metrics and diagnostics in Azure ... artificial intelligence
- On Wednesday, May 8, 2019
- By Read More
Decrappification, DeOldification, and Super Resolution
In this article we will introduce the idea of “decrappification”, a deep learning method implemented in fastai on PyTorch that can do some pretty amazing things, like… colorize classic black and white movies—even ones from back in the days of silent movies, like this: The same approach can make your old family photos look like they were taken on a modern camera, and even improve the clarity of microscopy images taken with state of the art equipment at the Salk Institute, resulting in 300% more accurate cellular analysis.
In recent years generative models have advanced at an astonishing rate, largely due to deep learning, and particularly due to generative adversarial models (GANs).
However, GANs are notoriously difficult to train, due to requiring a large amount of data, needing many GPUs and a lot of time to train, and being highly sensitive to minor hyperparameter changes.
fast.ai has been working in recent years towards making a range of models easier and faster to train, with a particular focus on using transfer learning.
Transfer learning refers to pre-training a model using readily available data and quick and easy to calculate loss functions, and then fine-tuning that model for a task that may have fewer labels, or be more expensive to compute.
The pre-trained model that fast.ai selected was this: Start with an image dataset and “crappify” the images, such as reducing the resolution, adding jpeg artifacts, and obscuring parts with random text.
Then, the loss function was replaced was a combination of other loss functions used in the generative modeling literature (more details in the f8 video) and trained for another couple of hours.
His ambition was to be able to successfully colorize real world old images with the noise, contrast, and brightness problems caused by film degradation.
He discovered that just a tiny bit of GAN fine-tuning on top of the process developed with fast.ai could create colorized movies in just a couple of hours, at a quality beyond any automated process that had been built before.
Meanwhile, Uri Manor, Director of the Waitt Advanced Biophotonics Core (WABC) at the Salk Institute, was looking for ways to simultaneously improve the resolution, speed, and signal-to-noise of the images taken by the WABC’s state of the art ZEISS scanning electron and laser scanning confocal microscopes.
These three parameters are notably in tension with one another - a variant of the so-called “triangle of compromise”, the bane of existence for all photographers and imaging scientists alike.
The advanced microscopes at the WABC are heavily used by researchers at the Salk (as well as several neighboring institutions including Scripps and UCSD) to investigate the ultrastructural organization and dynamics of life, ranging anywhere from carbon capturing machines in plant tissues to synaptic connections in brain circuits to energy generating mitochondria in cancer cells and neurons.
The scanning electron microscope is distinguished by its ability to serially slice and image an entire block of tissue, resulting in a 3-dimensional volumetric dataset at nearly nanometer resolution.
The so-called “Airyscan” scanning confocal microscopes at the WABC boast a cutting-edge array of 32 hexagonally packed detectors that facilitate fluorescence imaging at nearly double the resolution of a normal microscope while also providing 8-fold sensitivity and speed.
Using carefully acquired high resolution images for training, the group validated “generalized” models for super-resolution processing of electron and fluorescence microscope images, enabling faster imaging with higher throughput, lower sample damage, and smaller file sizes than ever reported.
Since the models are able to restore images acquired on relatively low-cost microscopes, this model also presents an opportunity to “democratize” high resolution imaging to those not working at elite institutions that can afford the latest cutting edge instrumentation.
Taking inspiration from this blog post about stabilizing neural style transfer in video, he was able to add a “stability” measure to the loss function being used for creating single image super-resolution.
This stability when combined with information about the preceding and following frames of video significantly reduces flicker and improves the quality of output when processing low resolution movies.
Additionally, high quality video colorization in DeOldify is made possible by advances in training that greatly increase stability and quality of rendering, largely brought about by employing NoGAN training.
Moreover, the discriminator can check that highly detailed features in distant portions of the image are consistent with each other.” This same approach was adapted to the critic and U-Net based generators used in DeOldify.
We’ve also employed gaussian noise augmentation in video model training in order to reduce model sensitivity to meaningless noise (grain) in film.
Here is the NoGAN training process: In the case of DeOldify, training to this point requires iterating through only about 1% to 3% of ImageNet data (or roughly 2600 to 7800 iterations on a batch size of five).
However, there’s still the issue of artifacts and discoloration being introduced after the “inflection point”, and it’s suspected that this could be reduced or eliminated by mitigating batch size related issues.
In just fifteen minutes of direct GAN training (or 1500 iterations), the output of Fast.AI’s Part 1 Lesson 7 Feature Loss based super resolution training is noticeably sharpened (original lesson Jupyter notebook here).
If you took the original DeOldify model and merely colorized each frame just as you would any other image, the result was this—a flickering mess: The immediate solution that usually springs to mind is temporal modeling of some sort, whereby you enforce constraints on the model during training over a window of frames to keep colorization decisions more consistent.
For example, a larger resnet backbone (resnet101) makes a noticeable difference in how accurately and robustly features are detected, and therefore how consistently the frames are rendered as objects and scenes move.
In contrast, simpler loss functions such as MSE and L1 loss tend to produce dull colorizations as they encourage the networks to “play it safe” and bet on gray and brown by default.
The high level overview of the steps involved in creating model to produce high resolution microscopy videos from low resolution sequences is: At Salk we have been fortunate because we have produced decent results using only synthetic low resolution data.
This is important because it is time consuming and rare to have perfectly aligned pairs of high resolution and low resolution images - and that would be even harder or impossible with video (live cells).
This is a function that transforms a high resolution image into a low resolution image that approximates the real low resolution images we will be working with once our model is trained.
For example, we found that if our crappifier injected too much high frequency noise into the training data, the trained model would have a tendency to eliminate thin and fine structures like those of neurons.
This image shows an example from a training where we are using 5 sequential images ( t-2, t-1, t 0, t+1, t+2) - to predict a single super-resolution output image (also at time t 0 ) For the movies we used bundles of 3 images and predicted the high resolution image at the corresponding middle time.
We chose 3 images because that conveniently allowed us to easily use pre-existing super-resolution network architectures, data loaders and loss functions that were written for 3 channels of input.
Given the low resolution image sequence X that we will use to predict the true high resolution image T, we create X1 and X2 which result from to separate applications of the random noise generating crappifier function.
We used L1 loss but you could also use a feature loss or some other approach to measure the difference: LossStable = loss(Y1,Y2) Our final training loss is therefore: loss = L1 + L2 + LossStable Now that we have a trained model, generating high resolution output from low resolution input is simply a matter of running the model across a sliding window of, in this case, three low resolution input images at a time.
- On Friday, October 18, 2019
Smart Insights with Machine Learning from Azure Monitoring : Build 2018
In today's world of multi-layered applications generating a huge amount of telemetry data, getting Insights that will significantly shorten your Root Cause Analysis ...
Microsoft Business Applications Virtual Launch Event April 2019
Watch the on-demand footage of the virtual launch event to experience the new capabilities and features across Dynamics 365, Power BI, PowerApps, Microsoft ...
DevOps for AI : Deploying everywhere : Build 2018
AI Lifecycle for App developers” how to manage artifacts that come from DS's and Cognitive Services integrate.
Build Intelligent Apps with the Microsoft Data & AI Platform : Build 2018
Join Rohan Kumar, Corporate Vice President of Data Platform, to learn how Microsoft provides the most comprehensive data platform for your modern, intelligent ...
Machine learning at scale : Build 2018
AI/ML expert Paige Bailey takes you on a tour of the powerful services available on Azure. You'll see how to take your predictive model to production, ...
Visual Studio 2019 Launch Event
Join us on April 2 for the launch of Visual Studio 2019. Learn about how Visual Studio 2019 is more productive, modern, and innovative, participate in live Q&As, ...
Business Applications keynote | Microsoft Ignite 2018
Learn how customers are driving digital transformation with Microsoft business applications including Artificial Intelligence (AI) and mixed reality. Find out more: ...
Microsoft & Prism Skylabs: Using AI to help organizations search visual data
Prism Skylabs is using Microsoft Cognitive Services to help businesses search, analyze and categorize their videos automatically with artificial intelligence.
Build a Solution on Azure Gov: The Connected Officer Bringing IoT to Policing (GOV)
MyAnalytics: Help employees thrive with AI and productivity insights in Office 365 - BRK1038
As knowledge work becomes more complex, many employees struggle to master their time and relationships. Nearly half of meeting time is seen as ...